1. Breakthrough Scalability
for Ruby on Rails with MySQL
How the Clustrix Database scales Ruby on Rails
Neil Harkins Clayton Cole
Performance Test Engineer Software Deployment Engineer
Clustrix Clustrix
2. What is Clustrix?
Clustrix is NewSQL: PLUS…
• Scalablility and Fault-Tolerance… Distributed query processing
without sacrificing ACID Online schema changes
compliance.
Multi-master replication slave
• Drop-in scalable replacement for Multi-binlog replication master
MySQL (dump, restore, change the
ip address!) Fast Parallel Backup/Restore
“Radical Scalability, Radical And now…
Simplicity” DBaaS via partnerships with
multiple Cloud/Hosting Providers
2
3. The Clustrix Database: Previous Benchmarks
October, 2011:
Percona-Clustrix TPC-C Evaluation
Compared:
• Clustrix 3/6/9 nodes
• MySQL w/ Intel SSD
• MySQL w/ FusionIO
Demonstrated how Clustrix provides
linear scale:
more nodes = more performance
4. Ruby on Rails: A popular web application framework
• Open-source “full-stack” web app framework
for the Ruby programming language.
• Uses Model-View-Controller (MVC) paradigm,
abstracts data store behind ORM (“ActiveRecord”)
lets you focus on your webapp’s features, not infrastructure.
• Philosophy:
• Convention over Configuration
• Don’t Repeat Yourself
• Quick Turnaround / Short Development Cycle
• Promises prototype -> full-featured website in record time
…and delivers.
• As of Feb 2012, Gartner estimates >235k websites use Rails!
5. What’s the catch? Does Rails scale?
“By various metrics Twitter is the biggest Rails site on the net right now.
… The common wisdom in the Rails community at this time
is that scaling Rails is a matter of cost: just throw more CPUs at it.
The problem is that more instances of Rails means more requests to your database.
… Once you hit a certain threshold of traffic, either you need to strip out
all the costly neat stuff that Rails does for you (RJS, ActiveRecord, ActiveSupport, etc.)
or move the slow parts of your application out of Rails, or both.”
- Twitter Developer Alex Payne, 2007-03-29
http://tumblr.yasulab.jp/post/10271634919/5-question-interview-with-twitter-developer-alex-payne
The Answer lies in whether your database can scale
5
6. Our Goal: Prove that Rails can Scale
To run a performance benchmark that…
• Allows us to observe the database workload from a real-world webapp
“scenarios” written in Ruby on Rails,
• At a commercial Rails-hosting environment,
• Against both a standard MySQL server offered by that service, and the same
Clustrix nodes used by our various DBaaS partners.
…does such a benchmark already exist?
Our search resulted only in modules which could be used to measure timing, but
we did not find a comprehensive RoR simulation comparable to a TPC
benchmark.
7. Designing the Benchmark – keeping it real
Tacit Knowledge supplied some Ruby • BlueBox was our Rails-hosting
coders to write the benchmark in “The service offering physical MySQL
Rails Way”. servers and virtual Rails servers
Cut out the View and Controller, • We then engaged with Percona to
concentrate solely on the Model review that the MySQL server
Avoids complexities and latencies configuration is optimal.
associated with HTTP load testing
Still uses the Rails “core”, in
particular ActiveRecord
Tacit added a lot of “knobs” in order
to test different ratios of scenario
“ingredients”, etc.
8. Sample Scenarios used in the Benchmark
Tacit took the data model • Create User, optionally with metadata
located in other tables.
we created for an auction • Create Auction, optionally with url to a
application and created picture.
• View Auction with its most recent
13 scenarios for how comments.
user might interact with • Add Tags and/or Comments to an
Auction.
the site
• Bid on an Auction, Determine current
highest bidder.
• Generate “tag cloud” for items
recently commented upon.
• etc.. 8
9. Sample Schema for a Social-Media Auction website
This is the
relational database
model used to back
the site used for
this benchmark. All
relational, all ACID.
10. Sample Scenario: Ruby code snippet for creating a user
Ruby
lib/benchmark/scenarios.rb Sample SQL generated
# Create User with 2 Phones BEGIN;
def scenario_3 INSERT INTO users (created_at, email, first, last,
login, status, updated_at)
user = User.new(login: VALUES ('2012-04-17 22:08:01', 'bill@walker.biz',
"#{Faker::Internet.user_name}_#{Random.rand(65536)} 'Dillan', 'Hamill',
_#{Faker::Internet.user_name.reverse}", email: 'harrison_34356_ztem_sivart', 0,
Faker::Internet.email, first:
Faker::Name.first_name, last: '2012-04-17 22:08:01');
Faker::Name.last_name, status: Random.rand(0..5))
INSERT INTO user_phones (number, phone_type, user_id)
user.user_phones << UserPhone.new(phone_type: VALUES ('530-209-0599', 1, 4426);
Random.rand(0..1), number:
INSERT INTO user_phones (number, phone_type, user_id)
Faker::PhoneNumber.phone_number)
VALUES ('1-978-714-2317', 1, 4426);
user.user_phones << UserPhone.new(phone_type:
COMMIT;
Random.rand(0..1), number:
Faker::PhoneNumber.phone_number)
user.save
end
10
11. Test Hardware Setup: Clustrix vs MySQL
Clustrix MySQL Instance
CLX 4110 nodes that we • 8 cores (Quantity 2 of
formed into 3-node and 6- Intel Xeon 5450
node clusters, each with (3GHz, 12MB
• 8 Cores Cache))
• 48GB RAM • 128GB RAM
• 896GB SSD • 1.6TB of spinning
• 600GB HDD HDD space (12 x
• Clustrix VIP 300GB Seagate 15k
(software load RPM SAS)
balancer) • Hardware RAID 10
• Running Scientific
Linux 6.2
• MySQL version
5.1.61
• Settings tuned
H by Percona
o
s
12. Test Sequence
Ruby on Rails Benchmark
• Starts desired number of threads, connects
each to target DB system
• Each thread begins running the prescribed
workload during a warm-up period
• After warm-up, statistics collection is turned
on and test runs for a set time (10 minutes)
• Results of test are saved locally as a JSON
file
13. Benchmark Results: MySQL only
Summary:
• This graph shows throughput (TPS)
over a range of concurrencies
5,000 – 6,000 TPS @ 256 Threads • MySQL maxes out around a TPS of 5-
6k at concurrency of 256
• This drops down to 3-4k as TPS
3,000 – 4,000 TPS @ 1024 Threads
concurrency approaches 1024, a
1/3 decrease in performance
Put in perspective:
• 256 threads might represent a small
and growing organization
• 1024 threads might represent when
that site starts getting more popular
14. Benchmark Results: 3-Node Clustrix v. MySQL
• Same graph as before, now adding
Clustrix database
• Same axis, with greater scale
because Clustrix outperforms MySQL
• At 256 threads (MySQL’s peak),
8x TPS Performance @ 1024 Threads Clustrix performs 2.5x faster
• At 1024 threads, a 3-node Clustrix
Clustrix 3-Node
database achieves peak performance
of 30,000 TPS
2.5x TPS Performance @ 256 Threads
MySQL
But wait! There is more
15. Benchmark Results: Clustrix 6-Node, 3-Node, MySQL
Clustrix 6-Node • Again, same graph as before, now
adding a 6 node Clustrix database
• Clustrix has 15x performance of
MySQL at 1024 threads
15x TPS Performance @ 1024 Threads
• Concurrency reaches 45,000 TPS as
concurrency reaches 10,000
How’s that for scale?
Clustrix 3-Node
MySQL
17. Conclusion
• Ruby on Rails: Great rapid development framework for
the web
• MySQL & Rails: frequent development strategy
• Scaling limits for Rails = Scaling limits for MySQL
• The Clustrix database breaks through traditional Rails
limitations by providing:
Linear scalability
Drop-in replacement for MySQL
Superior performance
High availability and inherent fault tolerance
Here’s code for one of the scenariosPretty simpleUses a Ruby gem to create fake user information and 2 telephone numbersThis function takes arguments from the controller and stores them durably (we hope).Describe MVC single-page “form” -> might be writing to multiple normalized objects in the backend
You can see the Clustrix 4110 nodes off to the right in orange{These were formed into 3-node and 6-node cluster configurations}Each of these nodes had:8Cores48GBRAMMade use of the Clustrix VIP…(software load balancer built into the product)…to evenly distribute queries to all of the nodes
Now both parts of the automation are in place and runningThe master client has started the slavesand the slaves have launched their Ruby benchmark instances{Each of these Ruby instances}{connects several threads to the database being tested}[CLICK] A warm-up period is provided to allow time for all these threads to connect to the database…and to start running the actual workload but with statistics collection turned off{When this warm-up ends, stats collection is turned back on and the benchmark runs for 10 minutes}{At the end the results are written to a JSON file}
So let’s take a look at the results of the benchmark run against MySQLThis graph shows performance in terms of TPS over a range of concurrenciesX-axis: threadsUnlike graph from earlier, this and all following graphs use a linear scaleY-axis: Transactions per Second, TPSFor this and for each following group of curves, 2 representative runs are shownHere we see two runs of the benchmark for MySQLLet’s look at the data[CLICK] MySQL maxes out around 5000 or 6000 TPS at concurrency of 256[CLICK] Drops down to about 3,000 or 4,000 TPS as concurrency hits 1024About 1/3rd decrease in performanceBeyond that, performance decays steadily as concurrency increasesNote also, that it makes it out to ~9,000 connections but not all the way to 10,000Could not complete the test at that concurrency because the Ruby instances hung on the clientsPut in perspective…256 threads might represent point where a small and growing organization, such as a company running our demo auction website, has established itself1024 threads would be where that company has started to become more popularSo right as users are starting to pay attention, it’s becoming tougher to meet their demandsNow I’m going to add 2 more lines to this same graph…
Same graph as before… exact same x-axis and…same y-axis except with greater scaleMySQL lines are now compressed down because the scale has increasedAnd we’ve added 2 new lines, each representing separate runs of a 3-node Clustrix cluster[CLICK] At 256 threads, MySQL’s peak performance,Clustrix has about 2.5x performance advantage[CLICK] But, at 1024 threads, the 3-node cluster is hitting its peak performance of around 30,000 TPSAnd now, I’m going to add the last 2 lines to this graph…
Again, same graph as before, with the addition of a pair of lines, each representing a run of a 6-node Clustrix cluster[CLICK]Clustrix has 15x performance of MySQL @ 1024 threadsReaches peak of ~55,000 TPSDrops very gently down to ~45,000 TPS as concurrency reaches all the way out to 10,000
So let’s look at those same test runs again, except now let’s concentrate on how long it takes each scenario to executeRemember, scenarios are things like adding a user, placing a bid on an auction, uploading a picture, etc. This graph has the same x-axis, concurrencyBut now the y-axis shows how long the average transaction took to complete in whole seconds [CLICK] At 256 threads, everybody is completing requests in about 0.1 secondsGreat; everything is working fine[CLICK] But along the way to 1024 threads, MySQL starts taking up to 3 seconds to answer requestsThis is going to confuse and upset the user experience[CLICK] Meanwhile Clustrix is completing just as fast as before (~0.1 seconds)[CLICK] Even out @9250 threads, 6-node Clustrix cluster is still operating relatively quickly[CLICK] But MySQL is reacting so slowly that various timeouts will likely occur, possibly bringing the revenue-producing activities of the site down
In conclusion…{Ruby on Rails is a popular framework for deploying web sites}[CLICK] {MySQL is frequently used as the backend for Rails and this works well…}[CLICK] When organizations are first putting their ideas together or…Dealing with their initial customer base[CLICK] But, as one’s organization becomes more successful, MySQL eventually hits various limits and this keeps Rails from scaling properly[CLICK] {Clustrix breaks through traditional Rails limitations}Allowing Rails to scaleProviding full MySQL compliance, high availability, and inherent fault toleranceIf you think back to the Twitter quote that Neil showed about Rails limitations…know that you don’t have to give up what you love about Rails to make it scale; you just need the a truly scalable database solution
Thanks very much for your attentionI got a little QR code on here that links to the white paper up on our web siteWe have Robert from Blue Box with us here today and… we invite you all to our booth # 16 after the break…to learn more about Clustrix, Blue Box, and our Ruby on Rails benchmarkI hope to see you thereEnjoy the rest of Percona Live!!