SlideShare une entreprise Scribd logo
1  sur  49
Télécharger pour lire hors ligne
Scaling social games
   “the order of magnitude
          challenge”
  Paolo Negri @hungryblank
Order of magnitude
                                      DAU

                     1000000

                      750000
DAU:
                      500000
daily active users
                      250000

                           0
                               July         December
Social Games
Flash client (game)    HTTP API




                      http://www.flickr.com/photos/stars6/4381851322
Social Games
Flash client


               • Game actions need to be
                 persisted and validated

               • 1 API call every few secs
Social Games
                          HTTP API



• 5000 HTTP reqs/sec
• more than 90% writes
• 60K queries/sec


                         http://www.flickr.com/photos/stars6/4381851322
July 2010
                                HAproxy
• ~ 170 000 daily users
• Plain Ruby on Rails app
• Persistency 100% SQL        Ruby on Rails




                                MySQL
July 2010
• 1 haproxy server              HAproxy


• multiple RoR servers
• 4 mysql servers             Ruby on Rails
  (sharded dataset)

                                MySQL
July 2010
                      HAproxy




                    Ruby on Rails




Slow down             MySQL
July 2010
                            HAproxy



High queries/request      Ruby on Rails
       ratio


    Slow down               MySQL
Queries/request


• Which code is triggering extra queries?
• Why in our test environment the ratio is
  lower than live?
Queries/request
       Running code of live system


Application   Plugins   Ruby on Rails
Queries/request
 Source of extra queries

              •   sharding plugin “breaks” std
                  Rails query cache
    Plugins
              •   Flash wire protocol plugin
                  generates extra queries
Plugins

• Deceiving “feature for free”
• Might provide the right feature
• But might not meet scaling need
Plugins

• Instant code legacy, for new projects also!
• Once added it’s your code
• Even if it’s maintained, will it follow your
  needs?
Plugins


• Assess code quality when you add it
• Can you afford to maintain/change it?
Plugins


• We fixed it!
• Query cut up to 40% on some requests
Early August
 30

22.5

 15

 7.5

  0
   6:00 6:10 6:20 6:30 6:40 6:50 7:00 7:10 7:20 7:30 7:40 7:50 8:00 8:10
                                                    query time in ms


• The MySQL hiccup
• every 70 mins query time spikes x7
Hiccup causes
    Who is periodically blocking MySQL

• Code (app + plugins + Rails)?
• Some periodic job?
• The devil (AWS)?
Hiccup quick fix
• We shard out the top queried table
  (40% of all queries)

                 MySQL servers

      shard 1   shard 2   shard 3   shard 4
Hiccup quick fix
• We shard out the top queried table
  (40% of all queries)

     Top table      Top table      Top table      Top table
      shard 1        shard 2        shard 3        shard 4



    Other tables   Other tables   Other tables   Other tables
      shard 1        shard 2        shard 3        shard 4
Hiccup quick fix
• Mysql likes it
• “top table” shards will go a long way in the
  scaling process

     Top table      Top table      Top table      Top table
      shard 1        shard 2        shard 3        shard 4



    Other tables   Other tables   Other tables   Other tables
      shard 1        shard 2        shard 3        shard 4
Hiccup causes
    Who is periodically blocking MySQL

• Code (app + plugins + Rails)?
• Some periodic job?
• The devil (AWS)?
             None of the Above
Hiccup real cause

• Emerging MySQL internal at high volume
• MySQL flushes its buffer
• Under heavy write IO it’s blocking
Hiccup solution

• Percona MySQL patches (XtraDB) avoid
  blocking behavior
• Query time profile gets smooth
• IO capacity limit manifested with gradual
  performance decay
Write through cache

• Memcache in front of MySQL
• Evaluated before sharding
• Was discarded
• Because of our read/write reatio
Write through cache


 90% of the times we read data
     in order to modify it
Write through cache

  It means 90% of the times

      1. read cache
      2. write cache
      3. write SQL
Write through cache
                   Bound to

   Read heavy                  Write heavy

                         • Mysql write
                              (unless async)
• memcache perfs
                         • Write through lib
                              optimized for
                              writes?
MySQL

• Sharding SQL is a painful way to scale
• Data migrations at high load imply
  downtime
• ACID benefits all lost because of sharding
  or in name of performance
Redis
• A persistent cache
• Fast 60000 qps on AWS hardware
• Interesting data structures, not only KV
• Already some small scale experince in
  house
Redis adoption

• Which data to start from?
• How do we migrate without downtime?
• Which Ruby object - Redis structure lib?
Redis adoption
• Which data to start from?
• Best data fit for Redis hashes
• Top 3rd queried table
• a collection of integer fields that need only
  increment / decrement
Redis adoption
• How do we migrate without downtime?
• Migrate one user at a time
• Use a Redis set to keep note of migrated/
  non migrated
• No downtime, transparent to users
Redis adoption
• How do we migrate without downtime?
                            MySQL
User 123
              RoR
             Server



                             Redis
Redis adoption
• How do we migrate without downtime?
              read original data
                                   MySQL
User 123
              RoR
             Server



                                   Redis
Redis adoption
• How do we migrate without downtime?
                                  MySQL
User 123
               RoR
              Server



                                  Redis
            write migrated data
Redis adoption
• How do we migrate without downtime?

• Migration might never complete
• SQL + Redis set information to generate
  final batch migration
Redis 1st result

10% query load from 4 MySQL server
is moved to 1 Redis server
Redis server load is 0.05
Redis


• Becomes the tool to use
• Migration plan for all write intensive data
• Migrate one “class” at a time
Redis honeymoon end

• Memory usage grows more than data
• Snapshot to disk causes spikes in query
  time
• Starting new slaves eats memory on the
  master node
Redis honeymoon end
           Russian Roulette Feeling



• Redis machine sized with overabundant
  RAM
• Rigorous slave/master starting plan
Redis


• Redis team acknowledges persistency/
  replication problems
• Redis 2.4 diskstore plan starts
1.000.000


And counting...
1.000.000
painless scaling          HAproxy




                        Ruby on Rails




                         Persistency
1.000.000
                            HAproxy



just add servers          Ruby on Rails
 as load grows


                           Peristency
1.000.000
                         HAproxy




                       Ruby on Rails



 Painful and            Peristency
troublesome
Infrastructure

• AWS
• Chef - through Scalarium
• Ganglia
Thanks
  ...
wooga
        Is looking for
Business Intelligence Engineer

   http://wooga.com/jobs

Contenu connexe

Tendances

Training Slides: 101 - Basics: Tungsten Clustering - Under The Hood
Training Slides: 101 - Basics: Tungsten Clustering - Under The HoodTraining Slides: 101 - Basics: Tungsten Clustering - Under The Hood
Training Slides: 101 - Basics: Tungsten Clustering - Under The HoodContinuent
 
How you can contribute to Apache Cassandra
How you can contribute to Apache CassandraHow you can contribute to Apache Cassandra
How you can contribute to Apache CassandraYuki Morishita
 
A multi-tenant architecture for Apache Axis2
A multi-tenant architecture for Apache Axis2A multi-tenant architecture for Apache Axis2
A multi-tenant architecture for Apache Axis2Afkham Azeez
 
Maria DB Galera Cluster for High Availability
Maria DB Galera Cluster for High AvailabilityMaria DB Galera Cluster for High Availability
Maria DB Galera Cluster for High AvailabilityOSSCube
 
Cassandra Summit 2014: Deploying Cassandra for Call of Duty
Cassandra Summit 2014: Deploying Cassandra for Call of DutyCassandra Summit 2014: Deploying Cassandra for Call of Duty
Cassandra Summit 2014: Deploying Cassandra for Call of DutyDataStax Academy
 
Client Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayClient Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayDataStax Academy
 
Achieving Infrastructure Portability with Chef
Achieving Infrastructure Portability with ChefAchieving Infrastructure Portability with Chef
Achieving Infrastructure Portability with ChefMatt Ray
 
under the covers -- chef in 20 minutes or less
under the covers -- chef in 20 minutes or lessunder the covers -- chef in 20 minutes or less
under the covers -- chef in 20 minutes or lesssarahnovotny
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache KafkaShiao-An Yuan
 
Redis everywhere - PHP London
Redis everywhere - PHP LondonRedis everywhere - PHP London
Redis everywhere - PHP LondonRicard Clau
 
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...Erik Onnen
 
Master master vs master-slave database
Master master vs master-slave databaseMaster master vs master-slave database
Master master vs master-slave databaseWipro
 
События, шины и интеграция данных в непростом мире микросервисов / Валентин Г...
События, шины и интеграция данных в непростом мире микросервисов / Валентин Г...События, шины и интеграция данных в непростом мире микросервисов / Валентин Г...
События, шины и интеграция данных в непростом мире микросервисов / Валентин Г...Ontico
 
Speed up your Symfony2 application and build awesome features with Redis
Speed up your Symfony2 application and build awesome features with RedisSpeed up your Symfony2 application and build awesome features with Redis
Speed up your Symfony2 application and build awesome features with RedisRicard Clau
 
Developing with the Go client for Apache Kafka
Developing with the Go client for Apache KafkaDeveloping with the Go client for Apache Kafka
Developing with the Go client for Apache KafkaJoe Stein
 
Hashicorp: Delivering the Tao of DevOps
Hashicorp: Delivering the Tao of DevOpsHashicorp: Delivering the Tao of DevOps
Hashicorp: Delivering the Tao of DevOpsRamit Surana
 
Introduction Apache Kafka
Introduction Apache KafkaIntroduction Apache Kafka
Introduction Apache KafkaJoe Stein
 

Tendances (17)

Training Slides: 101 - Basics: Tungsten Clustering - Under The Hood
Training Slides: 101 - Basics: Tungsten Clustering - Under The HoodTraining Slides: 101 - Basics: Tungsten Clustering - Under The Hood
Training Slides: 101 - Basics: Tungsten Clustering - Under The Hood
 
How you can contribute to Apache Cassandra
How you can contribute to Apache CassandraHow you can contribute to Apache Cassandra
How you can contribute to Apache Cassandra
 
A multi-tenant architecture for Apache Axis2
A multi-tenant architecture for Apache Axis2A multi-tenant architecture for Apache Axis2
A multi-tenant architecture for Apache Axis2
 
Maria DB Galera Cluster for High Availability
Maria DB Galera Cluster for High AvailabilityMaria DB Galera Cluster for High Availability
Maria DB Galera Cluster for High Availability
 
Cassandra Summit 2014: Deploying Cassandra for Call of Duty
Cassandra Summit 2014: Deploying Cassandra for Call of DutyCassandra Summit 2014: Deploying Cassandra for Call of Duty
Cassandra Summit 2014: Deploying Cassandra for Call of Duty
 
Client Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayClient Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right Way
 
Achieving Infrastructure Portability with Chef
Achieving Infrastructure Portability with ChefAchieving Infrastructure Portability with Chef
Achieving Infrastructure Portability with Chef
 
under the covers -- chef in 20 minutes or less
under the covers -- chef in 20 minutes or lessunder the covers -- chef in 20 minutes or less
under the covers -- chef in 20 minutes or less
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Redis everywhere - PHP London
Redis everywhere - PHP LondonRedis everywhere - PHP London
Redis everywhere - PHP London
 
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
 
Master master vs master-slave database
Master master vs master-slave databaseMaster master vs master-slave database
Master master vs master-slave database
 
События, шины и интеграция данных в непростом мире микросервисов / Валентин Г...
События, шины и интеграция данных в непростом мире микросервисов / Валентин Г...События, шины и интеграция данных в непростом мире микросервисов / Валентин Г...
События, шины и интеграция данных в непростом мире микросервисов / Валентин Г...
 
Speed up your Symfony2 application and build awesome features with Redis
Speed up your Symfony2 application and build awesome features with RedisSpeed up your Symfony2 application and build awesome features with Redis
Speed up your Symfony2 application and build awesome features with Redis
 
Developing with the Go client for Apache Kafka
Developing with the Go client for Apache KafkaDeveloping with the Go client for Apache Kafka
Developing with the Go client for Apache Kafka
 
Hashicorp: Delivering the Tao of DevOps
Hashicorp: Delivering the Tao of DevOpsHashicorp: Delivering the Tao of DevOps
Hashicorp: Delivering the Tao of DevOps
 
Introduction Apache Kafka
Introduction Apache KafkaIntroduction Apache Kafka
Introduction Apache Kafka
 

En vedette

A Documentation Crash Course, LinuxCon 2016
A Documentation Crash Course, LinuxCon 2016A Documentation Crash Course, LinuxCon 2016
A Documentation Crash Course, LinuxCon 2016Chris Ward
 
Content Management Systems and Refactoring - Drupal, WordPress and eZ Publish
Content Management Systems and Refactoring - Drupal, WordPress and eZ PublishContent Management Systems and Refactoring - Drupal, WordPress and eZ Publish
Content Management Systems and Refactoring - Drupal, WordPress and eZ PublishJani Tarvainen
 
Electron - Solving our cross platform dreams?
Electron - Solving our cross platform dreams?Electron - Solving our cross platform dreams?
Electron - Solving our cross platform dreams?Chris Ward
 
Entrez dans le mouvement Maker à l’aide des technologies Microsoft
Entrez dans le mouvement Maker à l’aide des technologies MicrosoftEntrez dans le mouvement Maker à l’aide des technologies Microsoft
Entrez dans le mouvement Maker à l’aide des technologies MicrosoftFabrice BARBIN
 
SimpleDb, an introduction
SimpleDb, an introductionSimpleDb, an introduction
SimpleDb, an introductionPaolo Negri
 
Why you should come to DrupalSouth
Why you should come to DrupalSouthWhy you should come to DrupalSouth
Why you should come to DrupalSouthChris Ward
 
Offre développeur Javascript Back-end
Offre développeur Javascript Back-endOffre développeur Javascript Back-end
Offre développeur Javascript Back-endSite Analyzer
 
Automate your docs, automate yourself
Automate your docs, automate yourselfAutomate your docs, automate yourself
Automate your docs, automate yourselfChris Ward
 
Erlang introduction geek2geek Berlin
Erlang introduction geek2geek BerlinErlang introduction geek2geek Berlin
Erlang introduction geek2geek BerlinPaolo Negri
 
Contentful Berlin Offices
Contentful Berlin OfficesContentful Berlin Offices
Contentful Berlin OfficesIrina Botea
 
The Anatomy of Content Management (workshop by J Gollner at Intelligent Conte...
The Anatomy of Content Management (workshop by J Gollner at Intelligent Conte...The Anatomy of Content Management (workshop by J Gollner at Intelligent Conte...
The Anatomy of Content Management (workshop by J Gollner at Intelligent Conte...Joe Gollner
 
AWS Lambda in infrastructure
AWS Lambda in infrastructureAWS Lambda in infrastructure
AWS Lambda in infrastructurePaolo Negri
 
Le futur de Drupal et des applications web
Le futur de Drupal et des applications webLe futur de Drupal et des applications web
Le futur de Drupal et des applications webJulien Dubreuil
 
Distributed and concurrent programming with RabbitMQ and EventMachine Rails U...
Distributed and concurrent programming with RabbitMQ and EventMachine Rails U...Distributed and concurrent programming with RabbitMQ and EventMachine Rails U...
Distributed and concurrent programming with RabbitMQ and EventMachine Rails U...Paolo Negri
 
Devoxx France 2015 - Se préparer à l'arrivée d'Angular 2
Devoxx France 2015 - Se préparer à l'arrivée d'Angular 2Devoxx France 2015 - Se préparer à l'arrivée d'Angular 2
Devoxx France 2015 - Se préparer à l'arrivée d'Angular 2Romain Linsolas
 
Back to the future with static site generators
Back to the future with static site generatorsBack to the future with static site generators
Back to the future with static site generatorsChris Ward
 
ParisJS #10 : PhantomJs
ParisJS #10 : PhantomJs ParisJS #10 : PhantomJs
ParisJS #10 : PhantomJs Maurice Svay
 
Erlang as a cloud citizen, a fractal approach to throughput
Erlang as a cloud citizen, a fractal approach to throughputErlang as a cloud citizen, a fractal approach to throughput
Erlang as a cloud citizen, a fractal approach to throughputPaolo Negri
 
Google : Prise en charge de l'Ajax et de l'Angular JS
Google : Prise en charge de l'Ajax et de l'Angular JSGoogle : Prise en charge de l'Ajax et de l'Angular JS
Google : Prise en charge de l'Ajax et de l'Angular JSPeak Ace
 
API Days Australia - Automatic Testing of (RESTful) API Documentation
API Days Australia  - Automatic Testing of (RESTful) API DocumentationAPI Days Australia  - Automatic Testing of (RESTful) API Documentation
API Days Australia - Automatic Testing of (RESTful) API DocumentationRouven Weßling
 

En vedette (20)

A Documentation Crash Course, LinuxCon 2016
A Documentation Crash Course, LinuxCon 2016A Documentation Crash Course, LinuxCon 2016
A Documentation Crash Course, LinuxCon 2016
 
Content Management Systems and Refactoring - Drupal, WordPress and eZ Publish
Content Management Systems and Refactoring - Drupal, WordPress and eZ PublishContent Management Systems and Refactoring - Drupal, WordPress and eZ Publish
Content Management Systems and Refactoring - Drupal, WordPress and eZ Publish
 
Electron - Solving our cross platform dreams?
Electron - Solving our cross platform dreams?Electron - Solving our cross platform dreams?
Electron - Solving our cross platform dreams?
 
Entrez dans le mouvement Maker à l’aide des technologies Microsoft
Entrez dans le mouvement Maker à l’aide des technologies MicrosoftEntrez dans le mouvement Maker à l’aide des technologies Microsoft
Entrez dans le mouvement Maker à l’aide des technologies Microsoft
 
SimpleDb, an introduction
SimpleDb, an introductionSimpleDb, an introduction
SimpleDb, an introduction
 
Why you should come to DrupalSouth
Why you should come to DrupalSouthWhy you should come to DrupalSouth
Why you should come to DrupalSouth
 
Offre développeur Javascript Back-end
Offre développeur Javascript Back-endOffre développeur Javascript Back-end
Offre développeur Javascript Back-end
 
Automate your docs, automate yourself
Automate your docs, automate yourselfAutomate your docs, automate yourself
Automate your docs, automate yourself
 
Erlang introduction geek2geek Berlin
Erlang introduction geek2geek BerlinErlang introduction geek2geek Berlin
Erlang introduction geek2geek Berlin
 
Contentful Berlin Offices
Contentful Berlin OfficesContentful Berlin Offices
Contentful Berlin Offices
 
The Anatomy of Content Management (workshop by J Gollner at Intelligent Conte...
The Anatomy of Content Management (workshop by J Gollner at Intelligent Conte...The Anatomy of Content Management (workshop by J Gollner at Intelligent Conte...
The Anatomy of Content Management (workshop by J Gollner at Intelligent Conte...
 
AWS Lambda in infrastructure
AWS Lambda in infrastructureAWS Lambda in infrastructure
AWS Lambda in infrastructure
 
Le futur de Drupal et des applications web
Le futur de Drupal et des applications webLe futur de Drupal et des applications web
Le futur de Drupal et des applications web
 
Distributed and concurrent programming with RabbitMQ and EventMachine Rails U...
Distributed and concurrent programming with RabbitMQ and EventMachine Rails U...Distributed and concurrent programming with RabbitMQ and EventMachine Rails U...
Distributed and concurrent programming with RabbitMQ and EventMachine Rails U...
 
Devoxx France 2015 - Se préparer à l'arrivée d'Angular 2
Devoxx France 2015 - Se préparer à l'arrivée d'Angular 2Devoxx France 2015 - Se préparer à l'arrivée d'Angular 2
Devoxx France 2015 - Se préparer à l'arrivée d'Angular 2
 
Back to the future with static site generators
Back to the future with static site generatorsBack to the future with static site generators
Back to the future with static site generators
 
ParisJS #10 : PhantomJs
ParisJS #10 : PhantomJs ParisJS #10 : PhantomJs
ParisJS #10 : PhantomJs
 
Erlang as a cloud citizen, a fractal approach to throughput
Erlang as a cloud citizen, a fractal approach to throughputErlang as a cloud citizen, a fractal approach to throughput
Erlang as a cloud citizen, a fractal approach to throughput
 
Google : Prise en charge de l'Ajax et de l'Angular JS
Google : Prise en charge de l'Ajax et de l'Angular JSGoogle : Prise en charge de l'Ajax et de l'Angular JS
Google : Prise en charge de l'Ajax et de l'Angular JS
 
API Days Australia - Automatic Testing of (RESTful) API Documentation
API Days Australia  - Automatic Testing of (RESTful) API DocumentationAPI Days Australia  - Automatic Testing of (RESTful) API Documentation
API Days Australia - Automatic Testing of (RESTful) API Documentation
 

Similaire à Scaling Social Games: Overcoming Order of Magnitude Challenges with Redis, MySQL Sharding, and More

Getting started with Riak in the Cloud
Getting started with Riak in the CloudGetting started with Riak in the Cloud
Getting started with Riak in the CloudInes Sombra
 
Chirp 2010: Scaling Twitter
Chirp 2010: Scaling TwitterChirp 2010: Scaling Twitter
Chirp 2010: Scaling TwitterJohn Adams
 
Scaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQLScaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQLRichard Schneeman
 
Scaling with swagger
Scaling with swaggerScaling with swagger
Scaling with swaggerTony Tam
 
Scaling MySQL Using Fabric
Scaling MySQL Using FabricScaling MySQL Using Fabric
Scaling MySQL Using FabricRemote MySQL DBA
 
Scaling MySQL using Fabric
Scaling MySQL using FabricScaling MySQL using Fabric
Scaling MySQL using FabricKarthik .P.R
 
Stack Exchange Infrastructure - LISA 14
Stack Exchange Infrastructure - LISA 14Stack Exchange Infrastructure - LISA 14
Stack Exchange Infrastructure - LISA 14GABeech
 
High Performance WordPress II
High Performance WordPress IIHigh Performance WordPress II
High Performance WordPress IIBarry Abrahamson
 
A closer look to locaweb IaaS
A closer look to locaweb IaaSA closer look to locaweb IaaS
A closer look to locaweb IaaSGleicon Moraes
 
Webinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyWebinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyCeph Community
 
Service-Oriented Design and Implement with Rails3
Service-Oriented Design and Implement with Rails3Service-Oriented Design and Implement with Rails3
Service-Oriented Design and Implement with Rails3Wen-Tien Chang
 
Mindtalk Tech - Behind the scenes
Mindtalk Tech - Behind the scenesMindtalk Tech - Behind the scenes
Mindtalk Tech - Behind the scenesrobin_sy
 
RedisDay London 2018 - Stack Overflow's Next Steps in Redis
RedisDay London 2018 - Stack Overflow's Next Steps in RedisRedisDay London 2018 - Stack Overflow's Next Steps in Redis
RedisDay London 2018 - Stack Overflow's Next Steps in RedisRedis Labs
 
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #4: MS Azure Database MySQL
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #4: MS Azure Database MySQLWebinar Slides: MySQL HA/DR/Geo-Scale - High Noon #4: MS Azure Database MySQL
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #4: MS Azure Database MySQLContinuent
 
Big Data Developers Moscow Meetup 1 - sql on hadoop
Big Data Developers Moscow Meetup 1  - sql on hadoopBig Data Developers Moscow Meetup 1  - sql on hadoop
Big Data Developers Moscow Meetup 1 - sql on hadoopbddmoscow
 

Similaire à Scaling Social Games: Overcoming Order of Magnitude Challenges with Redis, MySQL Sharding, and More (20)

Getting started with Riak in the Cloud
Getting started with Riak in the CloudGetting started with Riak in the Cloud
Getting started with Riak in the Cloud
 
Chirp 2010: Scaling Twitter
Chirp 2010: Scaling TwitterChirp 2010: Scaling Twitter
Chirp 2010: Scaling Twitter
 
Scaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQLScaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQL
 
Scaling with swagger
Scaling with swaggerScaling with swagger
Scaling with swagger
 
Scaling MySQL Using Fabric
Scaling MySQL Using FabricScaling MySQL Using Fabric
Scaling MySQL Using Fabric
 
Scaling MySQL using Fabric
Scaling MySQL using FabricScaling MySQL using Fabric
Scaling MySQL using Fabric
 
Stack Exchange Infrastructure - LISA 14
Stack Exchange Infrastructure - LISA 14Stack Exchange Infrastructure - LISA 14
Stack Exchange Infrastructure - LISA 14
 
High Performance WordPress II
High Performance WordPress IIHigh Performance WordPress II
High Performance WordPress II
 
redis
redisredis
redis
 
A closer look to locaweb IaaS
A closer look to locaweb IaaSA closer look to locaweb IaaS
A closer look to locaweb IaaS
 
Webinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyWebinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case Study
 
Why ruby and rails
Why ruby and railsWhy ruby and rails
Why ruby and rails
 
KeyValue Stores
KeyValue StoresKeyValue Stores
KeyValue Stores
 
Service-Oriented Design and Implement with Rails3
Service-Oriented Design and Implement with Rails3Service-Oriented Design and Implement with Rails3
Service-Oriented Design and Implement with Rails3
 
High availability
High availabilityHigh availability
High availability
 
MySQL in the Cloud
MySQL in the CloudMySQL in the Cloud
MySQL in the Cloud
 
Mindtalk Tech - Behind the scenes
Mindtalk Tech - Behind the scenesMindtalk Tech - Behind the scenes
Mindtalk Tech - Behind the scenes
 
RedisDay London 2018 - Stack Overflow's Next Steps in Redis
RedisDay London 2018 - Stack Overflow's Next Steps in RedisRedisDay London 2018 - Stack Overflow's Next Steps in Redis
RedisDay London 2018 - Stack Overflow's Next Steps in Redis
 
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #4: MS Azure Database MySQL
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #4: MS Azure Database MySQLWebinar Slides: MySQL HA/DR/Geo-Scale - High Noon #4: MS Azure Database MySQL
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #4: MS Azure Database MySQL
 
Big Data Developers Moscow Meetup 1 - sql on hadoop
Big Data Developers Moscow Meetup 1  - sql on hadoopBig Data Developers Moscow Meetup 1  - sql on hadoop
Big Data Developers Moscow Meetup 1 - sql on hadoop
 

Plus de Paolo Negri

Turning the web stack upside down rethinking how data flows through systems
Turning the web stack upside down  rethinking how data flows through systemsTurning the web stack upside down  rethinking how data flows through systems
Turning the web stack upside down rethinking how data flows through systemsPaolo Negri
 
Getting real with erlang
Getting real with erlangGetting real with erlang
Getting real with erlangPaolo Negri
 
Erlang factory 2011 london
Erlang factory 2011 londonErlang factory 2011 london
Erlang factory 2011 londonPaolo Negri
 
Erlang factory SF 2011 "Erlang and the big switch in social games"
Erlang factory SF 2011 "Erlang and the big switch in social games"Erlang factory SF 2011 "Erlang and the big switch in social games"
Erlang factory SF 2011 "Erlang and the big switch in social games"Paolo Negri
 
RabbitMQ with python and ruby RuPy 2009
RabbitMQ with python and ruby RuPy 2009RabbitMQ with python and ruby RuPy 2009
RabbitMQ with python and ruby RuPy 2009Paolo Negri
 
%w(map reduce).first - A Tale About Rabbits, Latency, and Slim Crontabs
%w(map reduce).first - A Tale About Rabbits, Latency, and Slim Crontabs%w(map reduce).first - A Tale About Rabbits, Latency, and Slim Crontabs
%w(map reduce).first - A Tale About Rabbits, Latency, and Slim CrontabsPaolo Negri
 

Plus de Paolo Negri (6)

Turning the web stack upside down rethinking how data flows through systems
Turning the web stack upside down  rethinking how data flows through systemsTurning the web stack upside down  rethinking how data flows through systems
Turning the web stack upside down rethinking how data flows through systems
 
Getting real with erlang
Getting real with erlangGetting real with erlang
Getting real with erlang
 
Erlang factory 2011 london
Erlang factory 2011 londonErlang factory 2011 london
Erlang factory 2011 london
 
Erlang factory SF 2011 "Erlang and the big switch in social games"
Erlang factory SF 2011 "Erlang and the big switch in social games"Erlang factory SF 2011 "Erlang and the big switch in social games"
Erlang factory SF 2011 "Erlang and the big switch in social games"
 
RabbitMQ with python and ruby RuPy 2009
RabbitMQ with python and ruby RuPy 2009RabbitMQ with python and ruby RuPy 2009
RabbitMQ with python and ruby RuPy 2009
 
%w(map reduce).first - A Tale About Rabbits, Latency, and Slim Crontabs
%w(map reduce).first - A Tale About Rabbits, Latency, and Slim Crontabs%w(map reduce).first - A Tale About Rabbits, Latency, and Slim Crontabs
%w(map reduce).first - A Tale About Rabbits, Latency, and Slim Crontabs
 

Dernier

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 

Dernier (20)

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 

Scaling Social Games: Overcoming Order of Magnitude Challenges with Redis, MySQL Sharding, and More

  • 1. Scaling social games “the order of magnitude challenge” Paolo Negri @hungryblank
  • 2. Order of magnitude DAU 1000000 750000 DAU: 500000 daily active users 250000 0 July December
  • 3. Social Games Flash client (game) HTTP API http://www.flickr.com/photos/stars6/4381851322
  • 4. Social Games Flash client • Game actions need to be persisted and validated • 1 API call every few secs
  • 5. Social Games HTTP API • 5000 HTTP reqs/sec • more than 90% writes • 60K queries/sec
 http://www.flickr.com/photos/stars6/4381851322
  • 6. July 2010 HAproxy • ~ 170 000 daily users • Plain Ruby on Rails app • Persistency 100% SQL Ruby on Rails MySQL
  • 7. July 2010 • 1 haproxy server HAproxy • multiple RoR servers • 4 mysql servers Ruby on Rails (sharded dataset) MySQL
  • 8. July 2010 HAproxy Ruby on Rails Slow down MySQL
  • 9. July 2010 HAproxy High queries/request Ruby on Rails ratio Slow down MySQL
  • 10. Queries/request • Which code is triggering extra queries? • Why in our test environment the ratio is lower than live?
  • 11. Queries/request Running code of live system Application Plugins Ruby on Rails
  • 12. Queries/request Source of extra queries • sharding plugin “breaks” std Rails query cache Plugins • Flash wire protocol plugin generates extra queries
  • 13. Plugins • Deceiving “feature for free” • Might provide the right feature • But might not meet scaling need
  • 14. Plugins • Instant code legacy, for new projects also! • Once added it’s your code • Even if it’s maintained, will it follow your needs?
  • 15. Plugins • Assess code quality when you add it • Can you afford to maintain/change it?
  • 16. Plugins • We fixed it! • Query cut up to 40% on some requests
  • 17. Early August 30 22.5 15 7.5 0 6:00 6:10 6:20 6:30 6:40 6:50 7:00 7:10 7:20 7:30 7:40 7:50 8:00 8:10 query time in ms • The MySQL hiccup • every 70 mins query time spikes x7
  • 18. Hiccup causes Who is periodically blocking MySQL • Code (app + plugins + Rails)? • Some periodic job? • The devil (AWS)?
  • 19. Hiccup quick fix • We shard out the top queried table (40% of all queries) MySQL servers shard 1 shard 2 shard 3 shard 4
  • 20. Hiccup quick fix • We shard out the top queried table (40% of all queries) Top table Top table Top table Top table shard 1 shard 2 shard 3 shard 4 Other tables Other tables Other tables Other tables shard 1 shard 2 shard 3 shard 4
  • 21. Hiccup quick fix • Mysql likes it • “top table” shards will go a long way in the scaling process Top table Top table Top table Top table shard 1 shard 2 shard 3 shard 4 Other tables Other tables Other tables Other tables shard 1 shard 2 shard 3 shard 4
  • 22. Hiccup causes Who is periodically blocking MySQL • Code (app + plugins + Rails)? • Some periodic job? • The devil (AWS)? None of the Above
  • 23. Hiccup real cause • Emerging MySQL internal at high volume • MySQL flushes its buffer • Under heavy write IO it’s blocking
  • 24. Hiccup solution • Percona MySQL patches (XtraDB) avoid blocking behavior • Query time profile gets smooth • IO capacity limit manifested with gradual performance decay
  • 25. Write through cache • Memcache in front of MySQL • Evaluated before sharding • Was discarded • Because of our read/write reatio
  • 26. Write through cache 90% of the times we read data in order to modify it
  • 27. Write through cache It means 90% of the times 1. read cache 2. write cache 3. write SQL
  • 28. Write through cache Bound to Read heavy Write heavy • Mysql write (unless async) • memcache perfs • Write through lib optimized for writes?
  • 29. MySQL • Sharding SQL is a painful way to scale • Data migrations at high load imply downtime • ACID benefits all lost because of sharding or in name of performance
  • 30. Redis • A persistent cache • Fast 60000 qps on AWS hardware • Interesting data structures, not only KV • Already some small scale experince in house
  • 31. Redis adoption • Which data to start from? • How do we migrate without downtime? • Which Ruby object - Redis structure lib?
  • 32. Redis adoption • Which data to start from? • Best data fit for Redis hashes • Top 3rd queried table • a collection of integer fields that need only increment / decrement
  • 33. Redis adoption • How do we migrate without downtime? • Migrate one user at a time • Use a Redis set to keep note of migrated/ non migrated • No downtime, transparent to users
  • 34. Redis adoption • How do we migrate without downtime? MySQL User 123 RoR Server Redis
  • 35. Redis adoption • How do we migrate without downtime? read original data MySQL User 123 RoR Server Redis
  • 36. Redis adoption • How do we migrate without downtime? MySQL User 123 RoR Server Redis write migrated data
  • 37. Redis adoption • How do we migrate without downtime? • Migration might never complete • SQL + Redis set information to generate final batch migration
  • 38. Redis 1st result 10% query load from 4 MySQL server is moved to 1 Redis server Redis server load is 0.05
  • 39. Redis • Becomes the tool to use • Migration plan for all write intensive data • Migrate one “class” at a time
  • 40. Redis honeymoon end • Memory usage grows more than data • Snapshot to disk causes spikes in query time • Starting new slaves eats memory on the master node
  • 41. Redis honeymoon end Russian Roulette Feeling • Redis machine sized with overabundant RAM • Rigorous slave/master starting plan
  • 42. Redis • Redis team acknowledges persistency/ replication problems • Redis 2.4 diskstore plan starts
  • 44. 1.000.000 painless scaling HAproxy Ruby on Rails Persistency
  • 45. 1.000.000 HAproxy just add servers Ruby on Rails as load grows Peristency
  • 46. 1.000.000 HAproxy Ruby on Rails Painful and Peristency troublesome
  • 47. Infrastructure • AWS • Chef - through Scalarium • Ganglia
  • 49. wooga Is looking for Business Intelligence Engineer http://wooga.com/jobs