SlideShare a Scribd company logo
1 of 24
Eugene Kovshilovsky
Technical Director, Operations
Rick Chen
Technical Director
Cie Games, Inc.
Rick Chen
 Rick Chen is a Technical Director at Cie
Games, overseeing the company‟s
engineering department. In 2012, Cie Games,
the leading developer and publisher of social
and mobile games, was recognized as one of
the Fastest Growing Company in North
America on Deloitte‟s Technology Fast 500 list.
Eugene Kovshilovsky
 Eugene Kovshilovsky is a Technical Director at
Cie Games, a leading developer and publisher
of social and mobile games. Its popular game,
Car Town, is the first Facebook game built
around brands and has attracted more than 50
million players worldwide. The company‟s first
iOS game, Car Town Streets, for iPhone, iPad
and iPod touch, is one of the Top 10 Games
on the App Store in multiple countries.
What is Sharding?
 (Horizontal) partitioning in a database where data is split by
rows (collections) instead or columns
Why Shard in Games?
 Write Throughput Exceeds I/O
 Each click updates the user‟s currency amount.
Why Shard in Games?
 Memory isn‟t large enough to hold active dataset
Why Shard in Games?
 You want to run on commodity hardware and scale
horizontally vs. vertically
MongoDB Sharding
 Distribute objects within collections automatically
 Choose how data is partitioned
 Single master to sharded cluster with = 0 down time
 Fully consistent
 Works well in the cloud
 Automated fail-over
 Java support with Spring integration for easy JSON to Object
mapping
Shard Design
Share the load amongst all shards
equally
High Randomness
Have the smallest working set possible
Roll with time
Recreate shard key from ObjectId
Be careful how you shard, no way back
99.999%
 Most of our queries we have the ObjectId
 Client saves ObjectId for re-login at a later time
 I‟m this user and I want to do xyz with my car.
 No range based queries
 We don‟t do „find‟ this user in some range and query against our
master db, we can do this on our secondary data stores that are
not sharded or additional reporting systems.
Shard Behavior
 Attempt to insert data into pre-split chunks
 Increase cluster throughput by reducing overhead
 User Life Time of 3 Days
 Most new players expected to play for an average of 3 days
 Unbalanced chunks will be rebalanced during off-peak hours
 Rarely happens
Pre Split
 Creates chunks and moves them to different shards ahead of
time
 Creates 0 byte chunks equally to all shards
 One chunk may grow to 5-6 chunks depending on how many
new users come in within a 3 day window
 Used instead of waiting for the balancer to detect that there
are too many chunks in one shard
3 Day Shard Key
2^18 = 262144 seconds = 3.03 days
ObjectId consists of the timestamp of when the
_id is created
“Consistent random” value
Generate Md5 from String version of ObjectId
Take first 18 bit from Md5
Append the 18 bit random value to the
timestamp
Mongo 2.4 and New Hash
Sharding
 No need for Pre-splitting
 Easy setup of randomized shard key
 Better distribution of chunks on shards for isolated document
writes and reads
 Uses first 64bit of md5 of the field
 Doesn‟t support compound indexes
 Don‟t use floating point numbers
 Range bases queries go to all shards
 Working set is distributed to all shards
Development Environment
Sharded Cluster
Deploy
 Start the Config Server (mongod) Instances
 Start the mongos Instances
 Add Shards to the Cluster
 Enable Sharding for a Database
 Enable Sharding for a Collection
Start the Config Server Database
Instances
 Create data dir for each of the 3 config server instances
 Start the 3 config server instances
 If starting on the same machine for study purposes then
configure each one to run on a different port
mkdir /data/configdb
mongod –configsvr –dbpath /data/configdb –port 27019
Start the mongos Instances
 Ideally these lightweight instances run on the application
server(s)
 Start the mongos instances
 mongos –configdb <config server hostnames>
 If connecting to the demo configs on localhost then specify the
 config servers with the different ports.
 mongos run on port 27017 by default
mongos –configdb cfg0.example.net:27019,cfg1.example.net:27019
mongos –configdb localhost:27019,localhost:27020
Add Shards to the Cluster
 Connect to mongos from mongo shell
 Add each shard to the cluster using sh.addShard()
 In production always specify the replica set
sh.addShard(“rs1/localhost:27017”)
Mongo –host localhost –port 27017
Enable Sharding for a Database
 To shard collections you need to enable this first
 Add sharding to database using
sh.enableSharding((“<database>”)
sh.enableSharding(“mydb”)
Enable Sharding for a Collection
 Enable sharing using
sh.shardCollection("<database>.<collection>", shard-
key-pattern)
sh.shardCollection(”mydb.shardcollection", {userId:
1})
Demo
 Please check out our demo
Resources
 http://docs.mongodb.org/manual/sharding/
 https://github.com/ekovshilovsky/mongo_java
 http://docs.mongodb.org/manual/administration/sharded-
clusters/
 http://docs.mongodb.org/manual/tutorial/deploy-shard-
cluster/
We are hiring!
 http://www.ciegames.com/careers

More Related Content

Similar to MongoDB Sharding

MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014Dylan Tong
 
Tales from the Field
Tales from the FieldTales from the Field
Tales from the FieldMongoDB
 
MondCloud Semantic Data Hub for Insurance
MondCloud Semantic Data Hub for InsuranceMondCloud Semantic Data Hub for Insurance
MondCloud Semantic Data Hub for InsuranceGeetha Sreedhar, MBA
 
MongoDB 3.2 - a giant leap. What’s new?
MongoDB 3.2 - a giant leap. What’s new?MongoDB 3.2 - a giant leap. What’s new?
MongoDB 3.2 - a giant leap. What’s new?Binary Studio
 
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...Andrew Liu
 
Five Database Mistakes and how to fix them -- Confoo Vancouver
Five Database Mistakes and how to fix them -- Confoo VancouverFive Database Mistakes and how to fix them -- Confoo Vancouver
Five Database Mistakes and how to fix them -- Confoo VancouverDave Stokes
 
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
Predictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-timePredictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-timeAerospike, Inc.
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDBMongoDB
 
Lessons Learned Migrating 2+ Billion Documents at Craigslist
Lessons Learned Migrating 2+ Billion Documents at CraigslistLessons Learned Migrating 2+ Billion Documents at Craigslist
Lessons Learned Migrating 2+ Billion Documents at CraigslistJeremy Zawodny
 
Scalding big ADta
Scalding big ADtaScalding big ADta
Scalding big ADtab0ris_1
 
Hellenic MongoDB user group - Introduction to sharding
Hellenic MongoDB user group - Introduction to shardingHellenic MongoDB user group - Introduction to sharding
Hellenic MongoDB user group - Introduction to shardingcsoulios
 
Sharding in MongoDB 4.2 #what_is_new
 Sharding in MongoDB 4.2 #what_is_new Sharding in MongoDB 4.2 #what_is_new
Sharding in MongoDB 4.2 #what_is_newAntonios Giannopoulos
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB
 
You Too Can Be a Radio Host Or How We Scaled a .NET Startup And Had Fun Doing It
You Too Can Be a Radio Host Or How We Scaled a .NET Startup And Had Fun Doing ItYou Too Can Be a Radio Host Or How We Scaled a .NET Startup And Had Fun Doing It
You Too Can Be a Radio Host Or How We Scaled a .NET Startup And Had Fun Doing ItAleksandr Yampolskiy
 
Scaling with mongo db - SF Mongo User Group 7-19-2011
Scaling with mongo db - SF Mongo User Group 7-19-2011Scaling with mongo db - SF Mongo User Group 7-19-2011
Scaling with mongo db - SF Mongo User Group 7-19-2011Jared Rosoff
 
Blockchains for AI [With New Applications]
Blockchains for AI [With New Applications]Blockchains for AI [With New Applications]
Blockchains for AI [With New Applications]Trent McConaghy
 
Introduction to MongoDB at IGDTUW
Introduction to MongoDB at IGDTUWIntroduction to MongoDB at IGDTUW
Introduction to MongoDB at IGDTUWAnkur Raina
 
MongoDB performance
MongoDB performanceMongoDB performance
MongoDB performanceMydbops
 

Similar to MongoDB Sharding (20)

MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014
 
Tales from the Field
Tales from the FieldTales from the Field
Tales from the Field
 
MondCloud Semantic Data Hub for Insurance
MondCloud Semantic Data Hub for InsuranceMondCloud Semantic Data Hub for Insurance
MondCloud Semantic Data Hub for Insurance
 
MongoDB 3.2 - a giant leap. What’s new?
MongoDB 3.2 - a giant leap. What’s new?MongoDB 3.2 - a giant leap. What’s new?
MongoDB 3.2 - a giant leap. What’s new?
 
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
 
Five Database Mistakes and how to fix them -- Confoo Vancouver
Five Database Mistakes and how to fix them -- Confoo VancouverFive Database Mistakes and how to fix them -- Confoo Vancouver
Five Database Mistakes and how to fix them -- Confoo Vancouver
 
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
Predictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-timePredictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-time
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDB
 
Lessons Learned Migrating 2+ Billion Documents at Craigslist
Lessons Learned Migrating 2+ Billion Documents at CraigslistLessons Learned Migrating 2+ Billion Documents at Craigslist
Lessons Learned Migrating 2+ Billion Documents at Craigslist
 
Scalding big ADta
Scalding big ADtaScalding big ADta
Scalding big ADta
 
SEO for Large Websites
SEO for Large WebsitesSEO for Large Websites
SEO for Large Websites
 
Hellenic MongoDB user group - Introduction to sharding
Hellenic MongoDB user group - Introduction to shardingHellenic MongoDB user group - Introduction to sharding
Hellenic MongoDB user group - Introduction to sharding
 
Sharding in MongoDB 4.2 #what_is_new
 Sharding in MongoDB 4.2 #what_is_new Sharding in MongoDB 4.2 #what_is_new
Sharding in MongoDB 4.2 #what_is_new
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: Sharding
 
You Too Can Be a Radio Host Or How We Scaled a .NET Startup And Had Fun Doing It
You Too Can Be a Radio Host Or How We Scaled a .NET Startup And Had Fun Doing ItYou Too Can Be a Radio Host Or How We Scaled a .NET Startup And Had Fun Doing It
You Too Can Be a Radio Host Or How We Scaled a .NET Startup And Had Fun Doing It
 
Scaling with mongo db - SF Mongo User Group 7-19-2011
Scaling with mongo db - SF Mongo User Group 7-19-2011Scaling with mongo db - SF Mongo User Group 7-19-2011
Scaling with mongo db - SF Mongo User Group 7-19-2011
 
Blockchains for AI [With New Applications]
Blockchains for AI [With New Applications]Blockchains for AI [With New Applications]
Blockchains for AI [With New Applications]
 
Introduction to MongoDB at IGDTUW
Introduction to MongoDB at IGDTUWIntroduction to MongoDB at IGDTUW
Introduction to MongoDB at IGDTUW
 
MongoDB performance
MongoDB performanceMongoDB performance
MongoDB performance
 

Recently uploaded

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 

Recently uploaded (20)

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 

MongoDB Sharding

  • 1. Eugene Kovshilovsky Technical Director, Operations Rick Chen Technical Director Cie Games, Inc.
  • 2. Rick Chen  Rick Chen is a Technical Director at Cie Games, overseeing the company‟s engineering department. In 2012, Cie Games, the leading developer and publisher of social and mobile games, was recognized as one of the Fastest Growing Company in North America on Deloitte‟s Technology Fast 500 list.
  • 3. Eugene Kovshilovsky  Eugene Kovshilovsky is a Technical Director at Cie Games, a leading developer and publisher of social and mobile games. Its popular game, Car Town, is the first Facebook game built around brands and has attracted more than 50 million players worldwide. The company‟s first iOS game, Car Town Streets, for iPhone, iPad and iPod touch, is one of the Top 10 Games on the App Store in multiple countries.
  • 4. What is Sharding?  (Horizontal) partitioning in a database where data is split by rows (collections) instead or columns
  • 5. Why Shard in Games?  Write Throughput Exceeds I/O  Each click updates the user‟s currency amount.
  • 6. Why Shard in Games?  Memory isn‟t large enough to hold active dataset
  • 7. Why Shard in Games?  You want to run on commodity hardware and scale horizontally vs. vertically
  • 8. MongoDB Sharding  Distribute objects within collections automatically  Choose how data is partitioned  Single master to sharded cluster with = 0 down time  Fully consistent  Works well in the cloud  Automated fail-over  Java support with Spring integration for easy JSON to Object mapping
  • 9. Shard Design Share the load amongst all shards equally High Randomness Have the smallest working set possible Roll with time Recreate shard key from ObjectId Be careful how you shard, no way back
  • 10. 99.999%  Most of our queries we have the ObjectId  Client saves ObjectId for re-login at a later time  I‟m this user and I want to do xyz with my car.  No range based queries  We don‟t do „find‟ this user in some range and query against our master db, we can do this on our secondary data stores that are not sharded or additional reporting systems.
  • 11. Shard Behavior  Attempt to insert data into pre-split chunks  Increase cluster throughput by reducing overhead  User Life Time of 3 Days  Most new players expected to play for an average of 3 days  Unbalanced chunks will be rebalanced during off-peak hours  Rarely happens
  • 12. Pre Split  Creates chunks and moves them to different shards ahead of time  Creates 0 byte chunks equally to all shards  One chunk may grow to 5-6 chunks depending on how many new users come in within a 3 day window  Used instead of waiting for the balancer to detect that there are too many chunks in one shard
  • 13. 3 Day Shard Key 2^18 = 262144 seconds = 3.03 days ObjectId consists of the timestamp of when the _id is created “Consistent random” value Generate Md5 from String version of ObjectId Take first 18 bit from Md5 Append the 18 bit random value to the timestamp
  • 14. Mongo 2.4 and New Hash Sharding  No need for Pre-splitting  Easy setup of randomized shard key  Better distribution of chunks on shards for isolated document writes and reads  Uses first 64bit of md5 of the field  Doesn‟t support compound indexes  Don‟t use floating point numbers  Range bases queries go to all shards  Working set is distributed to all shards
  • 16. Deploy  Start the Config Server (mongod) Instances  Start the mongos Instances  Add Shards to the Cluster  Enable Sharding for a Database  Enable Sharding for a Collection
  • 17. Start the Config Server Database Instances  Create data dir for each of the 3 config server instances  Start the 3 config server instances  If starting on the same machine for study purposes then configure each one to run on a different port mkdir /data/configdb mongod –configsvr –dbpath /data/configdb –port 27019
  • 18. Start the mongos Instances  Ideally these lightweight instances run on the application server(s)  Start the mongos instances  mongos –configdb <config server hostnames>  If connecting to the demo configs on localhost then specify the  config servers with the different ports.  mongos run on port 27017 by default mongos –configdb cfg0.example.net:27019,cfg1.example.net:27019 mongos –configdb localhost:27019,localhost:27020
  • 19. Add Shards to the Cluster  Connect to mongos from mongo shell  Add each shard to the cluster using sh.addShard()  In production always specify the replica set sh.addShard(“rs1/localhost:27017”) Mongo –host localhost –port 27017
  • 20. Enable Sharding for a Database  To shard collections you need to enable this first  Add sharding to database using sh.enableSharding((“<database>”) sh.enableSharding(“mydb”)
  • 21. Enable Sharding for a Collection  Enable sharing using sh.shardCollection("<database>.<collection>", shard- key-pattern) sh.shardCollection(”mydb.shardcollection", {userId: 1})
  • 22. Demo  Please check out our demo
  • 23. Resources  http://docs.mongodb.org/manual/sharding/  https://github.com/ekovshilovsky/mongo_java  http://docs.mongodb.org/manual/administration/sharded- clusters/  http://docs.mongodb.org/manual/tutorial/deploy-shard- cluster/
  • 24. We are hiring!  http://www.ciegames.com/careers