SlideShare une entreprise Scribd logo
1  sur  68
Télécharger pour lire hors ligne
Introduction to Sharding
      Tyler Brock
      Software Engineer, 10gen




Wednesday, March 27, 13
Agenda

      • Scaling Data
      • MongoDB's Approach
      • Architecture
      • Configuration
      • Mechanics




Wednesday, March 27, 13
Scaling Data




Wednesday, March 27, 13
Examining Growth




Wednesday, March 27, 13
Examining Growth

      • User Growth
             – 1995: 0.4% of the world’s population
             – Today: 30% of the world is online (~2.2B)
             – Emerging Markets & Mobile




Wednesday, March 27, 13
Examining Growth

      • User Growth
             – 1995: 0.4% of the world’s population
             – Today: 30% of the world is online (~2.2B)
             – Emerging Markets & Mobile

      • Data Set Growth
             – Facebook’s data set is around 100 petabytes
             – 4 billion photos taken in the last year (4x a decade ago)




Wednesday, March 27, 13
Working Set Exceeds Physical Memory


Wednesday, March 27, 13
Read/Write Throughput Exceeds I/O


Wednesday, March 27, 13
Vertical Scalability (Scale Up)


Wednesday, March 27, 13
Horizontal Scalability (Scale Out)


Wednesday, March 27, 13
Data Store Scalability

      • Custom Hardware
             – Oracle

      • Custom Software
             – Facebook + MySQL
             – Google




Wednesday, March 27, 13
Data Store Scalability Today

      • MongoDB Auto-Sharding
      • A data store that is
             – Free
             – Publicly available
             – Open Source (https://github.com/mongodb/mongo)
             – Horizontally scalable
             – Application independent




Wednesday, March 27, 13
MongoDB's Approach to
      Sharding




Wednesday, March 27, 13
Partitioning

      • User defines shard key
      • Shard key defines range of data
      • Key space is like points on a line
      • Range is a segment of that line




Wednesday, March 27, 13
Data Distribution

      • Initially 1 chunk
      • Default max chunk size: 64mb
      • MongoDB automatically splits & migrates chunks
          when max reached




Wednesday, March 27, 13
Routing and Balancing

      • Queries routed to specific
          shards
      • MongoDB balances cluster
      • MongoDB migrates data to
          new nodes




Wednesday, March 27, 13
MongoDB Auto-Sharding

      • Minimal effort required
             – Same interface as single mongod

      • Two steps
             – Enable Sharding for a database
             – Shard collection within database




Wednesday, March 27, 13
Architecture




Wednesday, March 27, 13
What is a Shard?
      • Shard is a node of the cluster
      • Shard can be a single mongod or a replica set




Wednesday, March 27, 13
Meta Data Storage

      • Config Server
         – Stores cluster chunk ranges and locations
         – Can have only 1 or 3 (production must have 3)
         – Not a replica set




Wednesday, March 27, 13
Routing and Managing Data

      • Mongos
         – Acts as a router / balancer
         – No local data (persists to config database)
         – Can have 1 or many




Wednesday, March 27, 13
Sharding infrastructure


Wednesday, March 27, 13
Configuration




Wednesday, March 27, 13
Example Cluster




        • Don’t use this setup in production!
           - Only one Config server (No Fault Tolerance)
           - Shard not in a replica set (Low Availability)
           - Only one mongos and shard (No Performance Improvement)




Wednesday, March 27, 13
Starting the Configuration Server




        • mongod --configsvr
        • Starts a configuration server on the default port (27019)




Wednesday, March 27, 13
Start the mongos Router




        • mongos --configdb <hostname>:27019
        • For 3 configuration servers:
           mongos --configdb
           <host1>:<port1>,<host2>:<port2>,<host3>:<port3>
        • This is always how to start a new mongos, even if the cluster is
           already running


Wednesday, March 27, 13
Start the shard database




        •   mongod --shardsvr
        •   Starts a mongod with the default shard port (27018)
        •   Shard is not yet connected to the rest of the cluster
        •   Shard may have already been running in production



Wednesday, March 27, 13
Add the Shard




        • On mongos:
           - sh.addShard(‘<host>:27018’)
        • Adding a replica set:




Wednesday, March 27, 13
Verify that the shard was added




        • db.runCommand({ listshards:1 })
            { "shards" :
            ! [{"_id”: "shard0000”,"host”: ”<hostname>:27018” } ],
                 "ok" : 1
             }




Wednesday, March 27, 13
Enabling Sharding

      • Enable sharding on a database

             sh.enableSharding(“<dbname>”)

      • Shard a collection with the given key

             sh.shardCollection(“<dbname>.people”,{“country”:1})

      • Use a compound shard key to prevent duplicates

             sh.shardCollection(“<dbname>.cars”,{“year”:1, ”uniqueid”:1})




Wednesday, March 27, 13
Tag Aware Sharding

      • Tag aware sharding allows you to control the
          distribution of your data
      • Tag a range of shard keys
             – sh.addTagRange(<collection>,<min>,<max>,<tag>)

      • Tag a shard
             – sh.addShardTag(<shard>,<tag>)




Wednesday, March 27, 13
Mechanics




Wednesday, March 27, 13
Partitioning

      • Remember it's based on ranges




Wednesday, March 27, 13
Chunk is a section of the entire range


Wednesday, March 27, 13
Chunk splitting




     • A chunk is split once it exceeds the maximum size
     • There is no split point if all documents have the same shard key
     • Chunk split is a logical operation (no data is moved)




Wednesday, March 27, 13
Balancing




        • Balancer is running on mongos
        • Once the difference in chunks between the most dense shard
           and the least dense shard is above the migration threshold, a
           balancing round starts




Wednesday, March 27, 13
Acquiring the Balancer Lock




        • The balancer on mongos takes out a “balancer lock”
        • To see the status of these locks:
           use config
           db.locks.find({ _id: “balancer” })




Wednesday, March 27, 13
Moving the chunk




        • The mongos sends a moveChunk command to source shard
        • The source shard then notifies destination shard
        • Destination shard starts pulling documents from source shard




Wednesday, March 27, 13
Committing Migration




        • When complete, destination shard updates config server
           - Provides new locations of the chunks




Wednesday, March 27, 13
Cleanup




        • Source shard deletes moved data
           - Must wait for open cursors to either close or time out
           - NoTimeout cursors may prevent the release of the lock
        • The mongos releases the balancer lock after old chunks are




Wednesday, March 27, 13
Routing Requests




Wednesday, March 27, 13
Cluster Request Routing

      • Targeted Queries
      • Scatter Gather Queries
      • Scatter Gather Queries with Sort




Wednesday, March 27, 13
Cluster Request Routing: Targeted Query


Wednesday, March 27, 13
Routable request received


Wednesday, March 27, 13
Request routed to appropriate shard


Wednesday, March 27, 13
Shard returns results


Wednesday, March 27, 13
Mongos returns results to client


Wednesday, March 27, 13
Cluster Request Routing: Non-Targeted Query


Wednesday, March 27, 13
Non-Targeted Request Received


Wednesday, March 27, 13
Request sent to all shards


Wednesday, March 27, 13
Shards return results to mongos


Wednesday, March 27, 13
Mongos returns results to client


Wednesday, March 27, 13
Cluster Request Routing: Non-Targeted Query
      with Sort


Wednesday, March 27, 13
Non-Targeted request with sort received


Wednesday, March 27, 13
Request sent to all shards


Wednesday, March 27, 13
Query and sort performed locally


Wednesday, March 27, 13
Shards return results to mongos


Wednesday, March 27, 13
Mongos merges sorted results


Wednesday, March 27, 13
Mongos returns results to client


Wednesday, March 27, 13
Shard Key




Wednesday, March 27, 13
Shard Key

      • Shard key is immutable
      • Shard key values are immutable
      • Shard key must be indexed
      • Shard key limited to 512 bytes in size
      • Shard key used to route queries
             – Choose a field commonly used in queries

      • Only shard key can be unique across shards
             – `_id` field is only unique within individual shard



Wednesday, March 27, 13
Shard Key Considerations

      • Cardinality
      • Write Distribution
      • Query Isolation
      • Reliability
      • Index Locality




Wednesday, March 27, 13
Conclusion




Wednesday, March 27, 13
Read/Write Throughput Exceeds I/O


Wednesday, March 27, 13
Working Set Exceeds Physical Memory


Wednesday, March 27, 13
Sharding Enables Scale

      • MongoDB’s Auto-Sharding
             – Easy to Configure
             – Consistent Interface
             – Free and Open Source




Wednesday, March 27, 13
• What’s next?
             – [ Insert Related Talks ]
             – [ Insert Upcoming Webinars ]
             – MongoDB User Group

      • Resources
                  https://education.10gen.com/
                  http://www.10gen.com/presentations




Wednesday, March 27, 13
Thank You
      Tyler Brock
      Software Engineer, 10gen




Wednesday, March 27, 13

Contenu connexe

Tendances

Mongodb - Scaling write performance
Mongodb - Scaling write performanceMongodb - Scaling write performance
Mongodb - Scaling write performanceDaum DNA
 
MongoDB Sharding
MongoDB ShardingMongoDB Sharding
MongoDB ShardingRob Walters
 
MongoDB Auto-Sharding at Mongo Seattle
MongoDB Auto-Sharding at Mongo SeattleMongoDB Auto-Sharding at Mongo Seattle
MongoDB Auto-Sharding at Mongo SeattleMongoDB
 
Lightning Talk: MongoDB Sharding
Lightning Talk: MongoDB ShardingLightning Talk: MongoDB Sharding
Lightning Talk: MongoDB ShardingMongoDB
 
I have a good shard key now what - Advanced Sharding
I have a good shard key now what - Advanced ShardingI have a good shard key now what - Advanced Sharding
I have a good shard key now what - Advanced ShardingDavid Murphy
 
Sharding in MongoDB Days 2013
Sharding in MongoDB Days 2013Sharding in MongoDB Days 2013
Sharding in MongoDB Days 2013Randall Hunt
 
MongoDB Deployment Checklist
MongoDB Deployment ChecklistMongoDB Deployment Checklist
MongoDB Deployment ChecklistMongoDB
 
Back to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production DeploymentBack to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production DeploymentMongoDB
 
Everything You Need to Know About Sharding
Everything You Need to Know About ShardingEverything You Need to Know About Sharding
Everything You Need to Know About ShardingMongoDB
 
MongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: ShardingMongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: ShardingMongoDB
 
Шардинг в MongoDB, Henrik Ingo (MongoDB)
Шардинг в MongoDB, Henrik Ingo (MongoDB)Шардинг в MongoDB, Henrik Ingo (MongoDB)
Шардинг в MongoDB, Henrik Ingo (MongoDB)Ontico
 
Sharding
ShardingSharding
ShardingMongoDB
 
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...leifwalsh
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to ShardingMongoDB
 
Exploring the replication in MongoDB
Exploring the replication in MongoDBExploring the replication in MongoDB
Exploring the replication in MongoDBIgor Donchovski
 
Evolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops Team
Evolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops TeamEvolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops Team
Evolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops TeamMydbops
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to ShardingMongoDB
 
Back to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to ShardingBack to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to ShardingMongoDB
 
Scaling with MongoDB
Scaling with MongoDBScaling with MongoDB
Scaling with MongoDBRick Copeland
 
MongoDB Capacity Planning
MongoDB Capacity PlanningMongoDB Capacity Planning
MongoDB Capacity PlanningNorberto Leite
 

Tendances (20)

Mongodb - Scaling write performance
Mongodb - Scaling write performanceMongodb - Scaling write performance
Mongodb - Scaling write performance
 
MongoDB Sharding
MongoDB ShardingMongoDB Sharding
MongoDB Sharding
 
MongoDB Auto-Sharding at Mongo Seattle
MongoDB Auto-Sharding at Mongo SeattleMongoDB Auto-Sharding at Mongo Seattle
MongoDB Auto-Sharding at Mongo Seattle
 
Lightning Talk: MongoDB Sharding
Lightning Talk: MongoDB ShardingLightning Talk: MongoDB Sharding
Lightning Talk: MongoDB Sharding
 
I have a good shard key now what - Advanced Sharding
I have a good shard key now what - Advanced ShardingI have a good shard key now what - Advanced Sharding
I have a good shard key now what - Advanced Sharding
 
Sharding in MongoDB Days 2013
Sharding in MongoDB Days 2013Sharding in MongoDB Days 2013
Sharding in MongoDB Days 2013
 
MongoDB Deployment Checklist
MongoDB Deployment ChecklistMongoDB Deployment Checklist
MongoDB Deployment Checklist
 
Back to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production DeploymentBack to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production Deployment
 
Everything You Need to Know About Sharding
Everything You Need to Know About ShardingEverything You Need to Know About Sharding
Everything You Need to Know About Sharding
 
MongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: ShardingMongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: Sharding
 
Шардинг в MongoDB, Henrik Ingo (MongoDB)
Шардинг в MongoDB, Henrik Ingo (MongoDB)Шардинг в MongoDB, Henrik Ingo (MongoDB)
Шардинг в MongoDB, Henrik Ingo (MongoDB)
 
Sharding
ShardingSharding
Sharding
 
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to Sharding
 
Exploring the replication in MongoDB
Exploring the replication in MongoDBExploring the replication in MongoDB
Exploring the replication in MongoDB
 
Evolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops Team
Evolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops TeamEvolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops Team
Evolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops Team
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to Sharding
 
Back to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to ShardingBack to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to Sharding
 
Scaling with MongoDB
Scaling with MongoDBScaling with MongoDB
Scaling with MongoDB
 
MongoDB Capacity Planning
MongoDB Capacity PlanningMongoDB Capacity Planning
MongoDB Capacity Planning
 

Similaire à Sharding

Webinar: Sharding
Webinar: ShardingWebinar: Sharding
Webinar: ShardingMongoDB
 
Sharding - Seoul 2012
Sharding - Seoul 2012Sharding - Seoul 2012
Sharding - Seoul 2012MongoDB
 
Sharding
ShardingSharding
ShardingMongoDB
 
Sharding
ShardingSharding
ShardingMongoDB
 
What can we learn from NoSQL technologies?
What can we learn from NoSQL technologies?What can we learn from NoSQL technologies?
What can we learn from NoSQL technologies?Ivan Zoratti
 
Sharding Overview
Sharding OverviewSharding Overview
Sharding OverviewMongoDB
 
Shard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stackShard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stackJustin Swanhart
 
Back to Basics: Build Something Big With MongoDB
Back to Basics: Build Something Big With MongoDB Back to Basics: Build Something Big With MongoDB
Back to Basics: Build Something Big With MongoDB MongoDB
 
Back tobasicswebinar part6-rev.
Back tobasicswebinar part6-rev.Back tobasicswebinar part6-rev.
Back tobasicswebinar part6-rev.MongoDB
 
Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...
Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...
Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...MongoDB
 
Cloud east shutl_talk
Cloud east shutl_talkCloud east shutl_talk
Cloud east shutl_talkVolker Pacher
 
Yet Another Replication Tool: RubyRep
Yet Another Replication Tool: RubyRepYet Another Replication Tool: RubyRep
Yet Another Replication Tool: RubyRepDenish Patel
 
Scaling MongoDB - Presentation at MTP
Scaling MongoDB - Presentation at MTPScaling MongoDB - Presentation at MTP
Scaling MongoDB - Presentation at MTPdarkdata
 
Back to Basics Spanish 4 Introduction to sharding
Back to Basics Spanish 4 Introduction to shardingBack to Basics Spanish 4 Introduction to sharding
Back to Basics Spanish 4 Introduction to shardingMongoDB
 
MongoDB Pros and Cons
MongoDB Pros and ConsMongoDB Pros and Cons
MongoDB Pros and Consjohnrjenson
 
Conceptos básicos. Seminario web 1: Introducción a NoSQL
Conceptos básicos. Seminario web 1: Introducción a NoSQLConceptos básicos. Seminario web 1: Introducción a NoSQL
Conceptos básicos. Seminario web 1: Introducción a NoSQLMongoDB
 
Fractal Tree Indexes : From Theory to Practice
Fractal Tree Indexes : From Theory to PracticeFractal Tree Indexes : From Theory to Practice
Fractal Tree Indexes : From Theory to PracticeTim Callaghan
 
Getting Started with MongoDB (TCF ITPC 2014)
Getting Started with MongoDB (TCF ITPC 2014)Getting Started with MongoDB (TCF ITPC 2014)
Getting Started with MongoDB (TCF ITPC 2014)Michael Redlich
 
MongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceMongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceSasidhar Gogulapati
 
Mongo db php_shaken_not_stirred_joomlafrappe
Mongo db php_shaken_not_stirred_joomlafrappeMongo db php_shaken_not_stirred_joomlafrappe
Mongo db php_shaken_not_stirred_joomlafrappeSpyros Passas
 

Similaire à Sharding (20)

Webinar: Sharding
Webinar: ShardingWebinar: Sharding
Webinar: Sharding
 
Sharding - Seoul 2012
Sharding - Seoul 2012Sharding - Seoul 2012
Sharding - Seoul 2012
 
Sharding
ShardingSharding
Sharding
 
Sharding
ShardingSharding
Sharding
 
What can we learn from NoSQL technologies?
What can we learn from NoSQL technologies?What can we learn from NoSQL technologies?
What can we learn from NoSQL technologies?
 
Sharding Overview
Sharding OverviewSharding Overview
Sharding Overview
 
Shard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stackShard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stack
 
Back to Basics: Build Something Big With MongoDB
Back to Basics: Build Something Big With MongoDB Back to Basics: Build Something Big With MongoDB
Back to Basics: Build Something Big With MongoDB
 
Back tobasicswebinar part6-rev.
Back tobasicswebinar part6-rev.Back tobasicswebinar part6-rev.
Back tobasicswebinar part6-rev.
 
Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...
Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...
Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...
 
Cloud east shutl_talk
Cloud east shutl_talkCloud east shutl_talk
Cloud east shutl_talk
 
Yet Another Replication Tool: RubyRep
Yet Another Replication Tool: RubyRepYet Another Replication Tool: RubyRep
Yet Another Replication Tool: RubyRep
 
Scaling MongoDB - Presentation at MTP
Scaling MongoDB - Presentation at MTPScaling MongoDB - Presentation at MTP
Scaling MongoDB - Presentation at MTP
 
Back to Basics Spanish 4 Introduction to sharding
Back to Basics Spanish 4 Introduction to shardingBack to Basics Spanish 4 Introduction to sharding
Back to Basics Spanish 4 Introduction to sharding
 
MongoDB Pros and Cons
MongoDB Pros and ConsMongoDB Pros and Cons
MongoDB Pros and Cons
 
Conceptos básicos. Seminario web 1: Introducción a NoSQL
Conceptos básicos. Seminario web 1: Introducción a NoSQLConceptos básicos. Seminario web 1: Introducción a NoSQL
Conceptos básicos. Seminario web 1: Introducción a NoSQL
 
Fractal Tree Indexes : From Theory to Practice
Fractal Tree Indexes : From Theory to PracticeFractal Tree Indexes : From Theory to Practice
Fractal Tree Indexes : From Theory to Practice
 
Getting Started with MongoDB (TCF ITPC 2014)
Getting Started with MongoDB (TCF ITPC 2014)Getting Started with MongoDB (TCF ITPC 2014)
Getting Started with MongoDB (TCF ITPC 2014)
 
MongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceMongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & Performance
 
Mongo db php_shaken_not_stirred_joomlafrappe
Mongo db php_shaken_not_stirred_joomlafrappeMongo db php_shaken_not_stirred_joomlafrappe
Mongo db php_shaken_not_stirred_joomlafrappe
 

Plus de MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

Plus de MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Dernier

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 

Dernier (20)

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 

Sharding

  • 1. Introduction to Sharding Tyler Brock Software Engineer, 10gen Wednesday, March 27, 13
  • 2. Agenda • Scaling Data • MongoDB's Approach • Architecture • Configuration • Mechanics Wednesday, March 27, 13
  • 5. Examining Growth • User Growth – 1995: 0.4% of the world’s population – Today: 30% of the world is online (~2.2B) – Emerging Markets & Mobile Wednesday, March 27, 13
  • 6. Examining Growth • User Growth – 1995: 0.4% of the world’s population – Today: 30% of the world is online (~2.2B) – Emerging Markets & Mobile • Data Set Growth – Facebook’s data set is around 100 petabytes – 4 billion photos taken in the last year (4x a decade ago) Wednesday, March 27, 13
  • 7. Working Set Exceeds Physical Memory Wednesday, March 27, 13
  • 8. Read/Write Throughput Exceeds I/O Wednesday, March 27, 13
  • 9. Vertical Scalability (Scale Up) Wednesday, March 27, 13
  • 10. Horizontal Scalability (Scale Out) Wednesday, March 27, 13
  • 11. Data Store Scalability • Custom Hardware – Oracle • Custom Software – Facebook + MySQL – Google Wednesday, March 27, 13
  • 12. Data Store Scalability Today • MongoDB Auto-Sharding • A data store that is – Free – Publicly available – Open Source (https://github.com/mongodb/mongo) – Horizontally scalable – Application independent Wednesday, March 27, 13
  • 13. MongoDB's Approach to Sharding Wednesday, March 27, 13
  • 14. Partitioning • User defines shard key • Shard key defines range of data • Key space is like points on a line • Range is a segment of that line Wednesday, March 27, 13
  • 15. Data Distribution • Initially 1 chunk • Default max chunk size: 64mb • MongoDB automatically splits & migrates chunks when max reached Wednesday, March 27, 13
  • 16. Routing and Balancing • Queries routed to specific shards • MongoDB balances cluster • MongoDB migrates data to new nodes Wednesday, March 27, 13
  • 17. MongoDB Auto-Sharding • Minimal effort required – Same interface as single mongod • Two steps – Enable Sharding for a database – Shard collection within database Wednesday, March 27, 13
  • 19. What is a Shard? • Shard is a node of the cluster • Shard can be a single mongod or a replica set Wednesday, March 27, 13
  • 20. Meta Data Storage • Config Server – Stores cluster chunk ranges and locations – Can have only 1 or 3 (production must have 3) – Not a replica set Wednesday, March 27, 13
  • 21. Routing and Managing Data • Mongos – Acts as a router / balancer – No local data (persists to config database) – Can have 1 or many Wednesday, March 27, 13
  • 24. Example Cluster • Don’t use this setup in production! - Only one Config server (No Fault Tolerance) - Shard not in a replica set (Low Availability) - Only one mongos and shard (No Performance Improvement) Wednesday, March 27, 13
  • 25. Starting the Configuration Server • mongod --configsvr • Starts a configuration server on the default port (27019) Wednesday, March 27, 13
  • 26. Start the mongos Router • mongos --configdb <hostname>:27019 • For 3 configuration servers: mongos --configdb <host1>:<port1>,<host2>:<port2>,<host3>:<port3> • This is always how to start a new mongos, even if the cluster is already running Wednesday, March 27, 13
  • 27. Start the shard database • mongod --shardsvr • Starts a mongod with the default shard port (27018) • Shard is not yet connected to the rest of the cluster • Shard may have already been running in production Wednesday, March 27, 13
  • 28. Add the Shard • On mongos: - sh.addShard(‘<host>:27018’) • Adding a replica set: Wednesday, March 27, 13
  • 29. Verify that the shard was added • db.runCommand({ listshards:1 }) { "shards" : ! [{"_id”: "shard0000”,"host”: ”<hostname>:27018” } ], "ok" : 1 } Wednesday, March 27, 13
  • 30. Enabling Sharding • Enable sharding on a database sh.enableSharding(“<dbname>”) • Shard a collection with the given key sh.shardCollection(“<dbname>.people”,{“country”:1}) • Use a compound shard key to prevent duplicates sh.shardCollection(“<dbname>.cars”,{“year”:1, ”uniqueid”:1}) Wednesday, March 27, 13
  • 31. Tag Aware Sharding • Tag aware sharding allows you to control the distribution of your data • Tag a range of shard keys – sh.addTagRange(<collection>,<min>,<max>,<tag>) • Tag a shard – sh.addShardTag(<shard>,<tag>) Wednesday, March 27, 13
  • 33. Partitioning • Remember it's based on ranges Wednesday, March 27, 13
  • 34. Chunk is a section of the entire range Wednesday, March 27, 13
  • 35. Chunk splitting • A chunk is split once it exceeds the maximum size • There is no split point if all documents have the same shard key • Chunk split is a logical operation (no data is moved) Wednesday, March 27, 13
  • 36. Balancing • Balancer is running on mongos • Once the difference in chunks between the most dense shard and the least dense shard is above the migration threshold, a balancing round starts Wednesday, March 27, 13
  • 37. Acquiring the Balancer Lock • The balancer on mongos takes out a “balancer lock” • To see the status of these locks: use config db.locks.find({ _id: “balancer” }) Wednesday, March 27, 13
  • 38. Moving the chunk • The mongos sends a moveChunk command to source shard • The source shard then notifies destination shard • Destination shard starts pulling documents from source shard Wednesday, March 27, 13
  • 39. Committing Migration • When complete, destination shard updates config server - Provides new locations of the chunks Wednesday, March 27, 13
  • 40. Cleanup • Source shard deletes moved data - Must wait for open cursors to either close or time out - NoTimeout cursors may prevent the release of the lock • The mongos releases the balancer lock after old chunks are Wednesday, March 27, 13
  • 42. Cluster Request Routing • Targeted Queries • Scatter Gather Queries • Scatter Gather Queries with Sort Wednesday, March 27, 13
  • 43. Cluster Request Routing: Targeted Query Wednesday, March 27, 13
  • 45. Request routed to appropriate shard Wednesday, March 27, 13
  • 47. Mongos returns results to client Wednesday, March 27, 13
  • 48. Cluster Request Routing: Non-Targeted Query Wednesday, March 27, 13
  • 50. Request sent to all shards Wednesday, March 27, 13
  • 51. Shards return results to mongos Wednesday, March 27, 13
  • 52. Mongos returns results to client Wednesday, March 27, 13
  • 53. Cluster Request Routing: Non-Targeted Query with Sort Wednesday, March 27, 13
  • 54. Non-Targeted request with sort received Wednesday, March 27, 13
  • 55. Request sent to all shards Wednesday, March 27, 13
  • 56. Query and sort performed locally Wednesday, March 27, 13
  • 57. Shards return results to mongos Wednesday, March 27, 13
  • 58. Mongos merges sorted results Wednesday, March 27, 13
  • 59. Mongos returns results to client Wednesday, March 27, 13
  • 61. Shard Key • Shard key is immutable • Shard key values are immutable • Shard key must be indexed • Shard key limited to 512 bytes in size • Shard key used to route queries – Choose a field commonly used in queries • Only shard key can be unique across shards – `_id` field is only unique within individual shard Wednesday, March 27, 13
  • 62. Shard Key Considerations • Cardinality • Write Distribution • Query Isolation • Reliability • Index Locality Wednesday, March 27, 13
  • 64. Read/Write Throughput Exceeds I/O Wednesday, March 27, 13
  • 65. Working Set Exceeds Physical Memory Wednesday, March 27, 13
  • 66. Sharding Enables Scale • MongoDB’s Auto-Sharding – Easy to Configure – Consistent Interface – Free and Open Source Wednesday, March 27, 13
  • 67. • What’s next? – [ Insert Related Talks ] – [ Insert Upcoming Webinars ] – MongoDB User Group • Resources https://education.10gen.com/ http://www.10gen.com/presentations Wednesday, March 27, 13
  • 68. Thank You Tyler Brock Software Engineer, 10gen Wednesday, March 27, 13