SlideShare une entreprise Scribd logo
1  sur  68
Télécharger pour lire hors ligne
Introduction to Sharding
      Tyler Brock
      Software Engineer, 10gen




Wednesday, March 27, 13
Agenda

      • Scaling Data
      • MongoDB's Approach
      • Architecture
      • Configuration
      • Mechanics




Wednesday, March 27, 13
Scaling Data




Wednesday, March 27, 13
Examining Growth




Wednesday, March 27, 13
Examining Growth

      • User Growth
             – 1995: 0.4% of the world’s population
             – Today: 30% of the world is online (~2.2B)
             – Emerging Markets & Mobile




Wednesday, March 27, 13
Examining Growth

      • User Growth
             – 1995: 0.4% of the world’s population
             – Today: 30% of the world is online (~2.2B)
             – Emerging Markets & Mobile

      • Data Set Growth
             – Facebook’s data set is around 100 petabytes
             – 4 billion photos taken in the last year (4x a decade ago)




Wednesday, March 27, 13
Working Set Exceeds Physical Memory


Wednesday, March 27, 13
Read/Write Throughput Exceeds I/O


Wednesday, March 27, 13
Vertical Scalability (Scale Up)


Wednesday, March 27, 13
Horizontal Scalability (Scale Out)


Wednesday, March 27, 13
Data Store Scalability

      • Custom Hardware
             – Oracle

      • Custom Software
             – Facebook + MySQL
             – Google




Wednesday, March 27, 13
Data Store Scalability Today

      • MongoDB Auto-Sharding
      • A data store that is
             – Free
             – Publicly available
             – Open Source (https://github.com/mongodb/mongo)
             – Horizontally scalable
             – Application independent




Wednesday, March 27, 13
MongoDB's Approach to
      Sharding




Wednesday, March 27, 13
Partitioning

      • User defines shard key
      • Shard key defines range of data
      • Key space is like points on a line
      • Range is a segment of that line




Wednesday, March 27, 13
Data Distribution

      • Initially 1 chunk
      • Default max chunk size: 64mb
      • MongoDB automatically splits & migrates chunks
          when max reached




Wednesday, March 27, 13
Routing and Balancing

      • Queries routed to specific
          shards
      • MongoDB balances cluster
      • MongoDB migrates data to
          new nodes




Wednesday, March 27, 13
MongoDB Auto-Sharding

      • Minimal effort required
             – Same interface as single mongod

      • Two steps
             – Enable Sharding for a database
             – Shard collection within database




Wednesday, March 27, 13
Architecture




Wednesday, March 27, 13
What is a Shard?
      • Shard is a node of the cluster
      • Shard can be a single mongod or a replica set




Wednesday, March 27, 13
Meta Data Storage

      • Config Server
         – Stores cluster chunk ranges and locations
         – Can have only 1 or 3 (production must have 3)
         – Not a replica set




Wednesday, March 27, 13
Routing and Managing Data

      • Mongos
         – Acts as a router / balancer
         – No local data (persists to config database)
         – Can have 1 or many




Wednesday, March 27, 13
Sharding infrastructure


Wednesday, March 27, 13
Configuration




Wednesday, March 27, 13
Example Cluster




        • Don’t use this setup in production!
           - Only one Config server (No Fault Tolerance)
           - Shard not in a replica set (Low Availability)
           - Only one mongos and shard (No Performance Improvement)




Wednesday, March 27, 13
Starting the Configuration Server




        • mongod --configsvr
        • Starts a configuration server on the default port (27019)




Wednesday, March 27, 13
Start the mongos Router




        • mongos --configdb <hostname>:27019
        • For 3 configuration servers:
           mongos --configdb
           <host1>:<port1>,<host2>:<port2>,<host3>:<port3>
        • This is always how to start a new mongos, even if the cluster is
           already running


Wednesday, March 27, 13
Start the shard database




        •   mongod --shardsvr
        •   Starts a mongod with the default shard port (27018)
        •   Shard is not yet connected to the rest of the cluster
        •   Shard may have already been running in production



Wednesday, March 27, 13
Add the Shard




        • On mongos:
           - sh.addShard(‘<host>:27018’)
        • Adding a replica set:




Wednesday, March 27, 13
Verify that the shard was added




        • db.runCommand({ listshards:1 })
            { "shards" :
            ! [{"_id”: "shard0000”,"host”: ”<hostname>:27018” } ],
                 "ok" : 1
             }




Wednesday, March 27, 13
Enabling Sharding

      • Enable sharding on a database

             sh.enableSharding(“<dbname>”)

      • Shard a collection with the given key

             sh.shardCollection(“<dbname>.people”,{“country”:1})

      • Use a compound shard key to prevent duplicates

             sh.shardCollection(“<dbname>.cars”,{“year”:1, ”uniqueid”:1})




Wednesday, March 27, 13
Tag Aware Sharding

      • Tag aware sharding allows you to control the
          distribution of your data
      • Tag a range of shard keys
             – sh.addTagRange(<collection>,<min>,<max>,<tag>)

      • Tag a shard
             – sh.addShardTag(<shard>,<tag>)




Wednesday, March 27, 13
Mechanics




Wednesday, March 27, 13
Partitioning

      • Remember it's based on ranges




Wednesday, March 27, 13
Chunk is a section of the entire range


Wednesday, March 27, 13
Chunk splitting




     • A chunk is split once it exceeds the maximum size
     • There is no split point if all documents have the same shard key
     • Chunk split is a logical operation (no data is moved)




Wednesday, March 27, 13
Balancing




        • Balancer is running on mongos
        • Once the difference in chunks between the most dense shard
           and the least dense shard is above the migration threshold, a
           balancing round starts




Wednesday, March 27, 13
Acquiring the Balancer Lock




        • The balancer on mongos takes out a “balancer lock”
        • To see the status of these locks:
           use config
           db.locks.find({ _id: “balancer” })




Wednesday, March 27, 13
Moving the chunk




        • The mongos sends a moveChunk command to source shard
        • The source shard then notifies destination shard
        • Destination shard starts pulling documents from source shard




Wednesday, March 27, 13
Committing Migration




        • When complete, destination shard updates config server
           - Provides new locations of the chunks




Wednesday, March 27, 13
Cleanup




        • Source shard deletes moved data
           - Must wait for open cursors to either close or time out
           - NoTimeout cursors may prevent the release of the lock
        • The mongos releases the balancer lock after old chunks are




Wednesday, March 27, 13
Routing Requests




Wednesday, March 27, 13
Cluster Request Routing

      • Targeted Queries
      • Scatter Gather Queries
      • Scatter Gather Queries with Sort




Wednesday, March 27, 13
Cluster Request Routing: Targeted Query


Wednesday, March 27, 13
Routable request received


Wednesday, March 27, 13
Request routed to appropriate shard


Wednesday, March 27, 13
Shard returns results


Wednesday, March 27, 13
Mongos returns results to client


Wednesday, March 27, 13
Cluster Request Routing: Non-Targeted Query


Wednesday, March 27, 13
Non-Targeted Request Received


Wednesday, March 27, 13
Request sent to all shards


Wednesday, March 27, 13
Shards return results to mongos


Wednesday, March 27, 13
Mongos returns results to client


Wednesday, March 27, 13
Cluster Request Routing: Non-Targeted Query
      with Sort


Wednesday, March 27, 13
Non-Targeted request with sort received


Wednesday, March 27, 13
Request sent to all shards


Wednesday, March 27, 13
Query and sort performed locally


Wednesday, March 27, 13
Shards return results to mongos


Wednesday, March 27, 13
Mongos merges sorted results


Wednesday, March 27, 13
Mongos returns results to client


Wednesday, March 27, 13
Shard Key




Wednesday, March 27, 13
Shard Key

      • Shard key is immutable
      • Shard key values are immutable
      • Shard key must be indexed
      • Shard key limited to 512 bytes in size
      • Shard key used to route queries
             – Choose a field commonly used in queries

      • Only shard key can be unique across shards
             – `_id` field is only unique within individual shard



Wednesday, March 27, 13
Shard Key Considerations

      • Cardinality
      • Write Distribution
      • Query Isolation
      • Reliability
      • Index Locality




Wednesday, March 27, 13
Conclusion




Wednesday, March 27, 13
Read/Write Throughput Exceeds I/O


Wednesday, March 27, 13
Working Set Exceeds Physical Memory


Wednesday, March 27, 13
Sharding Enables Scale

      • MongoDB’s Auto-Sharding
             – Easy to Configure
             – Consistent Interface
             – Free and Open Source




Wednesday, March 27, 13
• What’s next?
             – [ Insert Related Talks ]
             – [ Insert Upcoming Webinars ]
             – MongoDB User Group

      • Resources
                  https://education.10gen.com/
                  http://www.10gen.com/presentations




Wednesday, March 27, 13
Thank You
      Tyler Brock
      Software Engineer, 10gen




Wednesday, March 27, 13

Contenu connexe

Tendances

Mongodb - Scaling write performance
Mongodb - Scaling write performanceMongodb - Scaling write performance
Mongodb - Scaling write performance
Daum DNA
 
Lightning Talk: MongoDB Sharding
Lightning Talk: MongoDB ShardingLightning Talk: MongoDB Sharding
Lightning Talk: MongoDB Sharding
MongoDB
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to Sharding
MongoDB
 

Tendances (20)

Mongodb - Scaling write performance
Mongodb - Scaling write performanceMongodb - Scaling write performance
Mongodb - Scaling write performance
 
MongoDB Sharding
MongoDB ShardingMongoDB Sharding
MongoDB Sharding
 
MongoDB Auto-Sharding at Mongo Seattle
MongoDB Auto-Sharding at Mongo SeattleMongoDB Auto-Sharding at Mongo Seattle
MongoDB Auto-Sharding at Mongo Seattle
 
Lightning Talk: MongoDB Sharding
Lightning Talk: MongoDB ShardingLightning Talk: MongoDB Sharding
Lightning Talk: MongoDB Sharding
 
I have a good shard key now what - Advanced Sharding
I have a good shard key now what - Advanced ShardingI have a good shard key now what - Advanced Sharding
I have a good shard key now what - Advanced Sharding
 
Sharding in MongoDB Days 2013
Sharding in MongoDB Days 2013Sharding in MongoDB Days 2013
Sharding in MongoDB Days 2013
 
MongoDB Deployment Checklist
MongoDB Deployment ChecklistMongoDB Deployment Checklist
MongoDB Deployment Checklist
 
Back to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production DeploymentBack to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production Deployment
 
Everything You Need to Know About Sharding
Everything You Need to Know About ShardingEverything You Need to Know About Sharding
Everything You Need to Know About Sharding
 
MongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: ShardingMongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: Sharding
 
Шардинг в MongoDB, Henrik Ingo (MongoDB)
Шардинг в MongoDB, Henrik Ingo (MongoDB)Шардинг в MongoDB, Henrik Ingo (MongoDB)
Шардинг в MongoDB, Henrik Ingo (MongoDB)
 
Sharding
ShardingSharding
Sharding
 
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to Sharding
 
Exploring the replication in MongoDB
Exploring the replication in MongoDBExploring the replication in MongoDB
Exploring the replication in MongoDB
 
Evolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops Team
Evolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops TeamEvolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops Team
Evolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops Team
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to Sharding
 
Back to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to ShardingBack to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to Sharding
 
Scaling with MongoDB
Scaling with MongoDBScaling with MongoDB
Scaling with MongoDB
 
MongoDB Capacity Planning
MongoDB Capacity PlanningMongoDB Capacity Planning
MongoDB Capacity Planning
 

Similaire à Sharding

Sharding - Seoul 2012
Sharding - Seoul 2012Sharding - Seoul 2012
Sharding - Seoul 2012
MongoDB
 
Sharding
ShardingSharding
Sharding
MongoDB
 
Sharding Overview
Sharding OverviewSharding Overview
Sharding Overview
MongoDB
 
Back to Basics: Build Something Big With MongoDB
Back to Basics: Build Something Big With MongoDB Back to Basics: Build Something Big With MongoDB
Back to Basics: Build Something Big With MongoDB
MongoDB
 
Back tobasicswebinar part6-rev.
Back tobasicswebinar part6-rev.Back tobasicswebinar part6-rev.
Back tobasicswebinar part6-rev.
MongoDB
 
Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...
Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...
Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...
MongoDB
 
Cloud east shutl_talk
Cloud east shutl_talkCloud east shutl_talk
Cloud east shutl_talk
Volker Pacher
 

Similaire à Sharding (20)

Webinar: Sharding
Webinar: ShardingWebinar: Sharding
Webinar: Sharding
 
Sharding - Seoul 2012
Sharding - Seoul 2012Sharding - Seoul 2012
Sharding - Seoul 2012
 
Sharding
ShardingSharding
Sharding
 
Sharding
ShardingSharding
Sharding
 
What can we learn from NoSQL technologies?
What can we learn from NoSQL technologies?What can we learn from NoSQL technologies?
What can we learn from NoSQL technologies?
 
Sharding Overview
Sharding OverviewSharding Overview
Sharding Overview
 
Shard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stackShard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stack
 
Back to Basics: Build Something Big With MongoDB
Back to Basics: Build Something Big With MongoDB Back to Basics: Build Something Big With MongoDB
Back to Basics: Build Something Big With MongoDB
 
Back tobasicswebinar part6-rev.
Back tobasicswebinar part6-rev.Back tobasicswebinar part6-rev.
Back tobasicswebinar part6-rev.
 
Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...
Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...
Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...
 
Cloud east shutl_talk
Cloud east shutl_talkCloud east shutl_talk
Cloud east shutl_talk
 
Yet Another Replication Tool: RubyRep
Yet Another Replication Tool: RubyRepYet Another Replication Tool: RubyRep
Yet Another Replication Tool: RubyRep
 
Scaling MongoDB - Presentation at MTP
Scaling MongoDB - Presentation at MTPScaling MongoDB - Presentation at MTP
Scaling MongoDB - Presentation at MTP
 
Back to Basics Spanish 4 Introduction to sharding
Back to Basics Spanish 4 Introduction to shardingBack to Basics Spanish 4 Introduction to sharding
Back to Basics Spanish 4 Introduction to sharding
 
MongoDB Pros and Cons
MongoDB Pros and ConsMongoDB Pros and Cons
MongoDB Pros and Cons
 
Conceptos básicos. Seminario web 1: Introducción a NoSQL
Conceptos básicos. Seminario web 1: Introducción a NoSQLConceptos básicos. Seminario web 1: Introducción a NoSQL
Conceptos básicos. Seminario web 1: Introducción a NoSQL
 
Fractal Tree Indexes : From Theory to Practice
Fractal Tree Indexes : From Theory to PracticeFractal Tree Indexes : From Theory to Practice
Fractal Tree Indexes : From Theory to Practice
 
Getting Started with MongoDB (TCF ITPC 2014)
Getting Started with MongoDB (TCF ITPC 2014)Getting Started with MongoDB (TCF ITPC 2014)
Getting Started with MongoDB (TCF ITPC 2014)
 
MongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceMongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & Performance
 
Mongo db php_shaken_not_stirred_joomlafrappe
Mongo db php_shaken_not_stirred_joomlafrappeMongo db php_shaken_not_stirred_joomlafrappe
Mongo db php_shaken_not_stirred_joomlafrappe
 

Plus de MongoDB

Plus de MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Dernier (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

Sharding

  • 1. Introduction to Sharding Tyler Brock Software Engineer, 10gen Wednesday, March 27, 13
  • 2. Agenda • Scaling Data • MongoDB's Approach • Architecture • Configuration • Mechanics Wednesday, March 27, 13
  • 5. Examining Growth • User Growth – 1995: 0.4% of the world’s population – Today: 30% of the world is online (~2.2B) – Emerging Markets & Mobile Wednesday, March 27, 13
  • 6. Examining Growth • User Growth – 1995: 0.4% of the world’s population – Today: 30% of the world is online (~2.2B) – Emerging Markets & Mobile • Data Set Growth – Facebook’s data set is around 100 petabytes – 4 billion photos taken in the last year (4x a decade ago) Wednesday, March 27, 13
  • 7. Working Set Exceeds Physical Memory Wednesday, March 27, 13
  • 8. Read/Write Throughput Exceeds I/O Wednesday, March 27, 13
  • 9. Vertical Scalability (Scale Up) Wednesday, March 27, 13
  • 10. Horizontal Scalability (Scale Out) Wednesday, March 27, 13
  • 11. Data Store Scalability • Custom Hardware – Oracle • Custom Software – Facebook + MySQL – Google Wednesday, March 27, 13
  • 12. Data Store Scalability Today • MongoDB Auto-Sharding • A data store that is – Free – Publicly available – Open Source (https://github.com/mongodb/mongo) – Horizontally scalable – Application independent Wednesday, March 27, 13
  • 13. MongoDB's Approach to Sharding Wednesday, March 27, 13
  • 14. Partitioning • User defines shard key • Shard key defines range of data • Key space is like points on a line • Range is a segment of that line Wednesday, March 27, 13
  • 15. Data Distribution • Initially 1 chunk • Default max chunk size: 64mb • MongoDB automatically splits & migrates chunks when max reached Wednesday, March 27, 13
  • 16. Routing and Balancing • Queries routed to specific shards • MongoDB balances cluster • MongoDB migrates data to new nodes Wednesday, March 27, 13
  • 17. MongoDB Auto-Sharding • Minimal effort required – Same interface as single mongod • Two steps – Enable Sharding for a database – Shard collection within database Wednesday, March 27, 13
  • 19. What is a Shard? • Shard is a node of the cluster • Shard can be a single mongod or a replica set Wednesday, March 27, 13
  • 20. Meta Data Storage • Config Server – Stores cluster chunk ranges and locations – Can have only 1 or 3 (production must have 3) – Not a replica set Wednesday, March 27, 13
  • 21. Routing and Managing Data • Mongos – Acts as a router / balancer – No local data (persists to config database) – Can have 1 or many Wednesday, March 27, 13
  • 24. Example Cluster • Don’t use this setup in production! - Only one Config server (No Fault Tolerance) - Shard not in a replica set (Low Availability) - Only one mongos and shard (No Performance Improvement) Wednesday, March 27, 13
  • 25. Starting the Configuration Server • mongod --configsvr • Starts a configuration server on the default port (27019) Wednesday, March 27, 13
  • 26. Start the mongos Router • mongos --configdb <hostname>:27019 • For 3 configuration servers: mongos --configdb <host1>:<port1>,<host2>:<port2>,<host3>:<port3> • This is always how to start a new mongos, even if the cluster is already running Wednesday, March 27, 13
  • 27. Start the shard database • mongod --shardsvr • Starts a mongod with the default shard port (27018) • Shard is not yet connected to the rest of the cluster • Shard may have already been running in production Wednesday, March 27, 13
  • 28. Add the Shard • On mongos: - sh.addShard(‘<host>:27018’) • Adding a replica set: Wednesday, March 27, 13
  • 29. Verify that the shard was added • db.runCommand({ listshards:1 }) { "shards" : ! [{"_id”: "shard0000”,"host”: ”<hostname>:27018” } ], "ok" : 1 } Wednesday, March 27, 13
  • 30. Enabling Sharding • Enable sharding on a database sh.enableSharding(“<dbname>”) • Shard a collection with the given key sh.shardCollection(“<dbname>.people”,{“country”:1}) • Use a compound shard key to prevent duplicates sh.shardCollection(“<dbname>.cars”,{“year”:1, ”uniqueid”:1}) Wednesday, March 27, 13
  • 31. Tag Aware Sharding • Tag aware sharding allows you to control the distribution of your data • Tag a range of shard keys – sh.addTagRange(<collection>,<min>,<max>,<tag>) • Tag a shard – sh.addShardTag(<shard>,<tag>) Wednesday, March 27, 13
  • 33. Partitioning • Remember it's based on ranges Wednesday, March 27, 13
  • 34. Chunk is a section of the entire range Wednesday, March 27, 13
  • 35. Chunk splitting • A chunk is split once it exceeds the maximum size • There is no split point if all documents have the same shard key • Chunk split is a logical operation (no data is moved) Wednesday, March 27, 13
  • 36. Balancing • Balancer is running on mongos • Once the difference in chunks between the most dense shard and the least dense shard is above the migration threshold, a balancing round starts Wednesday, March 27, 13
  • 37. Acquiring the Balancer Lock • The balancer on mongos takes out a “balancer lock” • To see the status of these locks: use config db.locks.find({ _id: “balancer” }) Wednesday, March 27, 13
  • 38. Moving the chunk • The mongos sends a moveChunk command to source shard • The source shard then notifies destination shard • Destination shard starts pulling documents from source shard Wednesday, March 27, 13
  • 39. Committing Migration • When complete, destination shard updates config server - Provides new locations of the chunks Wednesday, March 27, 13
  • 40. Cleanup • Source shard deletes moved data - Must wait for open cursors to either close or time out - NoTimeout cursors may prevent the release of the lock • The mongos releases the balancer lock after old chunks are Wednesday, March 27, 13
  • 42. Cluster Request Routing • Targeted Queries • Scatter Gather Queries • Scatter Gather Queries with Sort Wednesday, March 27, 13
  • 43. Cluster Request Routing: Targeted Query Wednesday, March 27, 13
  • 45. Request routed to appropriate shard Wednesday, March 27, 13
  • 47. Mongos returns results to client Wednesday, March 27, 13
  • 48. Cluster Request Routing: Non-Targeted Query Wednesday, March 27, 13
  • 50. Request sent to all shards Wednesday, March 27, 13
  • 51. Shards return results to mongos Wednesday, March 27, 13
  • 52. Mongos returns results to client Wednesday, March 27, 13
  • 53. Cluster Request Routing: Non-Targeted Query with Sort Wednesday, March 27, 13
  • 54. Non-Targeted request with sort received Wednesday, March 27, 13
  • 55. Request sent to all shards Wednesday, March 27, 13
  • 56. Query and sort performed locally Wednesday, March 27, 13
  • 57. Shards return results to mongos Wednesday, March 27, 13
  • 58. Mongos merges sorted results Wednesday, March 27, 13
  • 59. Mongos returns results to client Wednesday, March 27, 13
  • 61. Shard Key • Shard key is immutable • Shard key values are immutable • Shard key must be indexed • Shard key limited to 512 bytes in size • Shard key used to route queries – Choose a field commonly used in queries • Only shard key can be unique across shards – `_id` field is only unique within individual shard Wednesday, March 27, 13
  • 62. Shard Key Considerations • Cardinality • Write Distribution • Query Isolation • Reliability • Index Locality Wednesday, March 27, 13
  • 64. Read/Write Throughput Exceeds I/O Wednesday, March 27, 13
  • 65. Working Set Exceeds Physical Memory Wednesday, March 27, 13
  • 66. Sharding Enables Scale • MongoDB’s Auto-Sharding – Easy to Configure – Consistent Interface – Free and Open Source Wednesday, March 27, 13
  • 67. • What’s next? – [ Insert Related Talks ] – [ Insert Upcoming Webinars ] – MongoDB User Group • Resources https://education.10gen.com/ http://www.10gen.com/presentations Wednesday, March 27, 13
  • 68. Thank You Tyler Brock Software Engineer, 10gen Wednesday, March 27, 13