Sharding allows you to distribute load across multiple servers and keep your data balanced across those servers. This session will review MongoDB’s sharding support, including an architectural overview, design principles, and automation.
3. Examining Growth
• User Growth
– 1995: 0.4% of the world’s population
– Today: 30% of the world is online (~2.2B)
– Emerging Markets & Mobile
• Data Set Growth
– Facebook’s data set is around 100 petabytes
– 4 billion photos taken in the last year (4x a decade ago)
25. Shard Key
• Defines the range of data called a Key Space
• Defines the distribution of documents in a collection
• Every document must contain the Shard Key
• Shard Keys are immutable
30. Scatter Gather Queries with Sort
• Query does not contain the shard key
• Sorting is done first on the Shard
• Results are merged in Mongos
P
S
S
P
Mongos
S
S
P
S
S
43. Launch Shards
• Nothing special, just like a normal replica set
Config
Mongos
Shard
Config
P
S
Config
S
44. Add Shards
• Connect to mongos via the shell
• sh.addShard(“<rsname>/<seedlist>”)
Config
Mongos
Shard
Config
P
S
Config
S
45. Verify that the shard was added
db.runCommand({
listShards:
1
})
{
shards
:
[
{
_id:
“shard0000”,
host:
“<hostname>:27017”
}
],
“ok”
:
1
}
46. Enable Sharding
• Enable sharding on a database
– sh.enableSharding(“<dbname>”)
• Shard a collection with the given key
– sh.shardCollection(“<dbname>.people”,
{
country:
1
})
– sh.shardCollection(“<dbname>”.cars”,
{
year:
1,
uniqueid:
1})
47. Tag Aware Sharding
• Tag aware sharding allows you to control the
distribution of your data
• Tag a range of shard keys
– sh.addTagRange(<collection>,<min>,<max>,<tag>)
• Tag a shard
– sh.addShardTag(<shard>,<tag>)