Read these webinar slides to learn how selecting the right shard key can future proof your application.
The shard key that you select can impact the performance, capability, and functionality of your database.
Webinar: Choosing the Right Shard Key for High Performance and Scale
1. Ger Hartnett
Director of Technical Services (EMEA), MongoDB @ghartnett #MongoDB
Tales from the Field
Part three: Choosing the Right Shard Key for
High-Performance and Scale
3. ●The main talk should take 30-35 minutes
●You can submit questions via the chat box
●We’ll answer as many as possible at the end
●We are recording and will send slides Friday
●This is the final webinar in a series of 3
Before we start
4. ●You work in operations
●You work in development
●You have a MongoDB system in production
●You have contacted MongoDB Technical
Services (support)
●You attended an earlier webinar in the series
(part1, part2)
A quick poll - add a word to the
chat to let me know your
perspective
5. ●We collect - observations about common
mistakes - to share the experience of many
●Names have been changed to protect the
(mostly) innocent
●No animals were harmed during the making
of this presentation (but maybe some DBAs
and engineers had light emotional scarring)
●While you might be new to MongoDB we
have deep experience that you can leverage
Stories
6. 1. Discovering a DR flaw during a data
centre outage
2. Complex documents, memory and
an upgrade “surprise”
3. Wild success “uncovers” the wrong
shard key
The Stories (part three today)
8. Story #1: Recovering from a
disaster
●Prospect in the process of signing up for a
subscription
●Called us late on Friday, data centre power
outage and 30+ (11 shards) servers down
●When they started bringing up the first
shard, the nodes crashed with data
corruption
●17TB of data, very little free disk space,
JOURNALLING DISABLED!
9. Recovering each shard
1.Start secondary
read only
2.Mount NFS
storage for repair
3.Repair former
primary node
4.Iterative rsync to
seed a secondary
Secondary
Primary
Secondary
10. Key takeaways for you
●If you are departing significantly from
standard config, check with us (i.e. if you
think journalling is a bad idea)
●Two DC in different buildings on different
flood plains, not in the path of the same
storm (i.e. secondaries in AWS)
●DR/backups are useless if you haven’t
tested them
11. Story #2: Complex documents,
memory and an upgrade “surprise”
●Well established ecommerce site selling
diverse goods in 20+ countries
●After switching to wired tiger in production,
performance dropped, this is the opposite of
what they were expecting
12. {
_id: 375
en_US : { name : ..., description : ..., <etc...> },
en_GB : { name : ..., description : ..., <etc...> },
fr_FR : { name : ..., description : ..., <etc...> },
de_DE : ...,
de_CH : ...,
<... and so on for other locales... >
inventory: 423
}
Product Catalog: Original Schema
13. Key Takeaways
●When doing a major version/storage-engine
upgrade, test in staging with some
proportion of production data/workload
●Sometimes putting everything into one
document is counter productive
14. Story #3: Wild success uncovers the
wrong shard key
●Started out as error “[Balancer] caught
exception … tag ranges not valid for: db.coll”
●11 shards, they had added 2 new shards to
keep up traffic - 400+ databases
●Lots of code changes ahead of the Superbowl
●Spotted slow 300+s queries, decided to build
some indexes without telling us
●Went production down
17. Diagnosing the issues #1
●The red-herring hunt begins
●Transparent Huge Pages enabled -
production
●Chaotic call - 20 people talking at once, then
in the middle of the call everything started
working again
●Barrage of tickets and calls
●Connection storms
19. Diagnosing the issues #2
●Got inconsistent and missing log files
●Discovered repeated scatter-gather queries
returning the same results
●Secondary reads
●Heavy load on some shards and low disk
space
21. Diagnosing the issues #3
● Shard key - string with year/month & customer id
{
_id : ObjectId("4c4ba5e5e8aabf3"),
count: 1025,
changes: { … }
modified :
{ date : "2015_02",
customerId: 314159 }
}
22.
23. Diagnosing the issues #4
●First heard about DDOS attack
●Missing tag ranges on some collections
●Stopped the balancer which reduced system
load from chunk moves
●Two clusters had a mongos each on the
same server
24. Fixing the issues
●Script to fix the tag ranges
●Proposed finer granularity shard key - but this
was not possible because of 30TB of data
●Moved mongos to dedicated servers
●Re-enable the balancer for short windows with
waitForDelete and secondaryThrottle
●Put together scripts to pre-split and move empty
chunks to quiet shards based on traffic from
month before
26. The diagnosis in retrospect
●The outage did not appear to have been related
to either the invalid tag ranges or the earlier
failed moves
●The step downs did not help resolve the outage
but did highlight some queries that need to be
fixed
●The DDoS was the ultimate cause of the outage
- lead to diagnosis of deeper issues
●The deepest issue was the shard key
27. Aftermath and lessons learned
●Signed up for a Named TSE
●Now doing pre-split and move before the
end of every month
●Check before making other changes (i.e.
building new indexes)
28. Key takeaways for you
●Choosing a shard key is a pivotal decision -
make it carefully
●Understand current bottleneck
●Monitor insert distribution and chunk ranges
●Look for slow queries (logs & mtools)
●Run mongos, mongod, config server on
dedicated server or use containers/cgroups
Some borrowed, some merged into a single narrative
Some of the people that inspired them may well be here in this room today
Bill's Bulk Updates randomly affected an ever larger data set.
In order to cope with the database size, Bill added more shards.
The cluster scaled linearly, as intended.
Bill's Bulk Updates randomly affected an ever larger data set.
In order to cope with the database size, Bill added more shards.
The cluster scaled linearly, as intended.
Just because you can add horizontal capacity, does not mean it is the optimum solution
Imagine that the sample rate was going to go from once a minute to once every 5 seconds
Bill's Bulk Updates randomly affected an ever larger data set.
In order to cope with the database size, Bill added more shards.
The cluster scaled linearly, as intended.
Imagine that the sample rate was going to go from once a minute to once every 5 seconds
Just because you can add horizontal capacity, does not mean it is the optimum solution
Imagine that the sample rate was going to go from once a minute to once every 5 seconds
Imagine that the sample rate was going to go from once a minute to once every 5 seconds
fast to move empty chunks
Just because you can add horizontal capacity, does not mean it is the optimum solution
We had actually told them to check with us before shard key but found out 3 months after they put it in production
Just because you can add horizontal capacity, does not mean it is the optimum solution