2. Notes to the presenter
Themes for this presentation:
• Balance between cost and redundancy.
• Cover the many scenarios which replication
would solve and why.
• Secret sauce of avoiding downtime & data loss
• If time is short, skip 'Behind the curtain' section
• If time is very short, skip Operational
Considerations also
4. Why Replication?
• How many have faced node failures?
• How many have been woken up from sleep to do
a fail-over(s)?
• How many have experienced issues due to
network latency?
• Different uses for data
– Normal processing
– Simple analytics
21. Client Application
Driver
Write
ad
Re
Re
ad
Primary
Secondary Secondary
Delayed Consistency
22. Write Concern
• Network acknowledgement
• Wait for error
• Wait for journal sync
• Wait for replication
23. Driver
write
Primary
apply in
memory
Unacknowledged
24. Driver
getLastError
Primary
apply in
memory
MongoDB Acknowledged (wait for error)
25. Driver
getLastError
j:true
write
Primary
apply in write to
memory journal
Wait for Journal Sync
26. Driver
getLastError
write
w:2
Primary
replicate
apply in
memory
Secondary
Wait for Replication
27. Tagging
• Control where data is written to, and read from
• Each member can have one or more tags
– tags: {dc: "ny"}
– tags: {dc: "ny",
subnet: "192.168",
rack: "row3rk7"}
• Replica set defines rules for write concerns
• Rules can change without changing app code
29. Driver
getLastError
W:allDCs
write
Primary (SF)
replicate
apply in
memory
Secondary (NY)
replicate
Secondary (Cloud)
Wait for Replication (Tagging)
30. Read Preference Modes
• 5 modes
– primary (only) - Default
– primaryPreferred
– secondary
– secondaryPreferred
– Nearest
When more than one node is possible, closest node is used for
reads (all modes but primary)
31. Tagged Read Preference
• Custom read preferences
• Control where you read from by (node) tags
– E.g. { "disk": "ssd", "use": "reporting" }
• Use in conjunction with standard read preferences
– Except primary
33. Maintenance and Upgrade
• No downtime
• Rolling upgrade/maintenance
– Start with Secondary
– Primary last
34. Replica Set – 1 Data Center
• Single datacenter
Datacenter
• Single switch & power
Member 1 • Points of failure:
– Power
Member 2 – Network
– Data center
Datacenter 2
Member 3 – Two node failure
• Automatic recovery of
single node crash
35. Replica Set – 2 Data Centers
• Multi data center
Datacenter 1
• DR node for safety
Member 1
• Can’t do multi data
Member 2 center durable write
safely since only 1 node
in distant DC
Datacenter 2
Member 3
36. Replica Set – 3 Data Centers
Datacenter 1 • Three data centers
Member 1
Member 2
• Can survive full data
center loss
Datacenter 2
Member 3 • Can do w= { dc : 2 } to
Member 4 guarantee write in 2
data centers (with tags)
Datacenter 3
Member 5
38. Implementation details
• Heartbeat every 2 seconds
– Times out in 10 seconds
• Local DB (not replicated)
– system.replset
– oplog.rs
• Capped collection
• Idempotent version of operation stored
41. Recent improvements
• Read preference support with sharding
– Drivers too
• Improved replication over WAN/high-latency
networks
• rs.syncFrom command
• buildIndexes setting
• replIndexPrefetch setting
42. Just Use It
• Use replica sets
• Easy to setup
– Try on a single machine
• Check doc page for RS tutorials
– http://docs.mongodb.org/manual/replication/#tutorials