SlideShare une entreprise Scribd logo
1  sur  29
Télécharger pour lire hors ligne
Advanced
Replication
Internals
Design and Goals
Goals
■ Highly Available
■ Consistent Data
■ Automatic Failover
■ Multi-Region/DC
■ Dynamic Reads
Design
● All DBs, each node
● Quorum/Election
● Smart clients
● Source selection
● Read Preferences
● Record operations
● Asynchronous
● Write/Replication
acknowledgements
Goal: High Availability
● Node Redundancy: Duplicate Data
● Record Write Operations
● Apply Write Operations
● Use capped collection called "oplog"
Replication Operations
Insert
● oplog entry (fields):
○ op, o
{
"ns" : "test.gamma",
"op" : "i", "v" : 2,
"ts" : Timestamp(1350504342, 5),
"o" : { "_id" : 2, "x" : "hi"} }
Replication Operations
Update
● oplog entry (fields):
○ o = update, o2 = query
{
"ns" : "test.tags",
"op" : "u", "v" : 2,
"ts": Timestamp(1368049619, 1),
"o2" : { "_id" : 1 },
"o" : { "$set" : { "tags.4" : "e" } } }
Operation Transformation
● Idempotent (update by _id)
● Multi-update/delete (results in many ops)
● Array modifications (replacement)
Interchangeable
● All members maintain oplog + dbs
● All able to take over, or be used for same
functions
Replication Process
● Record oplog entry on write
● Idempotent entries
● Pulled by replicas
1. Read over network
2. Buffer locally
3. Apply in batch
4. Repeat
Read + Apply Decoupled
● Background oplog reader thread
● Pool of oplog applier threads (by collection)
Repl Source
Applier
Thread
Pool
16
Buffer
DB4
DB3
DB1 DB2
Local Oplog
Network
Batch
Com
plete
Replication Metrics
"network": {
"bytes": 103830503805,
"readersCreated": 2248,
"getmores": {
"totalMillis": 257461206,
"num": 2152267 },
"ops": 7285440 }
"buffer": {
"sizeBytes": 0,
"maxSizeBytes": 268435456,
"count": 0},
"preload": { "docs": {
"totalMillis":0,"num":0},
"indexes": {
"totalMillis": 23142318,
"num": 14560667 } },
"apply": {
"batches": {
"totalMillis": 231847,
"num": 1797105},
"ops": 7285440 },
"oplog": {
"insertBytes": 106866610253,
"insert": {
"totalMillis": 1756725,
"num": 7285440 } }
Good Replication States
● Initial Sync
○ Record oplog start position
○ Clone/copy all dbs
○ Set minvalid, apply oplog since start
○ Build indexes
● Replication Batch: MinValid
Goal: Consistent Data
● Single Master
● Quorum (majority)
● Ordered Oplog
Consistent Data
Why a single master?
Election Events
Election events:
● Primary failure
● Stepdown (manual)
● Reconfigure
● Quorum loss
Election Nomination
Disqualifications
A replica will nominate itself unless:
● Priority:0 or arbiter
● Not freshest
● Just stepped down (in unelectable state)
● Would be vetoed by anyone because
○ There is a Primary already
○ They don't have us in their config
○ Higher priority member out there
● Higher config version out there
The Election
Nomination:
● If it looks like a tie, sleep random time
(unless first node)
Voting:
● If all goes well, only one nominee
● All voting members vote for one nominee
● Majority of votes wins
Goal: Automatic Failover
● Single Master
● Smart Clients
● Discovery
Discovery
isMaster command:
setName: <name>,
ismaster: true, secondary: false, arbiterOnly:
hosts: [ <visible nodes> ],
passives: [ <prio:0 nodes> ],
arbiters: [ <nodes> ],
primary: <active primary>,
tags: {<tags>},
me: <me>
Failover Scenario
Client
P
S
S
Discovery (isMaster)Active Primary
Failover Scenario
Client
P
S
S
Active Primary
P
Failed Primary
Failover Scenario
Client
Failed
P
S
Discovery (isMaster)
Active Primary
Replication Source Select'n
● Select closest source
○ Limit to non-hidden or slave delayed
○ If nothing, try again with hidden/slave delayed
○ Select node with fastest "ping" time
○ Must be fresher
● Choose source when
○ Starting
○ Any error with existing source (network, query)
○ Any member is 30s ahead of current source
● Manual override
○ replSetSyncSource -- good until we choose again
Goal: Datacenter Aware
● Dynamic replication topologies
● Beachhead data center server
P
Goal: Dynamic Reads
Controls for consistency
● Default to Primary
● Non-primary allowed
● Based on
○ Locality (ping/tags)
○ Tags
Client
S
P
S
Tags: A,
B
Tags: B, C
Asynchronous Replication
● Important considerations
● Additional requirements
● System/Application controls
Write Propagation
● Write Concern
● Replication requirements
● Timing
● Dynamic requirements
Exceptional Conditions
● Multiple Primaries
● Rollback
● Too stale
Design and Goals
Goals
■ Highly Available
■ Consistent Data
■ Automatic Failover
■ Multi-Region/DC
■ Dynamic Reads
Design
● All DBs, each node
● Quorum/Election
● Smart clients
● Source selection
● Read Preferences
● Record operations
● Asynchronous
● Write/Replication
acknowledgements
Thanks
Questions?

Contenu connexe

Tendances

20140531 serebryany lecture01_fantastic_cpp_bugs
20140531 serebryany lecture01_fantastic_cpp_bugs20140531 serebryany lecture01_fantastic_cpp_bugs
20140531 serebryany lecture01_fantastic_cpp_bugs
Computer Science Club
 
Yapcasia2011 - Hello Embed Perl
Yapcasia2011 - Hello Embed PerlYapcasia2011 - Hello Embed Perl
Yapcasia2011 - Hello Embed Perl
Hideaki Ohno
 

Tendances (18)

Java 7
Java 7Java 7
Java 7
 
Request process
Request processRequest process
Request process
 
20140531 serebryany lecture01_fantastic_cpp_bugs
20140531 serebryany lecture01_fantastic_cpp_bugs20140531 serebryany lecture01_fantastic_cpp_bugs
20140531 serebryany lecture01_fantastic_cpp_bugs
 
Swug July 2010 - windows debugging by sainath
Swug July 2010 - windows debugging by sainathSwug July 2010 - windows debugging by sainath
Swug July 2010 - windows debugging by sainath
 
Node js lecture
Node js lectureNode js lecture
Node js lecture
 
Triton and symbolic execution on gdb
Triton and symbolic execution on gdbTriton and symbolic execution on gdb
Triton and symbolic execution on gdb
 
Ejemplos Programas Descompilados
Ejemplos Programas DescompiladosEjemplos Programas Descompilados
Ejemplos Programas Descompilados
 
Asynchronous single page applications without a line of HTML or Javascript, o...
Asynchronous single page applications without a line of HTML or Javascript, o...Asynchronous single page applications without a line of HTML or Javascript, o...
Asynchronous single page applications without a line of HTML or Javascript, o...
 
Yapcasia2011 - Hello Embed Perl
Yapcasia2011 - Hello Embed PerlYapcasia2011 - Hello Embed Perl
Yapcasia2011 - Hello Embed Perl
 
A CTF Hackers Toolbox
A CTF Hackers ToolboxA CTF Hackers Toolbox
A CTF Hackers Toolbox
 
App secforum2014 andrivet-cplusplus11-metaprogramming_applied_to_software_obf...
App secforum2014 andrivet-cplusplus11-metaprogramming_applied_to_software_obf...App secforum2014 andrivet-cplusplus11-metaprogramming_applied_to_software_obf...
App secforum2014 andrivet-cplusplus11-metaprogramming_applied_to_software_obf...
 
DConf 2016: Keynote by Walter Bright
DConf 2016: Keynote by Walter Bright DConf 2016: Keynote by Walter Bright
DConf 2016: Keynote by Walter Bright
 
OSMC 2014: Monitoring VoIP Systems | Sebastian Damm
OSMC 2014: Monitoring VoIP Systems | Sebastian DammOSMC 2014: Monitoring VoIP Systems | Sebastian Damm
OSMC 2014: Monitoring VoIP Systems | Sebastian Damm
 
Hachiojipm11
Hachiojipm11Hachiojipm11
Hachiojipm11
 
TestR: generating unit tests for R internals
TestR: generating unit tests for R internalsTestR: generating unit tests for R internals
TestR: generating unit tests for R internals
 
[DSC] Introduction to Binary Exploitation
[DSC] Introduction to Binary Exploitation[DSC] Introduction to Binary Exploitation
[DSC] Introduction to Binary Exploitation
 
Compare mysql5.1.50 mysql5.5.8
Compare mysql5.1.50 mysql5.5.8Compare mysql5.1.50 mysql5.5.8
Compare mysql5.1.50 mysql5.5.8
 
Mysql handle socket
Mysql handle socketMysql handle socket
Mysql handle socket
 

Similaire à Advanced Replication Internals

Advanced Replication Internals
Advanced Replication InternalsAdvanced Replication Internals
Advanced Replication Internals
MongoDB
 
Replication using PostgreSQL Replicator
Replication using PostgreSQL ReplicatorReplication using PostgreSQL Replicator
Replication using PostgreSQL Replicator
Command Prompt., Inc
 
"Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Констан...
"Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Констан..."Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Констан...
"Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Констан...
Badoo Development
 
Fast dynamic analysis, Kostya Serebryany
Fast dynamic analysis, Kostya SerebryanyFast dynamic analysis, Kostya Serebryany
Fast dynamic analysis, Kostya Serebryany
yaevents
 
Finding Xori: Malware Analysis Triage with Automated Disassembly
Finding Xori: Malware Analysis Triage with Automated DisassemblyFinding Xori: Malware Analysis Triage with Automated Disassembly
Finding Xori: Malware Analysis Triage with Automated Disassembly
Priyanka Aash
 

Similaire à Advanced Replication Internals (20)

Advanced Replication Internals
Advanced Replication InternalsAdvanced Replication Internals
Advanced Replication Internals
 
Jvm profiling under the hood
Jvm profiling under the hoodJvm profiling under the hood
Jvm profiling under the hood
 
Learn JavaScript From Scratch
Learn JavaScript From ScratchLearn JavaScript From Scratch
Learn JavaScript From Scratch
 
Sharding: patterns and antipatterns (Osipov, Rybak, HighLoad'2014)
Sharding: patterns and antipatterns (Osipov, Rybak, HighLoad'2014)Sharding: patterns and antipatterns (Osipov, Rybak, HighLoad'2014)
Sharding: patterns and antipatterns (Osipov, Rybak, HighLoad'2014)
 
Go replicator
Go replicatorGo replicator
Go replicator
 
Replication using PostgreSQL Replicator
Replication using PostgreSQL ReplicatorReplication using PostgreSQL Replicator
Replication using PostgreSQL Replicator
 
"Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Констан...
"Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Констан..."Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Констан...
"Sharding - patterns & antipatterns". Доклад Алексея Рыбака (Badoo) и Констан...
 
Sharding - patterns & antipatterns, Константин Осипов, Алексей Рыбак
Sharding -  patterns & antipatterns, Константин Осипов, Алексей РыбакSharding -  patterns & antipatterns, Константин Осипов, Алексей Рыбак
Sharding - patterns & antipatterns, Константин Осипов, Алексей Рыбак
 
How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...
How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...
How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...
 
Hidden in Plain Sight: DUAL_EC_DRBG 'n stuff
Hidden in Plain Sight: DUAL_EC_DRBG 'n stuffHidden in Plain Sight: DUAL_EC_DRBG 'n stuff
Hidden in Plain Sight: DUAL_EC_DRBG 'n stuff
 
Replication MongoDB Days 2013
Replication MongoDB Days 2013Replication MongoDB Days 2013
Replication MongoDB Days 2013
 
Microservice Architecture - Beyond HTTP
Microservice Architecture - Beyond HTTPMicroservice Architecture - Beyond HTTP
Microservice Architecture - Beyond HTTP
 
Fast dynamic analysis, Kostya Serebryany
Fast dynamic analysis, Kostya SerebryanyFast dynamic analysis, Kostya Serebryany
Fast dynamic analysis, Kostya Serebryany
 
Константин Серебряный "Быстрый динамичекский анализ программ на примере поиск...
Константин Серебряный "Быстрый динамичекский анализ программ на примере поиск...Константин Серебряный "Быстрый динамичекский анализ программ на примере поиск...
Константин Серебряный "Быстрый динамичекский анализ программ на примере поиск...
 
Finding Xori: Malware Analysis Triage with Automated Disassembly
Finding Xori: Malware Analysis Triage with Automated DisassemblyFinding Xori: Malware Analysis Triage with Automated Disassembly
Finding Xori: Malware Analysis Triage with Automated Disassembly
 
Caching in (DevoxxUK 2013)
Caching in (DevoxxUK 2013)Caching in (DevoxxUK 2013)
Caching in (DevoxxUK 2013)
 
Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @Lendingkart
 
Hyperparameter optimization landscape Berlin ML Group meetup 8/2019
Hyperparameter optimization landscape Berlin ML Group meetup 8/2019Hyperparameter optimization landscape Berlin ML Group meetup 8/2019
Hyperparameter optimization landscape Berlin ML Group meetup 8/2019
 
Streaming huge databases using logical decoding
Streaming huge databases using logical decodingStreaming huge databases using logical decoding
Streaming huge databases using logical decoding
 
Implementation and analysis of search algorithms in single player connect fou...
Implementation and analysis of search algorithms in single player connect fou...Implementation and analysis of search algorithms in single player connect fou...
Implementation and analysis of search algorithms in single player connect fou...
 

Plus de Scott Hernandez

Mongo sf easy java persistence
Mongo sf   easy java persistenceMongo sf   easy java persistence
Mongo sf easy java persistence
Scott Hernandez
 
MongoDB Aug2010 SF Meetup
MongoDB Aug2010 SF MeetupMongoDB Aug2010 SF Meetup
MongoDB Aug2010 SF Meetup
Scott Hernandez
 
MongoDB: tips, trick and hacks
MongoDB: tips, trick and hacksMongoDB: tips, trick and hacks
MongoDB: tips, trick and hacks
Scott Hernandez
 
Mastering the MongoDB Javascript Shell
Mastering the MongoDB Javascript ShellMastering the MongoDB Javascript Shell
Mastering the MongoDB Javascript Shell
Scott Hernandez
 
Java Development with MongoDB
Java Development with MongoDBJava Development with MongoDB
Java Development with MongoDB
Scott Hernandez
 

Plus de Scott Hernandez (13)

Realtime Analytics with MongoDB Counters (mongonyc 2012)
Realtime Analytics with MongoDB Counters (mongonyc 2012)Realtime Analytics with MongoDB Counters (mongonyc 2012)
Realtime Analytics with MongoDB Counters (mongonyc 2012)
 
MongoDB Operational Best Practices (mongosf2012)
MongoDB Operational Best Practices (mongosf2012)MongoDB Operational Best Practices (mongosf2012)
MongoDB Operational Best Practices (mongosf2012)
 
MongoDB Datacenter Awareness (mongosf2012)
MongoDB Datacenter Awareness (mongosf2012)MongoDB Datacenter Awareness (mongosf2012)
MongoDB Datacenter Awareness (mongosf2012)
 
Mongo sf easy java persistence
Mongo sf   easy java persistenceMongo sf   easy java persistence
Mongo sf easy java persistence
 
MongoDB: Easy Java Persistence with Morphia
MongoDB: Easy Java Persistence with MorphiaMongoDB: Easy Java Persistence with Morphia
MongoDB: Easy Java Persistence with Morphia
 
MongoDB: Mastering the shell
MongoDB: Mastering the shellMongoDB: Mastering the shell
MongoDB: Mastering the shell
 
MongoDB: Backup, Restore, and DR
MongoDB: Backup, Restore, and DRMongoDB: Backup, Restore, and DR
MongoDB: Backup, Restore, and DR
 
A Brief MongoDB Intro
A Brief MongoDB IntroA Brief MongoDB Intro
A Brief MongoDB Intro
 
What's new in the MongoDB Java Driver (2.5)?
What's new in the MongoDB Java Driver (2.5)?What's new in the MongoDB Java Driver (2.5)?
What's new in the MongoDB Java Driver (2.5)?
 
MongoDB Aug2010 SF Meetup
MongoDB Aug2010 SF MeetupMongoDB Aug2010 SF Meetup
MongoDB Aug2010 SF Meetup
 
MongoDB: tips, trick and hacks
MongoDB: tips, trick and hacksMongoDB: tips, trick and hacks
MongoDB: tips, trick and hacks
 
Mastering the MongoDB Javascript Shell
Mastering the MongoDB Javascript ShellMastering the MongoDB Javascript Shell
Mastering the MongoDB Javascript Shell
 
Java Development with MongoDB
Java Development with MongoDBJava Development with MongoDB
Java Development with MongoDB
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Dernier (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 

Advanced Replication Internals