SlideShare a Scribd company logo
1 of 35
Distributed Systems + NodeJS
Bruno Bossola
MILAN 25-26 NOVEMBER 2016
@bbossola
@bbossola
Whoami
● Developer since 1988
● XP Coach 2000+
● Co-founder of JUG Torino
● Java Champion since 2005
● CTO @ EF (Education First)
I live in London, love the weather...
@bbossola
Agenda
● Distributed programming
● How does it work, what does it mean
● The CAP theorem
● CAP explained with code
– CA system using two phase commit
– AP system using sloppy quorums
– CP system using majority quorums
● What next?
● Q&A
@bbossola
Distributed programming
● Do we need it?
@bbossola
Distributed programming
● Any system should deal with two tasks:
– Storage
– Computation
● How do we deal with scale?
● How do we use multiple computers to do what we used to
do on one?
@bbossola
What do we want to achieve?
● Scalability
● Availability
● Consistency
@bbossola
Scalability
● The ability of a system/network/process to:
– handle a growing amount of work
– be enlarged to accommodate new growth
A scalable system continue to meet the needs of its users as the
scale increase
clipart courtesy of openclipart.org
clipart courtesy of openclipart.org
@bbossola
Scalability flavours
● size:
– more nodes, more speed
– more nodes, more space
– more data, same latency
● geographic:
– more data centers, quicker response
● administrative:
– more machines, no additional work
@bbossola
How do we scale? partitioning
● Slice the dataset into smaller independent sets
● reduces the impact of dataset growth
– improves performance by limiting the amount of data to
be examined
– improves availability by the ability of partitions to fail
indipendently
@bbossola
How do we scale? partitioning
● But can also be a source of problems
– what happens if a partition become unavailable?
– what if It becomes slower?
– what if it becomes unresponsive?
clipart courtesy of openclipart.org
@bbossola
How do we scale? replication
● Copies of the same data on multiple machines
● Benefits:
– allows more servers to take part in the computation
– improves performance by making additional computing
power and bandwidth
– improves availability by creating copy of the data
@bbossola
How do we scale? replication
● But it's also a source of problems
– there are independent copies of the data
– need to be kept in sync on multiple machines
● Your system must follow a consistency model
v4 v4
v8
v8 v4 v5
v7
v8
clipart courtesy of openclipart.org
@bbossola
Availability
● The proportion of time a system is in functioning conditions
● The system is fault-tolerant
– the ability of your system to behave in a well defined
manner once a fault occurs
● All clients can always read and write
– In distributed systems this
is achieved by redundancy
clipart courtesy of openclipart.org
@bbossola
Introducing: performance
● The amount of useful work accomplished compared to the
time and resources used
● Basically:
– short response time for a unit of work
– high rate of processing
– low utilization of resources
clipart courtesy of openclipart.org
@bbossola
Introducing: latency
● The period between the initiation of something and the
occurrence
● The time between something happened and the time it has
an impact or become visible
● more high level examples:
– how long until you become a zombie
after a bite?
– how long until my post is visible
to others?
clipart courtesy of cliparts.co
@bbossola
Consistency
● Any read on a data item X returns a value corresponding
to the result of the most recent write on X.
● Each client always has the same view of the data
● Also know as “Strong Consistency”
clipart courtesy of cliparts.co
@bbossola
Consistency flavours
● Strong consistency
– every replica sees every update in the same order.
– no two replicas may have different values at the same time.
● Weak consistency
– every replica will see every update, but possibly in different
orders.
● Eventual consistency
– every replica will eventually see every update and will
eventually agree on all values.
@bbossola
The CAP theorem
CONSISTENCY AVAILABILITY
PARTITION
TOLERANCE
@bbossola
The CAP theorem
● You cannot have all :(
● You can select two
properties at once
Sorry, this has been mathematically proven and no, has not been debunked.
@bbossola
The CAP theorem
CA systems!
● You selected consistency
and availability!
● Strict quorum protocols
(two/multi phase commit)
● Most RDBMS
Hey! A network partition will
f**k you up good!
@bbossola
The CAP theorem
AP systems!
● You selected availability
and partition tolerance!
● Sloppy quorums and
conflict resolution protocols
● Amazon Dynamo, Riak,
Cassandra
@bbossola
The CAP theorem
CP systems!
● You selected consistency
and partition tolerance!
● Majority quorum protocols
(paxos, raft, zab)
● Apache Zookeeper,
Google Spanner
@bbossola
NodeJS time!
● Let's write our brand new key value store
● We will code all three different flavours
● We will have many nodes, fully replicated
● No sharding
● We will kill servers!
● We will trigger network
partitions!
– (no worries. it's a simulation!)
clipart courtesy of cliparts.co
@bbossola
Node APP
General design
<proto>
APIStorage
API
GET (k) SET (k,v)
<proto>
Storage
Database
<proto>
Core
fX fY fZ fK
@bbossola
CA key-value store
● Uses classic two-phase commit
● Works like a local system
● Not partition tolerant
@bbossola
Nodeapp
CA: two phase commit, simplified
2PC
API
Storage
API
GET (k) SET (k,v)
Storage
Database
2PC
Core
propose
(tx)
commit
(tx)
rollback
(tx)
@bbossola
AP key-value store
● Eventually consistent design
● Prioritizes availability over consistency
@bbossola
Nodeapp`
AP: sloppy quorums, simplified
QUORUM
API
Storage
API
GET (k) SET (k,v)
Storage
Database
QUORUM
Core
(read) (repair)
propose
(tx)
commit
(tx)
rollback
(tx)
@bbossola
CP key-value store
● Uses majority quorum (raft)
● Guarantees eventual consistency
@bbossola
CP: majority quorums (raft, simplified)
RAFT
API
Storage
API
GET (k) SET (k,v)
Storage
Database
RAFT
Core
beat
voteme history
Nodeapp`
Urgently needs
refactoring!!!!
@bbossola
What about BASE?
● It's just a way to qualify eventually consistent systems
● BAsic Availability
– The database appears to work most of the time.
● Soft-state
– Stores don’t have to be write-consistent, nor do different
replicas have to be mutually consistent all the time.
● Eventual consistency
– Stores exhibit consistency at some later point (e.g.,
lazily at read time).
@bbossola
What about Lamport clocks?
● It's a mechanism to maintain a distributed notion of time
● Each process maintains a counter
– Whenever a process does work, increment the counter
– Whenever a process sends a message, include the
counter
– When a message is received, set the counter to
max(local_counter, received_counter) + 1
clipart courtesy of cliparts.co
@bbossola
What about Vector clocks?
● Maintains an array of N Lamport clocks, one per each node
● Whenever a process does work, increment the logical clock
value of the node in the vector
● Whenever a process sends a message, include the full vector
● When a message is received:
– update each element in
● max(local, received)
– increment the logical clock
– of the current node in the vector
clipart courtesy of cliparts.co
@bbossola
What next?
● Learn the lingo and the basics
● Do your homework
● Start playing with these concepts
● It's complicated, but not rocket science
● Be inspired!
@bbossola
Q&A
Amazon Dynamo:
http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
The RAFT consensus algorithm:
https://raft.github.io/
http://thesecretlivesofdata.com/raft/
The code used into this presentation:
https://github.com/bbossola/sysdist
clipart courtesy of cliparts.co

More Related Content

What's hot

How to make your ruby code faster with multithreading
How to make your ruby code faster with multithreadingHow to make your ruby code faster with multithreading
How to make your ruby code faster with multithreadingSun-Li Beatteay
 
Adopting language server for apache camel feedback from a java/Eclipse plugi...
Adopting language server for apache camel  feedback from a java/Eclipse plugi...Adopting language server for apache camel  feedback from a java/Eclipse plugi...
Adopting language server for apache camel feedback from a java/Eclipse plugi...Aurélien Pupier
 
Powerlang: a Vehicle for Lively Implementing Programming Languages
Powerlang: a Vehicle for Lively Implementing Programming LanguagesPowerlang: a Vehicle for Lively Implementing Programming Languages
Powerlang: a Vehicle for Lively Implementing Programming LanguagesESUG
 
The Professional Programmer
The Professional ProgrammerThe Professional Programmer
The Professional ProgrammerDave Cross
 
Elephant Carpaccio
Elephant CarpaccioElephant Carpaccio
Elephant CarpaccioLars Thorup
 
Deploying your SaaS stack OnPrem
Deploying your SaaS stack OnPremDeploying your SaaS stack OnPrem
Deploying your SaaS stack OnPremKris Buytaert
 
June 2014 - Building Rabbit MQ based chat on Android
June 2014 - Building Rabbit MQ based chat on AndroidJune 2014 - Building Rabbit MQ based chat on Android
June 2014 - Building Rabbit MQ based chat on AndroidBlrDroid
 
Custom angular libraries
Custom angular librariesCustom angular libraries
Custom angular librariesMattVaughn9
 
Stockholm JAM September 2018
Stockholm JAM September 2018Stockholm JAM September 2018
Stockholm JAM September 2018Andrey Devyatkin
 
Our wish to Flowtype
Our wish to FlowtypeOur wish to Flowtype
Our wish to FlowtypeTeppei Sato
 
Porting 100k Lines of Code to TypeScript
Porting 100k Lines of Code to TypeScriptPorting 100k Lines of Code to TypeScript
Porting 100k Lines of Code to TypeScriptTiny
 
Advantages and disadvantages of a monorepo
Advantages and disadvantages of a monorepoAdvantages and disadvantages of a monorepo
Advantages and disadvantages of a monorepoIanDavidson56
 
The Beam Vision for Portability: "Write once run anywhere"
The Beam Vision for Portability: "Write once run anywhere"The Beam Vision for Portability: "Write once run anywhere"
The Beam Vision for Portability: "Write once run anywhere"Knoldus Inc.
 
React web development
React web developmentReact web development
React web developmentRully Ramanda
 
TDC2016POA | Trilha DevOps - DevOps Anti-Patterns
TDC2016POA | Trilha DevOps - DevOps Anti-PatternsTDC2016POA | Trilha DevOps - DevOps Anti-Patterns
TDC2016POA | Trilha DevOps - DevOps Anti-Patternstdc-globalcode
 

What's hot (20)

How to make your ruby code faster with multithreading
How to make your ruby code faster with multithreadingHow to make your ruby code faster with multithreading
How to make your ruby code faster with multithreading
 
SGCE 2015 REST APIs
SGCE 2015 REST APIsSGCE 2015 REST APIs
SGCE 2015 REST APIs
 
Adopting language server for apache camel feedback from a java/Eclipse plugi...
Adopting language server for apache camel  feedback from a java/Eclipse plugi...Adopting language server for apache camel  feedback from a java/Eclipse plugi...
Adopting language server for apache camel feedback from a java/Eclipse plugi...
 
Powerlang: a Vehicle for Lively Implementing Programming Languages
Powerlang: a Vehicle for Lively Implementing Programming LanguagesPowerlang: a Vehicle for Lively Implementing Programming Languages
Powerlang: a Vehicle for Lively Implementing Programming Languages
 
Enterprise messaging
Enterprise messagingEnterprise messaging
Enterprise messaging
 
The Professional Programmer
The Professional ProgrammerThe Professional Programmer
The Professional Programmer
 
Elephant Carpaccio
Elephant CarpaccioElephant Carpaccio
Elephant Carpaccio
 
Deploying your SaaS stack OnPrem
Deploying your SaaS stack OnPremDeploying your SaaS stack OnPrem
Deploying your SaaS stack OnPrem
 
June 2014 - Building Rabbit MQ based chat on Android
June 2014 - Building Rabbit MQ based chat on AndroidJune 2014 - Building Rabbit MQ based chat on Android
June 2014 - Building Rabbit MQ based chat on Android
 
Monorepo at Pinterest
Monorepo at PinterestMonorepo at Pinterest
Monorepo at Pinterest
 
Custom angular libraries
Custom angular librariesCustom angular libraries
Custom angular libraries
 
Stockholm JAM September 2018
Stockholm JAM September 2018Stockholm JAM September 2018
Stockholm JAM September 2018
 
Our wish to Flowtype
Our wish to FlowtypeOur wish to Flowtype
Our wish to Flowtype
 
Porting 100k Lines of Code to TypeScript
Porting 100k Lines of Code to TypeScriptPorting 100k Lines of Code to TypeScript
Porting 100k Lines of Code to TypeScript
 
Advantages and disadvantages of a monorepo
Advantages and disadvantages of a monorepoAdvantages and disadvantages of a monorepo
Advantages and disadvantages of a monorepo
 
The Beam Vision for Portability: "Write once run anywhere"
The Beam Vision for Portability: "Write once run anywhere"The Beam Vision for Portability: "Write once run anywhere"
The Beam Vision for Portability: "Write once run anywhere"
 
React web development
React web developmentReact web development
React web development
 
TDC2016POA | Trilha DevOps - DevOps Anti-Patterns
TDC2016POA | Trilha DevOps - DevOps Anti-PatternsTDC2016POA | Trilha DevOps - DevOps Anti-Patterns
TDC2016POA | Trilha DevOps - DevOps Anti-Patterns
 
TypeScript
TypeScriptTypeScript
TypeScript
 
Mono Repo
Mono RepoMono Repo
Mono Repo
 

Similar to Distributed Systems

Distributed Systems explained (with NodeJS) - Bruno Bossola, JUG Torino
Distributed Systems explained (with NodeJS) - Bruno Bossola, JUG TorinoDistributed Systems explained (with NodeJS) - Bruno Bossola, JUG Torino
Distributed Systems explained (with NodeJS) - Bruno Bossola, JUG TorinoCodemotion Tel Aviv
 
Distributed System explained (with Java Microservices)
Distributed System explained (with Java Microservices)Distributed System explained (with Java Microservices)
Distributed System explained (with Java Microservices)Mario Romano
 
Is It Fast? : Measuring MongoDB Performance
Is It Fast? : Measuring MongoDB PerformanceIs It Fast? : Measuring MongoDB Performance
Is It Fast? : Measuring MongoDB PerformanceTim Callaghan
 
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
PHP At 5000 Requests Per Second: Hootsuite’s Scaling StoryPHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Storyvanphp
 
Event driven architectures with Kinesis
Event driven architectures with KinesisEvent driven architectures with Kinesis
Event driven architectures with KinesisMark Harrison
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Anton Nazaruk
 
Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3LibbySchulze
 
Voxxed Athens 2018 - Methods and Practices for Guaranteed Failure in Big Data
Voxxed Athens 2018 - Methods and Practices for Guaranteed Failure in Big DataVoxxed Athens 2018 - Methods and Practices for Guaranteed Failure in Big Data
Voxxed Athens 2018 - Methods and Practices for Guaranteed Failure in Big DataVoxxed Athens
 
Lessons from managing a Pulsar cluster (Nutanix)
Lessons from managing a Pulsar cluster (Nutanix)Lessons from managing a Pulsar cluster (Nutanix)
Lessons from managing a Pulsar cluster (Nutanix)StreamNative
 
lessons from managing a pulsar cluster
 lessons from managing a pulsar cluster lessons from managing a pulsar cluster
lessons from managing a pulsar clusterShivji Kumar Jha
 
Couchbase live 2016
Couchbase live 2016Couchbase live 2016
Couchbase live 2016Pierre Mavro
 
Concurrency, Parallelism And IO
Concurrency,  Parallelism And IOConcurrency,  Parallelism And IO
Concurrency, Parallelism And IOPiyush Katariya
 
Software Design for Persistent Memory Systems
Software Design for Persistent Memory SystemsSoftware Design for Persistent Memory Systems
Software Design for Persistent Memory SystemsC4Media
 
Ceph at Work in Bloomberg: Object Store, RBD and OpenStack
Ceph at Work in Bloomberg: Object Store, RBD and OpenStackCeph at Work in Bloomberg: Object Store, RBD and OpenStack
Ceph at Work in Bloomberg: Object Store, RBD and OpenStackRed_Hat_Storage
 
071410 sun a_1515_feldman_stephen
071410 sun a_1515_feldman_stephen071410 sun a_1515_feldman_stephen
071410 sun a_1515_feldman_stephenSteve Feldman
 
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInJay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInLinkedIn
 
Distributed Tensorflow with Kubernetes - data2day - Jakob Karalus
Distributed Tensorflow with Kubernetes - data2day - Jakob KaralusDistributed Tensorflow with Kubernetes - data2day - Jakob Karalus
Distributed Tensorflow with Kubernetes - data2day - Jakob KaralusJakob Karalus
 

Similar to Distributed Systems (20)

Distributed Systems explained (with NodeJS) - Bruno Bossola, JUG Torino
Distributed Systems explained (with NodeJS) - Bruno Bossola, JUG TorinoDistributed Systems explained (with NodeJS) - Bruno Bossola, JUG Torino
Distributed Systems explained (with NodeJS) - Bruno Bossola, JUG Torino
 
Distributed System explained (with Java Microservices)
Distributed System explained (with Java Microservices)Distributed System explained (with Java Microservices)
Distributed System explained (with Java Microservices)
 
Is It Fast? : Measuring MongoDB Performance
Is It Fast? : Measuring MongoDB PerformanceIs It Fast? : Measuring MongoDB Performance
Is It Fast? : Measuring MongoDB Performance
 
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
PHP At 5000 Requests Per Second: Hootsuite’s Scaling StoryPHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
 
Event driven architectures with Kinesis
Event driven architectures with KinesisEvent driven architectures with Kinesis
Event driven architectures with Kinesis
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?
 
Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3
 
Voxxed Athens 2018 - Methods and Practices for Guaranteed Failure in Big Data
Voxxed Athens 2018 - Methods and Practices for Guaranteed Failure in Big DataVoxxed Athens 2018 - Methods and Practices for Guaranteed Failure in Big Data
Voxxed Athens 2018 - Methods and Practices for Guaranteed Failure in Big Data
 
Lessons from managing a Pulsar cluster (Nutanix)
Lessons from managing a Pulsar cluster (Nutanix)Lessons from managing a Pulsar cluster (Nutanix)
Lessons from managing a Pulsar cluster (Nutanix)
 
lessons from managing a pulsar cluster
 lessons from managing a pulsar cluster lessons from managing a pulsar cluster
lessons from managing a pulsar cluster
 
Couchbase live 2016
Couchbase live 2016Couchbase live 2016
Couchbase live 2016
 
Concurrency, Parallelism And IO
Concurrency,  Parallelism And IOConcurrency,  Parallelism And IO
Concurrency, Parallelism And IO
 
Software Design for Persistent Memory Systems
Software Design for Persistent Memory SystemsSoftware Design for Persistent Memory Systems
Software Design for Persistent Memory Systems
 
NoSQL Evolution
NoSQL EvolutionNoSQL Evolution
NoSQL Evolution
 
Ceph at Work in Bloomberg: Object Store, RBD and OpenStack
Ceph at Work in Bloomberg: Object Store, RBD and OpenStackCeph at Work in Bloomberg: Object Store, RBD and OpenStack
Ceph at Work in Bloomberg: Object Store, RBD and OpenStack
 
071410 sun a_1515_feldman_stephen
071410 sun a_1515_feldman_stephen071410 sun a_1515_feldman_stephen
071410 sun a_1515_feldman_stephen
 
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInJay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
 
Distributed Tensorflow with Kubernetes - data2day - Jakob Karalus
Distributed Tensorflow with Kubernetes - data2day - Jakob KaralusDistributed Tensorflow with Kubernetes - data2day - Jakob Karalus
Distributed Tensorflow with Kubernetes - data2day - Jakob Karalus
 
Megastore by Google
Megastore by GoogleMegastore by Google
Megastore by Google
 

Recently uploaded

Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsChristian Birchler
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfkalichargn70th171
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 

Recently uploaded (20)

Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
Odoo Development Company in India | Devintelle Consulting Service
Odoo Development Company in India | Devintelle Consulting ServiceOdoo Development Company in India | Devintelle Consulting Service
Odoo Development Company in India | Devintelle Consulting Service
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 

Distributed Systems

  • 1. Distributed Systems + NodeJS Bruno Bossola MILAN 25-26 NOVEMBER 2016 @bbossola
  • 2. @bbossola Whoami ● Developer since 1988 ● XP Coach 2000+ ● Co-founder of JUG Torino ● Java Champion since 2005 ● CTO @ EF (Education First) I live in London, love the weather...
  • 3. @bbossola Agenda ● Distributed programming ● How does it work, what does it mean ● The CAP theorem ● CAP explained with code – CA system using two phase commit – AP system using sloppy quorums – CP system using majority quorums ● What next? ● Q&A
  • 5. @bbossola Distributed programming ● Any system should deal with two tasks: – Storage – Computation ● How do we deal with scale? ● How do we use multiple computers to do what we used to do on one?
  • 6. @bbossola What do we want to achieve? ● Scalability ● Availability ● Consistency
  • 7. @bbossola Scalability ● The ability of a system/network/process to: – handle a growing amount of work – be enlarged to accommodate new growth A scalable system continue to meet the needs of its users as the scale increase clipart courtesy of openclipart.org clipart courtesy of openclipart.org
  • 8. @bbossola Scalability flavours ● size: – more nodes, more speed – more nodes, more space – more data, same latency ● geographic: – more data centers, quicker response ● administrative: – more machines, no additional work
  • 9. @bbossola How do we scale? partitioning ● Slice the dataset into smaller independent sets ● reduces the impact of dataset growth – improves performance by limiting the amount of data to be examined – improves availability by the ability of partitions to fail indipendently
  • 10. @bbossola How do we scale? partitioning ● But can also be a source of problems – what happens if a partition become unavailable? – what if It becomes slower? – what if it becomes unresponsive? clipart courtesy of openclipart.org
  • 11. @bbossola How do we scale? replication ● Copies of the same data on multiple machines ● Benefits: – allows more servers to take part in the computation – improves performance by making additional computing power and bandwidth – improves availability by creating copy of the data
  • 12. @bbossola How do we scale? replication ● But it's also a source of problems – there are independent copies of the data – need to be kept in sync on multiple machines ● Your system must follow a consistency model v4 v4 v8 v8 v4 v5 v7 v8 clipart courtesy of openclipart.org
  • 13. @bbossola Availability ● The proportion of time a system is in functioning conditions ● The system is fault-tolerant – the ability of your system to behave in a well defined manner once a fault occurs ● All clients can always read and write – In distributed systems this is achieved by redundancy clipart courtesy of openclipart.org
  • 14. @bbossola Introducing: performance ● The amount of useful work accomplished compared to the time and resources used ● Basically: – short response time for a unit of work – high rate of processing – low utilization of resources clipart courtesy of openclipart.org
  • 15. @bbossola Introducing: latency ● The period between the initiation of something and the occurrence ● The time between something happened and the time it has an impact or become visible ● more high level examples: – how long until you become a zombie after a bite? – how long until my post is visible to others? clipart courtesy of cliparts.co
  • 16. @bbossola Consistency ● Any read on a data item X returns a value corresponding to the result of the most recent write on X. ● Each client always has the same view of the data ● Also know as “Strong Consistency” clipart courtesy of cliparts.co
  • 17. @bbossola Consistency flavours ● Strong consistency – every replica sees every update in the same order. – no two replicas may have different values at the same time. ● Weak consistency – every replica will see every update, but possibly in different orders. ● Eventual consistency – every replica will eventually see every update and will eventually agree on all values.
  • 18. @bbossola The CAP theorem CONSISTENCY AVAILABILITY PARTITION TOLERANCE
  • 19. @bbossola The CAP theorem ● You cannot have all :( ● You can select two properties at once Sorry, this has been mathematically proven and no, has not been debunked.
  • 20. @bbossola The CAP theorem CA systems! ● You selected consistency and availability! ● Strict quorum protocols (two/multi phase commit) ● Most RDBMS Hey! A network partition will f**k you up good!
  • 21. @bbossola The CAP theorem AP systems! ● You selected availability and partition tolerance! ● Sloppy quorums and conflict resolution protocols ● Amazon Dynamo, Riak, Cassandra
  • 22. @bbossola The CAP theorem CP systems! ● You selected consistency and partition tolerance! ● Majority quorum protocols (paxos, raft, zab) ● Apache Zookeeper, Google Spanner
  • 23. @bbossola NodeJS time! ● Let's write our brand new key value store ● We will code all three different flavours ● We will have many nodes, fully replicated ● No sharding ● We will kill servers! ● We will trigger network partitions! – (no worries. it's a simulation!) clipart courtesy of cliparts.co
  • 24. @bbossola Node APP General design <proto> APIStorage API GET (k) SET (k,v) <proto> Storage Database <proto> Core fX fY fZ fK
  • 25. @bbossola CA key-value store ● Uses classic two-phase commit ● Works like a local system ● Not partition tolerant
  • 26. @bbossola Nodeapp CA: two phase commit, simplified 2PC API Storage API GET (k) SET (k,v) Storage Database 2PC Core propose (tx) commit (tx) rollback (tx)
  • 27. @bbossola AP key-value store ● Eventually consistent design ● Prioritizes availability over consistency
  • 28. @bbossola Nodeapp` AP: sloppy quorums, simplified QUORUM API Storage API GET (k) SET (k,v) Storage Database QUORUM Core (read) (repair) propose (tx) commit (tx) rollback (tx)
  • 29. @bbossola CP key-value store ● Uses majority quorum (raft) ● Guarantees eventual consistency
  • 30. @bbossola CP: majority quorums (raft, simplified) RAFT API Storage API GET (k) SET (k,v) Storage Database RAFT Core beat voteme history Nodeapp` Urgently needs refactoring!!!!
  • 31. @bbossola What about BASE? ● It's just a way to qualify eventually consistent systems ● BAsic Availability – The database appears to work most of the time. ● Soft-state – Stores don’t have to be write-consistent, nor do different replicas have to be mutually consistent all the time. ● Eventual consistency – Stores exhibit consistency at some later point (e.g., lazily at read time).
  • 32. @bbossola What about Lamport clocks? ● It's a mechanism to maintain a distributed notion of time ● Each process maintains a counter – Whenever a process does work, increment the counter – Whenever a process sends a message, include the counter – When a message is received, set the counter to max(local_counter, received_counter) + 1 clipart courtesy of cliparts.co
  • 33. @bbossola What about Vector clocks? ● Maintains an array of N Lamport clocks, one per each node ● Whenever a process does work, increment the logical clock value of the node in the vector ● Whenever a process sends a message, include the full vector ● When a message is received: – update each element in ● max(local, received) – increment the logical clock – of the current node in the vector clipart courtesy of cliparts.co
  • 34. @bbossola What next? ● Learn the lingo and the basics ● Do your homework ● Start playing with these concepts ● It's complicated, but not rocket science ● Be inspired!
  • 35. @bbossola Q&A Amazon Dynamo: http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html The RAFT consensus algorithm: https://raft.github.io/ http://thesecretlivesofdata.com/raft/ The code used into this presentation: https://github.com/bbossola/sysdist clipart courtesy of cliparts.co

Editor's Notes

  1. The 93 petaflop Sunway TaihuLight is installed at the National Supercomputing Centre in Wuxi. At its peak, the computer can perform around 93,000 trillion calculations per second. It has more than 10.5 million processing cores and 40,960 nodes and runs on a Linux-based operating system.
  2. There are tradeoffs involved in optimizing for any of these outcomes. For example, a system may achieve a higher throughput by processing larger batches of work thereby reducing operation overhead. The tradeoff would be longer response times for individual pieces of work due to batching.
  3. I find that low latency - achieving a short response time - is the most interesting aspect of performance, because it has a strong connection with physical (rather than financial) limitations. It is harder to address latency using financial resources than the other aspects of performance.
  4. Strong consistency every replica sees every update in the same order. Updates are made atomically, so that no two replicas may have different values at the same time. Weak consistency every replica will see every update, but possibly in different orders. Eventual consistency every replica will eventually see every update (i.e. there is a point in time after which every replica has seen a given update), and will eventually agree on all values. Updates are therefore not atomic.
  5. Consistency means that each client always has the same view of the data. Availability means that all clients can always read and write. Partition tolerance means that the system works well across physical network partitions. Consistency is considered strong here: “Atomic, linearizable, consistency: there must exist a total order on all operations such that each operation looks as if it were completed at a single instant. This is equivalent to requiring requests of the distributed shared memory to act as if they were executing on a single node, responding to operations one at a time”
  6. Raft, Paxos and Zookeeper ZAB, all provide linearizable writes This is intuitive since they use a leader which publishes the quorum-voted changes atomically and in order, creating a virtual synchrony. CockroachDB and Google Spanner, also provide linearizability (Google also uses atomic clocks to optimize latency).
  7. explain CAP theorem with a distributed key-value store move to AP and implement lampart clock move to CP and implement consensus
  8. It provides the illusion of behaving like a single system but cannot tolerate network partitions or failures of his parts
  9. Example: Amazon Dynamo (Riak, Cassandra...) Dynamo prioritizes availability over consistency; it does not guarantee single-copy consistency. Instead, replicas may diverge from each other when values are written; when a key is read, there is a read reconciliation phase that attempts to reconcile differences between replicas before returning the value back to the client. For many features on Amazon, it is more important to avoid outages than it is to ensure that data is perfectly consistent, as an outage can lead to lost business and a loss of credibility. Furthermore, if the data is not particularly important, then a weakly consistent system can provide better performance and higher availability at a lower cost than a traditional RDBMS.
  10. Will use the RAFT protocol