Distributed Systems

Distributed Systems + NodeJS
Bruno Bossola
MILAN 25-26 NOVEMBER 2016
@bbossola

@bbossola
Whoami
● Developer since 1988
● XP Coach 2000+
● Co-founder of JUG Torino
● Java Champion since 2005
● CTO @ EF (Education First)
I live in London, love the weather...

@bbossola
Agenda
● Distributed programming
● How does it work, what does it mean
● The CAP theorem
● CAP explained with code
– CA system using two phase commit
– AP system using sloppy quorums
– CP system using majority quorums
● What next?
● Q&A

@bbossola
Distributed programming
● Do we need it?

@bbossola
Distributed programming
● Any system should deal with two tasks:
– Storage
– Computation
● How do we deal with scale?
● How do we use multiple computers to do what we used to
do on one?

@bbossola
What do we want to achieve?
● Scalability
● Availability
● Consistency

@bbossola
Scalability
● The ability of a system/network/process to:
– handle a growing amount of work
– be enlarged to accommodate new growth
A scalable system continue to meet the needs of its users as the
scale increase
clipart courtesy of openclipart.org

@bbossola
Scalability flavours
● size:
– more nodes, more speed
– more nodes, more space
– more data, same latency
● geographic:
– more data centers, quicker response
● administrative:
– more machines, no additional work

@bbossola
How do we scale? partitioning
● Slice the dataset into smaller independent sets
● reduces the impact of dataset growth
– improves performance by limiting the amount of data to
be examined
– improves availability by the ability of partitions to fail
indipendently

@bbossola
How do we scale? partitioning
● But can also be a source of problems
– what happens if a partition become unavailable?
– what if It becomes slower?
– what if it becomes unresponsive?

@bbossola
How do we scale? replication
● Copies of the same data on multiple machines
● Benefits:
– allows more servers to take part in the computation
– improves performance by making additional computing
power and bandwidth
– improves availability by creating copy of the data

@bbossola
How do we scale? replication
● But it's also a source of problems
– there are independent copies of the data
– need to be kept in sync on multiple machines
● Your system must follow a consistency model
v4 v4
v8
v8 v4 v5
v7
v8

@bbossola
Availability
● The proportion of time a system is in functioning conditions
● The system is fault-tolerant
– the ability of your system to behave in a well defined
manner once a fault occurs
● All clients can always read and write
– In distributed systems this
is achieved by redundancy

@bbossola
Introducing: performance
● The amount of useful work accomplished compared to the
time and resources used
● Basically:
– short response time for a unit of work
– high rate of processing
– low utilization of resources

@bbossola
Introducing: latency
● The period between the initiation of something and the
occurrence
● The time between something happened and the time it has
an impact or become visible
● more high level examples:
– how long until you become a zombie
after a bite?
– how long until my post is visible
to others?
clipart courtesy of cliparts.co

@bbossola
Consistency
● Any read on a data item X returns a value corresponding
to the result of the most recent write on X.
● Each client always has the same view of the data
● Also know as “Strong Consistency”

@bbossola
Consistency flavours
● Strong consistency
– every replica sees every update in the same order.
– no two replicas may have different values at the same time.
● Weak consistency
– every replica will see every update, but possibly in different
orders.
● Eventual consistency
– every replica will eventually see every update and will
eventually agree on all values.

@bbossola
The CAP theorem
CONSISTENCY AVAILABILITY
PARTITION
TOLERANCE

@bbossola
The CAP theorem
● You cannot have all :(
● You can select two
properties at once
Sorry, this has been mathematically proven and no, has not been debunked.

@bbossola
The CAP theorem
CA systems!
● You selected consistency
and availability!
● Strict quorum protocols
(two/multi phase commit)
● Most RDBMS
Hey! A network partition will
f**k you up good!

@bbossola
The CAP theorem
AP systems!
● You selected availability
and partition tolerance!
● Sloppy quorums and
conflict resolution protocols
● Amazon Dynamo, Riak,
Cassandra

@bbossola
The CAP theorem
CP systems!
● You selected consistency
and partition tolerance!
● Majority quorum protocols
(paxos, raft, zab)
● Apache Zookeeper,
Google Spanner

@bbossola
NodeJS time!
● Let's write our brand new key value store
● We will code all three different flavours
● We will have many nodes, fully replicated
● No sharding
● We will kill servers!
● We will trigger network
partitions!
– (no worries. it's a simulation!)

@bbossola
Node APP
General design
<proto>
APIStorage
API
GET (k) SET (k,v)
<proto>
Storage
Database
<proto>
Core
fX fY fZ fK

@bbossola
CA key-value store
● Uses classic two-phase commit
● Works like a local system
● Not partition tolerant

@bbossola
Nodeapp
CA: two phase commit, simplified
2PC
API
Storage
API
GET (k) SET (k,v)
Storage
Database
2PC
Core
propose
(tx)
commit
(tx)
rollback
(tx)

@bbossola
AP key-value store
● Eventually consistent design
● Prioritizes availability over consistency

@bbossola
Nodeapp`
AP: sloppy quorums, simplified
QUORUM
API
Storage
API
GET (k) SET (k,v)
Storage
Database
QUORUM
Core
(read) (repair)
propose
(tx)
commit
(tx)
rollback
(tx)

@bbossola
CP key-value store
● Uses majority quorum (raft)
● Guarantees eventual consistency

@bbossola
CP: majority quorums (raft, simplified)
RAFT
API
Storage
API
GET (k) SET (k,v)
Storage
Database
RAFT
Core
beat
voteme history
Nodeapp`
Urgently needs
refactoring!!!!

@bbossola
What about BASE?
● It's just a way to qualify eventually consistent systems
● BAsic Availability
– The database appears to work most of the time.
● Soft-state
– Stores don’t have to be write-consistent, nor do different
replicas have to be mutually consistent all the time.
● Eventual consistency
– Stores exhibit consistency at some later point (e.g.,
lazily at read time).

@bbossola
What about Lamport clocks?
● It's a mechanism to maintain a distributed notion of time
● Each process maintains a counter
– Whenever a process does work, increment the counter
– Whenever a process sends a message, include the
counter
– When a message is received, set the counter to
max(local_counter, received_counter) + 1

@bbossola
What about Vector clocks?
● Maintains an array of N Lamport clocks, one per each node
● Whenever a process does work, increment the logical clock
value of the node in the vector
● Whenever a process sends a message, include the full vector
● When a message is received:
– update each element in
● max(local, received)
– increment the logical clock
– of the current node in the vector

@bbossola
What next?
● Learn the lingo and the basics
● Do your homework
● Start playing with these concepts
● It's complicated, but not rocket science
● Be inspired!

@bbossola
Q&A
Amazon Dynamo:
http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
The RAFT consensus algorithm:
https://raft.github.io/
http://thesecretlivesofdata.com/raft/
The code used into this presentation:
https://github.com/bbossola/sysdist

Distributed Systems

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Distributed Systems

Similar to Distributed Systems (20)

Recently uploaded

Recently uploaded (20)

Distributed Systems

Editor's Notes