Distruted applications

Distributed
Applications
Nodes = [ra@node1, ra@node2, ra@node3].
start_cluster("My raft cluster", Nodes).

3
Agenda
● Distribution
● Consistency
● Replication
● Distributed KV in Erlang
● Pause for QA
● Scaling
● Partitioning
● Observability
● QA

4
Microservices Distribution
● Microservices and data
● Brokers, databases, caching, point to point .. etc
● Ignoring/delegating the data-distributions

5
Microservices Data
Avoid single point of failure with a fault-tolerance system

6
High Availability <> Scaling

7
Data in distributed systems
It is impossible for a distributed data store to simultaneously provide
more than two out of the following three guarantees:
1 - Consistency: Every read receives the most recent write or an error
2 - Availability: Every request receives a (non-error) response – without
the guarantee that it contains the most recent write
3 - Partition tolerance: The system continues to operate despite an
arbitrary number of messages being dropped (or delayed) by the
network between node

8
Consistency
● Strong consistency
● Eventually consistency

9
Replica with sm
State machine replication ( deterministic )
Fault Tolerance with sm
Determinism is an ideal characteristic for providing fault-tolerance.
Intuitively, if multiple copies of a system exist, a fault in one would be
noticeable as a difference in the State or Output from the others.

10
Raft Protocol
Consensus algorithm
Consensus is a fundamental problem in fault-tolerant distributed
systems. Consensus involves multiple servers agreeing on values.
Once they reach a decision on a value, that decision is final.
Leader, Candidate(s), Follower(s)
(RabbitMQ uses it with the new queues “quorum”)

11
Leader/Master implementations
● Leader has the truth
● Leader handles all the calls
● Wrong entry on the Leader is distributed across the followers
● Need always the majority

12
Raft and RSM
Leader
Follower
Follower
Client
X = 3 X = 3
X = 3
X = 3Ok

13
Partitions
Leader
Follower
Follower
Client
X = 3 X = 3
X = 3
X = 3
Ok
Leader
Follower
Client
X = 6 X = 6
X = 6
ERROR
Follower

14
What about BEAM ?
● Cluster by design
● State machine by design ( gen_statem, process, immutable )
● Demo demo demo

15
End First Part
Any questions ?

16
Scaling...
● Vertical scaling
● Horizontal scaling
I put more HW, so the system is faster...
I create another docker machine... so the system is faster
If the design does not scale, the application won’t scale
If you are using the wrong tools, the application won’t scale

17
Scaling using partitioning
● Divide a single stream data in more partitions
● Partitions can be stored in different machines
● Partitions/shardings/queues (usually) are the units of scaling
THE UNIT OF SCALING <<----- the key

18
Partitioning Example
{category = “food”,
product = “wine”}
product = “pasta”}
{category = “Computer”,
product = “Dell” }

19
Partitioning Example
product = “wine”}
product = “pasta”}
{category = “Computer”,
product = “Dell” }
Partition 11
Partition 21

20
Partitioning and distribution
● Equal distribution problem
● Consistent hashing (partition hasing)
○ Language hash (object.hashCode())
○ Use UUID version 3, compatible
● Distribution by key (ex: A,B,C etc)
○ Won’t save… but
● Rebalance partitions

21
Joke about Partitioning Consistency
Mr. Poons:
How far into the future can you see, Mrs. Cake?
Mrs. Cake:
About ten seconds usually, Mr. Poons.
Mrs. Cake:
About ten seconds usually, Mr. Poons.
Mr. Poons:
How far into the future can you see, Mrs. Cake?

22
Read-only scaling clusters
Read-only replica query also called read-
only scaling architectural you may read an
old value, eventual consistency- read scaling is
cool but you could read old data ( cap theorem
about consistency)

23
Handle failures
● Failures in distributed systems are hard to understand
● Monitoring systems are not enough
● Metrics/Logs are not enough
● Crash: the best you can hope for
● Silent fails - When you get them it is (usually) late
● The message didn’t arrive
The message didn’t arrive!!!! -- let’s talk about ...

24
What I learnt
● Spend time on your design ( even some “methodology” says the
opposite)
● Avoid “spaghetti” design, divide in contexts ( or units of scaling)
● Write tests close to your environment and not only “how fast is”
● Add the right metrics, log, monitoring - anticipate the fails
● Client… too often overlooked
● Simulate failures constantly
● Study distributed systems.
● Predict the fail using ML ( Working on it in SUSE )

25
About ME
● Senior Developer @SUSE
● Work on OpenStack middleware modules
● RabbitMQ member/supporter
● Co-Author of RabbitMQ cookbook
● @gsantomaggio /twitter/github/IRC

No one is born hating another person because of
the color of his skin, or his background, or his
religion. People must learn to hate, and if they
can learn to hate, they can be taught to love, for
love comes more naturally to the human heart
than its opposite
Any questions?
You can find me at
@gsantomaggio
26

27
Not only data but also to nodes
https://blog.discordapp.com/scaling-elixir-f9b8e1e7c29b
Example I did
Different hashing type

Distruted applications

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (18)

Similaire à Distruted applications

Similaire à Distruted applications (20)

Dernier

Dernier (20)

Distruted applications

Notes de l'éditeur