SlideShare a Scribd company logo
1 of 114
Download to read offline
Model Checking of Fault-Tolerant Distributed Algorithms 
Part I: Fault-Tolerant Distributed Algorithms 
Annu Gmeiner Igor Konnov Ulrich Schmid 
Helmut Veith Josef Widder 
TMPA 2014, Kostroma, Russia 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 1 / 52
Distributed Systems 
114 7.1 Assessing and validating the standard node HITS design 
Figure 7.1: DARTS prototype board, comprising 8 interconnected HITS chips
Distributed Systems 
114 7.1 Assessing and validating the standard node HITS design 
Figure 7.1: DARTS prototype board, comprising 8 interconnected HITS chips 
Are they always working? 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 2 / 52
No. . . some failing systems 
Therac-25 (1985) 
radiation therapy machine 
gave massive overdoses, e.g., due to race conditions of concurrent tasks 
Quantas Airbus in-
ight Learmonth upset (2008) 
1 out of 3 replicated components failed 
computer initiated dangerous altitude drop 
Ariane 501 maiden 
ight (1996) 
primary/backup, i.e., 2 replicated computers 
both run into the same integer over
ow 
Net
ix outages due to Amazon's cloud (ongoing) 
one is not sure what is going on there 
hundreds of computers involved 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 3 / 52
Why do they fail? 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 4 / 52
Why do they fail? 
faults at design/implementation phase 
faults at runtime 
outside of control of designer/developer 
e.g., to the right: crack in a diode in the 
data link interface of the Space Shuttle 
) led to erroneous messages being sent 
Driscoll (Honeywell) 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 4 / 52
Why do they fail? 
faults at design/implementation phase 
approach:
nd and
x faults before operation 
) model checking 
faults at runtime 
outside of control of designer/developer 
e.g., to the right: crack in a diode in the 
data link interface of the Space Shuttle 
) led to erroneous messages being sent 
Driscoll (Honeywell) 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 4 / 52
Why do they fail? 
faults at design/implementation phase 
approach:
nd and
x faults before operation 
) model checking 
faults at runtime 
outside of control of designer/developer 
e.g., to the right: crack in a diode in the 
data link interface of the Space Shuttle 
) led to erroneous messages being sent 
approach: 
keep system operational despite faults 
) fault-tolerant distributed algorithms Driscoll (Honeywell) 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 4 / 52
Bringing both together 
Goal: automatically veri
ed fault-tolerant distributed algorithms 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 5 / 52
Bringing both together 
Goal: automatically veri
ed fault-tolerant distributed algorithms 
model checking FTDAs is a research challenge: 
computers run independently at dierent speeds 
exchange messages with uncertain delays 
faults 
parameterization 
. . . fault-tolerance makes model checking harder 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 5 / 52
Lecture overview 
Part I: Fault-tolerant distributed algorithms 
introduction to distributed algorithms 
details of our case study algorithm 
motivation why model checking is cool 
Part II: Modeling fault-tolerant distributed algorithms 
model checking challenges in distributed algorithms 
Promela, control 
ow automata, etc. 
model checking of small instances with Spin 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 6 / 52
Lecture overview cont. 
We will have to skip due to time limitations: 
Part III: Introduction into Parameterized model checking 
the general problem statement 
well-known undecidability results 
Igor Konnov: 
Part IV: Parameterized model checking of FTDAs by abstraction 
parametric interval abstraction (PIA) 
PIA data and counter abstraction 
counterexample-guided abstraction re
nement (CEGAR) 
Part V: Parameterized model checking of FTDAs by BMC 
bounding the diameter 
bounded model checking as a complete method 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 7 / 52
Part I: Fault-Tolerant Distributed 
Algorithms 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 8 / 52
Distributed Systems are everywhere 
What they allow to do 
share resources 
communicate 
increase performance 
speed 
fault tolerance 
Dierence to centralized systems 
independent activities (concurrency) 
components do not have access to the global state (only local view) 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 9 / 52
Application areas 
buzzwords from the 60ies 
operating systems 
(distributed) data base systems 
communication networks 
multiprocessor architectures 
control systems 
New buzzwords 
cloud computing 
social networks 
multi core 
cyber-physical systems 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 10 / 52
Major challenge 
Uncertainty 
computers run independently at dierent speeds 
exchange messages with (unknown) delays 
faults 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 11 / 52
Major challenge 
Uncertainty 
computers run independently at dierent speeds 
exchange messages with (unknown) delays 
faults 
challenge in design of distributed algorithms 
a process has access only to its local state 
but one wants to achieve some global property 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 11 / 52
Major challenge 
Uncertainty 
computers run independently at dierent speeds 
exchange messages with (unknown) delays 
faults 
challenge in design of distributed algorithms 
a process has access only to its local state 
but one wants to achieve some global property 
challenge in proving them correct 
large degree of non-determinism 
) large execution and state space 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 11 / 52
From dependability to a distributed system 
P 
Process P provides a service. We want to access it reliably 
but P may fail 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 12 / 52
From dependability to a distributed system 
P 
P1 P2 
P3 
replication 
Process P provides a service. We want to access it reliably 
but P may fail 
canonical approach: replication, i.e., several copies of P 
Due to non-determinism, the behavior of the copies might deviate 
(e.g. in a replicated database, transactions are committed in dierent 
orders at dierent sites) 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 12 / 52
From dependability to a distributed system 
P 
replication P P 
P1 P2 
P3 
P 
consistency 
Process P provides a service. We want to access it reliably 
but P may fail 
canonical approach: replication, i.e., several copies of P 
Due to non-determinism, the behavior of the copies might deviate 
(e.g. in a replicated database, transactions are committed in dierent 
orders at dierent sites) 
) we have to enforce that the copies behave as one. 
) Consistency in a distributed system: what does it mean to behave 
as one. 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 12 / 52
Replication|distributed systems 
P 
P1 P2 
P3 
replication 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 13 / 52
Distributed message passing system 
multiple distributed processes pi 
pi send receive internal 
dots represent states 
a step of a process can be 
a send step (a process sends messages to other processes) 
a receive step (a process receives a subset of messages sent to it) 
an internal step (a local computation) 
steps are the atomic (indivisible) units of computations 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 14 / 52
Types of Distributed Algorithms: 
Synchronous vs. Asynchronous 
Synchronous 
all processes move in lock-step 
rounds 
a message sent in a round is received in the same round 
idealized view 
impossible or expensive to implement 
Asynchronous 
only one process moves at a time 
arbitrary interleavings of steps 
a message sent is received eventually 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 15 / 52
Types of Distributed Algorithms: 
Synchronous vs. Asynchronous 
Synchronous 
all processes move in lock-step 
rounds 
a message sent in a round is received in the same round 
idealized view 
impossible or expensive to implement 
Asynchronous 
only one process moves at a time 
arbitrary interleavings of steps 
a message sent is received eventually 
important problems not solvable (Fischer et al., 1985)! 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 15 / 52
Types of Distributed Algorithms: 
Synchronous vs. Asynchronous 
Synchronous 
all processes move in lock-step 
rounds 
a message sent in a round is received in the same round 
idealized view 
impossible or expensive to implement 
Asynchronous 
only one process moves at a time 
arbitrary interleavings of steps 
a message sent is received eventually 
important problems not solvable (Fischer et al., 1985)! 
We focus on asynchronous algorithms here. . . 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 15 / 52
Asynchronous system 
has very mild restrictions on the environment 
interleaving semantics 
unbounded message delays 
very little can be done. . . 
there is no distributed algorithm that solves consensus in the presence 
of one faulty process 
(as we will see, consensus is the paradigm of consistency) 
folklore explanation: 
you cannot distinguish a slow process from a crashed one 
real explanation: 
see intricate proof by Fischer, Lynch, and Paterson (JACM 1985) 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 16 / 52
Where we stand 
P 
P1 P2 
P3 
replication 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 17 / 52
What we still need. . . 
P1 P2 
P3 
P P 
P 
consistency 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 18 / 52
What we still need. . . 
P1 P2 
P3 
P P 
P 
consistency 
consistency requirements have been formalized under several names, 
e.g., 
consensus 
atomic broadcast 
Byzantine Generals problem 
Byzantine agreement 
atomic commitment 
de
nitions are similar but may have subtle dierences 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 18 / 52
What we still need. . . 
P1 P2 
P3 
P P 
P 
consistency 
consistency requirements have been formalized under several names, 
e.g., 
consensus 
atomic broadcast 
Byzantine Generals problem 
Byzantine agreement 
atomic commitment 
de
nitions are similar but may have subtle dierences 
We use the famous Byzantine Generals to introduce this problem 
domain. . . 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 18 / 52
Fault tolerance { The Byzantine generals problem 
Wiktionary: 
Byzantine: adj. of a devious, usually stealthy manner, of practice. 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 19 / 52
Fault tolerance { The Byzantine generals problem 
Lamport (this year's Turing laureate), Shostak, and Pease wrote in their 
Dijkstra Prize in Distributed Computing winning paper (Lamport et al., 
1982): 
[: : :] several divisions of the Byzantine army are camped outside 
an enemy city, each division commanded by its own general. [: : :] 
However, some of the generals may be traitors [: : :] 
if the divisions of loyal generals attack together, the city falls 
if only some loyal generals attack, their armies fall 
generals communicate by obedient messengers 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 20 / 52
Fault tolerance { The Byzantine generals problem 
Lamport (this year's Turing laureate), Shostak, and Pease wrote in their 
Dijkstra Prize in Distributed Computing winning paper (Lamport et al., 
1982): 
[: : :] several divisions of the Byzantine army are camped outside 
an enemy city, each division commanded by its own general. [: : :] 
However, some of the generals may be traitors [: : :] 
if the divisions of loyal generals attack together, the city falls 
if only some loyal generals attack, their armies fall 
generals communicate by obedient messengers 
The Byzantine generals problem: 
the loyal generals have to agree on whether to attack. 
if all want to attack they must attack, if no-one wants to attack they 
must not attack 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 20 / 52
Fault tolerance { The Byzantine generals problem 
Lamport (this year's Turing laureate), Shostak, and Pease wrote in their 
Dijkstra Prize in Distributed Computing winning paper (Lamport et al., 
1982): 
[: : :] several divisions of the Byzantine army are camped outside 
an enemy city, each division commanded by its own general. [: : :] 
However, some of the generals may be traitors [: : :] 
if the divisions of loyal generals attack together, the city falls 
if only some loyal generals attack, their armies fall 
generals communicate by obedient messengers 
The Byzantine generals problem: 
the loyal generals have to agree on whether to attack. 
if all want to attack they must attack, if no-one wants to attack they 
must not attack 
metaphor for a distributed system where correct processes (loyal generals) 
act as one in the presence of faulty processes (traitors) 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 20 / 52
Byzantine generals problem cont. 
In the absence of faults it is trivial to solve: 
send proposed plan (attack or not attack) to all 
wait until received messages from everyone 
if a process proposed attack decide to attack 
otherwise, decide to not attack 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 21 / 52
Byzantine generals problem cont. 
In the absence of faults it is trivial to solve: 
send proposed plan (attack or not attack) to all 
wait until received messages from everyone 
if a process proposed attack decide to attack 
otherwise, decide to not attack 
In the presence of faults it becomes tricky 
if a process may crash, some processes may not receive messages 
from everyone (but some may) 
if a process may send faulty messages, contradictory information may 
be received, e.g., 
A tells B that C told A that C wants to attack, while C tells B 
that C does not want to attack 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 21 / 52
Byzantine generals problem cont. 
In the absence of faults it is trivial to solve: 
send proposed plan (attack or not attack) to all 
wait until received messages from everyone 
if a process proposed attack decide to attack 
otherwise, decide to not attack 
In the presence of faults it becomes tricky 
if a process may crash, some processes may not receive messages 
from everyone (but some may) 
if a process may send faulty messages, contradictory information may 
be received, e.g., 
A tells B that C told A that C wants to attack, while C tells B 
that C does not want to attack Who is lying to whom? 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 21 / 52
Fault-tolerant distributed algorithms 
n 
n processes communicate by messages (reliable communication) 
all processes know that at most t of them might be faulty 
f are actually faulty 
resilience conditions, e.g., n  3t ^ t  f  0 
no masquerading: the processes know the origin of incoming messages 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 22 / 52
Fault-tolerant distributed algorithms 
n 
? 
? 
? 
t 
n processes communicate by messages (reliable communication) 
all processes know that at most t of them might be faulty 
f are actually faulty 
resilience conditions, e.g., n  3t ^ t  f  0 
no masquerading: the processes know the origin of incoming messages 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 22 / 52
Fault-tolerant distributed algorithms 
n 
? 
? 
? 
t f 
n processes communicate by messages (reliable communication) 
all processes know that at most t of them might be faulty 
f are actually faulty 
resilience conditions, e.g., n  3t ^ t  f  0 
no masquerading: the processes know the origin of incoming messages 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 22 / 52
Fault models|abstractions of reality 
clean crashes: least severe 
faulty processes prematurely halt after/before send to all 
crash faults: 
faulty processes prematurely halt (also) in the middle of send to all 
omission faults: 
faulty processes follow the algorithm, but some messages sent by them 
might be lost 
symmetric faults: 
faulty processes send arbitrarily to all or nobody 
Byzantine faults: most severe 
faulty processes can do anything 
encompass all behaviors of above models 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 23 / 52
Fault models|the ugly truth 
A photo of a Byzantine fault: 
photo by Driscoll (Honeywell) 
he reports Byzantine behavior on the Space Shuttle computer network 
other sources of faults: bit-
ips in memory, power outage, disconnection 
from the network, etc. 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 24 / 52
Model vs. reality: impossibilities 
Hence, we would like the weakest assumptions possible. But there are 
theoretical limits on how weak assumptions can be made: 
consensus is impossible in asynchronous systems if there may be a 
crash fault, i.e., t = 1 (Fischer et al., 1985) 
consensus is possible in synchronous systems in the presence of 
Byzantine faults i n  3t (Lamport et al., 1982) 
consensus is impossible in (synchronous) round-based systems if 
bn=2c messages can be lost per round (Santoro  Widmayer, 1989) 
fast Byzantine consensus is solvable i n  5t (Martin  Alvisi, 2006) 
32 dierent degrees of synchrony and whether consensus can be 
solved in the presence of how many faults investigated in (Dolev 
et al., 1987) 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 25 / 52
Model vs. reality: impossibilities 
Hence, we would like the weakest assumptions possible. But there are 
theoretical limits on how weak assumptions can be made: 
consensus is impossible in asynchronous systems if there may be a 
crash fault, i.e., t = 1 (Fischer et al., 1985) 
consensus is possible in synchronous systems in the presence of 
Byzantine faults i n  3t (Lamport et al., 1982) 
consensus is impossible in (synchronous) round-based systems if 
bn=2c messages can be lost per round (Santoro  Widmayer, 1989) 
fast Byzantine consensus is solvable i n  5t (Martin  Alvisi, 2006) 
32 dierent degrees of synchrony and whether consensus can be 
solved in the presence of how many faults investigated in (Dolev 
et al., 1987) 
arithmetic resilience conditions play crucial role! 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 25 / 52
After this excursion to faults, let's 
go back to the problem of de
ning 
consistency 
(asynchronous systems) 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 26 / 52
De
ning consistency|e.g., binary consensus 
Every process has some initial value v 2 f0; 1g and has to decide 
irrevocably on some value in concordance with the following properties: 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 27 / 52
De
ning consistency|e.g., binary consensus 
Every process has some initial value v 2 f0; 1g and has to decide 
irrevocably on some value in concordance with the following properties: 
agreement. No two correct processes decide on dierent value. 
either all attack or no-one 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 27 / 52
De
ning consistency|e.g., binary consensus 
Every process has some initial value v 2 f0; 1g and has to decide 
irrevocably on some value in concordance with the following properties: 
agreement. No two correct processes decide on dierent value. 
either all attack or no-one 
validity. If all correct processes have the same initial value v, then v 
is the only possible decision value 
the decision on whether to attack must be consistent with 
the will of at least one loyal general 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 27 / 52
De
ning consistency|e.g., binary consensus 
Every process has some initial value v 2 f0; 1g and has to decide 
irrevocably on some value in concordance with the following properties: 
agreement. No two correct processes decide on dierent value. 
either all attack or no-one 
validity. If all correct processes have the same initial value v, then v 
is the only possible decision value 
the decision on whether to attack must be consistent with 
the will of at least one loyal general 
termination. Every correct process eventually decides. 
at some point negotiations must be over 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 27 / 52
De
ning consistency|e.g., binary consensus 
Every process has some initial value v 2 f0; 1g and has to decide 
irrevocably on some value in concordance with the following properties: 
agreement. No two correct processes decide on dierent value. 
either all attack or no-one 
validity. If all correct processes have the same initial value v, then v 
is the only possible decision value 
the decision on whether to attack must be consistent with 
the will of at least one loyal general 
termination. Every correct process eventually decides. 
at some point negotiations must be over 
Interplay of safety and liveness makes the problem hard. . . 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 27 / 52
What if only two properties have to be satis
ed? 
Every process has some initial value v 2 f0; 1g and has to decide 
irrevocably on some value in concordance with the following properties: 
validity. If all correct processes have the same initial value v, then v 
is the only possible decision value. 
termination. Every correct process eventually decides. 
Give an algorithm that solves validity and termination! 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 28 / 52
What if only two properties have to be satis
ed? 
Every process has some initial value v 2 f0; 1g and has to decide 
irrevocably on some value in concordance with the following properties: 
validity. If all correct processes have the same initial value v, then v 
is the only possible decision value. 
termination. Every correct process eventually decides. 
Solution: decide my own proposed value. (no need to agree) 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 28 / 52
What if only two properties have to be satis
ed? 
Every process has some initial value v 2 f0; 1g and has to decide 
irrevocably on some value in concordance with the following properties: 
agreement. No two correct processes decide on dierent value. 
termination. Every correct process eventually decides. 
Give an algorithm that solves agreement and termination! 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 28 / 52
What if only two properties have to be satis
ed? 
Every process has some initial value v 2 f0; 1g and has to decide 
irrevocably on some value in concordance with the following properties: 
agreement. No two correct processes decide on dierent value. 
termination. Every correct process eventually decides. 
Solution: decide 0. (no relation to initial values required) 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 28 / 52
What if only two properties have to be satis
ed? 
Every process has some initial value v 2 f0; 1g and has to decide 
irrevocably on some value in concordance with the following properties: 
agreement. No two correct processes decide on dierent value. 
validity. If all correct processes have the same initial value v, then v 
is the only possible decision value. 
Give an algorithm that solves agreement and validity! 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 28 / 52
What if only two properties have to be satis
ed? 
Every process has some initial value v 2 f0; 1g and has to decide 
irrevocably on some value in concordance with the following properties: 
agreement. No two correct processes decide on dierent value. 
validity. If all correct processes have the same initial value v, then v 
is the only possible decision value. 
Solution: do nothing (doing nothing is always safe) 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 28 / 52
Wrap-up: Intro to FTDAs 
distributed systems 
replication and consistency 
synchronous vs. asynchronous 
fault models 
example for an agreement problem: Byzantine Generals 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 29 / 52
Our case study. . . 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 30 / 52
Asynchronous FTDAs 
In this lecture we consider methods for asynchronous FTDAs that either 
solve problems that are less hard than consensus: 
reliable broadcast. termination required only for speci
c initial state 
(Srikanth  Toueg, 1987). [Veri
ed in Parts II, III, V] 
condition-based consensus properties required only in runs from 
speci
c initial states (Mostefaoui et al., 2003) 
[Veri
ed in Part II] 
The Paxos idea fault-tolerant distributed algorithms that are safe and 
make progress only if you are lucky (Lamport, 1998) 
[Serious challenge] 
are asynchronous but use information on faults as a black box 
failure detector based atomic commitment. distributed databases 
(Raynal, 1997) [Challenge] 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 31 / 52
Asynchronous FTDAs 
In this lecture we consider methods for asynchronous FTDAs that either 
solve problems that are less hard than consensus: 
reliable broadcast. termination required only for speci
c initial state 
(Srikanth  Toueg, 1987). [Veri
ed in Parts II, III, V] 
condition-based consensus properties required only in runs from 
speci
c initial states (Mostefaoui et al., 2003) 
[Veri
ed in Part II] 
The Paxos idea fault-tolerant distributed algorithms that are safe and 
make progress only if you are lucky (Lamport, 1998) 
[Serious challenge] 
are asynchronous but use information on faults as a black box 
failure detector based atomic commitment. distributed databases 
(Raynal, 1997) [Challenge] 
We use the algorithm from (Srikanth  Toueg, 1987) as running example 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 31 / 52
Asynchronous Reliable Broadcast (Srikanth  Toueg, 87) 
The core of the classic broadcast algorithm from the DA literature. 
1 Variables of process i 
2 vi : f0 , 1g ini t i a l l y 0 or 1 
3 accepti : f0 , 1g ini t i a l l y 0 
4 
5 An atomic step: 
6 i f vi = 1 
7 then send ( echo ) to all ; 
8 
9 i f received (echo) from at l e a s t 
10 t + 1 distinct p r o c e s s e s 
11 and not s ent ( echo ) be f o r e 
12 then send ( echo ) to all ; 
13 
14 i f received ( echo ) from at l e a s t 
15 n - t distinct p r o c e s s e s 
16 then accepti := 1 ; 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 32 / 52
Assumptions from (Srikanth  Toueg, 87) 
asynchronous interleaving 
reliable message passing (no bounds on message delays) 
at most t Byzantine faults 
resilience condition: n  3t ^ t  f 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 33 / 52
The spec of our case-study 
Unforgeability. If vi = false for all correct processes i , then for all correct 
processes j , acceptj remains false forever. 
Completeness. If vi = true for all correct processes i , then there is a 
correct process j that eventually sets acceptj to true. 
Relay. If a correct process i sets accepti to true, then eventually all 
correct processes j set acceptj to true. 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 34 / 52
The spec of our case-study 
Unforgeability. If vi = false for all correct processes i , then for all correct 
processes j , acceptj remains false forever. 
if no loyal general wants to attack, then traitors should not 
be able to force one. 
Completeness. If vi = true for all correct processes i , then there is a 
correct process j that eventually sets acceptj to true. 
If all loyal generals want to attack, there shall be an attack. 
Relay. If a correct process i sets accepti to true, then eventually all 
correct processes j set acceptj to true. 
If one loyal general attacks, then all loyal generals should attack. 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 34 / 52
The spec of our case-study 
Unforgeability. If vi = false for all correct processes i , then for all correct 
processes j , acceptj remains false forever. 
if no loyal general wants to attack, then traitors should not 
be able to force one. 
Completeness. If vi = true for all correct processes i , then there is a 
correct process j that eventually sets acceptj to true. 
If all loyal generals want to attack, there shall be an attack. 
Relay. If a correct process i sets accepti to true, then eventually all 
correct processes j set acceptj to true. 
If one loyal general attacks, then all loyal generals should attack. 
These are the specs as given in literature: they can be formalized in LTL 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 34 / 52
Reliable broadcast vs. Consensus 
Reliable broadcast: Completeness. If vi = true for all correct processes i , 
then there is a correct process j that eventually sets acceptj 
to true. 
Consensus: Termination. Every correct process eventually decides. 
Dierence: 
Completeness requires to do something only if 8i : vi = true, 
i.e., only for one speci
c initial state 
Termination requires to do something in all runs (from all initial 
states) 
weakening of spec makes reliable broadcast solvable in async, 
while consensus is not solvable 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 35 / 52
Asynchronous Reliable Broadcast (Srikanth  Toueg, 87) 
The core of the classic broadcast algorithm from the DA literature. 
1 Variables of process i 
2 vi : f0 , 1g ini t i a l l y 0 or 1 
3 accepti : f0 , 1g ini t i a l l y 0 
4 
5 An atomic step: 
6 i f vi = 1 
7 then send ( echo ) to all ; 
8 
9 i f received (echo) from at l e a s t 
10 t + 1 distinct p r o c e s s e s 
11 and not s ent ( echo ) be f o r e 
12 then send ( echo ) to all ; 
13 
14 i f received ( echo ) from at l e a s t 
15 n - t distinct p r o c e s s e s 
16 then accepti := 1 ; 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 36 / 52
Threshold-Guarded Distributed Algorithms 
Standard construct: quanti
ed guards (t=f=0) 
Existential Guard 
if received m from some process then ... 
Universal Guard 
if received m from all processes then ... 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 37 / 52
Threshold-Guarded Distributed Algorithms 
Standard construct: quanti
ed guards (t=f=0) 
Existential Guard 
if received m from some process then ... 
Universal Guard 
if received m from all processes then ... 
what if faults might occur? 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 37 / 52
Threshold-Guarded Distributed Algorithms 
Standard construct: quanti
ed guards (t=f=0) 
Existential Guard 
if received m from some process then ... 
Universal Guard 
if received m from all processes then ... 
what if faults might occur? 
Fault-Tolerant Algorithms: n processes, at most t are Byzantine 
Threshold Guard 
if received m from n  t processes then ... 
(the processes cannot refer to f!) 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 37 / 52
Basic mechanisms used by the algorithm: thresholds 
n 
t f 
t + 1 
if received m from t + 1 processes then ... 
(threshold) 
Correct processes count distinct incoming messages 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 38 / 52
Basic mechanisms used by the algorithm: thresholds 
n 
t f 
t + 1 
if received m from t + 1 processes then ... 
(threshold) 
Correct processes count distinct incoming messages 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 38 / 52
Basic mechanisms used by the algorithm: thresholds 
n 
t f 
t + 1 
at least one non-faulty sent the message 
if received m from t + 1 processes then ... 
(threshold) 
Correct processes count distinct incoming messages 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 38 / 52
Classic correctness argument| 
hand-written proofs 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 39 / 52
Proof: Unforgeability 
If vi = false for all correct processes i , then for all correct processes j , 
acceptj remains false forever. 
1 Variables of process i 
2 vi : f0 , 1g ini t i a l l y 0 or 1 
3 accepti : f0 , 1g ini t i a l l y 0 
4 
5 An atomic step: 
6 i f vi = 1 
7 then send ( echo ) to all ; 
8 
9 i f received (echo) from at l e a s t 
10 t + 1 distinct p r o c e s s e s 
11 and not s ent ( echo ) be f o r e 
12 then send ( echo ) to all ; 
13 
14 i f received ( echo ) from at l e a s t 
15 n - t distinct p r o c e s s e s 
16 then accepti := 1 ; 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 40 / 52
Proof: Unforgeability 
If vi = false for all correct processes i , then for all correct processes j , 
acceptj remains false forever. 
1 Variables of process i 
2 vi : f0 , 1g ini t i a l l y 0 or 1 
3 accepti : f0 , 1g ini t i a l l y 0 
4 
5 An atomic step: 
6 i f vi = 1 
7 then send ( echo ) to all ; 
8 
9 i f received (echo) from at l e a s t 
10 t + 1 distinct p r o c e s s e s 
11 and not s ent ( echo ) be f o r e 
12 then send ( echo ) to all ; 
13 
14 i f received ( echo ) from at l e a s t 
15 n - t distinct p r o c e s s e s 
16 then accepti := 1 ; 
By contradiction assume a 
correct process sets acceptj = 1 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 40 / 52
Proof: Unforgeability 
If vi = false for all correct processes i , then for all correct processes j , 
acceptj remains false forever. 
1 Variables of process i 
2 vi : f0 , 1g ini t i a l l y 0 or 1 
3 accepti : f0 , 1g ini t i a l l y 0 
4 
5 An atomic step: 
6 i f vi = 1 
7 then send ( echo ) to all ; 
8 
9 i f received (echo) from at l e a s t 
10 t + 1 distinct p r o c e s s e s 
11 and not s ent ( echo ) be f o r e 
12 then send ( echo ) to all ; 
13 
14 i f received ( echo ) from at l e a s t 
15 n - t distinct p r o c e s s e s 
16 then accepti := 1 ; 
By contradiction assume a 
correct process sets acceptj = 1 
Thus it has executed line 16 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 40 / 52
Proof: Unforgeability 
If vi = false for all correct processes i , then for all correct processes j , 
acceptj remains false forever. 
1 Variables of process i 
2 vi : f0 , 1g ini t i a l l y 0 or 1 
3 accepti : f0 , 1g ini t i a l l y 0 
4 
5 An atomic step: 
6 i f vi = 1 
7 then send ( echo ) to all ; 
8 
9 i f received (echo) from at l e a s t 
10 t + 1 distinct p r o c e s s e s 
11 and not s ent ( echo ) be f o r e 
12 then send ( echo ) to all ; 
13 
14 i f received ( echo ) from at l e a s t 
15 n - t distinct p r o c e s s e s 
16 then accepti := 1 ; 
By contradiction assume a 
correct process sets acceptj = 1 
Thus it has executed line 16 
Thus it has received n  t 
messages by distinct processes 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 40 / 52
Proof: Unforgeability 
If vi = false for all correct processes i , then for all correct processes j , 
acceptj remains false forever. 
1 Variables of process i 
2 vi : f0 , 1g ini t i a l l y 0 or 1 
3 accepti : f0 , 1g ini t i a l l y 0 
4 
5 An atomic step: 
6 i f vi = 1 
7 then send ( echo ) to all ; 
8 
9 i f received (echo) from at l e a s t 
10 t + 1 distinct p r o c e s s e s 
11 and not s ent ( echo ) be f o r e 
12 then send ( echo ) to all ; 
13 
14 i f received ( echo ) from at l e a s t 
15 n - t distinct p r o c e s s e s 
16 then accepti := 1 ; 
By contradiction assume a 
correct process sets acceptj = 1 
Thus it has executed line 16 
Thus it has received n  t 
messages by distinct processes 
That means messages by at 
n  2t correct processes 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 40 / 52
Proof: Unforgeability 
If vi = false for all correct processes i , then for all correct processes j , 
acceptj remains false forever. 
1 Variables of process i 
2 vi : f0 , 1g ini t i a l l y 0 or 1 
3 accepti : f0 , 1g ini t i a l l y 0 
4 
5 An atomic step: 
6 i f vi = 1 
7 then send ( echo ) to all ; 
8 
9 i f received (echo) from at l e a s t 
10 t + 1 distinct p r o c e s s e s 
11 and not s ent ( echo ) be f o r e 
12 then send ( echo ) to all ; 
13 
14 i f received ( echo ) from at l e a s t 
15 n - t distinct p r o c e s s e s 
16 then accepti := 1 ; 
By contradiction assume a 
correct process sets acceptj = 1 
Thus it has executed line 16 
Thus it has received n  t 
messages by distinct processes 
That means messages by at 
n  2t correct processes 
Let p be the

More Related Content

What's hot

How Triton can help to reverse virtual machine based software protections
How Triton can help to reverse virtual machine based software protectionsHow Triton can help to reverse virtual machine based software protections
How Triton can help to reverse virtual machine based software protectionsJonathan Salwan
 
Dynamic Binary Analysis and Obfuscated Codes
Dynamic Binary Analysis and Obfuscated Codes Dynamic Binary Analysis and Obfuscated Codes
Dynamic Binary Analysis and Obfuscated Codes Jonathan Salwan
 
Process synchronization(deepa)
Process synchronization(deepa)Process synchronization(deepa)
Process synchronization(deepa)Nagarajan
 
Agilent flash programming agilent utility card versus deep serial memory-ca...
Agilent flash programming   agilent utility card versus deep serial memory-ca...Agilent flash programming   agilent utility card versus deep serial memory-ca...
Agilent flash programming agilent utility card versus deep serial memory-ca...AgilentT&M EMEA
 
Μεταπρογραµµατισµός κώδικα Python σε γλώσσα γραµµικού χρόνου για αυτόµατη επα...
Μεταπρογραµµατισµός κώδικα Python σε γλώσσα γραµµικού χρόνου για αυτόµατη επα...Μεταπρογραµµατισµός κώδικα Python σε γλώσσα γραµµικού χρόνου για αυτόµατη επα...
Μεταπρογραµµατισµός κώδικα Python σε γλώσσα γραµµικού χρόνου για αυτόµατη επα...ISSEL
 
20080501 software verification_sharygina_lecture01
20080501 software verification_sharygina_lecture0120080501 software verification_sharygina_lecture01
20080501 software verification_sharygina_lecture01Computer Science Club
 
الإعتيان والتبديل التمثيلي الرقمي
الإعتيان والتبديل التمثيلي الرقميالإعتيان والتبديل التمثيلي الرقمي
الإعتيان والتبديل التمثيلي الرقميDr. Munthear Alqaderi
 
Pragmatic Optimization in Modern Programming - Ordering Optimization Approaches
Pragmatic Optimization in Modern Programming - Ordering Optimization ApproachesPragmatic Optimization in Modern Programming - Ordering Optimization Approaches
Pragmatic Optimization in Modern Programming - Ordering Optimization ApproachesMarina Kolpakova
 
Sensors and Journalism Seminar, University of British Columbia - Day 1
Sensors and Journalism Seminar, University of British Columbia - Day 1Sensors and Journalism Seminar, University of British Columbia - Day 1
Sensors and Journalism Seminar, University of British Columbia - Day 1ferguspitt
 
Operating System - Monitors (Presentation)
Operating System - Monitors (Presentation)Operating System - Monitors (Presentation)
Operating System - Monitors (Presentation)Experts Desk
 
Pragmatic optimization in modern programming - modern computer architecture c...
Pragmatic optimization in modern programming - modern computer architecture c...Pragmatic optimization in modern programming - modern computer architecture c...
Pragmatic optimization in modern programming - modern computer architecture c...Marina Kolpakova
 
20051019 automating regression testing for evolving gui software
20051019 automating regression testing for evolving gui software20051019 automating regression testing for evolving gui software
20051019 automating regression testing for evolving gui softwareWill Shen
 
Mca ii os u-2 process management & communication
Mca  ii  os u-2 process management & communicationMca  ii  os u-2 process management & communication
Mca ii os u-2 process management & communicationRai University
 

What's hot (20)

How Triton can help to reverse virtual machine based software protections
How Triton can help to reverse virtual machine based software protectionsHow Triton can help to reverse virtual machine based software protections
How Triton can help to reverse virtual machine based software protections
 
Monitors
MonitorsMonitors
Monitors
 
Dynamic Binary Analysis and Obfuscated Codes
Dynamic Binary Analysis and Obfuscated Codes Dynamic Binary Analysis and Obfuscated Codes
Dynamic Binary Analysis and Obfuscated Codes
 
Semaphores
SemaphoresSemaphores
Semaphores
 
Process synchronization(deepa)
Process synchronization(deepa)Process synchronization(deepa)
Process synchronization(deepa)
 
Os3
Os3Os3
Os3
 
SYNCHRONIZATION
SYNCHRONIZATIONSYNCHRONIZATION
SYNCHRONIZATION
 
Agilent flash programming agilent utility card versus deep serial memory-ca...
Agilent flash programming   agilent utility card versus deep serial memory-ca...Agilent flash programming   agilent utility card versus deep serial memory-ca...
Agilent flash programming agilent utility card versus deep serial memory-ca...
 
Μεταπρογραµµατισµός κώδικα Python σε γλώσσα γραµµικού χρόνου για αυτόµατη επα...
Μεταπρογραµµατισµός κώδικα Python σε γλώσσα γραµµικού χρόνου για αυτόµατη επα...Μεταπρογραµµατισµός κώδικα Python σε γλώσσα γραµµικού χρόνου για αυτόµατη επα...
Μεταπρογραµµατισµός κώδικα Python σε γλώσσα γραµµικού χρόνου για αυτόµατη επα...
 
Real time SHVC decoder
Real time SHVC decoderReal time SHVC decoder
Real time SHVC decoder
 
20080501 software verification_sharygina_lecture01
20080501 software verification_sharygina_lecture0120080501 software verification_sharygina_lecture01
20080501 software verification_sharygina_lecture01
 
الإعتيان والتبديل التمثيلي الرقمي
الإعتيان والتبديل التمثيلي الرقميالإعتيان والتبديل التمثيلي الرقمي
الإعتيان والتبديل التمثيلي الرقمي
 
Pragmatic Optimization in Modern Programming - Ordering Optimization Approaches
Pragmatic Optimization in Modern Programming - Ordering Optimization ApproachesPragmatic Optimization in Modern Programming - Ordering Optimization Approaches
Pragmatic Optimization in Modern Programming - Ordering Optimization Approaches
 
SCP
SCPSCP
SCP
 
Sensors and Journalism Seminar, University of British Columbia - Day 1
Sensors and Journalism Seminar, University of British Columbia - Day 1Sensors and Journalism Seminar, University of British Columbia - Day 1
Sensors and Journalism Seminar, University of British Columbia - Day 1
 
Operating System - Monitors (Presentation)
Operating System - Monitors (Presentation)Operating System - Monitors (Presentation)
Operating System - Monitors (Presentation)
 
Pragmatic optimization in modern programming - modern computer architecture c...
Pragmatic optimization in modern programming - modern computer architecture c...Pragmatic optimization in modern programming - modern computer architecture c...
Pragmatic optimization in modern programming - modern computer architecture c...
 
Code GPU with CUDA - SIMT
Code GPU with CUDA - SIMTCode GPU with CUDA - SIMT
Code GPU with CUDA - SIMT
 
20051019 automating regression testing for evolving gui software
20051019 automating regression testing for evolving gui software20051019 automating regression testing for evolving gui software
20051019 automating regression testing for evolving gui software
 
Mca ii os u-2 process management & communication
Mca  ii  os u-2 process management & communicationMca  ii  os u-2 process management & communication
Mca ii os u-2 process management & communication
 

Similar to Introduction into Fault-tolerant Distributed Algorithms and their Modeling (Part1)

Introduction into Fault-tolerant Distributed Algorithms and their Modeling (P...
Introduction into Fault-tolerant Distributed Algorithms and their Modeling (P...Introduction into Fault-tolerant Distributed Algorithms and their Modeling (P...
Introduction into Fault-tolerant Distributed Algorithms and their Modeling (P...Iosif Itkin
 
Crash Analysis with Reverse Taint
Crash Analysis with Reverse TaintCrash Analysis with Reverse Taint
Crash Analysis with Reverse Taintmarekzmyslowski
 
Eventual Consistency - Du musst keine Angst haben
Eventual Consistency - Du musst keine Angst habenEventual Consistency - Du musst keine Angst haben
Eventual Consistency - Du musst keine Angst habenSusanne Braun
 
OOP 2021 - Eventual Consistency - Du musst keine Angst haben
OOP 2021 - Eventual Consistency - Du musst keine Angst habenOOP 2021 - Eventual Consistency - Du musst keine Angst haben
OOP 2021 - Eventual Consistency - Du musst keine Angst habenSusanne Braun
 
An Architectural Concept for Intrusion Tolerance in Air Traffic Networks
An Architectural Concept for Intrusion Tolerance in Air Traffic NetworksAn Architectural Concept for Intrusion Tolerance in Air Traffic Networks
An Architectural Concept for Intrusion Tolerance in Air Traffic NetworksÜlger Ahmet
 
Data Engineering for Data Scientists
Data Engineering for Data Scientists Data Engineering for Data Scientists
Data Engineering for Data Scientists jlacefie
 
OSDC 2018 | The Computer science behind a modern distributed data store by Ma...
OSDC 2018 | The Computer science behind a modern distributed data store by Ma...OSDC 2018 | The Computer science behind a modern distributed data store by Ma...
OSDC 2018 | The Computer science behind a modern distributed data store by Ma...NETWAYS
 
Joe armstrong erlanga_languageforprogrammingreliablesystems
Joe armstrong erlanga_languageforprogrammingreliablesystemsJoe armstrong erlanga_languageforprogrammingreliablesystems
Joe armstrong erlanga_languageforprogrammingreliablesystemsSentifi
 
Fuzzing malware for fun & profit. Applying Coverage-Guided Fuzzing to Find Bu...
Fuzzing malware for fun & profit. Applying Coverage-Guided Fuzzing to Find Bu...Fuzzing malware for fun & profit. Applying Coverage-Guided Fuzzing to Find Bu...
Fuzzing malware for fun & profit. Applying Coverage-Guided Fuzzing to Find Bu...Maksim Shudrak
 
Identity Providers-as-a-Service built as Cloud-of-Clouds: challenges and oppo...
Identity Providers-as-a-Service built as Cloud-of-Clouds: challenges and oppo...Identity Providers-as-a-Service built as Cloud-of-Clouds: challenges and oppo...
Identity Providers-as-a-Service built as Cloud-of-Clouds: challenges and oppo...Diego Kreutz
 
The Computer Science Behind a modern Distributed Database
The Computer Science Behind a modern Distributed DatabaseThe Computer Science Behind a modern Distributed Database
The Computer Science Behind a modern Distributed DatabaseArangoDB Database
 
The computer science behind a modern disributed data store
The computer science behind a modern disributed data storeThe computer science behind a modern disributed data store
The computer science behind a modern disributed data storeJ On The Beach
 
An Ad-hoc Smart Gateway Platform for the Web of Things (IEEE iThings 2013 Bes...
An Ad-hoc Smart Gateway Platform for the Web of Things (IEEE iThings 2013 Bes...An Ad-hoc Smart Gateway Platform for the Web of Things (IEEE iThings 2013 Bes...
An Ad-hoc Smart Gateway Platform for the Web of Things (IEEE iThings 2013 Bes...Darren Carlson
 
DSD-INT 2022 Singularity containers - simplifying the use of Delft3D FM on Hi...
DSD-INT 2022 Singularity containers - simplifying the use of Delft3D FM on Hi...DSD-INT 2022 Singularity containers - simplifying the use of Delft3D FM on Hi...
DSD-INT 2022 Singularity containers - simplifying the use of Delft3D FM on Hi...Deltares
 
Eventual Consistency - JUG DA
Eventual Consistency - JUG DAEventual Consistency - JUG DA
Eventual Consistency - JUG DASusanne Braun
 
What the music of the 1980s taught me about shipping software
What the music of the 1980s taught me about shipping softwareWhat the music of the 1980s taught me about shipping software
What the music of the 1980s taught me about shipping softwareMichael Ewins
 
Enabling Worm and Malware Investigation Using Virtualization
Enabling Worm and Malware Investigation Using VirtualizationEnabling Worm and Malware Investigation Using Virtualization
Enabling Worm and Malware Investigation Using Virtualizationamiable_indian
 
The influence of "Distributed platforms" on #devops
The influence of "Distributed platforms" on #devopsThe influence of "Distributed platforms" on #devops
The influence of "Distributed platforms" on #devopsKris Buytaert
 
FusionInventory at LSM/RMLL 2012
FusionInventory at LSM/RMLL 2012FusionInventory at LSM/RMLL 2012
FusionInventory at LSM/RMLL 2012Nouh Walid
 

Similar to Introduction into Fault-tolerant Distributed Algorithms and their Modeling (Part1) (20)

Introduction into Fault-tolerant Distributed Algorithms and their Modeling (P...
Introduction into Fault-tolerant Distributed Algorithms and their Modeling (P...Introduction into Fault-tolerant Distributed Algorithms and their Modeling (P...
Introduction into Fault-tolerant Distributed Algorithms and their Modeling (P...
 
Crash Analysis with Reverse Taint
Crash Analysis with Reverse TaintCrash Analysis with Reverse Taint
Crash Analysis with Reverse Taint
 
Eventual Consistency - Du musst keine Angst haben
Eventual Consistency - Du musst keine Angst habenEventual Consistency - Du musst keine Angst haben
Eventual Consistency - Du musst keine Angst haben
 
OOP 2021 - Eventual Consistency - Du musst keine Angst haben
OOP 2021 - Eventual Consistency - Du musst keine Angst habenOOP 2021 - Eventual Consistency - Du musst keine Angst haben
OOP 2021 - Eventual Consistency - Du musst keine Angst haben
 
An Architectural Concept for Intrusion Tolerance in Air Traffic Networks
An Architectural Concept for Intrusion Tolerance in Air Traffic NetworksAn Architectural Concept for Intrusion Tolerance in Air Traffic Networks
An Architectural Concept for Intrusion Tolerance in Air Traffic Networks
 
Data Engineering for Data Scientists
Data Engineering for Data Scientists Data Engineering for Data Scientists
Data Engineering for Data Scientists
 
OSDC 2018 | The Computer science behind a modern distributed data store by Ma...
OSDC 2018 | The Computer science behind a modern distributed data store by Ma...OSDC 2018 | The Computer science behind a modern distributed data store by Ma...
OSDC 2018 | The Computer science behind a modern distributed data store by Ma...
 
Joe armstrong erlanga_languageforprogrammingreliablesystems
Joe armstrong erlanga_languageforprogrammingreliablesystemsJoe armstrong erlanga_languageforprogrammingreliablesystems
Joe armstrong erlanga_languageforprogrammingreliablesystems
 
Fuzzing malware for fun & profit. Applying Coverage-Guided Fuzzing to Find Bu...
Fuzzing malware for fun & profit. Applying Coverage-Guided Fuzzing to Find Bu...Fuzzing malware for fun & profit. Applying Coverage-Guided Fuzzing to Find Bu...
Fuzzing malware for fun & profit. Applying Coverage-Guided Fuzzing to Find Bu...
 
Identity Providers-as-a-Service built as Cloud-of-Clouds: challenges and oppo...
Identity Providers-as-a-Service built as Cloud-of-Clouds: challenges and oppo...Identity Providers-as-a-Service built as Cloud-of-Clouds: challenges and oppo...
Identity Providers-as-a-Service built as Cloud-of-Clouds: challenges and oppo...
 
The Computer Science Behind a modern Distributed Database
The Computer Science Behind a modern Distributed DatabaseThe Computer Science Behind a modern Distributed Database
The Computer Science Behind a modern Distributed Database
 
The computer science behind a modern disributed data store
The computer science behind a modern disributed data storeThe computer science behind a modern disributed data store
The computer science behind a modern disributed data store
 
An Ad-hoc Smart Gateway Platform for the Web of Things (IEEE iThings 2013 Bes...
An Ad-hoc Smart Gateway Platform for the Web of Things (IEEE iThings 2013 Bes...An Ad-hoc Smart Gateway Platform for the Web of Things (IEEE iThings 2013 Bes...
An Ad-hoc Smart Gateway Platform for the Web of Things (IEEE iThings 2013 Bes...
 
DSD-INT 2022 Singularity containers - simplifying the use of Delft3D FM on Hi...
DSD-INT 2022 Singularity containers - simplifying the use of Delft3D FM on Hi...DSD-INT 2022 Singularity containers - simplifying the use of Delft3D FM on Hi...
DSD-INT 2022 Singularity containers - simplifying the use of Delft3D FM on Hi...
 
Eventual Consistency - JUG DA
Eventual Consistency - JUG DAEventual Consistency - JUG DA
Eventual Consistency - JUG DA
 
What the music of the 1980s taught me about shipping software
What the music of the 1980s taught me about shipping softwareWhat the music of the 1980s taught me about shipping software
What the music of the 1980s taught me about shipping software
 
Enabling Worm and Malware Investigation Using Virtualization
Enabling Worm and Malware Investigation Using VirtualizationEnabling Worm and Malware Investigation Using Virtualization
Enabling Worm and Malware Investigation Using Virtualization
 
The influence of "Distributed platforms" on #devops
The influence of "Distributed platforms" on #devopsThe influence of "Distributed platforms" on #devops
The influence of "Distributed platforms" on #devops
 
FusionInventory at LSM/RMLL 2012
FusionInventory at LSM/RMLL 2012FusionInventory at LSM/RMLL 2012
FusionInventory at LSM/RMLL 2012
 
Vert.x
Vert.xVert.x
Vert.x
 

More from Iosif Itkin

Foundations of Software Testing Lecture 4
Foundations of Software Testing Lecture 4Foundations of Software Testing Lecture 4
Foundations of Software Testing Lecture 4Iosif Itkin
 
QA Financial Forum London 2021 - Automation in Software Testing. Humans and C...
QA Financial Forum London 2021 - Automation in Software Testing. Humans and C...QA Financial Forum London 2021 - Automation in Software Testing. Humans and C...
QA Financial Forum London 2021 - Automation in Software Testing. Humans and C...Iosif Itkin
 
Exactpro FinTech Webinar - Global Exchanges Test Oracles
Exactpro FinTech Webinar - Global Exchanges Test OraclesExactpro FinTech Webinar - Global Exchanges Test Oracles
Exactpro FinTech Webinar - Global Exchanges Test OraclesIosif Itkin
 
Exactpro FinTech Webinar - Global Exchanges FIX Protocol
Exactpro FinTech Webinar - Global Exchanges FIX ProtocolExactpro FinTech Webinar - Global Exchanges FIX Protocol
Exactpro FinTech Webinar - Global Exchanges FIX ProtocolIosif Itkin
 
Operational Resilience in Financial Market Infrastructures
Operational Resilience in Financial Market InfrastructuresOperational Resilience in Financial Market Infrastructures
Operational Resilience in Financial Market InfrastructuresIosif Itkin
 
20 Simple Questions from Exactpro for Your Enjoyment This Holiday Season
20 Simple Questions from Exactpro for Your Enjoyment This Holiday Season20 Simple Questions from Exactpro for Your Enjoyment This Holiday Season
20 Simple Questions from Exactpro for Your Enjoyment This Holiday SeasonIosif Itkin
 
Testing the Intelligence of your AI
Testing the Intelligence of your AITesting the Intelligence of your AI
Testing the Intelligence of your AIIosif Itkin
 
EXTENT 2019: Exactpro Quality Assurance for Financial Market Infrastructures
EXTENT 2019: Exactpro Quality Assurance for Financial Market InfrastructuresEXTENT 2019: Exactpro Quality Assurance for Financial Market Infrastructures
EXTENT 2019: Exactpro Quality Assurance for Financial Market InfrastructuresIosif Itkin
 
ClearTH Test Automation Framework: Case Study in IRS & CDS Swaps Lifecycle Mo...
ClearTH Test Automation Framework: Case Study in IRS & CDS Swaps Lifecycle Mo...ClearTH Test Automation Framework: Case Study in IRS & CDS Swaps Lifecycle Mo...
ClearTH Test Automation Framework: Case Study in IRS & CDS Swaps Lifecycle Mo...Iosif Itkin
 
EXTENT Talks 2019 Tbilisi: Failover and Recovery Test Automation - Ivan Shamrai
EXTENT Talks 2019 Tbilisi: Failover and Recovery Test Automation - Ivan ShamraiEXTENT Talks 2019 Tbilisi: Failover and Recovery Test Automation - Ivan Shamrai
EXTENT Talks 2019 Tbilisi: Failover and Recovery Test Automation - Ivan ShamraiIosif Itkin
 
EXTENT Talks QA Community Tbilisi 20 April 2019 - Conference Open
EXTENT Talks QA Community Tbilisi 20 April 2019 - Conference OpenEXTENT Talks QA Community Tbilisi 20 April 2019 - Conference Open
EXTENT Talks QA Community Tbilisi 20 April 2019 - Conference OpenIosif Itkin
 
User-Assisted Log Analysis for Quality Control of Distributed Fintech Applica...
User-Assisted Log Analysis for Quality Control of Distributed Fintech Applica...User-Assisted Log Analysis for Quality Control of Distributed Fintech Applica...
User-Assisted Log Analysis for Quality Control of Distributed Fintech Applica...Iosif Itkin
 
QAFF Chicago 2019 - Complex Post-Trade Systems, Requirements Traceability and...
QAFF Chicago 2019 - Complex Post-Trade Systems, Requirements Traceability and...QAFF Chicago 2019 - Complex Post-Trade Systems, Requirements Traceability and...
QAFF Chicago 2019 - Complex Post-Trade Systems, Requirements Traceability and...Iosif Itkin
 
QA Community Saratov: Past, Present, Future (2019-02-08)
QA Community Saratov: Past, Present, Future (2019-02-08)QA Community Saratov: Past, Present, Future (2019-02-08)
QA Community Saratov: Past, Present, Future (2019-02-08)Iosif Itkin
 
Machine Learning and RoboCop Testing
Machine Learning and RoboCop TestingMachine Learning and RoboCop Testing
Machine Learning and RoboCop TestingIosif Itkin
 
Behaviour Driven Development: Oltre i limiti del possibile
Behaviour Driven Development: Oltre i limiti del possibileBehaviour Driven Development: Oltre i limiti del possibile
Behaviour Driven Development: Oltre i limiti del possibileIosif Itkin
 
2018 - Exactpro Year in Review
2018 - Exactpro Year in Review2018 - Exactpro Year in Review
2018 - Exactpro Year in ReviewIosif Itkin
 
Exactpro Discussion about Joy and Strategy
Exactpro Discussion about Joy and StrategyExactpro Discussion about Joy and Strategy
Exactpro Discussion about Joy and StrategyIosif Itkin
 
FIX EMEA Conference 2018 - Post Trade Software Testing Challenges
FIX EMEA Conference 2018 - Post Trade Software Testing ChallengesFIX EMEA Conference 2018 - Post Trade Software Testing Challenges
FIX EMEA Conference 2018 - Post Trade Software Testing ChallengesIosif Itkin
 
BDD. The Outer Limits. Iosif Itkin at Youcon (in Russian)
BDD. The Outer Limits. Iosif Itkin at Youcon (in Russian)BDD. The Outer Limits. Iosif Itkin at Youcon (in Russian)
BDD. The Outer Limits. Iosif Itkin at Youcon (in Russian)Iosif Itkin
 

More from Iosif Itkin (20)

Foundations of Software Testing Lecture 4
Foundations of Software Testing Lecture 4Foundations of Software Testing Lecture 4
Foundations of Software Testing Lecture 4
 
QA Financial Forum London 2021 - Automation in Software Testing. Humans and C...
QA Financial Forum London 2021 - Automation in Software Testing. Humans and C...QA Financial Forum London 2021 - Automation in Software Testing. Humans and C...
QA Financial Forum London 2021 - Automation in Software Testing. Humans and C...
 
Exactpro FinTech Webinar - Global Exchanges Test Oracles
Exactpro FinTech Webinar - Global Exchanges Test OraclesExactpro FinTech Webinar - Global Exchanges Test Oracles
Exactpro FinTech Webinar - Global Exchanges Test Oracles
 
Exactpro FinTech Webinar - Global Exchanges FIX Protocol
Exactpro FinTech Webinar - Global Exchanges FIX ProtocolExactpro FinTech Webinar - Global Exchanges FIX Protocol
Exactpro FinTech Webinar - Global Exchanges FIX Protocol
 
Operational Resilience in Financial Market Infrastructures
Operational Resilience in Financial Market InfrastructuresOperational Resilience in Financial Market Infrastructures
Operational Resilience in Financial Market Infrastructures
 
20 Simple Questions from Exactpro for Your Enjoyment This Holiday Season
20 Simple Questions from Exactpro for Your Enjoyment This Holiday Season20 Simple Questions from Exactpro for Your Enjoyment This Holiday Season
20 Simple Questions from Exactpro for Your Enjoyment This Holiday Season
 
Testing the Intelligence of your AI
Testing the Intelligence of your AITesting the Intelligence of your AI
Testing the Intelligence of your AI
 
EXTENT 2019: Exactpro Quality Assurance for Financial Market Infrastructures
EXTENT 2019: Exactpro Quality Assurance for Financial Market InfrastructuresEXTENT 2019: Exactpro Quality Assurance for Financial Market Infrastructures
EXTENT 2019: Exactpro Quality Assurance for Financial Market Infrastructures
 
ClearTH Test Automation Framework: Case Study in IRS & CDS Swaps Lifecycle Mo...
ClearTH Test Automation Framework: Case Study in IRS & CDS Swaps Lifecycle Mo...ClearTH Test Automation Framework: Case Study in IRS & CDS Swaps Lifecycle Mo...
ClearTH Test Automation Framework: Case Study in IRS & CDS Swaps Lifecycle Mo...
 
EXTENT Talks 2019 Tbilisi: Failover and Recovery Test Automation - Ivan Shamrai
EXTENT Talks 2019 Tbilisi: Failover and Recovery Test Automation - Ivan ShamraiEXTENT Talks 2019 Tbilisi: Failover and Recovery Test Automation - Ivan Shamrai
EXTENT Talks 2019 Tbilisi: Failover and Recovery Test Automation - Ivan Shamrai
 
EXTENT Talks QA Community Tbilisi 20 April 2019 - Conference Open
EXTENT Talks QA Community Tbilisi 20 April 2019 - Conference OpenEXTENT Talks QA Community Tbilisi 20 April 2019 - Conference Open
EXTENT Talks QA Community Tbilisi 20 April 2019 - Conference Open
 
User-Assisted Log Analysis for Quality Control of Distributed Fintech Applica...
User-Assisted Log Analysis for Quality Control of Distributed Fintech Applica...User-Assisted Log Analysis for Quality Control of Distributed Fintech Applica...
User-Assisted Log Analysis for Quality Control of Distributed Fintech Applica...
 
QAFF Chicago 2019 - Complex Post-Trade Systems, Requirements Traceability and...
QAFF Chicago 2019 - Complex Post-Trade Systems, Requirements Traceability and...QAFF Chicago 2019 - Complex Post-Trade Systems, Requirements Traceability and...
QAFF Chicago 2019 - Complex Post-Trade Systems, Requirements Traceability and...
 
QA Community Saratov: Past, Present, Future (2019-02-08)
QA Community Saratov: Past, Present, Future (2019-02-08)QA Community Saratov: Past, Present, Future (2019-02-08)
QA Community Saratov: Past, Present, Future (2019-02-08)
 
Machine Learning and RoboCop Testing
Machine Learning and RoboCop TestingMachine Learning and RoboCop Testing
Machine Learning and RoboCop Testing
 
Behaviour Driven Development: Oltre i limiti del possibile
Behaviour Driven Development: Oltre i limiti del possibileBehaviour Driven Development: Oltre i limiti del possibile
Behaviour Driven Development: Oltre i limiti del possibile
 
2018 - Exactpro Year in Review
2018 - Exactpro Year in Review2018 - Exactpro Year in Review
2018 - Exactpro Year in Review
 
Exactpro Discussion about Joy and Strategy
Exactpro Discussion about Joy and StrategyExactpro Discussion about Joy and Strategy
Exactpro Discussion about Joy and Strategy
 
FIX EMEA Conference 2018 - Post Trade Software Testing Challenges
FIX EMEA Conference 2018 - Post Trade Software Testing ChallengesFIX EMEA Conference 2018 - Post Trade Software Testing Challenges
FIX EMEA Conference 2018 - Post Trade Software Testing Challenges
 
BDD. The Outer Limits. Iosif Itkin at Youcon (in Russian)
BDD. The Outer Limits. Iosif Itkin at Youcon (in Russian)BDD. The Outer Limits. Iosif Itkin at Youcon (in Russian)
BDD. The Outer Limits. Iosif Itkin at Youcon (in Russian)
 

Recently uploaded

REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》rnrncn29
 
Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...navyadasi1992
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubaikojalkojal131
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...D. B. S. College Kanpur
 
basic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomybasic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomyDrAnita Sharma
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptxpallavirawat456
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxRitchAndruAgustin
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx023NiWayanAnggiSriWa
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)riyaescorts54
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuinethapagita
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 

Recently uploaded (20)

REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
 
Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
 
basic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomybasic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomy
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptx
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 
Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 

Introduction into Fault-tolerant Distributed Algorithms and their Modeling (Part1)

  • 1. Model Checking of Fault-Tolerant Distributed Algorithms Part I: Fault-Tolerant Distributed Algorithms Annu Gmeiner Igor Konnov Ulrich Schmid Helmut Veith Josef Widder TMPA 2014, Kostroma, Russia Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 1 / 52
  • 2. Distributed Systems 114 7.1 Assessing and validating the standard node HITS design Figure 7.1: DARTS prototype board, comprising 8 interconnected HITS chips
  • 3. Distributed Systems 114 7.1 Assessing and validating the standard node HITS design Figure 7.1: DARTS prototype board, comprising 8 interconnected HITS chips Are they always working? Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 2 / 52
  • 4. No. . . some failing systems Therac-25 (1985) radiation therapy machine gave massive overdoses, e.g., due to race conditions of concurrent tasks Quantas Airbus in- ight Learmonth upset (2008) 1 out of 3 replicated components failed computer initiated dangerous altitude drop Ariane 501 maiden ight (1996) primary/backup, i.e., 2 replicated computers both run into the same integer over ow Net ix outages due to Amazon's cloud (ongoing) one is not sure what is going on there hundreds of computers involved Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 3 / 52
  • 5. Why do they fail? Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 4 / 52
  • 6. Why do they fail? faults at design/implementation phase faults at runtime outside of control of designer/developer e.g., to the right: crack in a diode in the data link interface of the Space Shuttle ) led to erroneous messages being sent Driscoll (Honeywell) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 4 / 52
  • 7. Why do they fail? faults at design/implementation phase approach:
  • 9. x faults before operation ) model checking faults at runtime outside of control of designer/developer e.g., to the right: crack in a diode in the data link interface of the Space Shuttle ) led to erroneous messages being sent Driscoll (Honeywell) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 4 / 52
  • 10. Why do they fail? faults at design/implementation phase approach:
  • 12. x faults before operation ) model checking faults at runtime outside of control of designer/developer e.g., to the right: crack in a diode in the data link interface of the Space Shuttle ) led to erroneous messages being sent approach: keep system operational despite faults ) fault-tolerant distributed algorithms Driscoll (Honeywell) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 4 / 52
  • 13. Bringing both together Goal: automatically veri
  • 14. ed fault-tolerant distributed algorithms Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 5 / 52
  • 15. Bringing both together Goal: automatically veri
  • 16. ed fault-tolerant distributed algorithms model checking FTDAs is a research challenge: computers run independently at dierent speeds exchange messages with uncertain delays faults parameterization . . . fault-tolerance makes model checking harder Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 5 / 52
  • 17. Lecture overview Part I: Fault-tolerant distributed algorithms introduction to distributed algorithms details of our case study algorithm motivation why model checking is cool Part II: Modeling fault-tolerant distributed algorithms model checking challenges in distributed algorithms Promela, control ow automata, etc. model checking of small instances with Spin Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 6 / 52
  • 18. Lecture overview cont. We will have to skip due to time limitations: Part III: Introduction into Parameterized model checking the general problem statement well-known undecidability results Igor Konnov: Part IV: Parameterized model checking of FTDAs by abstraction parametric interval abstraction (PIA) PIA data and counter abstraction counterexample-guided abstraction re
  • 19. nement (CEGAR) Part V: Parameterized model checking of FTDAs by BMC bounding the diameter bounded model checking as a complete method Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 7 / 52
  • 20. Part I: Fault-Tolerant Distributed Algorithms Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 8 / 52
  • 21. Distributed Systems are everywhere What they allow to do share resources communicate increase performance speed fault tolerance Dierence to centralized systems independent activities (concurrency) components do not have access to the global state (only local view) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 9 / 52
  • 22. Application areas buzzwords from the 60ies operating systems (distributed) data base systems communication networks multiprocessor architectures control systems New buzzwords cloud computing social networks multi core cyber-physical systems Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 10 / 52
  • 23. Major challenge Uncertainty computers run independently at dierent speeds exchange messages with (unknown) delays faults Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 11 / 52
  • 24. Major challenge Uncertainty computers run independently at dierent speeds exchange messages with (unknown) delays faults challenge in design of distributed algorithms a process has access only to its local state but one wants to achieve some global property Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 11 / 52
  • 25. Major challenge Uncertainty computers run independently at dierent speeds exchange messages with (unknown) delays faults challenge in design of distributed algorithms a process has access only to its local state but one wants to achieve some global property challenge in proving them correct large degree of non-determinism ) large execution and state space Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 11 / 52
  • 26. From dependability to a distributed system P Process P provides a service. We want to access it reliably but P may fail Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 12 / 52
  • 27. From dependability to a distributed system P P1 P2 P3 replication Process P provides a service. We want to access it reliably but P may fail canonical approach: replication, i.e., several copies of P Due to non-determinism, the behavior of the copies might deviate (e.g. in a replicated database, transactions are committed in dierent orders at dierent sites) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 12 / 52
  • 28. From dependability to a distributed system P replication P P P1 P2 P3 P consistency Process P provides a service. We want to access it reliably but P may fail canonical approach: replication, i.e., several copies of P Due to non-determinism, the behavior of the copies might deviate (e.g. in a replicated database, transactions are committed in dierent orders at dierent sites) ) we have to enforce that the copies behave as one. ) Consistency in a distributed system: what does it mean to behave as one. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 12 / 52
  • 29. Replication|distributed systems P P1 P2 P3 replication Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 13 / 52
  • 30. Distributed message passing system multiple distributed processes pi pi send receive internal dots represent states a step of a process can be a send step (a process sends messages to other processes) a receive step (a process receives a subset of messages sent to it) an internal step (a local computation) steps are the atomic (indivisible) units of computations Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 14 / 52
  • 31. Types of Distributed Algorithms: Synchronous vs. Asynchronous Synchronous all processes move in lock-step rounds a message sent in a round is received in the same round idealized view impossible or expensive to implement Asynchronous only one process moves at a time arbitrary interleavings of steps a message sent is received eventually Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 15 / 52
  • 32. Types of Distributed Algorithms: Synchronous vs. Asynchronous Synchronous all processes move in lock-step rounds a message sent in a round is received in the same round idealized view impossible or expensive to implement Asynchronous only one process moves at a time arbitrary interleavings of steps a message sent is received eventually important problems not solvable (Fischer et al., 1985)! Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 15 / 52
  • 33. Types of Distributed Algorithms: Synchronous vs. Asynchronous Synchronous all processes move in lock-step rounds a message sent in a round is received in the same round idealized view impossible or expensive to implement Asynchronous only one process moves at a time arbitrary interleavings of steps a message sent is received eventually important problems not solvable (Fischer et al., 1985)! We focus on asynchronous algorithms here. . . Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 15 / 52
  • 34. Asynchronous system has very mild restrictions on the environment interleaving semantics unbounded message delays very little can be done. . . there is no distributed algorithm that solves consensus in the presence of one faulty process (as we will see, consensus is the paradigm of consistency) folklore explanation: you cannot distinguish a slow process from a crashed one real explanation: see intricate proof by Fischer, Lynch, and Paterson (JACM 1985) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 16 / 52
  • 35. Where we stand P P1 P2 P3 replication Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 17 / 52
  • 36. What we still need. . . P1 P2 P3 P P P consistency Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 18 / 52
  • 37. What we still need. . . P1 P2 P3 P P P consistency consistency requirements have been formalized under several names, e.g., consensus atomic broadcast Byzantine Generals problem Byzantine agreement atomic commitment de
  • 38. nitions are similar but may have subtle dierences Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 18 / 52
  • 39. What we still need. . . P1 P2 P3 P P P consistency consistency requirements have been formalized under several names, e.g., consensus atomic broadcast Byzantine Generals problem Byzantine agreement atomic commitment de
  • 40. nitions are similar but may have subtle dierences We use the famous Byzantine Generals to introduce this problem domain. . . Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 18 / 52
  • 41. Fault tolerance { The Byzantine generals problem Wiktionary: Byzantine: adj. of a devious, usually stealthy manner, of practice. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 19 / 52
  • 42. Fault tolerance { The Byzantine generals problem Lamport (this year's Turing laureate), Shostak, and Pease wrote in their Dijkstra Prize in Distributed Computing winning paper (Lamport et al., 1982): [: : :] several divisions of the Byzantine army are camped outside an enemy city, each division commanded by its own general. [: : :] However, some of the generals may be traitors [: : :] if the divisions of loyal generals attack together, the city falls if only some loyal generals attack, their armies fall generals communicate by obedient messengers Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 20 / 52
  • 43. Fault tolerance { The Byzantine generals problem Lamport (this year's Turing laureate), Shostak, and Pease wrote in their Dijkstra Prize in Distributed Computing winning paper (Lamport et al., 1982): [: : :] several divisions of the Byzantine army are camped outside an enemy city, each division commanded by its own general. [: : :] However, some of the generals may be traitors [: : :] if the divisions of loyal generals attack together, the city falls if only some loyal generals attack, their armies fall generals communicate by obedient messengers The Byzantine generals problem: the loyal generals have to agree on whether to attack. if all want to attack they must attack, if no-one wants to attack they must not attack Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 20 / 52
  • 44. Fault tolerance { The Byzantine generals problem Lamport (this year's Turing laureate), Shostak, and Pease wrote in their Dijkstra Prize in Distributed Computing winning paper (Lamport et al., 1982): [: : :] several divisions of the Byzantine army are camped outside an enemy city, each division commanded by its own general. [: : :] However, some of the generals may be traitors [: : :] if the divisions of loyal generals attack together, the city falls if only some loyal generals attack, their armies fall generals communicate by obedient messengers The Byzantine generals problem: the loyal generals have to agree on whether to attack. if all want to attack they must attack, if no-one wants to attack they must not attack metaphor for a distributed system where correct processes (loyal generals) act as one in the presence of faulty processes (traitors) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 20 / 52
  • 45. Byzantine generals problem cont. In the absence of faults it is trivial to solve: send proposed plan (attack or not attack) to all wait until received messages from everyone if a process proposed attack decide to attack otherwise, decide to not attack Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 21 / 52
  • 46. Byzantine generals problem cont. In the absence of faults it is trivial to solve: send proposed plan (attack or not attack) to all wait until received messages from everyone if a process proposed attack decide to attack otherwise, decide to not attack In the presence of faults it becomes tricky if a process may crash, some processes may not receive messages from everyone (but some may) if a process may send faulty messages, contradictory information may be received, e.g., A tells B that C told A that C wants to attack, while C tells B that C does not want to attack Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 21 / 52
  • 47. Byzantine generals problem cont. In the absence of faults it is trivial to solve: send proposed plan (attack or not attack) to all wait until received messages from everyone if a process proposed attack decide to attack otherwise, decide to not attack In the presence of faults it becomes tricky if a process may crash, some processes may not receive messages from everyone (but some may) if a process may send faulty messages, contradictory information may be received, e.g., A tells B that C told A that C wants to attack, while C tells B that C does not want to attack Who is lying to whom? Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 21 / 52
  • 48. Fault-tolerant distributed algorithms n n processes communicate by messages (reliable communication) all processes know that at most t of them might be faulty f are actually faulty resilience conditions, e.g., n 3t ^ t f 0 no masquerading: the processes know the origin of incoming messages Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 22 / 52
  • 49. Fault-tolerant distributed algorithms n ? ? ? t n processes communicate by messages (reliable communication) all processes know that at most t of them might be faulty f are actually faulty resilience conditions, e.g., n 3t ^ t f 0 no masquerading: the processes know the origin of incoming messages Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 22 / 52
  • 50. Fault-tolerant distributed algorithms n ? ? ? t f n processes communicate by messages (reliable communication) all processes know that at most t of them might be faulty f are actually faulty resilience conditions, e.g., n 3t ^ t f 0 no masquerading: the processes know the origin of incoming messages Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 22 / 52
  • 51. Fault models|abstractions of reality clean crashes: least severe faulty processes prematurely halt after/before send to all crash faults: faulty processes prematurely halt (also) in the middle of send to all omission faults: faulty processes follow the algorithm, but some messages sent by them might be lost symmetric faults: faulty processes send arbitrarily to all or nobody Byzantine faults: most severe faulty processes can do anything encompass all behaviors of above models Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 23 / 52
  • 52. Fault models|the ugly truth A photo of a Byzantine fault: photo by Driscoll (Honeywell) he reports Byzantine behavior on the Space Shuttle computer network other sources of faults: bit- ips in memory, power outage, disconnection from the network, etc. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 24 / 52
  • 53. Model vs. reality: impossibilities Hence, we would like the weakest assumptions possible. But there are theoretical limits on how weak assumptions can be made: consensus is impossible in asynchronous systems if there may be a crash fault, i.e., t = 1 (Fischer et al., 1985) consensus is possible in synchronous systems in the presence of Byzantine faults i n 3t (Lamport et al., 1982) consensus is impossible in (synchronous) round-based systems if bn=2c messages can be lost per round (Santoro Widmayer, 1989) fast Byzantine consensus is solvable i n 5t (Martin Alvisi, 2006) 32 dierent degrees of synchrony and whether consensus can be solved in the presence of how many faults investigated in (Dolev et al., 1987) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 25 / 52
  • 54. Model vs. reality: impossibilities Hence, we would like the weakest assumptions possible. But there are theoretical limits on how weak assumptions can be made: consensus is impossible in asynchronous systems if there may be a crash fault, i.e., t = 1 (Fischer et al., 1985) consensus is possible in synchronous systems in the presence of Byzantine faults i n 3t (Lamport et al., 1982) consensus is impossible in (synchronous) round-based systems if bn=2c messages can be lost per round (Santoro Widmayer, 1989) fast Byzantine consensus is solvable i n 5t (Martin Alvisi, 2006) 32 dierent degrees of synchrony and whether consensus can be solved in the presence of how many faults investigated in (Dolev et al., 1987) arithmetic resilience conditions play crucial role! Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 25 / 52
  • 55. After this excursion to faults, let's go back to the problem of de
  • 56. ning consistency (asynchronous systems) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 26 / 52
  • 57. De
  • 58. ning consistency|e.g., binary consensus Every process has some initial value v 2 f0; 1g and has to decide irrevocably on some value in concordance with the following properties: Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 27 / 52
  • 59. De
  • 60. ning consistency|e.g., binary consensus Every process has some initial value v 2 f0; 1g and has to decide irrevocably on some value in concordance with the following properties: agreement. No two correct processes decide on dierent value. either all attack or no-one Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 27 / 52
  • 61. De
  • 62. ning consistency|e.g., binary consensus Every process has some initial value v 2 f0; 1g and has to decide irrevocably on some value in concordance with the following properties: agreement. No two correct processes decide on dierent value. either all attack or no-one validity. If all correct processes have the same initial value v, then v is the only possible decision value the decision on whether to attack must be consistent with the will of at least one loyal general Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 27 / 52
  • 63. De
  • 64. ning consistency|e.g., binary consensus Every process has some initial value v 2 f0; 1g and has to decide irrevocably on some value in concordance with the following properties: agreement. No two correct processes decide on dierent value. either all attack or no-one validity. If all correct processes have the same initial value v, then v is the only possible decision value the decision on whether to attack must be consistent with the will of at least one loyal general termination. Every correct process eventually decides. at some point negotiations must be over Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 27 / 52
  • 65. De
  • 66. ning consistency|e.g., binary consensus Every process has some initial value v 2 f0; 1g and has to decide irrevocably on some value in concordance with the following properties: agreement. No two correct processes decide on dierent value. either all attack or no-one validity. If all correct processes have the same initial value v, then v is the only possible decision value the decision on whether to attack must be consistent with the will of at least one loyal general termination. Every correct process eventually decides. at some point negotiations must be over Interplay of safety and liveness makes the problem hard. . . Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 27 / 52
  • 67. What if only two properties have to be satis
  • 68. ed? Every process has some initial value v 2 f0; 1g and has to decide irrevocably on some value in concordance with the following properties: validity. If all correct processes have the same initial value v, then v is the only possible decision value. termination. Every correct process eventually decides. Give an algorithm that solves validity and termination! Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 28 / 52
  • 69. What if only two properties have to be satis
  • 70. ed? Every process has some initial value v 2 f0; 1g and has to decide irrevocably on some value in concordance with the following properties: validity. If all correct processes have the same initial value v, then v is the only possible decision value. termination. Every correct process eventually decides. Solution: decide my own proposed value. (no need to agree) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 28 / 52
  • 71. What if only two properties have to be satis
  • 72. ed? Every process has some initial value v 2 f0; 1g and has to decide irrevocably on some value in concordance with the following properties: agreement. No two correct processes decide on dierent value. termination. Every correct process eventually decides. Give an algorithm that solves agreement and termination! Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 28 / 52
  • 73. What if only two properties have to be satis
  • 74. ed? Every process has some initial value v 2 f0; 1g and has to decide irrevocably on some value in concordance with the following properties: agreement. No two correct processes decide on dierent value. termination. Every correct process eventually decides. Solution: decide 0. (no relation to initial values required) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 28 / 52
  • 75. What if only two properties have to be satis
  • 76. ed? Every process has some initial value v 2 f0; 1g and has to decide irrevocably on some value in concordance with the following properties: agreement. No two correct processes decide on dierent value. validity. If all correct processes have the same initial value v, then v is the only possible decision value. Give an algorithm that solves agreement and validity! Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 28 / 52
  • 77. What if only two properties have to be satis
  • 78. ed? Every process has some initial value v 2 f0; 1g and has to decide irrevocably on some value in concordance with the following properties: agreement. No two correct processes decide on dierent value. validity. If all correct processes have the same initial value v, then v is the only possible decision value. Solution: do nothing (doing nothing is always safe) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 28 / 52
  • 79. Wrap-up: Intro to FTDAs distributed systems replication and consistency synchronous vs. asynchronous fault models example for an agreement problem: Byzantine Generals Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 29 / 52
  • 80. Our case study. . . Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 30 / 52
  • 81. Asynchronous FTDAs In this lecture we consider methods for asynchronous FTDAs that either solve problems that are less hard than consensus: reliable broadcast. termination required only for speci
  • 82. c initial state (Srikanth Toueg, 1987). [Veri
  • 83. ed in Parts II, III, V] condition-based consensus properties required only in runs from speci
  • 84. c initial states (Mostefaoui et al., 2003) [Veri
  • 85. ed in Part II] The Paxos idea fault-tolerant distributed algorithms that are safe and make progress only if you are lucky (Lamport, 1998) [Serious challenge] are asynchronous but use information on faults as a black box failure detector based atomic commitment. distributed databases (Raynal, 1997) [Challenge] Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 31 / 52
  • 86. Asynchronous FTDAs In this lecture we consider methods for asynchronous FTDAs that either solve problems that are less hard than consensus: reliable broadcast. termination required only for speci
  • 87. c initial state (Srikanth Toueg, 1987). [Veri
  • 88. ed in Parts II, III, V] condition-based consensus properties required only in runs from speci
  • 89. c initial states (Mostefaoui et al., 2003) [Veri
  • 90. ed in Part II] The Paxos idea fault-tolerant distributed algorithms that are safe and make progress only if you are lucky (Lamport, 1998) [Serious challenge] are asynchronous but use information on faults as a black box failure detector based atomic commitment. distributed databases (Raynal, 1997) [Challenge] We use the algorithm from (Srikanth Toueg, 1987) as running example Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 31 / 52
  • 91. Asynchronous Reliable Broadcast (Srikanth Toueg, 87) The core of the classic broadcast algorithm from the DA literature. 1 Variables of process i 2 vi : f0 , 1g ini t i a l l y 0 or 1 3 accepti : f0 , 1g ini t i a l l y 0 4 5 An atomic step: 6 i f vi = 1 7 then send ( echo ) to all ; 8 9 i f received (echo) from at l e a s t 10 t + 1 distinct p r o c e s s e s 11 and not s ent ( echo ) be f o r e 12 then send ( echo ) to all ; 13 14 i f received ( echo ) from at l e a s t 15 n - t distinct p r o c e s s e s 16 then accepti := 1 ; Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 32 / 52
  • 92. Assumptions from (Srikanth Toueg, 87) asynchronous interleaving reliable message passing (no bounds on message delays) at most t Byzantine faults resilience condition: n 3t ^ t f Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 33 / 52
  • 93. The spec of our case-study Unforgeability. If vi = false for all correct processes i , then for all correct processes j , acceptj remains false forever. Completeness. If vi = true for all correct processes i , then there is a correct process j that eventually sets acceptj to true. Relay. If a correct process i sets accepti to true, then eventually all correct processes j set acceptj to true. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 34 / 52
  • 94. The spec of our case-study Unforgeability. If vi = false for all correct processes i , then for all correct processes j , acceptj remains false forever. if no loyal general wants to attack, then traitors should not be able to force one. Completeness. If vi = true for all correct processes i , then there is a correct process j that eventually sets acceptj to true. If all loyal generals want to attack, there shall be an attack. Relay. If a correct process i sets accepti to true, then eventually all correct processes j set acceptj to true. If one loyal general attacks, then all loyal generals should attack. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 34 / 52
  • 95. The spec of our case-study Unforgeability. If vi = false for all correct processes i , then for all correct processes j , acceptj remains false forever. if no loyal general wants to attack, then traitors should not be able to force one. Completeness. If vi = true for all correct processes i , then there is a correct process j that eventually sets acceptj to true. If all loyal generals want to attack, there shall be an attack. Relay. If a correct process i sets accepti to true, then eventually all correct processes j set acceptj to true. If one loyal general attacks, then all loyal generals should attack. These are the specs as given in literature: they can be formalized in LTL Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 34 / 52
  • 96. Reliable broadcast vs. Consensus Reliable broadcast: Completeness. If vi = true for all correct processes i , then there is a correct process j that eventually sets acceptj to true. Consensus: Termination. Every correct process eventually decides. Dierence: Completeness requires to do something only if 8i : vi = true, i.e., only for one speci
  • 97. c initial state Termination requires to do something in all runs (from all initial states) weakening of spec makes reliable broadcast solvable in async, while consensus is not solvable Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 35 / 52
  • 98. Asynchronous Reliable Broadcast (Srikanth Toueg, 87) The core of the classic broadcast algorithm from the DA literature. 1 Variables of process i 2 vi : f0 , 1g ini t i a l l y 0 or 1 3 accepti : f0 , 1g ini t i a l l y 0 4 5 An atomic step: 6 i f vi = 1 7 then send ( echo ) to all ; 8 9 i f received (echo) from at l e a s t 10 t + 1 distinct p r o c e s s e s 11 and not s ent ( echo ) be f o r e 12 then send ( echo ) to all ; 13 14 i f received ( echo ) from at l e a s t 15 n - t distinct p r o c e s s e s 16 then accepti := 1 ; Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 36 / 52
  • 99. Threshold-Guarded Distributed Algorithms Standard construct: quanti
  • 100. ed guards (t=f=0) Existential Guard if received m from some process then ... Universal Guard if received m from all processes then ... Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 37 / 52
  • 101. Threshold-Guarded Distributed Algorithms Standard construct: quanti
  • 102. ed guards (t=f=0) Existential Guard if received m from some process then ... Universal Guard if received m from all processes then ... what if faults might occur? Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 37 / 52
  • 103. Threshold-Guarded Distributed Algorithms Standard construct: quanti
  • 104. ed guards (t=f=0) Existential Guard if received m from some process then ... Universal Guard if received m from all processes then ... what if faults might occur? Fault-Tolerant Algorithms: n processes, at most t are Byzantine Threshold Guard if received m from n t processes then ... (the processes cannot refer to f!) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 37 / 52
  • 105. Basic mechanisms used by the algorithm: thresholds n t f t + 1 if received m from t + 1 processes then ... (threshold) Correct processes count distinct incoming messages Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 38 / 52
  • 106. Basic mechanisms used by the algorithm: thresholds n t f t + 1 if received m from t + 1 processes then ... (threshold) Correct processes count distinct incoming messages Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 38 / 52
  • 107. Basic mechanisms used by the algorithm: thresholds n t f t + 1 at least one non-faulty sent the message if received m from t + 1 processes then ... (threshold) Correct processes count distinct incoming messages Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 38 / 52
  • 108. Classic correctness argument| hand-written proofs Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 39 / 52
  • 109. Proof: Unforgeability If vi = false for all correct processes i , then for all correct processes j , acceptj remains false forever. 1 Variables of process i 2 vi : f0 , 1g ini t i a l l y 0 or 1 3 accepti : f0 , 1g ini t i a l l y 0 4 5 An atomic step: 6 i f vi = 1 7 then send ( echo ) to all ; 8 9 i f received (echo) from at l e a s t 10 t + 1 distinct p r o c e s s e s 11 and not s ent ( echo ) be f o r e 12 then send ( echo ) to all ; 13 14 i f received ( echo ) from at l e a s t 15 n - t distinct p r o c e s s e s 16 then accepti := 1 ; Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 40 / 52
  • 110. Proof: Unforgeability If vi = false for all correct processes i , then for all correct processes j , acceptj remains false forever. 1 Variables of process i 2 vi : f0 , 1g ini t i a l l y 0 or 1 3 accepti : f0 , 1g ini t i a l l y 0 4 5 An atomic step: 6 i f vi = 1 7 then send ( echo ) to all ; 8 9 i f received (echo) from at l e a s t 10 t + 1 distinct p r o c e s s e s 11 and not s ent ( echo ) be f o r e 12 then send ( echo ) to all ; 13 14 i f received ( echo ) from at l e a s t 15 n - t distinct p r o c e s s e s 16 then accepti := 1 ; By contradiction assume a correct process sets acceptj = 1 Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 40 / 52
  • 111. Proof: Unforgeability If vi = false for all correct processes i , then for all correct processes j , acceptj remains false forever. 1 Variables of process i 2 vi : f0 , 1g ini t i a l l y 0 or 1 3 accepti : f0 , 1g ini t i a l l y 0 4 5 An atomic step: 6 i f vi = 1 7 then send ( echo ) to all ; 8 9 i f received (echo) from at l e a s t 10 t + 1 distinct p r o c e s s e s 11 and not s ent ( echo ) be f o r e 12 then send ( echo ) to all ; 13 14 i f received ( echo ) from at l e a s t 15 n - t distinct p r o c e s s e s 16 then accepti := 1 ; By contradiction assume a correct process sets acceptj = 1 Thus it has executed line 16 Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 40 / 52
  • 112. Proof: Unforgeability If vi = false for all correct processes i , then for all correct processes j , acceptj remains false forever. 1 Variables of process i 2 vi : f0 , 1g ini t i a l l y 0 or 1 3 accepti : f0 , 1g ini t i a l l y 0 4 5 An atomic step: 6 i f vi = 1 7 then send ( echo ) to all ; 8 9 i f received (echo) from at l e a s t 10 t + 1 distinct p r o c e s s e s 11 and not s ent ( echo ) be f o r e 12 then send ( echo ) to all ; 13 14 i f received ( echo ) from at l e a s t 15 n - t distinct p r o c e s s e s 16 then accepti := 1 ; By contradiction assume a correct process sets acceptj = 1 Thus it has executed line 16 Thus it has received n t messages by distinct processes Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 40 / 52
  • 113. Proof: Unforgeability If vi = false for all correct processes i , then for all correct processes j , acceptj remains false forever. 1 Variables of process i 2 vi : f0 , 1g ini t i a l l y 0 or 1 3 accepti : f0 , 1g ini t i a l l y 0 4 5 An atomic step: 6 i f vi = 1 7 then send ( echo ) to all ; 8 9 i f received (echo) from at l e a s t 10 t + 1 distinct p r o c e s s e s 11 and not s ent ( echo ) be f o r e 12 then send ( echo ) to all ; 13 14 i f received ( echo ) from at l e a s t 15 n - t distinct p r o c e s s e s 16 then accepti := 1 ; By contradiction assume a correct process sets acceptj = 1 Thus it has executed line 16 Thus it has received n t messages by distinct processes That means messages by at n 2t correct processes Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 40 / 52
  • 114. Proof: Unforgeability If vi = false for all correct processes i , then for all correct processes j , acceptj remains false forever. 1 Variables of process i 2 vi : f0 , 1g ini t i a l l y 0 or 1 3 accepti : f0 , 1g ini t i a l l y 0 4 5 An atomic step: 6 i f vi = 1 7 then send ( echo ) to all ; 8 9 i f received (echo) from at l e a s t 10 t + 1 distinct p r o c e s s e s 11 and not s ent ( echo ) be f o r e 12 then send ( echo ) to all ; 13 14 i f received ( echo ) from at l e a s t 15 n - t distinct p r o c e s s e s 16 then accepti := 1 ; By contradiction assume a correct process sets acceptj = 1 Thus it has executed line 16 Thus it has received n t messages by distinct processes That means messages by at n 2t correct processes Let p be the
  • 115. rst correct processes that has sent (echo) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 40 / 52
  • 116. Proof: Unforgeability If vi = false for all correct processes i , then for all correct processes j , acceptj remains false forever. 1 Variables of process i 2 vi : f0 , 1g ini t i a l l y 0 or 1 3 accepti : f0 , 1g ini t i a l l y 0 4 5 An atomic step: 6 i f vi = 1 7 then send ( echo ) to all ; 8 9 i f received (echo) from at l e a s t 10 t + 1 distinct p r o c e s s e s 11 and not s ent ( echo ) be f o r e 12 then send ( echo ) to all ; 13 14 i f received ( echo ) from at l e a s t 15 n - t distinct p r o c e s s e s 16 then accepti := 1 ; By contradiction assume a correct process sets acceptj = 1 Thus it has executed line 16 Thus it has received n t messages by distinct processes That means messages by at n 2t correct processes Let p be the
  • 117. rst correct processes that has sent (echo) It did not send in line 7, as vp = 0 by assumption Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 40 / 52
  • 118. Proof: Unforgeability If vi = false for all correct processes i , then for all correct processes j , acceptj remains false forever. 1 Variables of process i 2 vi : f0 , 1g ini t i a l l y 0 or 1 3 accepti : f0 , 1g ini t i a l l y 0 4 5 An atomic step: 6 i f vi = 1 7 then send ( echo ) to all ; 8 9 i f received (echo) from at l e a s t 10 t + 1 distinct p r o c e s s e s 11 and not s ent ( echo ) be f o r e 12 then send ( echo ) to all ; 13 14 i f received ( echo ) from at l e a s t 15 n - t distinct p r o c e s s e s 16 then accepti := 1 ; By contradiction assume a correct process sets acceptj = 1 Thus it has executed line 16 Thus it has received n t messages by distinct processes That means messages by at n 2t correct processes Let p be the
  • 119. rst correct processes that has sent (echo) It did not send in line 7, as vp = 0 by assumption Thus, p sent in line 12 Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 40 / 52
  • 120. Proof: Unforgeability If vi = false for all correct processes i , then for all correct processes j , acceptj remains false forever. 1 Variables of process i 2 vi : f0 , 1g ini t i a l l y 0 or 1 3 accepti : f0 , 1g ini t i a l l y 0 4 5 An atomic step: 6 i f vi = 1 7 then send ( echo ) to all ; 8 9 i f received (echo) from at l e a s t 10 t + 1 distinct p r o c e s s e s 11 and not s ent ( echo ) be f o r e 12 then send ( echo ) to all ; 13 14 i f received ( echo ) from at l e a s t 15 n - t distinct p r o c e s s e s 16 then accepti := 1 ; By contradiction assume a correct process sets acceptj = 1 Thus it has executed line 16 Thus it has received n t messages by distinct processes That means messages by at n 2t correct processes Let p be the
  • 121. rst correct processes that has sent (echo) It did not send in line 7, as vp = 0 by assumption Thus, p sent in line 12 Based on t + 1 messages, i.e., 1 sent by a correct processes Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 40 / 52
  • 122. Proof: Unforgeability If vi = false for all correct processes i , then for all correct processes j , acceptj remains false forever. 1 Variables of process i 2 vi : f0 , 1g ini t i a l l y 0 or 1 3 accepti : f0 , 1g ini t i a l l y 0 4 5 An atomic step: 6 i f vi = 1 7 then send ( echo ) to all ; 8 9 i f received (echo) from at l e a s t 10 t + 1 distinct p r o c e s s e s 11 and not s ent ( echo ) be f o r e 12 then send ( echo ) to all ; 13 14 i f received ( echo ) from at l e a s t 15 n - t distinct p r o c e s s e s 16 then accepti := 1 ; By contradiction assume a correct process sets acceptj = 1 Thus it has executed line 16 Thus it has received n t messages by distinct processes That means messages by at n 2t correct processes Let p be the
  • 123. rst correct processes that has sent (echo) It did not send in line 7, as vp = 0 by assumption Thus, p sent in line 12 Based on t + 1 messages, i.e., 1 sent by a correct processes contradiction to p being the
  • 124. rst one. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 40 / 52
  • 125. Proof: Completeness If vi = true for all correct processes i , then there is a correct process j that eventually sets acceptj to true. 1 Variables of process i 2 vi : f0 , 1g ini t i a l l y 0 or 1 3 accepti : f0 , 1g ini t i a l l y 0 4 5 An atomic step: 6 i f vi = 1 7 then send ( echo ) to all ; 8 9 i f received (echo) from at l e a s t 10 t + 1 distinct p r o c e s s e s 11 and not s ent ( echo ) be f o r e 12 then send ( echo ) to all ; 13 14 i f received ( echo ) from at l e a s t 15 n - t distinct p r o c e s s e s 16 then accepti := 1 ; Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 41 / 52
  • 126. Proof: Completeness If vi = true for all correct processes i , then there is a correct process j that eventually sets acceptj to true. 1 Variables of process i 2 vi : f0 , 1g ini t i a l l y 0 or 1 3 accepti : f0 , 1g ini t i a l l y 0 4 5 An atomic step: 6 i f vi = 1 7 then send ( echo ) to all ; 8 9 i f received (echo) from at l e a s t 10 t + 1 distinct p r o c e s s e s 11 and not s ent ( echo ) be f o r e 12 then send ( echo ) to all ; 13 14 i f received ( echo ) from at l e a s t 15 n - t distinct p r o c e s s e s 16 then accepti := 1 ; all, i.e., at least n t correct processes execute line 7 Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 41 / 52
  • 127. Proof: Completeness If vi = true for all correct processes i , then there is a correct process j that eventually sets acceptj to true. 1 Variables of process i 2 vi : f0 , 1g ini t i a l l y 0 or 1 3 accepti : f0 , 1g ini t i a l l y 0 4 5 An atomic step: 6 i f vi = 1 7 then send ( echo ) to all ; 8 9 i f received (echo) from at l e a s t 10 t + 1 distinct p r o c e s s e s 11 and not s ent ( echo ) be f o r e 12 then send ( echo ) to all ; 13 14 i f received ( echo ) from at l e a s t 15 n - t distinct p r o c e s s e s 16 then accepti := 1 ; all, i.e., at least n t correct processes execute line 7 by reliable communication all correct processes receive all messages sent by correct processes Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 41 / 52
  • 128. Proof: Completeness If vi = true for all correct processes i , then there is a correct process j that eventually sets acceptj to true. 1 Variables of process i 2 vi : f0 , 1g ini t i a l l y 0 or 1 3 accepti : f0 , 1g ini t i a l l y 0 4 5 An atomic step: 6 i f vi = 1 7 then send ( echo ) to all ; 8 9 i f received (echo) from at l e a s t 10 t + 1 distinct p r o c e s s e s 11 and not s ent ( echo ) be f o r e 12 then send ( echo ) to all ; 13 14 i f received ( echo ) from at l e a s t 15 n - t distinct p r o c e s s e s 16 then accepti := 1 ; all, i.e., at least n t correct processes execute line 7 by reliable communication all correct processes receive all messages sent by correct processes Thus, a correct process receives n t (echo) messages Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 41 / 52
  • 129. Proof: Completeness If vi = true for all correct processes i , then there is a correct process j that eventually sets acceptj to true. 1 Variables of process i 2 vi : f0 , 1g ini t i a l l y 0 or 1 3 accepti : f0 , 1g ini t i a l l y 0 4 5 An atomic step: 6 i f vi = 1 7 then send ( echo ) to all ; 8 9 i f received (echo) from at l e a s t 10 t + 1 distinct p r o c e s s e s 11 and not s ent ( echo ) be f o r e 12 then send ( echo ) to all ; 13 14 i f received ( echo ) from at l e a s t 15 n - t distinct p r o c e s s e s 16 then accepti := 1 ; all, i.e., at least n t correct processes execute line 7 by reliable communication all correct processes receive all messages sent by correct processes Thus, a correct process receives n t (echo) messages Thus, a correct process executes line 16 Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 41 / 52
  • 130. Proof: Relay If a correct process i sets accepti to true, then eventually all correct processes j set acceptj to true. 1 Variables of process i 2 vi : f0 , 1g ini t i a l l y 0 or 1 3 accepti : f0 , 1g ini t i a l l y 0 4 5 An atomic step: 6 i f vi = 1 7 then send ( echo ) to all ; 8 9 i f received (echo) from at l e a s t 10 t + 1 distinct p r o c e s s e s 11 and not s ent ( echo ) be f o r e 12 then send ( echo ) to all ; 13 14 i f received ( echo ) from at l e a s t 15 n - t distinct p r o c e s s e s 16 then accepti := 1 ; Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 42 / 52
  • 131. Proof: Relay If a correct process i sets accepti to true, then eventually all correct processes j set acceptj to true. 1 Variables of process i 2 vi : f0 , 1g ini t i a l l y 0 or 1 3 accepti : f0 , 1g ini t i a l l y 0 4 5 An atomic step: 6 i f vi = 1 7 then send ( echo ) to all ; 8 9 i f received (echo) from at l e a s t 10 t + 1 distinct p r o c e s s e s 11 and not s ent ( echo ) be f o r e 12 then send ( echo ) to all ; 13 14 i f received ( echo ) from at l e a s t 15 n - t distinct p r o c e s s e s 16 then accepti := 1 ; Correct process executes line 16 Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 42 / 52
  • 132. Proof: Relay If a correct process i sets accepti to true, then eventually all correct processes j set acceptj to true. 1 Variables of process i 2 vi : f0 , 1g ini t i a l l y 0 or 1 3 accepti : f0 , 1g ini t i a l l y 0 4 5 An atomic step: 6 i f vi = 1 7 then send ( echo ) to all ; 8 9 i f received (echo) from at l e a s t 10 t + 1 distinct p r o c e s s e s 11 and not s ent ( echo ) be f o r e 12 then send ( echo ) to all ; 13 14 i f received ( echo ) from at l e a s t 15 n - t distinct p r o c e s s e s 16 then accepti := 1 ; Correct process executes line 16 Thus it has received n t messages by distinct processes Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 42 / 52
  • 133. Proof: Relay If a correct process i sets accepti to true, then eventually all correct processes j set acceptj to true. 1 Variables of process i 2 vi : f0 , 1g ini t i a l l y 0 or 1 3 accepti : f0 , 1g ini t i a l l y 0 4 5 An atomic step: 6 i f vi = 1 7 then send ( echo ) to all ; 8 9 i f received (echo) from at l e a s t 10 t + 1 distinct p r o c e s s e s 11 and not s ent ( echo ) be f o r e 12 then send ( echo ) to all ; 13 14 i f received ( echo ) from at l e a s t 15 n - t distinct p r o c e s s e s 16 then accepti := 1 ; Correct process executes line 16 Thus it has received n t messages by distinct processes That means messages by n 2t correct processes Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 42 / 52
  • 134. Proof: Relay If a correct process i sets accepti to true, then eventually all correct processes j set acceptj to true. 1 Variables of process i 2 vi : f0 , 1g ini t i a l l y 0 or 1 3 accepti : f0 , 1g ini t i a l l y 0 4 5 An atomic step: 6 i f vi = 1 7 then send ( echo ) to all ; 8 9 i f received (echo) from at l e a s t 10 t + 1 distinct p r o c e s s e s 11 and not s ent ( echo ) be f o r e 12 then send ( echo ) to all ; 13 14 i f received ( echo ) from at l e a s t 15 n - t distinct p r o c e s s e s 16 then accepti := 1 ; Correct process executes line 16 Thus it has received n t messages by distinct processes That means messages by n 2t correct processes By the resilience condition n 3t, we have n 2t t + 1 Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 42 / 52
  • 135. Proof: Relay If a correct process i sets accepti to true, then eventually all correct processes j set acceptj to true. 1 Variables of process i 2 vi : f0 , 1g ini t i a l l y 0 or 1 3 accepti : f0 , 1g ini t i a l l y 0 4 5 An atomic step: 6 i f vi = 1 7 then send ( echo ) to all ; 8 9 i f received (echo) from at l e a s t 10 t + 1 distinct p r o c e s s e s 11 and not s ent ( echo ) be f o r e 12 then send ( echo ) to all ; 13 14 i f received ( echo ) from at l e a s t 15 n - t distinct p r o c e s s e s 16 then accepti := 1 ; Correct process executes line 16 Thus it has received n t messages by distinct processes That means messages by n 2t correct processes By the resilience condition n 3t, we have n 2t t + 1 Thus at least t + 1 correct processes have sent (echo) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 42 / 52
  • 136. Proof: Relay If a correct process i sets accepti to true, then eventually all correct processes j set acceptj to true. 1 Variables of process i 2 vi : f0 , 1g ini t i a l l y 0 or 1 3 accepti : f0 , 1g ini t i a l l y 0 4 5 An atomic step: 6 i f vi = 1 7 then send ( echo ) to all ; 8 9 i f received (echo) from at l e a s t 10 t + 1 distinct p r o c e s s e s 11 and not s ent ( echo ) be f o r e 12 then send ( echo ) to all ; 13 14 i f received ( echo ) from at l e a s t 15 n - t distinct p r o c e s s e s 16 then accepti := 1 ; Correct process executes line 16 Thus it has received n t messages by distinct processes That means messages by n 2t correct processes By the resilience condition n 3t, we have n 2t t + 1 Thus at least t + 1 correct processes have sent (echo) By reliable communication, these messages are received by all correct processes Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 42 / 52
  • 137. Proof: Relay If a correct process i sets accepti to true, then eventually all correct processes j set acceptj to true. 1 Variables of process i 2 vi : f0 , 1g ini t i a l l y 0 or 1 3 accepti : f0 , 1g ini t i a l l y 0 4 5 An atomic step: 6 i f vi = 1 7 then send ( echo ) to all ; 8 9 i f received (echo) from at l e a s t 10 t + 1 distinct p r o c e s s e s 11 and not s ent ( echo ) be f o r e 12 then send ( echo ) to all ; 13 14 i f received ( echo ) from at l e a s t 15 n - t distinct p r o c e s s e s 16 then accepti := 1 ; Correct process executes line 16 Thus it has received n t messages by distinct processes That means messages by n 2t correct processes By the resilience condition n 3t, we have n 2t t + 1 Thus at least t + 1 correct processes have sent (echo) By reliable communication, these messages are received by all correct processes Thus, all correct processes send (echo) in line 12 Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 42 / 52
  • 138. Proof: Relay If a correct process i sets accepti to true, then eventually all correct processes j set acceptj to true. 1 Variables of process i 2 vi : f0 , 1g ini t i a l l y 0 or 1 3 accepti : f0 , 1g ini t i a l l y 0 4 5 An atomic step: 6 i f vi = 1 7 then send ( echo ) to all ; 8 9 i f received (echo) from at l e a s t 10 t + 1 distinct p r o c e s s e s 11 and not s ent ( echo ) be f o r e 12 then send ( echo ) to all ; 13 14 i f received ( echo ) from at l e a s t 15 n - t distinct p r o c e s s e s 16 then accepti := 1 ; Correct process executes line 16 Thus it has received n t messages by distinct processes That means messages by n 2t correct processes By the resilience condition n 3t, we have n 2t t + 1 Thus at least t + 1 correct processes have sent (echo) By reliable communication, these messages are received by all correct processes Thus, all correct processes send (echo) in line 12 There are at least n t correct Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 42 / 52
  • 139. Proof: Relay If a correct process i sets accepti to true, then eventually all correct processes j set acceptj to true. 1 Variables of process i 2 vi : f0 , 1g ini t i a l l y 0 or 1 3 accepti : f0 , 1g ini t i a l l y 0 4 5 An atomic step: 6 i f vi = 1 7 then send ( echo ) to all ; 8 9 i f received (echo) from at l e a s t 10 t + 1 distinct p r o c e s s e s 11 and not s ent ( echo ) be f o r e 12 then send ( echo ) to all ; 13 14 i f received ( echo ) from at l e a s t 15 n - t distinct p r o c e s s e s 16 then accepti := 1 ; Correct process executes line 16 Thus it has received n t messages by distinct processes That means messages by n 2t correct processes By the resilience condition n 3t, we have n 2t t + 1 Thus at least t + 1 correct processes have sent (echo) By reliable communication, these messages are received by all correct processes Thus, all correct processes send (echo) in line 12 There are at least n t correct Thus, all correct processes eventually execute line 16 Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 42 / 52
  • 140. Problems with hand-written proofs code inspection becomes confusing for larger algorithms Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 43 / 52
  • 141. Bracha Toueg's algorithm (JACM 1985) Part II, Part V Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 44 / 52
  • 142. Condition-based consensus (Mostefaoui et al., 2003) Part II, Part V Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 45 / 52
  • 143. Problems with hand-written proofs code inspection becomes confusing for larger algorithms hidden assumptions resilience condition reliable communication (fairness) non-masquerading failure model Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 46 / 52
  • 144. Problems with hand-written proofs code inspection becomes confusing for larger algorithms hidden assumptions resilience condition reliable communication (fairness) non-masquerading failure model re-using proofs if one of the ingredients changes? if I cannot prove it correct, that does not mean the algorithm is wrong . . . how to come up with counterexamples? ultimate goal: verify the actual source code. . . . it is not realistic that developers do mathematical proofs. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 46 / 52
  • 145. We have convinced a human, . . . . . . why should we convince a computer? it is easy to make mistakes in proofs Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 47 / 52
  • 146. We have convinced a human, . . . . . . why should we convince a computer? it is easy to make mistakes in proofs it is easier to overlook mistakes in proofs distributed algorithms require non-centralized thinking (untypical for the human mind) many issues to consider at the same time (interleaving of steps, faults, timing assumptions) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 47 / 52
  • 147. We have convinced a human, . . . . . . why should we convince a computer? it is easy to make mistakes in proofs it is easier to overlook mistakes in proofs distributed algorithms require non-centralized thinking (untypical for the human mind) many issues to consider at the same time (interleaving of steps, faults, timing assumptions) people who tried to convince computers found bugs in published. . . Byzantine agreement algorithm (Lincoln Rushby, 1993) clock synchronization algorithm (Malekpour Siminiceanu, 2006) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 47 / 52
  • 148. End of Part I Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 48 / 52
  • 149. References I Dolev, Danny, Dwork, Cynthia, Stockmeyer, Larry. 1987. On the minimal synchronism needed for distributed consensus. J. ACM, 34, 77{97. http://doi.acm.org/10.1145/7531.7533. Fischer, Michael J., Lynch, Nancy A., Paterson, M. S. 1985. Impossibility of Distributed Consensus with one Faulty Process. J. ACM, 32(2), 374{382. http://doi.acm.org/10.1145/3149.214121. Lamport, Leslie. 1998. The part-time parliament. ACM Trans. Comput. Syst., 16, 133{169. http://doi.acm.org/10.1145/279227.279229. Lamport, Leslie, Shostak, Robert E., Pease, Marshall C. 1982. The Byzantine Generals Problem. ACM Trans. Program. Lang. Syst., 4(3), 382{401. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 49 / 52
  • 150. References II Lincoln, P., Rushby, J. 1993. A formally veri
  • 151. ed algorithm for interactive consistency under a hybrid fault model. Pages 402{411 of: FTCS-23. http://dx.doi.org/10.1109/FTCS.1993.627343. Malekpour, Mahyar R., Siminiceanu, Radu. 2006. Comments on the Byzantine Self-Stabilizing Pulse Synchronization Protocol: Counterexamples. Tech. rept. TM-2006-213951. NASA. Martin, Jean-Philippe, Alvisi, Lorenzo. 2006. Fast Byzantine Consensus. IEEE Trans. Dependable Sec. Comput., 3(3), 202{215. Mostefaoui, Achour, Mourgaya, Eric, Parvedy, Philippe Raipin, Raynal, Michel. 2003. Evaluating the Condition-Based Approach to Solve Consensus. Pages 541{550 of: DSN. Raynal, Michel. 1997. A Case Study of Agreement Problems in Distributed Systems: Non-Blocking Atomic Commitment. Pages 209{214 of: HASE. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 50 / 52
  • 152. References III Santoro, Nicola, Widmayer, Peter. 1989. Time is Not a Healer. Pages 304{313 of: STACS. LNCS, vol. 349. http://dx.doi.org/10.1007/BFb0028994. Srikanth, T. K., Toueg, Sam. 1987. Optimal clock synchronization. J. ACM, 34, 626{645. http://doi.acm.org/10.1145/28869.28876. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 51 / 52
  • 153. Model vs. reality: assumption coverage Every assumption has a probability that it is satis
  • 154. ed in the actual system: n 3t less likely than n t every message sent is received within bounded time less likely than that it is eventually received processes fail by crashing less likely than they deviate arbitrarily from the prescribed behavior non-masquerading less likely than processes that can pretend to be someone else Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 52 / 52
  • 155. Model vs. reality: assumption coverage Every assumption has a probability that it is satis
  • 156. ed in the actual system: n 3t less likely than n t every message sent is received within bounded time less likely than that it is eventually received processes fail by crashing less likely than they deviate arbitrarily from the prescribed behavior non-masquerading less likely than processes that can pretend to be someone else To use a distributed algorithm in practice: one must ensure that an assumption is suitable for a given system the probability that the system is working correctly is the probability that the assumptions hold (given that the distributed algorithm actually is correct) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA'14, Nov. 2014 52 / 52