SlideShare une entreprise Scribd logo
1  sur  72
1
Distributed Systems
Consistency and
Replication
Chapter 7
2
Course/Slides Credits
Note: all course presentations are based on those
developed by Andrew S. Tanenbaum and
Maarten van Steen. They accompany their
"Distributed Systems: Principles and
Paradigms" textbook (1st & 2nd editions).
http://www.prenhall.com/divisions/esm/app/aut
hor_tanenbaum/custom/dist_sys_1e/index.html
And additions made by Paul Barry in course
CW046-4: Distributed Systems
http://glasnost.itcarlow.ie/~barryp/net4.html
3
Reasons for Replication
• Data are replicated to increase the reliability
of a system.
• Replication for performance:
– Scaling in numbers
– Scaling in geographical area
• Caveat:
– Gain in performance
– Cost of increased bandwidth for maintaining
replication
4
More on Replication
• Replicas allows remote sites to continue working in
the event of local failures.
• It is also possible to protect against data corruption.
• Replicas allow data to reside close to where
it is used.
• Even a large number of replicated “local” systems
can improve performance: think of clusters.
• This directly supports the distributed systems goal
of enhanced scalability.
5
Replication and Scalability
• Replication is a widely-used scalability technique: think of
Web clients and Web proxies.
• When systems scale, the first problems to surface are those
associated with performance – as the systems get bigger
(e.g., more users), they get often slower.
• Replicating the data and moving it closer to where it is
needed helps to solve this scalability problem.
• A problem remains: how to efficiently synchronize all of
the replicas created to solve the scalability issue?
• Dilemma: adding replicas improves scalability, but incurs
the (oftentimes considerable) overhead of keeping the
replicas up-to-date!!!
• As we shall see, the solution often results in a relaxation of
any consistency constraints.
6
Replication and Consistency
• But if there are many replicas of the same thing,
how do we keep all of them up-to-date? How
do we keep the replicas consistent?
• Consistency can be achieved in a number of
ways. We will study a number of consistency
models, as well as protocols for implementing
the models.
• So, what’s the catch?
– It is not easy to keep all those replicas consistent.
7
Data-centric Consistency Models
The general organization of a logical data store,
physically distributed and replicated across
multiple processes
8
Continuous Consistency (1)
An example of keeping track of consistency
deviations [adapted from (Yu and Vahdat, 2002)]
9
Continuous Consistency (2)
Choosing the appropriate granularity for a Conit.
(a) Two updates lead to update propagation.
10
Continuous Consistency (3)
Choosing the appropriate granularity for a Conit.
(b) No update propagation is needed (yet).
11
What is a Consistency Model?
• A “consistency model” is a CONTRACT
between a DS data-store and its processes.
• If the processes agree to the rules, the
data-store will perform properly and as
advertised.
• We start with Strict Consistency, which is
defined as:
– Any read on a data item ‘x’ returns a value
corresponding to the result of the most recent write
on ‘x’ (regardless of where the write occurred).
12
Consistency Model Diagram Notation
• Wi(x)a – a write by process ‘i’ to item ‘x’ with
a value of ‘a’. That is, ‘x’ is set to ‘a’.
• (Note: The process is often shown as ‘Pi’).
• Ri(x)b – a read by process ‘i’ from item ‘x’
producing the value ‘b’. That is, reading ‘x’
returns ‘b’.
• Time moves from left to right in all diagrams.
13
Strict Consistency
• Behavior of two processes, operating on same data item:
a) A strictly consistent data-store.
b) A data-store that is not strictly consistent.
• With Strict Consistency, all writes are instantaneously
visible to all processes and absolute global time order is
maintained throughout the DS. This is the consistency
model “Holy Grail” – not at all easy in the real world,
and all but impossible within a DS. So, other, less strict
(or “weaker”) models have been developed …
14
Sequential Consistency (1)
• A weaker consistency model, which represents
a relaxation of the rules.
• It is also much easier (possible) to implement.
• Definition of “Sequential Consistency”:
– The result of any execution is the same as if the
(read and write) operations by all processes on the
data-store were executed in the same sequential
order and the operations of each individual process
appear in this sequence in the order specified by its
program.
15
Sequential Consistency (2)
(a) A sequentially consistent data store.
(b) A data store that is not sequentially consistent.
16
Sequential Consistency (3)
Three concurrently-executing processes
17
Sequential Consistency (4)
Four valid execution sequences for these
processes. The vertical axis is time.
18
Problem with Sequential Consistency
• With this consistency model, adjusting the
protocol to favor reads over writes (or vice-
versa) can have a devastating impact on
performance (refer to the textbook for the gory
details).
• For this reason, other weaker consistency
models have been proposed and developed.
• Again, a relaxation of the rules allows for these
weaker models to make sense.
19
Causal Consistency
• This model distinguishes between events that
are “causally related” and those that are not.
• If event B is caused or influenced by an earlier
event A, then causal consistency requires that
every other process see event A, then event B.
• Operations that are not causally related are
said to be concurrent.
20
Causal Consistency (1)
• For a data store to be considered causally
consistent, it is necessary that the store
obeys the following conditions:
– Writes that are potentially causally related
• must be seen by all processes in the same
order.
– Concurrent writes
• may be seen in a different order on different
machines.
21
Causal Consistency (2)
(a) A violation of a causally-consistent store
22
Causal Consistency (3)
(b) A correct sequence of events in
a causally-consistent store
23
Causal Consistency (4)
This sequence is allowed with a
causally-consistent store, but not with a
sequentially consistent store
24
Introducing Weak Consistency
• Not all applications need to see all writes, let
alone seeing them in the same order.
• This leads to “Weak Consistency” (which is
primarily designed to work with distributed
critical sections).
• This model introduces the notion of a
“synchronization variable”, which is used to
update all copies of the data-store.
25
Weak Consistency Properties
• The three properties of Weak Consistency:
1. Accesses to synchronization variables associated
with a data-store are sequentially consistent.
2. No operation on a synchronization variable is
allowed to be performed until all previous writes
have been completed everywhere.
3. No read or write operation on data items are allowed
to be performed until all previous operations to
synchronization variables have been performed.
26
Weak Consistency: What It Means
• So …
• By doing a sync., a process can force the just written
value out to all the other replicas.
• Also, by doing a sync., a process can be sure it’s
getting the most recently written value before it reads.
• In essence, the weak consistency models enforce
consistency on a group of operations, as opposed to
individual reads and writes (as is the case with strict,
sequential, causal and FIFO consistency).
27
Grouping Operations
• Necessary criteria for correct synchronization:
• An acquire access of a synchronization variable, not allowed
to perform until all updates to guarded shared data have been
performed with respect to that process.
• Before exclusive mode access to synchronization variable by
process is allowed to perform with respect to that process, no
other process may hold synchronization variable, not even in
nonexclusive mode.
• After exclusive mode access to synchronization variable has
been performed, any other process’ next nonexclusive mode
access to that synchronization variable may not be performed
until it has performed with respect to that variable’s owner.
28
Entry Consistency
• At an acquire, all remote changes to guarded data
must be brought up to date.
• Before a write to a data item, a process must ensure
that no other process is trying to write at same time.
• Locks associate with individual data items, as opposed
to the entire data-store. Note: P2’s read on ‘y’ returns
NIL as no locks have been requested.
29
Client-Centric Consistency Models
• The previously studied consistency models concern
themselves with maintaining a consistent (globally
accessible) data-store in the presence of concurrent
read/write operations
• Another class of distributed data-store is that which is
characterized by the lack of simultaneous updates.
Here, the emphasis is more on maintaining a
consistent view of things for the individual client
process that is currently operating on the data-store.
30
More Client-Centric Consistency
• How fast should updates (writes) be made
available to read-only processes?
– Think of most database systems: mainly read.
– Think of the DNS: write-write conflicts do no
occur, only read-write conflicts.
– Think of WWW: as with DNS, except that heavy
use of client-side caching is present: even the
return of stale pages is acceptable to most users.
• These systems all exhibit a high degree of
acceptable inconsistency … with the replicas
gradually becoming consistent over time.
31
Toward Eventual Consistency
• The only requirement is that all replicas will
eventually be the same.
• All updates must be guaranteed to propagate to
all replicas … eventually!
• This works well if every client always updates
the same replica.
• Things are a little difficult if the clients are
mobile.
32
Eventual Consistency
The principle of a mobile user accessing
different replicas of a distributed database
33
Monotonic Reads (1)
• A data store is said to provide
monotonic-read consistency if the
following condition holds:
– If a process reads the value of a data item x
 any successive read operation on x by
that process will always return that same
value or a more recent value.
34
Monotonic Reads (2)
The read operations performed by a single process
P at two different local copies of the same
data store.
(a) A monotonic-read consistent data store.
35
Monotonic Reads (3)
The read operations performed by a single process
P at two different local copies of the same
data store.
(b) A data store that does not provide
monotonic reads.
36
Monotonic Writes (1)
• In a monotonic-write consistent
store, the following condition holds:
– A write operation by a process on a
data item x
 is completed before any successive write
operation on x by the same process.
37
Monotonic Writes (2)
The write operations performed by a single
process P at two different local copies of the
same data store.
(a) A monotonic-write consistent data store.
38
Monotonic Writes (3)
The write operations performed by a single
process P at two different local copies of the
same data store. (b) A data store that does not
provide monotonic-write consistency.
39
Read Your Writes (1)
• A data store is said to provide
read-your-writes consistency, if the
following condition holds:
– The effect of a write operation by a
process on data item x
 will always be seen by a successive read
operation on x by the same process.
40
Read Your Writes (2)
(a) A data store that provides
read-your-writes consistency
41
Read Your Writes (3)
(b) A data store that does not
42
Writes Follow Reads (1)
• A data store is said to provide
writes-follow-reads consistency,
if the following holds:
– A write operation by a process
 on a data item x following a previous
read operation on x by the same process
is guaranteed to take place on the same
or a more recent value of x that was read.
43
Writes Follow Reads (2)
(a) A writes-follow-reads consistent data store
44
Writes Follow Reads (3)
(b) A data store that does not provide
writes-follow-reads consistency
45
Replica-Server Placement
Choosing a proper cell size for server placement
46
Content Replication and Placement
• Regardless of which consistency model is
chosen, we need to decide where, when and by
whom copies of the data-store are to be placed.
47
Replica Placement Types
• There are three types of replica:
1. Permanent replicas: tend to be small in number,
organized as COWs (Clusters of Workstations) or
mirrored systems.
2. Server-initiated replicas: used to enhance
performance at the initiation of the owner of the
data-store. Typically used by web hosting
companies to geographically locate replicas close
to where they are needed most. (Often referred to
as “push caches”).
3. Client-initiated replicas: created as a result of client
requests – think of browser caches. Works well
assuming, of course, that the cached data does not
go stale too soon.
48
Server-Initiated Replicas
Counting access requests from different clients
49
Content Distribution
• When a client initiates an update to a distributed
data-store, what gets propagated?
• There are three possibilities:
1. Propagate notification of the update to the other
replicas – this is an “invalidation protocol” which
indicates that the replica’s data is no longer up-to-
date. Can work well when there’s many writes.
2. Transfer the data from one replica to another –
works well when there’s many reads.
3. Propagate the update to the other replicas – this is
“active replication”, and shifts the workload to each
of the replicas upon an “initial write”.
50
Push vs. Pull Protocols
• Another design issue relates to whether or not the
updates are pushed or pulled?
1. Push-based/Server-based Approach: sent
“automatically” by server, the client does not request
the update. This approach is useful when a high
degree of consistency is needed. Often used between
permanent and server-initiated replicas.
2. Pull-based/Client-based Approach: used by client
caches (e.g., browsers), updates are requested by the
client from the server. No request, no update!
51
Pull versus Push Protocols
A comparison between push-based and pull-based
protocols in the case of multiple-client, single-
server systems
52
Hybrid Schemes
• “Leases” – a promise from a server to push
updates to a client for a period of time.
• Once the lease expires, the client reverts to a
pull-based approach (until another lease is
issued).
• Three kinds of leases:
– Age-based Leases
– Renewal-Frequency based Leases
– State-based Leases
53
Epidemic Protocols
• This is an interesting class of protocols that can be
used to implement Eventual Consistency (note: these
protocols are used in Bayou).
• The main concern is the propagation of updates to all
the replicas in as few a number of messages as
possible.
• Of course, here we are spreading updates, not
diseases!
• With this “update propagation model”, the idea is to
“infect” as many replicas as quickly as possible.
54
Epidemic Protocols: Terminology
• Infective replica: a server that holds an update
that can be spread to other replicas.
• Susceptible replica: a yet to be updated server.
• Removed replica: an updated server that will
not (or cannot) spread the update to any other
replicas.
• The trick is to get all susceptible servers to
either infective or removed states as quickly as
possible without leaving any replicas out.
55
The Anti-Entropy Protocol
• Entropy: “a measure of the degradation or
disorganization of the universe”.
• Server P picks Q at random and exchanges
updates, using one of three approaches:
1. P only pushes its own updates to Q
2. P only pulls in new updates from Q
3. P and Q send updates to each other
• Sooner or later, all the servers in the system
will be infected (updated). Works well.
56
The Gossiping Protocol
• This variant is referred to as “gossiping” or
“rumour spreading”, and works as follows:
1. P has just been updated for item ‘x’.
2. It immediately pushes the update of ‘x’ to Q.
3. If Q already knows about ‘x’, P becomes
disinterested in spreading any more updates
(rumours) and is removed.
4. Otherwise P gossips to another server, as does Q.
• This approach is good, but can be shown not
to guarantee the propagation of all updates to
all servers. Oh dear.
57
The Best of Both Worlds
• A mix of anti-entropy and gossiping is regarded as the
best approach to rapidly infecting systems with
updates.
• However, what about removing data?
– Updates are easy, deletion is much, much harder!
– Under certain circumstances, after a deletion, an “old”
reference to the deleted item may appear at some replica and
cause the deleted item to be reactivated!
– One solution is to issue “Death Certificates” for data items –
these are a special type of update.
– Only problem remaining is the eventual removal of “old”
death certificates (with which timeouts can help).
58
Consistency Protocols
• This is a specific implementation of a
consistency model.
• The most widely implemented models
are:
1. Sequential Consistency.
2. Weak Consistency (with sync variables).
3. Atomic Transactions.
59
Primary-Based Protocols
• Each data item is associated with a “primary”
replica.
• The primary is responsible for coordinating
writes to the data item.
• There are two types of Primary-Based
Protocol:
1. Remote-Write
2. Local-Write
60
Remote-Write Protocols
The principle of a primary-backup protocol
61
The Bad and Good of Primary-Backup
• Bad: Performance!
• All of those writes can take a long time (especially
when a “blocking write protocol” is used).
• Using a non-blocking write protocol to handle the
updates can lead to fault tolerant problems (which is
our next topic).
• Good: The benefit of this scheme is, as the primary is
in control, all writes can be sent to each backup replica
IN THE SAME ORDER, making it easy to implement
sequential consistency.
62
Local-Write Protocols
• In this protocol, a single copy of the data item
is still maintained.
• Upon a write, the data item gets transferred to
the replica that is writing.
• That is, the status of primary for a data item is
transferable.
• This is also called a “fully migrating approach”.
63
Local-Write Protocols Example
Primary-based local-write protocol in which a single copy is
migrated between processes (prior to the read/write).
64
Local-Write Issues
• The big question to be answered by any process
about to read from or write to the data item is:
– “Where is the data item right now?”
• It is possible to use some of the dynamic
naming technologies studied earlier in this
course, but scaling quickly becomes an issue.
• Processes can spend more time actually
locating a data item than using it!
65
Local-Write Protocols
Primary-backup protocol in which the primary migrates to the
process wanting to perform an update, then updates the
backups. Consequently, reads are much more efficient.
66
Replicated-Write Protocols
• With these protocols, writes can be
carried out at any replica.
• Another name might be: “Distributed-
Write Protocols”
• There are two types:
1. Active Replication
2. Majority Voting (Quorums)
67
Active Replication
• A special process carries out the update
operations at each replica.
• Lamport’s timestamps can be used to achieve
total ordering, but this does not scale well
within distributed systems.
• An alternative/variation is to use a sequencer,
which is a process that assigns a unique ID# to
each update, which is then propagated to all
replicas.
• This can lead to another problem: replicated
invocations.
68
Replicated Invocations: The Problem
The problem of replicated invocations – ‘B’ is a replicated object
(which itself calls ‘C’). When ‘A’ calls ‘B’, how do we ensure
‘C’ isn’t invoked three times?
69
Replicated Invocations: Solutions
a) Using a coordinator for ‘B’, which is responsible for forwarding an
invocation request from the replicated object to ‘C’.
b) Returning results from ‘C’ using the same idea: a coordinator is
responsible for returning the result to all ‘B’s. Note the single result
returned to ‘A’.
70
Quorum-Based Protocols
• Clients must request and acquire permissions from
multiple replicas before either reading/writing a
replicated data item.
• Consider this example:
– A file is replicated within a distributed file system.
– To update a file, a process must get approval from a
majority of the replicas to perform a write. The replicas
need to agree to also perform the write.
– After the update, the file has a new version # associated
with it (and it is set at all the updated replicas).
– To read, a process contacts a majority of the replicas and
asks for the version # of the files. If the version # is the
same, then the file must be the most recent version, and
the read can proceed.
71
Quorum Protocols: Generalisation
NR + NW > N
NW > N/2
72
Quorum-Based Protocols
Three examples of the voting algorithm:
(a) A correct choice of read and write set.
(b) A choice that may lead to write-write
conflicts. (c) A correct choice, known as
ROWA (read one, write all).

Contenu connexe

Tendances

distributed shared memory
 distributed shared memory distributed shared memory
distributed shared memoryAshish Kumar
 
Communication Asymmetry - Mobile Computing
Communication Asymmetry - Mobile ComputingCommunication Asymmetry - Mobile Computing
Communication Asymmetry - Mobile ComputingAkshay Nagpal
 
Presentation of the IEEE 802.11a MAC Layer
Presentation of the IEEE 802.11a MAC LayerPresentation of the IEEE 802.11a MAC Layer
Presentation of the IEEE 802.11a MAC LayerMahdi Ahmed Jama
 
Distributed concurrency control
Distributed concurrency controlDistributed concurrency control
Distributed concurrency controlBinte fatima
 
Chapter 8 distributed file systems
Chapter 8 distributed file systemsChapter 8 distributed file systems
Chapter 8 distributed file systemsAbDul ThaYyal
 
2. Distributed Systems Hardware & Software concepts
2. Distributed Systems Hardware & Software concepts2. Distributed Systems Hardware & Software concepts
2. Distributed Systems Hardware & Software conceptsPrajakta Rane
 
Introduction to Distributed System
Introduction to Distributed SystemIntroduction to Distributed System
Introduction to Distributed SystemSunita Sahu
 
Synchronization in distributed computing
Synchronization in distributed computingSynchronization in distributed computing
Synchronization in distributed computingSVijaylakshmi
 
Distributed file system
Distributed file systemDistributed file system
Distributed file systemAnamika Singh
 
Distributed document based system
Distributed document based systemDistributed document based system
Distributed document based systemChetan Selukar
 
remote procedure calls
  remote procedure calls  remote procedure calls
remote procedure callsAshish Kumar
 
Distributed File System
Distributed File SystemDistributed File System
Distributed File SystemNtu
 
Clock synchronization in distributed system
Clock synchronization in distributed systemClock synchronization in distributed system
Clock synchronization in distributed systemSunita Sahu
 
82159587 case-study-on-corba
82159587 case-study-on-corba82159587 case-study-on-corba
82159587 case-study-on-corbahomeworkping3
 
GOOGLE FILE SYSTEM
GOOGLE FILE SYSTEMGOOGLE FILE SYSTEM
GOOGLE FILE SYSTEMJYoTHiSH o.s
 
Deadlock in Distributed Systems
Deadlock in Distributed SystemsDeadlock in Distributed Systems
Deadlock in Distributed SystemsPritom Saha Akash
 
Common protocols
Common protocolsCommon protocols
Common protocolsMr SMAK
 

Tendances (20)

distributed shared memory
 distributed shared memory distributed shared memory
distributed shared memory
 
11. dfs
11. dfs11. dfs
11. dfs
 
Communication Asymmetry - Mobile Computing
Communication Asymmetry - Mobile ComputingCommunication Asymmetry - Mobile Computing
Communication Asymmetry - Mobile Computing
 
Common Standards in Cloud Computing
Common Standards in Cloud ComputingCommon Standards in Cloud Computing
Common Standards in Cloud Computing
 
Presentation of the IEEE 802.11a MAC Layer
Presentation of the IEEE 802.11a MAC LayerPresentation of the IEEE 802.11a MAC Layer
Presentation of the IEEE 802.11a MAC Layer
 
Distributed concurrency control
Distributed concurrency controlDistributed concurrency control
Distributed concurrency control
 
Chapter 8 distributed file systems
Chapter 8 distributed file systemsChapter 8 distributed file systems
Chapter 8 distributed file systems
 
2. Distributed Systems Hardware & Software concepts
2. Distributed Systems Hardware & Software concepts2. Distributed Systems Hardware & Software concepts
2. Distributed Systems Hardware & Software concepts
 
Coda file system
Coda file systemCoda file system
Coda file system
 
Introduction to Distributed System
Introduction to Distributed SystemIntroduction to Distributed System
Introduction to Distributed System
 
Synchronization in distributed computing
Synchronization in distributed computingSynchronization in distributed computing
Synchronization in distributed computing
 
Distributed file system
Distributed file systemDistributed file system
Distributed file system
 
Distributed document based system
Distributed document based systemDistributed document based system
Distributed document based system
 
remote procedure calls
  remote procedure calls  remote procedure calls
remote procedure calls
 
Distributed File System
Distributed File SystemDistributed File System
Distributed File System
 
Clock synchronization in distributed system
Clock synchronization in distributed systemClock synchronization in distributed system
Clock synchronization in distributed system
 
82159587 case-study-on-corba
82159587 case-study-on-corba82159587 case-study-on-corba
82159587 case-study-on-corba
 
GOOGLE FILE SYSTEM
GOOGLE FILE SYSTEMGOOGLE FILE SYSTEM
GOOGLE FILE SYSTEM
 
Deadlock in Distributed Systems
Deadlock in Distributed SystemsDeadlock in Distributed Systems
Deadlock in Distributed Systems
 
Common protocols
Common protocolsCommon protocols
Common protocols
 

Similaire à Distributed Systems Consistency and Replication Chapter

This is introduction to distributed systems for the revised curiculum
This is introduction to distributed systems for the revised curiculumThis is introduction to distributed systems for the revised curiculum
This is introduction to distributed systems for the revised curiculummeharikiros2
 
Ch-7-Part-2-Distributed-System.pptx
Ch-7-Part-2-Distributed-System.pptxCh-7-Part-2-Distributed-System.pptx
Ch-7-Part-2-Distributed-System.pptxKabindra Koirala
 
CAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and PracticesCAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and PracticesYoav Francis
 
Chapter Introductionn to distributed system .pptx
Chapter Introductionn to distributed system .pptxChapter Introductionn to distributed system .pptx
Chapter Introductionn to distributed system .pptxTekle12
 
Chapter 6-Consistency and Replication.ppt
Chapter 6-Consistency and Replication.pptChapter 6-Consistency and Replication.ppt
Chapter 6-Consistency and Replication.pptsirajmohammed35
 
HbaseHivePigbyRohitDubey
HbaseHivePigbyRohitDubeyHbaseHivePigbyRohitDubey
HbaseHivePigbyRohitDubeyRohit Dubey
 
Beyond Strong Consistency
Beyond Strong ConsistencyBeyond Strong Consistency
Beyond Strong Consistencyjsinglet
 
Parallel and Distributed Computing Chapter 6
Parallel and Distributed Computing Chapter 6Parallel and Distributed Computing Chapter 6
Parallel and Distributed Computing Chapter 6AbdullahMunir32
 
Process synchronizationfinal
Process synchronizationfinalProcess synchronizationfinal
Process synchronizationfinalmarangburu42
 
Relational and non relational database 7
Relational and non relational database 7Relational and non relational database 7
Relational and non relational database 7abdulrahmanhelan
 
Week # 1.pdf
Week # 1.pdfWeek # 1.pdf
Week # 1.pdfgiddy5
 

Similaire à Distributed Systems Consistency and Replication Chapter (20)

This is introduction to distributed systems for the revised curiculum
This is introduction to distributed systems for the revised curiculumThis is introduction to distributed systems for the revised curiculum
This is introduction to distributed systems for the revised curiculum
 
chen-06.ppt
chen-06.pptchen-06.ppt
chen-06.ppt
 
Hbase hive pig
Hbase hive pigHbase hive pig
Hbase hive pig
 
Ch-7-Part-2-Distributed-System.pptx
Ch-7-Part-2-Distributed-System.pptxCh-7-Part-2-Distributed-System.pptx
Ch-7-Part-2-Distributed-System.pptx
 
CAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and PracticesCAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and Practices
 
Chapter Introductionn to distributed system .pptx
Chapter Introductionn to distributed system .pptxChapter Introductionn to distributed system .pptx
Chapter Introductionn to distributed system .pptx
 
Hbase hivepig
Hbase hivepigHbase hivepig
Hbase hivepig
 
Chapter 6-Consistency and Replication.ppt
Chapter 6-Consistency and Replication.pptChapter 6-Consistency and Replication.ppt
Chapter 6-Consistency and Replication.ppt
 
HbaseHivePigbyRohitDubey
HbaseHivePigbyRohitDubeyHbaseHivePigbyRohitDubey
HbaseHivePigbyRohitDubey
 
Real-Time Design Patterns
Real-Time Design PatternsReal-Time Design Patterns
Real-Time Design Patterns
 
Beyond Strong Consistency
Beyond Strong ConsistencyBeyond Strong Consistency
Beyond Strong Consistency
 
Parallel and Distributed Computing Chapter 6
Parallel and Distributed Computing Chapter 6Parallel and Distributed Computing Chapter 6
Parallel and Distributed Computing Chapter 6
 
Introduction
IntroductionIntroduction
Introduction
 
Lecture 5 inter process communication
Lecture 5 inter process communicationLecture 5 inter process communication
Lecture 5 inter process communication
 
Process synchronizationfinal
Process synchronizationfinalProcess synchronizationfinal
Process synchronizationfinal
 
Master.pptx
Master.pptxMaster.pptx
Master.pptx
 
Relational and non relational database 7
Relational and non relational database 7Relational and non relational database 7
Relational and non relational database 7
 
Lost with data consistency
Lost with data consistencyLost with data consistency
Lost with data consistency
 
Week # 1.pdf
Week # 1.pdfWeek # 1.pdf
Week # 1.pdf
 
NoSQL and Couchbase
NoSQL and CouchbaseNoSQL and Couchbase
NoSQL and Couchbase
 

Dernier

KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 

Dernier (20)

KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 

Distributed Systems Consistency and Replication Chapter

  • 2. 2 Course/Slides Credits Note: all course presentations are based on those developed by Andrew S. Tanenbaum and Maarten van Steen. They accompany their "Distributed Systems: Principles and Paradigms" textbook (1st & 2nd editions). http://www.prenhall.com/divisions/esm/app/aut hor_tanenbaum/custom/dist_sys_1e/index.html And additions made by Paul Barry in course CW046-4: Distributed Systems http://glasnost.itcarlow.ie/~barryp/net4.html
  • 3. 3 Reasons for Replication • Data are replicated to increase the reliability of a system. • Replication for performance: – Scaling in numbers – Scaling in geographical area • Caveat: – Gain in performance – Cost of increased bandwidth for maintaining replication
  • 4. 4 More on Replication • Replicas allows remote sites to continue working in the event of local failures. • It is also possible to protect against data corruption. • Replicas allow data to reside close to where it is used. • Even a large number of replicated “local” systems can improve performance: think of clusters. • This directly supports the distributed systems goal of enhanced scalability.
  • 5. 5 Replication and Scalability • Replication is a widely-used scalability technique: think of Web clients and Web proxies. • When systems scale, the first problems to surface are those associated with performance – as the systems get bigger (e.g., more users), they get often slower. • Replicating the data and moving it closer to where it is needed helps to solve this scalability problem. • A problem remains: how to efficiently synchronize all of the replicas created to solve the scalability issue? • Dilemma: adding replicas improves scalability, but incurs the (oftentimes considerable) overhead of keeping the replicas up-to-date!!! • As we shall see, the solution often results in a relaxation of any consistency constraints.
  • 6. 6 Replication and Consistency • But if there are many replicas of the same thing, how do we keep all of them up-to-date? How do we keep the replicas consistent? • Consistency can be achieved in a number of ways. We will study a number of consistency models, as well as protocols for implementing the models. • So, what’s the catch? – It is not easy to keep all those replicas consistent.
  • 7. 7 Data-centric Consistency Models The general organization of a logical data store, physically distributed and replicated across multiple processes
  • 8. 8 Continuous Consistency (1) An example of keeping track of consistency deviations [adapted from (Yu and Vahdat, 2002)]
  • 9. 9 Continuous Consistency (2) Choosing the appropriate granularity for a Conit. (a) Two updates lead to update propagation.
  • 10. 10 Continuous Consistency (3) Choosing the appropriate granularity for a Conit. (b) No update propagation is needed (yet).
  • 11. 11 What is a Consistency Model? • A “consistency model” is a CONTRACT between a DS data-store and its processes. • If the processes agree to the rules, the data-store will perform properly and as advertised. • We start with Strict Consistency, which is defined as: – Any read on a data item ‘x’ returns a value corresponding to the result of the most recent write on ‘x’ (regardless of where the write occurred).
  • 12. 12 Consistency Model Diagram Notation • Wi(x)a – a write by process ‘i’ to item ‘x’ with a value of ‘a’. That is, ‘x’ is set to ‘a’. • (Note: The process is often shown as ‘Pi’). • Ri(x)b – a read by process ‘i’ from item ‘x’ producing the value ‘b’. That is, reading ‘x’ returns ‘b’. • Time moves from left to right in all diagrams.
  • 13. 13 Strict Consistency • Behavior of two processes, operating on same data item: a) A strictly consistent data-store. b) A data-store that is not strictly consistent. • With Strict Consistency, all writes are instantaneously visible to all processes and absolute global time order is maintained throughout the DS. This is the consistency model “Holy Grail” – not at all easy in the real world, and all but impossible within a DS. So, other, less strict (or “weaker”) models have been developed …
  • 14. 14 Sequential Consistency (1) • A weaker consistency model, which represents a relaxation of the rules. • It is also much easier (possible) to implement. • Definition of “Sequential Consistency”: – The result of any execution is the same as if the (read and write) operations by all processes on the data-store were executed in the same sequential order and the operations of each individual process appear in this sequence in the order specified by its program.
  • 15. 15 Sequential Consistency (2) (a) A sequentially consistent data store. (b) A data store that is not sequentially consistent.
  • 16. 16 Sequential Consistency (3) Three concurrently-executing processes
  • 17. 17 Sequential Consistency (4) Four valid execution sequences for these processes. The vertical axis is time.
  • 18. 18 Problem with Sequential Consistency • With this consistency model, adjusting the protocol to favor reads over writes (or vice- versa) can have a devastating impact on performance (refer to the textbook for the gory details). • For this reason, other weaker consistency models have been proposed and developed. • Again, a relaxation of the rules allows for these weaker models to make sense.
  • 19. 19 Causal Consistency • This model distinguishes between events that are “causally related” and those that are not. • If event B is caused or influenced by an earlier event A, then causal consistency requires that every other process see event A, then event B. • Operations that are not causally related are said to be concurrent.
  • 20. 20 Causal Consistency (1) • For a data store to be considered causally consistent, it is necessary that the store obeys the following conditions: – Writes that are potentially causally related • must be seen by all processes in the same order. – Concurrent writes • may be seen in a different order on different machines.
  • 21. 21 Causal Consistency (2) (a) A violation of a causally-consistent store
  • 22. 22 Causal Consistency (3) (b) A correct sequence of events in a causally-consistent store
  • 23. 23 Causal Consistency (4) This sequence is allowed with a causally-consistent store, but not with a sequentially consistent store
  • 24. 24 Introducing Weak Consistency • Not all applications need to see all writes, let alone seeing them in the same order. • This leads to “Weak Consistency” (which is primarily designed to work with distributed critical sections). • This model introduces the notion of a “synchronization variable”, which is used to update all copies of the data-store.
  • 25. 25 Weak Consistency Properties • The three properties of Weak Consistency: 1. Accesses to synchronization variables associated with a data-store are sequentially consistent. 2. No operation on a synchronization variable is allowed to be performed until all previous writes have been completed everywhere. 3. No read or write operation on data items are allowed to be performed until all previous operations to synchronization variables have been performed.
  • 26. 26 Weak Consistency: What It Means • So … • By doing a sync., a process can force the just written value out to all the other replicas. • Also, by doing a sync., a process can be sure it’s getting the most recently written value before it reads. • In essence, the weak consistency models enforce consistency on a group of operations, as opposed to individual reads and writes (as is the case with strict, sequential, causal and FIFO consistency).
  • 27. 27 Grouping Operations • Necessary criteria for correct synchronization: • An acquire access of a synchronization variable, not allowed to perform until all updates to guarded shared data have been performed with respect to that process. • Before exclusive mode access to synchronization variable by process is allowed to perform with respect to that process, no other process may hold synchronization variable, not even in nonexclusive mode. • After exclusive mode access to synchronization variable has been performed, any other process’ next nonexclusive mode access to that synchronization variable may not be performed until it has performed with respect to that variable’s owner.
  • 28. 28 Entry Consistency • At an acquire, all remote changes to guarded data must be brought up to date. • Before a write to a data item, a process must ensure that no other process is trying to write at same time. • Locks associate with individual data items, as opposed to the entire data-store. Note: P2’s read on ‘y’ returns NIL as no locks have been requested.
  • 29. 29 Client-Centric Consistency Models • The previously studied consistency models concern themselves with maintaining a consistent (globally accessible) data-store in the presence of concurrent read/write operations • Another class of distributed data-store is that which is characterized by the lack of simultaneous updates. Here, the emphasis is more on maintaining a consistent view of things for the individual client process that is currently operating on the data-store.
  • 30. 30 More Client-Centric Consistency • How fast should updates (writes) be made available to read-only processes? – Think of most database systems: mainly read. – Think of the DNS: write-write conflicts do no occur, only read-write conflicts. – Think of WWW: as with DNS, except that heavy use of client-side caching is present: even the return of stale pages is acceptable to most users. • These systems all exhibit a high degree of acceptable inconsistency … with the replicas gradually becoming consistent over time.
  • 31. 31 Toward Eventual Consistency • The only requirement is that all replicas will eventually be the same. • All updates must be guaranteed to propagate to all replicas … eventually! • This works well if every client always updates the same replica. • Things are a little difficult if the clients are mobile.
  • 32. 32 Eventual Consistency The principle of a mobile user accessing different replicas of a distributed database
  • 33. 33 Monotonic Reads (1) • A data store is said to provide monotonic-read consistency if the following condition holds: – If a process reads the value of a data item x  any successive read operation on x by that process will always return that same value or a more recent value.
  • 34. 34 Monotonic Reads (2) The read operations performed by a single process P at two different local copies of the same data store. (a) A monotonic-read consistent data store.
  • 35. 35 Monotonic Reads (3) The read operations performed by a single process P at two different local copies of the same data store. (b) A data store that does not provide monotonic reads.
  • 36. 36 Monotonic Writes (1) • In a monotonic-write consistent store, the following condition holds: – A write operation by a process on a data item x  is completed before any successive write operation on x by the same process.
  • 37. 37 Monotonic Writes (2) The write operations performed by a single process P at two different local copies of the same data store. (a) A monotonic-write consistent data store.
  • 38. 38 Monotonic Writes (3) The write operations performed by a single process P at two different local copies of the same data store. (b) A data store that does not provide monotonic-write consistency.
  • 39. 39 Read Your Writes (1) • A data store is said to provide read-your-writes consistency, if the following condition holds: – The effect of a write operation by a process on data item x  will always be seen by a successive read operation on x by the same process.
  • 40. 40 Read Your Writes (2) (a) A data store that provides read-your-writes consistency
  • 41. 41 Read Your Writes (3) (b) A data store that does not
  • 42. 42 Writes Follow Reads (1) • A data store is said to provide writes-follow-reads consistency, if the following holds: – A write operation by a process  on a data item x following a previous read operation on x by the same process is guaranteed to take place on the same or a more recent value of x that was read.
  • 43. 43 Writes Follow Reads (2) (a) A writes-follow-reads consistent data store
  • 44. 44 Writes Follow Reads (3) (b) A data store that does not provide writes-follow-reads consistency
  • 45. 45 Replica-Server Placement Choosing a proper cell size for server placement
  • 46. 46 Content Replication and Placement • Regardless of which consistency model is chosen, we need to decide where, when and by whom copies of the data-store are to be placed.
  • 47. 47 Replica Placement Types • There are three types of replica: 1. Permanent replicas: tend to be small in number, organized as COWs (Clusters of Workstations) or mirrored systems. 2. Server-initiated replicas: used to enhance performance at the initiation of the owner of the data-store. Typically used by web hosting companies to geographically locate replicas close to where they are needed most. (Often referred to as “push caches”). 3. Client-initiated replicas: created as a result of client requests – think of browser caches. Works well assuming, of course, that the cached data does not go stale too soon.
  • 48. 48 Server-Initiated Replicas Counting access requests from different clients
  • 49. 49 Content Distribution • When a client initiates an update to a distributed data-store, what gets propagated? • There are three possibilities: 1. Propagate notification of the update to the other replicas – this is an “invalidation protocol” which indicates that the replica’s data is no longer up-to- date. Can work well when there’s many writes. 2. Transfer the data from one replica to another – works well when there’s many reads. 3. Propagate the update to the other replicas – this is “active replication”, and shifts the workload to each of the replicas upon an “initial write”.
  • 50. 50 Push vs. Pull Protocols • Another design issue relates to whether or not the updates are pushed or pulled? 1. Push-based/Server-based Approach: sent “automatically” by server, the client does not request the update. This approach is useful when a high degree of consistency is needed. Often used between permanent and server-initiated replicas. 2. Pull-based/Client-based Approach: used by client caches (e.g., browsers), updates are requested by the client from the server. No request, no update!
  • 51. 51 Pull versus Push Protocols A comparison between push-based and pull-based protocols in the case of multiple-client, single- server systems
  • 52. 52 Hybrid Schemes • “Leases” – a promise from a server to push updates to a client for a period of time. • Once the lease expires, the client reverts to a pull-based approach (until another lease is issued). • Three kinds of leases: – Age-based Leases – Renewal-Frequency based Leases – State-based Leases
  • 53. 53 Epidemic Protocols • This is an interesting class of protocols that can be used to implement Eventual Consistency (note: these protocols are used in Bayou). • The main concern is the propagation of updates to all the replicas in as few a number of messages as possible. • Of course, here we are spreading updates, not diseases! • With this “update propagation model”, the idea is to “infect” as many replicas as quickly as possible.
  • 54. 54 Epidemic Protocols: Terminology • Infective replica: a server that holds an update that can be spread to other replicas. • Susceptible replica: a yet to be updated server. • Removed replica: an updated server that will not (or cannot) spread the update to any other replicas. • The trick is to get all susceptible servers to either infective or removed states as quickly as possible without leaving any replicas out.
  • 55. 55 The Anti-Entropy Protocol • Entropy: “a measure of the degradation or disorganization of the universe”. • Server P picks Q at random and exchanges updates, using one of three approaches: 1. P only pushes its own updates to Q 2. P only pulls in new updates from Q 3. P and Q send updates to each other • Sooner or later, all the servers in the system will be infected (updated). Works well.
  • 56. 56 The Gossiping Protocol • This variant is referred to as “gossiping” or “rumour spreading”, and works as follows: 1. P has just been updated for item ‘x’. 2. It immediately pushes the update of ‘x’ to Q. 3. If Q already knows about ‘x’, P becomes disinterested in spreading any more updates (rumours) and is removed. 4. Otherwise P gossips to another server, as does Q. • This approach is good, but can be shown not to guarantee the propagation of all updates to all servers. Oh dear.
  • 57. 57 The Best of Both Worlds • A mix of anti-entropy and gossiping is regarded as the best approach to rapidly infecting systems with updates. • However, what about removing data? – Updates are easy, deletion is much, much harder! – Under certain circumstances, after a deletion, an “old” reference to the deleted item may appear at some replica and cause the deleted item to be reactivated! – One solution is to issue “Death Certificates” for data items – these are a special type of update. – Only problem remaining is the eventual removal of “old” death certificates (with which timeouts can help).
  • 58. 58 Consistency Protocols • This is a specific implementation of a consistency model. • The most widely implemented models are: 1. Sequential Consistency. 2. Weak Consistency (with sync variables). 3. Atomic Transactions.
  • 59. 59 Primary-Based Protocols • Each data item is associated with a “primary” replica. • The primary is responsible for coordinating writes to the data item. • There are two types of Primary-Based Protocol: 1. Remote-Write 2. Local-Write
  • 60. 60 Remote-Write Protocols The principle of a primary-backup protocol
  • 61. 61 The Bad and Good of Primary-Backup • Bad: Performance! • All of those writes can take a long time (especially when a “blocking write protocol” is used). • Using a non-blocking write protocol to handle the updates can lead to fault tolerant problems (which is our next topic). • Good: The benefit of this scheme is, as the primary is in control, all writes can be sent to each backup replica IN THE SAME ORDER, making it easy to implement sequential consistency.
  • 62. 62 Local-Write Protocols • In this protocol, a single copy of the data item is still maintained. • Upon a write, the data item gets transferred to the replica that is writing. • That is, the status of primary for a data item is transferable. • This is also called a “fully migrating approach”.
  • 63. 63 Local-Write Protocols Example Primary-based local-write protocol in which a single copy is migrated between processes (prior to the read/write).
  • 64. 64 Local-Write Issues • The big question to be answered by any process about to read from or write to the data item is: – “Where is the data item right now?” • It is possible to use some of the dynamic naming technologies studied earlier in this course, but scaling quickly becomes an issue. • Processes can spend more time actually locating a data item than using it!
  • 65. 65 Local-Write Protocols Primary-backup protocol in which the primary migrates to the process wanting to perform an update, then updates the backups. Consequently, reads are much more efficient.
  • 66. 66 Replicated-Write Protocols • With these protocols, writes can be carried out at any replica. • Another name might be: “Distributed- Write Protocols” • There are two types: 1. Active Replication 2. Majority Voting (Quorums)
  • 67. 67 Active Replication • A special process carries out the update operations at each replica. • Lamport’s timestamps can be used to achieve total ordering, but this does not scale well within distributed systems. • An alternative/variation is to use a sequencer, which is a process that assigns a unique ID# to each update, which is then propagated to all replicas. • This can lead to another problem: replicated invocations.
  • 68. 68 Replicated Invocations: The Problem The problem of replicated invocations – ‘B’ is a replicated object (which itself calls ‘C’). When ‘A’ calls ‘B’, how do we ensure ‘C’ isn’t invoked three times?
  • 69. 69 Replicated Invocations: Solutions a) Using a coordinator for ‘B’, which is responsible for forwarding an invocation request from the replicated object to ‘C’. b) Returning results from ‘C’ using the same idea: a coordinator is responsible for returning the result to all ‘B’s. Note the single result returned to ‘A’.
  • 70. 70 Quorum-Based Protocols • Clients must request and acquire permissions from multiple replicas before either reading/writing a replicated data item. • Consider this example: – A file is replicated within a distributed file system. – To update a file, a process must get approval from a majority of the replicas to perform a write. The replicas need to agree to also perform the write. – After the update, the file has a new version # associated with it (and it is set at all the updated replicas). – To read, a process contacts a majority of the replicas and asks for the version # of the files. If the version # is the same, then the file must be the most recent version, and the read can proceed.
  • 72. 72 Quorum-Based Protocols Three examples of the voting algorithm: (a) A correct choice of read and write set. (b) A choice that may lead to write-write conflicts. (c) A correct choice, known as ROWA (read one, write all).