Transactions and Concurrency Control in distributed systems. Transaction properties, classification, and transaction implementation. Flat, Nested, and Distributed transactions. Inconsistent Retrievals, Lost Update, Dirty Read, and Premature Writes Problem
1. Transactions & Concurrency
Control
CS4262 Distributed Systems
Dilum Bandara
Dilum.Bandara@uom.lk
Some slides adapted from U Uthaiyashankar (WSO2) and Rajkumar
Buyya’s
3. Transactions
Transactions protect a shared resource (e.g.,
shared data) against simultaneous access by
several concurrent processes
Transactions can do much more…
They allow a process to access & modify multiple data
items as a single atomic operation
All-or-nothing property
3
4. Transactions in DBMS
A unit of work performed within a DBMS against
a database
Treated in a coherent & reliable way independent of
other transactions
2 main purposes
To provide reliable units of work that allow correct
recovery from failures & keep a database consistent
even in cases of (complete or partial) system failure
To provide isolation between programs accessing a
database concurrently
4
5. Goal of Transactions
To ensure objects managed by a server remain
in a consistent state when accessed by multiple
transactions in the presence of server failure
Transactions deal with process crash failures &
communication omission failures
Transactions don’t deal with Byzantine failure
5
6. Transaction Model
Recoverable objects
Objects that can be recovered after their server
crashes
Failure atomicity
Effects of the transaction are atomic even when
server crashes
Transactions require coordination among
A client program
Some recoverable objects
A coordinator
6
7. Transaction Model (Cont.)
Each transaction is created & managed by a
coordinator which implements the coordinator
interface
Coordinator gives each transaction an identifier (TID)
Clients invoke transaction coordinator interface
methods to perform a transaction
A transaction may be aborted by client or server
7
8. Transaction Coordinator Interface
openTransaction() → trans
Starts a new transaction & delivers a unique TID trans
trans will be used in other operations in the transaction
closeTransaction(trans) → (commit, abort)
Ends a transaction
commit – indicates that transaction has committed
abort – Indicates that it has aborted
abortTransaction(trans)
Aborts the transaction
8
9. Properties of a Transaction
Proposed by Harder & Reuter in 1983
ACID Property
Atomic
Consistent
Isolated
Durable
9
10. Atomic Property
Process all of a transaction or none of it
To outside world, transaction happens indivisibly
While a transaction happens, other processes
can’t see any intermediate states
Example
A file is 100 bytes long when a transaction starts to
append to it
If other processes read the file during transaction,
they should see & access only the 100 bytes
If transaction commits successfully, file grows
instantaneously to its new size
10
11. Consistent Property
Data on all systems reflects the same state
Transaction doesn’t violate system invariants
Example
In a banking system, a key invariant is the law of
conservation of money
After any internal transfer, amount of money in bank
must be same as it was before transfer
During internal transfer operations, this invariant may
be violated
But this violation isn’t visible outside the transaction
11
12. Isolated Property
Transactions don’t interact/interfere with one
another
Concurrent transactions don’t interfere with each
other
i.e., transactions are serializable
Example
If 2+ transactions are running at the same time, to
each of them & to other processes, final result
appears as though all transactions ran sequentially (in
some system-dependent order)
12
13. Durable Property
Effects of a completed transaction are persistent
Once a transaction commits, changes are
permanent
No failure after the commit can undo results or
cause them to be lost
13
14. Ensuring Transaction Properties
To support “failure atomicity” & “durability”,
objects must be recoverable
To support “isolation” & “consistency” server
needs to synchronize operations
Perform transactions serially (sequentially)
Generally, an unacceptable solution
Allow transactions to execute concurrently, if they
would have the same effect as a serial execution
Known as serializable or serially equivalent transactions
14
16. Flat Transactions
A series of operations
Don’t allow partial results to be committed or
aborted
e.g., booking multiple air lines from start to
destination
16
17. Nested Transactions
Constructed from a number of sub-transactions
Top-level transaction may fork off children that
run in parallel to one another
Incase of failure results of some sub-transaction
that committed must be undone
e.g., booking multiple air lines from start to
destination
17
18. Distributed Transactions
Logically, a flat, indivisible transaction that
operates on distributed data
In contrast, a nested transaction is logically
decomposed into a hierarchy of sub-transactions
18
Source: Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007
19. Achieving Transaction Properties in
Distributed Transactions
To achieve atomicity & durability
Distributed commit protocols
To achieve consistency & isolation
Concurrency control algorithms (distributed locking
protocols)
We need separate distributed algorithms to
handle “locking of data” & “committing entire
transaction”
19
20. Transaction Implementation
Consider transactions on a file system
If each process executing a transaction just updates
the file it uses in place, then transactions will not be
atomic
2 common implementation techniques
Private workspace
Write-ahead log
20
21. Private Workspace
Provide private (local) copies
All updates on private copies
Update the original (global) copy while
committing
Issues
Cost of copying everything
Possible solution
Keep a pointer to relevant file block
When writing copy file block, instead of entire file
Update original when closing file
Examples – Andrew file systems
21
22. Example – Private Workspace
Implementation
22
Source: Andrew S. Tanenbaum , Distributed Operating Systems
23. Write-Ahead Log
Files are modified in-place
Before any changes, a record is written in a log
indicating
Which transaction is making the change
Which file & block is being changed
What old & new values are
Change is made to file only after log is written
successfully
Example – log structured file system
If transaction succeeds & is committed, a commit
record is written in log
If transaction is aborted, log is used to revert file
back to its original state – rollback 23
25. Concurrency Control
Goal
Allow simultaneous execution of transactions
While enabling each transaction to be isolated & data
items being manipulated are left in a consistent state
Consistency
Achieved by giving transactions access to data items in
a specific order
Final result would be the same as if all transactions
were run sequentially
25
27. Achieving Concurrency Control
(Cont.)
Data manager
Performs actual read/write on data
Not concerned about transaction
Scheduler
Responsible for properly controlling concurrency
Determines which transaction is allowed to pass a read/write
operation to data manager & at which time
Schedules individual read/write s.t. transaction isolation &
consistency properties are met
Transaction manager
Guarantees transaction atomicity
Processes transaction primitives by transforming them into
scheduling requests
27
28. Each site has a scheduler & data manager
Each transaction is handled by a transaction manager
Transaction manager communicates with schedulers
at different sites
Scheduler may also communicate with remote data
managers
Distributed Concurrency Control
28
30. Schedules – Example
Schedule – system dependent order of actions
Final result looks as though all transactions ran sequentially
30
31. Concurrent Transaction Issues
Lost update problem
Transactions should not read a “stale” value & use it
in computing a new value
Inconsistent retrievals problem
Update transactions shouldn’t interfere with retrievals
Illustrated in the context of banking examples…
31
32. “Lost Update” Problem
3 bank accounts (A, B, C) with balances,
A = $100, B = $200, C = $300
Transaction T
Transfers from account A to B, for 10% increase in the B’s
balance
B = $200 + ($200*10%) = $220
A = $100 – ($200*10%) = $80
Transaction U
Transfers from C to B, for 10% increase in B’s balance
B = $220 + ($220*10%) = $242
C = $300 – ($220*10%) = $278
What might happen if transactions T & U are executed
concurrently?
32
34. “Inconsistent Retrievals” Problem
2 bank accounts (A, B) with balances
A = $200
B = $200
Transaction V
Transfers $100 of account A to B
A = $200 - $100 = $100
B = $200 + $100 = $300
Transaction W
Computes sum of all account balances (branch total)
Total = $100 + $300 +…… = $400 + ……
What might happen if transactions V & W are executed
concurrently? 34
36. Serial Equivalence
Serial equivalence prevents occurrence of lost
updates & inconsistent retrievals
Thus, serial equivalence is used as a criterion in
concurrency control algorithms
2 transactions are serially equivalent if
All pairs of conflicting operations of 2 transactions are
executed in same order at all of the objects that both
transactions access
36
37. Serial Equivalence (Cont.)
Consider 2 objects i & j, & 2 transactions T & U
Serially equivalent orderings require one of the
following 2 conditions
T accesses i before U, and T accesses j before U
U accesses i before T, and U accesses j before T
37
42. Issues in Aborting Transactions
Servers must record effects of all committed
transactions & none of aborted transactions
2 well-known problems
Dirty reads
Premature writes
Both of these can occur in the presence of
serially equivalent executions of transactions
42
44. Example – Premature Writes
Due to interaction between write operations on
the same object belonging to different
transactions
If U aborts & T commits, balance should be $105
44
45. Concurrency Control Algorithms
2-Phase Locking (2PL)
Get all locks, then release them
No more lock acquisition when a lock is release
Timestamp ordering
Lamport’s time stamp
Optimistic concurrency
Go & do the changes, if there is a problem let’s
handle them later
How to do these concurrently on a distributed
system? 45
Notes de l'éditeur
Both T and U read the balance of B (200)
T overwrites the update by U without seeing it
W computes the sum before V updates account B
Both T and U read the balance of B (200)
T overwrites the update by U without seeing it