SlideShare une entreprise Scribd logo
1  sur  33
Building Scalable Producer-Consumer
Pools based on
Elimination-Diraction Trees

Yehuda Afek and Guy Korland and Maria
Natanzon and Nir Shavit
The Pool
Producer-consumer pools, that is, collections of
unordered objects or tasks, are a fundamental
element of modern multiprocessor software and a
target of extensive research and development
Get( )

P1 Put(x)

.
.
P2

C1

.
.

C2

Put(y)

Get( )

Pn Put(z)

Get( )

pool

Cn
ED-Tree Pool
We present the ED-Tree, a distributed pool
structure based on a combination of the
elimination-tree and diffracting-tree
paradigms, allowing high degrees of
parallelism with reduced contention
Java JDK6.0:
 SynchronousQueue/Stack

(Lea, Scott, and Shearer)

- pairing

up function without buffering. Producers and consumers wait for
one another
 LinkedBlockingQueue

- Producers put their value and

leave, Consumers wait for a value to become available.
 ConcurrentLinkedQueue

- Producers put their value

and leave, Consumers return null if the pool is empty.
Drawback
All these structures are based on a centralized
structures like a lock-free queue or a stack,
and thus are limited in their scalability: the
head of the stack or queue is a sequential
bottleneck and source of contention.
Some Observations
A

pool does not have to obey neither LIFO or
FIFO semantics.
 Therefore, no centralized structure needed,
to hold the items and to serve producers and
consumers requests.
New approach
ED-Tree: a combined variant of
the diffracting-tree structure (Shavit and Zemach) and
the elimination-tree structure (Shavit and Touitou)
The basic idea:
 Use randomization to distribute the concurrent
requests of threads onto many locations so that they
collide with one another and can exchange values,
thus avoiding using a central place through which all
threads pass.
The result:
 A pool that allows both parallelism and reduced
contention.
A little history
 Both

diffraction and elimination were
presented years ago, and claimed to be
effective through simulation
 However, elimination trees and diffracting
trees were never used to implement real
world structures
 Elimination and diffraction were never
combined in a single data structure
Diffraction trees
A binary tree of objects called balancers [Aspnes-Herlihy-Shavit] with
a single input wire and two output wires

5

4

3

2

1

b

1

3

2

5

4

Threads arrive at a balancer and it repeatedly sends them left and right,
so its top wire always has maximum one more than the bottom one.
Diffraction trees
1

[Shavit-Zemach]

b

b
10

9

8

7

6

5

4

3

2

1

b

9

2

10

3
4

b
b

b

5
6
7

b

8

In any quiescent state (when there are no threads in the tree), the tree
preserves the step property: the output items are balanced out so that the
top leaves outputted at most one more element than the bottom ones, and
there are no gaps.
Diffraction trees
Connect each output wire to a lock free queue
b
b

b

b

b

b

b

To perform a push, threads traverse the balancers from the root to the leaves and
then push the item onto the appropriate queue.
To perform a pop, threads traverse the balancers from the root to the leaves and
then pop from the appropriate queue/block if the queue is empty.
Diffraction trees
Problem:
Each toggle bit is a hot spot
1

1

b
0/1

1

b
0/1
3
3

2

1

b
0/1

0/1
0/1
2
2

b
0/1

b
0/1

b
0/1

2

3
Diffraction trees
Observation:
If an even number of threads pass through a balancer, the
outputs are evenly balanced on the top and bottom wires, but
the balancer's state remains unchanged

The approach:
Add a diffraction array in front of each toggle bit

0/1

Prism Array

toggle bit
Elimination
 At

any point while traversing the tree, if
producer and consumer collide, there is no
need for them to diffract and continue
traversing the tree

 Producer

can hand out his item to the
consumer, and both can leave the tree.
Adding elimination
x

Get( )

1
2
.
.
:
:
k

Put(x)

ok

0/1
0/1
Using elimination-diffraction balancers
Let the array at balancer each be
a diffraction-elimination array:
 If two producer (two consumer) threads meet in the
array, they leave on opposite wires, without a need to
touch the bit, as anyhow it would remain in its original
state.
 If producer and consumer meet, they eliminate,
exchanging items.
 If a producer or consumer call does not manage to
meet another in the array, it toggles the respective bit of
the balancer and moves on.
ED-tree
What about low concurrency
levels?
 We

show that elimination and diffraction
techniques can be combined to work well at
both high and low loads
 To insure good performance in low loads we use
several techniques, making the algorithm adapt
to the current contention level.
Adaptation mechanisms


Use backoff in space:
 Randomly choose a cell in a certain range of the array
 If the cell is busy (already occupied by two threads), increase the range and
repeat.
 Else Spin and wait to collision
 If timed out (no collision)
 Decrease the range and repeat
 If certain amount of timeouts reached, spin on the first cell of the array for a
period, and then move on to the toggle bit and the next level.
 If certain amount of timeouts was reached, don’t try to diffract on any of the
next levels, just go straight to the toggle bit



Each thread remembers the last range it used at the current balancer and next
time starts from this range
Starvation avoidance
 Threads

that failed to eliminate and propagated
all the way to the leaves can wait for a long time
for their requests to complete, while new threads
entering the tree and eliminating finish faster.

 To

avoid starvation we limit the time a thread
can be blocked in the queues before it retries
the whole traversal again.
Implementation
 Each

balancer is composed from
an elimination array, a pair of toggle bits, and
two references one to each of its child nodes.
public class Balancer
{
ToggleBit producerToggle, consumerToggle;
Exchanger[] eliminationArray;
Balancer leftChild , rightChild;
ThreadLocal<Integer> lastSlotRange;
}
Implementation
public class Exchanger
{
AtomicReference<ExchangerPackage> slot;
}
public class ExchangerPackage
{
Object value;
State state ; // WAITING/ELIMINATION/DIFFRACTION,
Type type; // PRODUCER/CONSUMER
}
Implementation


Starting from the root of the tree:
 Enter balancer
 Choose a cell in the array and try to collide with another thread,
using backoff mechanism described earlier.
 If collision with another thread occurred







If both threads are of the same type, leave to the next level balancer
(each to separate direction)
If threads are of different type, exchange values and leave

Else (no collision) use appropriate toggle bit and move to next
level

If one of the leaves reached, go to the appropriate queue and
Insert/Remove an item according to the thread type
Performance evaluation
Sun UltraSPARC T2 Plus multi-core machine.
 2 processors, each with 8 cores
 each core with 8 hardware threads
 64 way parallelism on a processor and 128 way
parallelism across the machine.


Most of the tests were done on one processor. i.e.
max 64 hardware threads
Performance evaluation



A tree with 3 levels and 8 queues
The queues are
SynchronousBlocking/LinkedBlocking/ConcurrentLinked,
according to the pool specification
b
b

b

b

b

b

b
Performance evaluation
Synchronous stack of Lea et. Al vs ED synchronous pool
Performance evaluation
Linked blocking queue vs ED blocking pool
Performance evaluation
Concurrent linked queue vs ED non blocking pool
Adding a delay between accesses
to the pool
32 consumers, 32 producers
Changing percentage of Consumers vs. total
threads number
64 threads
25% Producers 75%Consumers
Elimination rate
Elimination range

Contenu connexe

Tendances

CLASSES, STRUCTURE,UNION in C++
CLASSES, STRUCTURE,UNION in C++CLASSES, STRUCTURE,UNION in C++
CLASSES, STRUCTURE,UNION in C++Prof Ansari
 
03 expressions.ppt
03 expressions.ppt03 expressions.ppt
03 expressions.pptBusiness man
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnnKuppusamy P
 
Java Data Types
Java Data TypesJava Data Types
Java Data TypesSpotle.ai
 
Abstract data types
Abstract data typesAbstract data types
Abstract data typesHoang Nguyen
 
Chapter 13 introduction to classes
Chapter 13 introduction to classesChapter 13 introduction to classes
Chapter 13 introduction to classesrsnyder3601
 

Tendances (6)

CLASSES, STRUCTURE,UNION in C++
CLASSES, STRUCTURE,UNION in C++CLASSES, STRUCTURE,UNION in C++
CLASSES, STRUCTURE,UNION in C++
 
03 expressions.ppt
03 expressions.ppt03 expressions.ppt
03 expressions.ppt
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnn
 
Java Data Types
Java Data TypesJava Data Types
Java Data Types
 
Abstract data types
Abstract data typesAbstract data types
Abstract data types
 
Chapter 13 introduction to classes
Chapter 13 introduction to classesChapter 13 introduction to classes
Chapter 13 introduction to classes
 

En vedette

Lowering STM Overhead with Static Analysis
Lowering STM Overhead with Static AnalysisLowering STM Overhead with Static Analysis
Lowering STM Overhead with Static AnalysisGuy Korland
 
Open stack bigdata NY cloudcamp
Open stack bigdata NY cloudcampOpen stack bigdata NY cloudcamp
Open stack bigdata NY cloudcampGuy Korland
 
Crafting a Ready-to-Go STM
Crafting  a Ready-to-Go STMCrafting  a Ready-to-Go STM
Crafting a Ready-to-Go STMGuy Korland
 
Paractical Solutions for Multicore Programming
Paractical Solutions for Multicore ProgrammingParactical Solutions for Multicore Programming
Paractical Solutions for Multicore ProgrammingGuy Korland
 
Implementing STM in Java
Implementing STM in JavaImplementing STM in Java
Implementing STM in JavaMisha Kozik
 

En vedette (6)

Cloudify 10m
Cloudify 10mCloudify 10m
Cloudify 10m
 
Lowering STM Overhead with Static Analysis
Lowering STM Overhead with Static AnalysisLowering STM Overhead with Static Analysis
Lowering STM Overhead with Static Analysis
 
Open stack bigdata NY cloudcamp
Open stack bigdata NY cloudcampOpen stack bigdata NY cloudcamp
Open stack bigdata NY cloudcamp
 
Crafting a Ready-to-Go STM
Crafting  a Ready-to-Go STMCrafting  a Ready-to-Go STM
Crafting a Ready-to-Go STM
 
Paractical Solutions for Multicore Programming
Paractical Solutions for Multicore ProgrammingParactical Solutions for Multicore Programming
Paractical Solutions for Multicore Programming
 
Implementing STM in Java
Implementing STM in JavaImplementing STM in Java
Implementing STM in Java
 

Similaire à Building Scalable Producer-Consumer Pools based on Elimination-Diraction Trees

Distributed Coordination
Distributed CoordinationDistributed Coordination
Distributed CoordinationLuis Galárraga
 
Fukushima Cognitron
Fukushima CognitronFukushima Cognitron
Fukushima CognitronESCOM
 
Alacart Poor man's classification trees
Alacart Poor man's classification treesAlacart Poor man's classification trees
Alacart Poor man's classification treesLeonardo Auslender
 
Using Machine Learning to Measure the Cross Section of Top Quark Pairs in the...
Using Machine Learning to Measure the Cross Section of Top Quark Pairs in the...Using Machine Learning to Measure the Cross Section of Top Quark Pairs in the...
Using Machine Learning to Measure the Cross Section of Top Quark Pairs in the...m.a.kirn
 
Coding Assignment 3CSC 330 Advanced Data Structures, Spri.docx
Coding Assignment 3CSC 330 Advanced Data Structures, Spri.docxCoding Assignment 3CSC 330 Advanced Data Structures, Spri.docx
Coding Assignment 3CSC 330 Advanced Data Structures, Spri.docxmary772
 
Vlsiexpt 11 12
Vlsiexpt 11 12Vlsiexpt 11 12
Vlsiexpt 11 12JINCY Soju
 
Solar cell Modeling with Scaps 1-D
Solar cell Modeling with Scaps 1-D Solar cell Modeling with Scaps 1-D
Solar cell Modeling with Scaps 1-D shubham mishra
 
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)Universitat Politècnica de Catalunya
 
JAVA CONCEPTS AND PRACTICES
JAVA CONCEPTS AND PRACTICESJAVA CONCEPTS AND PRACTICES
JAVA CONCEPTS AND PRACTICESNikunj Parekh
 
Technical aptitude questions
Technical aptitude questionsTechnical aptitude questions
Technical aptitude questionssadiqkhanpathan
 
A STUDY OF METHODS FOR TRAINING WITH DIFFERENT DATASETS IN IMAGE CLASSIFICATION
A STUDY OF METHODS FOR TRAINING WITH DIFFERENT DATASETS IN IMAGE CLASSIFICATIONA STUDY OF METHODS FOR TRAINING WITH DIFFERENT DATASETS IN IMAGE CLASSIFICATION
A STUDY OF METHODS FOR TRAINING WITH DIFFERENT DATASETS IN IMAGE CLASSIFICATIONADEIJ Journal
 

Similaire à Building Scalable Producer-Consumer Pools based on Elimination-Diraction Trees (20)

Distributed Coordination
Distributed CoordinationDistributed Coordination
Distributed Coordination
 
Fukushima Cognitron
Fukushima CognitronFukushima Cognitron
Fukushima Cognitron
 
611+tutorial
611+tutorial611+tutorial
611+tutorial
 
Alacart Poor man's classification trees
Alacart Poor man's classification treesAlacart Poor man's classification trees
Alacart Poor man's classification trees
 
Java best practices
Java best practicesJava best practices
Java best practices
 
FractalTreeIndex
FractalTreeIndexFractalTreeIndex
FractalTreeIndex
 
Using Machine Learning to Measure the Cross Section of Top Quark Pairs in the...
Using Machine Learning to Measure the Cross Section of Top Quark Pairs in the...Using Machine Learning to Measure the Cross Section of Top Quark Pairs in the...
Using Machine Learning to Measure the Cross Section of Top Quark Pairs in the...
 
Lockless
LocklessLockless
Lockless
 
Coding Assignment 3CSC 330 Advanced Data Structures, Spri.docx
Coding Assignment 3CSC 330 Advanced Data Structures, Spri.docxCoding Assignment 3CSC 330 Advanced Data Structures, Spri.docx
Coding Assignment 3CSC 330 Advanced Data Structures, Spri.docx
 
Vlsiexpt 11 12
Vlsiexpt 11 12Vlsiexpt 11 12
Vlsiexpt 11 12
 
Wavelet
WaveletWavelet
Wavelet
 
ai7.ppt
ai7.pptai7.ppt
ai7.ppt
 
Solar cell Modeling with Scaps 1-D
Solar cell Modeling with Scaps 1-D Solar cell Modeling with Scaps 1-D
Solar cell Modeling with Scaps 1-D
 
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
 
JAVA CONCEPTS AND PRACTICES
JAVA CONCEPTS AND PRACTICESJAVA CONCEPTS AND PRACTICES
JAVA CONCEPTS AND PRACTICES
 
Jist of Java
Jist of JavaJist of Java
Jist of Java
 
Technical aptitude questions
Technical aptitude questionsTechnical aptitude questions
Technical aptitude questions
 
A STUDY OF METHODS FOR TRAINING WITH DIFFERENT DATASETS IN IMAGE CLASSIFICATION
A STUDY OF METHODS FOR TRAINING WITH DIFFERENT DATASETS IN IMAGE CLASSIFICATIONA STUDY OF METHODS FOR TRAINING WITH DIFFERENT DATASETS IN IMAGE CLASSIFICATION
A STUDY OF METHODS FOR TRAINING WITH DIFFERENT DATASETS IN IMAGE CLASSIFICATION
 
ai7.ppt
ai7.pptai7.ppt
ai7.ppt
 
Java mcq
Java mcqJava mcq
Java mcq
 

Plus de Guy Korland

FalkorDB - Fastest way to your Knowledge
FalkorDB - Fastest way to your KnowledgeFalkorDB - Fastest way to your Knowledge
FalkorDB - Fastest way to your KnowledgeGuy Korland
 
Redis Developer Day TLV - Redis Stack & RedisInsight
Redis Developer Day TLV - Redis Stack & RedisInsightRedis Developer Day TLV - Redis Stack & RedisInsight
Redis Developer Day TLV - Redis Stack & RedisInsightGuy Korland
 
Using Redis As Your Online Feature Store: 2021 Highlights. 2022 Directions
Using Redis As Your  Online Feature Store:  2021 Highlights. 2022 DirectionsUsing Redis As Your  Online Feature Store:  2021 Highlights. 2022 Directions
Using Redis As Your Online Feature Store: 2021 Highlights. 2022 DirectionsGuy Korland
 
The evolution of DBaaS - israelcloudsummit
The evolution of DBaaS - israelcloudsummitThe evolution of DBaaS - israelcloudsummit
The evolution of DBaaS - israelcloudsummitGuy Korland
 
From kv to multi model RedisDay NYC19
From kv to multi model   RedisDay NYC19From kv to multi model   RedisDay NYC19
From kv to multi model RedisDay NYC19Guy Korland
 
From Key-Value to Multi-Model - RedisConf19
From Key-Value to Multi-Model - RedisConf19From Key-Value to Multi-Model - RedisConf19
From Key-Value to Multi-Model - RedisConf19Guy Korland
 
The Open PaaS Stack
The Open PaaS StackThe Open PaaS Stack
The Open PaaS StackGuy Korland
 
Quasi-Linearizability: relaxed consistency for improved concurrency.
Quasi-Linearizability: relaxed consistency for improved concurrency.Quasi-Linearizability: relaxed consistency for improved concurrency.
Quasi-Linearizability: relaxed consistency for improved concurrency.Guy Korland
 
The Next Generation Application Server – How Event Based Processing yields s...
The Next Generation  Application Server – How Event Based Processing yields s...The Next Generation  Application Server – How Event Based Processing yields s...
The Next Generation Application Server – How Event Based Processing yields s...Guy Korland
 
Deuce STM - CMP'09
Deuce STM - CMP'09Deuce STM - CMP'09
Deuce STM - CMP'09Guy Korland
 

Plus de Guy Korland (11)

FalkorDB - Fastest way to your Knowledge
FalkorDB - Fastest way to your KnowledgeFalkorDB - Fastest way to your Knowledge
FalkorDB - Fastest way to your Knowledge
 
Redis Developer Day TLV - Redis Stack & RedisInsight
Redis Developer Day TLV - Redis Stack & RedisInsightRedis Developer Day TLV - Redis Stack & RedisInsight
Redis Developer Day TLV - Redis Stack & RedisInsight
 
Using Redis As Your Online Feature Store: 2021 Highlights. 2022 Directions
Using Redis As Your  Online Feature Store:  2021 Highlights. 2022 DirectionsUsing Redis As Your  Online Feature Store:  2021 Highlights. 2022 Directions
Using Redis As Your Online Feature Store: 2021 Highlights. 2022 Directions
 
Vector database
Vector databaseVector database
Vector database
 
The evolution of DBaaS - israelcloudsummit
The evolution of DBaaS - israelcloudsummitThe evolution of DBaaS - israelcloudsummit
The evolution of DBaaS - israelcloudsummit
 
From kv to multi model RedisDay NYC19
From kv to multi model   RedisDay NYC19From kv to multi model   RedisDay NYC19
From kv to multi model RedisDay NYC19
 
From Key-Value to Multi-Model - RedisConf19
From Key-Value to Multi-Model - RedisConf19From Key-Value to Multi-Model - RedisConf19
From Key-Value to Multi-Model - RedisConf19
 
The Open PaaS Stack
The Open PaaS StackThe Open PaaS Stack
The Open PaaS Stack
 
Quasi-Linearizability: relaxed consistency for improved concurrency.
Quasi-Linearizability: relaxed consistency for improved concurrency.Quasi-Linearizability: relaxed consistency for improved concurrency.
Quasi-Linearizability: relaxed consistency for improved concurrency.
 
The Next Generation Application Server – How Event Based Processing yields s...
The Next Generation  Application Server – How Event Based Processing yields s...The Next Generation  Application Server – How Event Based Processing yields s...
The Next Generation Application Server – How Event Based Processing yields s...
 
Deuce STM - CMP'09
Deuce STM - CMP'09Deuce STM - CMP'09
Deuce STM - CMP'09
 

Dernier

Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 

Dernier (20)

Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 

Building Scalable Producer-Consumer Pools based on Elimination-Diraction Trees

  • 1. Building Scalable Producer-Consumer Pools based on Elimination-Diraction Trees Yehuda Afek and Guy Korland and Maria Natanzon and Nir Shavit
  • 2. The Pool Producer-consumer pools, that is, collections of unordered objects or tasks, are a fundamental element of modern multiprocessor software and a target of extensive research and development Get( ) P1 Put(x) . . P2 C1 . . C2 Put(y) Get( ) Pn Put(z) Get( ) pool Cn
  • 3. ED-Tree Pool We present the ED-Tree, a distributed pool structure based on a combination of the elimination-tree and diffracting-tree paradigms, allowing high degrees of parallelism with reduced contention
  • 4. Java JDK6.0:  SynchronousQueue/Stack (Lea, Scott, and Shearer) - pairing up function without buffering. Producers and consumers wait for one another  LinkedBlockingQueue - Producers put their value and leave, Consumers wait for a value to become available.  ConcurrentLinkedQueue - Producers put their value and leave, Consumers return null if the pool is empty.
  • 5. Drawback All these structures are based on a centralized structures like a lock-free queue or a stack, and thus are limited in their scalability: the head of the stack or queue is a sequential bottleneck and source of contention.
  • 6. Some Observations A pool does not have to obey neither LIFO or FIFO semantics.  Therefore, no centralized structure needed, to hold the items and to serve producers and consumers requests.
  • 7. New approach ED-Tree: a combined variant of the diffracting-tree structure (Shavit and Zemach) and the elimination-tree structure (Shavit and Touitou) The basic idea:  Use randomization to distribute the concurrent requests of threads onto many locations so that they collide with one another and can exchange values, thus avoiding using a central place through which all threads pass. The result:  A pool that allows both parallelism and reduced contention.
  • 8. A little history  Both diffraction and elimination were presented years ago, and claimed to be effective through simulation  However, elimination trees and diffracting trees were never used to implement real world structures  Elimination and diffraction were never combined in a single data structure
  • 9. Diffraction trees A binary tree of objects called balancers [Aspnes-Herlihy-Shavit] with a single input wire and two output wires 5 4 3 2 1 b 1 3 2 5 4 Threads arrive at a balancer and it repeatedly sends them left and right, so its top wire always has maximum one more than the bottom one.
  • 10. Diffraction trees 1 [Shavit-Zemach] b b 10 9 8 7 6 5 4 3 2 1 b 9 2 10 3 4 b b b 5 6 7 b 8 In any quiescent state (when there are no threads in the tree), the tree preserves the step property: the output items are balanced out so that the top leaves outputted at most one more element than the bottom ones, and there are no gaps.
  • 11. Diffraction trees Connect each output wire to a lock free queue b b b b b b b To perform a push, threads traverse the balancers from the root to the leaves and then push the item onto the appropriate queue. To perform a pop, threads traverse the balancers from the root to the leaves and then pop from the appropriate queue/block if the queue is empty.
  • 12. Diffraction trees Problem: Each toggle bit is a hot spot 1 1 b 0/1 1 b 0/1 3 3 2 1 b 0/1 0/1 0/1 2 2 b 0/1 b 0/1 b 0/1 2 3
  • 13. Diffraction trees Observation: If an even number of threads pass through a balancer, the outputs are evenly balanced on the top and bottom wires, but the balancer's state remains unchanged The approach: Add a diffraction array in front of each toggle bit 0/1 Prism Array toggle bit
  • 14. Elimination  At any point while traversing the tree, if producer and consumer collide, there is no need for them to diffract and continue traversing the tree  Producer can hand out his item to the consumer, and both can leave the tree.
  • 16. Using elimination-diffraction balancers Let the array at balancer each be a diffraction-elimination array:  If two producer (two consumer) threads meet in the array, they leave on opposite wires, without a need to touch the bit, as anyhow it would remain in its original state.  If producer and consumer meet, they eliminate, exchanging items.  If a producer or consumer call does not manage to meet another in the array, it toggles the respective bit of the balancer and moves on.
  • 18. What about low concurrency levels?  We show that elimination and diffraction techniques can be combined to work well at both high and low loads  To insure good performance in low loads we use several techniques, making the algorithm adapt to the current contention level.
  • 19. Adaptation mechanisms  Use backoff in space:  Randomly choose a cell in a certain range of the array  If the cell is busy (already occupied by two threads), increase the range and repeat.  Else Spin and wait to collision  If timed out (no collision)  Decrease the range and repeat  If certain amount of timeouts reached, spin on the first cell of the array for a period, and then move on to the toggle bit and the next level.  If certain amount of timeouts was reached, don’t try to diffract on any of the next levels, just go straight to the toggle bit  Each thread remembers the last range it used at the current balancer and next time starts from this range
  • 20. Starvation avoidance  Threads that failed to eliminate and propagated all the way to the leaves can wait for a long time for their requests to complete, while new threads entering the tree and eliminating finish faster.  To avoid starvation we limit the time a thread can be blocked in the queues before it retries the whole traversal again.
  • 21. Implementation  Each balancer is composed from an elimination array, a pair of toggle bits, and two references one to each of its child nodes. public class Balancer { ToggleBit producerToggle, consumerToggle; Exchanger[] eliminationArray; Balancer leftChild , rightChild; ThreadLocal<Integer> lastSlotRange; }
  • 22. Implementation public class Exchanger { AtomicReference<ExchangerPackage> slot; } public class ExchangerPackage { Object value; State state ; // WAITING/ELIMINATION/DIFFRACTION, Type type; // PRODUCER/CONSUMER }
  • 23. Implementation  Starting from the root of the tree:  Enter balancer  Choose a cell in the array and try to collide with another thread, using backoff mechanism described earlier.  If collision with another thread occurred     If both threads are of the same type, leave to the next level balancer (each to separate direction) If threads are of different type, exchange values and leave Else (no collision) use appropriate toggle bit and move to next level If one of the leaves reached, go to the appropriate queue and Insert/Remove an item according to the thread type
  • 24. Performance evaluation Sun UltraSPARC T2 Plus multi-core machine.  2 processors, each with 8 cores  each core with 8 hardware threads  64 way parallelism on a processor and 128 way parallelism across the machine.  Most of the tests were done on one processor. i.e. max 64 hardware threads
  • 25. Performance evaluation   A tree with 3 levels and 8 queues The queues are SynchronousBlocking/LinkedBlocking/ConcurrentLinked, according to the pool specification b b b b b b b
  • 26. Performance evaluation Synchronous stack of Lea et. Al vs ED synchronous pool
  • 27. Performance evaluation Linked blocking queue vs ED blocking pool
  • 28. Performance evaluation Concurrent linked queue vs ED non blocking pool
  • 29. Adding a delay between accesses to the pool 32 consumers, 32 producers
  • 30. Changing percentage of Consumers vs. total threads number 64 threads

Notes de l'éditeur

  1. The theme is building a data structure that is used as a pool, making it scalable and usable for high loads, and not less usable than existing implementations for low loads.
  2. What is a pool? A collection of items, which my be objects or tasks. Resource pool – objects that are used and then returned to the pool, Pool of jobs to perform, etc… The pool is approached by Producers and Consumers, that perform Put/Get (Push/Pop, Enqueue/Dequeue) actions. These actions can implement different semantics, be blocking/non-blocking, depends on how the pool was defined (Explanation of blocking on blocking)
  3. The data structure we present is called ED-Tree and this is a highly scalable pool to, to be used in multithreaded application. We reach high performance and scalability by combining two paradigms: Elimination and diffraction The Ed-Tree is implemented in Java
  4. If we look in Java JDK for data structures that can be used as pool, we will find the following…
  5. All the mentioned data structures are problematic…. They are based on centralized structures… the head or tail of queue/stack becomes a hot spot and in case large number of threads performance becomes worse, instead of improving
  6. If we think about it, we don’t care about the order in which the items are inserted/removed from the pool. All we want is to avoid starvation (if item is inserted to the pool, eventually it will be removed). Therefore we can avoid using centralized structure and distribute the pool in memory.
  7. A single level of an elimination array was also used in implementing shared concurrent stacks. However, elimination trees and diffracting trees were never used to implement real world structures. This is mostly due the fact that there was no need for them: machines with a sufficient level of concurrency and low enough interconnect latency to benefit from them did not exist. Today, multi-core machines present the necessary combination of high levels of parallelism and low interconnection costs. Indeed, this paper is the first to show that that ED-Tree based implementations of data structures from the java.util.concurrent scale impressively on a real machine (a Sun Maramba multicore machine with 2x8 cores and 128 hardware threads), delivering throughput that is at high concurrency levels 10 times that of the new proposed JDK6.0 algorithms.
  8. A balancer is usually implemented as a toggle bit: a bit that holds a binary value. Each thread change the value to the opposite one and picks a direction to exit, according to the bit value. For example 0 – go left, 1 – go right.
  9. The diffraction tree constructed from a set of balancers…. You can say that the tree counts the elements, i.e. distributes them equally across the leafs…
  10. If we connect a lock free queue/stack to each leaf and use two toggle bits in each balancer, we get a data structure which obeys a pool semantics…
  11. We can see that we just moved our contention source from a single queue/stack to the balancers, starting from the entrance to the tree
  12. The problem is solved by diffraction… what we get eventually is that each thread that approaches the pool, traverses the whole tree and eventually reaches one of the queues at the leafs.
  13. Actually, if at some point during the tree traversal a producer and consumer threads meet each other, they don’t have to continue traversing the tree. The consumer can take the producers value, and they both can leave the tree.
  14. In high loads, according to our statistics 50% of the threads are successfully eliminated on each level. I.e. if we use 3-level tree, 50% are eliminated at the first level, another 25% on the second, and 12.5% on the third, meaning, only about 10% of the requests survive till reaching the leaves.
  15. We also use two toggle bits at each balancer – one for producers and one for consumers, to assure fair distribution
  16. In the described implementation, another problem we can encounter is starvation…
  17. Each balancer is composed from an EliminationArray, a pair of toggle bits, and two references one to each of its child nodes.
  18. The implementation of an eliminationArray is based on an array of Exchangers. Each exchanger contains a single AtomicReference which is used as an Atomic placeholder for exchanging ExchangerPackage, where the ExchangerPackage is an object used to wrap the actual data and to mark its state and type.
  19. At its peak at 64 threads the ED-Tree delivers more than 10 times the performance of the JDK. Beyond 64 threads the threads are no longer bound to a single CPU, and traffic across the interconnect causes a moderate performance decline for the ED-Tree version (the performance of the JDK is already very low).
  20. `