Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
©2015 Azul Systems, Inc.	 	 	 	 	 	
Understanding

Hardware Transactional
Memory
Gil Tene, CTO & co-Founder, Azul Systems
...
InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese an...
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practi...
©2015 Azul Systems, Inc.	 	 	 	 	 	
Agenda
Brief introduction

What is Hardware Transactional Memory (HTM)?

Cache coheren...
©2015 Azul Systems, Inc.	 	 	 	 	 	
About me: Gil Tene
co-founder, CTO @Azul
Systems

Have been working on “think
differen...
©2015 Azul Systems, Inc.	 	 	 	 	 	
As far as I am concerned
GC is a solved problem
©2015 Azul Systems, Inc.	 	 	 	 	 	
Why does HTM matter now?
Because it is (finally) here!

HTM already available in the pa...
©2015 Azul Systems, Inc.	 	 	 	 	 	
What is HTM?
©2015 Azul Systems, Inc.	 	 	 	 	 	
What is HTM?

(For the kind of HTM I will be talking about…)
Can be thought of as
“Spe...
©2015 Azul Systems, Inc.	 	 	 	 	 	
Cache Coherence
Protocols can be messy
©2015 Azul Systems, Inc.
©2015 Azul Systems, Inc.
©2015 Azul Systems, Inc.
©2015 Azul Systems, Inc.	 	 	 	 	 	
But conceptually it’s not that messy…
©2015 Azul Systems, Inc.	 	 	 	 	 	
Cache line state from an
individual CPU’s point of view
I don’t have it
I have a copy ...
©2015 Azul Systems, Inc.	 	 	 	 	 	
Cache line state from an
individual CPU’s point of view
I don’t have it
I have a copy ...
©2015 Azul Systems, Inc.	 	 	 	 	 	
HTM builds on existing cache coherence
©2015 Azul Systems, Inc.	 	 	 	 	 	
Conceptual cache line state
additions for HTM
Line was accessed during speculation:

L...
©2015 Azul Systems, Inc.	 	 	 	 	 	
What can make the cache
“lose track” of a line?
Another CPU wants to write to it

Othe...
©2015 Azul Systems, Inc.	 	 	 	 	 	
That’s it…. For memory.
©2015 Azul Systems, Inc.	 	 	 	 	 	
CPU state. E.g. Intel TSX
In addition to memory transactionality,
CPU architectural st...
©2015 Azul Systems, Inc.	 	 	 	 	 	
So when can you do with HTM?
Well, transact on memory, of course…
©2015 Azul Systems, Inc.	 	 	 	 	 	
Speculative Lock Elision
2001 PhD. thesis by Ravi Rajwar
** An independent work on a s...
©2015 Azul Systems, Inc.
©2015 Azul Systems, Inc.	 	 	 	 	 	
Using HTM under the hood in a JVM
A trip down transactional memory lane
© 2005 Azul Systems, Inc.
Speculative Locking:

Breaking the Scale Barrier
Gil Tene, VP Technology, CTO
Ivan Posva, Senior...
www.azulsystems.com
©2005 Azul Systems, Inc.25
New JVM capabilities improve

multi-threaded application scalability.
How c...
www.azulsystems.com
©2005 Azul Systems, Inc.26
Agenda
Why do we care?
Lock contention vs. Data contention
Transactional ex...
www.azulsystems.com
©2005 Azul Systems, Inc.27
Serialized portions of program limit scale
Amdahl’s Law
• efficiency = 1/(N...
www.azulsystems.com
©2005 Azul Systems, Inc.28
Amdahl’s Law Effect on Throughput
throughputscalefactor
0
25
50
75
100
proc...
www.azulsystems.com
©2005 Azul Systems, Inc.29
Amdahl’s Law Example
• The theoretical limit is usually intuitive
─ Assume ...
www.azulsystems.com
©2005 Azul Systems, Inc.30
Agenda
Why do we care?
Lock contention vs. Data contention
Transactional ex...
www.azulsystems.com
©2005 Azul Systems, Inc.31
• Lock contention:
An attempt by one thread to acquire a lock when
another ...
www.azulsystems.com
©2005 Azul Systems, Inc.32
Data Contention in a

Shared Data Structure
• Readers do not contend
• Read...
www.azulsystems.com
©2005 Azul Systems, Inc.33
• Need synchronization for correct execution
─Critical sections, shared dat...
www.azulsystems.com
©2005 Azul Systems, Inc.34
The industry has already solved a similar problem
Database Transactions
• S...
www.azulsystems.com
©2005 Azul Systems, Inc.35
Agenda
Why do we care?
Lock contention vs. Data contention
Transactional ex...
There is no spoon.
www.azulsystems.com
www.azulsystems.com
©2005 Azul Systems, Inc.37
What does synchronized mean?
• It does not actually mean:
grab lock, execut...
www.azulsystems.com
©2005 Azul Systems, Inc.38
Transactional execution of
synchronized {…}
• Two basic requirements
─ Dete...
www.azulsystems.com
©2005 Azul Systems, Inc.39
Transactional execution of
synchronized {…}
• The JVM maintains the semanti...
www.azulsystems.com
©2005 Azul Systems, Inc.40
Transactional execution of
synchronized {…}
• It’s all transparent!
• No ch...
www.azulsystems.com
©2005 Azul Systems, Inc.41
How does it fit in the current locking schemes?
Implementation in a JVM
• T...
www.azulsystems.com
©2005 Azul Systems, Inc.42
Agenda
Why do we care?
Lock contention vs. Data contention
Transactional sy...
www.azulsystems.com
©2005 Azul Systems, Inc.43


Data Contention and Hashtables
• Examples of no data contention in a Hash...
www.azulsystems.com
©2005 Azul Systems, Inc.44
1
10
100
1000
10000
100000
4 Threads 16 Threads 64 Threads
Locking
Spec. Lo...
www.azulsystems.com
©2005 Azul Systems, Inc.45
10
100
1000
10000
100000
4 Threads 16 Threads 64 Threads
Locking
Spec. Lock...
www.azulsystems.com
©2005 Azul Systems, Inc.46
Agenda
Why do we care?
Lock contention vs. Data contention
Transactional sy...
www.azulsystems.com
©2005 Azul Systems, Inc.47
Minimizing Data Contention 1
private Object table[];
private int size;
publ...
www.azulsystems.com
©2005 Azul Systems, Inc.48
private Object table[];
private int sizes[];
public synchronized void put(O...
www.azulsystems.com
©2005 Azul Systems, Inc.49
private Object table[];
private int sizes[];
private int cachedSize;
public...
www.azulsystems.com
©2005 Azul Systems, Inc.50
private Object table[];
private int sizes[];
private int cachedSize;
public...
www.azulsystems.com
©2005 Azul Systems, Inc.51
Agenda
Why do we care?
Lock contention vs. Data contention
Transactional sy...
©2015 Azul Systems, Inc.	 	 	 	 	 	
Summary
Hardware Transactional Memory is here!

Expect speculative use for locking and...
©2015 Azul Systems, Inc.	 	 	 	 	 	
Q&A
Watch the video with slide synchronization on
InfoQ.com!
http://www.infoq.com/presentations/hardware-
transactional-memory
Prochain SlideShare
Chargement dans…5
×

Understanding Hardware Transactional Memory

1 338 vues

Publié le

Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/1MUYohc.

Gil Tene explores the underlying mechanics that power HTM on current platforms, focusing on things developers need to understand when contemplating the use of HTM in new and existing code. He talks about new speculative locking mechanisms enabled by HTM, and speculates about both the near term and long term future impact of HTM on concurrency choices and everyday programming choices. Filmed at qconlondon.com.

Gil Tene is CTO and co-founder of Azul Systems. He has been involved with virtual machine and runtime technologies for the past 25 years. His focus areas include system responsiveness and latency behavior. He pioneered the Continuously Concurrent Compacting Collector (C4) that powers Azul's continuously reactive Java platforms.

Publié dans : Technologie
  • Soyez le premier à commenter

Understanding Hardware Transactional Memory

  1. 1. ©2015 Azul Systems, Inc. Understanding Hardware Transactional Memory Gil Tene, CTO & co-Founder, Azul Systems @giltene
  2. 2. InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations /hardware-transactional-memory
  3. 3. Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide Presented at QCon London www.qconlondon.com
  4. 4. ©2015 Azul Systems, Inc. Agenda Brief introduction What is Hardware Transactional Memory (HTM)? Cache coherence basics & how HTM works What it looks like from a runtime point of view Interesting coding considerations
  5. 5. ©2015 Azul Systems, Inc. About me: Gil Tene co-founder, CTO @Azul Systems Have been working on “think different” GC and runtime approaches since 2002 Built world’s 1st commercially shipping HTM system, along with JVM support for HTM A Long history building Virtual & Physical Machines, Operating Systems, Enterprise apps, etc... I also depress people by demonstrating how terribly wrong their latency measurements are… * working on real-world trash compaction issues, circa 2004
  6. 6. ©2015 Azul Systems, Inc. As far as I am concerned GC is a solved problem
  7. 7. ©2015 Azul Systems, Inc. Why does HTM matter now? Because it is (finally) here! HTM already available in the past e.g. Azul Vega, since 2004 e.g. some later variants of Power architecture e.g. some designs for SPARC But it is now here in commodity server chips Intel TSX, on modern Intel Xeons Already in E7-V3 (4+ sockets), E3-V4, E3-V5 (1 socket) Coming (1H2016) in E5-26xx V4 (“Broadwell") chips Most new servers will have HTM by 2H2016
  8. 8. ©2015 Azul Systems, Inc. What is HTM?
  9. 9. ©2015 Azul Systems, Inc. What is HTM? (For the kind of HTM I will be talking about…) Can be thought of as “Speculative Multi-Address Atomicity” Transaction starts and ends with explicit instructions No special load or store instructions All memory operations in a successfully completed transaction appear to execute atomically (to other threads) Transactions may abort. All memory operations in an aborted transaction appear “to have never happened”
  10. 10. ©2015 Azul Systems, Inc. Cache Coherence Protocols can be messy
  11. 11. ©2015 Azul Systems, Inc.
  12. 12. ©2015 Azul Systems, Inc.
  13. 13. ©2015 Azul Systems, Inc.
  14. 14. ©2015 Azul Systems, Inc. But conceptually it’s not that messy…
  15. 15. ©2015 Azul Systems, Inc. Cache line state from an individual CPU’s point of view I don’t have it I have a copy (and someone else may, too) I have the only copy I have the only copy, and I’ve changed it I S E M (Invalid) (Shared) (Exclusive) (Modified)
  16. 16. ©2015 Azul Systems, Inc. Cache line state from an individual CPU’s point of view I don’t have it I have a copy (and someone else may, too) I have the only copy I have the only copy, and I’ve changed it I S E M (Invalid) (Shared) (Exclusive) (Modified)
  17. 17. ©2015 Azul Systems, Inc. HTM builds on existing cache coherence
  18. 18. ©2015 Azul Systems, Inc. Conceptual cache line state additions for HTM Line was accessed during speculation: Line was read from during speculation Line was modified during speculation When transaction completes, clear all speculation tracking state Losing track of a line that was accessed during speculation aborts the transaction Aborts invalidate speculatively modified lines
  19. 19. ©2015 Azul Systems, Inc. What can make the cache “lose track” of a line? Another CPU wants to write to it Other CPU would need it “exclusive” It would first need to invalidate it in this cache Another CPU wants to read from it Cache line would need to be in “shared” state here If it were in speculatively modified state: abort Capacity related self-eviction E.g. current Xeons: 32KB, 8 way set associative
  20. 20. ©2015 Azul Systems, Inc. That’s it…. For memory.
  21. 21. ©2015 Azul Systems, Inc. CPU state. E.g. Intel TSX In addition to memory transactionality, CPU architectural state is maintained On abort, PC moves to location provided in XBEGIN EAX is changed to indicate abort information All other architecture state remains the same as it was before XBEGIN was executed
  22. 22. ©2015 Azul Systems, Inc. So when can you do with HTM? Well, transact on memory, of course…
  23. 23. ©2015 Azul Systems, Inc. Speculative Lock Elision 2001 PhD. thesis by Ravi Rajwar ** An independent work on a somewhat similar lock serialization avoidance concept by Jose F. Martinez and Josep Torrellas, UIUC, also published in 2001
  24. 24. ©2015 Azul Systems, Inc.
  25. 25. ©2015 Azul Systems, Inc. Using HTM under the hood in a JVM A trip down transactional memory lane
  26. 26. © 2005 Azul Systems, Inc. Speculative Locking:
 Breaking the Scale Barrier Gil Tene, VP Technology, CTO Ivan Posva, Senior Staff Engineer Azul Systems
  27. 27. www.azulsystems.com ©2005 Azul Systems, Inc.25 New JVM capabilities improve
 multi-threaded application scalability. How can this affect the way you code? Speculative locking reduces effects of Amdahl's law Multi-threaded Java Apps can Scale
  28. 28. www.azulsystems.com ©2005 Azul Systems, Inc.26 Agenda Why do we care? Lock contention vs. Data contention Transactional execution of synchronized {…} Measurements Effects on how you code for contention Summary
  29. 29. www.azulsystems.com ©2005 Azul Systems, Inc.27 Serialized portions of program limit scale Amdahl’s Law • efficiency = 1/(N*q + (1-q)) ─ N = # of concurrent threads ─ q = fraction of serialized code fraction of serialized code efficiency 0.0 0.3 0.5 0.8 1.0 processors (N) 1 11 21 31 41 51 61 71 81 91 0.10% 0.50% 1.00% 2.00% 5.00% 20.00%
  30. 30. www.azulsystems.com ©2005 Azul Systems, Inc.28 Amdahl’s Law Effect on Throughput throughputscalefactor 0 25 50 75 100 processors 1 11 21 31 41 51 61 71 81 91 0.10% 0.50% 1.00% 2.00% 5.00% 20.00% ideal
  31. 31. www.azulsystems.com ©2005 Azul Systems, Inc.29 Amdahl’s Law Example • The theoretical limit is usually intuitive ─ Assume 10% serialization ─ At best you can do 10x the work of 1 CPU • Efficiency drops are dramatic and may be less intuitive ─ Assume 10% Serialization ─ 10 CPUs will not scale past a speedup of 5.3x (Eff. 0.53) ─ 16 CPUs will not scale past a speedup of 6.4x (Eff. 0.48) ─ 64 CPUs will not scale past a speedup of 8.8x (Eff. 0.14) ─ 99 CPUs will not scale past a speedup of 9.2x (Eff. 0.09) ─ … ─ It will take a whole lot of inefficient CPUs to [never] reach a 10x
  32. 32. www.azulsystems.com ©2005 Azul Systems, Inc.30 Agenda Why do we care? Lock contention vs. Data contention Transactional execution of synchronized {…} Measurements Effects on how you code Summary
  33. 33. www.azulsystems.com ©2005 Azul Systems, Inc.31 • Lock contention: An attempt by one thread to acquire a lock when another thread is holding it • Data contention: An attempt by one thread to atomically access data when another thread expects to manipulate the same data atomically Lock Contention vs. Data Contention
  34. 34. www.azulsystems.com ©2005 Azul Systems, Inc.32 Data Contention in a
 Shared Data Structure • Readers do not contend • Readers and writers don’t always contend • Even writers may not contend with other writers
  35. 35. www.azulsystems.com ©2005 Azul Systems, Inc.33 • Need synchronization for correct execution ─Critical sections, shared data structures • Intent is to protect against data contention • Can’t easily tell in advance ─ That’s why we lock… • Lock contention >= Data contention ─ In reality: lock contention >>= Data contention Locks are typically very conservative Synchronization and Locking
  36. 36. www.azulsystems.com ©2005 Azul Systems, Inc.34 The industry has already solved a similar problem Database Transactions • Semantics of potential failure exposed to the application • Transactions: atomic group of DB commands ─ All or nothing ─ From “BEGIN TRANSACTION” to “COMMIT” • Data contention results in a rollback ─ Leaves no trace • Application can re-execute until successful • Optimistic concurrency does scale
  37. 37. www.azulsystems.com ©2005 Azul Systems, Inc.35 Agenda Why do we care? Lock contention vs. Data contention Transactional execution of synchronized {…} Measurements Effects on how you code for contention Summary
  38. 38. There is no spoon. www.azulsystems.com
  39. 39. www.azulsystems.com ©2005 Azul Systems, Inc.37 What does synchronized mean? • It does not actually mean: grab lock, execute block, release lock • It does mean: execute block atomically in relation to other blocks synchronizing on the same object • It can be satisfied by the more conservative: execute block atomically in relation to all other threads • That looks a lot like a transaction “The Java Language Specification”, “The Java Virtual Machine Specification”, JSR133
  40. 40. www.azulsystems.com ©2005 Azul Systems, Inc.38 Transactional execution of synchronized {…} • Two basic requirements ─ Detect data contention within the block ─ Roll back synchronized block on data contention • synchronized can run concurrently ─ JVM can use hardware transactional memory to detect data contention ─ JVM must rolls back synchronized blocks that encounter data contention
  41. 41. www.azulsystems.com ©2005 Azul Systems, Inc.39 Transactional execution of synchronized {…} • The JVM maintains the semantic meaning of: execute block atomically in relation to all other threads • Uncontended synchronized blocks
 run just as fast as before • Data contended synchronized blocks
 still serialize execution • synchronized blocks without data contention
 can execute in parallel
  42. 42. www.azulsystems.com ©2005 Azul Systems, Inc.40 Transactional execution of synchronized {…} • It’s all transparent! • No changes to Java code ─ The VM handles everything • Nested synchronized blocks ─ Roll back to outermost transactional synchronized • Reduces serialization • Amdahl’s Law now only reflects data contention ─ Desire to reduce data contention
  43. 43. www.azulsystems.com ©2005 Azul Systems, Inc.41 How does it fit in the current locking schemes? Implementation in a JVM • Thin locks handle uncontended synchronized blocks ─ Most common case ─ Uses CAS, no OS interaction • Thick locks handle data contended synchronized blocks ─ Blocks in the OS • Transactional monitors handle contended synchronized blocks that have no data contention ─ Execute synchronized blocks in parallel ─ Uses HW transactional memory support
  44. 44. www.azulsystems.com ©2005 Azul Systems, Inc.42 Agenda Why do we care? Lock contention vs. Data contention Transactional synchronized {…} Measurements Effects on how you code for contention Summary
  45. 45. www.azulsystems.com ©2005 Azul Systems, Inc.43 
 Data Contention and Hashtables • Examples of no data contention in a Hashtable ─ 2 readers ─ 1 reader, 1 writer, different hash buckets ─ 2 writers, different hash buckets • Examples of data contention in a Hashtable ─ 1 reader, 1 writer in same hash bucket ─ 2 writers in same hash bucket
  46. 46. www.azulsystems.com ©2005 Azul Systems, Inc.44 1 10 100 1000 10000 100000 4 Threads 16 Threads 64 Threads Locking Spec. Locking Measurements: Hashtable (0% writes)
  47. 47. www.azulsystems.com ©2005 Azul Systems, Inc.45 10 100 1000 10000 100000 4 Threads 16 Threads 64 Threads Locking Spec. Locking Measurements: Hashtable (5% writes)
  48. 48. www.azulsystems.com ©2005 Azul Systems, Inc.46 Agenda Why do we care? Lock contention vs. Data contention Transactional synchronized {…} Measurements Effects on how you code for contention Summary
  49. 49. www.azulsystems.com ©2005 Azul Systems, Inc.47 Minimizing Data Contention 1 private Object table[]; private int size; public synchronized void put(Object key, Object val) { … // missed, insert into table table[idx] = new HashEntry(key, val, table[idx]); size++; // writer data contention } public synchronized int size() { return size; }
  50. 50. www.azulsystems.com ©2005 Azul Systems, Inc.48 private Object table[]; private int sizes[]; public synchronized void put(Object key, Object val) { … // missed, insert into table table[idx] = new HashEntry(key, val, table[idx]); sizes[idx]++; // reduced writer data contention } public synchronized int size() { int size = 0; for (int i=0; i<sizes.length; i++) size += sizes[i]; return size; } Minimizing Data Contention 2
  51. 51. www.azulsystems.com ©2005 Azul Systems, Inc.49 private Object table[]; private int sizes[]; private int cachedSize; public synchronized void put(Object key, Object val) { … // missed, insert into table table[idx] = new HashEntry(key, val, table[idx]); sizes[idx]++; cachedSize = -1; // clear the cache } public synchronized int size() { if (cachedSize < 0) { // reduce size recalculation cachedSize = 0; for (int i=0; i<sizes.length; i++) cachedSize += sizes[i]; } return cachedSize; } Minimizing Data Contention 3
  52. 52. www.azulsystems.com ©2005 Azul Systems, Inc.50 private Object table[]; private int sizes[]; private int cachedSize; public synchronized void put(Object key, Object val) { … // missed, insert into table table[idx] = new HashEntry(key, val, table[idx]); sizes[idx]++; if (cachedSize >= 0) cachedSize = -1; // avoid contention } public synchronized int size() { if (cachedSize < 0) { cachedSize = 0; for (int i=0; i<sizes.length; i++) cachedSize += sizes[i]; } return cachedSize; } Minimizing Data Contention 4
  53. 53. www.azulsystems.com ©2005 Azul Systems, Inc.51 Agenda Why do we care? Lock contention vs. Data contention Transactional synchronized {…} Measurements Effects on how you code Summary
  54. 54. ©2015 Azul Systems, Inc. Summary Hardware Transactional Memory is here! Expect speculative use for locking and synchronization in libraries and runtimes JVMs, CLR, … POSIX mutexes, semaphores, etc. It may be useful for some other cool stuff…
  55. 55. ©2015 Azul Systems, Inc. Q&A
  56. 56. Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations/hardware- transactional-memory

×