SlideShare a Scribd company logo
1 of 12
'volatile' is volatile
mark@veltzer.net
Memory...
● When programming we use memory all the
time
– Reading/Writing data structures on the heap, stack
or data segment.
– Reading/Writing from/to hardware
– By “Memory” I do not refer to registers in this
presentation (since every core has it's own
registers)
What kinds of guarantees do we
want from memory operations?
● That the operation is not optimized away completely
● That the operation does not take place in registers (that are not
visible to other cores by definition)
● Visibility to other cores (bypass/flush/sync CPU caches)
● Visibility to hardware
● Atomicity
● Order between such two or more memory operations
● Any combination of the above (possibly none of them...)
● Regular C programming (without using special features, that is), the
compiler and the CPU provides none of the above guarantees
So what guarantees does volatile
provide?
● It's just not clear!
● The first two, yes
● The others, maybe. No in a lot of the architectures.
● Depends on your compiler, it's version, your compilation flags,
the astrological sign of the compiler authors best friend...
● To be specific: most volatile implementations do not imply
atomicity or ordering.
● And what does volatile mean for bigger than word or int or
long structures? Pass me the joint as things are getting hazy...
● Don't use volatile (true in most cases!)
Memory reordering
● Imagine the following code (with no compiler
optimizations):
●
●
● What states do you expect other cores to see?
● Or maybe
●
● Yes! The CPU does this. (well, not Intel, but others)
X=5
Y=6
X=7
Y=8
X=5, Y=6
X=7, Y=6
X=7, Y=8
X=5, Y=6
X=5, Y=8
X=7, Y=8
Is the Compiler/CPU allowed to do
that?
● Yes. Actually there are many types of reordering that the
Compiler/CPU is allowed to perform
● Common CPU reordering include:
– Load reordered after load
– Load reordered after store
– Store reordered after load
– Store reordered after store
– Store reordered after atomics
– Load reordered after atomics
– Dependant load reordered (YES! Alpha does this, they should all
be locked up...)
So what do the compiler/CPU
guarantee?
●
They guarantee results in one thread.
●
This means that they may alter your code, reorder it, discard parts of it, use
different operations than the ones you use and more.
● But all of these guarantee that the results you will get will be the same, in
the same thread that they are in.
●
But sometimes you want your code to be left unaltered.
●
This is especially true when other threads or hardware is involved.
●
In these cases the order matters, the specific operations matter, etc.
Enter memory barrier/fence
● A machine memory barrier is a special machine instruction or a
special type of memory access instruction that guarantees order
of execution between memory instructions before it and after it.
● __sync_synchronize() in gcc (user space).
● asm volatile ("mfence" ::: "memory")
● (smp_?)mb(),(smp_?)rmb(),(smp_?)wmb() in kernel development.
● In most cases atomic operations imply a memory barrier of some sort
and new C++11 has nice API with memory model included.
OK, prove it to me...
● Time for a demo.
● Two threads, when we start we have:
●
●
●
●
● Could it be that R1==R2==0 at the end?
X=0
Y=0
X=1
R1=Y
Y=1
R2=X
Hey, but I need volatile to overcome
the compiler!
● No, you don't
● There is something called a “compiler barrier”
● Compiler barriers usually offer several features:
– Forces the compiler to sync unsynchronized registers with memory so that memory writes
before the barrier will go to memory (no cache flush, no memory barrier)
– Forces the compiler to read from memory after the barrier even if the compiler thinks it knows
the value of certain memory locations.
– Forces order of memory operations at the compiler level (not machine level) in relation to the
barrier location in the code
● A compiler barrier is not a machine instruction (as opposed to memory barrier
● It is a compiler directive, influencing how to the compiler will generate machine code
after the directive is given.
● The compiler may emit machine instructions or it may not (depends on many factors)
● Time for another demo again...
References
● “What every programmer should know about memory” by
Ulrich Drepper
● “memory-barries.txt” from the Linux kernel.
● The example for memory barriers shown is derived from
“Memory Reordering Caught in the Act” by Jeff Preshing
● “Volatile_variable” from wikipedia
● “Memory_barrier” from wikipedia
● All examples can be found at linuxapi project at GitHub by
me.
Questions?

More Related Content

What's hot

What's hot (20)

Static Data Members and Member Functions
Static Data Members and Member FunctionsStatic Data Members and Member Functions
Static Data Members and Member Functions
 
C++ programming
C++ programmingC++ programming
C++ programming
 
Module 05 Preprocessor and Macros in C
Module 05 Preprocessor and Macros in CModule 05 Preprocessor and Macros in C
Module 05 Preprocessor and Macros in C
 
Interpreter, Compiler, JIT from scratch
Interpreter, Compiler, JIT from scratchInterpreter, Compiler, JIT from scratch
Interpreter, Compiler, JIT from scratch
 
Advanced C - Part 1
Advanced C - Part 1 Advanced C - Part 1
Advanced C - Part 1
 
Virtual Machine Constructions for Dummies
Virtual Machine Constructions for DummiesVirtual Machine Constructions for Dummies
Virtual Machine Constructions for Dummies
 
Memory model
Memory modelMemory model
Memory model
 
from Source to Binary: How GNU Toolchain Works
from Source to Binary: How GNU Toolchain Worksfrom Source to Binary: How GNU Toolchain Works
from Source to Binary: How GNU Toolchain Works
 
RISC-V-Day-Tokyo2018-suzaki
RISC-V-Day-Tokyo2018-suzakiRISC-V-Day-Tokyo2018-suzaki
RISC-V-Day-Tokyo2018-suzaki
 
Process and Threads in Linux - PPT
Process and Threads in Linux - PPTProcess and Threads in Linux - PPT
Process and Threads in Linux - PPT
 
datatypes and variables in c language
 datatypes and variables in c language datatypes and variables in c language
datatypes and variables in c language
 
Functions in c++
Functions in c++Functions in c++
Functions in c++
 
Detecting Paraphrases in Marathi Language
Detecting Paraphrases in Marathi LanguageDetecting Paraphrases in Marathi Language
Detecting Paraphrases in Marathi Language
 
MacOS memory allocator (libmalloc) Exploitation
MacOS memory allocator (libmalloc) ExploitationMacOS memory allocator (libmalloc) Exploitation
MacOS memory allocator (libmalloc) Exploitation
 
Address/Thread/Memory Sanitizer
Address/Thread/Memory SanitizerAddress/Thread/Memory Sanitizer
Address/Thread/Memory Sanitizer
 
GNU ld的linker script簡介
GNU ld的linker script簡介GNU ld的linker script簡介
GNU ld的linker script簡介
 
GDB Rocks!
GDB Rocks!GDB Rocks!
GDB Rocks!
 
コンピュータフォレンジックにちょっとだけ触れてみる
コンピュータフォレンジックにちょっとだけ触れてみるコンピュータフォレンジックにちょっとだけ触れてみる
コンピュータフォレンジックにちょっとだけ触れてみる
 
Python
Python Python
Python
 
PHP Regular Expressions
PHP Regular ExpressionsPHP Regular Expressions
PHP Regular Expressions
 

Viewers also liked (7)

Realtime
RealtimeRealtime
Realtime
 
Gcc
GccGcc
Gcc
 
Effective cplusplus
Effective cplusplusEffective cplusplus
Effective cplusplus
 
Gcc opt
Gcc optGcc opt
Gcc opt
 
Linux logging
Linux loggingLinux logging
Linux logging
 
Streams
StreamsStreams
Streams
 
Multicore
MulticoreMulticore
Multicore
 

Similar to Volatile

Programming with Threads in Java
Programming with Threads in JavaProgramming with Threads in Java
Programming with Threads in Java
koji lin
 
MySQL 5.6 Performance
MySQL 5.6 PerformanceMySQL 5.6 Performance
MySQL 5.6 Performance
MYXPLAIN
 

Similar to Volatile (20)

Java vs. C/C++
Java vs. C/C++Java vs. C/C++
Java vs. C/C++
 
Java under the hood
Java under the hoodJava under the hood
Java under the hood
 
Efficient Buffer Management
Efficient Buffer ManagementEfficient Buffer Management
Efficient Buffer Management
 
Let's Talk Locks!
Let's Talk Locks!Let's Talk Locks!
Let's Talk Locks!
 
JVM Performance Tuning
JVM Performance TuningJVM Performance Tuning
JVM Performance Tuning
 
Optimizing Linux Servers
Optimizing Linux ServersOptimizing Linux Servers
Optimizing Linux Servers
 
Performance optimization techniques for Java code
Performance optimization techniques for Java codePerformance optimization techniques for Java code
Performance optimization techniques for Java code
 
Share and Share Alike
Share and Share AlikeShare and Share Alike
Share and Share Alike
 
Introduction to Parallelization and performance optimization
Introduction to Parallelization and performance optimizationIntroduction to Parallelization and performance optimization
Introduction to Parallelization and performance optimization
 
Java memory model
Java memory modelJava memory model
Java memory model
 
Introduction to Parallelization ans performance optimization
Introduction to Parallelization ans performance optimizationIntroduction to Parallelization ans performance optimization
Introduction to Parallelization ans performance optimization
 
Linux Locking Mechanisms
Linux Locking MechanismsLinux Locking Mechanisms
Linux Locking Mechanisms
 
Introduction to Parallelization ans performance optimization
Introduction to Parallelization ans performance optimizationIntroduction to Parallelization ans performance optimization
Introduction to Parallelization ans performance optimization
 
Memory model
Memory modelMemory model
Memory model
 
Programming with Threads in Java
Programming with Threads in JavaProgramming with Threads in Java
Programming with Threads in Java
 
strangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patternsstrangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patterns
 
Kernel Recipes 2014 - Performance Does Matter
Kernel Recipes 2014 - Performance Does MatterKernel Recipes 2014 - Performance Does Matter
Kernel Recipes 2014 - Performance Does Matter
 
MySQL 5.6 Performance
MySQL 5.6 PerformanceMySQL 5.6 Performance
MySQL 5.6 Performance
 
Faster computation with matlab
Faster computation with matlabFaster computation with matlab
Faster computation with matlab
 
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
 

Recently uploaded

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Recently uploaded (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 

Volatile

  • 2. Memory... ● When programming we use memory all the time – Reading/Writing data structures on the heap, stack or data segment. – Reading/Writing from/to hardware – By “Memory” I do not refer to registers in this presentation (since every core has it's own registers)
  • 3. What kinds of guarantees do we want from memory operations? ● That the operation is not optimized away completely ● That the operation does not take place in registers (that are not visible to other cores by definition) ● Visibility to other cores (bypass/flush/sync CPU caches) ● Visibility to hardware ● Atomicity ● Order between such two or more memory operations ● Any combination of the above (possibly none of them...) ● Regular C programming (without using special features, that is), the compiler and the CPU provides none of the above guarantees
  • 4. So what guarantees does volatile provide? ● It's just not clear! ● The first two, yes ● The others, maybe. No in a lot of the architectures. ● Depends on your compiler, it's version, your compilation flags, the astrological sign of the compiler authors best friend... ● To be specific: most volatile implementations do not imply atomicity or ordering. ● And what does volatile mean for bigger than word or int or long structures? Pass me the joint as things are getting hazy... ● Don't use volatile (true in most cases!)
  • 5. Memory reordering ● Imagine the following code (with no compiler optimizations): ● ● ● What states do you expect other cores to see? ● Or maybe ● ● Yes! The CPU does this. (well, not Intel, but others) X=5 Y=6 X=7 Y=8 X=5, Y=6 X=7, Y=6 X=7, Y=8 X=5, Y=6 X=5, Y=8 X=7, Y=8
  • 6. Is the Compiler/CPU allowed to do that? ● Yes. Actually there are many types of reordering that the Compiler/CPU is allowed to perform ● Common CPU reordering include: – Load reordered after load – Load reordered after store – Store reordered after load – Store reordered after store – Store reordered after atomics – Load reordered after atomics – Dependant load reordered (YES! Alpha does this, they should all be locked up...)
  • 7. So what do the compiler/CPU guarantee? ● They guarantee results in one thread. ● This means that they may alter your code, reorder it, discard parts of it, use different operations than the ones you use and more. ● But all of these guarantee that the results you will get will be the same, in the same thread that they are in. ● But sometimes you want your code to be left unaltered. ● This is especially true when other threads or hardware is involved. ● In these cases the order matters, the specific operations matter, etc.
  • 8. Enter memory barrier/fence ● A machine memory barrier is a special machine instruction or a special type of memory access instruction that guarantees order of execution between memory instructions before it and after it. ● __sync_synchronize() in gcc (user space). ● asm volatile ("mfence" ::: "memory") ● (smp_?)mb(),(smp_?)rmb(),(smp_?)wmb() in kernel development. ● In most cases atomic operations imply a memory barrier of some sort and new C++11 has nice API with memory model included.
  • 9. OK, prove it to me... ● Time for a demo. ● Two threads, when we start we have: ● ● ● ● ● Could it be that R1==R2==0 at the end? X=0 Y=0 X=1 R1=Y Y=1 R2=X
  • 10. Hey, but I need volatile to overcome the compiler! ● No, you don't ● There is something called a “compiler barrier” ● Compiler barriers usually offer several features: – Forces the compiler to sync unsynchronized registers with memory so that memory writes before the barrier will go to memory (no cache flush, no memory barrier) – Forces the compiler to read from memory after the barrier even if the compiler thinks it knows the value of certain memory locations. – Forces order of memory operations at the compiler level (not machine level) in relation to the barrier location in the code ● A compiler barrier is not a machine instruction (as opposed to memory barrier ● It is a compiler directive, influencing how to the compiler will generate machine code after the directive is given. ● The compiler may emit machine instructions or it may not (depends on many factors) ● Time for another demo again...
  • 11. References ● “What every programmer should know about memory” by Ulrich Drepper ● “memory-barries.txt” from the Linux kernel. ● The example for memory barriers shown is derived from “Memory Reordering Caught in the Act” by Jeff Preshing ● “Volatile_variable” from wikipedia ● “Memory_barrier” from wikipedia ● All examples can be found at linuxapi project at GitHub by me.