SlideShare une entreprise Scribd logo
1  sur  41
Télécharger pour lire hors ligne
How NOT To Write A
Microbenchmark

Dr. Cliff Click
Senior Staff Engineer
Sun Microsystems
Session #
How NOT To Write A
Microbenchmark
or
Lies, Damn Lies, and
Microbenchmarks

Session #
Microbenchmarks are a sharp knife

Microbenchmarks are like a microscope
Magnification is high,
but what the heck are you looking at?
Like "truths, half-truths, and statistics",
microbenchmarks can be very misleading

Beginning

3

Session # 1816
Learning Objectives
• As a result of this presentation, you will
be able to:
– Recognize when a benchmark lies
– Recognize what a benchmark can tell you
(as opposed to what it purports to tell you)
– Understand how some popular benchmarks are
flawed
– Write a microbenchmark that doesn't lie (to you)

Beginning

4

Session # 1816
Speaker’s Qualifications
• Dr. Click is a Senior Staff Engineer at
Sun Microsystems
• Dr. Click wrote his first compiler at age 16
and has been writing:
– runtime compilers for 15 years, and
– optimizing compilers for 10 years

• Dr. Click architected the HotSpot™
Server Compiler

Beginning

5

Session # 1816
Your JVM is Lousy Because it
Doesn't Do... "X"

I routinely get handed a microbenchmark and
told "HotSpot doesn't do X"
49% chance it doesn't matter,
49% chance they got fooled,
1% chance HotSpot didn't do "Y" but should
1% chance HotSpot didn't do "X" but should

Beginning

6

Session # 1816
Agenda
•
•
•
•

Beginning

7

Recent real-word benchmark disaster
Popular microbenchmarks and their flaws
When to disbelieve a benchmark
How to write your own

Session # 1816
What is a Microbenchmark?
• Small program

– Datasets may be large

• All time spent in a few lines of code
• Performance depends on how those
few lines are compiled
• Goal: Discover some particular fact
• Remove all other variables

Middle

8

Session # 1816
Why Run Microbenchmarks?
• Discover some targeted fact

– Such as the cost of 'turned off' Asserts, or
– Will this inline?
– Will another layer of abstraction hurt
performance?

• Fun

– My JIT is faster than your JIT
– My Java is faster than your C

• But dangerous!

Middle

9

Session # 1816
How HotSpot Works
• HotSpot is a mixed-mode system
• Code first runs interpreted
– Profiles gathered

• Hot code gets compiled
• Same code "just runs faster" after awhile
• Bail out to interpreter for rare events
– Never taken before code
– Class loading, initializing

Middle

10

Session # 1816
Example from Magazine Editor
•
•
•
•
•
•
•

Middle

11

What do (turned off) Asserts cost?
Tiny 5-line function r/w's global variable
Run in a loop 100,000,000 times
With explicit check – 5 sec
With Assert (off) – 0.2 sec
With no test at all – 5 sec
What did he really measure?

Session # 1816
Assert Example

static int sval;
// Global variable
static int test_assert(int val) {
assert (val >= 0) : "should be positive";
sval = val * 6;
sval += 3;
// Read/write global
sval /= 2;
return val+2;
//
}
static void main(String args[]) {
int REPEAT = Integer.parseInt(args[0]);
int v=0;
for( int i=0; i<REPEAT; i++ )
v = test_assert(v);
}

Middle

12

Session # 1816
Assert Example

static int sval;
// Global variable
static int test_assert(int val) {
assert (val >= 0) : "should be positive";
sval = val * 6;
sval += 3;
// Read/write global
sval /= 2;
return val+2;
}
static void main(String args[]) {
int REPEAT = Integer.parseInt(args[0]);
int v=0;
for( int i=0; i<REPEAT; i++ )
v = test_assert(v);
}

Middle

13

Session # 1816
Assert Example

static int sval;
// Global variable
static int test_assert(int val) {
assert (val >= 0) : "should be positive";
sval = val * 6;
sval += 3;
// Read/write global
sval /= 2;
return val+2;
//
}
static void main(String args[]) {
int REPEAT = Integer.parseInt(args[0]);
int v=0;
for( int i=0; i<REPEAT; i++ )
v = test_assert(v);
}

Middle

14

Session # 1816
Assert Example

static int sval;
// Global variable
static int test_explicit(int val) {
if( val < 0 ) throw ...;
sval = val * 6;
sval += 3;
// Read/write global
sval /= 2;
return val+2;
//
}
static void main(String args[]) {
int REPEAT = Integer.parseInt(args[0]);
int v=0;
for( int i=0; i<REPEAT; i++ )
v = test_explicit(v);
}

Middle

15

Session # 1816
Assert Example (cont.)
• No synchronization ⇒ hoist static global into
register, only do final write
• Small hot static function ⇒ inline into loop
• Asserts turned off ⇒ no test in code
static int sval;
// Global variable
static void main(String args[]) {
int REPEAT = Integer.parseInt(args[0]);
int v=0;

}
Middle

16

for( int i=0; i<REPEAT; i++ ) {
sval = (v*6+3)/2;
v = v+2;
}
Session # 1816
Assert Example (cont.)
• Small loop ⇒ unroll it a lot

– Need pre-loop to 'align' loop

• Again remove redundant writes to global static
static int sval;
// Global variable
static void main(String args[]) {
int REPEAT = Integer.parseInt(args[0]);
int v=0;
// pre-loop goes here...
for( int i=0; i<REPEAT; i+=16 ) {
sval = ((v+2*14)*6+3)/2;
v = v+2*16;
}
}
Middle

17

Session # 1816
Benchmark is 'spammed'
• Loop body is basically 'dead'
• Unrolling speeds up by arbitrary factor
• Note: 0.2 sec / 100Million iters * 450 Mhz clock
⇒ Loop runs too fast: < 1 clock/iter
• But yet...
• ... different loops run 20x faster
• What's really happening?

Middle

18

Session # 1816
A Tale of Four Loops
static void main(String args[]) {
int REPEAT = Integer.parseInt(args[0]);
int v=0;
for( int i=0; i<REPEAT; i++ )
v = test_no_check(v); // hidden warmup loop: 0.2 sec
long time1 = System.currentTimeMillis()
for( int i=0; i<REPEAT; i++ )
v = test_explicit(v); // Loop 1: 5 sec
long time2 = System.currentTimeMillis()
System.out.println("explicit check time="+...);
for( int i=0; i<REPEAT; i++ )
v = test_assert(v);
// Loop 2: 0.2 sec
long time3 = System.currentTimeMillis()
System.out.println("assert time=",...);
for( int i=0; i<REPEAT; i++ )
v = test_no_check(v); // Loop 3: 5 sec
long time4 = System.currentTimeMillis()
System.out.println("no check time=",...);
}
Middle

19

Session # 1816
On-Stack Replacement
•
•
•
•

HotSpot is mixed mode: interp + compiled
Hidden warmup loop starts interpreted
Get's OSR'd: generate code for middle of loop
'explicit check' loop never yet executed, so...
– Don't inline test_explicit(), no unroll
– 'println' causes class loading ⇒ drop down to
interpreter

• Hidden loop runs fast
• 'explicit check' loop runs slow
Middle

20

Session # 1816
On-Stack Replacement (con't)
• Continue 'assert' loop in interpreter
• Get's OSR'd: generate code for middle of loop
• 'no check' loop never yet executed, so...
– Don't inline test_no_check(), no unroll

• Assert loop runs fast
• 'no check' loop runs slow

Middle

21

Session # 1816
Dangers of Microbenchmarks
• Looking for cost of Asserts...
• But found OSR Policy Bug (fixed in 1.4.1)
• Why not found before?
– Only impacts microbenchmarks
– Not found in any major benchmark
– Not found in any major customer app

• Note: test_explicit and test_no_check: 5 sec
– Check is compare & predicted branch
– Too cheap, hidden in other math

Middle

22

Session # 1816
Self-Calibrated Benchmarks
• Want a robust microbenchmark that runs on a
wide variety of machines
• How long should loop run?
– Depends on machine!

• Time a few iterations to get a sense of speed
– Then time enough runs to reach steady state

• Report iterations/sec
• Basic idea is sound, but devil is in the details

Middle

23

Session # 1816
Self-Calibrated Benchmarks
• First timing loop runs interpreted
– Also pays compile-time cost

• Count needed to run reasonable time is low
• Main timed run used fast compiled code
• Runs too fast to get result above clock jitter
• Divide small count by clock jitter ⇒
random result
• Report random score

Middle

24

Session # 1816
jBYTEmark Methodology
• Count iterations to run for 10 msec

– First run takes 104 msec
– Way too short, always = 1 iteration

• Then use 1st run time to compute iterations
needed to run 100msec
– Way too short, always = 1 iteration

• Then average 5 runs

– Next 5 runs take 80, 60, 20, 20, 20 msec
– Clock jitter = 1 msec

Middle

25

Session # 1816
Non-constant work per iter
•
•
•
•

Non-sync'd String speed test
Run with & without some synchronization
Use concat, continuously add Strings
Benchmark requires copy before add, so...
– Each iteration takes more work than before

• Self-timed loop has some jitter...

– 10000 main-loop iters with sync
– 10030 main-loop iters withOUT sync

Middle

26

Session # 1816
Non-constant work per iter (con't)
• Work for Loop1:

– 100002 copies, 10000 concats, 10000 syncs

• Work for Loop2:

– 100302 copies, 10000 concats, no syncs

•
•
•
•

Middle

27

Loop2 avoids 10000 syncs
But pays 600,900 more copies
Benchmark compares times, BUT
Extra work swamps sync cost!

Session # 1816
Beware 'hidden' Cache Blowout
•
•
•
•
•

Work appears to be linear with time
More iterations should report same work/sec
Grow dataset to run longer
Blow out caches!
Self-calibrated loop jitter can make your data
set fit in L1 cache or not!
• Jitter strongly affects apparent work/sec rate

Middle

28

Session # 1816
Beware 'hidden' Cache Blowout
•
•
•
•
•
•
•

Middle

29

jBYTEmark StringSort with time raised to 1 sec
Each iteration adds a 4K array
Needs 1 or 2 iterations to hit 100msec
Scale by 10 to reach 1 sec
Uses either 10 or 20 4K arrays
Working set is either 40K or 80K
64K L1 cache

Session # 1816
Explicitly Handle GC
• GC pauses occur at unpredictable times
• GC throughput is predictable
• Either don't allocate in your main loops
– So no GC pauses

• OR run long enough to reach GC steady state
– Each run spends roughly same time doing GC

Middle

30

Session # 1816
CaffeineMark3 Logic Benchmark
• Purports to test logical operation speed
• With unrolling OR constant propagation,
whole function is dead (oops!)
• First timing loop: 900K iterations to run 1 sec
• Second loop runs 2.7M iterations in 0.685 sec
• Score reports as negative (overflow) or huge
• Overall score dominated by Logic score

Middle

31

Session # 1816
Know what you test
• SpecJVM98 209_db

– 85% of time spent in shell_sort

• Scimark2 MonteCarlo simulation

– 80% of time spent in sync'd Random.next

• jBYTEmark StringSort

– 75% of time spent in byte copy

• CaffeineMark Logic test
– "He's dead, Jim"

Middle

32

Session # 1816
How To Write a Microbenchmark
(general – not just Java)
• Pick suitable goals!

– Can't judge web server performance by
jBYTEmarks
– Only answer very narrow questions
– Make sure you actually care about the answer!

• Be aware of system load, clock granularity
• Self-calibrated loops always have jitter
– Sanity check the time on the main run!

• Run to steady state, at least 10 sec

Middle

33

Session # 1816
How To Write a Microbenchmark
(general – not just Java)
• Fixed size datasets (cache effects)

– Large datasets have page coloring issues

• Constant amount of work per iteration
– Or results not comparable

• Avoid the 'eqntott Syndrome'

– Capture 1st run result (interpreted)
– Compare to last result

Middle

34

Session # 1816
How To Write a Microbenchmark
(general – not just Java)
• Avoid 'dead' loops

– These can have infinite speedups!
– Print final answer
– Make computation non-trivial

• Or handle dead loops

– Report when speedup is unreasonable
– CaffeineMark3 & JavaGrande Section 1
mix Infinities and reasonable scores

Middle

35

Session # 1816
How To Write a Microbenchmark
(Java-specific)
•
•
•
•

Middle

36

Be explicit about GC
Thread scheduling is not deterministic
JIT performance may change over time
Warmup loop+test code before ANY timing
– (HotSpot specific)

Session # 1816
OSR Bug Avoidance
• Fixed in 1.4.1
• Write 1 loop per method for Microbenchmarks
• Doesn't matter for 'real' apps

– Don't write 1-loop-per-method code to avoid bug

Middle

37

Session # 1816
Summary
•
•
•
•
•
•
•
•

End

38

Microbenchmarks can be easy, fun, informative
They can be very misleading! Beware!
Pick suitable goals
Warmup code before timing
Run reasonable length of time
Fixed work / fixed datasets per iteration
Sanity-check final times and results
Handle 'dead' loops, GC issues

Session # 1816
If You Only Remember One Thing…

Put Microtrust in a Microbenchmark

End

39

Session # 1816
Session #
Session #

Contenu connexe

Tendances

今日からできる!簡単 .NET 高速化 Tips
今日からできる!簡単 .NET 高速化 Tips今日からできる!簡単 .NET 高速化 Tips
今日からできる!簡単 .NET 高速化 TipsTakaaki Suzuki
 
Meet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingMeet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingViller Hsiao
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaJiangjie Qin
 
Blazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBrendan Gregg
 
Taking advantage of Prometheus relabeling
Taking advantage of Prometheus relabelingTaking advantage of Prometheus relabeling
Taking advantage of Prometheus relabelingJulien Pivotto
 
Introduction to Rust language programming
Introduction to Rust language programmingIntroduction to Rust language programming
Introduction to Rust language programmingRodolfo Finochietti
 
FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)Kirill Tsym
 
CEDEC2021 ダウンロード時間を大幅減!~大量のアセットをさばく高速な実装と運用事例の共有~
CEDEC2021 ダウンロード時間を大幅減!~大量のアセットをさばく高速な実装と運用事例の共有~ CEDEC2021 ダウンロード時間を大幅減!~大量のアセットをさばく高速な実装と運用事例の共有~
CEDEC2021 ダウンロード時間を大幅減!~大量のアセットをさばく高速な実装と運用事例の共有~ SEGADevTech
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkFlink Forward
 
Getting up to speed with Kafka Connect: from the basics to the latest feature...
Getting up to speed with Kafka Connect: from the basics to the latest feature...Getting up to speed with Kafka Connect: from the basics to the latest feature...
Getting up to speed with Kafka Connect: from the basics to the latest feature...HostedbyConfluent
 
5分でわかるWebRTCの仕組み - html5minutes vol.01
5分でわかるWebRTCの仕組み - html5minutes vol.015分でわかるWebRTCの仕組み - html5minutes vol.01
5分でわかるWebRTCの仕組み - html5minutes vol.01西畑 一馬
 
[Outdated] Secrets of Performance Tuning Java on Kubernetes
[Outdated] Secrets of Performance Tuning Java on Kubernetes[Outdated] Secrets of Performance Tuning Java on Kubernetes
[Outdated] Secrets of Performance Tuning Java on KubernetesBruno Borges
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesDatabricks
 
KGC 2016 오픈소스 네트워크 엔진 Super socket 사용하기
KGC 2016 오픈소스 네트워크 엔진 Super socket 사용하기KGC 2016 오픈소스 네트워크 엔진 Super socket 사용하기
KGC 2016 오픈소스 네트워크 엔진 Super socket 사용하기흥배 최
 
Plazma - Treasure Data’s distributed analytical database -
Plazma - Treasure Data’s distributed analytical database -Plazma - Treasure Data’s distributed analytical database -
Plazma - Treasure Data’s distributed analytical database -Treasure Data, Inc.
 
C++ GUI 라이브러리 소개: Qt & Nana
C++ GUI 라이브러리 소개: Qt & NanaC++ GUI 라이브러리 소개: Qt & Nana
C++ GUI 라이브러리 소개: Qt & NanaLazy Ahasil
 
CDC Stream Processing with Apache Flink
CDC Stream Processing with Apache FlinkCDC Stream Processing with Apache Flink
CDC Stream Processing with Apache FlinkTimo Walther
 

Tendances (20)

今日からできる!簡単 .NET 高速化 Tips
今日からできる!簡単 .NET 高速化 Tips今日からできる!簡単 .NET 高速化 Tips
今日からできる!簡単 .NET 高速化 Tips
 
Meet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingMeet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracing
 
C++ マルチスレッド 入門
C++ マルチスレッド 入門C++ マルチスレッド 入門
C++ マルチスレッド 入門
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
 
Blazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame Graphs
 
Why rust?
Why rust?Why rust?
Why rust?
 
Taking advantage of Prometheus relabeling
Taking advantage of Prometheus relabelingTaking advantage of Prometheus relabeling
Taking advantage of Prometheus relabeling
 
Introduction to Rust language programming
Introduction to Rust language programmingIntroduction to Rust language programming
Introduction to Rust language programming
 
FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)
 
CEDEC2021 ダウンロード時間を大幅減!~大量のアセットをさばく高速な実装と運用事例の共有~
CEDEC2021 ダウンロード時間を大幅減!~大量のアセットをさばく高速な実装と運用事例の共有~ CEDEC2021 ダウンロード時間を大幅減!~大量のアセットをさばく高速な実装と運用事例の共有~
CEDEC2021 ダウンロード時間を大幅減!~大量のアセットをさばく高速な実装と運用事例の共有~
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 
Getting up to speed with Kafka Connect: from the basics to the latest feature...
Getting up to speed with Kafka Connect: from the basics to the latest feature...Getting up to speed with Kafka Connect: from the basics to the latest feature...
Getting up to speed with Kafka Connect: from the basics to the latest feature...
 
5分でわかるWebRTCの仕組み - html5minutes vol.01
5分でわかるWebRTCの仕組み - html5minutes vol.015分でわかるWebRTCの仕組み - html5minutes vol.01
5分でわかるWebRTCの仕組み - html5minutes vol.01
 
[Outdated] Secrets of Performance Tuning Java on Kubernetes
[Outdated] Secrets of Performance Tuning Java on Kubernetes[Outdated] Secrets of Performance Tuning Java on Kubernetes
[Outdated] Secrets of Performance Tuning Java on Kubernetes
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
KGC 2016 오픈소스 네트워크 엔진 Super socket 사용하기
KGC 2016 오픈소스 네트워크 엔진 Super socket 사용하기KGC 2016 오픈소스 네트워크 엔진 Super socket 사용하기
KGC 2016 오픈소스 네트워크 엔진 Super socket 사용하기
 
Plazma - Treasure Data’s distributed analytical database -
Plazma - Treasure Data’s distributed analytical database -Plazma - Treasure Data’s distributed analytical database -
Plazma - Treasure Data’s distributed analytical database -
 
C++ GUI 라이브러리 소개: Qt & Nana
C++ GUI 라이브러리 소개: Qt & NanaC++ GUI 라이브러리 소개: Qt & Nana
C++ GUI 라이브러리 소개: Qt & Nana
 
Load Data Fast!
Load Data Fast!Load Data Fast!
Load Data Fast!
 
CDC Stream Processing with Apache Flink
CDC Stream Processing with Apache FlinkCDC Stream Processing with Apache Flink
CDC Stream Processing with Apache Flink
 

En vedette

Webinar: Zing Vision: Answering your toughest production Java performance que...
Webinar: Zing Vision: Answering your toughest production Java performance que...Webinar: Zing Vision: Answering your toughest production Java performance que...
Webinar: Zing Vision: Answering your toughest production Java performance que...Azul Systems Inc.
 
Experiences with Debugging Data Races
Experiences with Debugging Data RacesExperiences with Debugging Data Races
Experiences with Debugging Data RacesAzul Systems Inc.
 
Speculative Locking: Breaking the Scale Barrier (JAOO 2005)
Speculative Locking: Breaking the Scale Barrier (JAOO 2005)Speculative Locking: Breaking the Scale Barrier (JAOO 2005)
Speculative Locking: Breaking the Scale Barrier (JAOO 2005)Azul Systems Inc.
 
What's New in the JVM in Java 8?
What's New in the JVM in Java 8?What's New in the JVM in Java 8?
What's New in the JVM in Java 8?Azul Systems Inc.
 
ObjectLayout: Closing the (last?) inherent C vs. Java speed gap
ObjectLayout: Closing the (last?) inherent C vs. Java speed gapObjectLayout: Closing the (last?) inherent C vs. Java speed gap
ObjectLayout: Closing the (last?) inherent C vs. Java speed gapAzul Systems Inc.
 
Towards a Scalable Non-Blocking Coding Style
Towards a Scalable Non-Blocking Coding StyleTowards a Scalable Non-Blocking Coding Style
Towards a Scalable Non-Blocking Coding StyleAzul Systems Inc.
 
Priming Java for Speed at Market Open
Priming Java for Speed at Market OpenPriming Java for Speed at Market Open
Priming Java for Speed at Market OpenAzul Systems Inc.
 
The Java Evolution Mismatch - Why You Need a Better JVM
The Java Evolution Mismatch - Why You Need a Better JVMThe Java Evolution Mismatch - Why You Need a Better JVM
The Java Evolution Mismatch - Why You Need a Better JVMAzul Systems Inc.
 
Start Fast and Stay Fast - Priming Java for Market Open with ReadyNow!
Start Fast and Stay Fast - Priming Java for Market Open with ReadyNow!Start Fast and Stay Fast - Priming Java for Market Open with ReadyNow!
Start Fast and Stay Fast - Priming Java for Market Open with ReadyNow!Azul Systems Inc.
 
Lock-Free, Wait-Free Hash Table
Lock-Free, Wait-Free Hash TableLock-Free, Wait-Free Hash Table
Lock-Free, Wait-Free Hash TableAzul Systems Inc.
 
DotCMS Bootcamp: Enabling Java in Latency Sensitivie Environments
DotCMS Bootcamp: Enabling Java in Latency Sensitivie EnvironmentsDotCMS Bootcamp: Enabling Java in Latency Sensitivie Environments
DotCMS Bootcamp: Enabling Java in Latency Sensitivie EnvironmentsAzul Systems Inc.
 
Zulu Embedded Java Introduction
Zulu Embedded Java IntroductionZulu Embedded Java Introduction
Zulu Embedded Java IntroductionAzul Systems Inc.
 

En vedette (15)

Webinar: Zing Vision: Answering your toughest production Java performance que...
Webinar: Zing Vision: Answering your toughest production Java performance que...Webinar: Zing Vision: Answering your toughest production Java performance que...
Webinar: Zing Vision: Answering your toughest production Java performance que...
 
Experiences with Debugging Data Races
Experiences with Debugging Data RacesExperiences with Debugging Data Races
Experiences with Debugging Data Races
 
Speculative Locking: Breaking the Scale Barrier (JAOO 2005)
Speculative Locking: Breaking the Scale Barrier (JAOO 2005)Speculative Locking: Breaking the Scale Barrier (JAOO 2005)
Speculative Locking: Breaking the Scale Barrier (JAOO 2005)
 
What's New in the JVM in Java 8?
What's New in the JVM in Java 8?What's New in the JVM in Java 8?
What's New in the JVM in Java 8?
 
ObjectLayout: Closing the (last?) inherent C vs. Java speed gap
ObjectLayout: Closing the (last?) inherent C vs. Java speed gapObjectLayout: Closing the (last?) inherent C vs. Java speed gap
ObjectLayout: Closing the (last?) inherent C vs. Java speed gap
 
Towards a Scalable Non-Blocking Coding Style
Towards a Scalable Non-Blocking Coding StyleTowards a Scalable Non-Blocking Coding Style
Towards a Scalable Non-Blocking Coding Style
 
Priming Java for Speed at Market Open
Priming Java for Speed at Market OpenPriming Java for Speed at Market Open
Priming Java for Speed at Market Open
 
The Java Evolution Mismatch - Why You Need a Better JVM
The Java Evolution Mismatch - Why You Need a Better JVMThe Java Evolution Mismatch - Why You Need a Better JVM
The Java Evolution Mismatch - Why You Need a Better JVM
 
Start Fast and Stay Fast - Priming Java for Market Open with ReadyNow!
Start Fast and Stay Fast - Priming Java for Market Open with ReadyNow!Start Fast and Stay Fast - Priming Java for Market Open with ReadyNow!
Start Fast and Stay Fast - Priming Java for Market Open with ReadyNow!
 
Lock-Free, Wait-Free Hash Table
Lock-Free, Wait-Free Hash TableLock-Free, Wait-Free Hash Table
Lock-Free, Wait-Free Hash Table
 
DotCMS Bootcamp: Enabling Java in Latency Sensitivie Environments
DotCMS Bootcamp: Enabling Java in Latency Sensitivie EnvironmentsDotCMS Bootcamp: Enabling Java in Latency Sensitivie Environments
DotCMS Bootcamp: Enabling Java in Latency Sensitivie Environments
 
What's Inside a JVM?
What's Inside a JVM?What's Inside a JVM?
What's Inside a JVM?
 
Zulu Embedded Java Introduction
Zulu Embedded Java IntroductionZulu Embedded Java Introduction
Zulu Embedded Java Introduction
 
Java vs. C/C++
Java vs. C/C++Java vs. C/C++
Java vs. C/C++
 
C++vs java
C++vs javaC++vs java
C++vs java
 

Similaire à How NOT to Write a Microbenchmark

Algorithm-RepetitionSentinellNestedLoop_Solution.pptx
Algorithm-RepetitionSentinellNestedLoop_Solution.pptxAlgorithm-RepetitionSentinellNestedLoop_Solution.pptx
Algorithm-RepetitionSentinellNestedLoop_Solution.pptxAliaaAqilah3
 
TCLSH and Macro Ping Test on Cisco Routers and Switches
TCLSH and Macro Ping Test on Cisco Routers and SwitchesTCLSH and Macro Ping Test on Cisco Routers and Switches
TCLSH and Macro Ping Test on Cisco Routers and SwitchesNetProtocol Xpert
 
SPARKNaCl: A verified, fast cryptographic library
SPARKNaCl: A verified, fast cryptographic librarySPARKNaCl: A verified, fast cryptographic library
SPARKNaCl: A verified, fast cryptographic libraryAdaCore
 
Verilog Lecture3 hust 2014
Verilog Lecture3 hust 2014Verilog Lecture3 hust 2014
Verilog Lecture3 hust 2014Béo Tú
 
Analyzing and Interpreting AWR
Analyzing and Interpreting AWRAnalyzing and Interpreting AWR
Analyzing and Interpreting AWRpasalapudi
 
Algorithm analysis.pptx
Algorithm analysis.pptxAlgorithm analysis.pptx
Algorithm analysis.pptxDrBashirMSaad
 
Operators loops conditional and statements
Operators loops conditional and statementsOperators loops conditional and statements
Operators loops conditional and statementsVladislav Hadzhiyski
 
Timing Diagram.pptx
Timing Diagram.pptxTiming Diagram.pptx
Timing Diagram.pptxISMT College
 
Actor Concurrency
Actor ConcurrencyActor Concurrency
Actor ConcurrencyAlex Miller
 
FPGA based 10G Performance Tester for HW OpenFlow Switch
FPGA based 10G Performance Tester for HW OpenFlow SwitchFPGA based 10G Performance Tester for HW OpenFlow Switch
FPGA based 10G Performance Tester for HW OpenFlow SwitchYutaka Yasuda
 
Gordon morrison temporalengineering-delphi-v3
Gordon morrison temporalengineering-delphi-v3Gordon morrison temporalengineering-delphi-v3
Gordon morrison temporalengineering-delphi-v3Gordon Morrison
 
JVM Memory Model - Yoav Abrahami, Wix
JVM Memory Model - Yoav Abrahami, WixJVM Memory Model - Yoav Abrahami, Wix
JVM Memory Model - Yoav Abrahami, WixCodemotion Tel Aviv
 

Similaire à How NOT to Write a Microbenchmark (20)

Algorithm-RepetitionSentinellNestedLoop_Solution.pptx
Algorithm-RepetitionSentinellNestedLoop_Solution.pptxAlgorithm-RepetitionSentinellNestedLoop_Solution.pptx
Algorithm-RepetitionSentinellNestedLoop_Solution.pptx
 
TCLSH and Macro Ping Test on Cisco Routers and Switches
TCLSH and Macro Ping Test on Cisco Routers and SwitchesTCLSH and Macro Ping Test on Cisco Routers and Switches
TCLSH and Macro Ping Test on Cisco Routers and Switches
 
SPARKNaCl: A verified, fast cryptographic library
SPARKNaCl: A verified, fast cryptographic librarySPARKNaCl: A verified, fast cryptographic library
SPARKNaCl: A verified, fast cryptographic library
 
Verilog Lecture3 hust 2014
Verilog Lecture3 hust 2014Verilog Lecture3 hust 2014
Verilog Lecture3 hust 2014
 
DSA 103 Object Oriented Programming :: Week 3
DSA 103 Object Oriented Programming :: Week 3DSA 103 Object Oriented Programming :: Week 3
DSA 103 Object Oriented Programming :: Week 3
 
Lesson 5
Lesson 5Lesson 5
Lesson 5
 
06.Loops
06.Loops06.Loops
06.Loops
 
Analyzing and Interpreting AWR
Analyzing and Interpreting AWRAnalyzing and Interpreting AWR
Analyzing and Interpreting AWR
 
02 performance
02 performance02 performance
02 performance
 
Algorithm analysis.pptx
Algorithm analysis.pptxAlgorithm analysis.pptx
Algorithm analysis.pptx
 
Operators loops conditional and statements
Operators loops conditional and statementsOperators loops conditional and statements
Operators loops conditional and statements
 
Timing Diagram.pptx
Timing Diagram.pptxTiming Diagram.pptx
Timing Diagram.pptx
 
Jvm memory model
Jvm memory modelJvm memory model
Jvm memory model
 
Actor Concurrency
Actor ConcurrencyActor Concurrency
Actor Concurrency
 
Week2 ch4 part1edited 2020
Week2 ch4 part1edited 2020Week2 ch4 part1edited 2020
Week2 ch4 part1edited 2020
 
Week2 ch4 part1edited 2020
Week2 ch4 part1edited 2020Week2 ch4 part1edited 2020
Week2 ch4 part1edited 2020
 
Java Programming: Loops
Java Programming: LoopsJava Programming: Loops
Java Programming: Loops
 
FPGA based 10G Performance Tester for HW OpenFlow Switch
FPGA based 10G Performance Tester for HW OpenFlow SwitchFPGA based 10G Performance Tester for HW OpenFlow Switch
FPGA based 10G Performance Tester for HW OpenFlow Switch
 
Gordon morrison temporalengineering-delphi-v3
Gordon morrison temporalengineering-delphi-v3Gordon morrison temporalengineering-delphi-v3
Gordon morrison temporalengineering-delphi-v3
 
JVM Memory Model - Yoav Abrahami, Wix
JVM Memory Model - Yoav Abrahami, WixJVM Memory Model - Yoav Abrahami, Wix
JVM Memory Model - Yoav Abrahami, Wix
 

Plus de Azul Systems Inc.

Advancements ingc andc4overview_linkedin_oct2017
Advancements ingc andc4overview_linkedin_oct2017Advancements ingc andc4overview_linkedin_oct2017
Advancements ingc andc4overview_linkedin_oct2017Azul Systems Inc.
 
Understanding GC, JavaOne 2017
Understanding GC, JavaOne 2017Understanding GC, JavaOne 2017
Understanding GC, JavaOne 2017Azul Systems Inc.
 
Azul Systems open source guide
Azul Systems open source guideAzul Systems open source guide
Azul Systems open source guideAzul Systems Inc.
 
Intelligent Trading Summit NY 2014: Understanding Latency: Key Lessons and Tools
Intelligent Trading Summit NY 2014: Understanding Latency: Key Lessons and ToolsIntelligent Trading Summit NY 2014: Understanding Latency: Key Lessons and Tools
Intelligent Trading Summit NY 2014: Understanding Latency: Key Lessons and ToolsAzul Systems Inc.
 
Understanding Java Garbage Collection
Understanding Java Garbage CollectionUnderstanding Java Garbage Collection
Understanding Java Garbage CollectionAzul Systems Inc.
 
The evolution of OpenJDK: From Java's beginnings to 2014
The evolution of OpenJDK: From Java's beginnings to 2014The evolution of OpenJDK: From Java's beginnings to 2014
The evolution of OpenJDK: From Java's beginnings to 2014Azul Systems Inc.
 
Push Technology's latest data distribution benchmark with Solarflare and Zing
Push Technology's latest data distribution benchmark with Solarflare and ZingPush Technology's latest data distribution benchmark with Solarflare and Zing
Push Technology's latest data distribution benchmark with Solarflare and ZingAzul Systems Inc.
 
The Art of Java Benchmarking
The Art of Java BenchmarkingThe Art of Java Benchmarking
The Art of Java BenchmarkingAzul Systems Inc.
 
Azul Zulu on Azure Overview -- OpenTech CEE Workshop, Warsaw, Poland
Azul Zulu on Azure Overview -- OpenTech CEE Workshop, Warsaw, PolandAzul Zulu on Azure Overview -- OpenTech CEE Workshop, Warsaw, Poland
Azul Zulu on Azure Overview -- OpenTech CEE Workshop, Warsaw, PolandAzul Systems Inc.
 
Understanding Application Hiccups - and What You Can Do About Them
Understanding Application Hiccups - and What You Can Do About ThemUnderstanding Application Hiccups - and What You Can Do About Them
Understanding Application Hiccups - and What You Can Do About ThemAzul Systems Inc.
 
JVM Memory Management Details
JVM Memory Management DetailsJVM Memory Management Details
JVM Memory Management DetailsAzul Systems Inc.
 
JVM Mechanics: A Peek Under the Hood
JVM Mechanics: A Peek Under the HoodJVM Mechanics: A Peek Under the Hood
JVM Mechanics: A Peek Under the HoodAzul Systems Inc.
 
Enterprise Search Summit - Speeding Up Search
Enterprise Search Summit - Speeding Up SearchEnterprise Search Summit - Speeding Up Search
Enterprise Search Summit - Speeding Up SearchAzul Systems Inc.
 
Understanding Java Garbage Collection - And What You Can Do About It
Understanding Java Garbage Collection - And What You Can Do About ItUnderstanding Java Garbage Collection - And What You Can Do About It
Understanding Java Garbage Collection - And What You Can Do About ItAzul Systems Inc.
 
Enabling Java in Latency-Sensitive Applications
Enabling Java in Latency-Sensitive ApplicationsEnabling Java in Latency-Sensitive Applications
Enabling Java in Latency-Sensitive ApplicationsAzul Systems Inc.
 
Java at Scale, Dallas JUG, October 2013
Java at Scale, Dallas JUG, October 2013Java at Scale, Dallas JUG, October 2013
Java at Scale, Dallas JUG, October 2013Azul Systems Inc.
 

Plus de Azul Systems Inc. (16)

Advancements ingc andc4overview_linkedin_oct2017
Advancements ingc andc4overview_linkedin_oct2017Advancements ingc andc4overview_linkedin_oct2017
Advancements ingc andc4overview_linkedin_oct2017
 
Understanding GC, JavaOne 2017
Understanding GC, JavaOne 2017Understanding GC, JavaOne 2017
Understanding GC, JavaOne 2017
 
Azul Systems open source guide
Azul Systems open source guideAzul Systems open source guide
Azul Systems open source guide
 
Intelligent Trading Summit NY 2014: Understanding Latency: Key Lessons and Tools
Intelligent Trading Summit NY 2014: Understanding Latency: Key Lessons and ToolsIntelligent Trading Summit NY 2014: Understanding Latency: Key Lessons and Tools
Intelligent Trading Summit NY 2014: Understanding Latency: Key Lessons and Tools
 
Understanding Java Garbage Collection
Understanding Java Garbage CollectionUnderstanding Java Garbage Collection
Understanding Java Garbage Collection
 
The evolution of OpenJDK: From Java's beginnings to 2014
The evolution of OpenJDK: From Java's beginnings to 2014The evolution of OpenJDK: From Java's beginnings to 2014
The evolution of OpenJDK: From Java's beginnings to 2014
 
Push Technology's latest data distribution benchmark with Solarflare and Zing
Push Technology's latest data distribution benchmark with Solarflare and ZingPush Technology's latest data distribution benchmark with Solarflare and Zing
Push Technology's latest data distribution benchmark with Solarflare and Zing
 
The Art of Java Benchmarking
The Art of Java BenchmarkingThe Art of Java Benchmarking
The Art of Java Benchmarking
 
Azul Zulu on Azure Overview -- OpenTech CEE Workshop, Warsaw, Poland
Azul Zulu on Azure Overview -- OpenTech CEE Workshop, Warsaw, PolandAzul Zulu on Azure Overview -- OpenTech CEE Workshop, Warsaw, Poland
Azul Zulu on Azure Overview -- OpenTech CEE Workshop, Warsaw, Poland
 
Understanding Application Hiccups - and What You Can Do About Them
Understanding Application Hiccups - and What You Can Do About ThemUnderstanding Application Hiccups - and What You Can Do About Them
Understanding Application Hiccups - and What You Can Do About Them
 
JVM Memory Management Details
JVM Memory Management DetailsJVM Memory Management Details
JVM Memory Management Details
 
JVM Mechanics: A Peek Under the Hood
JVM Mechanics: A Peek Under the HoodJVM Mechanics: A Peek Under the Hood
JVM Mechanics: A Peek Under the Hood
 
Enterprise Search Summit - Speeding Up Search
Enterprise Search Summit - Speeding Up SearchEnterprise Search Summit - Speeding Up Search
Enterprise Search Summit - Speeding Up Search
 
Understanding Java Garbage Collection - And What You Can Do About It
Understanding Java Garbage Collection - And What You Can Do About ItUnderstanding Java Garbage Collection - And What You Can Do About It
Understanding Java Garbage Collection - And What You Can Do About It
 
Enabling Java in Latency-Sensitive Applications
Enabling Java in Latency-Sensitive ApplicationsEnabling Java in Latency-Sensitive Applications
Enabling Java in Latency-Sensitive Applications
 
Java at Scale, Dallas JUG, October 2013
Java at Scale, Dallas JUG, October 2013Java at Scale, Dallas JUG, October 2013
Java at Scale, Dallas JUG, October 2013
 

Dernier

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 

Dernier (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 

How NOT to Write a Microbenchmark

  • 1. How NOT To Write A Microbenchmark Dr. Cliff Click Senior Staff Engineer Sun Microsystems Session #
  • 2. How NOT To Write A Microbenchmark or Lies, Damn Lies, and Microbenchmarks Session #
  • 3. Microbenchmarks are a sharp knife Microbenchmarks are like a microscope Magnification is high, but what the heck are you looking at? Like "truths, half-truths, and statistics", microbenchmarks can be very misleading Beginning 3 Session # 1816
  • 4. Learning Objectives • As a result of this presentation, you will be able to: – Recognize when a benchmark lies – Recognize what a benchmark can tell you (as opposed to what it purports to tell you) – Understand how some popular benchmarks are flawed – Write a microbenchmark that doesn't lie (to you) Beginning 4 Session # 1816
  • 5. Speaker’s Qualifications • Dr. Click is a Senior Staff Engineer at Sun Microsystems • Dr. Click wrote his first compiler at age 16 and has been writing: – runtime compilers for 15 years, and – optimizing compilers for 10 years • Dr. Click architected the HotSpot™ Server Compiler Beginning 5 Session # 1816
  • 6. Your JVM is Lousy Because it Doesn't Do... "X" I routinely get handed a microbenchmark and told "HotSpot doesn't do X" 49% chance it doesn't matter, 49% chance they got fooled, 1% chance HotSpot didn't do "Y" but should 1% chance HotSpot didn't do "X" but should Beginning 6 Session # 1816
  • 7. Agenda • • • • Beginning 7 Recent real-word benchmark disaster Popular microbenchmarks and their flaws When to disbelieve a benchmark How to write your own Session # 1816
  • 8. What is a Microbenchmark? • Small program – Datasets may be large • All time spent in a few lines of code • Performance depends on how those few lines are compiled • Goal: Discover some particular fact • Remove all other variables Middle 8 Session # 1816
  • 9. Why Run Microbenchmarks? • Discover some targeted fact – Such as the cost of 'turned off' Asserts, or – Will this inline? – Will another layer of abstraction hurt performance? • Fun – My JIT is faster than your JIT – My Java is faster than your C • But dangerous! Middle 9 Session # 1816
  • 10. How HotSpot Works • HotSpot is a mixed-mode system • Code first runs interpreted – Profiles gathered • Hot code gets compiled • Same code "just runs faster" after awhile • Bail out to interpreter for rare events – Never taken before code – Class loading, initializing Middle 10 Session # 1816
  • 11. Example from Magazine Editor • • • • • • • Middle 11 What do (turned off) Asserts cost? Tiny 5-line function r/w's global variable Run in a loop 100,000,000 times With explicit check – 5 sec With Assert (off) – 0.2 sec With no test at all – 5 sec What did he really measure? Session # 1816
  • 12. Assert Example static int sval; // Global variable static int test_assert(int val) { assert (val >= 0) : "should be positive"; sval = val * 6; sval += 3; // Read/write global sval /= 2; return val+2; // } static void main(String args[]) { int REPEAT = Integer.parseInt(args[0]); int v=0; for( int i=0; i<REPEAT; i++ ) v = test_assert(v); } Middle 12 Session # 1816
  • 13. Assert Example static int sval; // Global variable static int test_assert(int val) { assert (val >= 0) : "should be positive"; sval = val * 6; sval += 3; // Read/write global sval /= 2; return val+2; } static void main(String args[]) { int REPEAT = Integer.parseInt(args[0]); int v=0; for( int i=0; i<REPEAT; i++ ) v = test_assert(v); } Middle 13 Session # 1816
  • 14. Assert Example static int sval; // Global variable static int test_assert(int val) { assert (val >= 0) : "should be positive"; sval = val * 6; sval += 3; // Read/write global sval /= 2; return val+2; // } static void main(String args[]) { int REPEAT = Integer.parseInt(args[0]); int v=0; for( int i=0; i<REPEAT; i++ ) v = test_assert(v); } Middle 14 Session # 1816
  • 15. Assert Example static int sval; // Global variable static int test_explicit(int val) { if( val < 0 ) throw ...; sval = val * 6; sval += 3; // Read/write global sval /= 2; return val+2; // } static void main(String args[]) { int REPEAT = Integer.parseInt(args[0]); int v=0; for( int i=0; i<REPEAT; i++ ) v = test_explicit(v); } Middle 15 Session # 1816
  • 16. Assert Example (cont.) • No synchronization ⇒ hoist static global into register, only do final write • Small hot static function ⇒ inline into loop • Asserts turned off ⇒ no test in code static int sval; // Global variable static void main(String args[]) { int REPEAT = Integer.parseInt(args[0]); int v=0; } Middle 16 for( int i=0; i<REPEAT; i++ ) { sval = (v*6+3)/2; v = v+2; } Session # 1816
  • 17. Assert Example (cont.) • Small loop ⇒ unroll it a lot – Need pre-loop to 'align' loop • Again remove redundant writes to global static static int sval; // Global variable static void main(String args[]) { int REPEAT = Integer.parseInt(args[0]); int v=0; // pre-loop goes here... for( int i=0; i<REPEAT; i+=16 ) { sval = ((v+2*14)*6+3)/2; v = v+2*16; } } Middle 17 Session # 1816
  • 18. Benchmark is 'spammed' • Loop body is basically 'dead' • Unrolling speeds up by arbitrary factor • Note: 0.2 sec / 100Million iters * 450 Mhz clock ⇒ Loop runs too fast: < 1 clock/iter • But yet... • ... different loops run 20x faster • What's really happening? Middle 18 Session # 1816
  • 19. A Tale of Four Loops static void main(String args[]) { int REPEAT = Integer.parseInt(args[0]); int v=0; for( int i=0; i<REPEAT; i++ ) v = test_no_check(v); // hidden warmup loop: 0.2 sec long time1 = System.currentTimeMillis() for( int i=0; i<REPEAT; i++ ) v = test_explicit(v); // Loop 1: 5 sec long time2 = System.currentTimeMillis() System.out.println("explicit check time="+...); for( int i=0; i<REPEAT; i++ ) v = test_assert(v); // Loop 2: 0.2 sec long time3 = System.currentTimeMillis() System.out.println("assert time=",...); for( int i=0; i<REPEAT; i++ ) v = test_no_check(v); // Loop 3: 5 sec long time4 = System.currentTimeMillis() System.out.println("no check time=",...); } Middle 19 Session # 1816
  • 20. On-Stack Replacement • • • • HotSpot is mixed mode: interp + compiled Hidden warmup loop starts interpreted Get's OSR'd: generate code for middle of loop 'explicit check' loop never yet executed, so... – Don't inline test_explicit(), no unroll – 'println' causes class loading ⇒ drop down to interpreter • Hidden loop runs fast • 'explicit check' loop runs slow Middle 20 Session # 1816
  • 21. On-Stack Replacement (con't) • Continue 'assert' loop in interpreter • Get's OSR'd: generate code for middle of loop • 'no check' loop never yet executed, so... – Don't inline test_no_check(), no unroll • Assert loop runs fast • 'no check' loop runs slow Middle 21 Session # 1816
  • 22. Dangers of Microbenchmarks • Looking for cost of Asserts... • But found OSR Policy Bug (fixed in 1.4.1) • Why not found before? – Only impacts microbenchmarks – Not found in any major benchmark – Not found in any major customer app • Note: test_explicit and test_no_check: 5 sec – Check is compare & predicted branch – Too cheap, hidden in other math Middle 22 Session # 1816
  • 23. Self-Calibrated Benchmarks • Want a robust microbenchmark that runs on a wide variety of machines • How long should loop run? – Depends on machine! • Time a few iterations to get a sense of speed – Then time enough runs to reach steady state • Report iterations/sec • Basic idea is sound, but devil is in the details Middle 23 Session # 1816
  • 24. Self-Calibrated Benchmarks • First timing loop runs interpreted – Also pays compile-time cost • Count needed to run reasonable time is low • Main timed run used fast compiled code • Runs too fast to get result above clock jitter • Divide small count by clock jitter ⇒ random result • Report random score Middle 24 Session # 1816
  • 25. jBYTEmark Methodology • Count iterations to run for 10 msec – First run takes 104 msec – Way too short, always = 1 iteration • Then use 1st run time to compute iterations needed to run 100msec – Way too short, always = 1 iteration • Then average 5 runs – Next 5 runs take 80, 60, 20, 20, 20 msec – Clock jitter = 1 msec Middle 25 Session # 1816
  • 26. Non-constant work per iter • • • • Non-sync'd String speed test Run with & without some synchronization Use concat, continuously add Strings Benchmark requires copy before add, so... – Each iteration takes more work than before • Self-timed loop has some jitter... – 10000 main-loop iters with sync – 10030 main-loop iters withOUT sync Middle 26 Session # 1816
  • 27. Non-constant work per iter (con't) • Work for Loop1: – 100002 copies, 10000 concats, 10000 syncs • Work for Loop2: – 100302 copies, 10000 concats, no syncs • • • • Middle 27 Loop2 avoids 10000 syncs But pays 600,900 more copies Benchmark compares times, BUT Extra work swamps sync cost! Session # 1816
  • 28. Beware 'hidden' Cache Blowout • • • • • Work appears to be linear with time More iterations should report same work/sec Grow dataset to run longer Blow out caches! Self-calibrated loop jitter can make your data set fit in L1 cache or not! • Jitter strongly affects apparent work/sec rate Middle 28 Session # 1816
  • 29. Beware 'hidden' Cache Blowout • • • • • • • Middle 29 jBYTEmark StringSort with time raised to 1 sec Each iteration adds a 4K array Needs 1 or 2 iterations to hit 100msec Scale by 10 to reach 1 sec Uses either 10 or 20 4K arrays Working set is either 40K or 80K 64K L1 cache Session # 1816
  • 30. Explicitly Handle GC • GC pauses occur at unpredictable times • GC throughput is predictable • Either don't allocate in your main loops – So no GC pauses • OR run long enough to reach GC steady state – Each run spends roughly same time doing GC Middle 30 Session # 1816
  • 31. CaffeineMark3 Logic Benchmark • Purports to test logical operation speed • With unrolling OR constant propagation, whole function is dead (oops!) • First timing loop: 900K iterations to run 1 sec • Second loop runs 2.7M iterations in 0.685 sec • Score reports as negative (overflow) or huge • Overall score dominated by Logic score Middle 31 Session # 1816
  • 32. Know what you test • SpecJVM98 209_db – 85% of time spent in shell_sort • Scimark2 MonteCarlo simulation – 80% of time spent in sync'd Random.next • jBYTEmark StringSort – 75% of time spent in byte copy • CaffeineMark Logic test – "He's dead, Jim" Middle 32 Session # 1816
  • 33. How To Write a Microbenchmark (general – not just Java) • Pick suitable goals! – Can't judge web server performance by jBYTEmarks – Only answer very narrow questions – Make sure you actually care about the answer! • Be aware of system load, clock granularity • Self-calibrated loops always have jitter – Sanity check the time on the main run! • Run to steady state, at least 10 sec Middle 33 Session # 1816
  • 34. How To Write a Microbenchmark (general – not just Java) • Fixed size datasets (cache effects) – Large datasets have page coloring issues • Constant amount of work per iteration – Or results not comparable • Avoid the 'eqntott Syndrome' – Capture 1st run result (interpreted) – Compare to last result Middle 34 Session # 1816
  • 35. How To Write a Microbenchmark (general – not just Java) • Avoid 'dead' loops – These can have infinite speedups! – Print final answer – Make computation non-trivial • Or handle dead loops – Report when speedup is unreasonable – CaffeineMark3 & JavaGrande Section 1 mix Infinities and reasonable scores Middle 35 Session # 1816
  • 36. How To Write a Microbenchmark (Java-specific) • • • • Middle 36 Be explicit about GC Thread scheduling is not deterministic JIT performance may change over time Warmup loop+test code before ANY timing – (HotSpot specific) Session # 1816
  • 37. OSR Bug Avoidance • Fixed in 1.4.1 • Write 1 loop per method for Microbenchmarks • Doesn't matter for 'real' apps – Don't write 1-loop-per-method code to avoid bug Middle 37 Session # 1816
  • 38. Summary • • • • • • • • End 38 Microbenchmarks can be easy, fun, informative They can be very misleading! Beware! Pick suitable goals Warmup code before timing Run reasonable length of time Fixed work / fixed datasets per iteration Sanity-check final times and results Handle 'dead' loops, GC issues Session # 1816
  • 39. If You Only Remember One Thing… Put Microtrust in a Microbenchmark End 39 Session # 1816