Contenu connexe Similaire à JavaMicroBenchmarkpptm (20) JavaMicroBenchmarkpptm4. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Why do we need to benchmark ?
Do we benchmark correctly ?
Know the optimization ..
How open jmh works ..
Advanced topics ..
1
2
3
4
4
5
5. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Why do we need to benchmark?
5
6. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Trends in hardware tech
• Single processor performance improvement have slowing steadily from 2003
• Clock speed have not increased last decade due to power consumption factors
• Also there is a concern in microprocessor production tech which is hitting 7 nm process
barrier
• DRAM chip capacity has increased by about 25% to 40% per year recently. And there is
also tremendous increase in bandwidth. But latency is still a concern
• Its increasing difficulty of efficiently manufacturing even smaller DRAM cells
• Bandwidth has outpaced latency across these technologies and will likely continue to do
so
6
7. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Why do you need to measure?
• Software Engineering is more like the Interstellar movie . Can be realistic, driven by
math. Its still a movie not a reality
• Performance Engineering is firmly placed in reality where one has to deal complex
hardware interaction, compiler and hardware optimization and multithreading
• The aim of Performance Engineering is to gather the performance model of the
underlying system
• It can give a picture where optimization is required and where there is too much over
engineering
• Having a microbench mark data is a far better that writing code in the blind
7
8. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Do we benchmark correctly ?
8
9. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
A wrong approach to benchmark
9
interface Incrementer {
void increment();
}
class LockIncrementer implements Incrementer {
private long counter = 0;
private Lock lock = new ReentrantLock();
public void increment() {
lock.lock();
try {
++counter;
} finally {
lock.unlock();
}}}
class SyncIncrementer implements Incrementer {
private long counter = 0;
public synchronized void increment() {
++counter;
}
10. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
A wrong approach to benchmark
10
long test(Incrementer incr) {
long start = System.nanoTime();
for (long i = 0; i < 10000000L; i++)
incr.increment();
return System.nanoTime() - start;
}
public static void main(String[] args) {
long synchTime = test(new SyncIncrementer());
long lockTime = test(new LockIncrementer());
System.out.printf("synchronized: %1$10dn", synchTime);
System.out.printf("Lock: %1$10dn", lockTime);
System.out.printf("Lock/synchronized = %1$.3f", (double) lockTime /
(double) synchTime);
}
11. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
What are things missed here ?
• No consideration for compiler and hardware optimization and number of cores etc
• No consideration for the JVM optimization
• No consideration for the number of threads acting here.
• No consideration for variation of number of inputs
• Conclusion solely based on numbers and stack overflow conversation
11
12. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Open JMH
• JMH is a Java harness for building, running, and analyzing nano/micro/milli/macro
benchmarks written in Java and other languages targeting the JVM.
• Part of the code-tools project of openjdk
• Used extensively within open jdk to test the internals
• Keeps pace with the changes in the jvm
• Brings scientific approach to benchmarking
12
13. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
A quick look at JMH working
• A maven based project . Bundles the benchmark code with the working jar
• A quick common annotation list
13
Annotation Function
@Benchmark Lines up the method for benchmarking
@BenchmarkMode Defines mode of benchmark line
averagetime or throughput
@Warmup Defines the warm-up cycles
@Measurement Defines the measurement iteration
@Fork Number of vm
14. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Know the optimization
14
15. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
VM and Hardware optimization
• Dead code elimination – Elimination of code which is not used
• Inlining – Analyzing the outcome of the code and optimizing it
• Loop unrolling – increase a program's speed by reducing (or eliminating) instructions
that control the loop
• Warmup – VM starts by interpreting the code and after seeing the hot methods it starts
aggressive inlining
15
16. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 16
@Benchmark
public void testMethod() {
int sum = 0;
for (int i = 0; i < 50; i++) {
sum += i;
}
}
@Benchmark
public int testMethod_1() {
int sum = 0;
for (int i = 0; i < 50; i++) {
sum += i;
}
return sum;
}
Benchmark Mode Cnt Score Error Units
Benchmark_Inlining.testMethod avgt 5 0.411 ▒ 0.199 ns/op
Benchmark_Inlining.testMethod_1 avgt 5 3.396 ▒ 0.138 ns/op
Benchmark_Inlining.testMethod_2 avgt 5 5.123 ▒ 0.993 ns/op
@CompilerControl(Mode.DONT_INLINE)
@Benchmark
public int testMethod_2() {
int sum = 0;
for (int i = 0; i < 50; i++) {
sum += i;
}
return sum;
}
DCE and Inlining
17. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 17
private double[] A = new double[2048];
@Benchmark
public double test1() {
double sum = 0.0;
for (int i = 0; i < A.length; i++) {
sum += A[i];
}
return sum;
}
@Benchmark
public double testManualUnroll() {
double sum = 0.0;
for (int i = 0; i < A.length; i += 4) {
sum += A[i] + A[i + 1] + A[i + 2] + A[i + 3];
}
return sum;
}
Benchmark Mode Cnt Score Error Units
Benchmark_LoopUnroll.test1 avgt 5 1946.006 ▒ 73.579 ns/op
Benchmark_LoopUnroll.testManualUnroll avgt 5 823.572 ▒ 183.984 ns/op
Loop Unrolling
18. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 18
Benchmark Mode Cnt Score Error Units
JMHSample_11_Loops.measure avgt 5 5.472 ▒ 2.783 ns/op
JMHSample_11_Loops.measure1 avgt 5 4.813 ▒ 0.352 ns/op
JMHSample_11_Loops.measure1000 avgt 5 0.039 ▒ 0.008 ns/op
JMHSample_11_Loops.measure100000 avgt 5 0.036 ▒ 0.006 ns/op
Unrolling and Inline in Steroids
int x = 1;
int y = 2;
@Benchmark
public int measure() {
return (x + y);
}
private int reps(int reps) {
int s = 0;
for (int i = 0; i < reps; i++) {
s += (x + y);
}
return s;
}
@Benchmark
@OperationsPerInvocation(1)
public int measure1() {
return reps(1);
}
@Benchmark
@OperationsPerInvocation(10000)
public int measure1000() {
return reps(10000);
}
@Benchmark
@OperationsPerInvocation(100000)
public int measure100000() {
return reps(100000);
}
19. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Advanced Topics
19
20. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 20
Benchmark Mode Cnt Score Error Units
BenchmarkAtomicInteger.baseline avgt 5 4.109 ▒ 0.345 ns/op
BenchmarkAtomicInteger.incrPlain avgt 5 3.320 ▒ 0.415 ns/op
BenchmarkAtomicInteger.incrAtomic avgt 5 9.205 ▒ 1.031 ns/op
What can be the cost of Atomic write
private int plainV;
private AtomicInteger atomicInteger = new
AtomicInteger(0);
@Benchmark
public int baseline() {
return 42;
}
@Benchmark
public int incrPlain() {
return plainV++;
}
@Benchmark
public int incrAtomic() {
return atomicInteger.incrementAndGet();
}
21. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 21
Benchmark (tokens) Mode Cnt Score Error Units
BenchmarkAmortizedAtomicInteger.baseline 40 avgt 5 84.652 ▒ 3.229 ns/op
BenchmarkAmortizedAtomicInteger.incrPlain 40 avgt 5 85.101 ▒ 2.916 ns/op
BenchmarkAmortizedAtomicInteger.incrAtomic 40 avgt 5 87.118 ▒ 3.891 ns/op
Lets Amortize the cost
@Param({ "40" })
private int tokens;
private int plainV;
private AtomicInteger atomicInteger = new
AtomicInteger(0);
@Benchmark
public int baseline() {
Blackhole.consumeCPU(tokens);
return 42;
}
@Benchmark
public int incrPlain() {
Blackhole.consumeCPU(tokens);
return plainV++;
}
@Benchmark
public int incrAtomic() {
Blackhole.consumeCPU(tokens);
return atomicInteger.incrementAndGet();
}
22. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Conclusion
• Benchmark for understand the performance model of the system not obtain number for
fighting in stack over flow.
• It can give an insight to where performance tweaking is need and where its not required
where underlying systems can do the optimization for you
• Superficial conclusion without accurate measurement on performance can lead over
engineering.
22
23. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
References
• http://hg.openjdk.java.net/code-tools/jmh/
• http://openjdk.java.net/projects/code-tools/jmh/
• http://shipilev.net/
• Computer Architecture A Quantitative Approach (5th edition) John L. Hennessy
Stanford University David A. Patterson University of California, Berkeley
• http://openjdk.java.net/projects/jdk8/
23
24. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Questions ?
24
Notes de l'éditeur This is a Title Slide with Java FY15 Theme slide ideal for including the Java Theme with a brief title, subtitle and presenter information.
To customize this slide with your own picture:
Right-click the slide area and choose Format Background from the pop-up menu. From the Fill menu, click Picture and texture fill. Under Insert from: click File. Locate your new picture and click Insert.
To copy the Customized Background from Another Presentation on PC
Click New Slide from the Home tab's Slides group and select Reuse Slides.
Click Browse in the Reuse Slides panel and select Browse Files. Double-click the PowerPoint presentation that contains the background you wish to copy.
Check Keep Source Formatting and click the slide that contains the background you want.
Click the left-hand slide preview to which you wish to apply the new master layout.
Apply New Layout (Important): Right-click any selected slide, point to Layout, and click the slide containing the desired layout from the layout gallery.
Delete any unwanted slides or duplicates.
To copy the Customized Background from Another Presentation on Mac
Click New Slide from the Home tab's Slides group and select Insert Slides from Other Presentation…
Navigate to the PowerPoint presentation file that contains the background you wish to copy. Double-click or press Insert. This prompts the Slide Finder dialogue box.
Make sure Keep design of original slides is unchecked and click the slide(s) that contains the background you want. Hold Shift key to select multiple slides.
Click the left-hand slide preview to which you wish to apply the new master layout.
Apply New Layout (Important): Click Layout from the Home tab's Slides group, and click the slide containing the desired layout from the layout gallery.
Delete any unwanted slides or duplicates. This slide can also be used as a Q and A slide This slide can also be used as a Q and A slide This slide can also be used as a Q and A slide This slide can also be used as a Q and A slide This slide can also be used as a Q and A slide