I gave this talk at the 2013 Meeting On Algorithm Engineering and Experiments (ALENEX) meeting.
Find my other talks and the corresponding papers on my web page:
http://wwwagak.cs.uni-kl.de/sebastian-wild.html
OpenShift Commons Paris - Choose Your Own Observability Adventure
Engineering Java 7's Dual Pivot Quicksort Using MaLiJAn
1. Engineering Java 7’s Dual Pivot Quicksort
Using MaLiJAn
Sebastian Wild Markus E. Nebel Raphael Reitzig Ulrich Laube
[wild, nebel, r_reitzi, laube] @cs.uni-kl.de
Computer Science Department
University of Kaiserslautern
January 7, 2013
Meeting on Algorithm Engineering & Experiments 2013
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 1 / 23
2. Background
Since Java 7: new dual pivot Quicksort in JRE library
Basic algorithm by Vladimir Yaroslavskiy
Optimizations by Jon Bentley, Joshua Bloch and others
(see java.core-libs.devel mailing list)
Motivated by experience with classic Quicksort
Validated by running time benchmark
In this talk:
Can we exploit special properties of dual pivot Quicksort?
Can we get more insight than running time measurements?
. . . stay tuned
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 2 / 23
3. Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort
(used in Oracle’s Java 7 Arrays.sort(int[]))
p q
3 5 1 8 4 7 2 9 6
Select two elements as pivots.
Invariant: <p p ◦ q k ? g >q
→ → ←
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
4. Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort
(used in Oracle’s Java 7 Arrays.sort(int[]))
p q
3 5 1 8 4 7 2 9 6
Only value relative to pivot counts.
Invariant: <p p ◦ q k ? g >q
→ → ←
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
5. Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort
(used in Oracle’s Java 7 Arrays.sort(int[]))
k
3 5 1 8 4 7 2 9 6
A[k] is medium go on
Invariant: <p p ◦ q k ? g >q
→ → ←
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
6. Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort
(used in Oracle’s Java 7 Arrays.sort(int[]))
k
3 5 1 8 4 7 2 9 6
A[k] is small Swap to left
Invariant: <p p ◦ q k ? g >q
→ → ←
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
7. Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort
(used in Oracle’s Java 7 Arrays.sort(int[]))
k
3 5 1 8 4 7 2 9 6
Swap small element to left end.
Invariant: <p p ◦ q k ? g >q
→ → ←
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
8. Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort
(used in Oracle’s Java 7 Arrays.sort(int[]))
k
3 1 5 8 4 7 2 9 6
Swap small element to left end.
Invariant: <p p ◦ q k ? g >q
→ → ←
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
9. Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort
(used in Oracle’s Java 7 Arrays.sort(int[]))
k
3 1 5 8 4 7 2 9 6
A[k] is large Find swap partner.
Invariant: <p p ◦ q k ? g >q
→ → ←
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
10. Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort
(used in Oracle’s Java 7 Arrays.sort(int[]))
k g
3 1 5 8 4 7 2 9 6
A[k] is large Find swap partner:
g skips over large elements.
Invariant: <p p ◦ q k ? g >q
→ → ←
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
11. Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort
(used in Oracle’s Java 7 Arrays.sort(int[]))
k g
3 1 5 8 4 7 2 9 6
A[k] is large Swap
Invariant: <p p ◦ q k ? g >q
→ → ←
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
12. Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort
(used in Oracle’s Java 7 Arrays.sort(int[]))
k g
3 1 5 2 4 7 8 9 6
A[k] is large Swap
Invariant: <p p ◦ q k ? g >q
→ → ←
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
13. Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort
(used in Oracle’s Java 7 Arrays.sort(int[]))
k g
3 1 5 2 4 7 8 9 6
A[k] is old A[g], small Swap to left
Invariant: <p p ◦ q k ? g >q
→ → ←
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
14. Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort
(used in Oracle’s Java 7 Arrays.sort(int[]))
k g
3 1 2 5 4 7 8 9 6
A[k] is old A[g], small Swap to left
Invariant: <p p ◦ q k ? g >q
→ → ←
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
15. Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort
(used in Oracle’s Java 7 Arrays.sort(int[]))
k g
3 1 2 5 4 7 8 9 6
A[k] is medium go on
Invariant: <p p ◦ q k ? g >q
→ → ←
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
16. Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort
(used in Oracle’s Java 7 Arrays.sort(int[]))
k g
3 1 2 5 4 7 8 9 6
A[k] is large Find swap partner.
Invariant: <p p ◦ q k ? g >q
→ → ←
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
17. Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort
(used in Oracle’s Java 7 Arrays.sort(int[]))
g k
3 1 2 5 4 7 8 9 6
A[k] is large Find swap partner:
g skips over large elements.
Invariant: <p p ◦ q k ? g >q
→ → ←
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
18. Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort
(used in Oracle’s Java 7 Arrays.sort(int[]))
g k
3 1 2 5 4 7 8 9 6
g and k have crossed!
Swap pivots in place
Invariant: <p p ◦ q k ? g >q
→ → ←
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
19. Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort
(used in Oracle’s Java 7 Arrays.sort(int[]))
g k
2 1 3 5 4 6 8 9 7
g and k have crossed!
Swap pivots in place
Invariant: <p p ◦ q k ? g >q
→ → ←
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
20. Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort
(used in Oracle’s Java 7 Arrays.sort(int[]))
2 1 3 5 4 6 8 9 7
Partitioning done!
Invariant: <p p ◦ q k ? g >q
→ → ←
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
21. Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort
(used in Oracle’s Java 7 Arrays.sort(int[]))
2 1 3 5 4 6 8 9 7
Recursively sort three sublists.
Invariant: <p p ◦ q k ? g >q
→ → ←
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
22. Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort
(used in Oracle’s Java 7 Arrays.sort(int[]))
1 2 3 4 5 6 7 8 9
Done.
Invariant: <p p ◦ q k ? g >q
→ → ←
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
23. Control Flow Graph of Partitioning Loop
1 bc: 3 no
k g
yes
2 bc: 7
t := A[k]; 7 bc: 2
yes t<p g := g − 1;
yes
no
3 bc: 12
4 bc: 3 yes 5 bc: 5 yes 6 bc: 3
A[k] := A[ ];
A[ ] := t; t q A[g] > q k<g
:= + 1;
no
no no
8 bc: 5
A[g] < p
yes no
9 bc: 14 10 bc: 6
A[k] := A[ ]; A[k] := A[g]
A[ ] := A[g]
:= + 1;
11 bc: 5
12 bc: 2
A[g] := t;
k := k + 1
g := g − 1;
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 4 / 23
24. Control Flow Graph of Partitioning Loop
1 bc: 3 no
k g Cycle 1
yes
2 bc: 7
7 bc: 2
A[k]: small
t := A[k];
yes g := g − 1;
t<p
yes A[g]: —
no
3 bc: 12
4 bc: 3 yes 5 bc: 5 yes 6 bc: 3
A[k] := A[ ];
A[ ] := t;
:= + 1;
t q A[g] > q k<g
∆(g − k): 1
no
no no
8 bc: 5
A[g] < p Bytecode
yes no
Instructions: 24
9 bc: 14 10 bc: 6
A[k] := A[ ]; A[k] := A[g]
A[ ] := A[g]
:= + 1;
11 bc: 5
12 bc: 2
A[g] := t;
k := k + 1
g := g − 1;
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 4 / 23
25. Control Flow Graph of Partitioning Loop
1 bc: 3 no
k g Cycle 2
yes
2 bc: 7
7 bc: 2
A[k]: medium
t := A[k];
yes g := g − 1;
t<p
yes A[g]: —
no
3 bc: 12
4 bc: 3 yes 5 bc: 5 yes 6 bc: 3
A[k] := A[ ];
A[ ] := t;
:= + 1;
t q A[g] > q k<g
∆(g − k): 1
no
no no
8 bc: 5
A[g] < p Bytecode
yes no
Instructions: 15
9 bc: 14 10 bc: 6
A[k] := A[ ]; A[k] := A[g]
A[ ] := A[g]
:= + 1;
11 bc: 5
12 bc: 2
A[g] := t;
k := k + 1
g := g − 1;
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 4 / 23
26. Control Flow Graph of Partitioning Loop
1 bc: 3 no
k g Cycle 3
yes
2 bc: 7
7 bc: 2
A[k]: large
t := A[k];
yes g := g − 1;
t<p
yes A[g]: large
no
3 bc: 12
4 bc: 3 yes 5 bc: 5 yes 6 bc: 3
A[k] := A[ ];
A[ ] := t;
:= + 1;
t q A[g] > q k<g
∆(g − k): 1
no
no no
8 bc: 5
A[g] < p Bytecode
yes no
Instructions: 10
9 bc: 14 10 bc: 6
A[k] := A[ ]; A[k] := A[g]
A[ ] := A[g]
:= + 1;
11 bc: 5
12 bc: 2
A[g] := t;
k := k + 1
g := g − 1;
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 4 / 23
27. Control Flow Graph of Partitioning Loop
1 bc: 3 no
k g Cycle 4
yes
2 bc: 7
7 bc: 2
A[k]: large
t := A[k];
yes g := g − 1;
t<p
yes A[g]: small
no
3 bc: 12
4 bc: 3 yes 5 bc: 5 yes 6 bc: 3
A[k] := A[ ];
A[ ] := t;
:= + 1;
t q A[g] > q k<g
∆(g − k): 2
no
no no
8 bc: 5
A[g] < p Bytecode
yes no
Instructions: 44
9 bc: 14 10 bc: 6
A[k] := A[ ]; A[k] := A[g]
A[ ] := A[g]
:= + 1;
11 bc: 5
12 bc: 2
A[g] := t;
k := k + 1
g := g − 1;
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 4 / 23
28. Control Flow Graph of Partitioning Loop
1 bc: 3 no
k g Cycle 5
yes
2 bc: 7
7 bc: 2
A[k]: large
t := A[k];
yes g := g − 1;
t<p
yes A[g]: medium
no
3 bc: 12
4 bc: 3 yes 5 bc: 5 yes 6 bc: 3
A[k] := A[ ];
A[ ] := t;
:= + 1;
t q A[g] > q k<g
∆(g − k): 2
no
no no
8 bc: 5
A[g] < p Bytecode
yes no
Instructions: 36
9 bc: 14 10 bc: 6
A[k] := A[ ]; A[k] := A[g]
A[ ] := A[g]
:= + 1;
11 bc: 5
12 bc: 2
A[g] := t;
k := k + 1
g := g − 1;
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 4 / 23
29. Asymmetry
1 bc: 3 no
k g
2
yes
bc: 7
Algorithm is asymmetric:
t := A[k]; 7 bc: 2
yes t<p
no
g := g − 1;
yes Cycles have different cost
3 bc: 12
A[k] := A[ ]; 4 bc: 3 yes 5 bc: 5 yes 6 bc: 3 Would rather execute cheap
A[ ] := t; t q A[g] > q k<g
:= + 1;
no
no
no
ones often
8 bc: 5
yes
A[g] < p
no Cycles chosen by classes
9 bc: 14
A[k] := A[ ];
10 bc: 6
A[k] := A[g] small , medium or large
A[ ] := A[g]
:= + 1;
Probability for classes depends
12 bc: 2
k := k + 1
11 bc: 5
A[g] := t;
on pivot values
g := g − 1;
Maybe we can “influence pivot values accordingly”?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 5 / 23
30. Pivot Sampling
Well-known optimization for classic Quicksort: median-of-three
pivot closer to median of whole list
In JRE7 Quicksort implementation: natural extension for 2 pivots:
tertiles-of-five
pivots closer to tertiles of whole list
9 other possibilities to pick p and q out of 5 elements:
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 6 / 23
31. Pivot Sampling
Well-known optimization for classic Quicksort: median-of-three
pivot closer to median of whole list
In JRE7 Quicksort implementation: natural extension for 2 pivots:
tertiles-of-five
pivots closer to tertiles of whole list
9 other possibilities to pick p and q out of 5 elements:
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 6 / 23
32. Pivot Sampling
Well-known optimization for classic Quicksort: median-of-three
pivot closer to median of whole list
In JRE7 Quicksort implementation: natural extension for 2 pivots:
p q
tertiles-of-five
pivots closer to tertiles of whole list
9 other possibilities to pick p and q out of 5 elements:
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 6 / 23
33. Pivot Sampling
Well-known optimization for classic Quicksort: median-of-three
pivot closer to median of whole list
In JRE7 Quicksort implementation: natural extension for 2 pivots:
p q
tertiles-of-five
pivots closer to tertiles of whole list
9 other possibilities to pick p and q out of 5 elements:
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 6 / 23
34. Optimizing Pivot Sampling
Which are “good” pivot selection schemes?
Is the symmetric choice best possible?
Need objective function to optimize
Typical approaches to judge efficiency:
A Count number of basic operations.
(Here: number of executed Java Bytecode instructions.)
B Measure total running time.
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 7 / 23
35. Optimizing Pivot Sampling
Relative performance of pivot sampling compared to tertiles-of-five:
Pivot Selection Scheme A 1 B 2
JRE7
+5.14% +0.80%
JRE7(1,3) −1.85% −0.44%
+3.34% −0.42%
— (stack overflow!) +10.6%
+2.48% +2.73%
+11.3% +3.31%
+12.7% +3.29%
+16.4% +2.48%
+39.0% +5.87%
1
Average number of executed bytecodes on almost sorted lists of length 105 .
2
Average running time on random permutations of length 106 .
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 8 / 23
37. Model and Method
What made JRE7(1,3) faster than JRE7 ?
. . . hard to tell from total time/bytecodes.
Need a more detailed model of the program.
Idea: Decompose along control flow graph!
1
View program as Markov chain over blocks
2 7
Termination via absorbing state
3 4 5 6
Transition i → j has probability p(n)
i→j
8 depending on input size n
9 10 Visiting block i incurs constant costs c(i)
12 11 Total cost is sum of block costs
Expected costs of program = expected costs of run of Markov chain
Latter easy to compute
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 9 / 23
38. Model and Method
What made JRE7(1,3) faster than JRE7 ?
. . . hard to tell from total time/bytecodes.
Need a more detailed model of the program.
Idea: Decompose along control flow graph!
1
View program as Markov chain over blocks
2 7
Termination via absorbing state
3 4 5 6
Transition i → j has probability p(n)
i→j
8 depending on input size n
9 10 Visiting block i incurs constant costs c(i)
12 11 Total cost is sum of block costs
Expected costs of program = expected costs of run of Markov chain
Latter easy to compute
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 9 / 23
39. Maximum Likelihood Analysis
How to determine block costs and transition probabilities?
Transition Probabilities
Count transitions in executions on sample data
1 Allows arbitrary input distributions!
2 Take relative frequency as estimate for p(n)
i→j
Extrapolate p(n) to a function pi→j (n) in n
i→j
Block Costs
We consider two cost measures:
1
A bc(i) = number of Bytecodes instructions in block i.
2
B t(i) = running time of block i
All steps are automated in our tool MaLiJAn3
3
http://wwwagak.cs.uni-kl.de/malijan.html
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 10 / 23
40. Block Sampling
Running times t(i) in B are typically few nanoseconds
direct measurement not possible.
Idea: Sampling Based Approach
12 11 12
ns
1 2 3 1 2 4 5 6 7 5 6 7 5 6 7 8 10 1
time µs
sampling 3 2 6 5 5 8 10
In regular intervals, store current basic block (concurrently)
We observe only ≈ 1 of all blocks repeat execution
Relative frequencies of observed samples approach
relative running time contribution of blocks.
Count in separate run how often block i gets executed in total
Together, this allows to compute t(i)
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 11 / 23
41. A Decent Word of Caution
1 Determining current block adds a small systematic error.
2 Java Specialty: Just-in-time Compilation
Running time heavily influenced by HotSpot JIT compiler
JIT collects profiling information at beginning
First input determines which optimizations are found
. . . more details in the paper
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 12 / 23
42. Input Distributions
We consider 2 different input distributions:
1 Random Permutations
well-studied in literature
2 Almost Sorted Lists
Random model by Brodal et al.4 :
A[i] chosen i. i. d. uniform in [i − d, i + d]
for constant d (here d = 100)
4
G. Brodal, R. Fagerberg, G. Moruz: On the Adaptiveness of Quicksort,
J. Exp. Algorithmics 12 (2008), pp. 3.2:1–3.2:20
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 13 / 23
44. Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
JRE7 19.40 n ln n + 51 n
Bytecodes A
JRE7(1,3) 18.73 n ln n + 62 n
JRE7
time -Xcomp B
JRE7(1,3)
JRE7
time warmup B
JRE7(1,3)
24 log. plot, normalized by n ln n
JRE7, JRE7(1,3)
23 model fits data well!
22
105 106 107 108
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 14 / 23
45. Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
JRE7 19.40 n ln n + 51 n
Bytecodes A
JRE7(1,3) 18.73 n ln n + 62 n
JRE7
time -Xcomp B
JRE7(1,3) 19.40 n ln n + 51 n
18.73 n ln n + 62 n
24 JRE7
time warmup B JRE7
JRE7(1,3)
JRE7(1,3)
n ln n
bc
24 23 log. plot, normalized by n ln n
JRE7, JRE7(1,3)
23 model fits data well!
22
22
105 106 107 108
105 106 107 108 n
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 14 / 23
46. Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
JRE7 19.40 n ln n + 51 n 15.10 n ln n + 68 n
Bytecodes A
JRE7(1,3) 18.73 n ln n + 62 n 13.52 n ln n + 85 n
JRE7
time -Xcomp B
JRE7(1,3)
JRE7
time warmup B
JRE7(1,3)
21
log. plot, normalized by n ln n
20 JRE7, JRE7(1,3)
model fits data well!
19
18
105 106 107 108
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 14 / 23
47. Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
JRE7 19.40 n ln n + 51 n 15.10 n ln n + 68 n
Bytecodes A
JRE7(1,3) 18.73 n ln n + 62 n 13.52 n ln n + 85 n
JRE7
time -Xcomp B
JRE7(1,3)
JRE7
time warmup B
JRE7(1,3)
asymptotically, JRE7(1,3) executes less Bytecodes!
Can we explain, why?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 14 / 23
51. Asymptotic Cycle Frequencies
· n ln n + O(n)
0.4
JRE7(1,3) executes
Cycle 3 more often
0.2
Cycle 1 less often
than JRE7
0
JRE7 JRE7(1,3) JRE7 JRE7(1,3)
JRE7(1,3)
random permutations executes cheap Cycle 3 more often
almost sorted
and expensive Cycle 1 less often than JRE7.
Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5
Asymptotically, less executed Bytecodes!
1 1 1 1 1
2 7 2 7 2 7 2 7 2 7
3 4 5 6 3 4 5 6 3 4 5 6 3 4 5 6 3 4 5 6
8 8 8 8 8
9 10 9 10 9 10 9 10 9 10
12 11 12 11 12 11 12 11 12 11
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 16 / 23
52. Running Time Results
How about running time?
HotSpot JIT compiler has two modes
-Xcomp JIT compiler without profiling information
warmup profiling JIT with warmup on fixed input
trigger JIT compilation
Do Block Sampling for both modes
Should we expect same block running times?
. . . stay tuned
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 17 / 23
55. Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
JRE7 19.40 n ln n + 51 n 15.10 n ln n + 68 n
Bytecodes A
JRE7(1,3) 18.73 n ln n + 62 n 13.52 n ln n + 85 n
JRE7 20.10 n ln n + 26 n 11.95 n ln n + 54 n
time -Xcomp B
JRE7(1,3) 19.95 n ln n + 32 n 11.09 n ln n + 64 n
JRE7
time warmup B
JRE7(1,3)
18
24 17
16
22
15
20 14
105 106 107 108 105 106 107 108
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 19 / 23
56. Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
JRE7 19.40 n ln n + 51 n 15.10 n ln n + 68 n
Bytecodes A
JRE7(1,3) 18.73 n ln n + 62 n 13.52 n ln n + 85 n
JRE7 20.10 n ln n + 26 n 11.95 n ln n + 54 n
time -Xcomp B
JRE7(1,3) 19.95 n ln n + 32 n 11.09 n ln n + 64 n
JRE7
time warmup B
JRE7(1,3)
18
24 17 JIT without profiling
16
22
15 asymptotically, JRE7(1,3) faster!
20 14
105 106 107 108 105 106 107 108
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 19 / 23
57. Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
JRE7 19.40 n ln n + 51 n 15.10 n ln n + 68 n
Bytecodes A
JRE7(1,3) 18.73 n ln n + 62 n 13.52 n ln n + 85 n
JRE7 20.10 n ln n + 26 n 11.95 n ln n + 54 n
time -Xcomp B
JRE7(1,3) 19.95 n ln n + 32 n 11.09 n ln n + 64 n
JRE7 10.02 n ln n + 9 n 5.52 n ln n + 13 n
time warmup B
JRE7(1,3) 11.39 n ln n + 15 n 5.38 n ln n + 19 n
8
12
6
10
4
105 106 107 108 105 106 107 108
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 19 / 23
58. Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
JRE7 19.40 n ln n + 51 n 15.10 n ln n + 68 n
Bytecodes A
JRE7(1,3) 18.73 n ln n + 62 n 13.52 n ln n + 85 n
JRE7 20.10 n ln n + 26 n 11.95 n ln n + 54 n
time -Xcomp B
JRE7(1,3) 19.95 n ln n + 32 n 11.09 n ln n + 64 n
JRE7 10.02 n ln n + 9 n 5.52 n ln n + 13 n
time warmup B
JRE7(1,3) 11.39 n ln n + 15 n 5.38 n ln n + 19 n
8
12 JIT with profiling and warmup
6
10
asymptotically, JRE7(1,3) slower!
4
105 106 107 108 105 106 107 108
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 19 / 23
59. Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
JRE7 19.40 n ln n + 51 n 15.10 n ln n + 68 n
Bytecodes A
JRE7(1,3) 18.73 n ln n + 62 n 13.52 n ln n + 85 n
JRE7 20.10 n ln n + 26 n 11.95 n ln n + 54 n
time -Xcomp B
JRE7(1,3) 19.95 n ln n + 32 n 11.09 n ln n + 64 n
JRE7 10.02 n ln n + 9 n 5.52 n ln n + 13 n
time warmup B
JRE7(1,3) 11.39 n ln n + 15 n 5.38 n ln n + 19 n
8
12 JIT with profiling and warmup
6
10
asymptotically, JRE7(1,3) slower!
4
105 106 107 108 105 106 107 108
What changes with profiling enabled?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 19 / 23
62. Cycle Costs
· cost(Cycle 5)
1
measures agree
qualitatively
0.5
except for JRE7(1,3)
with profiling JIT!
0
bc tJRE7 tJRE7 tJRE7 tJRE7
For -Xcomp (1,3), the code created by profiling JIT
JRE7(1,3) (1,3)
with warmup
for Cycle 3 is much slower than for JRE7!
Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5
That’s the place to focus future research on.
1 1 1 1 1
2 7 2 7 2 7 2 7 2 7
3 4 5 6 3 4 5 6 3 4 5 6 3 4 5 6 3 4 5 6
8 8 8 8 8
9 10 9 10 9 10 9 10 9 10
12 11 12 11 12 11 12 11 12 11
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 20 / 23
63. Cycle Costs
· cost(Cycle 5)
1
measures agree
qualitatively
0.5
except for JRE7(1,3)
with profiling JIT!
0
bc tJRE7 tJRE7 tJRE7 tJRE7
For -Xcomp (1,3), the code created by profiling JIT
JRE7(1,3) (1,3)
with warmup
for Cycle 3 is much slower than for JRE7!
Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5
That’s the place to focus future research on.
1 1 1 1 1
2 7 2 7 2 7 2 7 2 7
3 4 5 6 3 4 5 6 3 4 5 6 3 4 5 6 3 4 5 6
8 8 8 8 8
9 10 9 10 9 10 9 10 9 10
12 11 12 11 12 11 12 11 12 11
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 20 / 23
64. Conclusion
Summary
Java 7’s dual pivot Quicksort is highly asymmetric.
executes less Bytecodes than .
Almost sorted inputs amplify impact of pivot sampling.
Oracle’s profiling JIT compiler creates different code for JRE7(1,3) ,
which potentially overcompensates gains.
Control flow graph decomposition supported by MaLiJAn makes
difference in code efficiency directly visible.
Open Problems
? What causes different costs for Cycle 3?
? Are the differences idiosyncracies of Java / Oracle’s JRE?
? Performance of JRE7(1,3) on other inputs, especially with equal keys?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 21 / 23
65. Conclusion
Summary
Java 7’s dual pivot Quicksort is highly asymmetric.
executes less Bytecodes than .
Almost sorted inputs amplify impact of pivot sampling.
Oracle’s profiling JIT compiler creates different code for JRE7(1,3) ,
which potentially overcompensates gains.
Control flow graph decomposition supported by MaLiJAn makes
difference in code efficiency directly visible.
Open Problems
? What causes different costs for Cycle 3?
? Are the differences idiosyncracies of Java / Oracle’s JRE?
? Performance of JRE7(1,3) on other inputs, especially with equal keys?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 21 / 23
66. Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
JRE7 19.40 n ln n + 51 n 15.10 n ln n + 68 n
Bytecodes A
JRE7(1,3) 18.73 n ln n + 62 n 13.52 n ln n + 85 n
JRE7 20.10 n ln n + 26 n 11.95 n ln n + 54 n
time -Xcomp B
JRE7(1,3) 19.95 n ln n + 32 n 11.09 n ln n + 64 n
JRE7 10.02 n ln n + 9 n 5.52 n ln n + 13 n
time warmup B
JRE7(1,3) 11.39 n ln n + 15 n 5.38 n ln n + 19 n
8
12 JIT with profiling and warmup
6
10
asymptotically, JRE7(1,3) slower!
4
105 106 107 108 105 106 107 108
What changes with profiling enabled?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 22 / 23
67. Conclusion
Summary
Java 7’s dual pivot Quicksort is highly asymmetric.
executes less Bytecodes than .
Almost sorted inputs amplify impact of pivot sampling.
Oracle’s profiling JIT compiler creates different code for JRE7(1,3) ,
which potentially overcompensates gains.
Control flow graph decomposition supported by MaLiJAn makes
difference in code efficiency directly visible.
Open Problems
? What causes different costs for Cycle 3?
? Are the differences idiosyncracies of Java / Oracle’s JRE?
? Performance of JRE7(1,3) on other inputs, especially with equal keys?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 23 / 23