SlideShare une entreprise Scribd logo
1  sur  41
Télécharger pour lire hors ligne
Let’s talk about microbenchmarking
Andrey Akinshin, JetBrains
DotNext 2016 Helsinki
1/29
BenchmarkDotNet
2/29
Today’s environment
• BenchmarkDotNet v0.10.1
• Haswell Core i7-4702MQ CPU 2.20GHz, Windows 10
• .NET Framework 4.6.2 + clrjit/compatjit-v4.6.1586.0
• Source code: https://git.io/v1RRX
3/29
Today’s environment
• BenchmarkDotNet v0.10.1
• Haswell Core i7-4702MQ CPU 2.20GHz, Windows 10
• .NET Framework 4.6.2 + clrjit/compatjit-v4.6.1586.0
• Source code: https://git.io/v1RRX
Other environments:
C# compiler old csc / Roslyn
CLR CLR2 / CLR4 / CoreCLR / Mono
OS Windows / Linux / MacOS / FreeBSD
JIT LegacyJIT-x86 / LegacyJIT-x64 / RyuJIT-x64
GC MS (different CLRs) / Mono (Boehm/Sgen)
Toolchain JIT / NGen / .NET Native
Hardware ∞ different configurations
. . . . . .
And don’t forget about multiple versions
3/29
Count of iterations
A bad benchmark
// Resolution (Stopwatch) = 466 ns
// Latency (Stopwatch) = 18 ns
var sw = Stopwatch.StartNew();
Foo(); // 100 ns
sw.Stop();
WriteLine(sw.ElapsedMilliseconds);
4/29
Count of iterations
A bad benchmark
// Resolution (Stopwatch) = 466 ns
// Latency (Stopwatch) = 18 ns
var sw = Stopwatch.StartNew();
Foo(); // 100 ns
sw.Stop();
WriteLine(sw.ElapsedMilliseconds);
A better benchmark
var sw = Stopwatch.StartNew();
for (int i = 0; i < N; i++) // (N * 100 + eps) ns
Foo(); // 100 ns
sw.Stop();
var total = sw.ElapsedTicks / Stopwatch.Frequency;
WriteLine(total / N);
4/29
Several launches
Run 01 : 529.8674 ns/op
Run 02 : 532.7541 ns/op
Run 03 : 558.7448 ns/op
Run 04 : 555.6647 ns/op
Run 05 : 539.6401 ns/op
Run 06 : 539.3494 ns/op
Run 07 : 564.3222 ns/op
Run 08 : 551.9544 ns/op
Run 09 : 550.1608 ns/op
Run 10 : 533.0634 ns/op
5/29
Several launches
6/29
A simple case
Central limit theorem to the rescue!
7/29
Complicated cases
8/29
Latencies
Event Latency
1 CPU cycle 0.3 ns
Level 1 cache access 0.9 ns
Level 2 cache access 2.8 ns
Level 3 cache access 12.9 ns
Main memory access 120 ns
Solid-state disk I/O 50-150 µs
Rotational disk I/O 1-10 ms
Internet: SF to NYC 40 ms
Internet: SF to UK 81 ms
Internet: SF to Australia 183 ms
OS virtualization reboot 4 sec
Hardware virtualization reboot 40 sec
Physical system reboot 5 min
© Systems Performance: Enterprise and the Cloud
9/29
Sum of elements
const int N = 1024;
int[,] a = new int[N, N];
[Benchmark]
public double SumIJ()
{
var sum = 0;
for (int i = 0; i < N; i++)
for (int j = 0; j < N; j++)
sum += a[i, j];
return sum;
}
[Benchmark]
public double SumJI()
{
var sum = 0;
for (int j = 0; j < N; j++)
for (int i = 0; i < N; i++)
sum += a[i, j];
return sum;
}
10/29
Sum of elements
const int N = 1024;
int[,] a = new int[N, N];
[Benchmark]
public double SumIJ()
{
var sum = 0;
for (int i = 0; i < N; i++)
for (int j = 0; j < N; j++)
sum += a[i, j];
return sum;
}
[Benchmark]
public double SumJI()
{
var sum = 0;
for (int j = 0; j < N; j++)
for (int i = 0; i < N; i++)
sum += a[i, j];
return sum;
}
CPU cache effect:
SumIJ SumJI
LegacyJIT-x86 ≈1.3ms ≈4.0ms
10/29
Cache-sensitive benchmarks
Let’s run a benchmark several times:
int[] x = new int[128 * 1024 * 1024];
for (int iter = 0; iter < 5; iter++)
{
var sw = Stopwatch.StartNew();
for (int i = 0; i < x.Length; i += 16)
x[i]++;
sw.Stop();
WriteLine(sw.ElapsedMilliseconds);
}
11/29
Cache-sensitive benchmarks
Let’s run a benchmark several times:
int[] x = new int[128 * 1024 * 1024];
for (int iter = 0; iter < 5; iter++)
{
var sw = Stopwatch.StartNew();
for (int i = 0; i < x.Length; i += 16)
x[i]++;
sw.Stop();
WriteLine(sw.ElapsedMilliseconds);
}
176 // not warmed
81 // still not warmed
62 // the steady state
62 // the steady state
62 // the steady state
Warmup is not only about .NET
11/29
Branch prediction
const int N = 32767;
int[] sorted, unsorted; // random numbers [0..255]
private static int Sum(int[] data)
{
int sum = 0;
for (int i = 0; i < N; i++)
if (data[i] >= 128)
sum += data[i];
return sum;
}
[Benchmark]
public int Sorted()
{
return Sum(sorted);
}
[Benchmark]
public int Unsorted()
{
return Sum(unsorted);
}
12/29
Branch prediction
const int N = 32767;
int[] sorted, unsorted; // random numbers [0..255]
private static int Sum(int[] data)
{
int sum = 0;
for (int i = 0; i < N; i++)
if (data[i] >= 128)
sum += data[i];
return sum;
}
[Benchmark]
public int Sorted()
{
return Sum(sorted);
}
[Benchmark]
public int Unsorted()
{
return Sum(unsorted);
}
Sorted Unsorted
LegacyJIT-x86 ≈20µs ≈139µs
12/29
Isolation
A bad benchmark
var sw1 = Stopwatch.StartNew();
Foo();
sw1.Stop();
var sw2 = Stopwatch.StartNew();
Bar();
sw2.Stop();
13/29
Isolation
A bad benchmark
var sw1 = Stopwatch.StartNew();
Foo();
sw1.Stop();
var sw2 = Stopwatch.StartNew();
Bar();
sw2.Stop();
In general case, you should run each benchmark in
his own process. Remember about:
• Interface method dispatch
• Garbage collector and autotuning
• Conditional jitting
13/29
Interface method dispatch
private interface IInc {
double Inc(double x);
}
private class Foo : IInc {
double Inc(double x) => x + 1;
}
private class Bar : IInc {
double Inc(double x) => x + 1;
}
private double Run(IInc inc) {
double sum = 0;
for (int i = 0; i < 1001; i++)
sum += inc.Inc(0);
return sum;
}
// Which method is faster?
[Benchmark]
public double FooFoo() {
var foo1 = new Foo();
var foo2 = new Foo();
return Run(foo1) + Run(foo2);
}
[Benchmark]
public double FooBar() {
var foo = new Foo();
var bar = new Bar();
return Run(foo) + Run(bar);
}
14/29
Interface method dispatch
private interface IInc {
double Inc(double x);
}
private class Foo : IInc {
double Inc(double x) => x + 1;
}
private class Bar : IInc {
double Inc(double x) => x + 1;
}
private double Run(IInc inc) {
double sum = 0;
for (int i = 0; i < 1001; i++)
sum += inc.Inc(0);
return sum;
}
// Which method is faster?
[Benchmark]
public double FooFoo() {
var foo1 = new Foo();
var foo2 = new Foo();
return Run(foo1) + Run(foo2);
}
[Benchmark]
public double FooBar() {
var foo = new Foo();
var bar = new Bar();
return Run(foo) + Run(bar);
}
FooFoo FooBar
LegacyJIT-x64 ≈5.4µs ≈7.1µs
14/29
Tricky inlining
[Benchmark]
int Calc() => WithoutStarg(0x11) + WithStarg(0x12);
int WithoutStarg(int value) => value;
int WithStarg(int value) {
if (value < 0)
value = -value;
return value;
}
15/29
Tricky inlining
[Benchmark]
int Calc() => WithoutStarg(0x11) + WithStarg(0x12);
int WithoutStarg(int value) => value;
int WithStarg(int value) {
if (value < 0)
value = -value;
return value;
}
LegacyJIT-x86 LegacyJIT-x64 RyuJIT-x64
≈1.7ns 0 ≈1.7ns
15/29
Tricky inlining
[Benchmark]
int Calc() => WithoutStarg(0x11) + WithStarg(0x12);
int WithoutStarg(int value) => value;
int WithStarg(int value) {
if (value < 0)
value = -value;
return value;
}
LegacyJIT-x86 LegacyJIT-x64 RyuJIT-x64
≈1.7ns 0 ≈1.7ns
; LegacyJIT-x64 : Inlining succeeded
mov ecx,23h
ret
15/29
Tricky inlining
[Benchmark]
int Calc() => WithoutStarg(0x11) + WithStarg(0x12);
int WithoutStarg(int value) => value;
int WithStarg(int value) {
if (value < 0)
value = -value;
return value;
}
LegacyJIT-x86 LegacyJIT-x64 RyuJIT-x64
≈1.7ns 0 ≈1.7ns
; LegacyJIT-x64 : Inlining succeeded
mov ecx,23h
ret
// RyuJIT-x64 : Inlining failed
// Inline expansion aborted due to opcode
// [06] OP_starg.s in method
// Program:WithStarg(int):int:this
15/29
SIMD
struct MyVector // Copy-pasted from System.Numerics.Vector4
{
public float X, Y, Z, W;
public MyVector(float x, float y, float z, float w)
{
X = x; Y = y; Z = z; W = w;
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static MyVector operator *(MyVector left, MyVector right)
{
return new MyVector(left.X * right.X, left.Y * right.Y,
left.Z * right.Z, left.W * right.W);
}
}
Vector4 vector1, vector2, vector3;
MyVector myVector1, myVector2, myVector3;
[Benchmark] void MyMul() => myVector3 = myVector1 * myVector2;
[Benchmark] void BclMul() => vector3 = vector1 * vector2;
16/29
SIMD
struct MyVector // Copy-pasted from System.Numerics.Vector4
{
public float X, Y, Z, W;
public MyVector(float x, float y, float z, float w)
{
X = x; Y = y; Z = z; W = w;
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static MyVector operator *(MyVector left, MyVector right)
{
return new MyVector(left.X * right.X, left.Y * right.Y,
left.Z * right.Z, left.W * right.W);
}
}
Vector4 vector1, vector2, vector3;
MyVector myVector1, myVector2, myVector3;
[Benchmark] void MyMul() => myVector3 = myVector1 * myVector2;
[Benchmark] void BclMul() => vector3 = vector1 * vector2;
LegacyJIT-x64 RyuJIT-x64
MyMul ≈12.9ns ≈2.5ns
BclMul ≈12.9ns ≈0.2ns
16/29
How so?
LegacyJIT-x64 RyuJIT-x64
MyMul ≈12.9ns ≈2.5ns
BclMul ≈12.9ns ≈0.2ns
; LegacyJIT-x64
; MyMul, BclMul: Na¨ıve SSE
; ...
movss xmm3,dword ptr [rsp+40h]
mulss xmm3,dword ptr [rsp+30h]
movss xmm2,dword ptr [rsp+44h]
mulss xmm2,dword ptr [rsp+34h]
movss xmm1,dword ptr [rsp+48h]
mulss xmm1,dword ptr [rsp+38h]
movss xmm0,dword ptr [rsp+4Ch]
mulss xmm0,dword ptr [rsp+3Ch]
xor eax,eax
mov qword ptr [rsp],rax
mov qword ptr [rsp+8],rax
lea rax,[rsp]
movss dword ptr [rax],xmm3
movss dword ptr [rax+4],xmm2
; ...
; RyuJIT-x64
; MyMul: Na¨ıve AVX
; ...
vmulss xmm0,xmm0,xmm4
vmulss xmm1,xmm1,xmm5
vmulss xmm2,xmm2,xmm6
vmulss xmm3,xmm3,xmm7
; ...
; BclMul: Smart AVX intrinsic
vmovupd xmm0,xmmword ptr [rcx+8]
vmovupd xmm1,xmmword ptr [rcx+18h]
vmulps xmm0,xmm0,xmm1
vmovupd xmmword ptr [rcx+28h],xmm0
17/29
Let’s calculate some square roots
[Benchmark]
double Sqrt13() =>
Math.Sqrt(1) + Math.Sqrt(2) + Math.Sqrt(3) + /* ... */
+ Math.Sqrt(13);
VS
[Benchmark]
double Sqrt14() =>
Math.Sqrt(1) + Math.Sqrt(2) + Math.Sqrt(3) + /* ... */
+ Math.Sqrt(13) + Math.Sqrt(14);
18/29
Let’s calculate some square roots
[Benchmark]
double Sqrt13() =>
Math.Sqrt(1) + Math.Sqrt(2) + Math.Sqrt(3) + /* ... */
+ Math.Sqrt(13);
VS
[Benchmark]
double Sqrt14() =>
Math.Sqrt(1) + Math.Sqrt(2) + Math.Sqrt(3) + /* ... */
+ Math.Sqrt(13) + Math.Sqrt(14);
RyuJIT-x64∗
Sqrt13 ≈91ns
Sqrt14 0 ns
∗Can be changed in future versions, see github.com/dotnet/coreclr/issues/987
18/29
How so?
RyuJIT-x64, Sqrt13
vsqrtsd xmm0,xmm0,mmword ptr [7FF94F9E4D28h]
vsqrtsd xmm1,xmm0,mmword ptr [7FF94F9E4D30h]
vaddsd xmm0,xmm0,xmm1
vsqrtsd xmm1,xmm0,mmword ptr [7FF94F9E4D38h]
vaddsd xmm0,xmm0,xmm1
vsqrtsd xmm1,xmm0,mmword ptr [7FF94F9E4D40h]
vaddsd xmm0,xmm0,xmm1
; A lot of vqrtsd and vaddsd instructions
; ...
vsqrtsd xmm1,xmm0,mmword ptr [7FF94F9E4D88h]
vaddsd xmm0,xmm0,xmm1
ret
RyuJIT-x64, Sqrt14
vmovsd xmm0,qword ptr [7FF94F9C4C80h] ; Const
ret
19/29
How so?
Big expression tree
* stmtExpr void (top level) (IL 0x000... ???)
| /--* mathFN double sqrt
| | --* dconst double 13.000000000000000
| /--* + double
| | | /--* mathFN double sqrt
| | | | --* dconst double 12.000000000000000
| | --* + double
| | | /--* mathFN double sqrt
| | | | --* dconst double 11.000000000000000
| | --* + double
| | | /--* mathFN double sqrt
| | | | --* dconst double 10.000000000000000
| | --* + double
| | | /--* mathFN double sqrt
| | | | --* dconst double 9.0000000000000000
| | --* + double
| | | /--* mathFN double sqrt
| | | | --* dconst double 8.0000000000000000
| | --* + double
| | | /--* mathFN double sqrt
| | | | --* dconst double 7.0000000000000000
| | --* + double
| | | /--* mathFN double sqrt
| | | | --* dconst double 6.0000000000000000
| | --* + double
| | | /--* mathFN double sqrt
| | | | --* dconst double 5.0000000000000000
// ...
20/29
How so?
Constant folding in action
N001 [000001] dconst 1.0000000000000000 => $c0 {DblCns[1.000000]}
N002 [000002] mathFN => $c0 {DblCns[1.000000]}
N003 [000003] dconst 2.0000000000000000 => $c1 {DblCns[2.000000]}
N004 [000004] mathFN => $c2 {DblCns[1.414214]}
N005 [000005] + => $c3 {DblCns[2.414214]}
N006 [000006] dconst 3.0000000000000000 => $c4 {DblCns[3.000000]}
N007 [000007] mathFN => $c5 {DblCns[1.732051]}
N008 [000008] + => $c6 {DblCns[4.146264]}
N009 [000009] dconst 4.0000000000000000 => $c7 {DblCns[4.000000]}
N010 [000010] mathFN => $c1 {DblCns[2.000000]}
N011 [000011] + => $c8 {DblCns[6.146264]}
N012 [000012] dconst 5.0000000000000000 => $c9 {DblCns[5.000000]}
N013 [000013] mathFN => $ca {DblCns[2.236068]}
N014 [000014] + => $cb {DblCns[8.382332]}
N015 [000015] dconst 6.0000000000000000 => $cc {DblCns[6.000000]}
N016 [000016] mathFN => $cd {DblCns[2.449490]}
N017 [000017] + => $ce {DblCns[10.831822]}
N018 [000018] dconst 7.0000000000000000 => $cf {DblCns[7.000000]}
N019 [000019] mathFN => $d0 {DblCns[2.645751]}
N020 [000020] + => $d1 {DblCns[13.477573]}
...
21/29
Concurrency
22/29
True sharing
23/29
False sharing
24/29
False sharing in action
// It's an extremely na¨ıve benchmark
// Don't try this at home
int[] x = new int[1024];
void Inc(int p) {
for (int i = 0; i < 10000001; i++)
x[p]++;
}
void Run(int step) {
var sw = Stopwatch.StartNew();
Task.WaitAll(
Task.Factory.StartNew(() => Inc(0 * step)),
Task.Factory.StartNew(() => Inc(1 * step)),
Task.Factory.StartNew(() => Inc(2 * step)),
Task.Factory.StartNew(() => Inc(3 * step)));
WriteLine(sw.ElapsedMilliseconds);
}
25/29
False sharing in action
// It's an extremely na¨ıve benchmark
// Don't try this at home
int[] x = new int[1024];
void Inc(int p) {
for (int i = 0; i < 10000001; i++)
x[p]++;
}
void Run(int step) {
var sw = Stopwatch.StartNew();
Task.WaitAll(
Task.Factory.StartNew(() => Inc(0 * step)),
Task.Factory.StartNew(() => Inc(1 * step)),
Task.Factory.StartNew(() => Inc(2 * step)),
Task.Factory.StartNew(() => Inc(3 * step)));
WriteLine(sw.ElapsedMilliseconds);
}
Run(1) Run(256)
≈400ms ≈150ms
25/29
Conclusion: Benchmarking is hard
Anon et al., “A Measure of Transaction Processing Power”
There are lies, damn lies and then there are performance
measures.
26/29
Some good books
27/29
Questions?
Andrey Akinshin
http://aakinshin.net
https://github.com/AndreyAkinshin
https://twitter.com/andrey_akinshin
andrey.akinshin@gmail.com
28/29

Contenu connexe

Tendances

How to add an optimization for C# to RyuJIT
How to add an optimization for C# to RyuJITHow to add an optimization for C# to RyuJIT
How to add an optimization for C# to RyuJITEgor Bogatov
 
20140531 serebryany lecture02_find_scary_cpp_bugs
20140531 serebryany lecture02_find_scary_cpp_bugs20140531 serebryany lecture02_find_scary_cpp_bugs
20140531 serebryany lecture02_find_scary_cpp_bugsComputer Science Club
 
20140531 serebryany lecture01_fantastic_cpp_bugs
20140531 serebryany lecture01_fantastic_cpp_bugs20140531 serebryany lecture01_fantastic_cpp_bugs
20140531 serebryany lecture01_fantastic_cpp_bugsComputer Science Club
 
Advance C++notes
Advance C++notesAdvance C++notes
Advance C++notesRajiv Gupta
 
Abstracting Vector Architectures in Library Generators: Case Study Convolutio...
Abstracting Vector Architectures in Library Generators: Case Study Convolutio...Abstracting Vector Architectures in Library Generators: Case Study Convolutio...
Abstracting Vector Architectures in Library Generators: Case Study Convolutio...ETH Zurich
 
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...Alex Pruden
 
ZK Study Club: Sumcheck Arguments and Their Applications
ZK Study Club: Sumcheck Arguments and Their ApplicationsZK Study Club: Sumcheck Arguments and Their Applications
ZK Study Club: Sumcheck Arguments and Their ApplicationsAlex Pruden
 
To Swift 2...and Beyond!
To Swift 2...and Beyond!To Swift 2...and Beyond!
To Swift 2...and Beyond!Scott Gardner
 
zkStudy Club: Subquadratic SNARGs in the Random Oracle Model
zkStudy Club: Subquadratic SNARGs in the Random Oracle ModelzkStudy Club: Subquadratic SNARGs in the Random Oracle Model
zkStudy Club: Subquadratic SNARGs in the Random Oracle ModelAlex Pruden
 
Microsoft Word Hw#1
Microsoft Word   Hw#1Microsoft Word   Hw#1
Microsoft Word Hw#1kkkseld
 
Some examples of the 64-bit code errors
Some examples of the 64-bit code errorsSome examples of the 64-bit code errors
Some examples of the 64-bit code errorsPVS-Studio
 
Advance features of C++
Advance features of C++Advance features of C++
Advance features of C++vidyamittal
 
PVS-Studio team experience: checking various open source projects, or mistake...
PVS-Studio team experience: checking various open source projects, or mistake...PVS-Studio team experience: checking various open source projects, or mistake...
PVS-Studio team experience: checking various open source projects, or mistake...Andrey Karpov
 
How To Crack RSA Netrek Binary Verification System
How To Crack RSA Netrek Binary Verification SystemHow To Crack RSA Netrek Binary Verification System
How To Crack RSA Netrek Binary Verification SystemJay Corrales
 
ExperiencesSharingOnEmbeddedSystemDevelopment_20160321
ExperiencesSharingOnEmbeddedSystemDevelopment_20160321ExperiencesSharingOnEmbeddedSystemDevelopment_20160321
ExperiencesSharingOnEmbeddedSystemDevelopment_20160321Teddy Hsiung
 
Конверсия управляемых языков в неуправляемые
Конверсия управляемых языков в неуправляемыеКонверсия управляемых языков в неуправляемые
Конверсия управляемых языков в неуправляемыеPlatonov Sergey
 
Homomorphic Encryption
Homomorphic EncryptionHomomorphic Encryption
Homomorphic EncryptionVictor Pereira
 

Tendances (20)

How to add an optimization for C# to RyuJIT
How to add an optimization for C# to RyuJITHow to add an optimization for C# to RyuJIT
How to add an optimization for C# to RyuJIT
 
20140531 serebryany lecture02_find_scary_cpp_bugs
20140531 serebryany lecture02_find_scary_cpp_bugs20140531 serebryany lecture02_find_scary_cpp_bugs
20140531 serebryany lecture02_find_scary_cpp_bugs
 
20140531 serebryany lecture01_fantastic_cpp_bugs
20140531 serebryany lecture01_fantastic_cpp_bugs20140531 serebryany lecture01_fantastic_cpp_bugs
20140531 serebryany lecture01_fantastic_cpp_bugs
 
Advance C++notes
Advance C++notesAdvance C++notes
Advance C++notes
 
Abstracting Vector Architectures in Library Generators: Case Study Convolutio...
Abstracting Vector Architectures in Library Generators: Case Study Convolutio...Abstracting Vector Architectures in Library Generators: Case Study Convolutio...
Abstracting Vector Architectures in Library Generators: Case Study Convolutio...
 
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
 
ZK Study Club: Sumcheck Arguments and Their Applications
ZK Study Club: Sumcheck Arguments and Their ApplicationsZK Study Club: Sumcheck Arguments and Their Applications
ZK Study Club: Sumcheck Arguments and Their Applications
 
To Swift 2...and Beyond!
To Swift 2...and Beyond!To Swift 2...and Beyond!
To Swift 2...and Beyond!
 
Computing on Encrypted Data
Computing on Encrypted DataComputing on Encrypted Data
Computing on Encrypted Data
 
zkStudy Club: Subquadratic SNARGs in the Random Oracle Model
zkStudy Club: Subquadratic SNARGs in the Random Oracle ModelzkStudy Club: Subquadratic SNARGs in the Random Oracle Model
zkStudy Club: Subquadratic SNARGs in the Random Oracle Model
 
Microsoft Word Hw#1
Microsoft Word   Hw#1Microsoft Word   Hw#1
Microsoft Word Hw#1
 
Some examples of the 64-bit code errors
Some examples of the 64-bit code errorsSome examples of the 64-bit code errors
Some examples of the 64-bit code errors
 
Advance features of C++
Advance features of C++Advance features of C++
Advance features of C++
 
PVS-Studio team experience: checking various open source projects, or mistake...
PVS-Studio team experience: checking various open source projects, or mistake...PVS-Studio team experience: checking various open source projects, or mistake...
PVS-Studio team experience: checking various open source projects, or mistake...
 
C++ TUTORIAL 6
C++ TUTORIAL 6C++ TUTORIAL 6
C++ TUTORIAL 6
 
Computer Programming- Lecture 4
Computer Programming- Lecture 4Computer Programming- Lecture 4
Computer Programming- Lecture 4
 
How To Crack RSA Netrek Binary Verification System
How To Crack RSA Netrek Binary Verification SystemHow To Crack RSA Netrek Binary Verification System
How To Crack RSA Netrek Binary Verification System
 
ExperiencesSharingOnEmbeddedSystemDevelopment_20160321
ExperiencesSharingOnEmbeddedSystemDevelopment_20160321ExperiencesSharingOnEmbeddedSystemDevelopment_20160321
ExperiencesSharingOnEmbeddedSystemDevelopment_20160321
 
Конверсия управляемых языков в неуправляемые
Конверсия управляемых языков в неуправляемыеКонверсия управляемых языков в неуправляемые
Конверсия управляемых языков в неуправляемые
 
Homomorphic Encryption
Homomorphic EncryptionHomomorphic Encryption
Homomorphic Encryption
 

Similaire à Let’s talk about microbenchmarking

Tema3_Introduction_to_CUDA_C.pdf
Tema3_Introduction_to_CUDA_C.pdfTema3_Introduction_to_CUDA_C.pdf
Tema3_Introduction_to_CUDA_C.pdfpepe464163
 
PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...
PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...
PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...Andrey Karpov
 
I dont know what is wrong with this roulette program I cant seem.pdf
I dont know what is wrong with this roulette program I cant seem.pdfI dont know what is wrong with this roulette program I cant seem.pdf
I dont know what is wrong with this roulette program I cant seem.pdfarchanaemporium
 
Whats new in_csharp4
Whats new in_csharp4Whats new in_csharp4
Whats new in_csharp4Abed Bukhari
 
JVM code reading -- C2
JVM code reading -- C2JVM code reading -- C2
JVM code reading -- C2ytoshima
 
PVS-Studio, a solution for resource intensive applications development
PVS-Studio, a solution for resource intensive applications developmentPVS-Studio, a solution for resource intensive applications development
PVS-Studio, a solution for resource intensive applications developmentOOO "Program Verification Systems"
 
#include stdafx.h using namespace std; #include stdlib.h.docx
#include stdafx.h using namespace std; #include stdlib.h.docx#include stdafx.h using namespace std; #include stdlib.h.docx
#include stdafx.h using namespace std; #include stdlib.h.docxajoy21
 
public class TrequeT extends AbstractListT { .pdf
  public class TrequeT extends AbstractListT {  .pdf  public class TrequeT extends AbstractListT {  .pdf
public class TrequeT extends AbstractListT { .pdfinfo30292
 
ぐだ生 Java入門第三回(文字コードの話)(Keynote版)
ぐだ生 Java入門第三回(文字コードの話)(Keynote版)ぐだ生 Java入門第三回(文字コードの話)(Keynote版)
ぐだ生 Java入門第三回(文字コードの話)(Keynote版)Makoto Yamazaki
 
Covering a function using a Dynamic Symbolic Execution approach
Covering a function using a Dynamic Symbolic Execution approach Covering a function using a Dynamic Symbolic Execution approach
Covering a function using a Dynamic Symbolic Execution approach Jonathan Salwan
 
LSFMM 2019 BPF Observability
LSFMM 2019 BPF ObservabilityLSFMM 2019 BPF Observability
LSFMM 2019 BPF ObservabilityBrendan Gregg
 
DATASTRUCTURES PPTS PREPARED BY M V BRAHMANANDA REDDY
DATASTRUCTURES PPTS PREPARED BY M V BRAHMANANDA REDDYDATASTRUCTURES PPTS PREPARED BY M V BRAHMANANDA REDDY
DATASTRUCTURES PPTS PREPARED BY M V BRAHMANANDA REDDYMalikireddy Bramhananda Reddy
 
COSCUP2023 RSA256 Verilator.pdf
COSCUP2023 RSA256 Verilator.pdfCOSCUP2023 RSA256 Verilator.pdf
COSCUP2023 RSA256 Verilator.pdfYodalee
 
54602399 c-examples-51-to-108-programe-ee01083101
54602399 c-examples-51-to-108-programe-ee0108310154602399 c-examples-51-to-108-programe-ee01083101
54602399 c-examples-51-to-108-programe-ee01083101premrings
 
Score (smart contract for icon)
Score (smart contract for icon) Score (smart contract for icon)
Score (smart contract for icon) Doyun Hwang
 

Similaire à Let’s talk about microbenchmarking (20)

Tema3_Introduction_to_CUDA_C.pdf
Tema3_Introduction_to_CUDA_C.pdfTema3_Introduction_to_CUDA_C.pdf
Tema3_Introduction_to_CUDA_C.pdf
 
PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...
PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...
PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...
 
I dont know what is wrong with this roulette program I cant seem.pdf
I dont know what is wrong with this roulette program I cant seem.pdfI dont know what is wrong with this roulette program I cant seem.pdf
I dont know what is wrong with this roulette program I cant seem.pdf
 
Whats new in_csharp4
Whats new in_csharp4Whats new in_csharp4
Whats new in_csharp4
 
JVM code reading -- C2
JVM code reading -- C2JVM code reading -- C2
JVM code reading -- C2
 
PVS-Studio, a solution for resource intensive applications development
PVS-Studio, a solution for resource intensive applications developmentPVS-Studio, a solution for resource intensive applications development
PVS-Studio, a solution for resource intensive applications development
 
#include stdafx.h using namespace std; #include stdlib.h.docx
#include stdafx.h using namespace std; #include stdlib.h.docx#include stdafx.h using namespace std; #include stdlib.h.docx
#include stdafx.h using namespace std; #include stdlib.h.docx
 
.net progrmming part2
.net progrmming part2.net progrmming part2
.net progrmming part2
 
public class TrequeT extends AbstractListT { .pdf
  public class TrequeT extends AbstractListT {  .pdf  public class TrequeT extends AbstractListT {  .pdf
public class TrequeT extends AbstractListT { .pdf
 
ぐだ生 Java入門第三回(文字コードの話)(Keynote版)
ぐだ生 Java入門第三回(文字コードの話)(Keynote版)ぐだ生 Java入門第三回(文字コードの話)(Keynote版)
ぐだ生 Java入門第三回(文字コードの話)(Keynote版)
 
Insertion Sort Code
Insertion Sort CodeInsertion Sort Code
Insertion Sort Code
 
Ac2
Ac2Ac2
Ac2
 
Covering a function using a Dynamic Symbolic Execution approach
Covering a function using a Dynamic Symbolic Execution approach Covering a function using a Dynamic Symbolic Execution approach
Covering a function using a Dynamic Symbolic Execution approach
 
LSFMM 2019 BPF Observability
LSFMM 2019 BPF ObservabilityLSFMM 2019 BPF Observability
LSFMM 2019 BPF Observability
 
DATASTRUCTURES PPTS PREPARED BY M V BRAHMANANDA REDDY
DATASTRUCTURES PPTS PREPARED BY M V BRAHMANANDA REDDYDATASTRUCTURES PPTS PREPARED BY M V BRAHMANANDA REDDY
DATASTRUCTURES PPTS PREPARED BY M V BRAHMANANDA REDDY
 
Exploiting vectorization with ISPC
Exploiting vectorization with ISPCExploiting vectorization with ISPC
Exploiting vectorization with ISPC
 
COSCUP2023 RSA256 Verilator.pdf
COSCUP2023 RSA256 Verilator.pdfCOSCUP2023 RSA256 Verilator.pdf
COSCUP2023 RSA256 Verilator.pdf
 
PRACTICAL COMPUTING
PRACTICAL COMPUTINGPRACTICAL COMPUTING
PRACTICAL COMPUTING
 
54602399 c-examples-51-to-108-programe-ee01083101
54602399 c-examples-51-to-108-programe-ee0108310154602399 c-examples-51-to-108-programe-ee01083101
54602399 c-examples-51-to-108-programe-ee01083101
 
Score (smart contract for icon)
Score (smart contract for icon) Score (smart contract for icon)
Score (smart contract for icon)
 

Plus de Andrey Akinshin

Поговорим про performance-тестирование
Поговорим про performance-тестированиеПоговорим про performance-тестирование
Поговорим про performance-тестированиеAndrey Akinshin
 
Сложности performance-тестирования
Сложности performance-тестированияСложности performance-тестирования
Сложности performance-тестированияAndrey Akinshin
 
Сложности микробенчмаркинга
Сложности микробенчмаркингаСложности микробенчмаркинга
Сложности микробенчмаркингаAndrey Akinshin
 
Поговорим про память
Поговорим про памятьПоговорим про память
Поговорим про памятьAndrey Akinshin
 
Кроссплатформенный .NET и как там дела с Mono и CoreCLR
Кроссплатформенный .NET и как там дела с Mono и CoreCLRКроссплатформенный .NET и как там дела с Mono и CoreCLR
Кроссплатформенный .NET и как там дела с Mono и CoreCLRAndrey Akinshin
 
Теория и практика .NET-бенчмаркинга (25.01.2017, Москва)
 Теория и практика .NET-бенчмаркинга (25.01.2017, Москва) Теория и практика .NET-бенчмаркинга (25.01.2017, Москва)
Теория и практика .NET-бенчмаркинга (25.01.2017, Москва)Andrey Akinshin
 
Продолжаем говорить про арифметику
Продолжаем говорить про арифметикуПродолжаем говорить про арифметику
Продолжаем говорить про арифметикуAndrey Akinshin
 
Теория и практика .NET-бенчмаркинга (02.11.2016, Екатеринбург)
Теория и практика .NET-бенчмаркинга (02.11.2016, Екатеринбург)Теория и практика .NET-бенчмаркинга (02.11.2016, Екатеринбург)
Теория и практика .NET-бенчмаркинга (02.11.2016, Екатеринбург)Andrey Akinshin
 
Поговорим про арифметику
Поговорим про арифметикуПоговорим про арифметику
Поговорим про арифметикуAndrey Akinshin
 
Подружили CLR и JVM в Project Rider
Подружили CLR и JVM в Project RiderПодружили CLR и JVM в Project Rider
Подружили CLR и JVM в Project RiderAndrey Akinshin
 
Что нам готовит грядущий C#7?
Что нам готовит грядущий C#7?Что нам готовит грядущий C#7?
Что нам готовит грядущий C#7?Andrey Akinshin
 
Продолжаем говорить о микрооптимизациях .NET-приложений
Продолжаем говорить о микрооптимизациях .NET-приложенийПродолжаем говорить о микрооптимизациях .NET-приложений
Продолжаем говорить о микрооптимизациях .NET-приложенийAndrey Akinshin
 
Распространённые ошибки оценки производительности .NET-приложений
Распространённые ошибки оценки производительности .NET-приложенийРаспространённые ошибки оценки производительности .NET-приложений
Распространённые ошибки оценки производительности .NET-приложенийAndrey Akinshin
 
Поговорим о микрооптимизациях .NET-приложений
Поговорим о микрооптимизациях .NET-приложенийПоговорим о микрооптимизациях .NET-приложений
Поговорим о микрооптимизациях .NET-приложенийAndrey Akinshin
 
Практические приёмы оптимизации .NET-приложений
Практические приёмы оптимизации .NET-приложенийПрактические приёмы оптимизации .NET-приложений
Практические приёмы оптимизации .NET-приложенийAndrey Akinshin
 
Поговорим о различных версиях .NET
Поговорим о различных версиях .NETПоговорим о различных версиях .NET
Поговорим о различных версиях .NETAndrey Akinshin
 
Низкоуровневые оптимизации .NET-приложений
Низкоуровневые оптимизации .NET-приложенийНизкоуровневые оптимизации .NET-приложений
Низкоуровневые оптимизации .NET-приложенийAndrey Akinshin
 
Основы работы с Git
Основы работы с GitОсновы работы с Git
Основы работы с GitAndrey Akinshin
 
Сборка мусора в .NET
Сборка мусора в .NETСборка мусора в .NET
Сборка мусора в .NETAndrey Akinshin
 
Об особенностях использования значимых типов в .NET
Об особенностях использования значимых типов в .NETОб особенностях использования значимых типов в .NET
Об особенностях использования значимых типов в .NETAndrey Akinshin
 

Plus de Andrey Akinshin (20)

Поговорим про performance-тестирование
Поговорим про performance-тестированиеПоговорим про performance-тестирование
Поговорим про performance-тестирование
 
Сложности performance-тестирования
Сложности performance-тестированияСложности performance-тестирования
Сложности performance-тестирования
 
Сложности микробенчмаркинга
Сложности микробенчмаркингаСложности микробенчмаркинга
Сложности микробенчмаркинга
 
Поговорим про память
Поговорим про памятьПоговорим про память
Поговорим про память
 
Кроссплатформенный .NET и как там дела с Mono и CoreCLR
Кроссплатформенный .NET и как там дела с Mono и CoreCLRКроссплатформенный .NET и как там дела с Mono и CoreCLR
Кроссплатформенный .NET и как там дела с Mono и CoreCLR
 
Теория и практика .NET-бенчмаркинга (25.01.2017, Москва)
 Теория и практика .NET-бенчмаркинга (25.01.2017, Москва) Теория и практика .NET-бенчмаркинга (25.01.2017, Москва)
Теория и практика .NET-бенчмаркинга (25.01.2017, Москва)
 
Продолжаем говорить про арифметику
Продолжаем говорить про арифметикуПродолжаем говорить про арифметику
Продолжаем говорить про арифметику
 
Теория и практика .NET-бенчмаркинга (02.11.2016, Екатеринбург)
Теория и практика .NET-бенчмаркинга (02.11.2016, Екатеринбург)Теория и практика .NET-бенчмаркинга (02.11.2016, Екатеринбург)
Теория и практика .NET-бенчмаркинга (02.11.2016, Екатеринбург)
 
Поговорим про арифметику
Поговорим про арифметикуПоговорим про арифметику
Поговорим про арифметику
 
Подружили CLR и JVM в Project Rider
Подружили CLR и JVM в Project RiderПодружили CLR и JVM в Project Rider
Подружили CLR и JVM в Project Rider
 
Что нам готовит грядущий C#7?
Что нам готовит грядущий C#7?Что нам готовит грядущий C#7?
Что нам готовит грядущий C#7?
 
Продолжаем говорить о микрооптимизациях .NET-приложений
Продолжаем говорить о микрооптимизациях .NET-приложенийПродолжаем говорить о микрооптимизациях .NET-приложений
Продолжаем говорить о микрооптимизациях .NET-приложений
 
Распространённые ошибки оценки производительности .NET-приложений
Распространённые ошибки оценки производительности .NET-приложенийРаспространённые ошибки оценки производительности .NET-приложений
Распространённые ошибки оценки производительности .NET-приложений
 
Поговорим о микрооптимизациях .NET-приложений
Поговорим о микрооптимизациях .NET-приложенийПоговорим о микрооптимизациях .NET-приложений
Поговорим о микрооптимизациях .NET-приложений
 
Практические приёмы оптимизации .NET-приложений
Практические приёмы оптимизации .NET-приложенийПрактические приёмы оптимизации .NET-приложений
Практические приёмы оптимизации .NET-приложений
 
Поговорим о различных версиях .NET
Поговорим о различных версиях .NETПоговорим о различных версиях .NET
Поговорим о различных версиях .NET
 
Низкоуровневые оптимизации .NET-приложений
Низкоуровневые оптимизации .NET-приложенийНизкоуровневые оптимизации .NET-приложений
Низкоуровневые оптимизации .NET-приложений
 
Основы работы с Git
Основы работы с GitОсновы работы с Git
Основы работы с Git
 
Сборка мусора в .NET
Сборка мусора в .NETСборка мусора в .NET
Сборка мусора в .NET
 
Об особенностях использования значимых типов в .NET
Об особенностях использования значимых типов в .NETОб особенностях использования значимых типов в .NET
Об особенностях использования значимых типов в .NET
 

Dernier

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Dernier (20)

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Let’s talk about microbenchmarking

  • 1. Let’s talk about microbenchmarking Andrey Akinshin, JetBrains DotNext 2016 Helsinki 1/29
  • 3. Today’s environment • BenchmarkDotNet v0.10.1 • Haswell Core i7-4702MQ CPU 2.20GHz, Windows 10 • .NET Framework 4.6.2 + clrjit/compatjit-v4.6.1586.0 • Source code: https://git.io/v1RRX 3/29
  • 4. Today’s environment • BenchmarkDotNet v0.10.1 • Haswell Core i7-4702MQ CPU 2.20GHz, Windows 10 • .NET Framework 4.6.2 + clrjit/compatjit-v4.6.1586.0 • Source code: https://git.io/v1RRX Other environments: C# compiler old csc / Roslyn CLR CLR2 / CLR4 / CoreCLR / Mono OS Windows / Linux / MacOS / FreeBSD JIT LegacyJIT-x86 / LegacyJIT-x64 / RyuJIT-x64 GC MS (different CLRs) / Mono (Boehm/Sgen) Toolchain JIT / NGen / .NET Native Hardware ∞ different configurations . . . . . . And don’t forget about multiple versions 3/29
  • 5. Count of iterations A bad benchmark // Resolution (Stopwatch) = 466 ns // Latency (Stopwatch) = 18 ns var sw = Stopwatch.StartNew(); Foo(); // 100 ns sw.Stop(); WriteLine(sw.ElapsedMilliseconds); 4/29
  • 6. Count of iterations A bad benchmark // Resolution (Stopwatch) = 466 ns // Latency (Stopwatch) = 18 ns var sw = Stopwatch.StartNew(); Foo(); // 100 ns sw.Stop(); WriteLine(sw.ElapsedMilliseconds); A better benchmark var sw = Stopwatch.StartNew(); for (int i = 0; i < N; i++) // (N * 100 + eps) ns Foo(); // 100 ns sw.Stop(); var total = sw.ElapsedTicks / Stopwatch.Frequency; WriteLine(total / N); 4/29
  • 7. Several launches Run 01 : 529.8674 ns/op Run 02 : 532.7541 ns/op Run 03 : 558.7448 ns/op Run 04 : 555.6647 ns/op Run 05 : 539.6401 ns/op Run 06 : 539.3494 ns/op Run 07 : 564.3222 ns/op Run 08 : 551.9544 ns/op Run 09 : 550.1608 ns/op Run 10 : 533.0634 ns/op 5/29
  • 9. A simple case Central limit theorem to the rescue! 7/29
  • 11. Latencies Event Latency 1 CPU cycle 0.3 ns Level 1 cache access 0.9 ns Level 2 cache access 2.8 ns Level 3 cache access 12.9 ns Main memory access 120 ns Solid-state disk I/O 50-150 µs Rotational disk I/O 1-10 ms Internet: SF to NYC 40 ms Internet: SF to UK 81 ms Internet: SF to Australia 183 ms OS virtualization reboot 4 sec Hardware virtualization reboot 40 sec Physical system reboot 5 min © Systems Performance: Enterprise and the Cloud 9/29
  • 12. Sum of elements const int N = 1024; int[,] a = new int[N, N]; [Benchmark] public double SumIJ() { var sum = 0; for (int i = 0; i < N; i++) for (int j = 0; j < N; j++) sum += a[i, j]; return sum; } [Benchmark] public double SumJI() { var sum = 0; for (int j = 0; j < N; j++) for (int i = 0; i < N; i++) sum += a[i, j]; return sum; } 10/29
  • 13. Sum of elements const int N = 1024; int[,] a = new int[N, N]; [Benchmark] public double SumIJ() { var sum = 0; for (int i = 0; i < N; i++) for (int j = 0; j < N; j++) sum += a[i, j]; return sum; } [Benchmark] public double SumJI() { var sum = 0; for (int j = 0; j < N; j++) for (int i = 0; i < N; i++) sum += a[i, j]; return sum; } CPU cache effect: SumIJ SumJI LegacyJIT-x86 ≈1.3ms ≈4.0ms 10/29
  • 14. Cache-sensitive benchmarks Let’s run a benchmark several times: int[] x = new int[128 * 1024 * 1024]; for (int iter = 0; iter < 5; iter++) { var sw = Stopwatch.StartNew(); for (int i = 0; i < x.Length; i += 16) x[i]++; sw.Stop(); WriteLine(sw.ElapsedMilliseconds); } 11/29
  • 15. Cache-sensitive benchmarks Let’s run a benchmark several times: int[] x = new int[128 * 1024 * 1024]; for (int iter = 0; iter < 5; iter++) { var sw = Stopwatch.StartNew(); for (int i = 0; i < x.Length; i += 16) x[i]++; sw.Stop(); WriteLine(sw.ElapsedMilliseconds); } 176 // not warmed 81 // still not warmed 62 // the steady state 62 // the steady state 62 // the steady state Warmup is not only about .NET 11/29
  • 16. Branch prediction const int N = 32767; int[] sorted, unsorted; // random numbers [0..255] private static int Sum(int[] data) { int sum = 0; for (int i = 0; i < N; i++) if (data[i] >= 128) sum += data[i]; return sum; } [Benchmark] public int Sorted() { return Sum(sorted); } [Benchmark] public int Unsorted() { return Sum(unsorted); } 12/29
  • 17. Branch prediction const int N = 32767; int[] sorted, unsorted; // random numbers [0..255] private static int Sum(int[] data) { int sum = 0; for (int i = 0; i < N; i++) if (data[i] >= 128) sum += data[i]; return sum; } [Benchmark] public int Sorted() { return Sum(sorted); } [Benchmark] public int Unsorted() { return Sum(unsorted); } Sorted Unsorted LegacyJIT-x86 ≈20µs ≈139µs 12/29
  • 18. Isolation A bad benchmark var sw1 = Stopwatch.StartNew(); Foo(); sw1.Stop(); var sw2 = Stopwatch.StartNew(); Bar(); sw2.Stop(); 13/29
  • 19. Isolation A bad benchmark var sw1 = Stopwatch.StartNew(); Foo(); sw1.Stop(); var sw2 = Stopwatch.StartNew(); Bar(); sw2.Stop(); In general case, you should run each benchmark in his own process. Remember about: • Interface method dispatch • Garbage collector and autotuning • Conditional jitting 13/29
  • 20. Interface method dispatch private interface IInc { double Inc(double x); } private class Foo : IInc { double Inc(double x) => x + 1; } private class Bar : IInc { double Inc(double x) => x + 1; } private double Run(IInc inc) { double sum = 0; for (int i = 0; i < 1001; i++) sum += inc.Inc(0); return sum; } // Which method is faster? [Benchmark] public double FooFoo() { var foo1 = new Foo(); var foo2 = new Foo(); return Run(foo1) + Run(foo2); } [Benchmark] public double FooBar() { var foo = new Foo(); var bar = new Bar(); return Run(foo) + Run(bar); } 14/29
  • 21. Interface method dispatch private interface IInc { double Inc(double x); } private class Foo : IInc { double Inc(double x) => x + 1; } private class Bar : IInc { double Inc(double x) => x + 1; } private double Run(IInc inc) { double sum = 0; for (int i = 0; i < 1001; i++) sum += inc.Inc(0); return sum; } // Which method is faster? [Benchmark] public double FooFoo() { var foo1 = new Foo(); var foo2 = new Foo(); return Run(foo1) + Run(foo2); } [Benchmark] public double FooBar() { var foo = new Foo(); var bar = new Bar(); return Run(foo) + Run(bar); } FooFoo FooBar LegacyJIT-x64 ≈5.4µs ≈7.1µs 14/29
  • 22. Tricky inlining [Benchmark] int Calc() => WithoutStarg(0x11) + WithStarg(0x12); int WithoutStarg(int value) => value; int WithStarg(int value) { if (value < 0) value = -value; return value; } 15/29
  • 23. Tricky inlining [Benchmark] int Calc() => WithoutStarg(0x11) + WithStarg(0x12); int WithoutStarg(int value) => value; int WithStarg(int value) { if (value < 0) value = -value; return value; } LegacyJIT-x86 LegacyJIT-x64 RyuJIT-x64 ≈1.7ns 0 ≈1.7ns 15/29
  • 24. Tricky inlining [Benchmark] int Calc() => WithoutStarg(0x11) + WithStarg(0x12); int WithoutStarg(int value) => value; int WithStarg(int value) { if (value < 0) value = -value; return value; } LegacyJIT-x86 LegacyJIT-x64 RyuJIT-x64 ≈1.7ns 0 ≈1.7ns ; LegacyJIT-x64 : Inlining succeeded mov ecx,23h ret 15/29
  • 25. Tricky inlining [Benchmark] int Calc() => WithoutStarg(0x11) + WithStarg(0x12); int WithoutStarg(int value) => value; int WithStarg(int value) { if (value < 0) value = -value; return value; } LegacyJIT-x86 LegacyJIT-x64 RyuJIT-x64 ≈1.7ns 0 ≈1.7ns ; LegacyJIT-x64 : Inlining succeeded mov ecx,23h ret // RyuJIT-x64 : Inlining failed // Inline expansion aborted due to opcode // [06] OP_starg.s in method // Program:WithStarg(int):int:this 15/29
  • 26. SIMD struct MyVector // Copy-pasted from System.Numerics.Vector4 { public float X, Y, Z, W; public MyVector(float x, float y, float z, float w) { X = x; Y = y; Z = z; W = w; } [MethodImpl(MethodImplOptions.AggressiveInlining)] public static MyVector operator *(MyVector left, MyVector right) { return new MyVector(left.X * right.X, left.Y * right.Y, left.Z * right.Z, left.W * right.W); } } Vector4 vector1, vector2, vector3; MyVector myVector1, myVector2, myVector3; [Benchmark] void MyMul() => myVector3 = myVector1 * myVector2; [Benchmark] void BclMul() => vector3 = vector1 * vector2; 16/29
  • 27. SIMD struct MyVector // Copy-pasted from System.Numerics.Vector4 { public float X, Y, Z, W; public MyVector(float x, float y, float z, float w) { X = x; Y = y; Z = z; W = w; } [MethodImpl(MethodImplOptions.AggressiveInlining)] public static MyVector operator *(MyVector left, MyVector right) { return new MyVector(left.X * right.X, left.Y * right.Y, left.Z * right.Z, left.W * right.W); } } Vector4 vector1, vector2, vector3; MyVector myVector1, myVector2, myVector3; [Benchmark] void MyMul() => myVector3 = myVector1 * myVector2; [Benchmark] void BclMul() => vector3 = vector1 * vector2; LegacyJIT-x64 RyuJIT-x64 MyMul ≈12.9ns ≈2.5ns BclMul ≈12.9ns ≈0.2ns 16/29
  • 28. How so? LegacyJIT-x64 RyuJIT-x64 MyMul ≈12.9ns ≈2.5ns BclMul ≈12.9ns ≈0.2ns ; LegacyJIT-x64 ; MyMul, BclMul: Na¨ıve SSE ; ... movss xmm3,dword ptr [rsp+40h] mulss xmm3,dword ptr [rsp+30h] movss xmm2,dword ptr [rsp+44h] mulss xmm2,dword ptr [rsp+34h] movss xmm1,dword ptr [rsp+48h] mulss xmm1,dword ptr [rsp+38h] movss xmm0,dword ptr [rsp+4Ch] mulss xmm0,dword ptr [rsp+3Ch] xor eax,eax mov qword ptr [rsp],rax mov qword ptr [rsp+8],rax lea rax,[rsp] movss dword ptr [rax],xmm3 movss dword ptr [rax+4],xmm2 ; ... ; RyuJIT-x64 ; MyMul: Na¨ıve AVX ; ... vmulss xmm0,xmm0,xmm4 vmulss xmm1,xmm1,xmm5 vmulss xmm2,xmm2,xmm6 vmulss xmm3,xmm3,xmm7 ; ... ; BclMul: Smart AVX intrinsic vmovupd xmm0,xmmword ptr [rcx+8] vmovupd xmm1,xmmword ptr [rcx+18h] vmulps xmm0,xmm0,xmm1 vmovupd xmmword ptr [rcx+28h],xmm0 17/29
  • 29. Let’s calculate some square roots [Benchmark] double Sqrt13() => Math.Sqrt(1) + Math.Sqrt(2) + Math.Sqrt(3) + /* ... */ + Math.Sqrt(13); VS [Benchmark] double Sqrt14() => Math.Sqrt(1) + Math.Sqrt(2) + Math.Sqrt(3) + /* ... */ + Math.Sqrt(13) + Math.Sqrt(14); 18/29
  • 30. Let’s calculate some square roots [Benchmark] double Sqrt13() => Math.Sqrt(1) + Math.Sqrt(2) + Math.Sqrt(3) + /* ... */ + Math.Sqrt(13); VS [Benchmark] double Sqrt14() => Math.Sqrt(1) + Math.Sqrt(2) + Math.Sqrt(3) + /* ... */ + Math.Sqrt(13) + Math.Sqrt(14); RyuJIT-x64∗ Sqrt13 ≈91ns Sqrt14 0 ns ∗Can be changed in future versions, see github.com/dotnet/coreclr/issues/987 18/29
  • 31. How so? RyuJIT-x64, Sqrt13 vsqrtsd xmm0,xmm0,mmword ptr [7FF94F9E4D28h] vsqrtsd xmm1,xmm0,mmword ptr [7FF94F9E4D30h] vaddsd xmm0,xmm0,xmm1 vsqrtsd xmm1,xmm0,mmword ptr [7FF94F9E4D38h] vaddsd xmm0,xmm0,xmm1 vsqrtsd xmm1,xmm0,mmword ptr [7FF94F9E4D40h] vaddsd xmm0,xmm0,xmm1 ; A lot of vqrtsd and vaddsd instructions ; ... vsqrtsd xmm1,xmm0,mmword ptr [7FF94F9E4D88h] vaddsd xmm0,xmm0,xmm1 ret RyuJIT-x64, Sqrt14 vmovsd xmm0,qword ptr [7FF94F9C4C80h] ; Const ret 19/29
  • 32. How so? Big expression tree * stmtExpr void (top level) (IL 0x000... ???) | /--* mathFN double sqrt | | --* dconst double 13.000000000000000 | /--* + double | | | /--* mathFN double sqrt | | | | --* dconst double 12.000000000000000 | | --* + double | | | /--* mathFN double sqrt | | | | --* dconst double 11.000000000000000 | | --* + double | | | /--* mathFN double sqrt | | | | --* dconst double 10.000000000000000 | | --* + double | | | /--* mathFN double sqrt | | | | --* dconst double 9.0000000000000000 | | --* + double | | | /--* mathFN double sqrt | | | | --* dconst double 8.0000000000000000 | | --* + double | | | /--* mathFN double sqrt | | | | --* dconst double 7.0000000000000000 | | --* + double | | | /--* mathFN double sqrt | | | | --* dconst double 6.0000000000000000 | | --* + double | | | /--* mathFN double sqrt | | | | --* dconst double 5.0000000000000000 // ... 20/29
  • 33. How so? Constant folding in action N001 [000001] dconst 1.0000000000000000 => $c0 {DblCns[1.000000]} N002 [000002] mathFN => $c0 {DblCns[1.000000]} N003 [000003] dconst 2.0000000000000000 => $c1 {DblCns[2.000000]} N004 [000004] mathFN => $c2 {DblCns[1.414214]} N005 [000005] + => $c3 {DblCns[2.414214]} N006 [000006] dconst 3.0000000000000000 => $c4 {DblCns[3.000000]} N007 [000007] mathFN => $c5 {DblCns[1.732051]} N008 [000008] + => $c6 {DblCns[4.146264]} N009 [000009] dconst 4.0000000000000000 => $c7 {DblCns[4.000000]} N010 [000010] mathFN => $c1 {DblCns[2.000000]} N011 [000011] + => $c8 {DblCns[6.146264]} N012 [000012] dconst 5.0000000000000000 => $c9 {DblCns[5.000000]} N013 [000013] mathFN => $ca {DblCns[2.236068]} N014 [000014] + => $cb {DblCns[8.382332]} N015 [000015] dconst 6.0000000000000000 => $cc {DblCns[6.000000]} N016 [000016] mathFN => $cd {DblCns[2.449490]} N017 [000017] + => $ce {DblCns[10.831822]} N018 [000018] dconst 7.0000000000000000 => $cf {DblCns[7.000000]} N019 [000019] mathFN => $d0 {DblCns[2.645751]} N020 [000020] + => $d1 {DblCns[13.477573]} ... 21/29
  • 37. False sharing in action // It's an extremely na¨ıve benchmark // Don't try this at home int[] x = new int[1024]; void Inc(int p) { for (int i = 0; i < 10000001; i++) x[p]++; } void Run(int step) { var sw = Stopwatch.StartNew(); Task.WaitAll( Task.Factory.StartNew(() => Inc(0 * step)), Task.Factory.StartNew(() => Inc(1 * step)), Task.Factory.StartNew(() => Inc(2 * step)), Task.Factory.StartNew(() => Inc(3 * step))); WriteLine(sw.ElapsedMilliseconds); } 25/29
  • 38. False sharing in action // It's an extremely na¨ıve benchmark // Don't try this at home int[] x = new int[1024]; void Inc(int p) { for (int i = 0; i < 10000001; i++) x[p]++; } void Run(int step) { var sw = Stopwatch.StartNew(); Task.WaitAll( Task.Factory.StartNew(() => Inc(0 * step)), Task.Factory.StartNew(() => Inc(1 * step)), Task.Factory.StartNew(() => Inc(2 * step)), Task.Factory.StartNew(() => Inc(3 * step))); WriteLine(sw.ElapsedMilliseconds); } Run(1) Run(256) ≈400ms ≈150ms 25/29
  • 39. Conclusion: Benchmarking is hard Anon et al., “A Measure of Transaction Processing Power” There are lies, damn lies and then there are performance measures. 26/29