SlideShare une entreprise Scribd logo
1  sur  60
Télécharger pour lire hors ligne
Benchmarking: 
You’re Doing It Wrong 
Aysylu 
Greenberg 
@aysylu22
To 
Write 
Good 
Benchmarks… 
Need 
to 
be 
Full 
Stack
Benchmark 
= 
How 
Fast? 
your 
process 
vs 
Goal 
your 
process 
vs 
Best 
PracCces
Today 
• How 
Not 
to 
Write 
Benchmarks 
• Benchmark 
Setup 
& 
Results: 
- 
You’re 
wrong 
about 
machines 
- 
You’re 
wrong 
about 
stats 
- 
You’re 
wrong 
about 
what 
maLers 
• Becoming 
Less 
Wrong 
• Having 
Fun 
with 
Riak
HOW 
NOT 
TO 
WRITE 
BENCHMARKS
Website 
Serving 
Images 
• Access 
1 
image 
1000 
Cmes 
• Latency 
measured 
for 
each 
access 
• Start 
measuring 
immediately 
• 3 
runs 
• Find 
mean 
• Dev 
environment 
Web 
Request 
Server 
Cache 
S3
WHAT’S 
WRONG 
WITH 
THIS 
BENCHMARK?
YOU’RE 
WRONG 
ABOUT 
THE 
MACHINE
Wrong 
About 
the 
Machine 
• Cache, 
cache, 
cache, 
cache!
It’s 
Caches 
All 
The 
Way 
Down 
Web 
Request 
Server 
Cache 
S3
It’s 
Caches 
All 
The 
Way 
Down
Caches 
in 
Benchmarks 
Prof. 
Saman 
Amarasinghe, 
MIT 
2009
Caches 
in 
Benchmarks 
Prof. 
Saman 
Amarasinghe, 
MIT 
2009
Caches 
in 
Benchmarks 
Prof. 
Saman 
Amarasinghe, 
MIT 
2009
Caches 
in 
Benchmarks 
Prof. 
Saman 
Amarasinghe, 
MIT 
2009
Caches 
in 
Benchmarks 
Prof. 
Saman 
Amarasinghe, 
MIT 
2009
Website 
Serving 
Images 
• Access 
1 
image 
1000 
Cmes 
• Latency 
measured 
for 
each 
access 
• Start 
measuring 
immediately 
• 3 
runs 
• Find 
mean 
• Dev 
environment 
Web 
Request 
Server 
Cache 
S3
Wrong 
About 
the 
Machine 
• Cache, 
cache, 
cache, 
cache! 
• Warmup 
& 
Cming
Website 
Serving 
Images 
• Access 
1 
image 
1000 
Cmes 
• Latency 
measured 
for 
each 
access 
• Start 
measuring 
immediately 
• 3 
runs 
• Find 
mean 
• Dev 
environment 
Web 
Request 
Server 
Cache 
S3
Wrong 
About 
the 
Machine 
• Cache, 
cache, 
cache, 
cache! 
• Warmup 
& 
Cming 
• Periodic 
interference
Website 
Serving 
Images 
• Access 
1 
image 
1000 
Cmes 
• Latency 
measured 
for 
each 
access 
• Start 
measuring 
immediately 
• 3 
runs 
• Find 
mean 
• Dev 
environment 
Web 
Request 
Server 
Cache 
S3
Wrong 
About 
the 
Machine 
• Cache, 
cache, 
cache, 
cache! 
• Warmup 
& 
Cming 
• Periodic 
interference 
• Test 
!= 
Prod
Website 
Serving 
Images 
• Access 
1 
image 
1000 
Cmes 
• Latency 
measured 
for 
each 
access 
• Start 
measuring 
immediately 
• 3 
runs 
• Find 
mean 
• Dev 
environment 
Web 
Request 
Server 
Cache 
S3
Wrong 
About 
the 
Machine 
• Cache, 
cache, 
cache, 
cache! 
• Warmup 
& 
Cming 
• Periodic 
interference 
• Test 
!= 
Prod 
• Power 
mode 
changes
YOU’RE 
WRONG 
ABOUT 
THE 
STATS
Wrong 
About 
Stats 
• Too 
few 
samples
Wrong 
About 
Stats 
120 
100 
80 
60 
40 
20 
0 
Convergence 
of 
Median 
on 
Samples 
0 
10 
20 
30 
40 
50 
60 
Latency 
Time 
Stable 
Samples 
Stable 
Median 
Decaying 
Samples 
Decaying 
Median
Website 
Serving 
Images 
• Access 
1 
image 
1000 
Cmes 
• Latency 
measured 
for 
each 
access 
• Start 
measuring 
immediately 
• 3 
runs 
• Find 
mean 
• Dev 
machine 
Web 
Request 
Server 
Cache 
S3
Wrong 
About 
Stats 
• Too 
few 
samples 
• Gaussian 
(not)
Website 
Serving 
Images 
• Access 
1 
image 
1000 
Cmes 
• Latency 
measured 
for 
each 
access 
• Start 
measuring 
immediately 
• 3 
runs 
• Find 
mean 
• Dev 
machine 
Web 
Request 
Server 
Cache 
S3
Wrong 
About 
Stats 
• Too 
few 
samples 
• Gaussian 
(not) 
• MulCmodal 
distribuCon
MulCmodal 
DistribuCon 
50% 
99% 
# 
occurrences 
Latency 
5 
ms 
10 
ms
Wrong 
About 
Stats 
• Too 
few 
samples 
• Gaussian 
(not) 
• MulCmodal 
distribuCon 
• Outliers
YOU’RE 
WRONG 
ABOUT 
WHAT 
MATTERS
Wrong 
About 
What 
MaLers 
• Premature 
opCmizaCon
“Programmers 
waste 
enormous 
amounts 
of 
Cme 
thinking 
about 
… 
the 
speed 
of 
noncriCcal 
parts 
of 
their 
programs 
... 
Forget 
about 
small 
efficiencies 
…97% 
of 
the 
Cme: 
premature 
opHmizaHon 
is 
the 
root 
of 
all 
evil. 
Yet 
we 
should 
not 
pass 
up 
our 
opportuniCes 
in 
that 
criCcal 
3%.” 
-­‐-­‐ 
Donald 
Knuth
Wrong 
About 
What 
MaLers 
• Premature 
opCmizaCon 
• UnrepresentaCve 
workloads
Wrong 
About 
What 
MaLers 
• Premature 
opCmizaCon 
• UnrepresentaCve 
workloads 
• Memory 
pressure
Wrong 
About 
What 
MaLers 
• Premature 
opCmizaCon 
• UnrepresentaCve 
workloads 
• Memory 
pressure 
• Load 
balancing
Wrong 
About 
What 
MaLers 
• Premature 
opCmizaCon 
• UnrepresentaCve 
workloads 
• Memory 
pressure 
• Load 
balancing 
• Reproducibility 
of 
measurements
BECOMING 
LESS 
WRONG
User 
AcCons 
MaLer 
X 
> 
Y 
for 
workload 
Z 
with 
trade 
offs 
A, 
B, 
and 
C 
-­‐ 
hLp://www.toomuchcode.org/
Profiling 
Code 
instrumentaCon 
Aggregate 
over 
logs 
Traces
Microbenchmarking: 
Blessing 
& 
Curse 
+ Quick 
& 
cheap 
+ Answers 
narrow 
?s 
well 
- Osen 
misleading 
results 
- Not 
representaCve 
of 
the 
program
Microbenchmarking: 
Blessing 
& 
Curse 
• Choose 
your 
N 
wisely
Choose 
Your 
N 
Wisely 
Prof. 
Saman 
Amarasinghe, 
MIT 
2009
Microbenchmarking: 
Blessing 
& 
Curse 
• Choose 
your 
N 
wisely 
• Measure 
side 
effects
Microbenchmarking: 
Blessing 
& 
Curse 
• Choose 
your 
N 
wisely 
• Measure 
side 
effects 
• Beware 
of 
clock 
resoluCon
Microbenchmarking: 
Blessing 
& 
Curse 
• Choose 
your 
N 
wisely 
• Measure 
side 
effects 
• Beware 
of 
clock 
resoluCon 
• Dead 
Code 
EliminaCon
Microbenchmarking: 
Blessing 
& 
Curse 
• Choose 
your 
N 
wisely 
• Measure 
side 
effects 
• Beware 
of 
clock 
resoluCon 
• Dead 
Code 
EliminaCon 
• Constant 
work 
per 
iteraCon
Non-­‐Constant 
Work 
Per 
IteraCon
Follow-­‐up 
Material 
• How 
NOT 
to 
Measure 
Latency 
by 
Gil 
Tene 
– hLp://www.infoq.com/presentaCons/latency-­‐piualls 
• Taming 
the 
Long 
Latency 
Tail 
on 
highscalability.com 
– hLp://highscalability.com/blog/2012/3/12/google-­‐taming-­‐ 
the-­‐long-­‐latency-­‐tail-­‐when-­‐more-­‐machines-­‐equal.html 
• Performance 
Analysis 
Methodology 
by 
Brendan 
Gregg 
– hLp://www.brendangregg.com/methodology.html 
• Silverman’s 
Mode 
Detec@on 
Method 
by 
MaL 
Adereth 
– hLp://adereth.github.io/blog/2014/10/12/silvermans-­‐ 
mode-­‐detecCon-­‐method-­‐explained/
HAVING 
FUN 
WITH
Setup 
• SSD 
30 
GB 
• M3 
large 
• Riak 
version 
1.4.2-­‐0-­‐g61ac9d8 
• Ubuntu 
12.04.5 
LTS 
• 4 
byte 
keys, 
10 
KB 
values
2350 
2300 
2250 
2200 
2150 
2100 
2050 
2000 
1950 
1900 
1850 
Latency 
(usec) 
Get 
Latency 
L3 
Number 
of 
Keys
Takeaway 
#1: 
Cache
Takeaway 
#2: 
Outliers
Takeaway 
#3: 
Workload
Benchmarking: 
You’re Doing It Wrong 
Aysylu 
Greenberg 
@aysylu22

Contenu connexe

Similaire à Benchmarking (RICON 2014)

Anomaly Detection Using the CLA
Anomaly Detection Using the CLAAnomaly Detection Using the CLA
Anomaly Detection Using the CLA
Numenta
 
The challenges of live events scalability
The challenges of live events scalabilityThe challenges of live events scalability
The challenges of live events scalability
Guy Tomer
 
Coates bosc2010 clouds-fluff-and-no-substance
Coates bosc2010 clouds-fluff-and-no-substanceCoates bosc2010 clouds-fluff-and-no-substance
Coates bosc2010 clouds-fluff-and-no-substance
BOSC 2010
 
Show Me the Numbers: Automated Browser
Show Me the Numbers: Automated Browser Show Me the Numbers: Automated Browser
Show Me the Numbers: Automated Browser
colleenfry
 

Similaire à Benchmarking (RICON 2014) (20)

Anomaly Detection Using the CLA
Anomaly Detection Using the CLAAnomaly Detection Using the CLA
Anomaly Detection Using the CLA
 
The challenges of live events scalability
The challenges of live events scalabilityThe challenges of live events scalability
The challenges of live events scalability
 
Adventures in Azure Machine Learning from NE Bytes
Adventures in Azure Machine Learning from NE BytesAdventures in Azure Machine Learning from NE Bytes
Adventures in Azure Machine Learning from NE Bytes
 
Machine learning systems for engineers
Machine learning systems for engineersMachine learning systems for engineers
Machine learning systems for engineers
 
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
 
Building data intensive applications
Building data intensive applicationsBuilding data intensive applications
Building data intensive applications
 
Coates bosc2010 clouds-fluff-and-no-substance
Coates bosc2010 clouds-fluff-and-no-substanceCoates bosc2010 clouds-fluff-and-no-substance
Coates bosc2010 clouds-fluff-and-no-substance
 
Embrace Chaos - Introducing Chaos Engineering to your Organization
Embrace Chaos - Introducing Chaos Engineering to your OrganizationEmbrace Chaos - Introducing Chaos Engineering to your Organization
Embrace Chaos - Introducing Chaos Engineering to your Organization
 
The deep bootstrap framework review
The deep bootstrap framework reviewThe deep bootstrap framework review
The deep bootstrap framework review
 
Show Me the Numbers: Automated Browser
Show Me the Numbers: Automated Browser Show Me the Numbers: Automated Browser
Show Me the Numbers: Automated Browser
 
Sean Kandel - Data profiling: Assessing the overall content and quality of a ...
Sean Kandel - Data profiling: Assessing the overall content and quality of a ...Sean Kandel - Data profiling: Assessing the overall content and quality of a ...
Sean Kandel - Data profiling: Assessing the overall content and quality of a ...
 
Badneedles
BadneedlesBadneedles
Badneedles
 
Cvcc performance tuning
Cvcc performance tuningCvcc performance tuning
Cvcc performance tuning
 
The RED Method: How to monitoring your microservices.
The RED Method: How to monitoring your microservices.The RED Method: How to monitoring your microservices.
The RED Method: How to monitoring your microservices.
 
(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument
(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument
(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument
 
Rubyslava beyond the_monolith
Rubyslava beyond the_monolithRubyslava beyond the_monolith
Rubyslava beyond the_monolith
 
Capacity Planning for fun & profit
Capacity Planning for fun & profitCapacity Planning for fun & profit
Capacity Planning for fun & profit
 
Continuous Integration, the minimum viable product
Continuous Integration, the minimum viable productContinuous Integration, the minimum viable product
Continuous Integration, the minimum viable product
 
Cloud War Stories
Cloud War StoriesCloud War Stories
Cloud War Stories
 
Performance Oriented Design
Performance Oriented DesignPerformance Oriented Design
Performance Oriented Design
 

Plus de Aysylu Greenberg

Plus de Aysylu Greenberg (20)

Software Supply Chains for DevOps @ InfoQ Live 2021
Software Supply Chains for DevOps @ InfoQ Live 2021Software Supply Chains for DevOps @ InfoQ Live 2021
Software Supply Chains for DevOps @ InfoQ Live 2021
 
Binary Authorization in Kubernetes
Binary Authorization in KubernetesBinary Authorization in Kubernetes
Binary Authorization in Kubernetes
 
Software Supply Chain Management with Grafeas and Kritis
Software Supply Chain Management with Grafeas and KritisSoftware Supply Chain Management with Grafeas and Kritis
Software Supply Chain Management with Grafeas and Kritis
 
Software Supply Chain Observability with Grafeas and Kritis
Software Supply Chain Observability with Grafeas and KritisSoftware Supply Chain Observability with Grafeas and Kritis
Software Supply Chain Observability with Grafeas and Kritis
 
Software Supply Chain Management with Grafeas and Kritis
Software Supply Chain Management with Grafeas and KritisSoftware Supply Chain Management with Grafeas and Kritis
Software Supply Chain Management with Grafeas and Kritis
 
Zero Downtime Migrations at Scale
Zero Downtime Migrations at ScaleZero Downtime Migrations at Scale
Zero Downtime Migrations at Scale
 
Zero Downtime Migration
Zero Downtime MigrationZero Downtime Migration
Zero Downtime Migration
 
PWL Denver: Copysets
PWL Denver: CopysetsPWL Denver: Copysets
PWL Denver: Copysets
 
Distributed systems in practice, in theory (ScaleConf Colombia)
Distributed systems in practice, in theory (ScaleConf Colombia)Distributed systems in practice, in theory (ScaleConf Colombia)
Distributed systems in practice, in theory (ScaleConf Colombia)
 
MesosCon Asia Keynote: Replacing a Jet Engine Mid-flight
MesosCon Asia Keynote: Replacing a Jet Engine Mid-flightMesosCon Asia Keynote: Replacing a Jet Engine Mid-flight
MesosCon Asia Keynote: Replacing a Jet Engine Mid-flight
 
Distributed systems in practice, in theory (JAX London)
Distributed systems in practice, in theory (JAX London)Distributed systems in practice, in theory (JAX London)
Distributed systems in practice, in theory (JAX London)
 
Building A Distributed Build System at Google Scale (StrangeLoop 2016)
Building A Distributed Build System at Google Scale (StrangeLoop 2016)Building A Distributed Build System at Google Scale (StrangeLoop 2016)
Building A Distributed Build System at Google Scale (StrangeLoop 2016)
 
Building a Distributed Build System at Google Scale
Building a Distributed Build System at Google ScaleBuilding a Distributed Build System at Google Scale
Building a Distributed Build System at Google Scale
 
(+ Loom (years 2))
(+ Loom (years 2))(+ Loom (years 2))
(+ Loom (years 2))
 
Distributed systems in practice, in theory
Distributed systems in practice, in theoryDistributed systems in practice, in theory
Distributed systems in practice, in theory
 
Probabilistic Accuracy Bounds @ Papers We Love SF
Probabilistic Accuracy Bounds @ Papers We Love SFProbabilistic Accuracy Bounds @ Papers We Love SF
Probabilistic Accuracy Bounds @ Papers We Love SF
 
Benchmarking (JAXLondon 2015)
Benchmarking (JAXLondon 2015)Benchmarking (JAXLondon 2015)
Benchmarking (JAXLondon 2015)
 
Loom & Functional Graphs in Clojure @ LambdaConf 2015
Loom & Functional Graphs in Clojure @ LambdaConf 2015Loom & Functional Graphs in Clojure @ LambdaConf 2015
Loom & Functional Graphs in Clojure @ LambdaConf 2015
 
Benchmarking (DevNexus 2015)
Benchmarking (DevNexus 2015)Benchmarking (DevNexus 2015)
Benchmarking (DevNexus 2015)
 
PWL: One VM to Rule Them All
PWL: One VM to Rule Them AllPWL: One VM to Rule Them All
PWL: One VM to Rule Them All
 

Dernier

%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
masabamasaba
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 

Dernier (20)

Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 

Benchmarking (RICON 2014)

  • 1. Benchmarking: You’re Doing It Wrong Aysylu Greenberg @aysylu22
  • 2.
  • 3. To Write Good Benchmarks… Need to be Full Stack
  • 4. Benchmark = How Fast? your process vs Goal your process vs Best PracCces
  • 5. Today • How Not to Write Benchmarks • Benchmark Setup & Results: - You’re wrong about machines - You’re wrong about stats - You’re wrong about what maLers • Becoming Less Wrong • Having Fun with Riak
  • 6. HOW NOT TO WRITE BENCHMARKS
  • 7. Website Serving Images • Access 1 image 1000 Cmes • Latency measured for each access • Start measuring immediately • 3 runs • Find mean • Dev environment Web Request Server Cache S3
  • 8. WHAT’S WRONG WITH THIS BENCHMARK?
  • 9. YOU’RE WRONG ABOUT THE MACHINE
  • 10. Wrong About the Machine • Cache, cache, cache, cache!
  • 11. It’s Caches All The Way Down Web Request Server Cache S3
  • 12. It’s Caches All The Way Down
  • 13. Caches in Benchmarks Prof. Saman Amarasinghe, MIT 2009
  • 14. Caches in Benchmarks Prof. Saman Amarasinghe, MIT 2009
  • 15. Caches in Benchmarks Prof. Saman Amarasinghe, MIT 2009
  • 16. Caches in Benchmarks Prof. Saman Amarasinghe, MIT 2009
  • 17. Caches in Benchmarks Prof. Saman Amarasinghe, MIT 2009
  • 18. Website Serving Images • Access 1 image 1000 Cmes • Latency measured for each access • Start measuring immediately • 3 runs • Find mean • Dev environment Web Request Server Cache S3
  • 19. Wrong About the Machine • Cache, cache, cache, cache! • Warmup & Cming
  • 20. Website Serving Images • Access 1 image 1000 Cmes • Latency measured for each access • Start measuring immediately • 3 runs • Find mean • Dev environment Web Request Server Cache S3
  • 21. Wrong About the Machine • Cache, cache, cache, cache! • Warmup & Cming • Periodic interference
  • 22. Website Serving Images • Access 1 image 1000 Cmes • Latency measured for each access • Start measuring immediately • 3 runs • Find mean • Dev environment Web Request Server Cache S3
  • 23. Wrong About the Machine • Cache, cache, cache, cache! • Warmup & Cming • Periodic interference • Test != Prod
  • 24. Website Serving Images • Access 1 image 1000 Cmes • Latency measured for each access • Start measuring immediately • 3 runs • Find mean • Dev environment Web Request Server Cache S3
  • 25. Wrong About the Machine • Cache, cache, cache, cache! • Warmup & Cming • Periodic interference • Test != Prod • Power mode changes
  • 26. YOU’RE WRONG ABOUT THE STATS
  • 27. Wrong About Stats • Too few samples
  • 28. Wrong About Stats 120 100 80 60 40 20 0 Convergence of Median on Samples 0 10 20 30 40 50 60 Latency Time Stable Samples Stable Median Decaying Samples Decaying Median
  • 29. Website Serving Images • Access 1 image 1000 Cmes • Latency measured for each access • Start measuring immediately • 3 runs • Find mean • Dev machine Web Request Server Cache S3
  • 30. Wrong About Stats • Too few samples • Gaussian (not)
  • 31. Website Serving Images • Access 1 image 1000 Cmes • Latency measured for each access • Start measuring immediately • 3 runs • Find mean • Dev machine Web Request Server Cache S3
  • 32. Wrong About Stats • Too few samples • Gaussian (not) • MulCmodal distribuCon
  • 33. MulCmodal DistribuCon 50% 99% # occurrences Latency 5 ms 10 ms
  • 34. Wrong About Stats • Too few samples • Gaussian (not) • MulCmodal distribuCon • Outliers
  • 35. YOU’RE WRONG ABOUT WHAT MATTERS
  • 36. Wrong About What MaLers • Premature opCmizaCon
  • 37. “Programmers waste enormous amounts of Cme thinking about … the speed of noncriCcal parts of their programs ... Forget about small efficiencies …97% of the Cme: premature opHmizaHon is the root of all evil. Yet we should not pass up our opportuniCes in that criCcal 3%.” -­‐-­‐ Donald Knuth
  • 38. Wrong About What MaLers • Premature opCmizaCon • UnrepresentaCve workloads
  • 39. Wrong About What MaLers • Premature opCmizaCon • UnrepresentaCve workloads • Memory pressure
  • 40. Wrong About What MaLers • Premature opCmizaCon • UnrepresentaCve workloads • Memory pressure • Load balancing
  • 41. Wrong About What MaLers • Premature opCmizaCon • UnrepresentaCve workloads • Memory pressure • Load balancing • Reproducibility of measurements
  • 43. User AcCons MaLer X > Y for workload Z with trade offs A, B, and C -­‐ hLp://www.toomuchcode.org/
  • 44. Profiling Code instrumentaCon Aggregate over logs Traces
  • 45. Microbenchmarking: Blessing & Curse + Quick & cheap + Answers narrow ?s well - Osen misleading results - Not representaCve of the program
  • 46. Microbenchmarking: Blessing & Curse • Choose your N wisely
  • 47. Choose Your N Wisely Prof. Saman Amarasinghe, MIT 2009
  • 48. Microbenchmarking: Blessing & Curse • Choose your N wisely • Measure side effects
  • 49. Microbenchmarking: Blessing & Curse • Choose your N wisely • Measure side effects • Beware of clock resoluCon
  • 50. Microbenchmarking: Blessing & Curse • Choose your N wisely • Measure side effects • Beware of clock resoluCon • Dead Code EliminaCon
  • 51. Microbenchmarking: Blessing & Curse • Choose your N wisely • Measure side effects • Beware of clock resoluCon • Dead Code EliminaCon • Constant work per iteraCon
  • 53. Follow-­‐up Material • How NOT to Measure Latency by Gil Tene – hLp://www.infoq.com/presentaCons/latency-­‐piualls • Taming the Long Latency Tail on highscalability.com – hLp://highscalability.com/blog/2012/3/12/google-­‐taming-­‐ the-­‐long-­‐latency-­‐tail-­‐when-­‐more-­‐machines-­‐equal.html • Performance Analysis Methodology by Brendan Gregg – hLp://www.brendangregg.com/methodology.html • Silverman’s Mode Detec@on Method by MaL Adereth – hLp://adereth.github.io/blog/2014/10/12/silvermans-­‐ mode-­‐detecCon-­‐method-­‐explained/
  • 55. Setup • SSD 30 GB • M3 large • Riak version 1.4.2-­‐0-­‐g61ac9d8 • Ubuntu 12.04.5 LTS • 4 byte keys, 10 KB values
  • 56. 2350 2300 2250 2200 2150 2100 2050 2000 1950 1900 1850 Latency (usec) Get Latency L3 Number of Keys
  • 60. Benchmarking: You’re Doing It Wrong Aysylu Greenberg @aysylu22