SlideShare une entreprise Scribd logo
1  sur  36
Télécharger pour lire hors ligne
Use of a Levy Distribution for Modeling 
Best Case Execution Time Variation 
Jonathan Beard, Roger Chamberlain 
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Work also supported by: 
1
Outline 
• Motivation! 
• Stream Processing! 
• Optimization Goals! 
• Methodology! 
• Distributions! 
• Results 
2
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Streaming Computing 
3
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Streaming Computing 
Kernel 
3
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Streaming Computing 
Kernel 1 
Kernel 2 
Kernel 3 
Kernel 2 
Stream 
Stream 
Stream 
Stream 
4
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Streaming Languages 
StreamIt, Auto-Pipe, Brook, Cg, S-Net, 
Scala-Pipe, Streams-C and 
many others 
5
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Optimization 
Slow 
Fast Kernel 
Super Fast 
Medium 
6
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Optimization 
Kernel 1 
Kernel 2 
Kernel 3 
Kernel 2 
multi-core A 
1 2 
3 4 
multi-core B 
1 2 
3 4 
More allocation choices, 
NUMA node A or B to 
allocate stream. 
7
1 2 
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Optimization 
Kernel 1 
Kernel 2 
Kernel 3 
Kernel 2 
multi-core A 
1 2 
3 4 
multi-core B 
1 2 
3 4 
More allocation choices, 
NUMA node A or B to 
allocate stream. 
7
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Optimization 
Kernel 1 
Kernel 2 
Kernel 3 
Kernel 2 
multi-core A 
1 2 
3 4 
multi-core B 
1 2 
3 4 
More allocation choices, 
NUMA node A or B to 
allocate stream. 
1 2 
7
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Optimization 
A B C 
“Stream” is modeled as a Queue 
A Q1 B Q2 C 
8
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Optimization 
A B C 
“Stream” is modeled as a Queue 
A Q1 B Q2 C 
8
We want good models for streaming systems 
on shared multi-core systems (i.e., a cluster) 
Problem: Accurate measurement is very difficult. Is there 
a way to decide on a model without it. 
• Commodity multi-core timer availability and latency 
• Frequency scaling and core migration 
• Measuring modifies the application behavior 
SBS 
Streaming on Multi-core Systems 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
9
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Derived Information 
Expected Observed 
10
SBS 
Is there a pattern of minimal variation within the 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Derived Information 
Expected Observed 
systems we’re running on? 
Avg. Service Time = E[ X ] + Error 
10
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Goal 
Find a distribution that characterizes 
the minimum expected variation of a 
hardware and software system 
Use this characterization to 
accept or reject models 
11
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Process 
12 
• Measurement! 
• Workload definition! 
• Find a distribution! 
• Utilize the distribution to aid model selection
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Timer Mechanism 
Timer Thread Code 
13 
Ask for Time 
Receive Time
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Timer Mechanism 
Timer Thread 
rdtsc clock_gettime 
14 
• x86 assembly 
• varying methods 
to serialize 
• relatively fast 
• multiple drift 
issues 
• POSIX standard 
• relatively accurate 
• portable 
• slower than rdtsc
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Two Timing Choices 
15
SBS 
NUMA Node Variations 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
16
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Minimize Variation 
• Restricting timer to single core 
! 
• Use the x86 rdtsc instruction with processor 
recommended serializers for each processor 
type 
! 
• Keeping processes under test on the same 
NUMA node as timer 
! 
• Run timer thread with altered priority to 
minimize core context swaps 
17
SBS 
Best Case Execution Time Variation 
• no-op instruction implemented in most processors 
! 
• usually takes exactly 1 cycle 
! 
• no real functional units are involved, so least 
taxing 
! 
• variation observed in execution time should be 
external to process 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
18
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Data Collection 
• no-op loops calibrated for various nominal 
times, tied to a single core and run 
thousands of times 
! 
• Execution time measured end to end for 
each run, environment collected 
! 
• Parameters include: 
Number of processes executing on core 
Number of context swaps (voluntary, 
involuntary) 
Many others 
19
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Levy Distribution 
20 
Execution Time Error 
( obs - mean )
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Levy Distribution 
21 
Normal Distribution
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Levy Distribution 
22 
Gumbel Distribution
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Levy Distribution 
23 
Levy Distribution
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Levy Distribution 
23 
Levy Distribution
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Levy Distribution 
• Truncation enables mean calculation, but 
requires fitting to each dataset to find where 
to truncate 
! 
• The truncation parameters are correlated to 
both the number of processes per core and 
the expected execution time 
! 
• Roughly linear relationship gives an 
approximate solution to truncation 
parameters without refitting 
24
1 - 5 processes 6 - 10 processes 
!0.000014 
!0.0000145 
!0.000015 
!0.0000155 
SBS 
11 - 15 processes 16 - 20 processes 
!0.00002 
!0.000025 
!0.00003 
!0.000035 
!0.00004 
!0.000045 
!0.00005 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Levy Fit 
!0.0000125 
!0.000013 
!0.0000135 
!0.000014 
!0.00001 
!0.000015 
!0.00002 
!0.000025 
!0.00003 
25 
!0.000025 !0.00001 
!0.000025 !0.00001 0 
!0.00006 !0.00003 0 
!0.00005 !0.00002 0
Hypothesis: Lower Kullback-Leibler (KL) divergence 
SBS 
Question: Can we use an M/M/1 queueing model to 
estimate the mean queue occupancy of this system? 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Test Setup 
A Q1 B 
! 
between expected and realized distribution is 
associated with higher model accuracy. 
26
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Test Setup 
A Q1 B 
1. Dedicated thread of execution monitors 
27 
queue occupancy 
2. Calculate the estimated mean queue 
occupancy using the M/M/1 model 
3. Calculate KL Divergence for the arrival 
process distribution using the truncated 
Levy distribution noise model
SBS 
Convolution with Exponential 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
28
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Conclusions 
• The truncated Levy distribution can be used to 
approximate BCETV 
! 
• The distribution of BCETV can be used as a tool 
to accept or reject a stochastic queueing model 
based on distributional assumptions 
! 
• KL divergence between the expected and 
convolved distribution highly correlates with 
queue model accuracy 
29
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Parting Notes 
Slides available here: 
sbs.wust.edu 
! 
Timer C++ template code: 
http://goo.gl/ItJ3jP 
! 
Test harness used to collect data: 
http://goo.gl/U1VG6N 
30

Contenu connexe

Tendances

A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...
Marcelo Veiga Neves
 
Peformance Evaluation of Container-based Vi
Peformance Evaluation of Container-based ViPeformance Evaluation of Container-based Vi
Peformance Evaluation of Container-based Vi
Miguel Xavier
 

Tendances (20)

Ceph for Big Science - Dan van der Ster
Ceph for Big Science - Dan van der SterCeph for Big Science - Dan van der Ster
Ceph for Big Science - Dan van der Ster
 
Automatic Operation Bot for Ceph - You Ji
Automatic Operation Bot for Ceph - You JiAutomatic Operation Bot for Ceph - You Ji
Automatic Operation Bot for Ceph - You Ji
 
Performance Tuning - Understanding Garbage Collection
Performance Tuning - Understanding Garbage CollectionPerformance Tuning - Understanding Garbage Collection
Performance Tuning - Understanding Garbage Collection
 
Ceph Goes on Online at Qihoo 360 - Xuehan Xu
Ceph Goes on Online at Qihoo 360 - Xuehan XuCeph Goes on Online at Qihoo 360 - Xuehan Xu
Ceph Goes on Online at Qihoo 360 - Xuehan Xu
 
A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...
 
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex LauDoing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
 
An example transition to 1687-based mixed-signal DFT
An example transition to  1687-based mixed-signal DFTAn example transition to  1687-based mixed-signal DFT
An example transition to 1687-based mixed-signal DFT
 
A fun cup of joe with open liberty
A fun cup of joe with open libertyA fun cup of joe with open liberty
A fun cup of joe with open liberty
 
Ceph QoS: How to support QoS in distributed storage system - Taewoong Kim
Ceph QoS: How to support QoS in distributed storage system - Taewoong KimCeph QoS: How to support QoS in distributed storage system - Taewoong Kim
Ceph QoS: How to support QoS in distributed storage system - Taewoong Kim
 
Global deduplication for Ceph - Myoungwon Oh
Global deduplication for Ceph - Myoungwon OhGlobal deduplication for Ceph - Myoungwon Oh
Global deduplication for Ceph - Myoungwon Oh
 
Hadoop at Bloomberg:Medium data for the financial industry
Hadoop at Bloomberg:Medium data for the financial industryHadoop at Bloomberg:Medium data for the financial industry
Hadoop at Bloomberg:Medium data for the financial industry
 
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
 
Bin repacking scheduling in virtualized datacenters
Bin repacking scheduling in virtualized datacentersBin repacking scheduling in virtualized datacenters
Bin repacking scheduling in virtualized datacenters
 
Peformance Evaluation of Container-based Vi
Peformance Evaluation of Container-based ViPeformance Evaluation of Container-based Vi
Peformance Evaluation of Container-based Vi
 
MySQL Head to Head Performance
MySQL Head to Head PerformanceMySQL Head to Head Performance
MySQL Head to Head Performance
 
Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...
Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...
Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...
 
Kafka reliability velocity 17
Kafka reliability   velocity 17Kafka reliability   velocity 17
Kafka reliability velocity 17
 
Combining Cloud Native & PaaS: Building a Fully Managed Application Platform ...
Combining Cloud Native & PaaS: Building a Fully Managed Application Platform ...Combining Cloud Native & PaaS: Building a Fully Managed Application Platform ...
Combining Cloud Native & PaaS: Building a Fully Managed Application Platform ...
 
MySQL on Ceph
MySQL on CephMySQL on Ceph
MySQL on Ceph
 
Apache hadoop 3.x state of the union and upgrade guidance - Strata 2019 NY
Apache hadoop 3.x state of the union and upgrade guidance - Strata 2019 NYApache hadoop 3.x state of the union and upgrade guidance - Strata 2019 NY
Apache hadoop 3.x state of the union and upgrade guidance - Strata 2019 NY
 

Similaire à Use of a Levy Distribution for Modeling Best Case Execution Time Variation

CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...
CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...
CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...
Farley Lai
 
load-balancing-method-for-embedded-rt-system-20120711-0940
load-balancing-method-for-embedded-rt-system-20120711-0940load-balancing-method-for-embedded-rt-system-20120711-0940
load-balancing-method-for-embedded-rt-system-20120711-0940
Samsung Electronics
 
SOC-CH3.pptSOC ProcessorsSOC Processors Used in SOC Used in SOC
SOC-CH3.pptSOC ProcessorsSOC Processors Used in SOC Used in SOCSOC-CH3.pptSOC ProcessorsSOC Processors Used in SOC Used in SOC
SOC-CH3.pptSOC ProcessorsSOC Processors Used in SOC Used in SOC
SnehaLatha68
 

Similaire à Use of a Levy Distribution for Modeling Best Case Execution Time Variation (20)

Real-Time Inverted Search NYC ASLUG Oct 2014
Real-Time Inverted Search NYC ASLUG Oct 2014Real-Time Inverted Search NYC ASLUG Oct 2014
Real-Time Inverted Search NYC ASLUG Oct 2014
 
Cloud Computing: Safe Haven from the Data Deluge? AGBT 2011
Cloud Computing: Safe Haven from the Data Deluge? AGBT 2011Cloud Computing: Safe Haven from the Data Deluge? AGBT 2011
Cloud Computing: Safe Haven from the Data Deluge? AGBT 2011
 
DevoFlow - Scaling Flow Management for High-Performance Networks
DevoFlow - Scaling Flow Management for High-Performance NetworksDevoFlow - Scaling Flow Management for High-Performance Networks
DevoFlow - Scaling Flow Management for High-Performance Networks
 
Simulation of Heterogeneous Cloud Infrastructures
Simulation of Heterogeneous Cloud InfrastructuresSimulation of Heterogeneous Cloud Infrastructures
Simulation of Heterogeneous Cloud Infrastructures
 
A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Storm
 
Elasticsearch Sharding Strategy at Tubular Labs
Elasticsearch Sharding Strategy at Tubular LabsElasticsearch Sharding Strategy at Tubular Labs
Elasticsearch Sharding Strategy at Tubular Labs
 
AMBA 2.0 REPORT
AMBA 2.0 REPORTAMBA 2.0 REPORT
AMBA 2.0 REPORT
 
Distributed automation selcamp2016
Distributed automation selcamp2016Distributed automation selcamp2016
Distributed automation selcamp2016
 
Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...
Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...
Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...
 
CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...
CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...
CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...
 
[AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵
 [AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵 [AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵
[AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵
 
load-balancing-method-for-embedded-rt-system-20120711-0940
load-balancing-method-for-embedded-rt-system-20120711-0940load-balancing-method-for-embedded-rt-system-20120711-0940
load-balancing-method-for-embedded-rt-system-20120711-0940
 
High Speed Design Closure Techniques-Balachander Krishnamurthy
High Speed Design Closure Techniques-Balachander KrishnamurthyHigh Speed Design Closure Techniques-Balachander Krishnamurthy
High Speed Design Closure Techniques-Balachander Krishnamurthy
 
ChIP-seq - Data processing
ChIP-seq - Data processingChIP-seq - Data processing
ChIP-seq - Data processing
 
Pathogen phylogenetics using BEAST
Pathogen phylogenetics using BEASTPathogen phylogenetics using BEAST
Pathogen phylogenetics using BEAST
 
Scan insertion
Scan insertionScan insertion
Scan insertion
 
Springone2gx 2014 Reactive Streams and Reactor
Springone2gx 2014 Reactive Streams and ReactorSpringone2gx 2014 Reactive Streams and Reactor
Springone2gx 2014 Reactive Streams and Reactor
 
Cloud Architecture & Distributed Systems Trivia
Cloud Architecture & Distributed Systems TriviaCloud Architecture & Distributed Systems Trivia
Cloud Architecture & Distributed Systems Trivia
 
SOC-CH3.pptSOC ProcessorsSOC Processors Used in SOC Used in SOC
SOC-CH3.pptSOC ProcessorsSOC Processors Used in SOC Used in SOCSOC-CH3.pptSOC ProcessorsSOC Processors Used in SOC Used in SOC
SOC-CH3.pptSOC ProcessorsSOC Processors Used in SOC Used in SOC
 

Dernier

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 

Use of a Levy Distribution for Modeling Best Case Execution Time Variation

  • 1. Use of a Levy Distribution for Modeling Best Case Execution Time Variation Jonathan Beard, Roger Chamberlain SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Work also supported by: 1
  • 2. Outline • Motivation! • Stream Processing! • Optimization Goals! • Methodology! • Distributions! • Results 2
  • 3. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Streaming Computing 3
  • 4. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Streaming Computing Kernel 3
  • 5. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Streaming Computing Kernel 1 Kernel 2 Kernel 3 Kernel 2 Stream Stream Stream Stream 4
  • 6. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Streaming Languages StreamIt, Auto-Pipe, Brook, Cg, S-Net, Scala-Pipe, Streams-C and many others 5
  • 7. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Optimization Slow Fast Kernel Super Fast Medium 6
  • 8. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Optimization Kernel 1 Kernel 2 Kernel 3 Kernel 2 multi-core A 1 2 3 4 multi-core B 1 2 3 4 More allocation choices, NUMA node A or B to allocate stream. 7
  • 9. 1 2 SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Optimization Kernel 1 Kernel 2 Kernel 3 Kernel 2 multi-core A 1 2 3 4 multi-core B 1 2 3 4 More allocation choices, NUMA node A or B to allocate stream. 7
  • 10. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Optimization Kernel 1 Kernel 2 Kernel 3 Kernel 2 multi-core A 1 2 3 4 multi-core B 1 2 3 4 More allocation choices, NUMA node A or B to allocate stream. 1 2 7
  • 11. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Optimization A B C “Stream” is modeled as a Queue A Q1 B Q2 C 8
  • 12. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Optimization A B C “Stream” is modeled as a Queue A Q1 B Q2 C 8
  • 13. We want good models for streaming systems on shared multi-core systems (i.e., a cluster) Problem: Accurate measurement is very difficult. Is there a way to decide on a model without it. • Commodity multi-core timer availability and latency • Frequency scaling and core migration • Measuring modifies the application behavior SBS Streaming on Multi-core Systems Stream Based Supercomputing Lab http://sbs.wustl.edu 9
  • 14. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Derived Information Expected Observed 10
  • 15. SBS Is there a pattern of minimal variation within the Stream Based Supercomputing Lab http://sbs.wustl.edu Derived Information Expected Observed systems we’re running on? Avg. Service Time = E[ X ] + Error 10
  • 16. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Goal Find a distribution that characterizes the minimum expected variation of a hardware and software system Use this characterization to accept or reject models 11
  • 17. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Process 12 • Measurement! • Workload definition! • Find a distribution! • Utilize the distribution to aid model selection
  • 18. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Timer Mechanism Timer Thread Code 13 Ask for Time Receive Time
  • 19. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Timer Mechanism Timer Thread rdtsc clock_gettime 14 • x86 assembly • varying methods to serialize • relatively fast • multiple drift issues • POSIX standard • relatively accurate • portable • slower than rdtsc
  • 20. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Two Timing Choices 15
  • 21. SBS NUMA Node Variations Stream Based Supercomputing Lab http://sbs.wustl.edu 16
  • 22. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Minimize Variation • Restricting timer to single core ! • Use the x86 rdtsc instruction with processor recommended serializers for each processor type ! • Keeping processes under test on the same NUMA node as timer ! • Run timer thread with altered priority to minimize core context swaps 17
  • 23. SBS Best Case Execution Time Variation • no-op instruction implemented in most processors ! • usually takes exactly 1 cycle ! • no real functional units are involved, so least taxing ! • variation observed in execution time should be external to process Stream Based Supercomputing Lab http://sbs.wustl.edu 18
  • 24. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Data Collection • no-op loops calibrated for various nominal times, tied to a single core and run thousands of times ! • Execution time measured end to end for each run, environment collected ! • Parameters include: Number of processes executing on core Number of context swaps (voluntary, involuntary) Many others 19
  • 25. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Levy Distribution 20 Execution Time Error ( obs - mean )
  • 26. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Levy Distribution 21 Normal Distribution
  • 27. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Levy Distribution 22 Gumbel Distribution
  • 28. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Levy Distribution 23 Levy Distribution
  • 29. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Levy Distribution 23 Levy Distribution
  • 30. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Levy Distribution • Truncation enables mean calculation, but requires fitting to each dataset to find where to truncate ! • The truncation parameters are correlated to both the number of processes per core and the expected execution time ! • Roughly linear relationship gives an approximate solution to truncation parameters without refitting 24
  • 31. 1 - 5 processes 6 - 10 processes !0.000014 !0.0000145 !0.000015 !0.0000155 SBS 11 - 15 processes 16 - 20 processes !0.00002 !0.000025 !0.00003 !0.000035 !0.00004 !0.000045 !0.00005 Stream Based Supercomputing Lab http://sbs.wustl.edu Levy Fit !0.0000125 !0.000013 !0.0000135 !0.000014 !0.00001 !0.000015 !0.00002 !0.000025 !0.00003 25 !0.000025 !0.00001 !0.000025 !0.00001 0 !0.00006 !0.00003 0 !0.00005 !0.00002 0
  • 32. Hypothesis: Lower Kullback-Leibler (KL) divergence SBS Question: Can we use an M/M/1 queueing model to estimate the mean queue occupancy of this system? Stream Based Supercomputing Lab http://sbs.wustl.edu Test Setup A Q1 B ! between expected and realized distribution is associated with higher model accuracy. 26
  • 33. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Test Setup A Q1 B 1. Dedicated thread of execution monitors 27 queue occupancy 2. Calculate the estimated mean queue occupancy using the M/M/1 model 3. Calculate KL Divergence for the arrival process distribution using the truncated Levy distribution noise model
  • 34. SBS Convolution with Exponential Stream Based Supercomputing Lab http://sbs.wustl.edu 28
  • 35. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Conclusions • The truncated Levy distribution can be used to approximate BCETV ! • The distribution of BCETV can be used as a tool to accept or reject a stochastic queueing model based on distributional assumptions ! • KL divergence between the expected and convolved distribution highly correlates with queue model accuracy 29
  • 36. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Parting Notes Slides available here: sbs.wust.edu ! Timer C++ template code: http://goo.gl/ItJ3jP ! Test harness used to collect data: http://goo.gl/U1VG6N 30