The Adaptor framework automates experimentation, data collection and analysis in the field of programs performance and tuning.
It can be used for i.e. estimation of computer system performance during its design or search of optimal compiler settings by methods of iterative compilation and machine learning-driven techniques.
Contact information: Michael K. Pankov
• mkpankov@gmail.com
• michaelpankov.com
Source on GitHub: https://github.com/constantius9/adaptor
This is an extended and edited version of my diploma defense keynote, given on June 19, 2013
1. Introduction
Methodology
Implementation (general info)
Implementation (client)
Evaluation of implementation
Programs Performance Analysis Toolkit Adaptor
Michael K. Pankov
Advisor: Anatoly P. Karpenko
Bauman Moscow State Technical University
October 11, 2013
Michael K. PankovAdvisor: Anatoly P. Karpenko Programs Performance Analysis Toolkit Adaptor
2. Introduction
Methodology
Implementation (general info)
Implementation (client)
Evaluation of implementation
Goal, tasks, and importance of the work
Goal
Develop a method and software toolkit for modeling of
programs performance on general purpose computers
Tasks
1 Develop method of programs performance modeling
2 Implement the performance analysis & modeling toolkit
3 Study the efficiency of toolkit on a set of benchmarks
Importance
1 Estimation of computer performance during its design
2 Search of optimal compiler settings by methods of iterative
compilation and machine learning-driven techniques
Michael K. PankovAdvisor: Anatoly P. Karpenko Programs Performance Analysis Toolkit Adaptor
3. Introduction
Methodology
Implementation (general info)
Implementation (client)
Evaluation of implementation
Overview
A lot of recent research: see C. Dubach, G. Fursin, B. C. Lee,
W. Wu
In particular, there’s cTuning public repository for research
and corresponding program Collective Mind run by G. Fursin
This work is about modeling of performance of
general-purpose computer programs with feature ranking by
means of Earth Importance and regression by means of
k-Nearest Neighbors and Earth Regression. We try to
accomplish automatic detection of relevant features.
Michael K. PankovAdvisor: Anatoly P. Karpenko Programs Performance Analysis Toolkit Adaptor
4. Introduction
Methodology
Implementation (general info)
Implementation (client)
Evaluation of implementation
Method of statistical programs performance analysis
Velocitas
1 Perform a series of experiments on measuring time of program
execution and form a set, U:
U = {(Xi , yi )}, Xi = (xij , i ∈ [1; m], j ∈ [1; n])
Xi — features vector (CPU frequency, number of rows of
processed matrix, etc.), yi — response (execution time),
m — number of experiments, n — number of features
Michael K. PankovAdvisor: Anatoly P. Karpenko Programs Performance Analysis Toolkit Adaptor
5. Introduction
Methodology
Implementation (general info)
Implementation (client)
Evaluation of implementation
2 Split the U set into training sample D and test sample C by
randomly assigning of 70% of experiments to D
D = {di | f I
rand (di ) > 0.3}, (1)
di = (Xi , yi ), (2)
f I
rand (d) ∈ [0 : 1], (3)
i ∈ [1; m], (4)
C = U D (5)
Michael K. PankovAdvisor: Anatoly P. Karpenko Programs Performance Analysis Toolkit Adaptor
6. Introduction
Methodology
Implementation (general info)
Implementation (client)
Evaluation of implementation
3 Extract features xik
xik = f (Xi ), (6)
Xi = (xij ), (7)
D = {(Xi , yi )}, (8)
i ∈ [1; m], (9)
j ∈ [1; n + r], (10)
k ∈ [n + 1; r] (11)
r — number of additional features (i.e. ”size of input data”)
Michael K. PankovAdvisor: Anatoly P. Karpenko Programs Performance Analysis Toolkit Adaptor
7. Introduction
Methodology
Implementation (general info)
Implementation (client)
Evaluation of implementation
4 Filter the training set D to remove noise and incorrect
measurements
D = D {(Xi , yi ) | P(Xi , yi )}
P — experiment selection predicate (we remove all
experiments where the measured execution time is less than
tmin)
Michael K. PankovAdvisor: Anatoly P. Karpenko Programs Performance Analysis Toolkit Adaptor
8. Introduction
Methodology
Implementation (general info)
Implementation (client)
Evaluation of implementation
5 Rank the features and select only ones with non-zero
importance
sj = frank(D ), (12)
j ∈ [1; n], (13)
D = {(Xi , yi ) | Sj > 0} (14)
sj — scalar value of importance of particular feature,
frank — feature ranking function (we used MSE, Relief F,
Earth Importance)
Michael K. PankovAdvisor: Anatoly P. Karpenko Programs Performance Analysis Toolkit Adaptor
9. Introduction
Methodology
Implementation (general info)
Implementation (client)
Evaluation of implementation
6 Fit the regression model of 1 of 4 kinds (linear, random forest,
Earth, k nearest neighbors)
Mp = {fpred , B} (15)
B = ffit(D ), p ∈ [1; 4] (16)
B — vector of model parameters, ffit — learning function,
fpref — prediction function (they’re defined for each model
separately)
Michael K. PankovAdvisor: Anatoly P. Karpenko Programs Performance Analysis Toolkit Adaptor
10. Introduction
Methodology
Implementation (general info)
Implementation (client)
Evaluation of implementation
7 Test the model by RRSE metric
C = U D (17)
= {(Xi , yi )}, (18)
i ∈ [1; m], (19)
Xi = (xik), (20)
k ∈ [1; n + r] (21)
˜Y = fpred (X, B), (22)
RRSE =
m
i=1
(˜yi − yi )2
m
i=1
(yi − ¯y)2
(23)
˜yi — predicted value of response, ¯y — average value of
response in testing sample
Michael K. PankovAdvisor: Anatoly P. Karpenko Programs Performance Analysis Toolkit Adaptor
11. Introduction
Methodology
Implementation (general info)
Implementation (client)
Evaluation of implementation
Architecture of Adaptor Framework
Database server
Data views
Client
Database interaction module
Program building module
Experimentation module
Information retrieval module
Information analysis module
Michael K. PankovAdvisor: Anatoly P. Karpenko Programs Performance Analysis Toolkit Adaptor
12. Introduction
Methodology
Implementation (general info)
Implementation (client)
Evaluation of implementation
Technology stack
Database server
Distributed client-server document-oriented storage CouchDB
Cloud platform Cloudant
Client
Python
Statistical framework Orange
GNU/Linux on x86 platform
Michael K. PankovAdvisor: Anatoly P. Karpenko Programs Performance Analysis Toolkit Adaptor
13. Introduction
Methodology
Implementation (general info)
Implementation (client)
Evaluation of implementation
Database interaction module
Provides high-level API for
storage of Python objects to
database documents
Uses local CouchDB server
as a fall-back if the remote
isn’t available
Michael K. PankovAdvisor: Anatoly P. Karpenko Programs Performance Analysis Toolkit Adaptor
14. Introduction
Methodology
Implementation (general info)
Implementation (client)
Evaluation of implementation
Program building module
Manages paths to source
files of experimental
programs
Sources are in hierarchical
structure of directories
Module enables that only
specifying the name of
program to build is
enough for sources to be
found
Manages build tools and
their settings
Michael K. PankovAdvisor: Anatoly P. Karpenko Programs Performance Analysis Toolkit Adaptor
15. Introduction
Methodology
Implementation (general info)
Implementation (client)
Evaluation of implementation
Experimentation module
Calibrates the program
execution time measurement
before every series of runs
Subtracts the execution
time of ”simplest”
program to avoid
systematical error
Runs the program being
studied until relative
dispersion of time
measurement becomes
pretty low (drel < 5%)
Passes experiment data to
database interaction module
Michael K. PankovAdvisor: Anatoly P. Karpenko Programs Performance Analysis Toolkit Adaptor
16. Introduction
Methodology
Implementation (general info)
Implementation (client)
Evaluation of implementation
Information retrieval module
Collects the information on
used platform and
experiment being carried out
CPU
Frequency
Cache size
Instruction set extensions
etc.
Compiler
Experiment
Studied program
Size of input data
Michael K. PankovAdvisor: Anatoly P. Karpenko Programs Performance Analysis Toolkit Adaptor
17. Introduction
Methodology
Implementation (general info)
Implementation (client)
Evaluation of implementation
Data analysis module
Receives data from database and saves it to CSV files for
input to Orange statistical analysis system
Graphs results using Python library matplotlib
Two groups of program performance models
Simplest (1 feature)
More complex (3-5 features)
Four regression models in both groups
Linear
k Nearest Neighbors
Multivariate Adaptive Regression Splines
Random Forest
Michael K. PankovAdvisor: Anatoly P. Karpenko Programs Performance Analysis Toolkit Adaptor
18. Introduction
Methodology
Implementation (general info)
Implementation (client)
Evaluation of implementation
Data analysis module (cont.)
Scheme of 40 data analysis
components in Orange
system
Reading in
Preprocessing
Filtering
Feature extraction
Feature ranking
Predictor fitting
Prediction results
evaluation
Saving predictions to CSV
file
Michael K. PankovAdvisor: Anatoly P. Karpenko Programs Performance Analysis Toolkit Adaptor
19. Introduction
Methodology
Implementation (general info)
Implementation (client)
Evaluation of implementation
Platform
Intel CPUs
Core 2 Quad Q8200
2.33 GHz, 2 MB cache
Core i5 M460 2.53
GHz, 3 MB cache
Xeon E5430 2.66
GHz, 6 MB cache
Ubuntu 12.04, gcc and
llvm compilers
Polybench/C 3.2 benchmark
set, 28 programs in total
Linear algebra, solution of
systems of linear algebraic
equations and ordinary
differential equations
Input data is generated by
deterministic algorithms
Performance of chosen
programs from benchmark
set is modeled using
Adaptor framework
symm. Multiplication of
symmetric matrices
Square matrices of 2i
dimensionality,
i = frand (1, 10)
ludcmp.
LU-decomposition.
Square matrices of
frand (2, 1024)
dimensionality
1000 experiments per CPU
Michael K. PankovAdvisor: Anatoly P. Karpenko Programs Performance Analysis Toolkit Adaptor
20. Introduction
Methodology
Implementation (general info)
Implementation (client)
Evaluation of implementation
Feature ranking. symm program
Attribute Relief F Mean Square Error Earth Importance
size 0.268 0.573 4.9
cpu mhz 0.000 0.006 3.3
width 0.130 0.573 0.7
cpu cache 0.000 0.006 0.5
height 0.130 0.573 0.0
Earth Importance selected only relevant features
Michael K. PankovAdvisor: Anatoly P. Karpenko Programs Performance Analysis Toolkit Adaptor
21. Introduction
Methodology
Implementation (general info)
Implementation (client)
Evaluation of implementation
Feature ranking. symm program (cont.)
428 experiments
1 feature: matrix dimensionality
RMSE RRSE R2
k Nearest Neighbors 5.761 0.051 0.997
Random Forest 5.961 0.052 0.997
Linear Regression 15.869 0.139 0.981
Root Relative Square Error of k Nearest Neighbors —
approx. 5%
Michael K. PankovAdvisor: Anatoly P. Karpenko Programs Performance Analysis Toolkit Adaptor
22. Introduction
Methodology
Implementation (general info)
Implementation (client)
Evaluation of implementation
Resulting model of performance
k Nearest Neighbors
model
of performance
of symm program
on Intel
Core 2 Quad Q8200
CPU
Michael K. PankovAdvisor: Anatoly P. Karpenko Programs Performance Analysis Toolkit Adaptor
23. Introduction
Methodology
Implementation (general info)
Implementation (client)
Evaluation of implementation
Resulting model of performance
Comparison of models of performance of ludcmp program
468 experiments
2 features: width of matrix, CPU frequency
RMSE RRSE R2
k Nearest Neigbors 1.093 0.048 0.998
Linear Regression 9.067 0.394 0.845
Michael K. PankovAdvisor: Anatoly P. Karpenko Programs Performance Analysis Toolkit Adaptor
24. Introduction
Methodology
Implementation (general info)
Implementation (client)
Evaluation of implementation
Where models fail
Amazon throttles its
micro servers: data is
split into two
”curves”
Earth Regression at
least tries to follow
the ”main curve”
k Nearest Neighbors
is much worse in this
situation
Michael K. PankovAdvisor: Anatoly P. Karpenko Programs Performance Analysis Toolkit Adaptor
25. Introduction
Methodology
Implementation (general info)
Implementation (client)
Evaluation of implementation
Results of evaluation
Most suitable Feature Ranking method — Earth Importance
Most suitable Regression method — k Nearest Neighbors
Michael K. PankovAdvisor: Anatoly P. Karpenko Programs Performance Analysis Toolkit Adaptor
26. Introduction
Methodology
Implementation (general info)
Implementation (client)
Evaluation of implementation
Further work
Velocitas method is promising, scales for larger feature sets
Data filtering to reduce noise can help it to get even better
Orange is decent statistical framework, but interactive work
with it limits batch processing
For larger data sets and increased automation of Adaptor
framework, either its API, or other libraries (e.g. sklearn)
should be used
Custom research scenario support is required
It would be interesting to perform experiments on GPU to
study effects of massive parallel execution
Michael K. PankovAdvisor: Anatoly P. Karpenko Programs Performance Analysis Toolkit Adaptor
27. Introduction
Methodology
Implementation (general info)
Implementation (client)
Evaluation of implementation
Thank you!
Contact information: Michael K. Pankov
mkpankov@gmail.com
michaelpankov.com
This is an extended and edited version of my diploma defense
keynote, given on June 19, 2013
Michael K. PankovAdvisor: Anatoly P. Karpenko Programs Performance Analysis Toolkit Adaptor