With large scale and complex configurable systems, it is hard for users to choose the right combination of options (i.e., configurations) in order to obtain the wanted trade-off between functionality and performance goals such as speed or size. Machine learning can help in relating these goals to the configurable system options, and thus, predict the effect of options on the outcome, typically after a costly training step. However, many configurable systems evolve at such a rapid pace that it is impractical to retrain a new model from scratch for each new version. In this paper, we propose a new method to enable transfer learning of binary size predictions among versions of the same configurable system. Taking the extreme case of the Linux kernel with its ≈ 14, 500 configuration options, we first investigate how binary size predictions of kernel size degrade over successive versions. We show that the direct reuse of an accurate prediction model from 2017 quickly becomes inaccurate when Linux evolves, up to a 32% mean error by August 2020. We thus propose a new approach for transfer evolution-aware model shifting (TEAMS). It leverages the structure of a configurable system to transfer an initial predictive model towards its future versions with a minimal amount of extra processing for each version. We show that TEAMS vastly outperforms state of the art approaches over the 3 years history of Linux kernels, from 4.13 to 5.8.
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
Transfer Learning Across Variants and Versions: The Case of Linux Kernel Size
1. Transfer Learning Across
Variants and Versions
The Case of Linux Kernel Size
Hugo Martin, Mathieu Acher, Juliana Alves Pereira, Luc Lesoil,
Jean-Marc Jézéquel, and Djamel Eddine Khelladi
Published at IEEE Transactions on Software Engineering
(TSE) in 2021
Preprint: https://hal.inria.fr/hal-03358817
6. Challenge 1: you cannot build ≈106000
configurations; sampling and learning to the
rescue but still (very) costly!
7.1Mb
176.8Mb
?
Variability
in
space
7. Challenge 2: Linux evolves; (heterogeneous)
transfer learning!
7.1Mb
176.8Mb
?
v4.13 v5.8
3 years
later…
?
?
?
Variability
in
space
…Variability in time…
8. Instead of learning from scratch or reusing “as is”,
we propose to transfer (adapt) a prediction model.
A problem overlooked in the literature is that the
feature spaces differ among versions since options
are added/removed.
We propose TEAMS for transfer evolution-aware
model shifting.
Key results (more details in the rest of the talk):
Rapid degradation of prediction models due to evolution (up to 32% prediction errors)
Effective transfer learning with TEAMS (reaching same accuracy 3 years later + SOTA)
Possible to learn colossal configuration spaces such as the Linux kernel one across
variants and versions
12. A challenging case
● Targeted non-functional, quantitative property: binary size
○ interest for maintainers/users of the Linux kernel (embedded systems, cloud, etc.)
○ challenging to predict (cross-cutting options, interplay with compilers/build systems, etc/.)
● Dataset: version 4.13.3, x86_64 arch, measurements of 95K+ random
configurations
○ paranoiac about deep variability since 2017: Docker to control the build environment and scale
○ build: 8 minutes on average
○ diversity: from 7Mb to 1.9Gb
● Do existing techniques work?
○ most of the work in performance prediction consider a relatively low number of options (<50)
○ Linux has 9K+ options for x86_64
2
12
14. TUXML: Sampling, Measuring, Learning
Docker for a reproducible environment
with tools/packages needed
and Python procedures inside
Easy to launch campaign:
”python kernel_generator.py 10”
builds/measures
10 random configurations
(information sent to a database)
https://github.com/TuxML/
15. TUXML: Sampling, Measuring, Learning
Docker for a reproducible environment
with tools/packages needed
and Python procedures inside
Easy to launch campaign:
”python3 kernel_generator.py 10”
builds/measures
10 random configurations
(information sent to a database)
https://github.com/TuxML/
16. Data: version 4.13.3 and 4.15 (x86_64)
74K+ configurations for Linux 4.15
95K+ configurations for Linux 4.13.3
(and 15K hours of computation on a grid computing)
17. A challenging case
● Linear-based algorithms : high error rate (it’s not additive!)
● Polynomial regression & performance-influence model : Out Of Memory (too much interactions and
not designed for 9K+ options)
● Tree-based algorithms & neural networks: low error rate
Mean Absolute Percentage Error
(MAPE): the lower the better
17
Mathieu Acher, Hugo Martin, Juliana Alves Pereira, Arnaud Blouin, Jean-Marc Jézéquel, Djamel Eddine Khelladi, Luc Lesoil, and Olivier Barais.
Learning Very Large Configuration Spaces: What Matters for Linux Kernel Sizes (2019) https://hal.inria.fr/hal-02314830
N : percentage of the
dataset used to training
for Linux 4.13.3
18. Challenge 2: Linux evolves; (heterogeneous)
transfer learning!
7.1Mb
176.8Mb
?
v4.13 v5.8
3 years
later…
?
?
?
Variability
in
space
…Variability in time…
19. Can we reuse a prediction model among versions?
No. Accuracy quickly
decreases:
○ 4.13: 6%
○ 4.15: 20%
○ 5.7: 35%
3
19
Insights about evolution of (important) options in the TSE article!
20. Transfer learning to the rescue
“Inductive transfer refers to any algorithmic process by which structure or
knowledge derived from a learning problem is used to enhance learning on a
related problem.” - Jeremy West in A theoretical foundation for inductive transfer
● 95K+ configuration measurements, 15.000 hours of computation
● Mission Impossible : Saving Private Model 4.13
● Heterogeneous transfer learning: the feature space is different!
5
20
21. Heterogeneity between versions
● Feature changes
○ Deleted features
○ New features
● Model compatible only with 4.13 features (only them, all of them)
○ Delete new features
○ Create dummy values for old features
■ Set all features to 1
4.13
4.15
4
4.15 compatible
with Model 4.13
22. Transfer learning to the rescue
● Heterogeneous transfer learning: the feature space is
different
● TEAMS: transfer evolution-aware model shifting
○ train a source model (aka reuse of prediction model)
○ “align” feature space
○ learn the transfer function through model shift
■ train (1) over the targeted version and (2) using the
predicted value of the source
■ linear model shift is not sufficient (intuitively, the relation
“source_size * k = target_size” does not hold and is much
more complex!)
● Bonus: TEAMS is compositional and you can combine
transfer across successive versions
5
22
Insights about composing TEAMS in the TSE article!
24. TEAMS: transfer evolution-aware model shifting
5
Budget : 1.000 configurations
● Model shifting :
○ From 6.7% to 10.6% error rate
● Scratch :
○ From 14.9% to 16.7% error rate
Budget : 5.000 configurations
● Model shifting :
○ From 5.6% to 7.1% error rate
● Scratch :
○ From 8.2% to 9.2% error rate
Budget : 10.000 configurations
● Model shifting :
○ From 5.2% to 6.1% error rate
● Scratch :
○ From 7.1% to 7.7% error rate
24
(5K configurations
for training)
25. Kpredict
Python module for Python 3.8+ ( https://github.com/HugoJPMartin/kpredict )
Works for many kernel versions and any configuration x86_64
Error : ≃ 6.3%
97% of the predictions are below 20% error
12
H. Martin, M. Acher, J. A. Pereira, L. Lesoil, J. Jézéquel and D. E. Khelladi, “Transfer learning across variants and versions: The
case of linux kernel size” Transactions on Software Engineering (TSE), 2021 25
26. Conclusion
Learning across variants and versions at the scale of Linux
Direct perspectives for Linux:
● Considering more recent versions (negative transfer?)
● Deep variability: interplay with compilers (gcc version/clang version) or achitecture (eg
x86_32)
● Beyond binary size, targetting other quantitative properties (performance, energy, security,
build time, etc.)
Beyond Linux, mobile/web apps, web browsers, compilers, database systems, cloud services,
image processing, distributed streaming platforms, data transfer tools are also:
● highly configurable (variability in space)
● continuously evolving with the addition or removal of options + maintenance of code base
(variability in time)
Does evolution degrade performance model?
Is (heterogeneous) transfer learning (eg TEAMS) effective?
27. Published at IEEE Transactions on Software
Engineering (TSE) in 2021
Preprint: https://hal.inria.fr/hal-03358817
29. Transfer learning
“Inductive transfer refers to any algorithmic process by which structure or
knowledge derived from a learning problem is used to enhance learning on a
related problem.” - Jeremy West in A theoretical foundation for inductive transfer
● 100.000 configuration measurements, 15.000 hours of computation
● Mission Impossible : Saving Private Model 4.13
○ Budget : 5.000 configurations measurements (one night worth of ISTIC computing power)
5
49. Incremental Model Shifting
Model 4.13 + Shifting Model 4.15 = Model 4.15
Model 4.13 + Shifting Model 4.20 = Model 4.20
Model 4.13 + Shifting Model 5.0 = Model 5.0
Model 4.13 + Shifting Model 5.4 = Model 5.4
Model 4.13 + Shifting Model 5.7 = Model 5.7
Model 4.13 + Shifting Model 5.8 = Model 5.8
Source + Shifting Model = Full Model
Simple Model Shifting
9
50. Incremental Model Shifting
Model 4.13 + Shifting Model 4.15 = Model 4.15
Model 4.13 + Shifting Model 4.20 = Model 4.20
Model 4.13 + Shifting Model 5.0 = Model 5.0
Model 4.13 + Shifting Model 5.4 = Model 5.4
Model 4.13 + Shifting Model 5.7 = Model 5.7
Model 4.13 + Shifting Model 5.8 = Model 5.8
Source + Shifting Model = Full Model
Simple Model Shifting
Model 4.13 + Shifting Model 4.15 = Model 4.15
Model 4.13 + Shifting Model 4.20 = Model 4.20
Source + Shifting Model = Full Model
9
Incremental Model Shifting
51. Incremental Model Shifting
Model 4.13 + Shifting Model 4.15 = Model 4.15
Model 4.13 + Shifting Model 4.20 = Model 4.20
Model 4.13 + Shifting Model 5.0 = Model 5.0
Model 4.13 + Shifting Model 5.4 = Model 5.4
Model 4.13 + Shifting Model 5.7 = Model 5.7
Model 4.13 + Shifting Model 5.8 = Model 5.8
Source + Shifting Model = Full Model
Simple Model Shifting
Model 4.13 + Shifting Model 4.15 = Model 4.15
Model 4.15 + Shifting Model 4.20 = Model 4.20
Source + Shifting Model = Full Model
9
Incremental Model Shifting
52. Incremental Model Shifting
Model 4.13 + Shifting Model 4.15 = Model 4.15
Model 4.13 + Shifting Model 4.20 = Model 4.20
Model 4.13 + Shifting Model 5.0 = Model 5.0
Model 4.13 + Shifting Model 5.4 = Model 5.4
Model 4.13 + Shifting Model 5.7 = Model 5.7
Model 4.13 + Shifting Model 5.8 = Model 5.8
Source + Shifting Model = Full Model
Simple Model Shifting
Model 4.13 + Shifting Model 4.15 = Model 4.15
Model 4.15 + Shifting Model 4.20 = Model 4.20
Model 4.20 + Shifting Model 5.0 = Model 5.0
Model 5.0 + Shifting Model 5.4 = Model 5.4
Model 5.4 + Shifting Model 5.7 = Model 5.7
Model 5.7 + Shifting Model 5.8 = Model 5.8
Source + Shifting Model = Full Model
9
Incremental Model Shifting
54. Results
Budget : 5.000 configurations
● Model shifting :
○ From 5.6% to 7.1% error rate
● Scratch :
○ From 8.2% to 9.2% error rate
● Incremental Shifting :
○ From 5.6% to 7.5%
10
55. Results
Budget : 5.000 configurations
● Model shifting :
○ From 5.6% to 7.1% error rate
● Scratch :
○ From 8.2% to 9.2% error rate
● Incremental Shifting :
○ From 5.6% to 7.5%
Budget : 10.000 configurations
● Model shifting :
○ From 5.2% to 6.1% error rate
● Scratch :
○ From 7.1% to 7.7% error rate
● Incremental Shifting :
○ From 5.2% to 6.5%
10
56. Results
Budget : 1.000 configurations
● Model shifting :
○ From 6.7% to 10.6% error rate
● Scratch :
○ From 14.9% to 16.7% error rate
● Incremental Shifting :
○ From 6.7% to 13.3%
Budget : 5.000 configurations
● Model shifting :
○ From 5.6% to 7.1% error rate
● Scratch :
○ From 8.2% to 9.2% error rate
● Incremental Shifting :
○ From 5.6% to 7.5%
Budget : 10.000 configurations
● Model shifting :
○ From 5.2% to 6.1% error rate
● Scratch :
○ From 7.1% to 7.7% error rate
● Incremental Shifting :
○ From 5.2% to 6.5%
10
57. Results
Budget : 1.000 configurations
● Model shifting :
○ From 6.7% to 10.6% error rate
● Scratch :
○ From 14.9% to 16.7% error rate
● Incremental Shifting :
○ From 6.7% to 13.3%
Budget : 5.000 configurations
● Model shifting :
○ From 5.6% to 7.1% error rate
● Scratch :
○ From 8.2% to 9.2% error rate
● Incremental Shifting :
○ From 5.6% to 7.5%
Budget : 10.000 configurations
● Model shifting :
○ From 5.2% to 6.1% error rate
● Scratch :
○ From 7.1% to 7.7% error rate
● Incremental Shifting :
○ From 5.2% to 6.5%
Model 4.13 Budget : 85.000 configurations
10
58. Results
Budget : 1.000 configurations
● Model shifting :
○ From 6.7% to 10.6% error rate
● Scratch :
○ From 14.9% to 16.7% error rate
● Incremental Shifting :
○ From 6.7% to 13.3%
Budget : 5.000 configurations
● Model shifting :
○ From 5.6% to 7.1% error rate
● Scratch :
○ From 8.2% to 9.2% error rate
● Incremental Shifting :
○ From 5.6% to 7.5%
Budget : 10.000 configurations
● Model shifting :
○ From 5.2% to 6.1% error rate
● Scratch :
○ From 7.1% to 7.7% error rate
● Incremental Shifting :
○ From 5.2% to 6.5%
Model 4.13 Budget : 85.000 configurations
Model 4.13 Budget : 20.000 configurations
10
59. Results
Budget : 1.000 configurations
● Model shifting :
○ From 6.7% to 10.6% error rate
● Scratch :
○ From 14.9% to 16.7% error rate
● Incremental Shifting :
○ From 6.7% to 13.3%
Budget : 5.000 configurations
● Model shifting :
○ From 5.6% to 7.1% error rate
● Scratch :
○ From 8.2% to 9.2% error rate
● Incremental Shifting :
○ From 5.6% to 7.5%
Budget : 10.000 configurations
● Model shifting :
○ From 5.2% to 6.1% error rate
● Scratch :
○ From 7.1% to 7.7% error rate
● Incremental Shifting :
○ From 5.2% to 6.5%
Model 4.13 Budget : 85.000 configurations
Model 4.13 Budget : 20.000 configurations
Budget : 5.000 configurations
● Model shifting :
○ From 6.7% to 7.9% error rate
● Scratch :
○ From 8.2% to 9.2% error rate
● Incremental Shifting :
○ From 6.7% to 7.9%
10
60. Results
Budget : 1.000 configurations
● Model shifting :
○ From 6.7% to 10.6% error rate
● Scratch :
○ From 14.9% to 16.7% error rate
● Incremental Shifting :
○ From 6.7% to 13.3%
Budget : 5.000 configurations
● Model shifting :
○ From 5.6% to 7.1% error rate
● Scratch :
○ From 8.2% to 9.2% error rate
● Incremental Shifting :
○ From 5.6% to 7.5%
Budget : 10.000 configurations
● Model shifting :
○ From 5.2% to 6.1% error rate
● Scratch :
○ From 7.1% to 7.7% error rate
● Incremental Shifting :
○ From 5.2% to 6.5%
Model 4.13 Budget : 85.000 configurations
Model 4.13 Budget : 20.000 configurations
Budget : 5.000 configurations
● Model shifting :
○ From 6.7% to 7.9% error rate
● Scratch :
○ From 8.2% to 9.2% error rate
● Incremental Shifting :
○ From 6.7% to 7.9%
Budget : 10.000 configurations
● Model shifting :
○ From 6.2% to 6.7% error rate
● Scratch :
○ From 7.1% to 7.7% error rate
● Incremental Shifting :
○ From 6.1% to 6.7%
10
61. Results
Budget : 1.000 configurations
● Model shifting :
○ From 6.7% to 10.6% error rate
● Scratch :
○ From 14.9% to 16.7% error rate
● Incremental Shifting :
○ From 6.7% to 13.3%
Budget : 5.000 configurations
● Model shifting :
○ From 5.6% to 7.1% error rate
● Scratch :
○ From 8.2% to 9.2% error rate
● Incremental Shifting :
○ From 5.6% to 7.5%
Budget : 10.000 configurations
● Model shifting :
○ From 5.2% to 6.1% error rate
● Scratch :
○ From 7.1% to 7.7% error rate
● Incremental Shifting :
○ From 5.2% to 6.5%
Model 4.13 Budget : 85.000 configurations
Model 4.13 Budget : 20.000 configurations
Budget : 1.000 configurations
● Model shifting :
○ From 8.5% to 11.6% error rate
● Scratch :
○ From 14.9% to 16.7% error rate
● Incremental Shifting :
○ From 8.5% to 13.8%
Budget : 5.000 configurations
● Model shifting :
○ From 6.7% to 7.9% error rate
● Scratch :
○ From 8.2% to 9.2% error rate
● Incremental Shifting :
○ From 6.7% to 7.9%
Budget : 10.000 configurations
● Model shifting :
○ From 6.2% to 6.7% error rate
● Scratch :
○ From 7.1% to 7.7% error rate
● Incremental Shifting :
○ From 6.1% to 6.7%
10
62. Summary
● Model 4.13 is saved
○ Positively reuse old model on new version at lower cost
○ Better than learning from scratch for years
● Incremental Shifting
○ More sensible to previous models error
○ Better use of more transfer budget
11
63. Kpredict
Python module for Python 3.8+ ( https://github.com/HugoJPMartin/kpredict )
Error : ≃ 6.3%
97% of the predictions are below 20% error
12