SlideShare une entreprise Scribd logo
1  sur  61
Télécharger pour lire hors ligne
28.03.19 DFKI – DIMA – TU Berlin
Database Systems and Information
Management Group
TU Berlin
https://www.dima.tu-berlin.de/
Intelligent Analytics for Massive Data
German Research Center for Artificial
Intelligence
https://www.dfki.de/ 1
Continuous Deployment of Machine
Learning Pipelines
Behrouz Derakhshan behrouz.derakhshan@dfki.de
Alireza Rezaei Mahdiraji alireza.rm@dfki.de
Tilmann Rabl rabl@tu-berlin.de
Volker Markl volker.markl@tu-berlin.de
28.03.19 DFKI – DIMA – TU Berlin 2
Life Cycle of Machine Learning Applications
28.03.19 DFKI – DIMA – TU Berlin 2
■ Life cycle of ML application does not end with training
Life Cycle of Machine Learning Applications
28.03.19 DFKI – DIMA – TU Berlin 2
■ Life cycle of ML application does not end with training
■ Models and Pipelines must be deployed to answer
prediction queries
Life Cycle of Machine Learning Applications
28.03.19 DFKI – DIMA – TU Berlin 2
■ Life cycle of ML application does not end with training
■ Models and Pipelines must be deployed to answer
prediction queries
■ Deployed models and pipelines should be monitored and
trained further
Life Cycle of Machine Learning Applications
28.03.19 DFKI – DIMA – TU Berlin 2
■ Life cycle of ML application does not end with training
■ Models and Pipelines must be deployed to answer
prediction queries
■ Deployed models and pipelines should be monitored and
trained further
Life Cycle of Machine Learning Applications
Focus of this talk
28.03.19 DFKI – DIMA – TU Berlin 3
2. Deployment
Improvement By Online Learning
Data
Preprocessing
Model
Training
Model
1. Training
Data
Preprocessing
Model
Training
Model
Deployment Platform
3. Query Answering
Prediction
Query Prediction
Training DataTraining Data
28.03.19 DFKI – DIMA – TU Berlin 3
2. Deployment
Improvement By Online Learning
Data
Preprocessing
Model
Training
Model
1. Training
Data
Preprocessing
Model
Training
Model
Deployment Platform
3. Query Answering
Prediction
Query Prediction
Training
Data
Training DataTraining Data
28.03.19 DFKI – DIMA – TU Berlin 3
2. Deployment
Improvement By Online Learning
Data
Preprocessing
Model
Training
Model
1. Training
Data
Preprocessing
Model
Training
Model
Deployment Platform
3. Query Answering
Prediction
Query Prediction
Training
Data
Training DataTraining Data
4. Online Learning
28.03.19 DFKI – DIMA – TU Berlin 3
2. Deployment
Improvement By Online Learning
Data
Preprocessing
Model
Training
Model
1. Training
Data
Preprocessing
Model
Training
Model
Deployment Platform
3. Query Answering
Prediction
Query Prediction
Training
Data
Training DataTraining Data
• Efficient and Fast
4. Online Learning
28.03.19 DFKI – DIMA – TU Berlin 3
2. Deployment
Improvement By Online Learning
Data
Preprocessing
Model
Training
Model
1. Training
Data
Preprocessing
Model
Training
Model
Deployment Platform
3. Query Answering
Prediction
Query Prediction
Training
Data
Training DataTraining Data
• Cannot guarantee high-quality models
• Efficient and Fast
4. Online Learning
28.03.19 DFKI – DIMA – TU Berlin 4
2. Deployment
Improvement By Retraining
Data
Preprocessing
Model
Training
Model
Data
Preprocessing
Model
Training
Model
Deployment Platform
Prediction
Query Prediction
Training
Data
Training Data
1. Training
3. Query Answering
4. Online Learning
28.03.19 DFKI – DIMA – TU Berlin 4
2. Deployment
Improvement By Retraining
Data
Preprocessing
Model
Training
Model
Data
Preprocessing
Model
Training
Model
Deployment Platform
Prediction
Query Prediction
New Training Data
5. Gather Training Data
Training
Data
Training Data
1. Training
3. Query Answering
4. Online Learning
28.03.19 DFKI – DIMA – TU Berlin 4
2. Deployment
Improvement By Retraining
Data
Preprocessing
Model
Training
Model
Data
Preprocessing
Model
Training
Model
Deployment Platform
Prediction
Query Prediction
New Training Data
5. Gather Training Data
Training
Data
6. Retraining
Training Data
1. Training
3. Query Answering
4. Online Learning
28.03.19 DFKI – DIMA – TU Berlin 4
2. Deployment
Improvement By Retraining
Data
Preprocessing
Model
Training
Model
Data
Preprocessing
Model
Training
Model
Deployment Platform
Prediction
Query Prediction
New Training Data
5. Gather Training Data
Training
Data
6. Retraining
Historical
Training Data
1. Training
3. Query Answering
4. Online Learning
28.03.19 DFKI – DIMA – TU Berlin 4
2. Deployment
Improvement By Retraining
Data
Preprocessing
Model
Training
Model
Data
Preprocessing
Model
Training
Model
Deployment Platform
Prediction
Query Prediction
New Training Data
5. Gather Training Data
Training
Data
6. Retraining
Historical
Training Data
• Provides high-quality models
1. Training
3. Query Answering
4. Online Learning
28.03.19 DFKI – DIMA – TU Berlin 4
2. Deployment
Improvement By Retraining
Data
Preprocessing
Model
Training
Model
Data
Preprocessing
Model
Training
Model
Deployment Platform
Prediction
Query Prediction
New Training Data
5. Gather Training Data
Training
Data
6. Retraining
Historical
Training Data
• Provides high-quality models
• Time-consuming and resource-intensive
• Out-of-core
1. Training
3. Query Answering
4. Online Learning
28.03.19 DFKI – DIMA – TU Berlin 5
Ideal Deployment Platform
Can a platform provide the same level of quality as Retraining
and perform (almost) as efficiently as Online Learning?
28.03.19 DFKI – DIMA – TU Berlin 6
Continuous Deployment Platform
Prediction
Query Prediction
Training
Data
Continuous Deployment Platform
• Train the model inside the platform
• Compute features and cache them
• Update data preprocessing statistics
• Replace Retraining with Proactive Training
Data
Preprocessing
Model
Training
Model
Historical Training Data Online Learning
Proactive Training
Cache computed
Features
Stats
28.03.19 DFKI – DIMA – TU Berlin 7
Historical Training Data
Preprocess MaterializeDiscretize Sample Update
Raw Data
Chunks
Preprocessed
Features
Model
Continuous Deployment Platform
Scheduled Execution
Raw
Data
28.03.19 DFKI – DIMA – TU Berlin 8
Historical Training Data
Preprocess MaterializeDiscretize Sample Update
Raw Data
Chunks
Preprocessed
Features
Model
Continuous Deployment Platform
Scheduled Execution
Raw
Data
Data Preparation Phase
28.03.19 DFKI – DIMA – TU Berlin 9
Historical Training Data
Preprocess MaterializeDiscretize Sample Update
Raw Data
Chunks
Preprocessed
Features
Model
Continuous Deployment Platform
Scheduled Execution
Raw
Data
Proactive Training Phase
28.03.19 DFKI – DIMA – TU Berlin 10
Historical Training Data
Data Preparation Phase
Raw
Data
Raw Data
Chunks
Discretize
…..
t0 t1 t2 t3
t0 t1 t2 t3 tn-3 tn-2 tn-1 tn
28.03.19 DFKI – DIMA – TU Berlin 10
Historical Training Data
Data Preparation Phase
Raw
Data
Raw Data
Chunks
Discretize Preprocess
Parser
Missing Value
Imputer
Standard
Scaler
Feature
Hasher
…..
t0 t1 t2 t3
t0 t1 t2 t3 tn-3 tn-2 tn-1 tn
28.03.19 DFKI – DIMA – TU Berlin 10
Historical Training Data
Data Preparation Phase
Raw
Data
Raw Data
Chunks
Discretize Preprocess
Feature
Chunks
Parser
Missing Value
Imputer
Standard
Scaler
Feature
Hasher
…..
…..
Cache
Layer
t0 t1 t2 t3
t0 t1 t2 t3
t0 t1 t2 t3 tn-3 tn-2 tn-1 tn
28.03.19 DFKI – DIMA – TU Berlin 10
Historical Training Data
Data Preparation Phase
Raw
Data
Raw Data
Chunks
Discretize Preprocess
Feature
Chunks
Parser
Missing Value
Imputer
Standard
Scaler
Feature
Hasher
Stats Stats
…..
…..
Cache
Layer
t0 t1 t2 t3
t0 t1 t2 t3
t0 t1 t2 t3 tn-3 tn-2 tn-1 tn
Update and Store Statistics
28.03.19 DFKI – DIMA – TU Berlin 11
Storage Requirement
Historical Training Data
…..
…..
t0 t1 t2 t3 tn-3 tn-2 tn-1 tn
Cache
Layer
28.03.19 DFKI – DIMA – TU Berlin 11
Storage Requirement
Historical Training Data
…..
…..
Removed Cached Features
t0 t1 t2 t3 tn-3 tn-2 tn-1 tn
Cache
Layer
28.03.19 DFKI – DIMA – TU Berlin 12
Historical Training Data
…..
…..
Sample Update ModelMaterialize
Cache
Layer
Scheduled Execution
Proactive Training Phase
28.03.19 DFKI – DIMA – TU Berlin 12
Historical Training Data
…..
…..
Sample Update ModelMaterialize
Cache
Layer
mini-batch SGD Algorithm
Input: = training dataset
Output: = trained model
1: initialize
2: for = do
3:
4:
5
6: end for
7: return
𝐷
𝑚
𝑚0
𝑖 1…𝑛 
𝑠𝑖 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑓𝑟𝑜𝑚 𝐷
𝑔 = 𝛁𝐽(𝑠𝑖, 𝑚𝑖−1)
:     𝑚𝑖 = 𝑚𝑖−1 − 𝜂𝑖−1 𝑔
𝑚𝑛
Scheduled Execution
Proactive Training Phase
28.03.19 DFKI – DIMA – TU Berlin 12
Historical Training Data
…..
…..
Sample Update ModelMaterialize
Cache
Layer
mini-batch SGD Algorithm
Input: = training dataset
Output: = trained model
1: initialize
2: for = do
3:
4:
5
6: end for
7: return
𝐷
𝑚
𝑚0
𝑖 1…𝑛 
𝑠𝑖 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑓𝑟𝑜𝑚 𝐷
𝑔 = 𝛁𝐽(𝑠𝑖, 𝑚𝑖−1)
:     𝑚𝑖 = 𝑚𝑖−1 − 𝜂𝑖−1 𝑔
𝑚𝑛
Scheduled Execution
Proactive Training Phase
28.03.19 DFKI – DIMA – TU Berlin 12
Historical Training Data
…..
…..
Sample Update ModelMaterialize
Cache
Layer
mini-batch SGD Algorithm
Input: = training dataset
Output: = trained model
1: initialize
2: for = do
3:
4:
5
6: end for
7: return
𝐷
𝑚
𝑚0
𝑖 1…𝑛 
𝑠𝑖 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑓𝑟𝑜𝑚 𝐷
𝑔 = 𝛁𝐽(𝑠𝑖, 𝑚𝑖−1)
:     𝑚𝑖 = 𝑚𝑖−1 − 𝜂𝑖−1 𝑔
𝑚𝑛
Scheduled Execution
Proactive Training Phase
28.03.19 DFKI – DIMA – TU Berlin 12
Historical Training Data
…..
…..
Sample Update ModelMaterialize
Cache
Layer
mini-batch SGD Algorithm
Input: = training dataset
Output: = trained model
1: initialize
2: for = do
3:
4:
5
6: end for
7: return
𝐷
𝑚
𝑚0
𝑖 1…𝑛 
𝑠𝑖 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑓𝑟𝑜𝑚 𝐷
𝑔 = 𝛁𝐽(𝑠𝑖, 𝑚𝑖−1)
:     𝑚𝑖 = 𝑚𝑖−1 − 𝜂𝑖−1 𝑔
𝑚𝑛
Scheduled Execution
Proactive Training Phase
28.03.19 DFKI – DIMA – TU Berlin 12
Historical Training Data
…..
…..
Sample Update ModelMaterialize
Cache
Layer
mini-batch SGD Algorithm
Input: = training dataset
Output: = trained model
1: initialize
2: for = do
3:
4:
5
6: end for
7: return
𝐷
𝑚
𝑚0
𝑖 1…𝑛 
𝑠𝑖 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑓𝑟𝑜𝑚 𝐷
𝑔 = 𝛁𝐽(𝑠𝑖, 𝑚𝑖−1)
:     𝑚𝑖 = 𝑚𝑖−1 − 𝜂𝑖−1 𝑔
𝑚𝑛
Scheduled Execution
Proactive Training Phase
28.03.19 DFKI – DIMA – TU Berlin 13
Data Materializing
Historical Training Data
…..
…..
Sample Update ModelMaterialize
Cache
Layer
28.03.19 DFKI – DIMA – TU Berlin 13
Data Materializing
Historical Training Data
…..
…..
Sample Update ModelMaterialize
Parser
Missing Value
Imputer
Standard
Scaler
Feature
Hasher
Stats Stats
Cache
Layer
Statistics precomputed during the Data Preparation Phase
28.03.19 DFKI – DIMA – TU Berlin 14
Evaluation
Datasets Size #Instances Initial Deployment
URL 2.1 GB 2.4 M Day 0 Day 1-120
Taxi 42 GB 280 M Jan 15 Feb 15 – Jun 16
Parser
Missing Value
Imputer
Standard
Scaler
Feature
Hasher
SVM Model
URL
Pipeline
Parser
Feature
Extractor
Anomaly
Detector
Standard
Scaler
Linear Regression
Model
Taxi
Pipeline
28.03.19 DFKI – DIMA – TU Berlin 15
Quality
Can Proactive Training provide the same level of quality as Retraining?
28.03.19 DFKI – DIMA – TU Berlin 15
Quality
−
−
−
−
−
−
−
−
−
−
−
−
− −
−
−
−
−
−
−
−
−
− − −
−
−
−
−
−
−
−
−
−
−
−
−
− −
−
−
−
−
−
−
−
−
−
− −
−
−
−
−
−
−
−
−
−
−
−
−
− −
−
−
−
−
−
−
−
−
−
− −
2.1
2.2
2.3
day1 day30 day60 day90 day120
Misclassification%
− − −Proactive Retraining Online
Can Proactive Training provide the same level of quality as Retraining?
Cumulative Prequential Prediction Error Rate for the URL Pipeline During the Deployment
28.03.19 DFKI – DIMA – TU Berlin 16
Training Time
Can Proactive Training perform (almost) as efficiently as Online Learning?
28.03.19 DFKI – DIMA – TU Berlin 16
Training Time
− − − − − − − − − − − − − − − − − − − − − − − −
−
− − −
− −
− −
− −
− −
− −
− −
− −
− −
− −
− −
− − − − − − − − − − − − − − − − − − − − − − − −
0
250
500
750
day1 day30 day60 day90 day120
Time(m)
− − −Proactive Retraining Online
Cumulative Training Time for the URL Pipeline During the Deployment
Can Proactive Training perform (almost) as efficiently as Online Learning?
28.03.19 DFKI – DIMA – TU Berlin 16
Training Time
− − − − − − − − − − − − − − − − − − − − − − − −
−
− − −
− −
− −
− −
− −
− −
− −
− −
− −
− −
− −
− − − − − − − − − − − − − − − − − − − − − − − −
0
250
500
750
day1 day30 day60 day90 day120
Time(m)
− − −Proactive Retraining Online
Cumulative Training Time for the URL Pipeline During the Deployment
Can Proactive Training perform (almost) as efficiently as Online Learning?
Proactive training provides same level of model accuracy as Retraining, while
matching the speed of Online Learning
28.03.19 DFKI – DIMA – TU Berlin 17
Feature Caching and Statistics Computation
What are the effects of Statistics Computation and Feature Caching ?
111
91
55
0
40
80
120
No
Optimiziation
0% Cached 100% Cached
Time(m)
Total Training Time in Presence of Statistics Computation and Feature Caching
28.03.19 DFKI – DIMA – TU Berlin 17
Feature Caching and Statistics Computation
What are the effects of Statistics Computation and Feature Caching ?
111
91
55
0
40
80
120
No
Optimiziation
0% Cached 100% Cached
Time(m)
Total Training Time in Presence of Statistics Computation and Feature Caching
Statistics Computation and Feature Caching improves the performance of
Proactive training by a factor of 2
28.03.19 DFKI – DIMA – TU Berlin 18
Summary
Continuous Deployment Platform
• Proactive Training, instead of
Offline Retraining
• Feature Caching
• Online Statistics Computation
• Reduces the total training time
• Achieves high quality
Historical Training Data
Preprocess MaterializeDiscretize Sample Update
Raw
Data
Raw Data
Chunks
Preprocessed
Features
Model
Schedule Execution
2.23
2.24
2.25
2.26
2.27
0 400 800
Time(m)
Misclassification%
Proactive Retraining Online
28.03.19 DFKI – DIMA – TU Berlin 19
1. D. Crankshaw, X. Wang, G. Zhou, M. Franklin, et al. 2016. Clipper: A Low-Latency Online Prediction
Serving System. arXiv preprint arXiv:1612.03079 (2016).
2. D. Crankshaw, P. Bailis, J. Gonzalez, H. Li, et al. 2014. The missing piece incomplex analytics: Low
latency, scalable model management and serving with velox.
3. D. Baylor, E. Breck, H. Cheng, N. Fiedel, et al. 2017. TFX: A TensorFlow-Based Production-Scale
Machine Learning Platform. In Proceedings of the 23rd ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining. ACM, 1387–1395.
4. L. Bottou. 2010. Large-scale machine learning with stochastic gradient descent. In Proceedings of
COMPSTAT’2010. Springer, 177–186.
5. M. Zaharia, M. Chowdhury, M. Franklin, S. Shenker, and I. Stoica. 2010. Spark: cluster computing with
working sets. HotCloud 10 (2010), 10–10.
6. O. Chapelle. [n. d.]. NYC Taxi & Lomousine Commision Trip Record Data. http://www.nyc.gov/html/tlc/
html/about/trip_record_data.shtml. [Online;accessed 10-April-2018].
7. J. Ma, L. Saul, S. Savage, and G. Voelker. 2009. Identifying suspicious URLs: an application of large-
scale online learning. In Proceedings of the 26th annual international conference on machine learning.
ACM, 681–688.
8. D. Kingma and J. Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint
arXiv:1412.6980 (2014).
9. M. Zeiler. 2012. ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (2012).
10. T. Tieleman and G. Hinton. 2012. Lecture 6.5-rmsprop: Divide the gradient by a running average of its
recent magnitude. COURSERA: Neural networks for machine learning 4, 2 (2012), 26–31.
References
28.03.19 DFKI – DIMA – TU Berlin 20
Backup Slides
28.03.19 DFKI – DIMA – TU Berlin 21
Preprocess MaterializeDiscretize Sample Update
Raw Data
Chunks
Preprocessed
Features
Model
Data Manager
Historical Training Data
Raw
Data
Data Manager
• Data Discretizing
• Data Sampling
• Historical Data Management Scheduled Execution
28.03.19 DFKI – DIMA – TU Berlin 22
Scheduled Execution
Historical Training Data
Preprocess MaterializeDiscretize Sample Update
Raw
Data
Raw Data
Chunks
Preprocessed
Features
Model
Pipeline Manager
Pipeline Manager
• Data Preprocessing
• Data Materialization
• Pipeline Component Management
28.03.19 DFKI – DIMA – TU Berlin 23
Historical Training Data
Preprocess MaterializeDiscretize Sample Update
Raw
Data
Raw Data
Chunks
Preprocessed
Features
Model
Model Updater
Model Updater
• Online Training
• Proactive Training
Scheduled Execution
28.03.19 DFKI – DIMA – TU Berlin 24
Historical Training Data
Preprocess MaterializeDiscretize Sample Update
Raw
Data
Raw Data
Chunks
Preprocessed
Features
Model
Scheduler
Scheduler
• Schedule Proactive Training
Scheduled Execution
28.03.19 DFKI – DIMA – TU Berlin 25
■ TFX
□ Manual Retraining
□ No Online Learning
■ Velox
□ Automatic Retraining
□ Online Learning
■ Clipper
□ No Retraining
□ No Online Learning
□ Ensemble of Models
Related Work
28.03.19 DFKI – DIMA – TU Berlin 26
Proactive Training vs Periodical Retraining (Taxi)
Cumulative Prequential Prediction Error Rate for the Taxi Pipeline During the Deployment
−
−
− −
−
−
− −
−
−
−
−
−
−
−
−
−
−
− − − −
−
−
−
−
− −
−
−
− −
−
−
−
−
−
−
−
−
−
−
− − − −
−
−
−
−
− −
−
−
− −
−
−
−
−
−
−
−
−
−
−
− − − −
−
−
0.0960
0.0965
0.0970
0.0975
0.0980
Feb15 Jul15 Jan16 June16
RMSLE
− − −Proactive Retraining Online
28.03.19 DFKI – DIMA – TU Berlin 27
Proactive Training vs Periodical Retraining (Taxi)
Cumulative Training Time for the Taxi Pipeline During the Deployment
− − − − − − − − − − − − − − − − − − − − − − − −
−
−
− −
−
− −
−
− −
−
−
− −
−
− −
−
− −
−
− −
−
− − − − − − − − − − − − − − − − − − − − − − − −
0
500
1000
1500
Feb15 Jul15 Jan16 June16
Time(m)
− − −Proactive Retraining Online
28.03.19 DFKI – DIMA – TU Berlin 28
Materialization and Statistics Computation (Taxi)
●
●
●
●300
400
500
600
700
800
0.0 0.2 0.6 1.0
Cached Features Ratio
Time(m)
● TimeBased
Uniform
WindowBased
NoOptimization
Materialization Utilization Rate
for different ratio of Cached Features Taxi
Sampling
Ratio of Cached Features
m = 0.2 m = 0.6
Uniform 0.51 0.90
Window-based 0.57 1.0
Time-based 0.65 0.97
Materialization Utilization Rate:
Ratio of preprocessed features that
skipped the materialization step
28.03.19 DFKI – DIMA – TU Berlin 29
Materialization and Statistics Computation (URL)
Sampling
Ratio of Cached Features
m = 0.2 m = 0.6
Uniform 0.52 0.91
Window-based 0.58 1.0
Time-based 0.68 0.97
Materialization Utilization Rate
for different ratio of Cached Features URL
●
●
●
●
60
70
80
90
100
110
0.0 0.2 0.6 1.0
Cached Features Ratio
Time(m)
● TimeBased
Uniform
WindowBased
NoOptimization
Materialization Utilization Rate:
Ratio of preprocessed features that
skipped the materialization step
28.03.19 DFKI – DIMA – TU Berlin 30
Time vs Quality (Taxi)
0.0974
0.0975
0.0976
0 1000 2000
Time(m)
RMSLE
Proactive Retraining Online
28.03.19 DFKI – DIMA – TU Berlin 31
Adaptation
URL Taxi
r = 1E-2 r = 1E-3 r = 1E-4 r = 1E-2 r = 1E-3 r = 1E-4
Adam 0.030 0.026 0.035 0.09553 0.09551 0.09551
RMSProp 0.030 0.027 0.034 0.09552 0.09552 0.09550
Adadelta 0.029 0.028 0.034 0.09609 0.09610 0.09619
Effect of Learning Rate Adaption during the Deployment
Effect Learning Rate Adaption and Regularization Parameter on Initial Training
Tuning
− − −Adam Rmsprop Adadelta
−
−
−
−
−
−
−
−
−
−
−
−
−
−
−
−
−
−
−
−
−
2.10
2.15
2.20
2.25
2.30
2.35
day1 day15 day30
URL
Misclassification%
−
−
− −
− − − − −
−
−
− −
− − − − −
−
−
− −
− − − − −
0.085
0.090
0.095
Feb15 Apr15
TaxiRMSLE
28.03.19 DFKI – DIMA – TU Berlin 32
Effect of Sampling on model quality
Effect of Sampling Method on the Error Rate
− − −Timebased Windowbased Uniform
−
−
−
−
−
−
−
−
−
−
−
−
−
−
−
−
−
−
−
−
−
2.10
2.15
2.20
2.25
2.30
day1 day15 day30
URL
Misclassification%
−
−
− −
− − − − −
−
−
− −
− − − − −
−
−
− −
− − − − −
0.085
0.090
0.095
Feb15 Apr15
Taxi
RMSLE
28.03.19 DFKI – DIMA – TU Berlin 33
Materialization Process
28.03.19 DFKI – DIMA – TU Berlin 34
Ads CTR USE Case Figure

Contenu connexe

Similaire à Continuous Deployment of Machine Learning Pipelines

CD Theory With Lab Component-IMLCIS.doc
CD Theory With Lab Component-IMLCIS.docCD Theory With Lab Component-IMLCIS.doc
CD Theory With Lab Component-IMLCIS.docGayathriRHICETCSESTA
 
Product Engineer Certified Lean Six Sigma Black Belt by IASSC
Product Engineer Certified Lean Six Sigma Black Belt by IASSCProduct Engineer Certified Lean Six Sigma Black Belt by IASSC
Product Engineer Certified Lean Six Sigma Black Belt by IASSCHAKKACHE Mohamed
 
AzureML Welcome to the future of Predictive Analytics
AzureML Welcome to the future of Predictive Analytics AzureML Welcome to the future of Predictive Analytics
AzureML Welcome to the future of Predictive Analytics Ruben Pertusa Lopez
 
Deep Learning for Autonomous Driving
Deep Learning for Autonomous DrivingDeep Learning for Autonomous Driving
Deep Learning for Autonomous DrivingJan Wiegelmann
 
Driving Enterprise Adoption: Tragedies, Triumphs and Our NEXT
Driving Enterprise Adoption: Tragedies, Triumphs and Our NEXTDriving Enterprise Adoption: Tragedies, Triumphs and Our NEXT
Driving Enterprise Adoption: Tragedies, Triumphs and Our NEXTDataWorks Summit
 
PYTHON AND DATA SCIENCE FOR INVESTMENT PROFESSIONALS
PYTHON AND DATA SCIENCE FOR INVESTMENT PROFESSIONALSPYTHON AND DATA SCIENCE FOR INVESTMENT PROFESSIONALS
PYTHON AND DATA SCIENCE FOR INVESTMENT PROFESSIONALSQuantUniversity
 
Certified Data Science Specialist Course Preview Dr. Nickholas
Certified Data Science Specialist Course Preview Dr. NickholasCertified Data Science Specialist Course Preview Dr. Nickholas
Certified Data Science Specialist Course Preview Dr. NickholasiTrainMalaysia1
 
Machine Learning for .NET Developers - ADC21
Machine Learning for .NET Developers - ADC21Machine Learning for .NET Developers - ADC21
Machine Learning for .NET Developers - ADC21Gülden Bilgütay
 
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...Joachim Schlosser
 
Data Science in the Elastic Stack
Data Science in the Elastic StackData Science in the Elastic Stack
Data Science in the Elastic StackRochelle Sonnenberg
 
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...All Things Open
 
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)dtz001
 
Alluxio Monthly Webinar - Accelerate AI Path to Production
Alluxio Monthly Webinar - Accelerate AI Path to ProductionAlluxio Monthly Webinar - Accelerate AI Path to Production
Alluxio Monthly Webinar - Accelerate AI Path to ProductionAlluxio, Inc.
 
Scaling machine learning to millions of users with Apache Beam
Scaling machine learning to millions of users with Apache BeamScaling machine learning to millions of users with Apache Beam
Scaling machine learning to millions of users with Apache BeamTatiana Al-Chueyr
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsDataPhoenix
 
( Big ) Data Management - Data Mining and Machine Learning - Global concepts ...
( Big ) Data Management - Data Mining and Machine Learning - Global concepts ...( Big ) Data Management - Data Mining and Machine Learning - Global concepts ...
( Big ) Data Management - Data Mining and Machine Learning - Global concepts ...Nicolas Sarramagna
 
Supply Chain Management
Supply Chain ManagementSupply Chain Management
Supply Chain ManagementKori Bori
 
Key projects Data Science and Engineering
Key projects Data Science and EngineeringKey projects Data Science and Engineering
Key projects Data Science and EngineeringVijayananda Mohire
 

Similaire à Continuous Deployment of Machine Learning Pipelines (20)

CD Theory With Lab Component-IMLCIS.doc
CD Theory With Lab Component-IMLCIS.docCD Theory With Lab Component-IMLCIS.doc
CD Theory With Lab Component-IMLCIS.doc
 
Product Engineer Certified Lean Six Sigma Black Belt by IASSC
Product Engineer Certified Lean Six Sigma Black Belt by IASSCProduct Engineer Certified Lean Six Sigma Black Belt by IASSC
Product Engineer Certified Lean Six Sigma Black Belt by IASSC
 
AzureML Welcome to the future of Predictive Analytics
AzureML Welcome to the future of Predictive Analytics AzureML Welcome to the future of Predictive Analytics
AzureML Welcome to the future of Predictive Analytics
 
Introducing MLOps.pdf
Introducing MLOps.pdfIntroducing MLOps.pdf
Introducing MLOps.pdf
 
Deep Learning for Autonomous Driving
Deep Learning for Autonomous DrivingDeep Learning for Autonomous Driving
Deep Learning for Autonomous Driving
 
Driving Enterprise Adoption: Tragedies, Triumphs and Our NEXT
Driving Enterprise Adoption: Tragedies, Triumphs and Our NEXTDriving Enterprise Adoption: Tragedies, Triumphs and Our NEXT
Driving Enterprise Adoption: Tragedies, Triumphs and Our NEXT
 
Deep learning in manufacturing predicting and preventing manufacturing defect...
Deep learning in manufacturing predicting and preventing manufacturing defect...Deep learning in manufacturing predicting and preventing manufacturing defect...
Deep learning in manufacturing predicting and preventing manufacturing defect...
 
PYTHON AND DATA SCIENCE FOR INVESTMENT PROFESSIONALS
PYTHON AND DATA SCIENCE FOR INVESTMENT PROFESSIONALSPYTHON AND DATA SCIENCE FOR INVESTMENT PROFESSIONALS
PYTHON AND DATA SCIENCE FOR INVESTMENT PROFESSIONALS
 
Certified Data Science Specialist Course Preview Dr. Nickholas
Certified Data Science Specialist Course Preview Dr. NickholasCertified Data Science Specialist Course Preview Dr. Nickholas
Certified Data Science Specialist Course Preview Dr. Nickholas
 
Machine Learning for .NET Developers - ADC21
Machine Learning for .NET Developers - ADC21Machine Learning for .NET Developers - ADC21
Machine Learning for .NET Developers - ADC21
 
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
 
Data Science in the Elastic Stack
Data Science in the Elastic StackData Science in the Elastic Stack
Data Science in the Elastic Stack
 
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...
 
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
 
Alluxio Monthly Webinar - Accelerate AI Path to Production
Alluxio Monthly Webinar - Accelerate AI Path to ProductionAlluxio Monthly Webinar - Accelerate AI Path to Production
Alluxio Monthly Webinar - Accelerate AI Path to Production
 
Scaling machine learning to millions of users with Apache Beam
Scaling machine learning to millions of users with Apache BeamScaling machine learning to millions of users with Apache Beam
Scaling machine learning to millions of users with Apache Beam
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOps
 
( Big ) Data Management - Data Mining and Machine Learning - Global concepts ...
( Big ) Data Management - Data Mining and Machine Learning - Global concepts ...( Big ) Data Management - Data Mining and Machine Learning - Global concepts ...
( Big ) Data Management - Data Mining and Machine Learning - Global concepts ...
 
Supply Chain Management
Supply Chain ManagementSupply Chain Management
Supply Chain Management
 
Key projects Data Science and Engineering
Key projects Data Science and EngineeringKey projects Data Science and Engineering
Key projects Data Science and Engineering
 

Dernier

Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
While-For-loop in python used in college
While-For-loop in python used in collegeWhile-For-loop in python used in college
While-For-loop in python used in collegessuser7a7cd61
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 

Dernier (20)

Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
While-For-loop in python used in college
While-For-loop in python used in collegeWhile-For-loop in python used in college
While-For-loop in python used in college
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 

Continuous Deployment of Machine Learning Pipelines

  • 1. 28.03.19 DFKI – DIMA – TU Berlin Database Systems and Information Management Group TU Berlin https://www.dima.tu-berlin.de/ Intelligent Analytics for Massive Data German Research Center for Artificial Intelligence https://www.dfki.de/ 1 Continuous Deployment of Machine Learning Pipelines Behrouz Derakhshan behrouz.derakhshan@dfki.de Alireza Rezaei Mahdiraji alireza.rm@dfki.de Tilmann Rabl rabl@tu-berlin.de Volker Markl volker.markl@tu-berlin.de
  • 2. 28.03.19 DFKI – DIMA – TU Berlin 2 Life Cycle of Machine Learning Applications
  • 3. 28.03.19 DFKI – DIMA – TU Berlin 2 ■ Life cycle of ML application does not end with training Life Cycle of Machine Learning Applications
  • 4. 28.03.19 DFKI – DIMA – TU Berlin 2 ■ Life cycle of ML application does not end with training ■ Models and Pipelines must be deployed to answer prediction queries Life Cycle of Machine Learning Applications
  • 5. 28.03.19 DFKI – DIMA – TU Berlin 2 ■ Life cycle of ML application does not end with training ■ Models and Pipelines must be deployed to answer prediction queries ■ Deployed models and pipelines should be monitored and trained further Life Cycle of Machine Learning Applications
  • 6. 28.03.19 DFKI – DIMA – TU Berlin 2 ■ Life cycle of ML application does not end with training ■ Models and Pipelines must be deployed to answer prediction queries ■ Deployed models and pipelines should be monitored and trained further Life Cycle of Machine Learning Applications Focus of this talk
  • 7. 28.03.19 DFKI – DIMA – TU Berlin 3 2. Deployment Improvement By Online Learning Data Preprocessing Model Training Model 1. Training Data Preprocessing Model Training Model Deployment Platform 3. Query Answering Prediction Query Prediction Training DataTraining Data
  • 8. 28.03.19 DFKI – DIMA – TU Berlin 3 2. Deployment Improvement By Online Learning Data Preprocessing Model Training Model 1. Training Data Preprocessing Model Training Model Deployment Platform 3. Query Answering Prediction Query Prediction Training Data Training DataTraining Data
  • 9. 28.03.19 DFKI – DIMA – TU Berlin 3 2. Deployment Improvement By Online Learning Data Preprocessing Model Training Model 1. Training Data Preprocessing Model Training Model Deployment Platform 3. Query Answering Prediction Query Prediction Training Data Training DataTraining Data 4. Online Learning
  • 10. 28.03.19 DFKI – DIMA – TU Berlin 3 2. Deployment Improvement By Online Learning Data Preprocessing Model Training Model 1. Training Data Preprocessing Model Training Model Deployment Platform 3. Query Answering Prediction Query Prediction Training Data Training DataTraining Data • Efficient and Fast 4. Online Learning
  • 11. 28.03.19 DFKI – DIMA – TU Berlin 3 2. Deployment Improvement By Online Learning Data Preprocessing Model Training Model 1. Training Data Preprocessing Model Training Model Deployment Platform 3. Query Answering Prediction Query Prediction Training Data Training DataTraining Data • Cannot guarantee high-quality models • Efficient and Fast 4. Online Learning
  • 12. 28.03.19 DFKI – DIMA – TU Berlin 4 2. Deployment Improvement By Retraining Data Preprocessing Model Training Model Data Preprocessing Model Training Model Deployment Platform Prediction Query Prediction Training Data Training Data 1. Training 3. Query Answering 4. Online Learning
  • 13. 28.03.19 DFKI – DIMA – TU Berlin 4 2. Deployment Improvement By Retraining Data Preprocessing Model Training Model Data Preprocessing Model Training Model Deployment Platform Prediction Query Prediction New Training Data 5. Gather Training Data Training Data Training Data 1. Training 3. Query Answering 4. Online Learning
  • 14. 28.03.19 DFKI – DIMA – TU Berlin 4 2. Deployment Improvement By Retraining Data Preprocessing Model Training Model Data Preprocessing Model Training Model Deployment Platform Prediction Query Prediction New Training Data 5. Gather Training Data Training Data 6. Retraining Training Data 1. Training 3. Query Answering 4. Online Learning
  • 15. 28.03.19 DFKI – DIMA – TU Berlin 4 2. Deployment Improvement By Retraining Data Preprocessing Model Training Model Data Preprocessing Model Training Model Deployment Platform Prediction Query Prediction New Training Data 5. Gather Training Data Training Data 6. Retraining Historical Training Data 1. Training 3. Query Answering 4. Online Learning
  • 16. 28.03.19 DFKI – DIMA – TU Berlin 4 2. Deployment Improvement By Retraining Data Preprocessing Model Training Model Data Preprocessing Model Training Model Deployment Platform Prediction Query Prediction New Training Data 5. Gather Training Data Training Data 6. Retraining Historical Training Data • Provides high-quality models 1. Training 3. Query Answering 4. Online Learning
  • 17. 28.03.19 DFKI – DIMA – TU Berlin 4 2. Deployment Improvement By Retraining Data Preprocessing Model Training Model Data Preprocessing Model Training Model Deployment Platform Prediction Query Prediction New Training Data 5. Gather Training Data Training Data 6. Retraining Historical Training Data • Provides high-quality models • Time-consuming and resource-intensive • Out-of-core 1. Training 3. Query Answering 4. Online Learning
  • 18. 28.03.19 DFKI – DIMA – TU Berlin 5 Ideal Deployment Platform Can a platform provide the same level of quality as Retraining and perform (almost) as efficiently as Online Learning?
  • 19. 28.03.19 DFKI – DIMA – TU Berlin 6 Continuous Deployment Platform Prediction Query Prediction Training Data Continuous Deployment Platform • Train the model inside the platform • Compute features and cache them • Update data preprocessing statistics • Replace Retraining with Proactive Training Data Preprocessing Model Training Model Historical Training Data Online Learning Proactive Training Cache computed Features Stats
  • 20. 28.03.19 DFKI – DIMA – TU Berlin 7 Historical Training Data Preprocess MaterializeDiscretize Sample Update Raw Data Chunks Preprocessed Features Model Continuous Deployment Platform Scheduled Execution Raw Data
  • 21. 28.03.19 DFKI – DIMA – TU Berlin 8 Historical Training Data Preprocess MaterializeDiscretize Sample Update Raw Data Chunks Preprocessed Features Model Continuous Deployment Platform Scheduled Execution Raw Data Data Preparation Phase
  • 22. 28.03.19 DFKI – DIMA – TU Berlin 9 Historical Training Data Preprocess MaterializeDiscretize Sample Update Raw Data Chunks Preprocessed Features Model Continuous Deployment Platform Scheduled Execution Raw Data Proactive Training Phase
  • 23. 28.03.19 DFKI – DIMA – TU Berlin 10 Historical Training Data Data Preparation Phase Raw Data Raw Data Chunks Discretize ….. t0 t1 t2 t3 t0 t1 t2 t3 tn-3 tn-2 tn-1 tn
  • 24. 28.03.19 DFKI – DIMA – TU Berlin 10 Historical Training Data Data Preparation Phase Raw Data Raw Data Chunks Discretize Preprocess Parser Missing Value Imputer Standard Scaler Feature Hasher ….. t0 t1 t2 t3 t0 t1 t2 t3 tn-3 tn-2 tn-1 tn
  • 25. 28.03.19 DFKI – DIMA – TU Berlin 10 Historical Training Data Data Preparation Phase Raw Data Raw Data Chunks Discretize Preprocess Feature Chunks Parser Missing Value Imputer Standard Scaler Feature Hasher ….. ….. Cache Layer t0 t1 t2 t3 t0 t1 t2 t3 t0 t1 t2 t3 tn-3 tn-2 tn-1 tn
  • 26. 28.03.19 DFKI – DIMA – TU Berlin 10 Historical Training Data Data Preparation Phase Raw Data Raw Data Chunks Discretize Preprocess Feature Chunks Parser Missing Value Imputer Standard Scaler Feature Hasher Stats Stats ….. ….. Cache Layer t0 t1 t2 t3 t0 t1 t2 t3 t0 t1 t2 t3 tn-3 tn-2 tn-1 tn Update and Store Statistics
  • 27. 28.03.19 DFKI – DIMA – TU Berlin 11 Storage Requirement Historical Training Data ….. ….. t0 t1 t2 t3 tn-3 tn-2 tn-1 tn Cache Layer
  • 28. 28.03.19 DFKI – DIMA – TU Berlin 11 Storage Requirement Historical Training Data ….. ….. Removed Cached Features t0 t1 t2 t3 tn-3 tn-2 tn-1 tn Cache Layer
  • 29. 28.03.19 DFKI – DIMA – TU Berlin 12 Historical Training Data ….. ….. Sample Update ModelMaterialize Cache Layer Scheduled Execution Proactive Training Phase
  • 30. 28.03.19 DFKI – DIMA – TU Berlin 12 Historical Training Data ….. ….. Sample Update ModelMaterialize Cache Layer mini-batch SGD Algorithm Input: = training dataset Output: = trained model 1: initialize 2: for = do 3: 4: 5 6: end for 7: return 𝐷 𝑚 𝑚0 𝑖 1…𝑛  𝑠𝑖 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑓𝑟𝑜𝑚 𝐷 𝑔 = 𝛁𝐽(𝑠𝑖, 𝑚𝑖−1) :     𝑚𝑖 = 𝑚𝑖−1 − 𝜂𝑖−1 𝑔 𝑚𝑛 Scheduled Execution Proactive Training Phase
  • 31. 28.03.19 DFKI – DIMA – TU Berlin 12 Historical Training Data ….. ….. Sample Update ModelMaterialize Cache Layer mini-batch SGD Algorithm Input: = training dataset Output: = trained model 1: initialize 2: for = do 3: 4: 5 6: end for 7: return 𝐷 𝑚 𝑚0 𝑖 1…𝑛  𝑠𝑖 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑓𝑟𝑜𝑚 𝐷 𝑔 = 𝛁𝐽(𝑠𝑖, 𝑚𝑖−1) :     𝑚𝑖 = 𝑚𝑖−1 − 𝜂𝑖−1 𝑔 𝑚𝑛 Scheduled Execution Proactive Training Phase
  • 32. 28.03.19 DFKI – DIMA – TU Berlin 12 Historical Training Data ….. ….. Sample Update ModelMaterialize Cache Layer mini-batch SGD Algorithm Input: = training dataset Output: = trained model 1: initialize 2: for = do 3: 4: 5 6: end for 7: return 𝐷 𝑚 𝑚0 𝑖 1…𝑛  𝑠𝑖 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑓𝑟𝑜𝑚 𝐷 𝑔 = 𝛁𝐽(𝑠𝑖, 𝑚𝑖−1) :     𝑚𝑖 = 𝑚𝑖−1 − 𝜂𝑖−1 𝑔 𝑚𝑛 Scheduled Execution Proactive Training Phase
  • 33. 28.03.19 DFKI – DIMA – TU Berlin 12 Historical Training Data ….. ….. Sample Update ModelMaterialize Cache Layer mini-batch SGD Algorithm Input: = training dataset Output: = trained model 1: initialize 2: for = do 3: 4: 5 6: end for 7: return 𝐷 𝑚 𝑚0 𝑖 1…𝑛  𝑠𝑖 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑓𝑟𝑜𝑚 𝐷 𝑔 = 𝛁𝐽(𝑠𝑖, 𝑚𝑖−1) :     𝑚𝑖 = 𝑚𝑖−1 − 𝜂𝑖−1 𝑔 𝑚𝑛 Scheduled Execution Proactive Training Phase
  • 34. 28.03.19 DFKI – DIMA – TU Berlin 12 Historical Training Data ….. ….. Sample Update ModelMaterialize Cache Layer mini-batch SGD Algorithm Input: = training dataset Output: = trained model 1: initialize 2: for = do 3: 4: 5 6: end for 7: return 𝐷 𝑚 𝑚0 𝑖 1…𝑛  𝑠𝑖 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑓𝑟𝑜𝑚 𝐷 𝑔 = 𝛁𝐽(𝑠𝑖, 𝑚𝑖−1) :     𝑚𝑖 = 𝑚𝑖−1 − 𝜂𝑖−1 𝑔 𝑚𝑛 Scheduled Execution Proactive Training Phase
  • 35. 28.03.19 DFKI – DIMA – TU Berlin 13 Data Materializing Historical Training Data ….. ….. Sample Update ModelMaterialize Cache Layer
  • 36. 28.03.19 DFKI – DIMA – TU Berlin 13 Data Materializing Historical Training Data ….. ….. Sample Update ModelMaterialize Parser Missing Value Imputer Standard Scaler Feature Hasher Stats Stats Cache Layer Statistics precomputed during the Data Preparation Phase
  • 37. 28.03.19 DFKI – DIMA – TU Berlin 14 Evaluation Datasets Size #Instances Initial Deployment URL 2.1 GB 2.4 M Day 0 Day 1-120 Taxi 42 GB 280 M Jan 15 Feb 15 – Jun 16 Parser Missing Value Imputer Standard Scaler Feature Hasher SVM Model URL Pipeline Parser Feature Extractor Anomaly Detector Standard Scaler Linear Regression Model Taxi Pipeline
  • 38. 28.03.19 DFKI – DIMA – TU Berlin 15 Quality Can Proactive Training provide the same level of quality as Retraining?
  • 39. 28.03.19 DFKI – DIMA – TU Berlin 15 Quality − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − 2.1 2.2 2.3 day1 day30 day60 day90 day120 Misclassification% − − −Proactive Retraining Online Can Proactive Training provide the same level of quality as Retraining? Cumulative Prequential Prediction Error Rate for the URL Pipeline During the Deployment
  • 40. 28.03.19 DFKI – DIMA – TU Berlin 16 Training Time Can Proactive Training perform (almost) as efficiently as Online Learning?
  • 41. 28.03.19 DFKI – DIMA – TU Berlin 16 Training Time − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − 0 250 500 750 day1 day30 day60 day90 day120 Time(m) − − −Proactive Retraining Online Cumulative Training Time for the URL Pipeline During the Deployment Can Proactive Training perform (almost) as efficiently as Online Learning?
  • 42. 28.03.19 DFKI – DIMA – TU Berlin 16 Training Time − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − 0 250 500 750 day1 day30 day60 day90 day120 Time(m) − − −Proactive Retraining Online Cumulative Training Time for the URL Pipeline During the Deployment Can Proactive Training perform (almost) as efficiently as Online Learning? Proactive training provides same level of model accuracy as Retraining, while matching the speed of Online Learning
  • 43. 28.03.19 DFKI – DIMA – TU Berlin 17 Feature Caching and Statistics Computation What are the effects of Statistics Computation and Feature Caching ? 111 91 55 0 40 80 120 No Optimiziation 0% Cached 100% Cached Time(m) Total Training Time in Presence of Statistics Computation and Feature Caching
  • 44. 28.03.19 DFKI – DIMA – TU Berlin 17 Feature Caching and Statistics Computation What are the effects of Statistics Computation and Feature Caching ? 111 91 55 0 40 80 120 No Optimiziation 0% Cached 100% Cached Time(m) Total Training Time in Presence of Statistics Computation and Feature Caching Statistics Computation and Feature Caching improves the performance of Proactive training by a factor of 2
  • 45. 28.03.19 DFKI – DIMA – TU Berlin 18 Summary Continuous Deployment Platform • Proactive Training, instead of Offline Retraining • Feature Caching • Online Statistics Computation • Reduces the total training time • Achieves high quality Historical Training Data Preprocess MaterializeDiscretize Sample Update Raw Data Raw Data Chunks Preprocessed Features Model Schedule Execution 2.23 2.24 2.25 2.26 2.27 0 400 800 Time(m) Misclassification% Proactive Retraining Online
  • 46. 28.03.19 DFKI – DIMA – TU Berlin 19 1. D. Crankshaw, X. Wang, G. Zhou, M. Franklin, et al. 2016. Clipper: A Low-Latency Online Prediction Serving System. arXiv preprint arXiv:1612.03079 (2016). 2. D. Crankshaw, P. Bailis, J. Gonzalez, H. Li, et al. 2014. The missing piece incomplex analytics: Low latency, scalable model management and serving with velox. 3. D. Baylor, E. Breck, H. Cheng, N. Fiedel, et al. 2017. TFX: A TensorFlow-Based Production-Scale Machine Learning Platform. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1387–1395. 4. L. Bottou. 2010. Large-scale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT’2010. Springer, 177–186. 5. M. Zaharia, M. Chowdhury, M. Franklin, S. Shenker, and I. Stoica. 2010. Spark: cluster computing with working sets. HotCloud 10 (2010), 10–10. 6. O. Chapelle. [n. d.]. NYC Taxi & Lomousine Commision Trip Record Data. http://www.nyc.gov/html/tlc/ html/about/trip_record_data.shtml. [Online;accessed 10-April-2018]. 7. J. Ma, L. Saul, S. Savage, and G. Voelker. 2009. Identifying suspicious URLs: an application of large- scale online learning. In Proceedings of the 26th annual international conference on machine learning. ACM, 681–688. 8. D. Kingma and J. Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014). 9. M. Zeiler. 2012. ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (2012). 10. T. Tieleman and G. Hinton. 2012. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural networks for machine learning 4, 2 (2012), 26–31. References
  • 47. 28.03.19 DFKI – DIMA – TU Berlin 20 Backup Slides
  • 48. 28.03.19 DFKI – DIMA – TU Berlin 21 Preprocess MaterializeDiscretize Sample Update Raw Data Chunks Preprocessed Features Model Data Manager Historical Training Data Raw Data Data Manager • Data Discretizing • Data Sampling • Historical Data Management Scheduled Execution
  • 49. 28.03.19 DFKI – DIMA – TU Berlin 22 Scheduled Execution Historical Training Data Preprocess MaterializeDiscretize Sample Update Raw Data Raw Data Chunks Preprocessed Features Model Pipeline Manager Pipeline Manager • Data Preprocessing • Data Materialization • Pipeline Component Management
  • 50. 28.03.19 DFKI – DIMA – TU Berlin 23 Historical Training Data Preprocess MaterializeDiscretize Sample Update Raw Data Raw Data Chunks Preprocessed Features Model Model Updater Model Updater • Online Training • Proactive Training Scheduled Execution
  • 51. 28.03.19 DFKI – DIMA – TU Berlin 24 Historical Training Data Preprocess MaterializeDiscretize Sample Update Raw Data Raw Data Chunks Preprocessed Features Model Scheduler Scheduler • Schedule Proactive Training Scheduled Execution
  • 52. 28.03.19 DFKI – DIMA – TU Berlin 25 ■ TFX □ Manual Retraining □ No Online Learning ■ Velox □ Automatic Retraining □ Online Learning ■ Clipper □ No Retraining □ No Online Learning □ Ensemble of Models Related Work
  • 53. 28.03.19 DFKI – DIMA – TU Berlin 26 Proactive Training vs Periodical Retraining (Taxi) Cumulative Prequential Prediction Error Rate for the Taxi Pipeline During the Deployment − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − 0.0960 0.0965 0.0970 0.0975 0.0980 Feb15 Jul15 Jan16 June16 RMSLE − − −Proactive Retraining Online
  • 54. 28.03.19 DFKI – DIMA – TU Berlin 27 Proactive Training vs Periodical Retraining (Taxi) Cumulative Training Time for the Taxi Pipeline During the Deployment − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − 0 500 1000 1500 Feb15 Jul15 Jan16 June16 Time(m) − − −Proactive Retraining Online
  • 55. 28.03.19 DFKI – DIMA – TU Berlin 28 Materialization and Statistics Computation (Taxi) ● ● ● ●300 400 500 600 700 800 0.0 0.2 0.6 1.0 Cached Features Ratio Time(m) ● TimeBased Uniform WindowBased NoOptimization Materialization Utilization Rate for different ratio of Cached Features Taxi Sampling Ratio of Cached Features m = 0.2 m = 0.6 Uniform 0.51 0.90 Window-based 0.57 1.0 Time-based 0.65 0.97 Materialization Utilization Rate: Ratio of preprocessed features that skipped the materialization step
  • 56. 28.03.19 DFKI – DIMA – TU Berlin 29 Materialization and Statistics Computation (URL) Sampling Ratio of Cached Features m = 0.2 m = 0.6 Uniform 0.52 0.91 Window-based 0.58 1.0 Time-based 0.68 0.97 Materialization Utilization Rate for different ratio of Cached Features URL ● ● ● ● 60 70 80 90 100 110 0.0 0.2 0.6 1.0 Cached Features Ratio Time(m) ● TimeBased Uniform WindowBased NoOptimization Materialization Utilization Rate: Ratio of preprocessed features that skipped the materialization step
  • 57. 28.03.19 DFKI – DIMA – TU Berlin 30 Time vs Quality (Taxi) 0.0974 0.0975 0.0976 0 1000 2000 Time(m) RMSLE Proactive Retraining Online
  • 58. 28.03.19 DFKI – DIMA – TU Berlin 31 Adaptation URL Taxi r = 1E-2 r = 1E-3 r = 1E-4 r = 1E-2 r = 1E-3 r = 1E-4 Adam 0.030 0.026 0.035 0.09553 0.09551 0.09551 RMSProp 0.030 0.027 0.034 0.09552 0.09552 0.09550 Adadelta 0.029 0.028 0.034 0.09609 0.09610 0.09619 Effect of Learning Rate Adaption during the Deployment Effect Learning Rate Adaption and Regularization Parameter on Initial Training Tuning − − −Adam Rmsprop Adadelta − − − − − − − − − − − − − − − − − − − − − 2.10 2.15 2.20 2.25 2.30 2.35 day1 day15 day30 URL Misclassification% − − − − − − − − − − − − − − − − − − − − − − − − − − − 0.085 0.090 0.095 Feb15 Apr15 TaxiRMSLE
  • 59. 28.03.19 DFKI – DIMA – TU Berlin 32 Effect of Sampling on model quality Effect of Sampling Method on the Error Rate − − −Timebased Windowbased Uniform − − − − − − − − − − − − − − − − − − − − − 2.10 2.15 2.20 2.25 2.30 day1 day15 day30 URL Misclassification% − − − − − − − − − − − − − − − − − − − − − − − − − − − 0.085 0.090 0.095 Feb15 Apr15 Taxi RMSLE
  • 60. 28.03.19 DFKI – DIMA – TU Berlin 33 Materialization Process
  • 61. 28.03.19 DFKI – DIMA – TU Berlin 34 Ads CTR USE Case Figure