Today machine learning is entering many business and scientific applications. The life cycle of machine learning applications consists of data preprocessing for transforming the raw data into features, training a model using the features, and deploying the model for answering prediction queries. In order to guarantee accurate predictions, one has to continuously monitor and update the deployed model and pipeline. Current deployment platforms update the model using online learning methods. When online learning alone is not adequate to guarantee the prediction accuracy, some deployment platforms provide a mechanism for automatic or manual retraining of the model. While the online training is fast, the retraining of the model is time-consuming and adds extra overhead and complexity to the process of deployment.
We propose a novel continuous deployment approach for updating the deployed model using a combination of the incoming real- time data and the historical data. We utilize sampling techniques to include the historical data in the training process, thus eliminating the need for retraining the deployed model. We also offer online statistics computation and dynamic materialization of the preprocessed features, which further reduces the total training and data preprocessing time. In our experiments, we design and deploy two pipelines and models to process two real-world datasets. The experiments show that continuous deployment reduces the total training cost up to 15 times while providing the same level of quality when compared to the state-of-the-art deployment approaches.
Link to the Paper: https://openproceedings.org/2019/conf/edbt/EDBT19_paper_23.pdf
Student Profile Sample report on improving academic performance by uniting gr...
Continuous Deployment of Machine Learning Pipelines
1. 28.03.19 DFKI – DIMA – TU Berlin
Database Systems and Information
Management Group
TU Berlin
https://www.dima.tu-berlin.de/
Intelligent Analytics for Massive Data
German Research Center for Artificial
Intelligence
https://www.dfki.de/ 1
Continuous Deployment of Machine
Learning Pipelines
Behrouz Derakhshan behrouz.derakhshan@dfki.de
Alireza Rezaei Mahdiraji alireza.rm@dfki.de
Tilmann Rabl rabl@tu-berlin.de
Volker Markl volker.markl@tu-berlin.de
2. 28.03.19 DFKI – DIMA – TU Berlin 2
Life Cycle of Machine Learning Applications
3. 28.03.19 DFKI – DIMA – TU Berlin 2
■ Life cycle of ML application does not end with training
Life Cycle of Machine Learning Applications
4. 28.03.19 DFKI – DIMA – TU Berlin 2
■ Life cycle of ML application does not end with training
■ Models and Pipelines must be deployed to answer
prediction queries
Life Cycle of Machine Learning Applications
5. 28.03.19 DFKI – DIMA – TU Berlin 2
■ Life cycle of ML application does not end with training
■ Models and Pipelines must be deployed to answer
prediction queries
■ Deployed models and pipelines should be monitored and
trained further
Life Cycle of Machine Learning Applications
6. 28.03.19 DFKI – DIMA – TU Berlin 2
■ Life cycle of ML application does not end with training
■ Models and Pipelines must be deployed to answer
prediction queries
■ Deployed models and pipelines should be monitored and
trained further
Life Cycle of Machine Learning Applications
Focus of this talk
7. 28.03.19 DFKI – DIMA – TU Berlin 3
2. Deployment
Improvement By Online Learning
Data
Preprocessing
Model
Training
Model
1. Training
Data
Preprocessing
Model
Training
Model
Deployment Platform
3. Query Answering
Prediction
Query Prediction
Training DataTraining Data
8. 28.03.19 DFKI – DIMA – TU Berlin 3
2. Deployment
Improvement By Online Learning
Data
Preprocessing
Model
Training
Model
1. Training
Data
Preprocessing
Model
Training
Model
Deployment Platform
3. Query Answering
Prediction
Query Prediction
Training
Data
Training DataTraining Data
9. 28.03.19 DFKI – DIMA – TU Berlin 3
2. Deployment
Improvement By Online Learning
Data
Preprocessing
Model
Training
Model
1. Training
Data
Preprocessing
Model
Training
Model
Deployment Platform
3. Query Answering
Prediction
Query Prediction
Training
Data
Training DataTraining Data
4. Online Learning
10. 28.03.19 DFKI – DIMA – TU Berlin 3
2. Deployment
Improvement By Online Learning
Data
Preprocessing
Model
Training
Model
1. Training
Data
Preprocessing
Model
Training
Model
Deployment Platform
3. Query Answering
Prediction
Query Prediction
Training
Data
Training DataTraining Data
• Efficient and Fast
4. Online Learning
11. 28.03.19 DFKI – DIMA – TU Berlin 3
2. Deployment
Improvement By Online Learning
Data
Preprocessing
Model
Training
Model
1. Training
Data
Preprocessing
Model
Training
Model
Deployment Platform
3. Query Answering
Prediction
Query Prediction
Training
Data
Training DataTraining Data
• Cannot guarantee high-quality models
• Efficient and Fast
4. Online Learning
12. 28.03.19 DFKI – DIMA – TU Berlin 4
2. Deployment
Improvement By Retraining
Data
Preprocessing
Model
Training
Model
Data
Preprocessing
Model
Training
Model
Deployment Platform
Prediction
Query Prediction
Training
Data
Training Data
1. Training
3. Query Answering
4. Online Learning
13. 28.03.19 DFKI – DIMA – TU Berlin 4
2. Deployment
Improvement By Retraining
Data
Preprocessing
Model
Training
Model
Data
Preprocessing
Model
Training
Model
Deployment Platform
Prediction
Query Prediction
New Training Data
5. Gather Training Data
Training
Data
Training Data
1. Training
3. Query Answering
4. Online Learning
14. 28.03.19 DFKI – DIMA – TU Berlin 4
2. Deployment
Improvement By Retraining
Data
Preprocessing
Model
Training
Model
Data
Preprocessing
Model
Training
Model
Deployment Platform
Prediction
Query Prediction
New Training Data
5. Gather Training Data
Training
Data
6. Retraining
Training Data
1. Training
3. Query Answering
4. Online Learning
15. 28.03.19 DFKI – DIMA – TU Berlin 4
2. Deployment
Improvement By Retraining
Data
Preprocessing
Model
Training
Model
Data
Preprocessing
Model
Training
Model
Deployment Platform
Prediction
Query Prediction
New Training Data
5. Gather Training Data
Training
Data
6. Retraining
Historical
Training Data
1. Training
3. Query Answering
4. Online Learning
16. 28.03.19 DFKI – DIMA – TU Berlin 4
2. Deployment
Improvement By Retraining
Data
Preprocessing
Model
Training
Model
Data
Preprocessing
Model
Training
Model
Deployment Platform
Prediction
Query Prediction
New Training Data
5. Gather Training Data
Training
Data
6. Retraining
Historical
Training Data
• Provides high-quality models
1. Training
3. Query Answering
4. Online Learning
17. 28.03.19 DFKI – DIMA – TU Berlin 4
2. Deployment
Improvement By Retraining
Data
Preprocessing
Model
Training
Model
Data
Preprocessing
Model
Training
Model
Deployment Platform
Prediction
Query Prediction
New Training Data
5. Gather Training Data
Training
Data
6. Retraining
Historical
Training Data
• Provides high-quality models
• Time-consuming and resource-intensive
• Out-of-core
1. Training
3. Query Answering
4. Online Learning
18. 28.03.19 DFKI – DIMA – TU Berlin 5
Ideal Deployment Platform
Can a platform provide the same level of quality as Retraining
and perform (almost) as efficiently as Online Learning?
19. 28.03.19 DFKI – DIMA – TU Berlin 6
Continuous Deployment Platform
Prediction
Query Prediction
Training
Data
Continuous Deployment Platform
• Train the model inside the platform
• Compute features and cache them
• Update data preprocessing statistics
• Replace Retraining with Proactive Training
Data
Preprocessing
Model
Training
Model
Historical Training Data Online Learning
Proactive Training
Cache computed
Features
Stats
20. 28.03.19 DFKI – DIMA – TU Berlin 7
Historical Training Data
Preprocess MaterializeDiscretize Sample Update
Raw Data
Chunks
Preprocessed
Features
Model
Continuous Deployment Platform
Scheduled Execution
Raw
Data
21. 28.03.19 DFKI – DIMA – TU Berlin 8
Historical Training Data
Preprocess MaterializeDiscretize Sample Update
Raw Data
Chunks
Preprocessed
Features
Model
Continuous Deployment Platform
Scheduled Execution
Raw
Data
Data Preparation Phase
22. 28.03.19 DFKI – DIMA – TU Berlin 9
Historical Training Data
Preprocess MaterializeDiscretize Sample Update
Raw Data
Chunks
Preprocessed
Features
Model
Continuous Deployment Platform
Scheduled Execution
Raw
Data
Proactive Training Phase
23. 28.03.19 DFKI – DIMA – TU Berlin 10
Historical Training Data
Data Preparation Phase
Raw
Data
Raw Data
Chunks
Discretize
…..
t0 t1 t2 t3
t0 t1 t2 t3 tn-3 tn-2 tn-1 tn
24. 28.03.19 DFKI – DIMA – TU Berlin 10
Historical Training Data
Data Preparation Phase
Raw
Data
Raw Data
Chunks
Discretize Preprocess
Parser
Missing Value
Imputer
Standard
Scaler
Feature
Hasher
…..
t0 t1 t2 t3
t0 t1 t2 t3 tn-3 tn-2 tn-1 tn
25. 28.03.19 DFKI – DIMA – TU Berlin 10
Historical Training Data
Data Preparation Phase
Raw
Data
Raw Data
Chunks
Discretize Preprocess
Feature
Chunks
Parser
Missing Value
Imputer
Standard
Scaler
Feature
Hasher
…..
…..
Cache
Layer
t0 t1 t2 t3
t0 t1 t2 t3
t0 t1 t2 t3 tn-3 tn-2 tn-1 tn
26. 28.03.19 DFKI – DIMA – TU Berlin 10
Historical Training Data
Data Preparation Phase
Raw
Data
Raw Data
Chunks
Discretize Preprocess
Feature
Chunks
Parser
Missing Value
Imputer
Standard
Scaler
Feature
Hasher
Stats Stats
…..
…..
Cache
Layer
t0 t1 t2 t3
t0 t1 t2 t3
t0 t1 t2 t3 tn-3 tn-2 tn-1 tn
Update and Store Statistics
27. 28.03.19 DFKI – DIMA – TU Berlin 11
Storage Requirement
Historical Training Data
…..
…..
t0 t1 t2 t3 tn-3 tn-2 tn-1 tn
Cache
Layer
28. 28.03.19 DFKI – DIMA – TU Berlin 11
Storage Requirement
Historical Training Data
…..
…..
Removed Cached Features
t0 t1 t2 t3 tn-3 tn-2 tn-1 tn
Cache
Layer
29. 28.03.19 DFKI – DIMA – TU Berlin 12
Historical Training Data
…..
…..
Sample Update ModelMaterialize
Cache
Layer
Scheduled Execution
Proactive Training Phase
30. 28.03.19 DFKI – DIMA – TU Berlin 12
Historical Training Data
…..
…..
Sample Update ModelMaterialize
Cache
Layer
mini-batch SGD Algorithm
Input: = training dataset
Output: = trained model
1: initialize
2: for = do
3:
4:
5
6: end for
7: return
𝐷
𝑚
𝑚0
𝑖 1…𝑛
𝑠𝑖 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑓𝑟𝑜𝑚 𝐷
𝑔 = 𝛁𝐽(𝑠𝑖, 𝑚𝑖−1)
: 𝑚𝑖 = 𝑚𝑖−1 − 𝜂𝑖−1 𝑔
𝑚𝑛
Scheduled Execution
Proactive Training Phase
31. 28.03.19 DFKI – DIMA – TU Berlin 12
Historical Training Data
…..
…..
Sample Update ModelMaterialize
Cache
Layer
mini-batch SGD Algorithm
Input: = training dataset
Output: = trained model
1: initialize
2: for = do
3:
4:
5
6: end for
7: return
𝐷
𝑚
𝑚0
𝑖 1…𝑛
𝑠𝑖 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑓𝑟𝑜𝑚 𝐷
𝑔 = 𝛁𝐽(𝑠𝑖, 𝑚𝑖−1)
: 𝑚𝑖 = 𝑚𝑖−1 − 𝜂𝑖−1 𝑔
𝑚𝑛
Scheduled Execution
Proactive Training Phase
32. 28.03.19 DFKI – DIMA – TU Berlin 12
Historical Training Data
…..
…..
Sample Update ModelMaterialize
Cache
Layer
mini-batch SGD Algorithm
Input: = training dataset
Output: = trained model
1: initialize
2: for = do
3:
4:
5
6: end for
7: return
𝐷
𝑚
𝑚0
𝑖 1…𝑛
𝑠𝑖 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑓𝑟𝑜𝑚 𝐷
𝑔 = 𝛁𝐽(𝑠𝑖, 𝑚𝑖−1)
: 𝑚𝑖 = 𝑚𝑖−1 − 𝜂𝑖−1 𝑔
𝑚𝑛
Scheduled Execution
Proactive Training Phase
33. 28.03.19 DFKI – DIMA – TU Berlin 12
Historical Training Data
…..
…..
Sample Update ModelMaterialize
Cache
Layer
mini-batch SGD Algorithm
Input: = training dataset
Output: = trained model
1: initialize
2: for = do
3:
4:
5
6: end for
7: return
𝐷
𝑚
𝑚0
𝑖 1…𝑛
𝑠𝑖 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑓𝑟𝑜𝑚 𝐷
𝑔 = 𝛁𝐽(𝑠𝑖, 𝑚𝑖−1)
: 𝑚𝑖 = 𝑚𝑖−1 − 𝜂𝑖−1 𝑔
𝑚𝑛
Scheduled Execution
Proactive Training Phase
34. 28.03.19 DFKI – DIMA – TU Berlin 12
Historical Training Data
…..
…..
Sample Update ModelMaterialize
Cache
Layer
mini-batch SGD Algorithm
Input: = training dataset
Output: = trained model
1: initialize
2: for = do
3:
4:
5
6: end for
7: return
𝐷
𝑚
𝑚0
𝑖 1…𝑛
𝑠𝑖 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑓𝑟𝑜𝑚 𝐷
𝑔 = 𝛁𝐽(𝑠𝑖, 𝑚𝑖−1)
: 𝑚𝑖 = 𝑚𝑖−1 − 𝜂𝑖−1 𝑔
𝑚𝑛
Scheduled Execution
Proactive Training Phase
35. 28.03.19 DFKI – DIMA – TU Berlin 13
Data Materializing
Historical Training Data
…..
…..
Sample Update ModelMaterialize
Cache
Layer
36. 28.03.19 DFKI – DIMA – TU Berlin 13
Data Materializing
Historical Training Data
…..
…..
Sample Update ModelMaterialize
Parser
Missing Value
Imputer
Standard
Scaler
Feature
Hasher
Stats Stats
Cache
Layer
Statistics precomputed during the Data Preparation Phase
37. 28.03.19 DFKI – DIMA – TU Berlin 14
Evaluation
Datasets Size #Instances Initial Deployment
URL 2.1 GB 2.4 M Day 0 Day 1-120
Taxi 42 GB 280 M Jan 15 Feb 15 – Jun 16
Parser
Missing Value
Imputer
Standard
Scaler
Feature
Hasher
SVM Model
URL
Pipeline
Parser
Feature
Extractor
Anomaly
Detector
Standard
Scaler
Linear Regression
Model
Taxi
Pipeline
38. 28.03.19 DFKI – DIMA – TU Berlin 15
Quality
Can Proactive Training provide the same level of quality as Retraining?
39. 28.03.19 DFKI – DIMA – TU Berlin 15
Quality
−
−
−
−
−
−
−
−
−
−
−
−
− −
−
−
−
−
−
−
−
−
− − −
−
−
−
−
−
−
−
−
−
−
−
−
− −
−
−
−
−
−
−
−
−
−
− −
−
−
−
−
−
−
−
−
−
−
−
−
− −
−
−
−
−
−
−
−
−
−
− −
2.1
2.2
2.3
day1 day30 day60 day90 day120
Misclassification%
− − −Proactive Retraining Online
Can Proactive Training provide the same level of quality as Retraining?
Cumulative Prequential Prediction Error Rate for the URL Pipeline During the Deployment
40. 28.03.19 DFKI – DIMA – TU Berlin 16
Training Time
Can Proactive Training perform (almost) as efficiently as Online Learning?
41. 28.03.19 DFKI – DIMA – TU Berlin 16
Training Time
− − − − − − − − − − − − − − − − − − − − − − − −
−
− − −
− −
− −
− −
− −
− −
− −
− −
− −
− −
− −
− − − − − − − − − − − − − − − − − − − − − − − −
0
250
500
750
day1 day30 day60 day90 day120
Time(m)
− − −Proactive Retraining Online
Cumulative Training Time for the URL Pipeline During the Deployment
Can Proactive Training perform (almost) as efficiently as Online Learning?
42. 28.03.19 DFKI – DIMA – TU Berlin 16
Training Time
− − − − − − − − − − − − − − − − − − − − − − − −
−
− − −
− −
− −
− −
− −
− −
− −
− −
− −
− −
− −
− − − − − − − − − − − − − − − − − − − − − − − −
0
250
500
750
day1 day30 day60 day90 day120
Time(m)
− − −Proactive Retraining Online
Cumulative Training Time for the URL Pipeline During the Deployment
Can Proactive Training perform (almost) as efficiently as Online Learning?
Proactive training provides same level of model accuracy as Retraining, while
matching the speed of Online Learning
43. 28.03.19 DFKI – DIMA – TU Berlin 17
Feature Caching and Statistics Computation
What are the effects of Statistics Computation and Feature Caching ?
111
91
55
0
40
80
120
No
Optimiziation
0% Cached 100% Cached
Time(m)
Total Training Time in Presence of Statistics Computation and Feature Caching
44. 28.03.19 DFKI – DIMA – TU Berlin 17
Feature Caching and Statistics Computation
What are the effects of Statistics Computation and Feature Caching ?
111
91
55
0
40
80
120
No
Optimiziation
0% Cached 100% Cached
Time(m)
Total Training Time in Presence of Statistics Computation and Feature Caching
Statistics Computation and Feature Caching improves the performance of
Proactive training by a factor of 2
45. 28.03.19 DFKI – DIMA – TU Berlin 18
Summary
Continuous Deployment Platform
• Proactive Training, instead of
Offline Retraining
• Feature Caching
• Online Statistics Computation
• Reduces the total training time
• Achieves high quality
Historical Training Data
Preprocess MaterializeDiscretize Sample Update
Raw
Data
Raw Data
Chunks
Preprocessed
Features
Model
Schedule Execution
2.23
2.24
2.25
2.26
2.27
0 400 800
Time(m)
Misclassification%
Proactive Retraining Online
46. 28.03.19 DFKI – DIMA – TU Berlin 19
1. D. Crankshaw, X. Wang, G. Zhou, M. Franklin, et al. 2016. Clipper: A Low-Latency Online Prediction
Serving System. arXiv preprint arXiv:1612.03079 (2016).
2. D. Crankshaw, P. Bailis, J. Gonzalez, H. Li, et al. 2014. The missing piece incomplex analytics: Low
latency, scalable model management and serving with velox.
3. D. Baylor, E. Breck, H. Cheng, N. Fiedel, et al. 2017. TFX: A TensorFlow-Based Production-Scale
Machine Learning Platform. In Proceedings of the 23rd ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining. ACM, 1387–1395.
4. L. Bottou. 2010. Large-scale machine learning with stochastic gradient descent. In Proceedings of
COMPSTAT’2010. Springer, 177–186.
5. M. Zaharia, M. Chowdhury, M. Franklin, S. Shenker, and I. Stoica. 2010. Spark: cluster computing with
working sets. HotCloud 10 (2010), 10–10.
6. O. Chapelle. [n. d.]. NYC Taxi & Lomousine Commision Trip Record Data. http://www.nyc.gov/html/tlc/
html/about/trip_record_data.shtml. [Online;accessed 10-April-2018].
7. J. Ma, L. Saul, S. Savage, and G. Voelker. 2009. Identifying suspicious URLs: an application of large-
scale online learning. In Proceedings of the 26th annual international conference on machine learning.
ACM, 681–688.
8. D. Kingma and J. Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint
arXiv:1412.6980 (2014).
9. M. Zeiler. 2012. ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (2012).
10. T. Tieleman and G. Hinton. 2012. Lecture 6.5-rmsprop: Divide the gradient by a running average of its
recent magnitude. COURSERA: Neural networks for machine learning 4, 2 (2012), 26–31.
References
48. 28.03.19 DFKI – DIMA – TU Berlin 21
Preprocess MaterializeDiscretize Sample Update
Raw Data
Chunks
Preprocessed
Features
Model
Data Manager
Historical Training Data
Raw
Data
Data Manager
• Data Discretizing
• Data Sampling
• Historical Data Management Scheduled Execution
49. 28.03.19 DFKI – DIMA – TU Berlin 22
Scheduled Execution
Historical Training Data
Preprocess MaterializeDiscretize Sample Update
Raw
Data
Raw Data
Chunks
Preprocessed
Features
Model
Pipeline Manager
Pipeline Manager
• Data Preprocessing
• Data Materialization
• Pipeline Component Management
50. 28.03.19 DFKI – DIMA – TU Berlin 23
Historical Training Data
Preprocess MaterializeDiscretize Sample Update
Raw
Data
Raw Data
Chunks
Preprocessed
Features
Model
Model Updater
Model Updater
• Online Training
• Proactive Training
Scheduled Execution
51. 28.03.19 DFKI – DIMA – TU Berlin 24
Historical Training Data
Preprocess MaterializeDiscretize Sample Update
Raw
Data
Raw Data
Chunks
Preprocessed
Features
Model
Scheduler
Scheduler
• Schedule Proactive Training
Scheduled Execution
52. 28.03.19 DFKI – DIMA – TU Berlin 25
■ TFX
□ Manual Retraining
□ No Online Learning
■ Velox
□ Automatic Retraining
□ Online Learning
■ Clipper
□ No Retraining
□ No Online Learning
□ Ensemble of Models
Related Work
55. 28.03.19 DFKI – DIMA – TU Berlin 28
Materialization and Statistics Computation (Taxi)
●
●
●
●300
400
500
600
700
800
0.0 0.2 0.6 1.0
Cached Features Ratio
Time(m)
● TimeBased
Uniform
WindowBased
NoOptimization
Materialization Utilization Rate
for different ratio of Cached Features Taxi
Sampling
Ratio of Cached Features
m = 0.2 m = 0.6
Uniform 0.51 0.90
Window-based 0.57 1.0
Time-based 0.65 0.97
Materialization Utilization Rate:
Ratio of preprocessed features that
skipped the materialization step
56. 28.03.19 DFKI – DIMA – TU Berlin 29
Materialization and Statistics Computation (URL)
Sampling
Ratio of Cached Features
m = 0.2 m = 0.6
Uniform 0.52 0.91
Window-based 0.58 1.0
Time-based 0.68 0.97
Materialization Utilization Rate
for different ratio of Cached Features URL
●
●
●
●
60
70
80
90
100
110
0.0 0.2 0.6 1.0
Cached Features Ratio
Time(m)
● TimeBased
Uniform
WindowBased
NoOptimization
Materialization Utilization Rate:
Ratio of preprocessed features that
skipped the materialization step
57. 28.03.19 DFKI – DIMA – TU Berlin 30
Time vs Quality (Taxi)
0.0974
0.0975
0.0976
0 1000 2000
Time(m)
RMSLE
Proactive Retraining Online
58. 28.03.19 DFKI – DIMA – TU Berlin 31
Adaptation
URL Taxi
r = 1E-2 r = 1E-3 r = 1E-4 r = 1E-2 r = 1E-3 r = 1E-4
Adam 0.030 0.026 0.035 0.09553 0.09551 0.09551
RMSProp 0.030 0.027 0.034 0.09552 0.09552 0.09550
Adadelta 0.029 0.028 0.034 0.09609 0.09610 0.09619
Effect of Learning Rate Adaption during the Deployment
Effect Learning Rate Adaption and Regularization Parameter on Initial Training
Tuning
− − −Adam Rmsprop Adadelta
−
−
−
−
−
−
−
−
−
−
−
−
−
−
−
−
−
−
−
−
−
2.10
2.15
2.20
2.25
2.30
2.35
day1 day15 day30
URL
Misclassification%
−
−
− −
− − − − −
−
−
− −
− − − − −
−
−
− −
− − − − −
0.085
0.090
0.095
Feb15 Apr15
TaxiRMSLE