Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Machine Learning in
Production
Krishna Sridhar (@krishna_srd)
Data Scientist, Dato Inc.
1
About Me
• Background
- Machine Learning (ML) Research.
- Ph.D Numerical Optimization @Wisconsin
• Now
- Build ML tools fo...
Overview
• Lots of fundamental problems to tackle.
• Blend of statistics, applied-ML, and software engineering.
• The spac...
What is an ML app?
4
Why production?
5
Why production?
6
Make your predictions available to everyone.
Share
Measure quality of the predictions over time.
Review
...
ML in Production - 101
Creation Production
7
Historical
Data
Trained
Model
Deployed
Model
Live
Data
Predictions
What is Production?
Evaluation
Management Monitoring
Deployment
Making model predictions easily available.
Measuring quali...
What is Production?
Evaluation
Monitoring
Deployment
Management
9
Deployment
10
What is Deployment?
Evaluation
Monitoring
Deployment
Management
11
ML in Production - 101
12
Trained
Model
Deployed
Model
ProductionCreation
Historical
Data
Live
Data
Predictions
What are we deploying?
13
def predict(data):
data[‘is_good’] = data[‘rating’] > 3
return model.predict(data)
Advantages
• ...
What are we deploying?
def predict(data):
data[‘is_good’] = data[‘rating’] > 3
return model.predict(data)
def predict(data...
What’s the challenge?
Wallofconfusion
Beat baseline by 15%.
Time to deploy!
What the **** is alpha,
and beta.
Data Scienti...
What’s the solution?
Beat baseline by 15%.
Time to deploy!
Beat baseline by 15%!
16
Data Scientists Deployment Engineers
Deployment - Demo
17
Deploying ML: Requirements
1. Ease of integration.
- Any code, any language.
2. Low latency predictions.
- Cache frequent ...
Deploying ML
Model Prediction
Cache
Web Service
Node 1
Model Prediction
Cache
Web Service
Node 3
Load Balancer
Model Predi...
Evaluation
20
What is Evaluation?
Evaluation
Monitoring
Deployment
Management
21
What is Evaluation?
22
Predictions Metric
+
Evaluation
Which metric?
Model evaluation metric != business metric
Precision-Recall, DCG,
NDCG
User engagement,
click through rate
T...
Evaluating Models
24
Historical
Data
Live
Data
PredictionsTrained
Model
Deployed
Model
Offline Evaluation
Online Evaluation
Monitoring & Management
25
Monitoring & Management?
Evaluation
Monitoring
Deployment
Management
26
Monitoring & Management?
Tracking metrics over time and reacting to
feedback from deployed models.
MonitoringManagement
Monitoring & Management
28
Historical
Data
Live
Data
PredictionsTrained
Model
Deployed
Model
Feedback
Monitoring & Management
Important for software engineering
- Versioning.
- Logging.
- Provenance.
- Dashboards.
- Reports....
Updating models
When to update?
• Trends and user taste changes over time.
- I liked R in the past, but now I like Python!...
A/B testing
Is model V2 significantly better than model V1?
2000 visits
10% CTR
2000 visits
30% CTR
Model V2
Model V1
31
B...
Multi-armed Bandits
32
2000 visits
10% CTR
2000 visits
30% CTR
Model V2
Model V1
B
A
World gets V2
10% of the time
Explora...
Multi-armed Bandits
MAB vs A/B Testing
Why MAB?
• “Set and forget approach” for continuous optimization.
• Minimize your losses.
• Good MAB al...
Conclusion
@krishna_srd, @DatoInc
• ML in production can be fun! Lots of new challenges in
deployment, evaluation, monitor...
Thanks!
Download
pip install graphlab-create
Docs
https://dato.com/learn/
Source
https://github.com/dato-code/tutorials
Thank you!
37
Backup
38
When/how to evaluate ML
• Offline evaluation
- Evaluate on historical labeled data.
- Make sure you evaluate on a test set...
ML Deployment - 2
Prototype
model
Historical
data
Deployed
model
Predictions
New
request
Online
adaptive
model
40
Online Learning
• Benefits
- Computationally faster and more efficient.
- Deployment and training are the same!
• Key Chal...
A/B testing
I’m a happy
Gaussian
I’m another
happy Gaussian
Click-through rate
Variance A
Variance B
42
Running an A/B test
As easy as alpha, beta, gamma, delta.
• Procedure
- Pick significance level α.
- Compute the test stat...
How long to run the test?
• Run the test until you see a significant difference?
- Wrong! Don’t do this.
• Statistical tes...
Separation of experiences
How well did you split off group B?
Homepage New
homepage
Second page Second page
BA
Button Butt...
Separation of experiences
How well did you split off group B?
Homepage New
homepage
Second page Second page
BA
Button Butt...
Shock of newness
• People hate change
• Why is my button now blue??
• Wait until the “shock of newness” wears off, then me...
Deploying ML: Requirements
1. Ease of integration.
- Any code, any language.
2. Low latency predictions.
- Cache frequent ...
What are we deploying?
def predict(data):
data[‘is_good’] = data[‘rating’] > 3
return model.predict(data)
def predict(data...
Deploying ML
Model Prediction
Cache
Web Service
Node 1
Model Prediction
Cache
Web Service
Node 3
Load Balancer
Model Predi...
Prochain SlideShare
Chargement dans…5
×

sur

Machine learning in production Slide 1 Machine learning in production Slide 2 Machine learning in production Slide 3 Machine learning in production Slide 4 Machine learning in production Slide 5 Machine learning in production Slide 6 Machine learning in production Slide 7 Machine learning in production Slide 8 Machine learning in production Slide 9 Machine learning in production Slide 10 Machine learning in production Slide 11 Machine learning in production Slide 12 Machine learning in production Slide 13 Machine learning in production Slide 14 Machine learning in production Slide 15 Machine learning in production Slide 16 Machine learning in production Slide 17 Machine learning in production Slide 18 Machine learning in production Slide 19 Machine learning in production Slide 20 Machine learning in production Slide 21 Machine learning in production Slide 22 Machine learning in production Slide 23 Machine learning in production Slide 24 Machine learning in production Slide 25 Machine learning in production Slide 26 Machine learning in production Slide 27 Machine learning in production Slide 28 Machine learning in production Slide 29 Machine learning in production Slide 30 Machine learning in production Slide 31 Machine learning in production Slide 32 Machine learning in production Slide 33 Machine learning in production Slide 34 Machine learning in production Slide 35 Machine learning in production Slide 36 Machine learning in production Slide 37 Machine learning in production Slide 38 Machine learning in production Slide 39 Machine learning in production Slide 40 Machine learning in production Slide 41 Machine learning in production Slide 42 Machine learning in production Slide 43 Machine learning in production Slide 44 Machine learning in production Slide 45 Machine learning in production Slide 46 Machine learning in production Slide 47 Machine learning in production Slide 48 Machine learning in production Slide 49 Machine learning in production Slide 50
Prochain SlideShare
JAXLondon2017 - Agile Machine Learning: From Theory to Production
Suivant
Télécharger pour lire hors ligne et voir en mode plein écran

18 j’aime

Partager

Télécharger pour lire hors ligne

Machine learning in production

Télécharger pour lire hors ligne

Tutorial for Machine Learning 101 (an all-day tutorial at Strata + Hadoop World, New York City, 2015)

The course is designed to introduce machine learning via real applications like building a recommender image analysis using deep learning.

In this talk we cover deployment of machine learning models.

Machine learning in production

  1. 1. Machine Learning in Production Krishna Sridhar (@krishna_srd) Data Scientist, Dato Inc. 1
  2. 2. About Me • Background - Machine Learning (ML) Research. - Ph.D Numerical Optimization @Wisconsin • Now - Build ML tools for data-scientists & developers @Dato. - Help deploy ML algorithms. @krishna_srd, @DatoInc 2
  3. 3. Overview • Lots of fundamental problems to tackle. • Blend of statistics, applied-ML, and software engineering. • The space is new, so lots of room for innovation! • Understanding production helps make better modeling decisions. 3 ML +
  4. 4. What is an ML app? 4
  5. 5. Why production? 5
  6. 6. Why production? 6 Make your predictions available to everyone. Share Measure quality of the predictions over time. Review Improve prediction quality with feedback. React
  7. 7. ML in Production - 101 Creation Production 7 Historical Data Trained Model Deployed Model Live Data Predictions
  8. 8. What is Production? Evaluation Management Monitoring Deployment Making model predictions easily available. Measuring quality of deployed models. Tracking model quality over time. Improving deployed models with feedback. 8
  9. 9. What is Production? Evaluation Monitoring Deployment Management 9
  10. 10. Deployment 10
  11. 11. What is Deployment? Evaluation Monitoring Deployment Management 11
  12. 12. ML in Production - 101 12 Trained Model Deployed Model ProductionCreation Historical Data Live Data Predictions
  13. 13. What are we deploying? 13 def predict(data): data[‘is_good’] = data[‘rating’] > 3 return model.predict(data) Advantages • Flexibility: No need for complicated abstractions. • Software deployment is a very mature field. • Rapid model updating with continuous deployments. Treat model deployment the same was as code deployment!
  14. 14. What are we deploying? def predict(data): data[‘is_good’] = data[‘rating’] > 3 return model.predict(data) def predict(data) : double = { data[‘is_good’] = data[‘rating’] > 3 return model.predict(data) } predict <- function(data): data$is_good = data$rating > 3 return predict(model, data) 14
  15. 15. What’s the challenge? Wallofconfusion Beat baseline by 15%. Time to deploy! What the **** is alpha, and beta. Data Scientists Deployment Engineers 15
  16. 16. What’s the solution? Beat baseline by 15%. Time to deploy! Beat baseline by 15%! 16 Data Scientists Deployment Engineers
  17. 17. Deployment - Demo 17
  18. 18. Deploying ML: Requirements 1. Ease of integration. - Any code, any language. 2. Low latency predictions. - Cache frequent predictions. 3. Fault Tolerant. - Replicate models, run on many machines. 4. Scalable. - Elastically scale nodes up or down. 5. Maintainable. - Easily update with newer models. 18
  19. 19. Deploying ML Model Prediction Cache Web Service Node 1 Model Prediction Cache Web Service Node 3 Load Balancer Model Prediction Cache Web Service Node 2 Client 19
  20. 20. Evaluation 20
  21. 21. What is Evaluation? Evaluation Monitoring Deployment Management 21
  22. 22. What is Evaluation? 22 Predictions Metric + Evaluation
  23. 23. Which metric? Model evaluation metric != business metric Precision-Recall, DCG, NDCG User engagement, click through rate Track both ML and business metrics to see if they correlate! 23
  24. 24. Evaluating Models 24 Historical Data Live Data PredictionsTrained Model Deployed Model Offline Evaluation Online Evaluation
  25. 25. Monitoring & Management 25
  26. 26. Monitoring & Management? Evaluation Monitoring Deployment Management 26
  27. 27. Monitoring & Management? Tracking metrics over time and reacting to feedback from deployed models. MonitoringManagement
  28. 28. Monitoring & Management 28 Historical Data Live Data PredictionsTrained Model Deployed Model Feedback
  29. 29. Monitoring & Management Important for software engineering - Versioning. - Logging. - Provenance. - Dashboards. - Reports. Interesting for applied-ML researchers - Updating models. 29
  30. 30. Updating models When to update? • Trends and user taste changes over time. - I liked R in the past, but now I like Python! - Tip: Track statistics about the data over time • Model performance drops. - CTR was down 20% last month. - Tip: Monitor both offline and online metric, track correlation! How to update? • A/B Testing • Multi-armed bandits 30
  31. 31. A/B testing Is model V2 significantly better than model V1? 2000 visits 10% CTR 2000 visits 30% CTR Model V2 Model V1 31 Be really careful with A/B testing. B A World gets V2
  32. 32. Multi-armed Bandits 32 2000 visits 10% CTR 2000 visits 30% CTR Model V2 Model V1 B A World gets V2 10% of the time Exploration 90% of the time Exploitation 36k visits 30% CTR
  33. 33. Multi-armed Bandits
  34. 34. MAB vs A/B Testing Why MAB? • “Set and forget approach” for continuous optimization. • Minimize your losses. • Good MAB algorithms converge very quickly! Why A/B testing? • Easy and quick to set up! • Answer relevant business questions. • Sometimes, it could take a while before you observe results.
  35. 35. Conclusion @krishna_srd, @DatoInc • ML in production can be fun! Lots of new challenges in deployment, evaluation, monitoring, and management. • Summary of tips: - Try to run the same code in modeling & deployment mode. - Business metric != Model metric - Monitor offline and online behavior, track their correlation. - Be really careful with A/B testing. - Minimize your losses with multi-armed bandits! 35
  36. 36. Thanks! Download pip install graphlab-create Docs https://dato.com/learn/ Source https://github.com/dato-code/tutorials
  37. 37. Thank you! 37
  38. 38. Backup 38
  39. 39. When/how to evaluate ML • Offline evaluation - Evaluate on historical labeled data. - Make sure you evaluate on a test set! • Online evaluation - A/B testing – split off a portion of incoming requests (B) to evaluate new deployment, use the rest as control group (A). 39
  40. 40. ML Deployment - 2 Prototype model Historical data Deployed model Predictions New request Online adaptive model 40
  41. 41. Online Learning • Benefits - Computationally faster and more efficient. - Deployment and training are the same! • Key Challenges - How do we maintain distributed state? - Do standard algorithms need to change in order to be more deployment friendly? - How much should the model “forget”. - Tricky to evaluate. • Simple Ideas that work. - Splitting the model space so the state of each model can lie in a single machine. 41
  42. 42. A/B testing I’m a happy Gaussian I’m another happy Gaussian Click-through rate Variance A Variance B 42
  43. 43. Running an A/B test As easy as alpha, beta, gamma, delta. • Procedure - Pick significance level α. - Compute the test statistic. - Compute p-value (probability of test statistic under the null hypothesis). - Reject the null hypothesis if p-value is less than α. 43
  44. 44. How long to run the test? • Run the test until you see a significant difference? - Wrong! Don’t do this. • Statistical tests directly control for false positive rate (significance) - With probability 1-α, Population 1 is different from Population 0 • The statistical power of a test controls for the false negative rate - How many observations do I need to discern a difference of δ between the means with power 0.8 and significance 0.05? • Determine how many observations you need before you start the test - Pick the power β, significance α, and magnitude of difference δ - Calculate n, the number of observations needed - Don’t stop the test until you’ve made this many observations. 44
  45. 45. Separation of experiences How well did you split off group B? Homepage New homepage Second page Second page BA Button Button Button Button 45
  46. 46. Separation of experiences How well did you split off group B? Homepage New homepage Second page Second page BA Button Button Button Button Unclean separation of experiences! 46
  47. 47. Shock of newness • People hate change • Why is my button now blue?? • Wait until the “shock of newness” wears off, then measure • Some population of users are forever wedded to old ways • Consider obtaining a fresh population Click-through rate The shock of newness t0 47
  48. 48. Deploying ML: Requirements 1. Ease of integration. - Any code, any language. 2. Low latency predictions. - Cache frequent predictions. 3. Fault Tolerant. - Replicate models, run on many machines. 4. Scalable. - Elastically scale nodes up or down. 5. Maintainable. - Easily update with newer models. 48
  49. 49. What are we deploying? def predict(data): data[‘is_good’] = data[‘rating’] > 3 return model.predict(data) def predict(data) : double = { data[‘is_good’] = data[‘rating’] > 3 return model.predict(data) } predict <- function(data): data$is_good = data$rating > 3 return predict(model, data) 49
  50. 50. Deploying ML Model Prediction Cache Web Service Node 1 Model Prediction Cache Web Service Node 3 Load Balancer Model Prediction Cache Web Service Node 2 Client 50
  • RamaPriyaKP

    Dec. 24, 2019
  • spaky

    Aug. 2, 2019
  • MarcoBahrs

    Jan. 13, 2019
  • RohitMarkam

    Apr. 14, 2018
  • thanhtran81

    Jul. 17, 2017
  • MaximKesin

    Jun. 9, 2017
  • sklavit1

    May. 12, 2017
  • FarizIkhwantri

    Sep. 11, 2016
  • mathematixy

    Jul. 15, 2016
  • MarcosCoqueJr

    Jun. 3, 2016
  • dannyeuu

    Jun. 3, 2016
  • JiHladvka

    May. 30, 2016
  • liweiyang5

    Apr. 2, 2016
  • cctsim

    Oct. 29, 2015
  • SilverMaple

    Oct. 25, 2015
  • sunshine16781

    Sep. 30, 2015
  • designerkenji

    Sep. 28, 2015
  • HanaJeon1

    Sep. 28, 2015

Tutorial for Machine Learning 101 (an all-day tutorial at Strata + Hadoop World, New York City, 2015) The course is designed to introduce machine learning via real applications like building a recommender image analysis using deep learning. In this talk we cover deployment of machine learning models.

Vues

Nombre de vues

7 635

Sur Slideshare

0

À partir des intégrations

0

Nombre d'intégrations

8

Actions

Téléchargements

325

Partages

0

Commentaires

0

Mentions J'aime

18

×