Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Models in Minutes not Months:
Data Science as Microservices
QCon 2017
saerni@salesforce.com
@itweetsarah
​Sarah Aerni, Ein...
InfoQ.com: News & Community Site
• Over 1,000,000 software developers, architects and CTOs read the site world-
wide every...
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practi...
LIVE DEMO
Agenda
​BUILDING AI APPS: Perspective Of A Data Scientist
• Journey to building your first model
• Barriers to production ...
ENABLING DATA SCIENCE
​A DATA SCIENTISTS VIEW OF BUILDING MODELS
Access and
Explore Data
Engineer
Features and
Build Models
Interpret Model
Results and
Accuracy
A data scientist’s view of...
Access and
Explore Data
Engineer
Features and
Build Models
Interpret Model
Results and
Accuracy
A data scientist’s view of...
Access and
Explore Data
Engineer
Features and
Build Models
Interpret Model
Results and
Accuracy
#Leads
Created Date
Access and
Explore Data
Engineer
Features and
Build Models
Interpret Model
Results and
Accuracy
Engineer Features
Empty fi...
Access and
Explore Data
Engineer
Features and
Build Models
Interpret Model
Results and
Accuracy​>>> from sklearn import sv...
Access and
Explore Data
Engineer
Features and
Build Models
Interpret Model
Results and
Accuracy
Geographies
#Leads
Access and
Explore Data
Engineer
Features and
Build Models
Interpret Model
Results and
Accuracy
Access and
Explore Data
Engineer
Features and
Build Models
Interpret Model
Results and
Accuracy
Fresh Data Input
Delivery ...
Bringing a Model to Production Requires a Team
​Applications deliver predictions for
customer consumption
​Predictions are...
Data Engineers
Provide data access and management
capabilities for data scientists
Set up and monitor data pipelines
Impro...
Supporting a Model in Production is Complex
Only a small fraction of real-world ML systems is a composed of ML code, as
sh...
MODELS IN PRODUDCTION
​WHAT IT TAKES TO DEPLOY AN AI-POWERED
APPLICATION
Supporting Models in Production is Mostly NOT AI
Only a small fraction of real-world ML systems
is a composed of ML code, ...
Configuration
Data Collection
Data Verification
Machine Resource
Management
Monitoring
Serving
Infrastructure
Feature
Extr...
Configuration
Data Collection
Data Verification
Machine Resource
Management
Monitoring
Serving
Infrastructure
Feature
Extr...
Configuration
Data Collection
Data Verification
Machine Resource
Management
Monitoring
Serving
Infrastructure
Feature
Extr...
Configuration
Data Collection
Data Verification
Machine Resource
Management
Monitoring
Serving
Infrastructure
Feature
Extr...
Why Data Services are Critical
…
DataConnector
Data Hub
Object Store in
Blob Storage
Catalog
AccessControl
Data Registry
A...
Configuration
Data Collection
Data Verification
Machine Resource
Management
Monitoring
Serving
Infrastructure
Feature
Extr...
Configuration
Data Collection
Data Verification
Machine Resource
Management
Monitoring
Serving
Infrastructure
Feature
Extr...
Configuration
Data Collection
Data Verification
Machine Resource
Management
Monitoring
Serving
Infrastructure
Feature
Extr...
Configuration
Data Collection
Data Verification
Machine Resource
Management
Monitoring
Serving
Infrastructure
Feature
Extr...
Configuration
Data Collection
Data Verification
Machine Resource
Management
Monitoring
Serving
Infrastructure
Feature
Extr...
Configuration
Data Collection
Data Verification
Machine Resource
Management
Monitoring
Serving
Infrastructure
Feature
Extr...
Monitoring your AI’s health like any other app
​Pipelines, Model Performance, Scores – Invest your time where it is needed...
Configuration
Data Collection
Data Verification
Machine Resource
Management
Monitoring
Serving
Infrastructure
Feature
Extr...
Why Data Services are Critical
…
DataConnector
Data Hub
Object Store in
Blob Storage
Catalog
AccessControl
Data Registry
A...
Configuration
Data Collection
Data Verification
Machine Resource
Management
Monitoring
Serving
Infrastructure
Feature
Extr...
c
Executor Service
Monitoring Tool
Data Puller
Data
Preparator
Data
Pusher
Scheduler
Model
Management
Service
Control Syst...
Why Stop at Microservices for Supporting Your ML Code?
Why stop here?
Your ML code can also be just a
collection of micros...
Auto Machine Learning
​Building reusable ML code
Leveraging Platform Services to Easily Deploy 1000s of Apps
Data Scientists on App #1
Leveraging Platform Services to Easily Deploy 1000s of Apps
Data Scientists on App #2Data Scientists on App #1
Let’s Add a Third App
Data Scientists on App #2 Data Scientists on App #3Data Scientists on App #1
How This Process Would Look in Salesforce
150,000 customers
​LeadIQ App ​Activity App ​Predictive
Journeys App
​App #5 ​Ap...
Einstein’sNewApproachtoAI
Democratizing AI for Everyone
Classical
Approach
Data
Sampling
Feature
Selection
Model
Selection...
Numerical BucketsCategorical Variables Text Fields
​AutoML for feature engineering
Repeatable Elements in Machine Learning...
Categorical Variables
​AutoML for feature engineering
Repeatable Elements in Machine Learning Pipelines
1 0 0
1 0 0
0 0 1
...
Text Fields
​AutoML for feature engineering
Repeatable Elements in Machine Learning Pipelines
Word Count
Word Count (no
st...
Numerical Buckets
​AutoML for feature engineering
Repeatable Elements in Machine Learning Pipelines
What Now? How autoML can choose your model
​>>> from sklearn import svm
>>> from numpy import loadtxt as l, random as r
>>...
Model 1
83% accuracy
Model 3
73% accuracy
Model 4
89% accuracy
…
Model 1 Model 2
Model 3 Model 4
…
MODEL GENERATION MODEL ...
A tournament of models!
…
Model 1 Model 2
Model 3 Model 4
…
CUSTOMER B
Customer ID
Age
Age Group
Gender
Valid Address
1 2 ...
Deploy Monitors, Monitor, Repeat!
Sample Dashboard on Simulated Data
​134
Models in
Production
​215
Models Trained
(curr.m...
Deploy Monitors, Monitor, Repeat!
​Pipelines, Model Performance, Scores – Invest your time where it is needed!
Sample Dash...
Deploy Monitors, Monitor, Repeat!
Sample Dashboard on Simulated Data
​134
Models in
Production
​215
Models Trained
(curr.m...
Key Takeaways
• Deploying machine learning in production is hard
• Platforms are critical for enabling data scientist prod...
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
einstein-platform-ai
Getting Data Science to Production
Prochain SlideShare
Chargement dans…5
×

Getting Data Science to Production

1 361 vues

Publié le

Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/2D86eWM.

Sarah Aerni covers the nuts and bolts of the Einstein Platform, a system that enables the automation and scaling of Artificial Intelligence to 1000s of customers, each with multiple models. The data ingestion, automated machine learning, instrumentation and intelligent monitoring and alerting make it possible to serve the varied needs of many different businesses. Filmed at qconsf.com.

Sarah Aerni is a Senior Manager of Data Science at Salesforce, where she leads teams building AI-powered applications across the Salesforce platform. Prior to Salesforce she led the healthcare & life science and Federal teams at Pivotal. She also co-founded a company offering expert services in informatics to both academia and industry.

Publié dans : Technologie
  • Soyez le premier à commenter

Getting Data Science to Production

  1. 1. Models in Minutes not Months: Data Science as Microservices QCon 2017 saerni@salesforce.com @itweetsarah ​Sarah Aerni, Einstein Platform
  2. 2. InfoQ.com: News & Community Site • Over 1,000,000 software developers, architects and CTOs read the site world- wide every month • 250,000 senior developers subscribe to our weekly newsletter • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • 2 dedicated podcast channels: The InfoQ Podcast, with a focus on Architecture and The Engineering Culture Podcast, with a focus on building • 96 deep dives on innovative topics packed as downloadable emags and minibooks • Over 40 new content items per week Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ einstein-platform-ai
  3. 3. Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide Presented at QCon San Francisco www.qconsf.com
  4. 4. LIVE DEMO
  5. 5. Agenda ​BUILDING AI APPS: Perspective Of A Data Scientist • Journey to building your first model • Barriers to production along the way ​DEPLOYING MODELS IN PRODUCTION: Built For Reuse • Where engineering and applications meet AI • DevOps in Data Science – monitoring, alerting and iterating ​AUTO MACHINE LEARNING: Machine Learning Pipelines as a Collection of Microservices • Create reusable ML pipeline code for multiple applications customers • Data Scientists focus on exploration, validation and adding new apps and models
  6. 6. ENABLING DATA SCIENCE ​A DATA SCIENTISTS VIEW OF BUILDING MODELS
  7. 7. Access and Explore Data Engineer Features and Build Models Interpret Model Results and Accuracy A data scientist’s view of the journey to building models
  8. 8. Access and Explore Data Engineer Features and Build Models Interpret Model Results and Accuracy A data scientist’s view of the journey to building models
  9. 9. Access and Explore Data Engineer Features and Build Models Interpret Model Results and Accuracy #Leads Created Date
  10. 10. Access and Explore Data Engineer Features and Build Models Interpret Model Results and Accuracy Engineer Features Empty fields One-hot encoding (pivoting) Email domain of a user Business titles of a user Historical spend Email-Company Name Similarity
  11. 11. Access and Explore Data Engineer Features and Build Models Interpret Model Results and Accuracy​>>> from sklearn import svm >>> from numpy import loadtxt as l, random as r >>> pls = numpy.loadtxt("leadFeatures.data", delimiter=",") >>> testSet = r.choice(len(pls), int(len(pls)*.7), replace=False) >>> X, y = pls[-testSet,:-1], pls[-testSet:,-1] >>> clf = svm.SVC() >>> clf.fit(X,y) SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,decision_function_shape=None, degree=3, gamma='auto', kernel='rbf', max_iter=-1, tol=0.001, verbose=False) >>> clf.score(pls[testSet,:-1],pls[testSet,-1]) 0.88571428571428568
  12. 12. Access and Explore Data Engineer Features and Build Models Interpret Model Results and Accuracy Geographies #Leads
  13. 13. Access and Explore Data Engineer Features and Build Models Interpret Model Results and Accuracy
  14. 14. Access and Explore Data Engineer Features and Build Models Interpret Model Results and Accuracy Fresh Data Input Delivery of Predictions
  15. 15. Bringing a Model to Production Requires a Team ​Applications deliver predictions for customer consumption ​Predictions are produced by the models live in production ​Pipelines deliver the data for modeling and scoring at an appropriate latency ​Monitoring systems allow us to check the health of the models, data, pipelines and app Source: Salesforce Customer Relationship Survey conducted 2014-2016 among 10,500+ customers randomly selected. Response sizes per question vary.
  16. 16. Data Engineers Provide data access and management capabilities for data scientists Set up and monitor data pipelines Improve performance of data processing pipelines Front-End Developers Build customer-facing UI Application instrumentation and logging ​Data Scientists ​Continue evaluating models ​Monitor for anomalies and degradation ​Iteratively improve models in production Bringing a Model to Production Requires a Team Platform Engineers Machine resource management Alerting and monitoring Product Managers Gather requirements & feedback Provide business context
  17. 17. Supporting a Model in Production is Complex Only a small fraction of real-world ML systems is a composed of ML code, as shown by the small black box in the middle. The required surrounding infrastructure is fast and complex. D. Sculley, et al. Hidden technical debt in machine learning systems. In Neural Information Processing Systems (NIPS). 2015
  18. 18. MODELS IN PRODUDCTION ​WHAT IT TAKES TO DEPLOY AN AI-POWERED APPLICATION
  19. 19. Supporting Models in Production is Mostly NOT AI Only a small fraction of real-world ML systems is a composed of ML code, as shown by the small black box in the middle. The required surrounding infrastructure is fast and complex. Adapted from D. Sculley, et al. Hidden technical debt in machine learning systems. In Neural Information Processing Systems (NIPS). 2015 Configuration Data Collection Data Verification Machine Resource Management Monitoring Serving Infrastructure Feature Extraction Analysis Tools Process Management Tools ML Code
  20. 20. Configuration Data Collection Data Verification Machine Resource Management Monitoring Serving Infrastructure Feature Extraction Analysis Tools Process Management Tools ML Code How the Salesforce Einstein Platform Enables Data Scientists ​Deploy, monitor and iterate on models in one location Data Sources …
  21. 21. Configuration Data Collection Data Verification Machine Resource Management Monitoring Serving Infrastructure Feature Extraction Analysis Tools Process Management Tools ML Code How the Salesforce Einstein Platform Enables Data Scientists ​Deploy, monitor and iterate on models in one location Data Sources … Web UICLI
  22. 22. Configuration Data Collection Data Verification Machine Resource Management Monitoring Serving Infrastructure Feature Extraction Analysis Tools Process Management Tools ML Code How the Salesforce Einstein Platform Enables Data Scientists ​Deploy, monitor and iterate on models in one location Admin / Authentication Service Data Sources … Web UICLI
  23. 23. Configuration Data Collection Data Verification Machine Resource Management Monitoring Serving Infrastructure Feature Extraction Analysis Tools Process Management Tools ML Code How the Salesforce Einstein Platform Enables Data Scientists ​Deploy, monitor and iterate on models in one location c Executor ServiceDatastore Service Admin / Authentication Service Data Sources … Web UICLI
  24. 24. Why Data Services are Critical … DataConnector Data Hub Object Store in Blob Storage Catalog AccessControl Data Registry Applications App Data Prov
  25. 25. Configuration Data Collection Data Verification Machine Resource Management Monitoring Serving Infrastructure Feature Extraction Analysis Tools Process Management Tools ML Code How the Salesforce Einstein Platform Enables Data Scientists ​Deploy, monitor and iterate on models in one location c Executor Service Data Puller Data Pusher Datastore Service Admin / Authentication Service Data Sources … Web UICLI
  26. 26. Configuration Data Collection Data Verification Machine Resource Management Monitoring Serving Infrastructure Feature Extraction Analysis Tools Process Management Tools ML Code How the Salesforce Einstein Platform Enables Data Scientists ​Deploy, monitor and iterate on models in one location c Executor Service Data Puller Data Pusher Datastore Service Admin / Authentication Service Data Sources … Web UICLI Exploration Tool
  27. 27. Configuration Data Collection Data Verification Machine Resource Management Monitoring Serving Infrastructure Feature Extraction Analysis Tools Process Management Tools ML Code How the Salesforce Einstein Platform Enables Data Scientists ​Deploy, monitor and iterate on models in one location c Executor Service Data Puller Data Pusher Datastore Service Admin / Authentication Service Data Sources … Modeling / Scoring Web UICLI Exploration Tool
  28. 28. Configuration Data Collection Data Verification Machine Resource Management Monitoring Serving Infrastructure Feature Extraction Analysis Tools Process Management Tools ML Code How the Salesforce Einstein Platform Enables Data Scientists ​Deploy, monitor and iterate on models in one location c Executor Service Data Puller Data Preparator Data Pusher Datastore Service Admin / Authentication Service Data Sources … Modeling / Scoring Web UICLI Exploration Tool
  29. 29. Configuration Data Collection Data Verification Machine Resource Management Monitoring Serving Infrastructure Feature Extraction Analysis Tools Process Management Tools ML Code How the Salesforce Einstein Platform Enables Data Scientists ​Deploy, monitor and iterate on models in one location c Executor Service Data Puller Data Preparator Data Pusher CI Service Auxiliary Services Datastore Service Admin / Authentication Service Data Sources … Modeling / Scoring Web UICLI Exploration Tool
  30. 30. Configuration Data Collection Data Verification Machine Resource Management Monitoring Serving Infrastructure Feature Extraction Analysis Tools Process Management Tools ML Code How the Salesforce Einstein Platform Enables Data Scientists ​Deploy, monitor and iterate on models in one location c Executor Service Monitoring Tool Data Puller Data Preparator Data Pusher CI Service Monitoring Service Auxiliary Services Datastore Service Admin / Authentication Service Data Sources … Modeling / Scoring Web UICLI Exploration Tool
  31. 31. Monitoring your AI’s health like any other app ​Pipelines, Model Performance, Scores – Invest your time where it is needed! Sample Dashboard on Simulated Data ​Model Performance at Evaluation​Distribution of Scores at Evaluation ​105,874 Scores Written Per Hour(1 day moving avg) ​0.86 Evaluation auROC ​Total Number of Scores Written Per Hour ​150,000 ​ 100,000 ​ 50,000 ​ 0 ​Total Number of Scores Written Per Week ​10,000,000 ​ 100,000 ​ 100 ​ 0 ​PercentofTotalPredictionsin Bucket ​Prediction Probability Bucket ​TotalPredictionCount
  32. 32. Configuration Data Collection Data Verification Machine Resource Management Monitoring Serving Infrastructure Feature Extraction Analysis Tools Process Management Tools ML Code How the Salesforce Einstein Platform Enables Data Scientists ​Deploy, monitor and iterate on models in one location c Executor Service Monitoring Tool Data Puller Data Preparator Data Pusher CI Service Monitoring Service Auxiliary Services Datastore Service Admin / Authentication Service Data Sources … Modeling / Scoring Provisioning Service Web UICLI Exploration Tool
  33. 33. Why Data Services are Critical … DataConnector Data Hub Object Store in Blob Storage Catalog AccessControl Data Registry Applications App Data Prov Applications App 1 App 2 App 3 Prov Prov Prov
  34. 34. Configuration Data Collection Data Verification Machine Resource Management Monitoring Serving Infrastructure Feature Extraction Analysis Tools Process Management Tools ML Code How the Salesforce Einstein Platform Enables Data Scientists ​Deploy, monitor and iterate on models in one location c Executor Service Monitoring Tool Data Puller Data Preparator Data Pusher Scheduler Model Management Service Control System CI Service Monitoring Service Auxiliary Services Datastore Service Admin / Authentication Service Data Sources … Modeling / Scoring Provisioning Service Web UICLI Exploration Tool
  35. 35. c Executor Service Monitoring Tool Data Puller Data Preparator Data Pusher Scheduler Model Management Service Control System CI Service Monitoring Service Auxiliary Services Datastore Service Admin / Authentication Service Data Sources … Modeling / Scoring Provisioning Service Web UICLI Exploration ToolMicroservice architecture Customizable model-evaluation & monitoring dashboards Scheduling and workflow management In-platform secured experimentation and exploration Data Scientists focus their efforts on modeling and evaluating results How the Salesforce Einstein Platform Enables Data Scientists ​Deploy, monitor and iterate on models in one location
  36. 36. Why Stop at Microservices for Supporting Your ML Code? Why stop here? Your ML code can also be just a collection of microservices! Configuration Data Collection Data Verification Machine Resource Management Monitoring Serving Infrastructure Feature Extraction Analysis Tools Process Management Tools ML Code
  37. 37. Auto Machine Learning ​Building reusable ML code
  38. 38. Leveraging Platform Services to Easily Deploy 1000s of Apps Data Scientists on App #1
  39. 39. Leveraging Platform Services to Easily Deploy 1000s of Apps Data Scientists on App #2Data Scientists on App #1
  40. 40. Let’s Add a Third App Data Scientists on App #2 Data Scientists on App #3Data Scientists on App #1
  41. 41. How This Process Would Look in Salesforce 150,000 customers ​LeadIQ App ​Activity App ​Predictive Journeys App ​App #5 ​App #6 ​App #7 ​App #8 ​Opportunity App ​App #9 ​App #10 ​App #11 ​App #12 ​LeadIQ App ​Activity App ​Predictive Journeys App ​App #5 ​App #6 ​App #7 ​App #8 ​Opportunity App ​App #9 ​App #10 ​App #11 ​App #12 ​LeadIQ App ​Activity App ​Predictive Journeys App ​App #5 ​App #6 ​App #7 ​App #8 ​Opportunity App ​App #9 ​App #10 ​App #11 ​App #12
  42. 42. Einstein’sNewApproachtoAI Democratizing AI for Everyone Classical Approach Data Sampling Feature Selection Model Selection Score Calibration Integrate to Application Artificial Intelligence Einstein Auto-ML Data already prepped Models automatically built Predictions delivered in context AI for CRM Discover Predict Recommend Automate
  43. 43. Numerical BucketsCategorical Variables Text Fields ​AutoML for feature engineering Repeatable Elements in Machine Learning Pipelines
  44. 44. Categorical Variables ​AutoML for feature engineering Repeatable Elements in Machine Learning Pipelines 1 0 0 1 0 0 0 0 1 0 0 1 0 1 0 0 0 1 0 0 0 0 1 0
  45. 45. Text Fields ​AutoML for feature engineering Repeatable Elements in Machine Learning Pipelines Word Count Word Count (no stop words) Is English Sentiment 4 2 1 1 6 3 1 1 9 4 0 0 6 4 0 -1 7 3 1 0 5 1 1 0 7 3 1 0
  46. 46. Numerical Buckets ​AutoML for feature engineering Repeatable Elements in Machine Learning Pipelines
  47. 47. What Now? How autoML can choose your model ​>>> from sklearn import svm >>> from numpy import loadtxt as l, random as r >>> clf = svm.SVC() >>> pls = numpy.loadtxt("leadFeatures.data", delimiter=",") >>> testSet = r.choice(len(pls), int(len(pls)*.7), replace=False) >>> X, y = pls[-testSet,:-1], pls[-testSet:,-1] >>> clf.fit(X,y) SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,decision_function_shape=None, degree=3, gamma='auto', kernel='rbf', max_iter=-1, probability=False, random_state=None, shrinking=True, tol=0.001, verbose=False) >>> clf.score(pls[testSet,:-1],pls[testSet,-1]) 0.88571428571428568 Should we try other model forms?Should we try other model forms? Features? Should we try other model forms? Features? Kernels or hyperparameters? Each use case will have its own model and features to use. We enable building separate models and features with 1 code base using OP
  48. 48. Model 1 83% accuracy Model 3 73% accuracy Model 4 89% accuracy … Model 1 Model 2 Model 3 Model 4 … MODEL GENERATION MODEL TESTINGCUSTOMER A Model 2 91% accuracy Customer ID Age Age Group Gender Valid Address 1 2 3 4 5 22 30 45 23 60 A A B A C M F F M M Y N N Y Y A tournament of models!
  49. 49. A tournament of models! … Model 1 Model 2 Model 3 Model 4 … CUSTOMER B Customer ID Age Age Group Gender Valid Address 1 2 3 4 5 34 22 66 58 41 B A C C B M M F M F Y Y N N Y Model 1 Model 3 Model 4 Customer B Model 2 Customer A MODEL GENERATION MODEL TESTINGCUSTOMER A Customer ID Age Age Group Gender Valid Address 1 2 3 4 5 22 30 45 23 60 A A B A C M F F M M Y N N Y Y
  50. 50. Deploy Monitors, Monitor, Repeat! Sample Dashboard on Simulated Data ​134 Models in Production ​215 Models Trained (curr.month) ​98.51% Models with Above Chance Performance ​35,573,664 Predictions Written Per Day (7 day avg) ​8 Experiments Run this Week
  51. 51. Deploy Monitors, Monitor, Repeat! ​Pipelines, Model Performance, Scores – Invest your time where it is needed! Sample Dashboard on Simulated Data ​Model Performance at Evaluation​Distribution of Scores at Evaluation ​105,874 Scores Written Per Hour(1 day moving avg) ​0.86 Evaluation auROC ​Total Number of Scores Written Per Hour ​150,000 ​ 100,000 ​ 50,000 ​ 0 ​Total Number of Scores Written Per Week ​10,000,000 ​ 100,000 ​ 100 ​ 0 ​PercentofTotalPredictionsin Bucket ​Prediction Probability Bucket ​TotalPredictionCount
  52. 52. Deploy Monitors, Monitor, Repeat! Sample Dashboard on Simulated Data ​134 Models in Production ​215 Models Trained (curr.month) ​98.51% Models with Above Chance Performance ​216 Models Trained (curr.month) ​99.25% Models with Above Chance Performance ​35,573,664 Predictions Written Per Day (7 day avg) ​12 Experiments Run this Week
  53. 53. Key Takeaways • Deploying machine learning in production is hard • Platforms are critical for enabling data scientist productivity • Plan for multiple apps… always • To ensure enabling rapid identification of areas of improvement and efficacy of new approaches provide • Monitoring services • Experimentation frameworks • Identify opportunities for reusability in all aspects, even your machine learning pipelines • Help simplify the process of experimenting, deploying, and iterating
  54. 54. Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ einstein-platform-ai

×