SlideShare a Scribd company logo
1 of 89
Foundation for ML
Your data +
Microsoft data
Breakthrough
advancements
Data Cloud Models
Power of Azure
SpeechVision Language
2016 2017 20182018
Microsoft ML breakthroughs
Microsoft 365
ML at Microsoft
| Research
ML at scale
Monthly active
Office 365 users
using AI
180
million
Questions Asked
of Cortana
18
Billion
Number of Signals
Analyzed to Block
Emerging Threats
DAILY
6.5
Trillion
But ML is HARD!
Building a model
Building
a model
Data ingestion Data analysis
Data
transformation
Data validation Data splitting
Trainer
Model
validation
Training
at scale
LoggingRoll-out Serving Monitoring
Ok, but, like, I’m
a data scientist. IDGAF
I don’t care
about all that.
Yes You Do!
11
Cowboys and Ranchers Can Be Friends!
SRE/ML EngineersData Scientist
• Quick iteration
• Frameworks they
understand
• Best of breed tools
• No management
headaches
• Unlimited scale
• Reuse of tooling and
platforms
• Corporate compliance
• Observability
• Uptime
Haven’t I Heard This Before?
GitOps = Git + Dev + Ops
GitOps
== VELOCITY and SECURITY
MLOps!
MLOps = ML + DEV + OPS
Experiment
Data Acquisition
Business Understanding
Initial Modeling
Develop
Modeling
Operate
Continuous Delivery
Data Feedback Loop
System + Model Monitoring
ML
+ Testing
Continuous Integration
Continuous Deployment
MLOps Benefits
• Code drives generation
and deployments
• Pipelines are
reproducible and
verifiable
• All artifacts can be
tagged and audited
• SWE best practices for
quality control
• Offline comparisons of
model quality
• Minimize bias and
enable explainability
• Controlled rollout
capabilities
• Live comparison of
predicted vs. expected
performance
• Results fed back to
watch for drift and
improve model
Automation /
Observability Validation
Reproducibility
/Auditability
== VELOCITY and SECURITY (For ML)
Internal MLOps Platforms
FBLearner FlowTensorFlow Extended
Uber’s Michelangelo
Microsoft Aether
But I Don’t Work at a
Big Company With
Thousands of
ML Engineers!
Build Your Own MLOps Platform
And many MANY more…
+ +
Cloud Provider
MLOps Platforms
Real World Multi-Cloud
CI/CD Pipeline
Process Train Stage Serve
Data
Distributed Cloud
SRE/ML Engineers
Data Scientist
ENV
#1
ENV
#2
Azure DevOps Pipelines
Cloud-hosted pipelines for Linux, Windows and macOS.
Any language, any platform, any cloud
Build, test, and deploy Node.js, Python, 
Java, PHP,
Ruby, C/C++, .NET, Android, and iOS apps. Run in
parallel on Linux, macOS, and Windows. Deploy to
Azure, AWS, GCP or on-premises
Extensible
Explore and implement a wide range of community-
built build, test, and deployment tasks, along with
hundreds of extensions from Slack to SonarCloud.
Support for YAML, reporting and more
Containers and Kubernetes
Easily build and push images to container registries
like Docker Hub and Azure Container Registry.
Deploy containers to individual hosts or Kubernetes.
Azure DevOps + Azure ML
First Class Model Training Tasks
CI pipeline captures:
1. Create sandbox
2. Run unit tests and code quality checks
3. Attach to compute
4. Run training pipeline
5. Evaluate model
6. Register model
Automated Deployment
CD pipeline captures:
1. Package model into container
image
2. Validate and profile model
3. Deploy model to DevTest (ACI)
4. If all is well, proceed to rollout
to AKS
Everything is done via the CLI
Model Versioning & Storage
• which data,
• which experiment / previous model(s),
• where’s the code / notebook)
• Was it converted / quantized?
• Private / compliant data
Model Validation
• Data (changes to shape / profile)
• Model in isolation (offline A/B)
• Model + app (functional testing)
• Only deploy after initial validation passes
• Ramp up traffic to new model using A/B
experimentations
• Functional behavior
• Performance characteristics
Model Profiling
Model Deployment
• Focus on ML, not DevOps
• Get telemetry for service health and model behavior
• code-generation
• API specifications / interfaces
• Cloud Services
• Mobile / Embedded Applications
• Edge Devices
• Quantize / optimize models for target platform
• Compliant + Safe
Seems Like a Lot of Work…
33
MLOps Gets You to Production
• End-to-end ownership by data science teams
using SWE best practices
• Continuously deliver of value to end users.
• Enables lineage, auditability and regulatory
compliance through consistency
Ok… but WHY?
What Does All This Stuff Solve For?
1. Does My Model Actually Work?
2. What Did My Customers See?
3. Is My Model Still Good?
What Does All This Stuff Solve For?
1. Does My Model Actually Work?
2. What Did My Customers See?
3. Is My Model Still Good?
Does My Model
Actually Work?
Does My Model Actually Work?
SRE/ML EngineersData Scientist
Time to test out
my model…
Laptop The Cloud
Does My Model Actually Work?
SRE/ML EngineersData Scientist
Laptop The Cloud
Does My Model Actually Work?
SRE/ML EngineersData Scientist
Looks good to
me! To Production!
Laptop The Cloud
Does My Model Actually Work?
SRE/ML EngineersData Scientist
Laptop The Cloud
Wait, what?
Oh… oh no…
Does My Model Actually Work?
SRE/ML EngineersData Scientist
Laptop The Cloud
WOAH there.
Does My Model Actually Work?
SRE/ML EngineersData Scientist
Laptop The Cloud
WOAH there.
Source Control
What is
happening…
Source Control
Does My Model Actually Work?
SRE/ML EngineersData Scientist
Laptop The Cloud
A Small Example of Issues You Can Have…
• Inappropriate HW/SW stack
• Mismatched driver versions
• Crash looping deployment
• Data/model versioning [Nick Walsh]
• Non-standard images/OS version
• Pre-processing code doesn’t match
production pre-processing
• Production data doesn’t match
training/test data
• Output of the model doesn’t match
application expectations
• Hand-coded heuristics better than model
[Adam Laiacano]
• Model freshness (train on out-of-date
data/input shape changed)
• Test/production statistics/population
shape skew
• Overfitting on training/test data
• Bias introduction (or not tested)
• Over/under HW provisioning
• Latency issues
• Permissions/certs
• Failure to obey health checks
• Killed production model before roll out
of new/in wrong order
• Thundering herd for new model
• Logging to the wrong location
• Storage for model not allocated
properly/accessible by deployment
tooling
• Route to artifacts not available for
download
• API signature changes not
propagated/expected
• Cross-data center latency
• Expected benefit doesn’t materialize (e.g.
multiple components in the app change
simultaneously)
• Get wrong/no traffic because A/B config
didn’t roll out
• Get too much traffic too soon (expected
to canary/exponential roll out)
• Lack of visibility into real-time model
behavior (detecting data drift, live data
distribution vs train data, etc) [Nick
Walsh]
• Outliers not predicted [MikeBSilverman]
• Change was a good change, but didn’t
communicate with the rest of the team
(so you must roll back)
• No dates! (date to measure
impact/improvement against a pre-
agreed measure; date scheduled to
assess data changes) [Mary Branscombe]
• No CI/CD; manual changes untracked
[Jon Peck]
• LACK OF DOCUMENTATION!! (the
problem, the testing, the solution, lots
more) [Terry Christiani]
• Successful model causes pain elsewhere
in the organization (e.g. detecting faults
previously missed) [Mark Round]
Or It Just Doesn’t Work!
At All!
Does My Model Actually Work?
SRE/ML EngineersData Scientist
Laptop The Cloud
Source Control
Automated
Validation &
Profiling
Package
For Rollout
Explain Model
& Look for
Bias
Clean/
Minimize
Code
Sane
Deployment
Nice. Nice.
ü
But I Can Do All
These Manually…
No.
MLOps is a Platform and a Philosophy
Even if:
o Every data scientist trained...
o And you had all the tools necessary...
o And they all worked together...
o And your SREs understood ML modeling...
o And and and and ...
You’d still need a permenant, repeatble
record of what you did
That’s MLOps!
What Does All This Stuff Solve For?
1. Does My Model Actually Work?
2. What Did My Customers See?
3. Is My Model Still Good?
What Does All This Stuff Solve For?
1. Does My Model Actually Work?
2. What Did My Customers See?
3. Is My Model Still Good?
What Did My
Customers See?
What Did My Customers See?
SRE/ML Engineers
The Cloud
Front End
Model Server
Customer
I’d Like a loan,
please.
Source Control
What Did My Customers See?
SRE/ML Engineers
The Cloud
Front End
Model Server
Customer
No.
Source Control
What Did My Customers See?
SRE/ML Engineers
The Cloud
Front End
Model Server
Customer
Ok, but why?
Source Control
Source Control
What Did My Customers See?
SRE/ML Engineers
The Cloud
Front End
Model Server
Customer
Uh oh.
Lawyer
Lawyer
Lawyer
Lawyer
Lawyer
Lawyer
Lawyer
Lawyer
Lawyer
Lawyer Lawyer
Lawyer
Lawyer
Lawyer
Lawyer
Lawyer
LawyerLawyer
It’s Not Just About Explainability!
• Yes, models are complicated
• But, that’s not enough:
o What data did you train on?
o How did you transform/exclude outliers?
o What are the data statistics?
o Did anything change between code and production?
o What model did you actually serve (to this person)?
• MLOps can help!
What Did My Customers See?
SRE/ML Engineers
The Cloud
Front End
Model Server
Customer
Source Control
Automated
Validation &
Profiling
Package
For Rollout
Explain Model
& Look for
Bias
Clean/
Minimize
Code
Sane
Deployment
32c04681d7573
What Did My Customers See?
SRE/ML Engineers
The Cloud
Front End
Model Server
Customer
Automated
Validation &
Profiling
Package
For Rollout
Explain Model
& Look for
Bias
Clean/
Minimize
Code
Sane
Deployment
Source Control
Immutable
Metadata Store
b151f8e65b32a c7f4e7607b4b7 0ef1d58921d89 e2e1e994c4251 786c8e57a6d51 9ce88802f0759
9ce88802f0759
What Did My Customers See?
SRE/ML Engineers
The Cloud
Front End
Model Server
Customer
Automated
Validation &
Profiling
Package
For Rollout
Explain Model
& Look for
Bias
Clean/
Minimize
Code
Sane
Deployment
Source Control
Immutable
Metadata Store
b151f8e65b32a c7f4e7607b4b7 0ef1d58921d89 e2e1e994c4251 786c8e57a6d51 9ce88802f0759
32c04681d7573
Why didn’t I get
a loan?
9ce88802f0759
What Did My Customers See?
SRE/ML Engineers
The Cloud
Front End
Model Server
Customer
Automated
Validation &
Profiling
Package
For Rollout
Explain Model
& Look for
Bias
Clean/
Minimize
Code
Sane
Deployment
Source Control
Immutable
Metadata Store
b151f8e65b32a c7f4e7607b4b7 0ef1d58921d89 e2e1e994c4251 786c8e57a6d51 9ce88802f0759
32c04681d7573
32c04681d7573
9ce88802f0759
What Does All This Stuff Solve For?
1. Does My Model Actually Work?
2. What Did My Customers See?
3. Is My Model Still Good?
What Does All This Stuff Solve For?
1. Does My Model Actually Work?
2. What Did My Customers See?
3. Is My Model Still Good?
Is My Model
Still Good?
Is My Model Still Good?
SRE/ML Engineers
The Cloud
There is a
blue or
orange
DUCK inside
this barn.
What color
is the duck?
Let’s Use Machine
Learning!!
Is My Model Still Good?
SRE/ML Engineers
The Cloud
Front End
Model Server
f7c5f9fe7b762
It’s a
duck!
BLUE
There is a
blue or
orange
DUCK inside
this barn.
What color
is the duck?
But wait...
Is My Model Still Good?
SRE/ML Engineers
The Cloud
Front End
Model Server
f7c5f9fe7b762
It’s a
duck!
BLUE
5 Blue Ducks
995 Yellow Ducks
Accuracy = 99%
False Positive = 1%
???????????????????
Thomas
Bayes
𝑷 𝑨| 𝑩 =
𝑷 𝑩| 𝑨 ⋅ 𝑷 𝑨
𝑷 𝑩
Bayes’ Theorem
Accuracy depends on
the population
distribution!
Is My Model Still Good?
SRE/ML Engineers
The Cloud
Front End
Model Server
f7c5f9fe7b762
It’s a
duck!
BLUE
995 Yellow Ducks
5 Blue Ducks
WRONG 2/3rd of the Time!
Accuracy = 99%
False Positive = 1%
???????????????????
Who cares…
This Can Be
Addressed!
Is My Model Still Good?
SRE/ML Engineers
The Cloud
Front End
Model Server
f7c5f9fe7b762
It’s a
duck!
BLUE
995 Yellow Ducks
5 Blue Ducks
Model Server
d4093cc84b267
But…
Is My Model Still Good?
SRE/ML Engineers
The Cloud
Front End
Model Server
995 Yellow Ducks
5 Blue Ducks
d4093cc84b267
Is My Model Still Good?
SRE/ML Engineers
The Cloud
Front End
Model Server 500 Yellow Ducks
500 Blue Ducks
d4093cc84b267
Is My Model Still Good?
• Models != Code – they can go stale... QUICKLY.
• IMPORTANT:
o Watch your model & data for drift from training
o Regularly (if not continuously) retrain, even before
performance begins to fail
o Multiple versions rollbacks are not uncommon!
• Without an e2e MLOps pipeline, many of the
above are O(really really hard)!
What Does All This Stuff Solve For?
1. Does My Model Actually Work?
2. What Did My Customers See?
3. Is My Model Still Good?
Next for MLOps
MLOps Gives* You…
• Software best practices for building machine
learning solutions
• Repeatable workflow for training a model and
rolling it out to production
• An immutable record of what’s actually running
• Lineage of model creation including data sources
• Acceleration from code to customer benefits
* Requires some human and software work
What’s Next for MLOps
• Simplify monitoring and retraining
• Extend MLOps for data incl prep and profiling
• Enterprise features
o Test cases
o Auditing
o Security
o Resource management (bin packing / resource optimization)
o Network isolation
• Metadata and API standards
Or, better yet, you tell us!
It’s a whole new world
• Data science will touch
EVERY industry.
• We can’t ask people to
become a PhD in statistics
though.
• How do WE help everyone
take advantage of this
transformation?
me: David Aronchick (david.aronchick@microsoft.com)
twitter: @aronchick
github:
• https://github.com/aronchick/kubeflow-and-mlops
• https://aka.ms/mlops
THANK YOU!
Using MLOps to Bring ML to Production/The Promise of MLOps

More Related Content

What's hot

MLOps for production-level machine learning
MLOps for production-level machine learningMLOps for production-level machine learning
MLOps for production-level machine learning
cnvrg.io AI OS - Hands-on ML Workshops
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
Databricks
 
Managing the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflowManaging the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflow
Databricks
 
Intro to Vertex AI, unified MLOps platform for Data Scientists & ML Engineers
Intro to Vertex AI, unified MLOps platform for Data Scientists & ML EngineersIntro to Vertex AI, unified MLOps platform for Data Scientists & ML Engineers
Intro to Vertex AI, unified MLOps platform for Data Scientists & ML Engineers
Daniel Zivkovic
 

What's hot (20)

MLOps for production-level machine learning
MLOps for production-level machine learningMLOps for production-level machine learning
MLOps for production-level machine learning
 
Ml ops past_present_future
Ml ops past_present_futureMl ops past_present_future
Ml ops past_present_future
 
MLOps with Azure DevOps
MLOps with Azure DevOpsMLOps with Azure DevOps
MLOps with Azure DevOps
 
MLOps Bridging the gap between Data Scientists and Ops.
MLOps Bridging the gap between Data Scientists and Ops.MLOps Bridging the gap between Data Scientists and Ops.
MLOps Bridging the gap between Data Scientists and Ops.
 
Apply MLOps at Scale by H&M
Apply MLOps at Scale by H&MApply MLOps at Scale by H&M
Apply MLOps at Scale by H&M
 
MLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at ScaleMLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at Scale
 
Databricks Overview for MLOps
Databricks Overview for MLOpsDatabricks Overview for MLOps
Databricks Overview for MLOps
 
Introduction to MLflow
Introduction to MLflowIntroduction to MLflow
Introduction to MLflow
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOps
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 
ML-Ops how to bring your data science to production
ML-Ops  how to bring your data science to productionML-Ops  how to bring your data science to production
ML-Ops how to bring your data science to production
 
Managing the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflowManaging the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflow
 
Managing the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOpsManaging the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOps
 
MLOps with Kubeflow
MLOps with Kubeflow MLOps with Kubeflow
MLOps with Kubeflow
 
Simplifying Model Management with MLflow
Simplifying Model Management with MLflowSimplifying Model Management with MLflow
Simplifying Model Management with MLflow
 
Intro to Vertex AI, unified MLOps platform for Data Scientists & ML Engineers
Intro to Vertex AI, unified MLOps platform for Data Scientists & ML EngineersIntro to Vertex AI, unified MLOps platform for Data Scientists & ML Engineers
Intro to Vertex AI, unified MLOps platform for Data Scientists & ML Engineers
 
Machine Learning Model Deployment: Strategy to Implementation
Machine Learning Model Deployment: Strategy to ImplementationMachine Learning Model Deployment: Strategy to Implementation
Machine Learning Model Deployment: Strategy to Implementation
 
End to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
End to end Machine Learning using Kubeflow - Build, Train, Deploy and ManageEnd to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
End to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
 
Vertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsVertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflows
 
Seamless MLOps with Seldon and MLflow
Seamless MLOps with Seldon and MLflowSeamless MLOps with Seldon and MLflow
Seamless MLOps with Seldon and MLflow
 

Similar to Using MLOps to Bring ML to Production/The Promise of MLOps

Building Powerful and Intelligent Applications with Azure Machine Learning
Building Powerful and Intelligent Applications with Azure Machine LearningBuilding Powerful and Intelligent Applications with Azure Machine Learning
Building Powerful and Intelligent Applications with Azure Machine Learning
David Walker, CSM,CSD,MCP,MCAD,MCSD,MVP
 
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
Provectus
 

Similar to Using MLOps to Bring ML to Production/The Promise of MLOps (20)

Rsqrd AI: How to Design a Reliable and Reproducible Pipeline
Rsqrd AI: How to Design a Reliable and Reproducible PipelineRsqrd AI: How to Design a Reliable and Reproducible Pipeline
Rsqrd AI: How to Design a Reliable and Reproducible Pipeline
 
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-usDevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-us
 
Innovate Better Through Machine data Analytics
Innovate Better Through Machine data AnalyticsInnovate Better Through Machine data Analytics
Innovate Better Through Machine data Analytics
 
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
 
Consolidating MLOps at One of Europe’s Biggest Airports
Consolidating MLOps at One of Europe’s Biggest AirportsConsolidating MLOps at One of Europe’s Biggest Airports
Consolidating MLOps at One of Europe’s Biggest Airports
 
Making Data Science Scalable - 5 Lessons Learned
Making Data Science Scalable - 5 Lessons LearnedMaking Data Science Scalable - 5 Lessons Learned
Making Data Science Scalable - 5 Lessons Learned
 
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...
 
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
 
Managing the Machine Learning Lifecycle with MLflow
Managing the Machine Learning Lifecycle with MLflowManaging the Machine Learning Lifecycle with MLflow
Managing the Machine Learning Lifecycle with MLflow
 
“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps
 
Orchestration, the conductor's score
Orchestration, the conductor's scoreOrchestration, the conductor's score
Orchestration, the conductor's score
 
Building Powerful and Intelligent Applications with Azure Machine Learning
Building Powerful and Intelligent Applications with Azure Machine LearningBuilding Powerful and Intelligent Applications with Azure Machine Learning
Building Powerful and Intelligent Applications with Azure Machine Learning
 
5 Key Metrics to Release Better Software Faster
5 Key Metrics to Release Better Software Faster5 Key Metrics to Release Better Software Faster
5 Key Metrics to Release Better Software Faster
 
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
 
Maintainable Machine Learning Products
Maintainable Machine Learning ProductsMaintainable Machine Learning Products
Maintainable Machine Learning Products
 
Continuous Intelligence Workshop
Continuous Intelligence WorkshopContinuous Intelligence Workshop
Continuous Intelligence Workshop
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
 
From Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender systemFrom Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender system
 
A Beard, An App, A Blender
A Beard, An App, A BlenderA Beard, An App, A Blender
A Beard, An App, A Blender
 
[AI] ML Operationalization with Microsoft Azure
[AI] ML Operationalization with Microsoft Azure[AI] ML Operationalization with Microsoft Azure
[AI] ML Operationalization with Microsoft Azure
 

More from Weaveworks

SRE and GitOps for Building Robust Kubernetes Platforms.pdf
SRE and GitOps for Building Robust Kubernetes Platforms.pdfSRE and GitOps for Building Robust Kubernetes Platforms.pdf
SRE and GitOps for Building Robust Kubernetes Platforms.pdf
Weaveworks
 
How to Avoid Kubernetes Multi-tenancy Catastrophes
How to Avoid Kubernetes Multi-tenancy CatastrophesHow to Avoid Kubernetes Multi-tenancy Catastrophes
How to Avoid Kubernetes Multi-tenancy Catastrophes
Weaveworks
 

More from Weaveworks (20)

Weave AI Controllers (Weave GitOps Office Hours)
Weave AI Controllers (Weave GitOps Office Hours)Weave AI Controllers (Weave GitOps Office Hours)
Weave AI Controllers (Weave GitOps Office Hours)
 
Flamingo: Expand ArgoCD with Flux (Office Hours)
Flamingo: Expand ArgoCD with Flux (Office Hours)Flamingo: Expand ArgoCD with Flux (Office Hours)
Flamingo: Expand ArgoCD with Flux (Office Hours)
 
Webinar: Capabilities, Confidence and Community – What Flux GA Means for You
Webinar: Capabilities, Confidence and Community – What Flux GA Means for YouWebinar: Capabilities, Confidence and Community – What Flux GA Means for You
Webinar: Capabilities, Confidence and Community – What Flux GA Means for You
 
Six Signs You Need Platform Engineering
Six Signs You Need Platform EngineeringSix Signs You Need Platform Engineering
Six Signs You Need Platform Engineering
 
SRE and GitOps for Building Robust Kubernetes Platforms.pdf
SRE and GitOps for Building Robust Kubernetes Platforms.pdfSRE and GitOps for Building Robust Kubernetes Platforms.pdf
SRE and GitOps for Building Robust Kubernetes Platforms.pdf
 
Webinar: End to End Security & Operations with Chainguard and Weave GitOps
Webinar: End to End Security & Operations with Chainguard and Weave GitOpsWebinar: End to End Security & Operations with Chainguard and Weave GitOps
Webinar: End to End Security & Operations with Chainguard and Weave GitOps
 
Flux Beyond Git Harnessing the Power of OCI
Flux Beyond Git Harnessing the Power of OCIFlux Beyond Git Harnessing the Power of OCI
Flux Beyond Git Harnessing the Power of OCI
 
Automated Provisioning, Management & Cost Control for Kubernetes Clusters
Automated Provisioning, Management & Cost Control for Kubernetes ClustersAutomated Provisioning, Management & Cost Control for Kubernetes Clusters
Automated Provisioning, Management & Cost Control for Kubernetes Clusters
 
How to Avoid Kubernetes Multi-tenancy Catastrophes
How to Avoid Kubernetes Multi-tenancy CatastrophesHow to Avoid Kubernetes Multi-tenancy Catastrophes
How to Avoid Kubernetes Multi-tenancy Catastrophes
 
Building internal developer platform with EKS and GitOps
Building internal developer platform with EKS and GitOpsBuilding internal developer platform with EKS and GitOps
Building internal developer platform with EKS and GitOps
 
GitOps Testing in Kubernetes with Flux and Testkube.pdf
GitOps Testing in Kubernetes with Flux and Testkube.pdfGitOps Testing in Kubernetes with Flux and Testkube.pdf
GitOps Testing in Kubernetes with Flux and Testkube.pdf
 
Intro to GitOps with Weave GitOps, Flagger and Linkerd
Intro to GitOps with Weave GitOps, Flagger and LinkerdIntro to GitOps with Weave GitOps, Flagger and Linkerd
Intro to GitOps with Weave GitOps, Flagger and Linkerd
 
Implementing Flux for Scale with Soft Multi-tenancy
Implementing Flux for Scale with Soft Multi-tenancyImplementing Flux for Scale with Soft Multi-tenancy
Implementing Flux for Scale with Soft Multi-tenancy
 
Accelerating Hybrid Multistage Delivery with Weave GitOps on EKS
Accelerating Hybrid Multistage Delivery with Weave GitOps on EKSAccelerating Hybrid Multistage Delivery with Weave GitOps on EKS
Accelerating Hybrid Multistage Delivery with Weave GitOps on EKS
 
The Story of Flux Reaching Graduation in the CNCF
The Story of Flux Reaching Graduation in the CNCFThe Story of Flux Reaching Graduation in the CNCF
The Story of Flux Reaching Graduation in the CNCF
 
Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...
Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...
Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...
 
Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...
Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...
Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...
 
Flux’s Security & Scalability with OCI & Helm Slides.pdf
Flux’s Security & Scalability with OCI & Helm Slides.pdfFlux’s Security & Scalability with OCI & Helm Slides.pdf
Flux’s Security & Scalability with OCI & Helm Slides.pdf
 
Flux Security & Scalability using VS Code GitOps Extension
Flux Security & Scalability using VS Code GitOps Extension Flux Security & Scalability using VS Code GitOps Extension
Flux Security & Scalability using VS Code GitOps Extension
 
Deploying Stateful Applications Securely & Confidently with Ondat & Weave GitOps
Deploying Stateful Applications Securely & Confidently with Ondat & Weave GitOpsDeploying Stateful Applications Securely & Confidently with Ondat & Weave GitOps
Deploying Stateful Applications Securely & Confidently with Ondat & Weave GitOps
 

Recently uploaded

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Recently uploaded (20)

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 

Using MLOps to Bring ML to Production/The Promise of MLOps

  • 1.
  • 2. Foundation for ML Your data + Microsoft data Breakthrough advancements Data Cloud Models Power of Azure
  • 3. SpeechVision Language 2016 2017 20182018 Microsoft ML breakthroughs
  • 4. Microsoft 365 ML at Microsoft | Research
  • 5. ML at scale Monthly active Office 365 users using AI 180 million Questions Asked of Cortana 18 Billion Number of Signals Analyzed to Block Emerging Threats DAILY 6.5 Trillion
  • 6. But ML is HARD!
  • 8. Building a model Data ingestion Data analysis Data transformation Data validation Data splitting Trainer Model validation Training at scale LoggingRoll-out Serving Monitoring
  • 9. Ok, but, like, I’m a data scientist. IDGAF I don’t care about all that.
  • 11. 11
  • 12. Cowboys and Ranchers Can Be Friends! SRE/ML EngineersData Scientist • Quick iteration • Frameworks they understand • Best of breed tools • No management headaches • Unlimited scale • Reuse of tooling and platforms • Corporate compliance • Observability • Uptime
  • 13. Haven’t I Heard This Before?
  • 14. GitOps = Git + Dev + Ops
  • 17. MLOps = ML + DEV + OPS Experiment Data Acquisition Business Understanding Initial Modeling Develop Modeling Operate Continuous Delivery Data Feedback Loop System + Model Monitoring ML + Testing Continuous Integration Continuous Deployment
  • 18. MLOps Benefits • Code drives generation and deployments • Pipelines are reproducible and verifiable • All artifacts can be tagged and audited • SWE best practices for quality control • Offline comparisons of model quality • Minimize bias and enable explainability • Controlled rollout capabilities • Live comparison of predicted vs. expected performance • Results fed back to watch for drift and improve model Automation / Observability Validation Reproducibility /Auditability == VELOCITY and SECURITY (For ML)
  • 19. Internal MLOps Platforms FBLearner FlowTensorFlow Extended Uber’s Michelangelo Microsoft Aether
  • 20. But I Don’t Work at a Big Company With Thousands of ML Engineers!
  • 21. Build Your Own MLOps Platform And many MANY more… + +
  • 23. Real World Multi-Cloud CI/CD Pipeline Process Train Stage Serve Data Distributed Cloud SRE/ML Engineers Data Scientist ENV #1 ENV #2
  • 24. Azure DevOps Pipelines Cloud-hosted pipelines for Linux, Windows and macOS. Any language, any platform, any cloud Build, test, and deploy Node.js, Python, 
Java, PHP, Ruby, C/C++, .NET, Android, and iOS apps. Run in parallel on Linux, macOS, and Windows. Deploy to Azure, AWS, GCP or on-premises Extensible Explore and implement a wide range of community- built build, test, and deployment tasks, along with hundreds of extensions from Slack to SonarCloud. Support for YAML, reporting and more Containers and Kubernetes Easily build and push images to container registries like Docker Hub and Azure Container Registry. Deploy containers to individual hosts or Kubernetes.
  • 25. Azure DevOps + Azure ML
  • 26. First Class Model Training Tasks CI pipeline captures: 1. Create sandbox 2. Run unit tests and code quality checks 3. Attach to compute 4. Run training pipeline 5. Evaluate model 6. Register model
  • 27. Automated Deployment CD pipeline captures: 1. Package model into container image 2. Validate and profile model 3. Deploy model to DevTest (ACI) 4. If all is well, proceed to rollout to AKS Everything is done via the CLI
  • 28. Model Versioning & Storage • which data, • which experiment / previous model(s), • where’s the code / notebook) • Was it converted / quantized? • Private / compliant data
  • 29. Model Validation • Data (changes to shape / profile) • Model in isolation (offline A/B) • Model + app (functional testing) • Only deploy after initial validation passes • Ramp up traffic to new model using A/B experimentations • Functional behavior • Performance characteristics
  • 31. Model Deployment • Focus on ML, not DevOps • Get telemetry for service health and model behavior • code-generation • API specifications / interfaces • Cloud Services • Mobile / Embedded Applications • Edge Devices • Quantize / optimize models for target platform • Compliant + Safe
  • 32. Seems Like a Lot of Work…
  • 33. 33
  • 34. MLOps Gets You to Production • End-to-end ownership by data science teams using SWE best practices • Continuously deliver of value to end users. • Enables lineage, auditability and regulatory compliance through consistency
  • 36. What Does All This Stuff Solve For? 1. Does My Model Actually Work? 2. What Did My Customers See? 3. Is My Model Still Good?
  • 37. What Does All This Stuff Solve For? 1. Does My Model Actually Work? 2. What Did My Customers See? 3. Is My Model Still Good?
  • 39. Does My Model Actually Work? SRE/ML EngineersData Scientist Time to test out my model… Laptop The Cloud
  • 40. Does My Model Actually Work? SRE/ML EngineersData Scientist Laptop The Cloud
  • 41. Does My Model Actually Work? SRE/ML EngineersData Scientist Looks good to me! To Production! Laptop The Cloud
  • 42. Does My Model Actually Work? SRE/ML EngineersData Scientist Laptop The Cloud Wait, what? Oh… oh no…
  • 43. Does My Model Actually Work? SRE/ML EngineersData Scientist Laptop The Cloud WOAH there.
  • 44. Does My Model Actually Work? SRE/ML EngineersData Scientist Laptop The Cloud WOAH there. Source Control
  • 45. What is happening… Source Control Does My Model Actually Work? SRE/ML EngineersData Scientist Laptop The Cloud
  • 46. A Small Example of Issues You Can Have… • Inappropriate HW/SW stack • Mismatched driver versions • Crash looping deployment • Data/model versioning [Nick Walsh] • Non-standard images/OS version • Pre-processing code doesn’t match production pre-processing • Production data doesn’t match training/test data • Output of the model doesn’t match application expectations • Hand-coded heuristics better than model [Adam Laiacano] • Model freshness (train on out-of-date data/input shape changed) • Test/production statistics/population shape skew • Overfitting on training/test data • Bias introduction (or not tested) • Over/under HW provisioning • Latency issues • Permissions/certs • Failure to obey health checks • Killed production model before roll out of new/in wrong order • Thundering herd for new model • Logging to the wrong location • Storage for model not allocated properly/accessible by deployment tooling • Route to artifacts not available for download • API signature changes not propagated/expected • Cross-data center latency • Expected benefit doesn’t materialize (e.g. multiple components in the app change simultaneously) • Get wrong/no traffic because A/B config didn’t roll out • Get too much traffic too soon (expected to canary/exponential roll out) • Lack of visibility into real-time model behavior (detecting data drift, live data distribution vs train data, etc) [Nick Walsh] • Outliers not predicted [MikeBSilverman] • Change was a good change, but didn’t communicate with the rest of the team (so you must roll back) • No dates! (date to measure impact/improvement against a pre- agreed measure; date scheduled to assess data changes) [Mary Branscombe] • No CI/CD; manual changes untracked [Jon Peck] • LACK OF DOCUMENTATION!! (the problem, the testing, the solution, lots more) [Terry Christiani] • Successful model causes pain elsewhere in the organization (e.g. detecting faults previously missed) [Mark Round] Or It Just Doesn’t Work! At All!
  • 47. Does My Model Actually Work? SRE/ML EngineersData Scientist Laptop The Cloud Source Control Automated Validation & Profiling Package For Rollout Explain Model & Look for Bias Clean/ Minimize Code Sane Deployment Nice. Nice. ü
  • 48. But I Can Do All These Manually…
  • 49. No.
  • 50. MLOps is a Platform and a Philosophy Even if: o Every data scientist trained... o And you had all the tools necessary... o And they all worked together... o And your SREs understood ML modeling... o And and and and ... You’d still need a permenant, repeatble record of what you did
  • 52. What Does All This Stuff Solve For? 1. Does My Model Actually Work? 2. What Did My Customers See? 3. Is My Model Still Good?
  • 53. What Does All This Stuff Solve For? 1. Does My Model Actually Work? 2. What Did My Customers See? 3. Is My Model Still Good?
  • 55. What Did My Customers See? SRE/ML Engineers The Cloud Front End Model Server Customer I’d Like a loan, please. Source Control
  • 56. What Did My Customers See? SRE/ML Engineers The Cloud Front End Model Server Customer No. Source Control
  • 57. What Did My Customers See? SRE/ML Engineers The Cloud Front End Model Server Customer Ok, but why? Source Control
  • 58. Source Control What Did My Customers See? SRE/ML Engineers The Cloud Front End Model Server Customer Uh oh. Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer LawyerLawyer
  • 59. It’s Not Just About Explainability! • Yes, models are complicated • But, that’s not enough: o What data did you train on? o How did you transform/exclude outliers? o What are the data statistics? o Did anything change between code and production? o What model did you actually serve (to this person)? • MLOps can help!
  • 60. What Did My Customers See? SRE/ML Engineers The Cloud Front End Model Server Customer Source Control Automated Validation & Profiling Package For Rollout Explain Model & Look for Bias Clean/ Minimize Code Sane Deployment
  • 61. 32c04681d7573 What Did My Customers See? SRE/ML Engineers The Cloud Front End Model Server Customer Automated Validation & Profiling Package For Rollout Explain Model & Look for Bias Clean/ Minimize Code Sane Deployment Source Control Immutable Metadata Store b151f8e65b32a c7f4e7607b4b7 0ef1d58921d89 e2e1e994c4251 786c8e57a6d51 9ce88802f0759 9ce88802f0759
  • 62. What Did My Customers See? SRE/ML Engineers The Cloud Front End Model Server Customer Automated Validation & Profiling Package For Rollout Explain Model & Look for Bias Clean/ Minimize Code Sane Deployment Source Control Immutable Metadata Store b151f8e65b32a c7f4e7607b4b7 0ef1d58921d89 e2e1e994c4251 786c8e57a6d51 9ce88802f0759 32c04681d7573 Why didn’t I get a loan? 9ce88802f0759
  • 63. What Did My Customers See? SRE/ML Engineers The Cloud Front End Model Server Customer Automated Validation & Profiling Package For Rollout Explain Model & Look for Bias Clean/ Minimize Code Sane Deployment Source Control Immutable Metadata Store b151f8e65b32a c7f4e7607b4b7 0ef1d58921d89 e2e1e994c4251 786c8e57a6d51 9ce88802f0759 32c04681d7573 32c04681d7573 9ce88802f0759
  • 64. What Does All This Stuff Solve For? 1. Does My Model Actually Work? 2. What Did My Customers See? 3. Is My Model Still Good?
  • 65. What Does All This Stuff Solve For? 1. Does My Model Actually Work? 2. What Did My Customers See? 3. Is My Model Still Good?
  • 67. Is My Model Still Good? SRE/ML Engineers The Cloud There is a blue or orange DUCK inside this barn. What color is the duck?
  • 69. Is My Model Still Good? SRE/ML Engineers The Cloud Front End Model Server f7c5f9fe7b762 It’s a duck! BLUE There is a blue or orange DUCK inside this barn. What color is the duck?
  • 71. Is My Model Still Good? SRE/ML Engineers The Cloud Front End Model Server f7c5f9fe7b762 It’s a duck! BLUE 5 Blue Ducks 995 Yellow Ducks Accuracy = 99% False Positive = 1% ???????????????????
  • 73. 𝑷 𝑨| 𝑩 = 𝑷 𝑩| 𝑨 ⋅ 𝑷 𝑨 𝑷 𝑩 Bayes’ Theorem
  • 74. Accuracy depends on the population distribution!
  • 75. Is My Model Still Good? SRE/ML Engineers The Cloud Front End Model Server f7c5f9fe7b762 It’s a duck! BLUE 995 Yellow Ducks 5 Blue Ducks WRONG 2/3rd of the Time! Accuracy = 99% False Positive = 1% ???????????????????
  • 78. Is My Model Still Good? SRE/ML Engineers The Cloud Front End Model Server f7c5f9fe7b762 It’s a duck! BLUE 995 Yellow Ducks 5 Blue Ducks Model Server d4093cc84b267
  • 80. Is My Model Still Good? SRE/ML Engineers The Cloud Front End Model Server 995 Yellow Ducks 5 Blue Ducks d4093cc84b267
  • 81. Is My Model Still Good? SRE/ML Engineers The Cloud Front End Model Server 500 Yellow Ducks 500 Blue Ducks d4093cc84b267
  • 82. Is My Model Still Good? • Models != Code – they can go stale... QUICKLY. • IMPORTANT: o Watch your model & data for drift from training o Regularly (if not continuously) retrain, even before performance begins to fail o Multiple versions rollbacks are not uncommon! • Without an e2e MLOps pipeline, many of the above are O(really really hard)!
  • 83. What Does All This Stuff Solve For? 1. Does My Model Actually Work? 2. What Did My Customers See? 3. Is My Model Still Good?
  • 85. MLOps Gives* You… • Software best practices for building machine learning solutions • Repeatable workflow for training a model and rolling it out to production • An immutable record of what’s actually running • Lineage of model creation including data sources • Acceleration from code to customer benefits * Requires some human and software work
  • 86. What’s Next for MLOps • Simplify monitoring and retraining • Extend MLOps for data incl prep and profiling • Enterprise features o Test cases o Auditing o Security o Resource management (bin packing / resource optimization) o Network isolation • Metadata and API standards Or, better yet, you tell us!
  • 87. It’s a whole new world • Data science will touch EVERY industry. • We can’t ask people to become a PhD in statistics though. • How do WE help everyone take advantage of this transformation?
  • 88. me: David Aronchick (david.aronchick@microsoft.com) twitter: @aronchick github: • https://github.com/aronchick/kubeflow-and-mlops • https://aka.ms/mlops THANK YOU!