SlideShare une entreprise Scribd logo
1  sur  33
A platform for the
Complete Machine
Learning Lifecycle
Corey Zumar
June 24th, 2019
InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
mlflow-databricks/
Presented at QCon New York
www.qconnewyork.com
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
2
Outline
• Overview of ML development challenges
• How MLflow tackles these challenges
• MLflow components
• Demo
• How to get started
3
Machine Learning
Development is Complex
4
ML Lifecycle
Data Prep
Training
Deploy
Raw Data
μ
λ θ Tunin
g
Scal
e
μ
λ θ Tunin
g
Scal
e
Scal
e
Scal
e
Model
Exchang
e
Governanc
e
Delta
5
Custom ML Platforms
Facebook FBLearner, Uber Michelangelo, Google TFX
+Standardize the data prep / training / deploy loop:
if you work with the platform, you get these!
–Limited to a few algorithms or frameworks
–Tied to one company’s infrastructure
Can we provide similar benefits in an open
manner?
6
Introducing
Open machine learning platform
• Works with any ML library & language
• Runs the same way anywhere (e.g. any cloud)
• Designed to be useful for 1 or 1000+ person orgs
7
MLflow Components
Tracking
Record and query
experiments: code,
configs, results,
…etc
Project
s
Packaging format
for reproducible
runs on any
platform
Model
s
General model
format
that supports diverse
deployment tools
8
Key Concepts in Tracking
Parameters: key-value inputs to your code
Metrics: numeric values (can update over time)
Source: training code that ran
Version: version of the training code
Artifacts: files, including data and models
Tags and Notes: any additional information
9
MLflow Tracking
Tracking Server
UI
API
Tracking APIs
(REST, Python, Java, R)
10
MLflow Tracking
Tracking
Record and query
experiments: code,
configs, results,
…etc
import mlflow
with mlflow.start_run():
mlflow.log_param("layers", layers)
mlflow.log_param("alpha", alpha)
# train model
mlflow.log_metric("mse", model.mse())
mlflow.log_artifact("plot", model.plot(test_df))
mlflow.tensorflow.log_model(model)
11
Demo
Goal: Classify hand-drawn digits
1. Instrument Keras training code with MLflow tracking APIs
2. Run training code as an MLflow Project
3. Deploy an MLflow Model for real-time serving
12
MLflow backend stores
1. Entity (Metadata) Store
• FileStore (local filesystem)
• SQLStore (via SQLAlchemy)
• REST Store
2. Artifact Store
• S3 backed store
• Azure Blob storage
• Google Cloud storage
• DBFS artifact repo
13
MLflow Components
Tracking
Record and query
experiments: code,
configs, results,
…etc
Project
s
Packaging format
for reproducible
runs on any
platform
Model
s
General model
format
that supports diverse
deployment tools
14
MLflow Projects Motivation
Diverse set of training
tools
Diverse set of
environments
Challenge:
ML results are
difficult to
reproduce.
15
Project Spec
Code
Data
Config
Local Execution
Remote Execution
MLflow Projects
Dependencie
s
16
MLflow Projects
Packaging format for reproducible ML runs
• Any code folder or GitHub repository
• Optional MLproject file with project configuration
Defines dependencies for reproducibility
• Conda (+ R, Docker, …) dependencies can be specified in MLproject
• Reproducible in (almost) any environment
Execution API for running projects
• CLI / Python / R / Java
• Supports local and remote execution
17
Example MLflow Project
my_project/
├── MLproject
│
│
│
│
│
├── conda.yaml
├── main.py
└── model.py
...
conda_env: conda.yaml
entry_points:
main:
parameters:
training_data: path
lambda: {type: float, default: 0.1}
command: python main.py {training_data}
{lambda}
$ mlflow run git://<my_project>
18
Demo
Goal: Classify hand-drawn digits
1. Instrument Keras training code with MLflow tracking APIs
2. Run training code as an MLflow Project
3. Deploy an MLflow Model for real-time serving
19
MLflow Components
Tracking
Record and query
experiments: code,
configs, results,
…etc
Project
s
Packaging format
for reproducible
runs on any
platform
Model
s
General model
format
that supports diverse
deployment tools
20
Inference
Code
Batch & Stream
Scoring
Serving Tools
Mlflow Models Motivation
ML
Frameworks
21
Model Format
Flavor 2Flavor 1
ML
Frameworks
Inference Code
Batch & Stream
Scoring
Serving Tools
Standard for ML
models
MLflow Models
22
MLflow Models
Packaging format for ML Models
• Any directory with MLmodel file
Defines dependencies for reproducibility
• Conda environment can be specified in MLmodel configuration
Model creation utilities
• Save models from any framework in MLflow format
Deployment APIs
• CLI / Python / R / Java
23
Example MLflow Model
my_model/
├── MLmodel
│
│
│
│
│
└ estimator/
├─ saved_model.pb
└─ variables/
...
Usable with Tensorflow
tools / APIs
Usable with any Python
tool
run_id: 769915006efd4c4bbd662461
time_created: 2018-06-28T12:34
flavors:
tensorflow:
saved_model_dir: estimator
signature_def_key: predict
python_function:
loader_module:
mlflow.tensorflow
mlflow.tensorflow.log_model(...)
24
Model Flavors Example
Train a model
mlflow.keras.log_model()
Model
Format
Flavor 1:
Pyfunc
Flavor 2:
Keras
predict = mlflow.pyfunc.load_pyfunc(…)
predict(input_dataframe)
model = mlflow.keras.load_model(…)
model.predict(keras.Input(…))
25
Model Flavors Example
predict = mlflow.pyfunc.load_pyfunc(…)
predict(input_dataframe)
26
Demo
Goal: Classify hand-drawn digits
1. Instrument Keras training code with MLflow tracking APIs
2. Run training code as an MLflow Project
3. Deploy an MLflow Model for real-time serving
27
1.0 Release
MLflow 1.0 was released recently! Major features:
•New metrics UI
•“Step” axis for metrics
•Improved search capabilities
•Package MLflow Models as Docker containers
•Support for ONNX models
27
28
Ongoing MLflow Roadmap
• New component: Model Registry for model management
• Multi-step project workflows
• Fluent Tracking API for Java and Scala
• Packaging projects with build steps
• Better environment isolation when loading models
• Improved model input/output schemas
29
Get started with MLflow
pip install mlflow to get started
Find docs & examples at mlflow.org
tinyurl.com/mlflow-slack
29
30
Thank you!
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
mlflow-databricks/

Contenu connexe

Plus de C4Media

Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDC4Media
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine LearningC4Media
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at SpeedC4Media
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsC4Media
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsC4Media
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerC4Media
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleC4Media
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeC4Media
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereC4Media
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing ForC4Media
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data EngineeringC4Media
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreC4Media
 
Navigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsNavigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsC4Media
 
High Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechHigh Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechC4Media
 
Rust's Journey to Async/await
Rust's Journey to Async/awaitRust's Journey to Async/await
Rust's Journey to Async/awaitC4Media
 
Opportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaOpportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaC4Media
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayC4Media
 
Are We Really Cloud-Native?
Are We Really Cloud-Native?Are We Really Cloud-Native?
Are We Really Cloud-Native?C4Media
 
CockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseCockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseC4Media
 
A Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with BrooklinA Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with BrooklinC4Media
 

Plus de C4Media (20)

Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 
Navigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsNavigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery Teams
 
High Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechHigh Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in Adtech
 
Rust's Journey to Async/await
Rust's Journey to Async/awaitRust's Journey to Async/await
Rust's Journey to Async/await
 
Opportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaOpportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven Utopia
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
 
Are We Really Cloud-Native?
Are We Really Cloud-Native?Are We Really Cloud-Native?
Are We Really Cloud-Native?
 
CockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseCockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL Database
 
A Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with BrooklinA Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with Brooklin
 

Dernier

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 

Dernier (20)

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 

MLflow: An Open Platform to Simplify the Machine Learning Lifecycle

  • 1. A platform for the Complete Machine Learning Lifecycle Corey Zumar June 24th, 2019
  • 2. InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ mlflow-databricks/
  • 3. Presented at QCon New York www.qconnewyork.com Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide
  • 4. 2 Outline • Overview of ML development challenges • How MLflow tackles these challenges • MLflow components • Demo • How to get started
  • 6. 4 ML Lifecycle Data Prep Training Deploy Raw Data μ λ θ Tunin g Scal e μ λ θ Tunin g Scal e Scal e Scal e Model Exchang e Governanc e Delta
  • 7. 5 Custom ML Platforms Facebook FBLearner, Uber Michelangelo, Google TFX +Standardize the data prep / training / deploy loop: if you work with the platform, you get these! –Limited to a few algorithms or frameworks –Tied to one company’s infrastructure Can we provide similar benefits in an open manner?
  • 8. 6 Introducing Open machine learning platform • Works with any ML library & language • Runs the same way anywhere (e.g. any cloud) • Designed to be useful for 1 or 1000+ person orgs
  • 9. 7 MLflow Components Tracking Record and query experiments: code, configs, results, …etc Project s Packaging format for reproducible runs on any platform Model s General model format that supports diverse deployment tools
  • 10. 8 Key Concepts in Tracking Parameters: key-value inputs to your code Metrics: numeric values (can update over time) Source: training code that ran Version: version of the training code Artifacts: files, including data and models Tags and Notes: any additional information
  • 12. 10 MLflow Tracking Tracking Record and query experiments: code, configs, results, …etc import mlflow with mlflow.start_run(): mlflow.log_param("layers", layers) mlflow.log_param("alpha", alpha) # train model mlflow.log_metric("mse", model.mse()) mlflow.log_artifact("plot", model.plot(test_df)) mlflow.tensorflow.log_model(model)
  • 13. 11 Demo Goal: Classify hand-drawn digits 1. Instrument Keras training code with MLflow tracking APIs 2. Run training code as an MLflow Project 3. Deploy an MLflow Model for real-time serving
  • 14. 12 MLflow backend stores 1. Entity (Metadata) Store • FileStore (local filesystem) • SQLStore (via SQLAlchemy) • REST Store 2. Artifact Store • S3 backed store • Azure Blob storage • Google Cloud storage • DBFS artifact repo
  • 15. 13 MLflow Components Tracking Record and query experiments: code, configs, results, …etc Project s Packaging format for reproducible runs on any platform Model s General model format that supports diverse deployment tools
  • 16. 14 MLflow Projects Motivation Diverse set of training tools Diverse set of environments Challenge: ML results are difficult to reproduce.
  • 17. 15 Project Spec Code Data Config Local Execution Remote Execution MLflow Projects Dependencie s
  • 18. 16 MLflow Projects Packaging format for reproducible ML runs • Any code folder or GitHub repository • Optional MLproject file with project configuration Defines dependencies for reproducibility • Conda (+ R, Docker, …) dependencies can be specified in MLproject • Reproducible in (almost) any environment Execution API for running projects • CLI / Python / R / Java • Supports local and remote execution
  • 19. 17 Example MLflow Project my_project/ ├── MLproject │ │ │ │ │ ├── conda.yaml ├── main.py └── model.py ... conda_env: conda.yaml entry_points: main: parameters: training_data: path lambda: {type: float, default: 0.1} command: python main.py {training_data} {lambda} $ mlflow run git://<my_project>
  • 20. 18 Demo Goal: Classify hand-drawn digits 1. Instrument Keras training code with MLflow tracking APIs 2. Run training code as an MLflow Project 3. Deploy an MLflow Model for real-time serving
  • 21. 19 MLflow Components Tracking Record and query experiments: code, configs, results, …etc Project s Packaging format for reproducible runs on any platform Model s General model format that supports diverse deployment tools
  • 22. 20 Inference Code Batch & Stream Scoring Serving Tools Mlflow Models Motivation ML Frameworks
  • 23. 21 Model Format Flavor 2Flavor 1 ML Frameworks Inference Code Batch & Stream Scoring Serving Tools Standard for ML models MLflow Models
  • 24. 22 MLflow Models Packaging format for ML Models • Any directory with MLmodel file Defines dependencies for reproducibility • Conda environment can be specified in MLmodel configuration Model creation utilities • Save models from any framework in MLflow format Deployment APIs • CLI / Python / R / Java
  • 25. 23 Example MLflow Model my_model/ ├── MLmodel │ │ │ │ │ └ estimator/ ├─ saved_model.pb └─ variables/ ... Usable with Tensorflow tools / APIs Usable with any Python tool run_id: 769915006efd4c4bbd662461 time_created: 2018-06-28T12:34 flavors: tensorflow: saved_model_dir: estimator signature_def_key: predict python_function: loader_module: mlflow.tensorflow mlflow.tensorflow.log_model(...)
  • 26. 24 Model Flavors Example Train a model mlflow.keras.log_model() Model Format Flavor 1: Pyfunc Flavor 2: Keras predict = mlflow.pyfunc.load_pyfunc(…) predict(input_dataframe) model = mlflow.keras.load_model(…) model.predict(keras.Input(…))
  • 27. 25 Model Flavors Example predict = mlflow.pyfunc.load_pyfunc(…) predict(input_dataframe)
  • 28. 26 Demo Goal: Classify hand-drawn digits 1. Instrument Keras training code with MLflow tracking APIs 2. Run training code as an MLflow Project 3. Deploy an MLflow Model for real-time serving
  • 29. 27 1.0 Release MLflow 1.0 was released recently! Major features: •New metrics UI •“Step” axis for metrics •Improved search capabilities •Package MLflow Models as Docker containers •Support for ONNX models 27
  • 30. 28 Ongoing MLflow Roadmap • New component: Model Registry for model management • Multi-step project workflows • Fluent Tracking API for Java and Scala • Packaging projects with build steps • Better environment isolation when loading models • Improved model input/output schemas
  • 31. 29 Get started with MLflow pip install mlflow to get started Find docs & examples at mlflow.org tinyurl.com/mlflow-slack 29
  • 33. Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ mlflow-databricks/