ML development brings many new complexities beyond the traditional software development lifecycle. ML projects, unlike software projects, after they were successfully delivered and deployed, cannot be abandoned but must be continuously monitored if model performance still satisfies all requirements.
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Continuous Delivery of ML-Enabled Pipelines on Databricks using MLflow
1.
2. Continuous delivery of ML
pipelines on Databricks using
MLflow and CICD Templates
Michael Shtelma, Databricks
Thunder Shiviah, Databricks
3. Agenda
The Challenges of implementing
CICD for ML (CD4ML) pipelines
The CICD challenges forcing ML teams to choose
between Databricks notebooks or local IDEs
Introducing DatabricksLabs
CICD Templates
How CICD Templates solves ML team production
challenges
Demo and Next Steps
4. Sato, Wider, and Windheuser, 2019
Continuous Delivery for Machine Learning (CD4ML) is a software
engineering approach in which a cross-functional team produces
machine learning applications based on code, data, and models in
small and safe increments that can be reproduced and reliably
released at any time, in short adaptation cycles.
6. ML teams struggle to combine traditional CICD
tools with Databricks notebooks
1. Benefits to Databricks notebooks
▪ Easy to use
▪ Scalable
▪ Provides access to ML tools such as mlflow for model logging and serving
2. Challenges
▪ Non-trivial to hook into traditional software development tools such as CI tools or local IDEs.
3. Result
▪ Teams find themselves choosing between
▪ using traditional IDE based workflows but struggling to test and deploy at scale or
▪ using Databricks notebooks or other cloud notebooks but then struggling to ensure testing
and deployment reliability via CICD pipelines.
8. CICD Templates gives you the benefits of
traditional CICD workflows and the scale of
databricks clusters
CICD Templates allows you to
● create a production pipeline via template in a few steps
● that automatically hooks to github actions and
● runs tests and deployments on databricks upon git commit or
whatever trigger you define and
● gives you a test success status directly in github so you know if your
commit broke the build
9. A scalable CICD pipeline in 5 easy steps
1. Install and customize with a single command
2. Create a new github repo containing your databricks host and
token secrets
3. Initialize git in your repo and commit the code.
4. Push your new cicd templates project to the repo. Your tests will
start running automatically on Databricks. Upon your tests’ success or
failure you will get a green checkmark or red x next to your commit
status.
5. You’re done! You now have a fully scalable CICD pipeline.
1
2
3
4
5
10. CICD Templates executes tests and deployments
directly on databricks while storing packages, model
logging and other artifacts in Mlflow
15. Summary
The Challenges of implementing
CD4ML
The CICD challenges forcing ML teams to choose
between Databricks notebooks or local IDEs
Introducing DatabricksLabs
CICD Templates
How CICD Templates solves ML team production
challenges
Next Steps
Search DatabricksLabs cicd-templates or go
directly to https://github.com/databrickslabs/cicd-templates to get
started
michael.shtelma@databricks.com
thunder.shiviah@databricks.com
21. Reduce Long Titles
▪ Bullet 1
▪ Sub-bullet
▪ Sub-bullet
▪ Bullet 2
▪ Sub-bullet
▪ Sub-bullet
By splitting them into a short title, and a more detailed subtitle using this slide format that includes a
subtitle area
22. Two Columns
▪ Bulleted list format
▪ Bulleted list format
▪ Bulleted list format
▪ Bulleted list format
▪ Bulleted list format
▪ Bulleted list format
▪ Bulleted list format
▪ Bulleted list format
Headline FormatHeadline Format
23. Two Box
▪ Bulleted list
▪ Bulleted list
▪ Bulleted list
▪ Bulleted list
CategoryCategory
24. Three Box
▪ Bulleted list
▪ Bulleted list
▪ Bulleted list
▪ Bulleted list
CategoryCategory
▪ Bulleted list
▪ Bulleted list
Category
25. Four Box
▪ Bulleted list
▪ Bulleted list
▪ Bulleted list
▪ Bulleted list
CategoryCategory
▪ Bulleted list
▪ Bulleted list
Category
▪ Bulleted list
▪ Bulleted list
Category
29. Table
Column Column Column
Row Value Value Value
Row Value Value Value
Row Value Value Value
Row Value Value Value
Row Value Value Value
Row Value Value Value
Row Value Value Value