SlideShare a Scribd company logo
1 of 60
Download to read offline
Train and deliver machine
learning models to production
with a single command
STEPAN PUSHKAREV
ILNUR GARIFULLIN
Today’s webinar overview
1. Machine Learning Workflow
2. Tools overview
a. Kubeflow
b. Hydrosphere.io
3. Deep Dive into Automation
a. Steps definition
b. Steps automation
Machine Learning
Workflow
ML Workflow
1. Research
2. Data Preparation
3. Model Training
4. Model Cataloguing
5. Model Deployment
6. Model Integration Testing
7. Production Inferencing
8. Model Performance Monitoring
9. Model Maintenance
Step 1: Research
● Defining an objective
● Defining requirements
● Defining methods
● Defining data sources
1.
Step 2: Data Preparation
● Collecting data
● Preparing data
○ Cleaning
○ Feature engineering
○ Transformation
● Important! To be reused for Inferencing.
1. 2.
Step 3: Model Training
● Building the model
● Training the model
● Evaluating the model
● Tuning hyper-parameters
● Versioning training data
1. 2. 3.
Step 4: Model Cataloguing
● Metadata extraction
○ Graph definition
○ Weights
○ Training data version / stats
○ Other dependencies (look_up vocabulary, etc)
● Indexing model’s binaries
● Versioning a model artifact
● Storing a model in Repository
1. 2. 3. 4.
Step 5: Model Deployment
● Preparing infrastructure for the model
● Preparing runtime for the model
● Deploying the model server
● Exposing API endpoints to the model
● Model Integration
1. 2. 3. 4. 5.
Step 6: Model Integration Testing
● Performing integration tests
● Replaying a golden data set
● Replaying edge cases
● Replaying recent traffic
● Asserting results
1. 2. 3. 4. 5. 6.
Step 7: Production Inferencing
● A/B & Canary deployment
● Model scaling
1. 2. 3. 4. 5. 6. 7.
Step 8: Model Performance Monitoring
● System metrics monitoring
● Model metrics tracking
● Model comparison
● Concept drift monitoring
● Anomaly detection
● Data profiling
1. 2. 3. 4. 5. 6. 7. 8.
Step 9: Model Maintenance
● Alerts & Troubleshooting
● Root Cause Analysis
● Edge Case Exploration
● Retraining Dataset Subsampling
● Retraining
1. 2. 3. 4. 5. 6. 7. 8. 9.
The Toolset
The Machine Learning Model Management Platform
The Machine Learning Toolkit for Kubernetes
What is Kubeflow?
● Began as Kubernetes template / blueprint for running Tensorflow
● Evolved into “Toolkit” - loosely coupled tools and blueprints for ML on
Kubernetes
What is Hydrosphere.io?
Hydrosphere.io is a platform for ML models Management.
- An exact value-add “tool” - a part of the toolkit
- Opensource
- Augments Cataloguing, Deployment, Inferencing,
Monitoring and Maintenance
Research Data Prep Training Cataloguing Deployment Integration
Testing
Production
Inferencing
Performance
Monitoring
Model
Maintenance
Tools Landscape
Orchestrate
ModelDB
Deep Dive into
Workflow Automation
Part 1: Creating executables
Step 1: Research
Step 1: Research
MNIST● Objective – given an image of the handwritten
digit, predict what digit it is;
● Requirements – model export with an ease;
● Tools and Methods – Tensorflow Estimator API;
● Data – Mnist dataset
Step 2: Data Preparation
Step 2: Data Preparation — Building Container
FROM python:3.6-slim
RUN pip install numpy==1.14.3 Pillow==5.2.0
ADD ./download.py /src/
WORKDIR /src/
ENTRYPOINT [ "python", "download.py" ]
$ docker build -t {username}/mnist-pipeline-download .
$ docker push {username}/mnist-pipeline-download
Dockerfile
Step 3: Model Training — Building a model
Step 3: Model Training
Step 3.5: Model Training and Saving
Step 4: Model Cataloguing
DIY:
Instrument training
pipeline
Store metadata
Zip model and metadata
Store in S3
Or push to Artifactory
Or push to git
Step 4: Model Cataloguing
DIY:
Instrument training
pipeline
Store metadata
Zip model and metadata
Store in S3
Or push to Artifactory
Or push to git
ModelDB:
Python DSL:
- Sync Model
- Sync Test data
- Sync metrics
Nice UI
Step 4: Model Cataloguing
DIY:
Instrument training
pipeline
Store metadata
Zip model and metadata
Store in S3
Or push to Artifactory
Or push to git
ModelDB:
Python DSL:
- Sync Model
- Sync Test data
- Sync metrics
Nice UI
Hydrosphere.io:
$ hs upload /models/mnist/
$ hs profile push /data/mnist/
Step 4: Model Cataloguing
Version
Extract metadata
Build model docker
Image
Store in Docker
Registry
Hydrosphere.io:
$ hs upload /models/mnist/
$ hs profile push /data/mnist/
Step 4: Model Cataloguing
Step 5: Model Deployment
DIY:
Implement model
server (Flask App)
Lookup for model
Dockerize
Add Kube configs, tags
Expose API (HTTP,
gRPC, batch, Streaming)
Step 5: Model Deployment
DIY:
Implement model
server (Flask App)
Lookup for Model
Dockerize
Add Kube configs, tags
Expose API (HTTP,
gRPC, batch, Streaming)
Niche tools:
TensorFlow Serving
PyTorch Serving
Nvidia TensorRT Serving
Step 5: Model Deployment
DIY:
Implement model
server (Flask App)
Lookup for Model
Dockerize
Add Kube configs, tags
Expose API (HTTP,
gRPC, batch, Streaming)
Niche tools:
TensorFlow Serving
PyTorch Serving
Nvidia TensorRT Serving
Hydrosphere.io
$ hs apply -f - << EOF
kind: Application
name : “MyPredictionApp”
singular:
model: mnist:1
runtime:
“serving-runtime-python:1.7.0-latest”
EOF
Step 5: Model Deployment
Hydrosphere.io
$ hs apply -f - << EOF
kind: Application
name : “MyPredictionApp”
singular:
model: mnist:1
runtime:
“serving-runtime-python:1.7.0-latest”
EOF
metadata
runtime
model
Model launched on Kube
HTTP, gRPC, Kafka API
Step 6: Model Integration Testing
DIY:
Implement testing
script
Dockerize, add to Kube
Replay a golden data
Replay edge cases
Replay recent traffic
Asserting results
Step 6: Model Integration Testing
DIY:
Implement testing
script
Dockerize, add to Kube
Replay a golden data
Replay edge cases
Replay recent traffic
Asserting results
Step 6: Model Integration Testing
DIY:
Implement testing
script
Dockerize, add to Kube
Replay a golden data
Replay edge cases
Replay recent traffic
Asserting results
Hydrosphere Serving (Q2 2019)
$ hs test -f /test/dataset
$ hs test replay anomalies
$ hs test replay <from_date>
Step 7: Production Inference
Step 8: Model Performance Monitoring
Step 9: Model Maintenance
alert
accuracy
drops
data
changed
what exactly
Step 9: Model Maintenance - explainability of monitoring alert
Deep Dive into
Workflow Automation
Part 2: Defining Kubeflow Pipeline
Defining Kubeflow Pipeline
Parametrizing function
Stage 1: Defining Downloading Container
Stage 1: Mounting Volumes
Stage 2: Defining Training Container
Stage 2: Defining Training Container
Stage 3: Defining Uploading Container
Stage 4: Defining Deploying Container
Stage 5: Defining Testing Container
Stage 6: Defining Cleaning Container
Compiling Pipeline
$ python pipeline.py pipeline.tar.gz
$ tar -xvf pipeline.tar.gz # produces pipeline.yaml
Executing Pipeline with a single command
$ argo submit pipeline.yaml --watch
Executing Pipeline with UI
Source code
https://github.com/Hydrospheredata/hydro-serving-kubeflow-demo
Contact Us
GENERAL INQUIRIES
hydrosphere.io
info@hydrosphere.io
linkedin.com/company/hydrospherebigdata
twitter.com/hydrospheredata
facebook.com/hydrosphere.io
ADDRESS
125 University Avenue, Suite 290
Palo Alto, CA, 94301
tel: 650-521-7875
BUSINESS AND TECHNICAL
Stepan Pushkarev
spushkarev@hydrosphere.io
Ilnur Garifullin
igarifullin@provectus.com

More Related Content

What's hot

How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
Databricks
 

What's hot (20)

Kubeflow
KubeflowKubeflow
Kubeflow
 
Kubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPOKubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPO
 
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and IstioAdvanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
 
Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)
Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)
Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)
 
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycleKyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
 
Yannis Zarkadas. Enterprise data science workflows on kubeflow
Yannis Zarkadas. Enterprise data science workflows on kubeflowYannis Zarkadas. Enterprise data science workflows on kubeflow
Yannis Zarkadas. Enterprise data science workflows on kubeflow
 
모델 서빙 파이프라인 구축하기
모델 서빙 파이프라인 구축하기모델 서빙 파이프라인 구축하기
모델 서빙 파이프라인 구축하기
 
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
 
Powering machine learning workflows with Apache Airflow and Python
Powering machine learning workflows with Apache Airflow and PythonPowering machine learning workflows with Apache Airflow and Python
Powering machine learning workflows with Apache Airflow and Python
 
Webinar kubernetes and-spark
Webinar  kubernetes and-sparkWebinar  kubernetes and-spark
Webinar kubernetes and-spark
 
Apache Airflow
Apache AirflowApache Airflow
Apache Airflow
 
Kubeflow: Machine Learning en Cloud para todos
Kubeflow: Machine Learning en Cloud para todosKubeflow: Machine Learning en Cloud para todos
Kubeflow: Machine Learning en Cloud para todos
 
Running Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using KubernetesRunning Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using Kubernetes
 
Serverless machine learning operations
Serverless machine learning operationsServerless machine learning operations
Serverless machine learning operations
 
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
 
Clearing Airflow Obstructions
Clearing Airflow ObstructionsClearing Airflow Obstructions
Clearing Airflow Obstructions
 
Multi runtime serving pipelines for machine learning
Multi runtime serving pipelines for machine learningMulti runtime serving pipelines for machine learning
Multi runtime serving pipelines for machine learning
 
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowMLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
 
Model versioning done right: A ModelDB 2.0 Walkthrough
Model versioning done right: A ModelDB 2.0 WalkthroughModel versioning done right: A ModelDB 2.0 Walkthrough
Model versioning done right: A ModelDB 2.0 Walkthrough
 
Introduction to Apache Beam (incubating) - DataCamp Salzburg - 7 dec 2016
Introduction to Apache Beam (incubating) - DataCamp Salzburg - 7 dec 2016Introduction to Apache Beam (incubating) - DataCamp Salzburg - 7 dec 2016
Introduction to Apache Beam (incubating) - DataCamp Salzburg - 7 dec 2016
 

Similar to Hydrosphere.io for ODSC: Webinar on Kubeflow

MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
Databricks
 
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and KubeflowKostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
IT Arena
 

Similar to Hydrosphere.io for ODSC: Webinar on Kubeflow (20)

MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 
Machine Learning Platform in LINE Fukuoka
Machine Learning Platform in LINE FukuokaMachine Learning Platform in LINE Fukuoka
Machine Learning Platform in LINE Fukuoka
 
S8277 - Introducing Krylov: AI Platform that Empowers eBay Data Science and E...
S8277 - Introducing Krylov: AI Platform that Empowers eBay Data Science and E...S8277 - Introducing Krylov: AI Platform that Empowers eBay Data Science and E...
S8277 - Introducing Krylov: AI Platform that Empowers eBay Data Science and E...
 
MLOps pipelines using MLFlow - From training to production
MLOps pipelines using MLFlow - From training to productionMLOps pipelines using MLFlow - From training to production
MLOps pipelines using MLFlow - From training to production
 
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and KubeflowKostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
 
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
 
AI day2021 approcci DevOps per il rilascio continuo di modelli di machine lea...
AI day2021 approcci DevOps per il rilascio continuo di modelli di machine lea...AI day2021 approcci DevOps per il rilascio continuo di modelli di machine lea...
AI day2021 approcci DevOps per il rilascio continuo di modelli di machine lea...
 
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
 
MLflow with Databricks
MLflow with DatabricksMLflow with Databricks
MLflow with Databricks
 
Mlflow with databricks
Mlflow with databricksMlflow with databricks
Mlflow with databricks
 
_Python Ireland Meetup - Serverless ML - Dowling.pdf
_Python Ireland Meetup - Serverless ML - Dowling.pdf_Python Ireland Meetup - Serverless ML - Dowling.pdf
_Python Ireland Meetup - Serverless ML - Dowling.pdf
 
Angular 2 overview in 60 minutes
Angular 2 overview in 60 minutesAngular 2 overview in 60 minutes
Angular 2 overview in 60 minutes
 
Devops course content
Devops course contentDevops course content
Devops course content
 
Containerized architectures for deep learning
Containerized architectures for deep learningContainerized architectures for deep learning
Containerized architectures for deep learning
 
How To Build Efficient ML Pipelines From The Startup Perspective (GTC Silicon...
How To Build Efficient ML Pipelines From The Startup Perspective (GTC Silicon...How To Build Efficient ML Pipelines From The Startup Perspective (GTC Silicon...
How To Build Efficient ML Pipelines From The Startup Perspective (GTC Silicon...
 
Start with version control and experiments management in machine learning
Start with version control and experiments management in machine learningStart with version control and experiments management in machine learning
Start with version control and experiments management in machine learning
 
EPAM ML/AI Accelerator - ODAHU
EPAM ML/AI Accelerator - ODAHUEPAM ML/AI Accelerator - ODAHU
EPAM ML/AI Accelerator - ODAHU
 
MLFlow 1.0 Meetup
MLFlow 1.0 Meetup MLFlow 1.0 Meetup
MLFlow 1.0 Meetup
 
Useful practices of creation automatic tests by using cucumber jvm
Useful practices of creation automatic tests by using cucumber jvmUseful practices of creation automatic tests by using cucumber jvm
Useful practices of creation automatic tests by using cucumber jvm
 
MLOps in action
MLOps in actionMLOps in action
MLOps in action
 

Recently uploaded

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 

Recently uploaded (20)

Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 

Hydrosphere.io for ODSC: Webinar on Kubeflow

  • 1. Train and deliver machine learning models to production with a single command STEPAN PUSHKAREV ILNUR GARIFULLIN
  • 2. Today’s webinar overview 1. Machine Learning Workflow 2. Tools overview a. Kubeflow b. Hydrosphere.io 3. Deep Dive into Automation a. Steps definition b. Steps automation
  • 4. ML Workflow 1. Research 2. Data Preparation 3. Model Training 4. Model Cataloguing 5. Model Deployment 6. Model Integration Testing 7. Production Inferencing 8. Model Performance Monitoring 9. Model Maintenance
  • 5. Step 1: Research ● Defining an objective ● Defining requirements ● Defining methods ● Defining data sources 1.
  • 6. Step 2: Data Preparation ● Collecting data ● Preparing data ○ Cleaning ○ Feature engineering ○ Transformation ● Important! To be reused for Inferencing. 1. 2.
  • 7. Step 3: Model Training ● Building the model ● Training the model ● Evaluating the model ● Tuning hyper-parameters ● Versioning training data 1. 2. 3.
  • 8. Step 4: Model Cataloguing ● Metadata extraction ○ Graph definition ○ Weights ○ Training data version / stats ○ Other dependencies (look_up vocabulary, etc) ● Indexing model’s binaries ● Versioning a model artifact ● Storing a model in Repository 1. 2. 3. 4.
  • 9. Step 5: Model Deployment ● Preparing infrastructure for the model ● Preparing runtime for the model ● Deploying the model server ● Exposing API endpoints to the model ● Model Integration 1. 2. 3. 4. 5.
  • 10. Step 6: Model Integration Testing ● Performing integration tests ● Replaying a golden data set ● Replaying edge cases ● Replaying recent traffic ● Asserting results 1. 2. 3. 4. 5. 6.
  • 11. Step 7: Production Inferencing ● A/B & Canary deployment ● Model scaling 1. 2. 3. 4. 5. 6. 7.
  • 12. Step 8: Model Performance Monitoring ● System metrics monitoring ● Model metrics tracking ● Model comparison ● Concept drift monitoring ● Anomaly detection ● Data profiling 1. 2. 3. 4. 5. 6. 7. 8.
  • 13. Step 9: Model Maintenance ● Alerts & Troubleshooting ● Root Cause Analysis ● Edge Case Exploration ● Retraining Dataset Subsampling ● Retraining 1. 2. 3. 4. 5. 6. 7. 8. 9.
  • 15. The Machine Learning Model Management Platform The Machine Learning Toolkit for Kubernetes
  • 16. What is Kubeflow? ● Began as Kubernetes template / blueprint for running Tensorflow ● Evolved into “Toolkit” - loosely coupled tools and blueprints for ML on Kubernetes
  • 17. What is Hydrosphere.io? Hydrosphere.io is a platform for ML models Management. - An exact value-add “tool” - a part of the toolkit - Opensource - Augments Cataloguing, Deployment, Inferencing, Monitoring and Maintenance
  • 18. Research Data Prep Training Cataloguing Deployment Integration Testing Production Inferencing Performance Monitoring Model Maintenance Tools Landscape Orchestrate ModelDB
  • 19. Deep Dive into Workflow Automation Part 1: Creating executables
  • 21. Step 1: Research MNIST● Objective – given an image of the handwritten digit, predict what digit it is; ● Requirements – model export with an ease; ● Tools and Methods – Tensorflow Estimator API; ● Data – Mnist dataset
  • 22. Step 2: Data Preparation
  • 23. Step 2: Data Preparation — Building Container FROM python:3.6-slim RUN pip install numpy==1.14.3 Pillow==5.2.0 ADD ./download.py /src/ WORKDIR /src/ ENTRYPOINT [ "python", "download.py" ] $ docker build -t {username}/mnist-pipeline-download . $ docker push {username}/mnist-pipeline-download Dockerfile
  • 24. Step 3: Model Training — Building a model
  • 25. Step 3: Model Training
  • 26. Step 3.5: Model Training and Saving
  • 27. Step 4: Model Cataloguing DIY: Instrument training pipeline Store metadata Zip model and metadata Store in S3 Or push to Artifactory Or push to git
  • 28. Step 4: Model Cataloguing DIY: Instrument training pipeline Store metadata Zip model and metadata Store in S3 Or push to Artifactory Or push to git ModelDB: Python DSL: - Sync Model - Sync Test data - Sync metrics Nice UI
  • 29. Step 4: Model Cataloguing DIY: Instrument training pipeline Store metadata Zip model and metadata Store in S3 Or push to Artifactory Or push to git ModelDB: Python DSL: - Sync Model - Sync Test data - Sync metrics Nice UI Hydrosphere.io: $ hs upload /models/mnist/ $ hs profile push /data/mnist/
  • 30. Step 4: Model Cataloguing Version Extract metadata Build model docker Image Store in Docker Registry Hydrosphere.io: $ hs upload /models/mnist/ $ hs profile push /data/mnist/
  • 31. Step 4: Model Cataloguing
  • 32. Step 5: Model Deployment DIY: Implement model server (Flask App) Lookup for model Dockerize Add Kube configs, tags Expose API (HTTP, gRPC, batch, Streaming)
  • 33. Step 5: Model Deployment DIY: Implement model server (Flask App) Lookup for Model Dockerize Add Kube configs, tags Expose API (HTTP, gRPC, batch, Streaming) Niche tools: TensorFlow Serving PyTorch Serving Nvidia TensorRT Serving
  • 34. Step 5: Model Deployment DIY: Implement model server (Flask App) Lookup for Model Dockerize Add Kube configs, tags Expose API (HTTP, gRPC, batch, Streaming) Niche tools: TensorFlow Serving PyTorch Serving Nvidia TensorRT Serving Hydrosphere.io $ hs apply -f - << EOF kind: Application name : “MyPredictionApp” singular: model: mnist:1 runtime: “serving-runtime-python:1.7.0-latest” EOF
  • 35. Step 5: Model Deployment Hydrosphere.io $ hs apply -f - << EOF kind: Application name : “MyPredictionApp” singular: model: mnist:1 runtime: “serving-runtime-python:1.7.0-latest” EOF metadata runtime model Model launched on Kube HTTP, gRPC, Kafka API
  • 36. Step 6: Model Integration Testing DIY: Implement testing script Dockerize, add to Kube Replay a golden data Replay edge cases Replay recent traffic Asserting results
  • 37. Step 6: Model Integration Testing DIY: Implement testing script Dockerize, add to Kube Replay a golden data Replay edge cases Replay recent traffic Asserting results
  • 38. Step 6: Model Integration Testing DIY: Implement testing script Dockerize, add to Kube Replay a golden data Replay edge cases Replay recent traffic Asserting results Hydrosphere Serving (Q2 2019) $ hs test -f /test/dataset $ hs test replay anomalies $ hs test replay <from_date>
  • 39. Step 7: Production Inference
  • 40. Step 8: Model Performance Monitoring
  • 41. Step 9: Model Maintenance alert accuracy drops data changed what exactly
  • 42. Step 9: Model Maintenance - explainability of monitoring alert
  • 43. Deep Dive into Workflow Automation Part 2: Defining Kubeflow Pipeline
  • 44.
  • 47.
  • 48. Stage 1: Defining Downloading Container
  • 49. Stage 1: Mounting Volumes
  • 50. Stage 2: Defining Training Container
  • 51. Stage 2: Defining Training Container
  • 52. Stage 3: Defining Uploading Container
  • 53. Stage 4: Defining Deploying Container
  • 54. Stage 5: Defining Testing Container
  • 55. Stage 6: Defining Cleaning Container
  • 56. Compiling Pipeline $ python pipeline.py pipeline.tar.gz $ tar -xvf pipeline.tar.gz # produces pipeline.yaml
  • 57. Executing Pipeline with a single command $ argo submit pipeline.yaml --watch
  • 60. Contact Us GENERAL INQUIRIES hydrosphere.io info@hydrosphere.io linkedin.com/company/hydrospherebigdata twitter.com/hydrospheredata facebook.com/hydrosphere.io ADDRESS 125 University Avenue, Suite 290 Palo Alto, CA, 94301 tel: 650-521-7875 BUSINESS AND TECHNICAL Stepan Pushkarev spushkarev@hydrosphere.io Ilnur Garifullin igarifullin@provectus.com