Contenu connexe Similaire à AI Pipeline Optimization using Kubeflow (20) AI Pipeline Optimization using Kubeflow1. AI Pipeline Optimization
… using Kubeflow
© 2019 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —
Muneer Ahmad (muneer.ahmad@netapp.com) Steve Guhr (steve.guhr@netapp.com)
AI Solutions Architect Solutions Engineer
2. Agenda
1) AI Pipeline Optimization & Architecture
2) Demo
3) Q&A
What’s it all about?
2 © 2019 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —
3. What it is and why did we do that?
AI Pipeline Optimization
3 © 2019 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —
4. AI Pipeline in general
… workflow across different sites
4 © 2019 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —
Data
Ingestion
Data
Analysis
Data
Transformation
Data
Validation
Training
Data
Splitting
Monitoring
Training
Model
Validation
Training At
Scale
ServingRoll-Out Monitoring Logging
Site 1
Site 2Site 3Site 4
Site 5 Site 6
5. Model
AI Pipeline Portability
... working with multiple premises
5 © 2019 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —
UX
Tooling
Framework
Storage
Runtime
Drivers
OS
Accelerator
HW
Model
UX
Tooling
Framework
Storage
Runtime
Drivers
OS
Accelerator
HW
Model
UX
Tooling
Framework
Storage
Runtime
Drivers
OS
Accelerator
HW
Laptop Training Rig Cloud
6. Model
AI Pipeline Optimization
… using Kubernetes & Kubeflow
6 © 2019 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —
UX
Tooling
Framework
Storage
Runtime
Drivers
OS
Accelerator
HW
Model
UX
Tooling
Framework
Storage
Runtime
Drivers
OS
Accelerator
HW
Model
UX
Tooling
Framework
Storage
Runtime
Drivers
OS
Accelerator
HW
Laptop Training Rig Cloud
7. Architectural Overview
What did we do (in a nutshell)?
7 © 2019 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —
Kubeflow
Kubernetes
JupyterHub
…
Trident
Pipeline Katib
8. How did we do it?
Everyone loves demos, right?!
8 © 2019 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —
9. Explaining the pipeline
Pre-processing
Training (classification, CIFAR10 dataset)
Deploying and serving trained models
TensorRT Inference engine
Web-application
… using actual data and training
9 © 2019 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —
10. What about „Machine Learning Version Control“?
How do you „Lift and Shift“ the whole AI application stack across hybrid clouds?
How to manage (c)old trained models and data?
[…]
10 © 2019 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —
Closing Thoughts – What‘s next?
11. 11 © 2019 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —
Resources
“Trident” for persistent volumes inside of containers:
https://github.com/NetApp/trident
https://netapp-trident.readthedocs.io/en/latest/
Kubernetes for container orchestration:
https://kubernetes.io/de/
Kubeflow as a „Data Science Toolchest“
https://www.kubeflow.org/
Articles about „DataScience as a Service“ and „Machine Learning Version Control“:
https://www.linkedin.com/pulse/simplify-machine-learning-version-control-muneer-ahmad-dedmari/
https://www.linkedin.com/pulse/part-2-simplifying-dataops-datascience-service-jupyter-steve-guhr/
12. 12
May the Data be with you!
© 2019 NetApp, Inc. All rights reserved. — NETAPP CONFIDENTIAL —