MLOps with Kubeflow

  1. 1. MLOps with Saurabh Kaushik @saurabhkaushik
  2. 2. About Me: Saurabh Kaushik @saurabhkaushik • Director, Product Engineering Management @Eureka.AI • Help Telco's to monetize their data using AI and Data Products • Engineered and deployed about 20+ AI Product Solutions • Experience: 20+ years in various roles (Consultant/Lead/Architect/Manager/Director) • Domain: FinTech, AdTech, MarTech • Industry: Telco, Banking, Financial, Retail, CPG • Tech: Data Science, ML, DL, NLP, Big Data, Java, Python, Full Stack • Org: Products, Enterprise, Service, Tech Startups • Speakers: Product School, NASCOM, World Startup Expo, Institute of Product Leadership, IIMB, TechGigs • Hobbies: Tennis, Piano, Building Bots (Botreload.com)
  3. 3. Why do we need MLOps?
  4. 4. Most Data Scientist View of World… 4 Love to be in this zone!
  5. 5. How can someone take care of all these… 5 How do I auto scale my each stage independently? How can I choose different tool for different stage of pipeline? How can I run my workload seamlessly across environments? How can I deploy my model without bothering too much about Containerization or Cluster Mgmt.? Each stage of pipeline has different needs • Training – Compute Heavy and Memory High • Serving – Compute Fast and Memory Low How to manage this compute and memory allocation dynamically? MLOps… !!!
  6. 6. What is Kubeflow?
  7. 7. Kubeflow is the solution…. • Kubeflow is an open source artificial intelligence/machine learning (AI/ML) tool that helps improve deployment, portability and management of AI/ML models. • Kubeflow allows users to quickly create, train and tune neural networks within Kubernetes for dynamic resource provisioning. • Kubeflow works well with TensorFlow and other modern AI/ML frameworks such as PyTorch, MXNet and Chainer allowing users to enhance their existing code and setup. 7 Machine Learning Toolkit on Kubernetes
  8. 8. Kubeflow – Origin • Kubeflow was originally released in March 2018 by Google as an open source initiative to develop machine learning applications using TensorFlow on top of Kubernetes to minimise MLOps effort. • Google has been using TFX based Pipeline to deploy ML Models in production over Kubernetes based infra. They offered this with combined power of TensorFlow and Kubernetes. 8
  9. 9. Kubeflow – Building Principals Composability Scalability Portability 9
  10. 10. Composability • Allow to choose what is right for project. E.g. Frameworks, Tools, lib, versions in different stages of pipeline. Composability 10
  11. 11. Portability • Allow to run ML workload to run anywhere/any platform seamlessly. E.g. Laptop, Cloud, On-prem, OS. Portability 11
  13. 13. Scalability • Allow to auto scale on given resources with independent configuration for each. Scalability 13
  15. 15. What are key MLOps capabilities in Kubeflow?
  16. 16. Jupyter - notebooks • Kubeflow comes with support for managing Jupyter notebooks, an open-source application that allows users to blend code, equation-style notation, free text and dynamic visualisations to give data scientists a single point of access to their experimental setup and notes. 16
  17. 17. Katib - hyper-parameter tuning • Hyperparameters are set before the machine learning process takes place. These parameters (e.g. topology or number of layers in a neural network) can be tuned with Katib. 17
  18. 18. Katib - hyper-parameter tuning • Katib supports various ML tools such as TensorFlow, PyTorch and MXNet making it easy to reuse previous experiments results with Katib and Kubeflow. 18
  19. 19. Katib - hyper-parameter tuning • Hyper Parameter Tuning – Pipeline 19
  20. 20. Katib - hyper-parameter tuning • YMAL for Katib 20
  21. 21. Pipelines • Kubeflow pipelines facilitate end-to-end orchestration of ML workflows, management of multiple experiments and approaches as well as easier re-use of previously successful solutions into a new workflow. This helps developers and data scientists save time and effort. 21
  22. 22. Pipelines • Requires to be bit technical to build it. 22
  23. 23. Serving • Kubeflow makes two service systems available, KFServing and Seldon Core. These allow multi-framework model serving and the choice should be made based on the needs of each project. 23
  24. 24. How does MLOps pipeline operate with Kubeflow?
  25. 25. Kubeflow – Architecture 25
  26. 26. Typical ML Process 26
  27. 27. Kubeflow – Experimental Phase 27
  28. 28. Kubeflow – Production Phase 28
  29. 29. Spotify – Case Study
  30. 30. Spotify – Kubeflow Transition 30Standing on the shoulders of giants
  31. 31. How easy is to do MLOps with KubeFlow?
  33. 33. Thank you Twitter: @saurabhkaushik Linkedin: @saurabhkaushik