This deck provide an overview of containers and Kubernetes, and how these technologies can help solve the challenges faced by data scientists, ML engineers, and application developers. Next, it showcases the key capabilities required in a containers and kubernetes platform to help data scientists easily use technologies like Jupyter Notebooks, ML frameworks, programming languages to innovate faster. Finally it discusses the available platform options (e.g. KubeFlow, Open Data Hub, etc.), and some examples of how data scientists are accelerating their ML initiatives with containers and kubernetes platform.
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
ODSC East 2020 Accelerate ML Lifecycle with Kubernetes and Containerized Data Science Tools
1. Accelerate ML Lifecycle with
Kubernetes and Containerized
Data Science Tools
April 16th, 2020
1
Abhinav Joshi & Tushar Katarki
Red Hat
2. 2
Abhinav Joshi
Senior Manager, Red Hat OpenShift Product Marketing
19+ yrs IT experience, 2 yrs at Red Hat, ex VMware, NetApp, Cisco
Email: abhjoshi@redhat.com
LinkedIn: https://www.linkedin.com/in/abhinavjoshi/
Tushar Katarki
Senior Manager, Red Hat OpenShift Product Management
20 yrs IT experience, 8 yrs at Red Hat, Ex Oracle/Sun, Polycom, etc
Email: tkatarki@redhat.com
LinkedIn: https://www.linkedin.com/in/katarki/
3. What we’ll
discuss today
3
● Desired AI/ML architecture & execution challenges
● Why containers, Kubernetes, and DevOps for AI/ML
● Enterprise Kubernetes Platform examples
● Real world deployment use cases
5. 5
AI/ML lifecycle and key personas
App developer
IT operations
Set
goals
Gather and
prepare data
Develop ML
model
Deploy ML
models in app
dev process
Implement
Apps &
Inference
ML models
Monitoring &
Management
Data engineer
Business
leadership
Data scientists
ML Engineer
6. ML/DL and DevOps Tools (e.g. TensorFlow, Jupyter Notebooks, Python, Seldon, etc.)
Desired Conceptual Architecture
6
ML/DL data pipeline and sources (databases, data lake, etc.)
Compute acceleration (GPU, FPGA, TPU)
Hybrid, multi cloud platform with self service capabilities
Set
goals
Gather and
prepare data
Develop ML
model
Deploy ML models
in app dev process
Implement
Apps & inference
ML models
monitoring &
management
Infrastructure
Virtual Private Public Hybrid EdgePhysical
7. AI/ML execution challenges
7
Lots of data is
collected, but finding
and preparing the right
data is difficult.
Readily usable data
lacking
Lack of key skills make it
difficult to find and
secure talent to
maintain operations.
Talent
shortage
No rapid availability of
infrastructure and
software tools slows data
scientists and developers
Unavailability of
infrastructure & software
Unable to implement
quickly due to slow,
manual and siloed
operations.
Lack of collaboration
across teams
Containers, Kubernetes, and DevOps can help!
8. What does a Data Scientist care about?
As a Data Scientist, I want a
“self-service cloud like” experience
for my Machine Learning projects,
where I can access a rich set of
modelling tools, data, and
computational resources, share and
collaborate with colleagues, and
deliver my work into production
with speed, agility and repeatability
to drive business value!
Self service portal to access ML
tools and access sources
ML model deployment
in app dev process
Data Scientists care less about infrastructure platform unless it integrates with their
ML tooling, and provides them the agility, flexibility, portability, & scalability.
8
ML Modelling / hardware
acceleration
Inferencing w/ hardware
acceleration
9. ML/DL and DevOps Tools (e.g. TensorFlow, Jupyter Notebooks, Python, Seldon, etc.)
Containers, Kubernetes, and DevOps as part of the Hybrid, Multi Cloud Platform
9
ML/DL data sources - databases (SQL, NoSQL, etc.), data lake, etc.
Compute acceleration (GPU, FPGA, TPU)
Hybrid, multi cloud platform with self service capabilities
Set
goals
Gather and
prepare data
Develop ML
model
Deploy ML models
in app dev process
Implement
Apps & inference
ML models
monitoring &
management
Infrastructure
Virtual Private Public Hybrid EdgePhysical
● Containers
● Kubernetes
● DevOps
11. Containers
Basic units that make AI/ML
programs shareable and portable
across hybrid cloud
Choice: Containers contain all your ML
frameworks and tools
Sharing: Container images can be shared and
iterated in flexible ways
Immutable & Portable: Contain once and run
them anywhere with integrity
Versioning: Incremental changes are tracked
Fast & Efficient: They are Linux processes!
Security: Process isolation and resource
control
Container Host Operating System
Container
App
Supporting Files &
Runtime
Container
App
...
Container
App
...
App
12. Kubernetes
▸ Centralizes compute resources and
provides a consistent experience across the
data center, cloud, and edge
▸ Resource management for compute
resources (including GPUs and FPGA)
▸ Workload scheduling and management
▸ Multi tenancy and quotas enforcement
▸ Networking and storage abstractions
Kubernetes is the de facto container
management platform for the hybrid, multi
cloudFoundation for the Hybrid, Multi Cloud
Platform w/ self service capabilities for
Data Scientists, Developers, etc.
13. Self-service,
Automation, CI/CD
Boosts speed, efficiency and
productivity
▸ Jupyter Notebooks running on Kubernetes form
the basis for self-service
▸ Source-2-image automatically converts a
notebook into a container image that is ready to
be deployed
▸ Kubernetes Operators provide automation and
lifecycle management for the containers
▸ CI/CD makes rapid, incremental and iterative
change possible; Open source technologies such
as Argo, Tekton, Jenkins and Spinnaker in
conjunction with Kubernetes make this happen
▸ ‘Serverless’ technologies such as Knative will
enable AI/ML users to spend more time
developing their models
Image source: https://www.brainvire.com/devops/
14. Data
Engineering
Easy, self-service and repeatable
Data sources: Kubernetes Persistent Volumes and
S3 object store makes access to storage easy and
standardized
Data pipes: Kubernetes Networking and
ServiceMesh provides the data connectivity - high
bandwidth, low latency that is secure
Data streaming and manipulation: Tools such as
Apache Spark, Kafka, Presto etc. can run natively and
can be accessed as a service
Data governance: With open source technologies
like Open Policy Agent (OPA)
15. Deploying into
production
To deliver business value and
redeem the promise of AI in the
enterprise
Containerize models and expose the service
with an REST API using the microservices
pattern - ServiceMesh (such as ISTIO) makes
this easy !
Models are incorporated in a data pipeline
Jobs (batch or real-time) with tools such as
Spark, Kafka and Argo
Models are delivered into existing
application workflow as binaries: PMML,
ONNX, Pickle
Monitoring model performance and drift
with open source tools native to Kubernetes i.e.
Prometheus and Grafana
CI/CD to drive continuous change and
improvement in production
15
16. Stitching all of this together into an…...
“Enterprise Kubernetes and Container Platform for AI/ML”
DatacenterLaptop
ANY
INFRASTRUCTURE
Intelligent
Applications
ENTERPRISE CONTAINER HOST(S)
CONTAINER ORCHESTRATION AND MANAGEMENT
(KUBERNETES)
AI/ML/DL & DevOps tool chain
Machine Learning
Modeling
Data Pipeline
Open Source community ML
toolkit for Kubernetes
16
18. Red Hat OpenShift Kubernetes Platform
EXISTING
AUTOMATION
TOOLSETS
SCM
(GIT)
CI/CD
DATA SCIENTIST
Deploy ML on
any cloud
18
Developer Productivity
Cluster Services
Automated Ops ⠇Over-The-Air Updates ⠇Monitoring ⠇Logging ⠇Registry ⠇Networking ⠇Router ⠇KubeVirt ⠇OLM ⠇Helm
Red Hat Enterprise Linux & RHEL CoreOS
Kubernetes
Developer CLI ⠇VS Code
extensions ⠇IDE Plugins
Code Ready Workspaces
CodeReady Containers
Service Mesh ⠇Serverless
Builds ⠇CI/CD Pipelines
Full Stack Logging
Chargeback
Databases ⠇Languages
Runtimes ⠇Integration
Business Automation
100+ ISV Services
Platform Services Application Services Developer Services
Physical
Virtual Private cloud Public cloud
Build Cloud-Native AppsManage Workloads
Multi-cluster Management
Discovery ⠇Policy ⠇Compliance ⠇Configuration ⠇Workloads
Managed cloud
(Azure, AWS, IBM, Red Hat)
Windows Server
Nodes
Expose ML as
services, load
balanced and
scalable
Compute
Resources
on-demand
Best of SDLC
ML in
Production
19. Open source community project
● Open Source AI/ML Tooling
● Open source Red Hat
technologies e.g. OpenShift
● Automated deployment of open
source AI/ML tooling with
Kubernetes Operators
● https://www.opendatahub.io
Open Data Hub - “Data and AI Platform for the Hybrid Cloud”
20. Relationship between Kubeflow and Open Data Hub project
ML-as-a-service platform based on OpenShift,
Ceph storage, Kafka, JupyterHub and Spark
Home for k8s community to share
operators for various apps/tools
20
21. 21
Examples of containerized data science on Red Hat OpenShift
Connected Drive &
Autonomous Driving
Data driven diagnosis
Data driven diagnosis
Democratize data science for oil
and gas exploration
Containerized Apache
Spark
Healthcare and public sector Automotive Financial Oil and gas
Discover Financial
ServicesJupyter notebooks as a service
Ministry of Defence (Israel)
RBC Bank
22. ML/DL with Jupyter Notebook on Enterprise Kubernetes
Platform
22
DATA SCIENTIST
ML/DL
Model
test &
iteration
Model deployed
into production via
Inference server
Data Sources Access to
multiple data
sources
Kubernetes Platform
Kubernetes Platform
Integrated
GPU access
Integrated
GPU access
23. Example Data Science Delivery Model on OpenShift
23
Source: https://assets.openshift.com/hubfs/OpenShift-Commons-SF-Agile-Data-Science-ExxonMobil.pdf
25. Summary
25
Containers, Kubernetes, &
DevOps can help
Agility, self-service, hybrid
cloud portability, scalability,
flexibility, automation
AI/ML benefits businesses
AI-powered intelligent
applications help achieve
key business goals, but
execution challenges
exists
Enterprise Kubernetes &
Container Platforms make it real
Allows leveraging the
benefits of containers,
Kubernetes, DevOps, and
accelerate delivery of
AI-powered apps