Open Source AI - News and examples

Open Source @ IBM
Open source
AI/Machine Learning
2018 / © 2018 IBM Corporation 1
Luciano Resende
Data Science Platform Architect

About me - Luciano Resende
2
Data Science Platform Architect – IBM – CODAIT
• Have been contributing to open source at ASF for over 10 years
• Currently contributing to : Jupyter Notebook ecosystem, Apache Bahir, Apache
Toree, Apache Spark among other projects related to AI/ML platforms
lresende@apache.org
https://www.linkedin.com/in/lresende
@lresende1975
https://github.com/lresende

Open Source @ IBM
Center for Open Source Data & AI Technologies (CODAIT)
Model Asset eXchange (MAX)
Fabric for Deep Learning (FfDL)
Jupyter Enterprise Gateway
Q&A
Agenda
32018 / © 2018 IBM Corporation

4
Learn
Open Source @ IBM
Program touches
78,000
IBMers annually
Consume
Virtually all
IBM products
contain some
open source
• 40,363 pkgs
Per Year
Contribute
• >62K OS Certs
per year
• ~10K IBM
commits per
month
Connect
> 1000
active IBM
Contributors
Working in key OS
projects
Open Source participation and usage is simpler than ever

5
Open Source is essential to Developer Advocacy
IBM generated open source innovation
• 137 Code Open (dWO) projects w/1000+ Github projects
• 4 graduates: Node-Red, OpenWhisk, SystemML,
Blockchain fabric to full open governance in the last year
• developer.ibm.com/code/open/code/
Community
• IBM focused on 18 strategic communities
• Drive open governance in “Centers of Gravity”
• IBM Leaders drive key technologies and assure freedom
of action
The IBM OS Way is now open sourced
• Training, Recognition, Tooling
• Organization, Consuming, Contributing

Center for Open Source
Data and AI Technologies
(CODAIT)

7
IBM’s history of strong AI leadership
1997: Deep Blue
• Deep Blue became the first machine to beat a world chess
champion in tournament play
2011: Jeopardy!
• Watson beat two top
Jeopardy! champions
1968, 2001: A Space Odyssey
• IBM was a technical
advisor
• HAL is “the latest in
machine intelligence”
2018: Open Tech, AI & emerging
standards
• New IBM centers of gravity for AI
• OS projects increasing exponentially
• Emerging global standards in AI

CODAIT
codait.org
codait (French)
= coder/coded
https://m.interglot.com/fr/en/codait
CODAIT aims to make AI solutions
dramatically easier to create, deploy,
and manage in the enterprise
Relaunch of the Spark Technology
Center (STC) to reflect expanded
mission
8

CODAIT by the numb3rs
CODAIT
codait.org
codait (French)
= coder/coded
https://m.interglot.com/fr/en/codait
The team contributes to over 10 open source projects. These
projects include - Spark, Tensorflow, Keras, SystemML, Arrow,
Bahir, Toree, Livy, Zeppelin, R4ML, Stocator, Jupyter Enterprise
Gateway
17 committers and many contributors in Apache projects- Spark,
Arrow, systemML, Bahir, Toree, Livy
Over 980 JIRAs and 50,000 lines of code committed to Apache
Spark itself, and Over 65,000 LoC into SystemML
• Established IBM as the number 1 contributor to Spark
Machine Learning in Spark 2.0 release
Over 25 product lines within IBM leveraging Apache Spark in
some form or another. CODAIT engineers have interacted and
interlocked with many of them.
Speakers at over 100 conferences, MeetUps, un-conferences
etc.
9
Spark code contribution growth by
week

codait (French)
= coder/coded
https://m.interglot.com/fr/en/codaitCode - Build and improve practical frameworks
to enable more developers to realize immediate
value (e.g. FfDL, Tensorflow Jupyter, Spark)
Content – Showcase solutions to complex and
real world AI problems
Community – Bring developers and data
scientists to engage with IBM (e.g. MAX)
Improving Enterprise AI lifecycle in Open Source
Gather
Data
Analyze
Data
Machine
Learning
Deep
Learning
Deploy
Model
Maintain
Model
Python
Data Science
Stack
Fabric for
Deep Learning
(FfDL)
Mleap +
PFA
Scikit-LearnPandas
Apache
Spark
Apache
Spark
Jupyter
Model
Asset
eXchange
Keras +
Tensorflow
CODAIT
codait.org
10

Model Asset eXchange
Enabling domain experts to
use deep learning in the
enterprise

CODAIT: Enabling End-to-End AI in the Enterprise
Gather
Data
Analyze
Data
Machine
Learning
Deep
Learning
Deploy
Model
Maintain
Model
Python
Data Science
Stack
Fabric for
Deep Learning
(FfDL)
Mleap +
PFA
Scikit-LearnPandas
Apache
Spark
Apache
Spark
Jupyter
Model
Asset
eXchange
Keras +
Tensorflow

Making AI as Ubiquitous
as the Telephone

Q: What is deep learning?
A: Machine learning using
deep neural networks.
InceptionV3 Convolutional Neural Net
(A “medium-sized” deep learning model)
Image Source:
https://github.com/tensorflow/models/blob/master/research/inception/g
3doc/inception_v3_architecture.png

Characteristics of Deep
Learning (1)
15
State-of-the-Art prediction
quality in many domains
– Image classification
– Machine translation
– Facial recognition
– Time series prediction
– Many more

Learning (2)
16
Large, complex models
– Model size generally determined by “how big a
model can you fit on your device?”
Each box ≈ between
32 and 768 linear
regression models

Learning (3)
17
Poorly understood today
…even by experts
– Why do the models converge?
– Why do the models converge with low loss?
– Why do the models generalize?

Focus of this Talk
18
Incorporating well-
understood deep learning
models into enterprise
applications.

Sounds easy!

“cat”
The Components of a Deep
Learning Model
Dense
(3×8)
Dense
(8×6)
Input
(3)
Output
(2)Dense
(6×4)
Dense
(4×2)
Neural Network
Graph
Weights
(not to scale)
Driver Program

Example: Get an Image Classifier
21
Step 1: Find a suitable neural
network graph.
– Need to read some papers

22
Step 2: Find code to generate
the neural network graph
TensorFlow code to build ResNet50 neural network graph

23
Step 3: Find some pre-trained
weights for your graph
Caffe2 ResNet50 model weights

24
Step 4: Find example code
that performs model
inference
TensorFlow code for training and batch inference on ResNet50

25
Step 5: Write your own code to
perform model inference on one
image at a time
Step 6: Package your inference
code, graph creation code, and pre-
trained weights together
Step 7: Deploy your package

Model Marketplaces
26
Collections of well-
understood deep learning
models
Provide a central place to find
known-good implementations
of these models

IBM Model Asset eXchange
MAX is a one-stop shop open source
ecosystem for data scientists and AI
developers to share and consume models that
use machine learning engines, such
as TensorFlow, PyTorch and Caffe2.
It also provides a standard approach to
classify, annotate, and deploy these models
for prediction and inferencing.
MAX
https://developer.ibm.com/
code/exchanges/models/

Demo!
https://developer.ibm.com/code/exchanges/models/

Summary
29
Free, open-source models.
Wide variety of domains.
Multiple deep learning frameworks.
Vetted and tested code and IP.
Build and deploy a web service in 30
seconds.
Start training on Watson Studio in
minutes.

MAX: Future Plans
30
Many more models
– Train with Watson Studio/DLaaS
– Run inference on IBM infrastructure
Revamped website
Integration with Watson Catalog
IBMer-uploaded models
More IBM Code code patterns showing usage
https://developer.ibm.com/code/exchanges/models/

But if you can’t wait
MAX Models at DockerHub
MAX models are exposed as Docker
containers, and published to
DockerHub under CODAIT
organization.
31
https://hub.docker.com/u/codait/dashboard/

MAX and Container Services
K8 Deployment Descriptor
apiVersion: v1
kind: Pod
metadata:
name: image-caption-generator
namespace: default
labels:
app: image-caption-generator
spec:
restartPolicy: Always
containers:
- env:
image: codait/max-image-caption-generator
---
apiVersion: v1
kind: Service
metadata:
labels:
component: image-caption-generator
spec:
ports:
- name: http
port: 5000
targetPort: 5000
selector:
sessionAffinity: None
type: NodePort2018 / © 2018 IBM Corporation
MAX models require a Kubernetes
deployment descriptor to enable easy
deployment in IBM Cloud Container
Services.
32
Kubectl apply –f image-caption-generator.yaml

IBM Cloud Container Service
IBM Cloud CLI
IBM Cloud Container Service plug-in
Kubernetes CLI (kubectl)
Useful Commands
Pointing Kubectl to IBM Cloud Container Service
export KUBECONFIG=/Users/lresende/.bluemix/plugins/container-
service/clusters/lresende-kubernetes/kube-config-hou02-lresende-kubernetes.yml
Accessing Kubernetes dashboard via kubectl proxy
kubectl config view -o jsonpath='{.users[0].user.auth-provider.config.id-token}'
kubectl proxy
http://localhost:8001/ui
Deploying application
kubectl apply –f image-caption-generator.yml
Accessing application
bx cs workers lresende-kubernetes
kubectl describe service image-caption-generator
curl -F "image=@assets/surfing.jpg" -X POST
http://184.172.242.55:32229/model/predict
curl -F "image=@/Users/lresende/Pictures/375337.jpg" -X POST
curl -F "image=@/Users/lresende/Pictures/362809.jpg" -X POST

Click to edit Master title style
FfDL
Fabric for Deep Learning
FfDL provides a scalable,
resilient, and fault tolerant
deep-learning framework

https://github.com/IBM/FfDL
FfDL provides a scalable, resilient, and fault
tolerant deep-learning framework
FfDL Github Page
FfDL dwOpen Page
https://developer.ibm.com/code/open/projects/fa
bric-for-deep-learning-ffdl/
FfDL Announcement Blog
http://developer.ibm.com/code/2018/03/20/fabri
c-for-deep-learning
FfDL Technical Architecture Blog
http://developer.ibm.com/code/2018/03/20/dem
ocratize-ai-with-fabric-for-deep-learning
Deep Learning as a Service within Watson Studio
https://www.ibm.com/cloud/deep-learning
Research paper: “Scalable Multi-Framework
Management of Deep Learning Training Jobs”
http://learningsys.org/nips17/assets/papers/pape
r_29.pdf
• Fabric for Deep Learning or FfDL (pronounced as ‘fiddle’) is an open source
project which aims at making Deep Learning easily accessible to the
people it matters the most i.e. Data Scientists, and AI developers.
• FfDL Provides a consistent way to deploy, train and visualize Deep
Learning jobs across multiple frameworks like TensorFlow, Caffe, PyTorch,
Keras etc.
• FfDL is being developed in close collaboration with IBM Research and IBM
Watson. It forms the core of Watson`s Deep Learning service in open
source.
FfDL
35

FfDL is built using Microservices architecture
on Kubernetes
• FfDL platform uses a microservices architecture to offer
resilience, scalability, multi-tenancy, and security without
modifying the deep learning frameworks, and with no or minimal
changes to model code.
• FfDL control plane microservices are deployed as pods on
Kubernetes to manage this cluster of GPU- and CPU-enabled
machines effectively
• Tested Platforms: Minikube, IBM Cloud Public, IBM Cloud
Private, GPUs using both Kubernetes feature gate Accelerators
and NVidia device plugins
Try FfDL/DLaaS
https://ibm.biz/BdZtab

source code
training
definition
Auto-allocation means infrastructure is used only when needed
Kubernetes container
training
artifacts
compute cluster
NVIDIA Tesla K80, P100, V100
Cloud Object Storage
Training assets are
managed and tracked.
Access to elastic compute leveraging Kubernetes

NVIDIA GPUs
Kubernetes
container orchestration
training runs
containers
server cluster
dataset
Cloud Object Storage
Model training distributed across containers

39
FfDL: Architecture

40
https://arxiv.org/abs/1709.05871
FfDL: Research Papers

Click to edit Master title style
Jupyter
Enterprise
Gateway
Provides multi-tenant,
scalable and secure remote
Jupyter Notebook kernels

Jupyter Notebooks
© 2018 IBM Corporation 43
Notebooks are interactive
computational
environments, in which
you can combine code
execution, rich text,
mathematics, plots and
rich media.

Jupyter Notebooks
• Notebook UI runs on the browser
• The Notebook Server serves the
’Notebooks’
• Kernels interpret/execute cell contents
– Are responsible for code execution
– Abstracts different languages

Building a
Data Science
Analytical Platform

Building an Data Science Platform
© 2018 IBM Corporation
Large pool of shared computing resources
• Enterprise Cloud, Public Cloud or Hybrid
• Data in the cloud (Data Lakes/Object Storage)
Distributed Consumers
• Notebooks running local (users laptop)
or as a service (e.g. Jupyter Hub)
Different Resource Utilization Patterns
• High number of idle resources

Vanilla Jupyter Notebooks
Gather
Data
Analyze
Data
Machine
Learning
Deep
Learning
Deploy
Model
Maintain
Model
Python
Data Science
Stack
Fabric for
Deep Learning
(FfDL)
Mleap +
PFA
Scikit-LearnPandas
Apache
Spark
Apache
Spark
Jupyter
Model
Asset
eXchange
Keras +
Tensorflow
47
8 8 8 8
0
10
20
30
40
50
60
70
80
4 Nodes 8 Nodes 12 Nodes 16 NodesMaxKernels(4GBHeap)
Cluster Size (32GB Nodes)
MAXIMUM NUMBER OF
SIMULTANEOUS KERNELS
Kernel
Kernel
Kernel
Kernel
Limitations of Jupyter Notebook Stack
• Security limitations
• Single user sharing the same privileges
• Users can see and control each other process
using Jupyter administrative utilities
• Scalability limitations
• Jupyter Kernels running as local process
• Resources are limited by what is available
on the one single node that runs all Kernels
and associated Spark drivers
Kernel

Jupyter Enterprise
Gateway
Jupyter Enterprise Gateway at IBM Code
https://developer.ibm.com/code/openprojects/jupyter-enterprise-gateway/
Jupyter Enterprise Gateway source code at GitHub
https://github.com/jupyter-incubator/enterprise_gateway
Jupyter Enterprise Gateway Documentation
http://jupyter-enterprise-gateway.readthedocs.io/en/latest/
Supported Kernels
Supported Platforms
49
A lightweight, multi-tenant, scalable
and secure gateway that enables
Jupyter Notebooks to share resources
across an Apache Spark or Kubernetes
cluster for Enterprise/Cloud use cases
Spectrum Conductor
+

Gather
Data
Analyze
Data
Machine
Learning
Deep
Learning
Deploy
Model
Maintain
Model
Python
Data Science
Stack
Fabric for
Deep Learning
(FfDL)
Mleap +
PFA
Scikit-LearnPandas
Apache
Spark
Apache
Spark
Jupyter
Model
Asset
eXchange
Keras +
Tensorflow
50
16
32
48
64
0
10
20
30
40
50
60
70
80
4 Nodes 8 Nodes 12 Nodes 16 NodesMaxKernels(4GBHeap)
Cluster Size (32GB Nodes)
MAXIMUM NUMBER OF
SIMULTANEOUS KERNELS
Kernel
Kernel
KernelKernel
Optimized Resource Allocation
– Utilize resources on all cluster nodes by running kernels as Spark
applications in YARN Cluster Mode.
– Pluggable architecture to enable support for additional Resource Managers
Enhanced Security
– End-to-End secure communications
• Secure socket communications
• Encrypted HTTP communication using SSL
Multiuser support with user impersonation
– Enhance security and sandboxing by enabling user impersonation when
running kernels (using Kerberos).
– Individual HDFS home folder for each notebook user.
– Use the same user ID for notebook and batch jobs.
KernelKernel
Kernel

Jupyter Enterprise Gateway – YARN
YARN Cluster
YARN
Workers
Gateway Node
• Multitenancy
• Remote kernel lifecycle management via process proxies
Spark Executors
Spark Executors
Spark Executors
Yarn Container
Jupyter Kernel
Spark Driver
Impersonation:
Alice’s kernel runs
under Alice’s user ID.
Spark Executors
Spark Executors
Spark Executors
Yarn Container
Jupyter Kernel
Spark Driver
SecurityLayer
nb2kg
nb2kg
Spark Executors
Spark Executors
Spark Executors
Yarn Container
Jupyter Kernel
Spark Driver
Bob
Alice

Enterprise Gateway & Kubernetes
Supported Platforms
Kernel
Kernel
Kernel
Kernel
Before Jupyter Enterprise Gateway …
• Scalability limitations
• Resources are limited and the amount
required to all kernels needs to be allocated
during Notebook Server pod creation.
• Resources are limited by what is available
on the one single node that runs all Kernels
and associated Spark drivers
Kernel
KernelKernel

Jupyter Enterprise Gateway - Kubernetes
Container images defined in kernelspec
Community image
Kernel
Spark on K8
Kernel
Distributed
File
System
Vanilla Kernels
Spark based kernels
Gateway
nb2kg
nb2kg

Summary

Summary
• Model Asset Exchange
• Curated set of models ready to use or embedded in your
application or solution
• Fabric for Deep Learning
• Provides a consistent way for AI developers and
Data Scientists to train their models
• Jupyter Enterprise Gateway
• Enables your Jupyter Notebook stack to scale in
order to build Machine Learning and AI Models
more resource effectively
MAX
https://developer.ibm.com/
code/exchanges/models/

Open Source AI - News and examples

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Open Source AI - News and examples

Similaire à Open Source AI - News and examples (20)

Plus de Luciano Resende

Plus de Luciano Resende (20)

Dernier

Dernier (20)

Open Source AI - News and examples