Future of AI & Fabric for Deep Learning (FfDL

Future of AI & FfDL
Jim Spohrer (IBM) and Animesh Singh (IBM)
http://slideshare.net/spohrer/intel_20180608_v2
June 8, 2018 - Skype Intel Skype PresentationIntel
Hosts: John Miranda and Michael Jacobson
6/8/2018 IBM #OpenTechAI 1

IBM Contacts
Jim Spohrer <spohrer@us.ibm.com>
IBM Research – Almaden
San Jose, CA
Animesh Singh <singhan@us.ibm.com>
IBM Silicon Valley Lab
San Jose, DC
Vijay Bommireddipalli
<vijayrb@us.ibm.com>
CODAIT, San Francisco, CACenter

6/8/2018
© IBM UPWard 2016
3
AI (Artificial Intelligence) is popular again… you see it mentioned on billboards in SF
However, pattern recognition does not equal AI
Deep learning works if you have lots of data and compute power
We finally have lots of data and compute power – hurray!!!
So finally, deep learning for pattern recognition is working pretty well
However, AI is more than deep learning for pattern recognition…
AI requires commonsense reasoning – that will take another 5-10 years of research
How do we know this? Look at the AI leaderboards – we will get to that…

Future of AI
6/8/2018 (c) IBM 2017, Cognitive Opentech Group 4
… when will
your smartphone
be able to take and
pass any online
course? And then
be your coach, so
you can pass too?

6/8/2018 Understanding Cognitive Systems 5

Future of AI
6/8/2018
© IBM Cognitive Opentech Group (COG)
6
Dota 2
“Deep Learning” for
“AI Pattern Recognition”
depends on massive
amounts of “labeled data”
and computing power
available since ~2012;
Labeled data is simply
input and output pairs,
such as a sound and word,
or image and word, or
English sentence and French
sentence, or road scene
and car control settings –
labeled data means having
both input and output data
in massive quantities.
For example, 100K images
of skin, half with skin
cancer and half without to
learn to recognize presence
of skin cancer.

Every 20 years, compute costs are down
by 1000x
• Cost of Digital Workers
• Moore’s Law can be thought of as
lowering costs by a factor of a…
• Thousand times lower
in 20 years
• Million times lower
in 40 years
• Billion times lower
in 60 years
• Smarter Tools (Terascale)
• Terascale (2017) = $3K
• Terascale (2020) = ~$1K
• Narrow Worker (Petascale)
• Recognition (Fast)
• Petascale (2040) = ~$1K
• Broad Worker (Exascale)
• Reasoning (Slow)
• Exascale (2060) = ~$1K
76/8/2018 (c) IBM 2017, Cognitive Opentech Group
2080204020001960
$1K
$1M
$1B
$1T
206020201980
+/- 10 years
$1
Person Average
Annual Salary
(Living Income)
Super Computer
Cost
Mainframe Cost
Smartphone Cost
T
P
E
T P E
AI Progress on Open Leaderboards
Benchmark Roadmap to solve AI/IA

GDP/Employee
(Source)
Lower compute costs translate into increasing productivity and GDP/employees for nations
Increasing productivity and GDP/employees should translate into wealthier citizens
AI Progress on Open Leaderboards
Benchmark Roadmap to solve AI/IA

Leaderboards Framework
AI Progress on Open Leaderboards - Benchmark Roadmap
Perceive World Develop Cognition Build Relationships Fill Roles
Pattern
recognition
Video
understanding
Memory Reasoning Social
interactions
Fluent
conversation
Assistant &
Collaborator
Coach &
Mediator
Speech Actions Declarative Deduction Scripts Speech Acts Tasks Institutions
Chime Thumos SQuAD SAT ROC Story ConvAI
Images Context Episodic Induction Plans Intentions Summarizatio
n
Values
ImageNet VQA DSTC RALI General-AI
Translation Narration Dynamic Abductive Goals Cultures Debate Negotiation
WMT DeepVideo Alexa Prize ICCMA AT
Learning from Labeled Training Data and Searching (Optimization)
Learning by Watching and Reading (Education)
Learning by Doing and being Responsible (Exploration)
2015 2018 2021 2024 2027 2030 2033 2036
Which experts would be really surprised if it takes less time… and which experts really surprised if it takes longer?
Approx.
Year
Human
Level ->

6/8/2018 10
1955 1975 1995 2015 2035 2055
Better Building Blocks

Build: 10 million minutes of experience

Build: 2 million minutes of experience

Build: Hardware < Software < Data < Experience

Types: Progression of models and capabilities
Task & World Model/
Planning & Decisions
Self Model/
Capacity & Limits
User Model/
Episodic Memory
Institutions Model/
Trust & Social Acts
Tool + - - -
Assistant ++ + - -
Collaborator +++ ++ + -
Coach ++++ +++ ++ +
Mediator +++++ ++++ +++ ++
Cognitive
Tool
Cognitive
Assistant
Cognitive
Collaborator
Cognitive
Coach
Cognitive
Mediator

“The best way to predict the future is to inspire the
next generation of students to build it better”
Digital Natives Transportation Water Manufacturing
Energy Construction ICT Retail
Finance Healthcare Education Government

Step Comment
GitHub Get an account and read the guide
Learn 3 R's - Read, Redo, Report Read (Medium/arXiv), Redo (GitHub), Report (Jupyter Notebook)
Kaggle Compete in a Kaggle competition
Leaderboards Compete to advance AI progress
Design New Challenges build an AI system that can take and pass any online course, then
switch to tutor-mode and help you pass
Open Source Guide Establish open source culture in your organization

Fabric for Deep Learning
FfDL
FfDL Github Page
https://github.com/IBM/FfDL
FfDL dwOpen Page
https://developer.ibm.com/code/open/projects/fabri
c-for-deep-learning-ffdl/
FfDL Announcement Blog
http://developer.ibm.com/code/2018/03/20/fabric-
for-deep-learning
FfDL Technical Architecture Blog
http://developer.ibm.com/code/2018/03/20/democr
atize-ai-with-fabric-for-deep-learning
Deep Learning as a Service within Watson Studio
https://www.ibm.com/cloud/deep-learning
Research paper: “Scalable Multi-Framework
Management of Deep Learning Training Jobs”
http://learningsys.org/nips17/assets/papers/paper_
29.pdf
FfDL
18

…that automate
decisions.
…to build models…Use data…
The Enterprise AI Process
19
Gather
Data
Analyze
Data
Machine
Learning
Deep
Learning
Deploy
Model
Maintain
Model

Center for Open Source
Data and AI Technologies
March 30 2018 / © 2018 IBM Corporation
codait (French)
= coder/coded
https://m.interglot.com/fr/en/codaitCode - Build and improve practical frameworks to
enable more developers to realize immediate
value (e.g. FfDL, Tensorflow Jupyter, Spark)
Content – Showcase solutions to complex and
real world AI problems
Community – Bring developers and data
scientists together to engage (e.g. MAX)
Improving Enterprise AI lifecycle in Open Source
Gather
Data
Analyze
Data
Machine
Learning
Deep
Learning
Deploy
Model
Maintain
Model
Python
Data Science
Stack
Fabric for
Deep Learning
(FfDL)
Mleap +
PFA
Scikit-LearnPandas
Apache
Spark
Apache
Spark
Jupyter
Model
Asset
eXchange
Keras +
Tensorflow
CODAIT
codait.org
20

FfDL provides a scalable, resilient, and fault
tolerant deep-learning framework
FfDL Github Page
FfDL dwOpen Page
https://developer.ibm.com/code/open/projects/fabri
c-for-deep-learning-ffdl/
FfDL Announcement Blog
http://developer.ibm.com/code/2018/03/20/fabric-
for-deep-learning
FfDL Technical Architecture Blog
http://developer.ibm.com/code/2018/03/20/democr
atize-ai-with-fabric-for-deep-learning
https://www.ibm.com/cloud/deep-learning
Research paper: “Scalable Multi-Framework
Management of Deep Learning Training Jobs”
http://learningsys.org/nips17/assets/papers/paper_
29.pdf
• Fabric for Deep Learning or FfDL (pronounced as ‘fiddle’) is an
open source project which aims at making Deep Learning easily
accessible to the people it matters the most i.e. Data Scientists,
and AI developers.
• FfDL Provides a consistent way to deploy, train and visualize
Deep Learning jobs across multiple frameworks like TensorFlow,
Caffe, PyTorch, Keras etc.
• FfDL is being developed in close collaboration with IBM
Research and IBM Watson. It forms the core of Watson`s Deep
Learning service in open source.
FfDL
21

FfDL is built using Microservices architecture
on Kubernetes
• FfDL platform uses a microservices architecture to offer
resilience, scalability, multi-tenancy, and security without
modifying the deep learning frameworks, and with no or minimal
changes to model code.
• FfDL control plane microservices are deployed as pods on
Kubernetes to manage this cluster of GPU- and CPU-enabled
machines effectively
• Tested Platforms: Minikube, IBM Cloud Public, IBM Cloud
Private, GPUs using both Kubernetes feature gate Accelerators
and NVidia device plugins
22

source code
training
definition
Access to elastic compute leveraging Kubernetes
Auto-allocation means infrastructure is used only when needed
Kubernetes container
training
artifacts
compute cluster
NVIDIA Tesla K80, P100, V100
Cloud Object Storage
Training assets are
managed and tracked.
IBM Cloud / Watson and Cloud Platform / © 2018 IBM Corporation 23

NVIDIA GPUs
Kubernetes
container orchestration
training runs
containers
Model training distributed across containers
server cluster
dataset
Cloud Object Storage
24

26
FfDL: Research Papers
https://arxiv.org/abs/1709.05871

27
FfDL: Research Papers
http://learningsys.org/nips17/assets/papers/paper_29.pdf

And we offer more
Model Asset Exchange
MAX
and
Adversarial Robustness Toolbox
ART
28

IBM Model Asset eXchange
MAX
MAX is a one stop exchange to find ML/DL
models created using popular Machine
Learning engines and provides a
standardized approach to consume these
models for training and inferencing.
29
developer.ibm.com/code/exchanges/models/

IBM Adversarial Robustness
Toolbox
ART
ART is a library dedicated to adversarial
machine learning. Its purpose is to allow rapid
crafting and analysis of attacks and defense
methods for machine learning models. The
Adversarial Robustness Toolbox provides an
implementation for many state-of-the-art
methods for attacking and defending
classifiers.
30
https://developer.ibm.com/code/open/projects/adver
sarial-robustness-toolbox/
The Adversarial Robustness Toolbox contains
implementations of the following attacks:
Deep Fool (Moosavi-Dezfooli et al., 2015)
Fast Gradient Method (Goodfellow et al., 2014)
Jacobian Saliency Map (Papernot et al., 2016)
Universal Perturbation (Moosavi-Dezfooli et al., 2016)
Virtual Adversarial Method (Moosavi-Dezfooli et al.,
2015)
C&W Attack (Carlini and Wagner, 2016)
NewtonFool (Jang et al., 2017)
The following defense methods are also supported:
Feature squeezing (Xu et al., 2017)
Spatial smoothing (Xu et al., 2017)
Label smoothing (Warde-Farley and Goodfellow, 2016)
Adversarial training (Szegedy et al., 2013)
Virtual adversarial training (Miyato et al., 2017)

FfDL
Core of Deep Learning as a
Service in Watson Studio
31

Model Lifecycle Management
Machine Learning Runtimes Deep Learning Runtimes
Authoring Tools
Cloud Infrastructure as a Service
• Most popular open source frameworks
• IBM best-in-class frameworks
• Create, collaborate, deploy, and monitor
• Best of breed open source & IBM tools
• Code (R, Python or Scala) and no-code/visual
modeling tools
• Fully managed service
• Container-based resource management
• Elastic pay as you go cpu/gpu power
Watson Studio
Tools for supporting the end-to-end AI workflow

3
Train neural
networks in parallel
across NVIDIA
GPUs.
Pay only for what
you use. Auto-
deallocation means
no more
remembering to
shutdown your
cloud training
instances.
Monitor batch training
experiments then
compare cross-model
performance without
worrying about log
transfers and scripts to
visualize results. You
focus on designing your
neural networks. We’ll
manage and track your
assets.
Python client, command
line interface (CLI) or
UI? You choose the
tooling that best fits your
existing workflows.
Training history and
assets are tracked then
automatically transferred
to the customer’s Object
Storage for quick
access.
Deploy models into
production then
monitor them to
evaluate
performance.
Capture new data
for continuous
learning and retrain
models so they
continually adapt to
changing
conditions.
Using FfDL as core

Neural Network Modeller within Watson Studio
An intuitive drag-and-drop, no-code interface for designing neural network structure

DLaaS Training Dashboard in Watson Studio

Trust: Two Communities6/8/2018
© IBM MAP COG2018
37
Service
Science
OpenTech
AI
Trust:
Value Co-Creation,
Transdisciplinary
Trust:
Ethical, Safe, Explainable,
Open Communities
Special Issue
AI Magazine?
Handbook of
OpenTech AI?

Resilience:
Rapidly Rebuilding From
Scratch
Dartnell L (2012) The Knowledge: How to
Rebuild Civilization in the Aftermath of a
Cataclysm. Westminster London: Penguin
Books.
6/8/2018
© IBM MAP COG2018
38

Future of AI & Fabric for Deep Learning (FfDL

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Future of AI & Fabric for Deep Learning (FfDL

Similaire à Future of AI & Fabric for Deep Learning (FfDL (20)

Plus de ISSIP

Plus de ISSIP (20)

Dernier

Dernier (20)

Future of AI & Fabric for Deep Learning (FfDL

Notes de l'éditeur