Soumith Chintala - Increasing the Impact of AI Through Better Software

SOUMITH
CHINTALA
FACEBOOK AI
Increasing the
Impact of AI
through Better
Software

S O C I A L
R E C O M M E N D A T I O N S
E N H A N C I N G E X I S T I N G P R O D U C T S

S O C I A L
M A C H I N E
T R A N S L A T I O N S

A C C E S S I B I L I T YS O C I A L
M A C H I N E
T R A N S L A T I O N S
Diana and Hanna
on green bicycles
Riding bikes
in Santo Domingo
Biking near
a red building
I M A G E C O N TA I N S

B O T S &
A S S I S T A N T S
G E N E R A T E D
C O N T E N T
A R
E F F E C T S
V R
H A R D W A R E
P O W E R I N G N E W E X P E R I E N C E S

S H A R E
B A I T I N G
S U I C I D E
P R E V E N T I O N
H E L P I N G P R O T E C T C O M M U N I T Y

A I I S A D V A N C I N G Q U I C K L Y
R E I N F O R C E M E N T
L E A R N I N G
C O M P U T E R
V I S I O N
N AT U R A L L A N G U A G E
P R O C E S S I N G

What is PyTorch?
Ndarray library
with GPU support
automatic differentiation
engine
gradient based
optimization package
Deep Learning
Reinforcement Learning
Numpy-alternative
Utilities
(data loading, etc.)

ndarray library
• np.ndarray <-> torch.Tensor
• 200+ operations, similar to
numpy
• very fast acceleration on

NumPy bridge
Zero memory-copy
very efficient

Optimization package
SGD, Adagrad, RMSProp, LBFGS, etc.

Distributed training
S C A L A B I L I T Y
Challenges:
• Scaling to hundreds of GPUs
• Heterogeneous clusters, Ethernet/InfiniBand
• Potentially unreliable nodes
In PyTorch 1.0:
• Fully revamped distributed backend - c10d

Distributed PyTorch
• MPI style distributed communication
• Broadcast Tensors to other nodes
• Reduce Tensors among nodes
- for example: sum gradients among all nodes

Work items in practice
Writing
Dataset loaders
Building models
Implementing
Training loop
Checkpointing
models
Interfacing with
environments
Dealing with
GPUs
Building optimizers
Building
Baselines

Work items in practice
Writing
Dataset loaders
Building models
Implementing
Training loop
Checkpointing
models
Interfacing with
environments
Dealing with
GPUs
Building optimizers
Building
Baselines
Python + PyTorch - an environment to do all of this

T I M E L I N E S I N C E R E L E A S E
0.1.12
first public release
0.2.0
Distributed
Higher-order gradients
Broadcasting
Advanced-Indexing
0.3.0
Reduce framework
overhead by 10x
ONNX Support
0.4.0
Windows Support
Tensor <-> Variable
distributions
1.0.0
torch.jit

Code == Model == DataCode == Model
1.0.0
torch.jit

1.0.0
torch.jit
def myfun(x):
y = x * x
z = y.tanh()
return z
@torch.jit.script
def myfun(x):
y = x * x
z = y.tanh()
return z

1.0.0
torch.jit
def myfun(x):
y = x * x
z = y.tanh()
return z
@torch.jit.script
def myfun(x):
y = x * x
z = y.tanh()
return z
graph(%x : Dynamic) {
%y : Dynamic = aten::mul(%x,
%x)
%z : Dynamic = aten::tanh(%y)
return (%z)
}

1.0.0
torch.jit
def myfun(x):
y = x * x
z = y.tanh()
return z
@torch.jit.script
def myfun(x):
y = x * x
z = y.tanh()
return z
graph(%x : Dynamic) {
%y : Dynamic = aten::mul(%x,
%x)
%z : Dynamic = aten::tanh(%y)
return (%z)
}
Export and run anywhere

1.0.0
torch.jit
def myfun(x):
y = x * x
z = y.tanh()
return z
@torch.jit.script
def myfun(x):
y = x * x
z = y.tanh()
return z
Eager execution
Execution with
ahead-of-time analysis

1.0.0
torch.jit
def myfun(x):
y = x * x
z = y.tanh()
return z
@torch.jit.script
def myfun(x):
return tanh_mul(x)
Eager execution
Execution with

1.0.0
torch.jit
def myfun(x):
y = x * x
z = y.tanh()
return z
@torch.jit.script
def myfun(x):
return tanh_mul(x)
Eager execution
Execution with
saturate faster hardware
whole program optimizations

PyTorch
Models are Python program,
autograd for derivatives
+ Simple
+ Debuggable — print and pdb
+ Hackable — use any Python library
– Needs Python to run
– Difficult to optimize and parallelize
Eager Mode PyTorch
Models are programs written in
an optimizable subset of Python
+ Production deployment
+ No Python dependency
+ Optimizable
Script Mode

Tools to transition eager code into script mode
P Y T O R C H J I T
@torch.jit.script
torch.jit.trace
For prototyping, training,
and experiments
For use at scale
in production
E A G E R
M O D E
S C R I P T
M O D E

Build tools for transitioning
from research to production
in the same environment
Fully loaded: expose building blocks for performance and scaling
Opt-in: trade-off flexibility only when beneficial
Same environment: continuously refactor your model
V I S I O N
V A L U E S

Deployment in C++
S C A L A B I L I T Y & C R O S S - P L A T F O R M
Often Python is not an option:
• High overhead on small models
• Multithreading services bottleneck on GIL
• Deployment service might be C++ only
In PyTorch 1.0:
• Convert inference part of the model to Torch Script
• Link with libtorch.so in your C++ application
Torch Script
+ state_dict

PyTorch
Models are Python program,
autograd for derivatives
+ Simple
+ Debuggable — print and pdb
+ Hackable — use any Python library

Ecosystem
• Probabilistic Programming
http://pyro.ai/
github.com/probtorch/probtorch

Ecosystem
• Gaussian Processes
https://github.com/cornellius-gp/gpytorch

Ecosystem
• Machine Translation
https://github.com/OpenNMT/OpenNMT-pyhttps://github.com/facebookresearch/fairseq-py

Ecosystem
• AllenNLP http://allennlp.org/

Ecosystem
• Pix2PixHD
https://github.com/NVIDIA/pix2pixHD

Ecosystem
• Sentiment Discovery
https://github.com/NVIDIA/sentiment-discovery

G E T S T A R T E D
L O C A L I N S T A L L C L O U D P A R T N E R S

Soumith Chintala - Increasing the Impact of AI Through Better Software

Recommended

Recommended

More Related Content

Similar to Soumith Chintala - Increasing the Impact of AI Through Better Software

Similar to Soumith Chintala - Increasing the Impact of AI Through Better Software (20)

More from MLconf

More from MLconf (20)

Recently uploaded

Recently uploaded (20)

Soumith Chintala - Increasing the Impact of AI Through Better Software

Editor's Notes