This document discusses ChainerX, a NumPy-like library for deep learning in Chainer with autograd support. ChainerX is implemented in C++ for speed and to allow Python-free deployment. It provides a Python interface and supports pluggable backends. The talk explains ChainerX internals like Array and Node types, and how it integrates with Chainer. It encourages contributions to ChainerX on GitHub to expand supported operations and backends.
3. • Speed
• Fast trial-and-error
• Fast training and inference
• Environment Support
• Quick adoption of new hardwares/environments
• Quick Deployment
• Quick application of research outcome
Chainer
4. • Speed
• Fast trial-and-error
• Fast training and inference
• Environment Support
• Quick adoption of new hardwares/environments
• Quick Deployment
• Quick application of research outcome
Chainer
ChainerX
5. • how it makes Chainer a modern deep learning framework
• how it started and where it is heading
• how to contribute to it
This talk is about ChainerX and...
6. • understand ChainerX and some of its internals
• are ready to try ChainerX
• be curious to modify it to your needs
You hopefully after this talk...
7. What is ChainerX?
A NumPy-like ndarray library with autograd,
built from scratch with experiences from Chainer
8. • Subproject of Chainer started in late 2017
• With both internal and external Chainer developers
• Merged into master as of v6.0.0b1 and will be included in v6
https://github.com/chainer/chainer/tree/master/chainerx
https://github.com/chainer/chainer/tree/master/chainerx_cc
How it started
@beam2d @niboshi @asi1024 @hvy @sonots @takagi
9. import chainerx as chx
# Array creation, chx.ndarray, similar to NumPy
x = chx.ones((2, 3), dtype=chx.float32, device='native')
# Flag to record computational graph
x.require_grad()
# Define-by-run/eager forward pass, again similar to NumPy
y = chx.exp(x + 1).sum()
# Backpropagation
chx.backward(y)
# Computed gradient is also a chx.ndarray
gx = x.grad
12. • Written in C++
• Speed
• No Python runtime
required for deployment
• Python binding on top
• Lightweight
• 1-to-1 C++ mappings
• Pluggable backends
• Extensible to new
hardwares/environments
Autograd
Backpropable ndarray
CUDA
Backend/
Device
Native
Backend/
Device
Python binding
Backend/Device interface
Custom
Backend/
Device
...
13. #include "chainerx.h"
namespace chx = chainerx;
chx::Array x = chx::Ones(
{2, 3}, chx::Dtype::kFloat32,
chx::GetDevice("native"));
x.RequireGrad();
chx::Array y = chx::Exp(x + 1).Sum();
chx::Backward(y);
chx::Array gy = *x.GetGrad();
C++ API
import chainerx as chx
x = chx.ones(
(2, 3), dtype=chx.float32,
device='native')
x.require_grad()
y = chx.exp(x + 1).sum()
chx.backward(y)
gx = x.grad
Python API
15. // Call a routine to create a graph.
Internally uses chx::BackwardBuilder to do so
chx::Array y =
chx::Conv(x, w, b, {1, 1}, {1, 1});
Array, x
ArrayBody
Array, w
ArrayBody
Array, b
ArrayBody
ArrayNode ArrayNode ArrayNode
OpNode, Conv
Array, y
ArrayBody
ArrayNode
chainerx namespace omitted for clarity
// Flag to record computational graph
x.RequireGrad();
w.RequireGrad();
b.RequireGrad();
// Create input ndarrays
chx::Array x = ...
chx::Array w = ...
chx::Array b = ...
16. chainerx::Array (chainerx::ArrayBody)
• Core data type in ChainerX, an ndarray with autograd
• Has ndarray properties such as
• pointer to allocated data,
shape, dtype, strides
• Associated with a single device
• Data resides on e.g. "native" or "cuda:2"
• Holds references to its
• gradients, also chainerx::Arrays
• nodes in the computational graphs
Array, x
device
ArrayBody
data
Array, gx
ArrayNode
ArrayBody
17. chainerx::ArrayNode
• A node representing an
array in the
computational graph
• Owned by
chainerx::ArrayBody
Array, x
ArrayBody
Array, w
ArrayBody
Array, b
ArrayBody
ArrayNode ArrayNode ArrayNode
OpNode, Conv
Array, y
ArrayBody
ArrayNode
18. chainerx::OpNode
• A node representing an
operation in the
computational graph
• Referenced by
chainerx::ArrayNode
Array, x
ArrayBody
Array, w
ArrayBody
Array, b
ArrayBody
ArrayNode ArrayNode ArrayNode
OpNode, Conv
Array, y
ArrayBody
ArrayNode
19. • An array is constructed by specifying the allocating device
chainerx::Device& gpu = chainerx::GetDevice("cuda:0");
chainerx::Array x =
chainerx::Ones({2, 3}, chainerx::Dtype::kFloat32, gpu);
• A device defines
• how memory is allocated and freed
• chainerx::Device::Allocate
• operations on data
• chainerx::Device::{
Fill,Arange,Add,Subtract,Multiply,Divide,Sum,Dot,...}
chainerx::Device (1/2)
20. chainerx::Device (2/2)
• chainerx::Device is an interface
• Concrete implementations provided by ChainerX
• chainerx::native::NativeDevice
• chainerx::cuda::CudaDevice
• Can be implemented for other devices and dynamically loaded as
shared libraries
24. Architecture
Variable and functions APIs
Autograd
Backpropable ndarray
CUDA
Backend/
Device
Native
Backend/
Device
Python binding
Backend/Device interface
Custom
Backend/
Device
...
Training and model APIs
CuPy
Autograd
NumPy
• Various APIs in Chainer
v6 work with and utilize
chainerx
• Variable and
FunctionNode
delegates autograd
computations to ChainerX
25. Chainer
import chainer as ch
import cupy as cp
class ResNet50(ch.Chain):
…
model = ResNet50()
model.to_device(0)
arr = cp.array(...)
x = ch.Variable(arr)
y = model(x)
loss = …
loss.backward()
Autograd
Backpropable ndarray
CUDA
Backend/
Device
Native
Backend/
Device
Python binding
Backend/Device interface
Custom
Backend/
Device
...
Training and model APIs
CuPy
Variable and functions APIs
CuPy
Autograd
NumPy
26. Chainer
on ChainerX
import chainer as ch
import chainerx as chx
class ResNet50(ch.Chain):
…
model = ResNet50()
model.to_device('cuda:0')
arr = chx.array(...)
x = ch.Variable(arr)
y = model(x)
loss = …
loss.backward()
Training and model APIs
CuPy
Variable and functions APIs
CuPy
Autograd
NumPy
Autograd
Backpropable ndarray
CUDA
Backend/
Device
Native
Backend/
Device
Python binding
Backend/Device interface
Custom
Backend/
Device
...
27. How to take part in developing ChainerX
Contribution guide explained
28. It’s all documented
• A section in the Chainer documentation
https://docs.chainer.org/en/latest/chainerx/index.html
• On GitHub
• Look for issues/PRs labeled
• ChainerX needs to support more routines
• A list of unimplemented routines
https://github.com/chainer/chainer/issues/6423
contribution-welcomeChainerX
30. Future roadmap
• Integrate into Chainer
• Wider range of supported routines
• Dynamic device operation registration
• Concrete third party backends
• Stable C++ interface
• Wider coverage of “compiled models”
31. Summary
ChainerX is implemented in C++ with far less host-side
overhead, made accessible to Python-free deployments and
allows third parties to implement backends and devices for
hardware/environment support
Taking Chainer to the next level
by being accessible via Python and used by Chainer
32. and you can take part of ChainerX on GitHub
Contributions, ideas and discussions are welcome
• Follow @ChainerOfficial on Twitter
• Join chainer on Slack
• Job application to https://www.preferred-networks.jp/en/jobs
We are hiring