Urs Köster and Yinyin Liu present at ODSC West. Deep learning has had a major impact in the last three years. Imperfect interactions with machines, such as speech, natural language, or image processing have been made robust by deep learning and deep learning holds promise in finding usable structure in large datasets. The training process is lengthy and has proven to be difficult to scale due to constraints of existing compute architectures and there is a need of standardized tools for building and scaling deep learning solutions. Urs will outline some of these challenges and how fundamental changes to the organization of computation and communication can lead to large advances in capabilities. Urs will dive deep into the field of Deep Learning and focus on Convolutional and Recurrent Neural Networks. The talk will be followed by a workshop highlighting neon™, an open source python based deep learning framework that has been built from the ground up for speed and ease of use. This session is targeted at data scientists and researchers interested in taking deep learning to the next level of speed and scalability. The tutorial covers how to use neon™ to build and train Recurrent Neural Networks to generate text, and Convolutional Networks to perform image classification.
1. Proprietary and confidential. Do not distribute.
Diving deep into Deep Learning:
Convolutional and Recurrent Neural Networks
Urs Köster, PhD, Yinyin Liu, PhD
MAKING MACHINES SMARTER.™
now part of
2. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
2
• What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm)
• Convolution Neural Networks (2:25 pm – 2:50 pm)
• BREAK: 2:50 – 3:00
• Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm)
• Segnet for Object Segmentation (3:25 pm – 3:50 pm)
• BREAK: 3:50 – 4:00
• Recurrent Neural Networks (4:00 pm – 4:25 pm)
• Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm)
• BREAK: 4:50 – 5:00
• Neural Machine Translation (5:00 pm – 5:25 pm)
• Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)
download neon!
https://github.com/NervanaSystems/neon
git clone git@github.com:NervanaSystems/neon.git
Nervana’s deep learning tutorials:
https://www.nervanasys.com/deep-learning-
tutorials/
We are hiring!
https://www.nervanasys.com/careers/
3. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
3
Back-propagation
End-to-end
Resnet
ImageNet
Word2Vec
Regularization
Convolution
Unrolling
RNN
Generalization
hyperparameters
Video recognition
dropout
Pooling
LSTM
AlexNet
Speech recognition
download neon!
https://github.com/NervanaSystems/neon
git clone git@github.com:NervanaSystems/neon.git
Nervana’s deep learning tutorials:
https://www.nervanasys.com/deep-learning-tutorials/
We are hiring!
https://www.nervanasys.com/careers/
4. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
4
5. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
5
6. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
6
https://www.nervanasys.com/industry-focus-serving-the-automotive-industry-with-
the-nervana-platform/
7. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
7
8. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
8
http://www.nervanasys.com/deep-reinforcement-learning-with-neon/
https://youtu.be/KkIf0Ok5GCE
9. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
9
~60 million parameters
Positive/
negative
End-to-end learning
Raw image input Output
10. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
10
(Zeiler and Fergus, 2013)
11. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
11
Historical perspective:
• Input → designed features → output
• Input → designed features → SVM → output
• Input → learned features → SVM → output
• Input → levels of learned features → output
12. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
12
A method for extracting features
at multiple levels of abstraction
• Features are discovered from data
• Performance improves with more data
• Network can express complex
transformations
• High degree of representational power
13. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
13
No free lunch:
• lots of data
• flexible models
• powerful priors
14. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
Source: ImageNet
ImageNet top 5 error rate
0%
10%
20%
30%
2010 2011 2012 2013 2014 2015
human
performance
• No free lunch
• lots of data
• flexible and fast
frameworks
• powerful computing
resources
14
15. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
15
Healthcare: Tumor detection
Automotive: Speech interfaces Finance: Time-series search engine
Positive:
Negative:
Agricultural Robotics Oil & Gas
Positive:
Negative:
Proteomics: Sequence analysis
Query:
Results:
16. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
16
• What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm)
• Convolution Neural Networks (2:25 pm – 2:50 pm)
• BREAK: 2:50 – 3:00
• Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm)
• Segnet for Object Segmentation (3:25 pm – 3:50 pm)
• BREAK: 3:50 – 4:00
• Recurrent Neural Networks (4:00 pm – 4:25 pm)
• Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm)
• BREAK: 4:50 – 5:00
• Neural Machine Translation (5:00 pm – 5:25 pm)
• Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)
17. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
0 1 2
3 4 5
6 7 8
0 1
2 3
19 25
37 43
0 1 3 4 0 1 2 3 19
• Each element in the output is the result of a dot product between two vectors
17
input filter output
18. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
24
B0 B1 B2
B3 B4 B5
B6 B7 B8
G0 G1 G2
G3 G4 G5
G6 G7 G8
R0 R1 R2
R3 R4 R5
R6 R7 R8
19. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
19
B0 B1 B2
B3 B4 B5
B6 B7 B8
G0 G1 G2
G3 G4 G5
G6 G7 G8
R0 R1 R2
R3 R4 R5
R6 R7 R8
20. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
20
B0 B1 B2
B3 B4 B5
B6 B7 B8
G0 G1 G2
G3 G4 G5
G6 G7 G8
R0 R1 R2
R3 R4 R5
R6 R7 R8
21. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
21
B0 B1 B2
B3 B4 B5
B6 B7 B8
G0 G1 G2
G3 G4 G5
G6 G7 G8
R0 R1 R2
R3 R4 R5
R6 R7 R8
22. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
22
0 1 2
3 4 5
6 7 8
0 1
2 3
19 25
37 43
0
1
2
3
4
5
6
7
8
19
0
2
3
1
0
2
3
1
0
2
3
1
0
2
3
1
25
37
43
23. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
23
Detected the pattern!
24. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
24
0 1 2
3 4 5
6 7 8
4 5
7 8
0 1 3 4 4
• Each element in the output is the maximum value within the pooling window
• Precise location becomes less relevant
• The layer becomes tolerant to local perturbations in the input – build in invariance
Max( )
25. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
25
0 0 1
0 4 6
4 12 9
0 1
2 3
0 1
2 3
• Opposite transformation of convolution
• Represents the bases to reconstruct shape of an input
26. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
26
0
0
1
0
4
6
4
12
9
0
1
2
3
0
1
2
3
0 1
2
3
0 1
2
3
0
1
2
3
x +
x =
31
13 6
27. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
27
• AlexNet (ILSVRC 2012 winner)
• ZF Net (2013 winner)
• GoogLeNet (2014 winner)
• VGG (2014 runner-up)
• ResNet (2015 winner)
conv1
pool1
conv2
pool2
conv3
conv4
conv5
pool5
fc6
fc7
28. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
28
When you construct a deep network, and train with a lot of data:
• optimize all the parameters for the problem –optimized feature
extractor
• it discovers the intrinsic structures of the data on its own
• different layers of filters discovers different level of features
29. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
29
• Filters can be visualized by the weights
• The weights reflect what patterns a filter is looking for
• Low-level filters represent lower-level features, edges, color blobs
11x11x3 conv filters learned by the first layer
http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf
30. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
30
• High-level filters represent abstract
features
• What input activates the filters (neurons)
will pass through to the upper layers
• But the pattern can be hard to interpret
http://eblearn.sourceforge.net/lib/exe/mnist_fprop1.png
31. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
31
• A conv-deconv network to project filters
to the pixel level
• For high level filters:
• Each tile shows a feature map
activation projected to pixel space
• Strong grouping within each feature
map
• Greater invariance at higher layers
• Exaggeration of discriminative parts
of the image, eyes, wheels…
(Zeiler and Fergus, 2013)
32. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
32
1. Each layer has different scope of things it can look for.
Lower layers will develop general features, so it doesn’t have a wide
variety to look for
Higher layers have a larger variety of things to look for
à Number of features increase
2. Combine simple features to complex features
Choose convolution strides / padding to retain FM size
Use pooling to reduce FM size
à (H, W) decrease
33. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
33
Layer Output shape
Input (224, 224, 3)
CONV (3x3x64) (224, 224, 64)
CONV (3x3x64) (224, 224, 64)
POOL (2x2) (112, 112, 64)
CONV (3x3x128) (112, 112, 128)
CONV (3x3x128) (112, 112, 128)
POOL (2x2) (56, 56, 128)
CONV (3x3x256) (56, 56, 256)
CONV (3x3x256) (56, 56, 256)
CONV (3x3x256) (56, 56, 256)
POOL (2x2) (28, 28, 256)
CONV (3x3x256) (28, 28, 512)
CONV (3x3x256) (28, 28, 512)
CONV (3x3x256) (28, 28, 512)
POOL (2x2) (14, 14, 512)
CONV (3x3x512) (14, 14, 512)
CONV (3x3x512) (14, 14, 512)
CONV (3x3x512) (14, 14, 512)
POOL (2x2) (7, 7, 512)
AFFINE (4096 units) (4096, 1)
AFFINE (4096 units) (4096, 1)
AFFINE (100 units) (100, 1)
https://www.cs.toronto.edu/~frossard/post/vgg16/
34. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
Input Conv1 Conv2 Conv3 Deconv1 Deconv2 Deconv3
• Can be trained to reconstruct meaningful variations
• Have been used to generate images, and object localization
http://arxiv.org/abs/1411.5928
http://arxiv.org/abs/1412.6583
http://arxiv.org/abs/1505.04366
34
35. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
35
Image classification
Image segmentation
Object localizationVideo classification
36. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
36
• Neon supports optimized
convolution kernels for maxwell-
based GPUs
• All components for constructing
example CNNs
37. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
37
• What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm)
• Convolution Neural Networks (2:25 pm – 2:50 pm)
• BREAK: 2:50 – 3:00
• Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm)
• Segnet for Object Segmentation (3:25 pm – 3:50 pm)
• BREAK: 3:50 – 4:00
• Recurrent Neural Networks (4:00 pm – 4:25 pm)
• Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm)
• BREAK: 4:50 – 5:00
• Neural Machine Translation (5:00 pm – 5:25 pm)
• Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)
38. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
38
• https://www.kaggle.com/c/noaa-right-whale-recognition
• Right whales being photographed and tracked for over 10 years
• ~4500 labeled images, ~450 whales
• ~7000 test images
39. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
39
• They all look quite the same
• Small objects to identify with
background
• Whales in the pictures have different
orientations - challenging to build in
this much variance.
40. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
40
• How to go from to - up-close and orientation aligned?
• Estimate the heading (angle) of the whale using a CNN?
• Training set can be manually labeled, to train a segmentation CNN
• Apply the segmentation CNN to process and auto-align the test images
• Apply classification CNN on the pre-process images
41. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
?
41
42. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
42
Input epoch 0 epoch 2 epoch 4 epoch 6
target prediction indicated by
43. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
43
init = Gaussian(scale=0.1)
opt = Adadelta(decay=0.9)
common = dict(init=init, batch_norm=True, activation=Rectlin())
layers = []
nchan = 128
layers.append(Conv((2, 2, nchan), strides=2, **common))
for idx in range(16):
layers.append(Conv((3, 3, nchan), **common))
if nchan > 16:
nchan /= 2
for idx in range(15):
layers.append(Deconv((3, 3, nchan), **common))
layers.append(Deconv((4, 4, nchan), strides=2, **common))
layers.append(Deconv((3, 3, 1), init=init))
cost = GeneralizedCost(costfunc=SumSquared())
mlp = Model(layers=layers)
callbacks = Callbacks(mlp, train, eval_set=val, **args.callback_args)
mlp.fit(train, optimizer=opt, num_epochs=args.epochs, cost=cost, callbacks=callbacks)
44. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
44
45. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
init = Gaussian(scale=0.01)
opt = Adadelta(decay=0.9)
common = dict(init=init, batch_norm=True, activation=Rectlin())
layers = []
nchan = 64
layers.append(Conv((2, 2, nchan), strides=2, **common))
for idx in range(6):
if nchan > 1024:
nchan = 1024
layers.append(Conv((3, 3, nchan), strides=1, **common))
layers.append(Pooling(2, strides=2))
nchan *= 2
layers.append(DropoutBinary(keep=0.5))
layers.append(Affine(nout=447, init=init, activation=Softmax()))
cost = GeneralizedCost(costfunc=CrossEntropyMulti())
mlp = Model(layers=layers)
callbacks = Callbacks(mlp, train, eval_set=val, **args.callback_args)
mlp.fit(train, optimizer=opt, num_epochs=args.epochs, cost=cost, callbacks=callbacks)
45
46. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
46
https://github.com/anlthms/whale-2015
47. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
47
• What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm)
• Convolution Neural Networks (2:25 pm – 2:50 pm)
• BREAK: 2:50 – 3:00
• Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm)
• Segnet for Object Segmentation (3:25 pm – 3:50 pm)
• BREAK: 3:50 – 4:00
• Recurrent Neural Networks (4:00 pm – 4:25 pm)
• Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm)
• BREAK: 4:50 – 5:00
• Neural Machine Translation (5:00 pm – 5:25 pm)
• Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)
48. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
48
Source: http://mi.eng.cam.ac.uk/projects/segnet/
49. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
49
• Uses CamVid dataset: https://github.com/alexgkendall/SegNet-
Tutorial/tree/master/CamVid
• converts the 1 channel target class images holding the ground truth values for
each pixel into a 12 channel image using a one-hot representation for the
class of each pixel
• Takes about 12G GPU memory
• After 650 epochs of training, the network should reach ~9000 training cost
and ~80% pixel classification accuracy.
50. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
50
51. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
51
52. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
52
Sky
Building
Sidewalk
Tree
Car
Pedestrian
Pole
Road
Sign
Fence
Bicyclist
Unlabeled
53. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
53
Sky
Building
Pole
Road
Sidewalk
Tree
Sign
Fence
Car
Pedestrian
Bicyclist
Unlabeled
54. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
54
Source at https://github.com/NervanaSystems/neon/tree/master/examples
Example 1:
./conv_autoencoder.py
Conv-deconv network to reconstruct input images
Example 2:
./cifar10_conv.py
ConvNet for Cifar10 dataset
55. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
55
• What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm)
• Convolution Neural Networks (2:25 pm – 2:50 pm)
• BREAK: 2:50 – 3:00
• Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm)
• Segnet for Object Segmentation (3:25 pm – 3:50 pm)
• BREAK: 3:50 – 4:00
• Recurrent Neural Networks (4:00 pm – 4:25 pm)
• Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm)
• BREAK: 4:50 – 5:00
• Neural Machine Translation (5:00 pm – 5:25 pm)
• Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)
56. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
56
57. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
57
Back Propagation through time
1. Unroll the
network across
time steps.
2. Follow the back-
propagated
gradients.
3. Update weights
with average
gradients.
ht
jt
kt
xt
Encoder RNNEncoder RNN
Recurrent
weights
Feed-forward
weights
h1
j1
k1
h2
j2
k2
hn
xn
jn
kn
x2x1
Unrolled Network
gradients
58. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
58
Network activations determine
states of input, forget, output
gate:
• Open input, open output,
closed forget: LSTM network
acts like a standard RNN
• Closing input, opening
forget: Memory cell recalls
previous state, new input is
ignored
• Closing output: Internal
state is stored for the next
time step without producing
any output
f g i o
c
ht
Input
Hidden
f g i o
c
ht
f g i o
c
ht
f g i o
c
ht
FF Weights
Recurrent
Weights
59. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
59
• Neon supports a wide range
of recurrent layers
• Connectivity between
recurrent and feed-forward
layers
• Deep and bi-directional
RNNs
• Containers for Encoders,
Decoders, Sequence to
Sequence models.
Recurrent output
layers
Standard Recurrent
layers
Bidirectional Recurrent
layers
60. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
60
• Simple RNN example:
neon/examples/char_rnn.py
• Penn Tree Bank text dataset:
Learn to predict text, one letter
at a time.
• Small enough to run on your
Laptop, right now (should take
about 4 minutes per epoch of
training on a Laptop CPU).
• LSTM example:
text_generation_lstm.py
• Generate Shakespeare-style text
Backend
Hyper-parameters
Network Layers
Dataset
Cost Function
Optimizer
Fitting the model
61. Other RNN examples you can try:
61
Source at https://github.com/anlthms/meetup2/blob/master
Example 1:
./rnn1.py -e 16 -w /home/ubuntu/nervana/music -r 0 -v
(Does not work well! This example demonstrates challenges of training RNNs)
Example 2:
./rnn2.py -e 16 -w /home/ubuntu/nervana/music -r 0 -v
(Uses Glorot init, gradient clipping, Adagrad and LSTM)
Example 3
./rnn3.py -e 16 -w /home/ubuntu/nervana/music -r 0 -v
(Uses multiple bi-rnn layers)
Warning: Large dataset, please do not download over ODSC WiFi.
62. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
62
Image Captioning
Speech recognition
Machine Translation
Time Series Analysis
63. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
63
• What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm)
• Convolution Neural Networks (2:25 pm – 2:50 pm)
• BREAK: 2:50 – 3:00
• Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm)
• Segnet for Object Segmentation (3:25 pm – 3:50 pm)
• BREAK: 3:50 – 4:00
• Recurrent Neural Networks (4:00 pm – 4:25 pm)
• Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm)
• BREAK: 4:50 – 5:00
• Neural Machine Translation (5:00 pm – 5:25 pm)
• Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)
64. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
64
• Whale Detection challenge from Kaggle:
https://www.kaggle.com/c/whale-detection-challenge
• Identify calls by Right Whales based
on their signature chirp sound
• 30.000 training clips of 2s length at 2kHz
65. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
65
• Processing the data:
Work in the spectrogram domain: 81
frequencies, 49 time steps
• neon dataloader has built in audio
processing tools.
• Essentially transforms the sound into an
“image” we can apply ConvNet tools to.
Whale Call Spectrogram
66. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
66
Network
• Spectrograms are 81 x 49 “pixel” images.
• Apply convolutional layers to obtain a 37x10 feature map of depth 512
• RNN layers applied to 10 time steps of the 37*512-D stack
• Feed last time step into a binary classifier with SoftMax
• Conv layers have ReLu activations and use Batch Normalization
Training
• Optimization with AdaDelta
• Initialization with Gaussian noise
• This model is not very deep, not challenging to train
68. Network Architecture
68
Main Python script
Full source at https://github.com/NervanaSystems/neon/blob/master/examples/whale_calls.py
Spectrogram
Class Label
MAX
70. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
70
• What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm)
• Convolution Neural Networks (2:25 pm – 2:50 pm)
• BREAK: 2:50 – 3:00
• Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm)
• Segnet for Object Segmentation (3:25 pm – 3:50 pm)
• BREAK: 3:50 – 4:00
• Recurrent Neural Networks (4:00 pm – 4:25 pm)
• Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm)
• BREAK: 4:50 – 5:00
• Neural Machine Translation (5:00 pm – 5:25 pm)
• Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)
71. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
71
Image: Kyunghyun Cho
72. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
72
• Data is tokenized and
mapped to a dictionary
• One-hot encoding: Each
word is a category
• No fixed mapping
between words
Input sentence
De nouvelles règles
sur les transferts de
données pour une
coopération policière
plus efficace
00001000000000
00000000001000
00100000000000
00000000100000
… …
Output Sentence
New rules on data
transfers to
ensure smoother
police cooperation
0000000010
0100000000
0001000000
0000001000
… …
Input Sentence
?
73. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
73
Sequence to Sequence Model
h1
j1
k1
h2
j2
k2
hn
xn
jn
kn
x2x1
ENCODER
kn k2 k1
jn j2 j1
hn h2 h1
~~ ~
~
~
~
~ ~
~
y1y2yn
~ ~ ~
~ ~
DECODER
Encoding
the cat is
le chat est
Encoder
Decoder
le chat
Recurrent
weights
Feed-forward
weights
Embedding
weights
74. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
74
neon layer configuration
Encoder and Decoder are
layer containers in neon
stack of GRU-type LSTM
layers in the container
Seq2Seq container to train
Encoder and Decoder
The (correct) previous
word is fed as input to
the decoder LookupTable
75. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
75
• Toy model with a vocabulary of 16,384 words
• Word embedding of 1024 dimensions
• 2 hidden layers with 512 GRU units
Network Layers:
Seq2Seq
LookupTable Layer : 20 inputs, (512, 20) outputs size
Recurrent Layer 'GRU1Enc': 512 inputs, 512 outputs, 20 steps
Recurrent Layer 'GRU1Enc': 512 inputs, 512 outputs, 20 steps
LookupTable Layer : 20 inputs, (512, 20) outputs size
Recurrent Layer 'GRU1Dec': 512 inputs, 512 outputs, 20 steps
Recurrent Layer 'GRU1Dec': 512 inputs, 512 outputs, 20 steps
Linear Layer 'Affine': 512 inputs, 16384 outputs
Bias Layer 'Affine_bias': size 16384
Activation Layer 'Affine_Softmax': Softmax
Model trained with
Cross-Entropy cost
using RMSProp
76. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
76
BLEU Score (bilingual evaluation understudy)
• Compare n-grams with one or multiple references
• Modified form of precision, additional penalties.
Beam Search
• Greedy algorithm to obtain output sequences
• Not perfect, so often NMT systems used for rescoring
Candidate on the mat there is a cat
Reference 1 the cat is on the mat
Reference 2 there is a cat on the mat
on the mat is
there
a
…
0.1
0.5
0.03
is
cat
a
…
0.3
0.07
0.05
is
the
a
…
0.01
0.2
0.02
BLEU Score
Beam Search
77. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
77
• What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm)
• Convolution Neural Networks (2:25 pm – 2:50 pm)
• BREAK: 2:50 – 3:00
• Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm)
• Segnet for Object Segmentation (3:25 pm – 3:50 pm)
• BREAK: 3:50 – 4:00
• Recurrent Neural Networks (4:00 pm – 4:25 pm)
• Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm)
• BREAK: 4:50 – 5:00
• Neural Machine Translation (5:00 pm – 5:25 pm)
• Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)
78. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
78
Krizhevsky, 2012
Kendall et al, 2016
Amodei et al, 2015
79. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
Layers
Linear, Convolution, Pooling, Deconvolution, Dropout, Recurrent, Long Short-
Term Memory, Gated Recurrent Unit, BatchNorm, LookupTable,
Local Response Normalization, Bidirectional-RNN, Bidirectional-LSTM
Backend NervanaGPU, NervanaCPU, NervanaMGPU
Datasets
MNIST, CIFAR-10, Imagenet 1K, PASCAL VOC, Mini-Places2, IMDB, Penn Treebank,
Shakespeare Text, bAbI, Hutter-prize, UCF101, flickr8k, flickr30k, COCO
Initializers Constant, Uniform, Gaussian, Glorot Uniform, Xavier, Kaiming, IdentityInit, Orthonormal
Optimizers Gradient Descent with Momentum, RMSProp, AdaDelta, Adam, Adagrad,MultiOptimizer
Activations Rectified Linear, Softmax, Tanh, Logistic, Identity, ExpLin
Costs Binary Cross Entropy, Multiclass Cross Entropy, Sum of Squares Error
Metrics Misclassification (Top1, TopK), LogLoss, Accuracy, PrecisionRecall, ObjectDetection
79
80. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
80
neon Theano Caffe Torch TensorFlow
Academic Research
Bleeding-edge
Curated models
Iteration Time
Inference speed
Package ecosystem
Support
81. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
81
Third-party
(Facebook)
benchmarking
82. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
82
83. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
83
• github.com/NervanaSystems/ModelZoo
• model files, parameters
84. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
84
Nervana’s deep learning tutorials:
https://www.nervanasys.com/deep-learning-tutorials/
Github page:
https://github.com/NervanaSystems/neon
For more information, contact:
info@nervanasys.com
85. The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
85
THANK YOU!
QUESTIONS?