SlideShare une entreprise Scribd logo
1  sur  85
Télécharger pour lire hors ligne
Proprietary and confidential. Do not distribute.
Diving deep into Deep Learning:
Convolutional and Recurrent Neural Networks
Urs Köster, PhD, Yinyin Liu, PhD
MAKING MACHINES SMARTER.™
now part of
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
2
• What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm)
• Convolution Neural Networks (2:25 pm – 2:50 pm)
• BREAK: 2:50 – 3:00
• Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm)
• Segnet for Object Segmentation (3:25 pm – 3:50 pm)
• BREAK: 3:50 – 4:00
• Recurrent Neural Networks (4:00 pm – 4:25 pm)
• Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm)
• BREAK: 4:50 – 5:00
• Neural Machine Translation (5:00 pm – 5:25 pm)
• Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)
download neon!
https://github.com/NervanaSystems/neon
git clone git@github.com:NervanaSystems/neon.git
Nervana’s deep learning tutorials:
https://www.nervanasys.com/deep-learning-
tutorials/
We are hiring!
https://www.nervanasys.com/careers/
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
3
Back-propagation
End-to-end
Resnet
ImageNet
Word2Vec
Regularization
Convolution
Unrolling
RNN
Generalization
hyperparameters
Video recognition
dropout
Pooling
LSTM
AlexNet
Speech recognition
download neon!
https://github.com/NervanaSystems/neon
git clone git@github.com:NervanaSystems/neon.git
Nervana’s deep learning tutorials:
https://www.nervanasys.com/deep-learning-tutorials/
We are hiring!
https://www.nervanasys.com/careers/
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
4
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
5
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
6
https://www.nervanasys.com/industry-focus-serving-the-automotive-industry-with-
the-nervana-platform/
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
7
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
8
http://www.nervanasys.com/deep-reinforcement-learning-with-neon/
https://youtu.be/KkIf0Ok5GCE
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
9
~60 million parameters
Positive/
negative
End-to-end learning
Raw image input Output
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
10
(Zeiler and Fergus, 2013)
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
11
Historical perspective:
• Input → designed features → output
• Input → designed features → SVM → output
• Input → learned features → SVM → output
• Input → levels of learned features → output
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
12
A method for extracting features
at multiple levels of abstraction
• Features are discovered from data
• Performance improves with more data
• Network can express complex
transformations
• High degree of representational power
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
13
No free lunch:
• lots of data
• flexible models
• powerful priors
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
Source: ImageNet
ImageNet top 5 error rate
0%
10%
20%
30%
2010 2011 2012 2013 2014 2015
human
performance
• No free lunch
• lots of data
• flexible and fast
frameworks
• powerful computing
resources
14
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
15
Healthcare: Tumor detection
Automotive: Speech interfaces Finance: Time-series search engine
Positive:
Negative:
Agricultural Robotics Oil & Gas
Positive:
Negative:
Proteomics: Sequence analysis
Query:
Results:
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
16
• What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm)
• Convolution Neural Networks (2:25 pm – 2:50 pm)
• BREAK: 2:50 – 3:00
• Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm)
• Segnet for Object Segmentation (3:25 pm – 3:50 pm)
• BREAK: 3:50 – 4:00
• Recurrent Neural Networks (4:00 pm – 4:25 pm)
• Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm)
• BREAK: 4:50 – 5:00
• Neural Machine Translation (5:00 pm – 5:25 pm)
• Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
0 1 2
3 4 5
6 7 8
0 1
2 3
19 25
37 43
0 1 3 4 0 1 2 3 19
• Each element in the output is the result of a dot product between two vectors
17
input filter output
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
24
B0 B1 B2
B3 B4 B5
B6 B7 B8
G0 G1 G2
G3 G4 G5
G6 G7 G8
R0 R1 R2
R3 R4 R5
R6 R7 R8
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
19
B0 B1 B2
B3 B4 B5
B6 B7 B8
G0 G1 G2
G3 G4 G5
G6 G7 G8
R0 R1 R2
R3 R4 R5
R6 R7 R8
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
20
B0 B1 B2
B3 B4 B5
B6 B7 B8
G0 G1 G2
G3 G4 G5
G6 G7 G8
R0 R1 R2
R3 R4 R5
R6 R7 R8
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
21
B0 B1 B2
B3 B4 B5
B6 B7 B8
G0 G1 G2
G3 G4 G5
G6 G7 G8
R0 R1 R2
R3 R4 R5
R6 R7 R8
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
22
0 1 2
3 4 5
6 7 8
0 1
2 3
19 25
37 43
0
1
2
3
4
5
6
7
8
19
0
2
3
1
0
2
3
1
0
2
3
1
0
2
3
1
25
37
43
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
23
Detected the pattern!
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
24
0 1 2
3 4 5
6 7 8
4 5
7 8
0 1 3 4 4
• Each element in the output is the maximum value within the pooling window
• Precise location becomes less relevant
• The layer becomes tolerant to local perturbations in the input – build in invariance
Max( )
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
25
0 0 1
0 4 6
4 12 9
0 1
2 3
0 1
2 3
• Opposite transformation of convolution
• Represents the bases to reconstruct shape of an input
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
26
0
0
1
0
4
6
4
12
9
0
1
2
3
0
1
2
3
0 1
2
3
0 1
2
3
0
1
2
3
x +
x =
31
13 6
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
27
• AlexNet (ILSVRC 2012 winner)
• ZF Net (2013 winner)
• GoogLeNet (2014 winner)
• VGG (2014 runner-up)
• ResNet (2015 winner)
conv1
pool1
conv2
pool2
conv3
conv4
conv5
pool5
fc6
fc7
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
28
When you construct a deep network, and train with a lot of data:
• optimize all the parameters for the problem –optimized feature
extractor
• it discovers the intrinsic structures of the data on its own
• different layers of filters discovers different level of features
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
29
• Filters can be visualized by the weights
• The weights reflect what patterns a filter is looking for
• Low-level filters represent lower-level features, edges, color blobs
11x11x3 conv filters learned by the first layer
http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
30
• High-level filters represent abstract
features
• What input activates the filters (neurons)
will pass through to the upper layers
• But the pattern can be hard to interpret
http://eblearn.sourceforge.net/lib/exe/mnist_fprop1.png
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
31
• A conv-deconv network to project filters
to the pixel level
• For high level filters:
• Each tile shows a feature map
activation projected to pixel space
• Strong grouping within each feature
map
• Greater invariance at higher layers
• Exaggeration of discriminative parts
of the image, eyes, wheels…
(Zeiler and Fergus, 2013)
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
32
1. Each layer has different scope of things it can look for.
Lower layers will develop general features, so it doesn’t have a wide
variety to look for
Higher layers have a larger variety of things to look for
à Number of features increase
2. Combine simple features to complex features
Choose convolution strides / padding to retain FM size
Use pooling to reduce FM size
à (H, W) decrease
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
33
Layer Output shape
Input (224, 224, 3)
CONV (3x3x64) (224, 224, 64)
CONV (3x3x64) (224, 224, 64)
POOL (2x2) (112, 112, 64)
CONV (3x3x128) (112, 112, 128)
CONV (3x3x128) (112, 112, 128)
POOL (2x2) (56, 56, 128)
CONV (3x3x256) (56, 56, 256)
CONV (3x3x256) (56, 56, 256)
CONV (3x3x256) (56, 56, 256)
POOL (2x2) (28, 28, 256)
CONV (3x3x256) (28, 28, 512)
CONV (3x3x256) (28, 28, 512)
CONV (3x3x256) (28, 28, 512)
POOL (2x2) (14, 14, 512)
CONV (3x3x512) (14, 14, 512)
CONV (3x3x512) (14, 14, 512)
CONV (3x3x512) (14, 14, 512)
POOL (2x2) (7, 7, 512)
AFFINE (4096 units) (4096, 1)
AFFINE (4096 units) (4096, 1)
AFFINE (100 units) (100, 1)
https://www.cs.toronto.edu/~frossard/post/vgg16/
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
Input Conv1 Conv2 Conv3 Deconv1 Deconv2 Deconv3
• Can be trained to reconstruct meaningful variations
• Have been used to generate images, and object localization
http://arxiv.org/abs/1411.5928
http://arxiv.org/abs/1412.6583
http://arxiv.org/abs/1505.04366
34
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
35
Image classification
Image segmentation
Object localizationVideo classification
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
36
• Neon supports optimized
convolution kernels for maxwell-
based GPUs
• All components for constructing
example CNNs
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
37
• What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm)
• Convolution Neural Networks (2:25 pm – 2:50 pm)
• BREAK: 2:50 – 3:00
• Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm)
• Segnet for Object Segmentation (3:25 pm – 3:50 pm)
• BREAK: 3:50 – 4:00
• Recurrent Neural Networks (4:00 pm – 4:25 pm)
• Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm)
• BREAK: 4:50 – 5:00
• Neural Machine Translation (5:00 pm – 5:25 pm)
• Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
38
• https://www.kaggle.com/c/noaa-right-whale-recognition
• Right whales being photographed and tracked for over 10 years
• ~4500 labeled images, ~450 whales
• ~7000 test images
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
39
• They all look quite the same
• Small objects to identify with
background
• Whales in the pictures have different
orientations - challenging to build in
this much variance.
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
40
• How to go from to - up-close and orientation aligned?
• Estimate the heading (angle) of the whale using a CNN?
• Training set can be manually labeled, to train a segmentation CNN
• Apply the segmentation CNN to process and auto-align the test images
• Apply classification CNN on the pre-process images
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
?
41
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
42
Input epoch 0 epoch 2 epoch 4 epoch 6
target prediction indicated by
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
43
init = Gaussian(scale=0.1)
opt = Adadelta(decay=0.9)
common = dict(init=init, batch_norm=True, activation=Rectlin())
layers = []
nchan = 128
layers.append(Conv((2, 2, nchan), strides=2, **common))
for idx in range(16):
layers.append(Conv((3, 3, nchan), **common))
if nchan > 16:
nchan /= 2
for idx in range(15):
layers.append(Deconv((3, 3, nchan), **common))
layers.append(Deconv((4, 4, nchan), strides=2, **common))
layers.append(Deconv((3, 3, 1), init=init))
cost = GeneralizedCost(costfunc=SumSquared())
mlp = Model(layers=layers)
callbacks = Callbacks(mlp, train, eval_set=val, **args.callback_args)
mlp.fit(train, optimizer=opt, num_epochs=args.epochs, cost=cost, callbacks=callbacks)
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
44
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
init = Gaussian(scale=0.01)
opt = Adadelta(decay=0.9)
common = dict(init=init, batch_norm=True, activation=Rectlin())
layers = []
nchan = 64
layers.append(Conv((2, 2, nchan), strides=2, **common))
for idx in range(6):
if nchan > 1024:
nchan = 1024
layers.append(Conv((3, 3, nchan), strides=1, **common))
layers.append(Pooling(2, strides=2))
nchan *= 2
layers.append(DropoutBinary(keep=0.5))
layers.append(Affine(nout=447, init=init, activation=Softmax()))
cost = GeneralizedCost(costfunc=CrossEntropyMulti())
mlp = Model(layers=layers)
callbacks = Callbacks(mlp, train, eval_set=val, **args.callback_args)
mlp.fit(train, optimizer=opt, num_epochs=args.epochs, cost=cost, callbacks=callbacks)
45
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
46
https://github.com/anlthms/whale-2015
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
47
• What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm)
• Convolution Neural Networks (2:25 pm – 2:50 pm)
• BREAK: 2:50 – 3:00
• Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm)
• Segnet for Object Segmentation (3:25 pm – 3:50 pm)
• BREAK: 3:50 – 4:00
• Recurrent Neural Networks (4:00 pm – 4:25 pm)
• Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm)
• BREAK: 4:50 – 5:00
• Neural Machine Translation (5:00 pm – 5:25 pm)
• Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
48
Source: http://mi.eng.cam.ac.uk/projects/segnet/
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
49
• Uses CamVid dataset: https://github.com/alexgkendall/SegNet-
Tutorial/tree/master/CamVid
• converts the 1 channel target class images holding the ground truth values for
each pixel into a 12 channel image using a one-hot representation for the
class of each pixel
• Takes about 12G GPU memory
• After 650 epochs of training, the network should reach ~9000 training cost
and ~80% pixel classification accuracy.
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
50
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
51
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
52
Sky
Building
Sidewalk
Tree
Car
Pedestrian
Pole
Road
Sign
Fence
Bicyclist
Unlabeled
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
53
Sky
Building
Pole
Road
Sidewalk
Tree
Sign
Fence
Car
Pedestrian
Bicyclist
Unlabeled
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
54
Source at https://github.com/NervanaSystems/neon/tree/master/examples
Example 1:
./conv_autoencoder.py
Conv-deconv network to reconstruct input images
Example 2:
./cifar10_conv.py
ConvNet for Cifar10 dataset
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
55
• What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm)
• Convolution Neural Networks (2:25 pm – 2:50 pm)
• BREAK: 2:50 – 3:00
• Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm)
• Segnet for Object Segmentation (3:25 pm – 3:50 pm)
• BREAK: 3:50 – 4:00
• Recurrent Neural Networks (4:00 pm – 4:25 pm)
• Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm)
• BREAK: 4:50 – 5:00
• Neural Machine Translation (5:00 pm – 5:25 pm)
• Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
56
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
57
Back Propagation through time
1. Unroll the
network across
time steps.
2. Follow the back-
propagated
gradients.
3. Update weights
with average
gradients.
ht
jt
kt
xt
Encoder RNNEncoder RNN
Recurrent
weights
Feed-forward
weights
h1
j1
k1
h2
j2
k2
hn
xn
jn
kn
x2x1
Unrolled Network
gradients
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
58
Network activations determine
states of input, forget, output
gate:
• Open input, open output,
closed forget: LSTM network
acts like a standard RNN
• Closing input, opening
forget: Memory cell recalls
previous state, new input is
ignored
• Closing output: Internal
state is stored for the next
time step without producing
any output
f g i o
c
ht
Input
Hidden
f g i o
c
ht
f g i o
c
ht
f g i o
c
ht
FF Weights
Recurrent
Weights
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
59
• Neon supports a wide range
of recurrent layers
• Connectivity between
recurrent and feed-forward
layers
• Deep and bi-directional
RNNs
• Containers for Encoders,
Decoders, Sequence to
Sequence models.
Recurrent output
layers
Standard Recurrent
layers
Bidirectional Recurrent
layers
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
60
• Simple RNN example:
neon/examples/char_rnn.py
• Penn Tree Bank text dataset:
Learn to predict text, one letter
at a time.
• Small enough to run on your
Laptop, right now (should take
about 4 minutes per epoch of
training on a Laptop CPU).
• LSTM example:
text_generation_lstm.py
• Generate Shakespeare-style text
Backend
Hyper-parameters
Network Layers
Dataset
Cost Function
Optimizer
Fitting the model
Other RNN examples you can try:
61
Source at https://github.com/anlthms/meetup2/blob/master
Example 1:
./rnn1.py -e 16 -w /home/ubuntu/nervana/music -r 0 -v
(Does not work well! This example demonstrates challenges of training RNNs)
Example 2:
./rnn2.py -e 16 -w /home/ubuntu/nervana/music -r 0 -v
(Uses Glorot init, gradient clipping, Adagrad and LSTM)
Example 3
./rnn3.py -e 16 -w /home/ubuntu/nervana/music -r 0 -v
(Uses multiple bi-rnn layers)
Warning: Large dataset, please do not download over ODSC WiFi.
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
62
Image Captioning
Speech recognition
Machine Translation
Time Series Analysis
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
63
• What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm)
• Convolution Neural Networks (2:25 pm – 2:50 pm)
• BREAK: 2:50 – 3:00
• Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm)
• Segnet for Object Segmentation (3:25 pm – 3:50 pm)
• BREAK: 3:50 – 4:00
• Recurrent Neural Networks (4:00 pm – 4:25 pm)
• Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm)
• BREAK: 4:50 – 5:00
• Neural Machine Translation (5:00 pm – 5:25 pm)
• Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
64
• Whale Detection challenge from Kaggle:
https://www.kaggle.com/c/whale-detection-challenge
• Identify calls by Right Whales based
on their signature chirp sound
• 30.000 training clips of 2s length at 2kHz
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
65
• Processing the data:
Work in the spectrogram domain: 81
frequencies, 49 time steps
• neon dataloader has built in audio
processing tools.
• Essentially transforms the sound into an
“image” we can apply ConvNet tools to.
Whale Call Spectrogram
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
66
Network
• Spectrograms are 81 x 49 “pixel” images.
• Apply convolutional layers to obtain a 37x10 feature map of depth 512
• RNN layers applied to 10 time steps of the 37*512-D stack
• Feed last time step into a binary classifier with SoftMax
• Conv layers have ReLu activations and use Batch Normalization
Training
• Optimization with AdaDelta
• Initialization with Gaussian noise
• This model is not very deep, not challenging to train
Network Architecture
67
Linear (BN)
BiRNN
BiRNN
BiRNN
Conv2 (BN)
Pool
Conv1 (BN)
Conv0
Spectrogram
Class Label
MAX
Inspired by DeepSpeech 2 (Baidu).
Convolution + Recurrent layers
Network Architecture
68
Main Python script
Full source at https://github.com/NervanaSystems/neon/blob/master/examples/whale_calls.py
Spectrogram
Class Label
MAX
Running the example
69
Command line:
./whale_calls.py -e 16 -r 0 -s whales.pkl –v -w /home/ubuntu/nervana/wdc
Convolution Layer 'Convolution_0': 1 x (81x49) inputs, 128 x (79x24) outputs
Activation Layer 'Convolution_0_Rectlin': Rectlin
Convolution Layer 'Convolution_1': 128 x (79x24) inputs, 256 x (77x22) outputs
BatchNorm Layer 'Convolution_1_bnorm': 433664 inputs, 1 steps, 256 feature maps
Activation Layer 'Convolution_1_Rectlin': Rectlin
Pooling Layer 'Pooling_0': 256 x (77x22) inputs, 256 x (38x11) outputs
Convolution Layer 'Convolution_2': 256 x (38x11) inputs, 512 x (37x10) outputs
BatchNorm Layer 'Convolution_2_bnorm': 189440 inputs, 1 steps, 512 feature maps
Activation Layer 'Convolution_2_Rectlin': Rectlin
BiRNN Layer 'BiRNN_0': 18944 inputs, (256 outputs) * 2, 10 steps
BiRNN Layer 'BiRNN_1': (256 inputs) * 2, (256 outputs) * 2, 10 steps
BiRNN Layer 'BiRNN_2': (256 inputs) * 2, (256 outputs) * 2, 10 steps
RecurrentOutput choice RecurrentLast : (512, 10) inputs, 512 outputs
Linear Layer 'Linear_0': 512 inputs, 32 outputs
BatchNorm Layer 'Linear_0_bnorm': 32 inputs, 1 steps, 32 feature maps
Activation Layer 'Linear_0_Rectlin': Rectlin
Linear Layer 'Linear_1': 32 inputs, 2 outputs
Activation Layer 'Linear_1_Softmax': Softmax
Spectrogram
Class Label
MAX
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
70
• What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm)
• Convolution Neural Networks (2:25 pm – 2:50 pm)
• BREAK: 2:50 – 3:00
• Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm)
• Segnet for Object Segmentation (3:25 pm – 3:50 pm)
• BREAK: 3:50 – 4:00
• Recurrent Neural Networks (4:00 pm – 4:25 pm)
• Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm)
• BREAK: 4:50 – 5:00
• Neural Machine Translation (5:00 pm – 5:25 pm)
• Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
71
Image: Kyunghyun Cho
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
72
• Data is tokenized and
mapped to a dictionary
• One-hot encoding: Each
word is a category
• No fixed mapping
between words
Input sentence
De nouvelles règles
sur les transferts de
données pour une
coopération policière
plus efficace
00001000000000
00000000001000
00100000000000
00000000100000
… …
Output Sentence
New rules on data
transfers to
ensure smoother
police cooperation
0000000010
0100000000
0001000000
0000001000
… …
Input Sentence
?
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
73
Sequence to Sequence Model
h1
j1
k1
h2
j2
k2
hn
xn
jn
kn
x2x1
ENCODER
kn k2 k1
jn j2 j1
hn h2 h1
~~ ~
~
~
~
~ ~
~
y1y2yn
~ ~ ~
~ ~
DECODER
Encoding
the cat is
le chat est
Encoder
Decoder
le chat
Recurrent
weights
Feed-forward
weights
Embedding
weights
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
74
neon layer configuration
Encoder and Decoder are
layer containers in neon
stack of GRU-type LSTM
layers in the container
Seq2Seq container to train
Encoder and Decoder
The (correct) previous
word is fed as input to
the decoder LookupTable
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
75
• Toy model with a vocabulary of 16,384 words
• Word embedding of 1024 dimensions
• 2 hidden layers with 512 GRU units
Network Layers:
Seq2Seq
LookupTable Layer : 20 inputs, (512, 20) outputs size
Recurrent Layer 'GRU1Enc': 512 inputs, 512 outputs, 20 steps
Recurrent Layer 'GRU1Enc': 512 inputs, 512 outputs, 20 steps
LookupTable Layer : 20 inputs, (512, 20) outputs size
Recurrent Layer 'GRU1Dec': 512 inputs, 512 outputs, 20 steps
Recurrent Layer 'GRU1Dec': 512 inputs, 512 outputs, 20 steps
Linear Layer 'Affine': 512 inputs, 16384 outputs
Bias Layer 'Affine_bias': size 16384
Activation Layer 'Affine_Softmax': Softmax
Model trained with
Cross-Entropy cost
using RMSProp
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
76
BLEU Score (bilingual evaluation understudy)
• Compare n-grams with one or multiple references
• Modified form of precision, additional penalties.
Beam Search
• Greedy algorithm to obtain output sequences
• Not perfect, so often NMT systems used for rescoring
Candidate on the mat there is a cat
Reference 1 the cat is on the mat
Reference 2 there is a cat on the mat
on the mat is
there
a
…
0.1
0.5
0.03
is
cat
a
…
0.3
0.07
0.05
is
the
a
…
0.01
0.2
0.02
BLEU Score
Beam Search
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
77
• What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm)
• Convolution Neural Networks (2:25 pm – 2:50 pm)
• BREAK: 2:50 – 3:00
• Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm)
• Segnet for Object Segmentation (3:25 pm – 3:50 pm)
• BREAK: 3:50 – 4:00
• Recurrent Neural Networks (4:00 pm – 4:25 pm)
• Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm)
• BREAK: 4:50 – 5:00
• Neural Machine Translation (5:00 pm – 5:25 pm)
• Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
78
Krizhevsky, 2012
Kendall et al, 2016
Amodei et al, 2015
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
Layers
Linear, Convolution, Pooling, Deconvolution, Dropout, Recurrent, Long Short-
Term Memory, Gated Recurrent Unit, BatchNorm, LookupTable,
Local Response Normalization, Bidirectional-RNN, Bidirectional-LSTM
Backend NervanaGPU, NervanaCPU, NervanaMGPU
Datasets
MNIST, CIFAR-10, Imagenet 1K, PASCAL VOC, Mini-Places2, IMDB, Penn Treebank,
Shakespeare Text, bAbI, Hutter-prize, UCF101, flickr8k, flickr30k, COCO
Initializers Constant, Uniform, Gaussian, Glorot Uniform, Xavier, Kaiming, IdentityInit, Orthonormal
Optimizers Gradient Descent with Momentum, RMSProp, AdaDelta, Adam, Adagrad,MultiOptimizer
Activations Rectified Linear, Softmax, Tanh, Logistic, Identity, ExpLin
Costs Binary Cross Entropy, Multiclass Cross Entropy, Sum of Squares Error
Metrics Misclassification (Top1, TopK), LogLoss, Accuracy, PrecisionRecall, ObjectDetection
79
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
80
neon Theano Caffe Torch TensorFlow
Academic Research
Bleeding-edge
Curated models
Iteration Time
Inference speed
Package ecosystem
Support
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
81
Third-party
(Facebook)
benchmarking
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
82
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
83
• github.com/NervanaSystems/ModelZoo
• model files, parameters
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
84
Nervana’s deep learning tutorials:
https://www.nervanasys.com/deep-learning-tutorials/
Github page:
https://github.com/NervanaSystems/neon
For more information, contact:
info@nervanasys.com
The image part with relationship ID rId7 was not found in the file.
Nervana Systems Proprietary
85
THANK YOU!
QUESTIONS?

Contenu connexe

Tendances

Introduction to deep learning @ Startup.ML by Andres Rodriguez
Introduction to deep learning @ Startup.ML by Andres RodriguezIntroduction to deep learning @ Startup.ML by Andres Rodriguez
Introduction to deep learning @ Startup.ML by Andres RodriguezIntel Nervana
 
Rethinking computation: A processor architecture for machine intelligence
Rethinking computation: A processor architecture for machine intelligenceRethinking computation: A processor architecture for machine intelligence
Rethinking computation: A processor architecture for machine intelligenceIntel Nervana
 
Intel Nervana Artificial Intelligence Meetup 11/30/16
Intel Nervana Artificial Intelligence Meetup 11/30/16Intel Nervana Artificial Intelligence Meetup 11/30/16
Intel Nervana Artificial Intelligence Meetup 11/30/16Intel Nervana
 
Urs Köster Presenting at RE-Work DL Summit in Boston
Urs Köster Presenting at RE-Work DL Summit in BostonUrs Köster Presenting at RE-Work DL Summit in Boston
Urs Köster Presenting at RE-Work DL Summit in BostonIntel Nervana
 
NVIDIA 深度學習教育機構 (DLI): Approaches to object detection
NVIDIA 深度學習教育機構 (DLI): Approaches to object detectionNVIDIA 深度學習教育機構 (DLI): Approaches to object detection
NVIDIA 深度學習教育機構 (DLI): Approaches to object detectionNVIDIA Taiwan
 
RE-Work Deep Learning Summit - September 2016
RE-Work Deep Learning Summit - September 2016RE-Work Deep Learning Summit - September 2016
RE-Work Deep Learning Summit - September 2016Intel Nervana
 
Anil Thomas - Object recognition
Anil Thomas - Object recognitionAnil Thomas - Object recognition
Anil Thomas - Object recognitionIntel Nervana
 
Squeezing Deep Learning Into Mobile Phones
Squeezing Deep Learning Into Mobile PhonesSqueezing Deep Learning Into Mobile Phones
Squeezing Deep Learning Into Mobile PhonesAnirudh Koul
 
Using Deep Learning to do Real-Time Scoring in Practical Applications
Using Deep Learning to do Real-Time Scoring in Practical ApplicationsUsing Deep Learning to do Real-Time Scoring in Practical Applications
Using Deep Learning to do Real-Time Scoring in Practical ApplicationsGreg Makowski
 
Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesTuri, Inc.
 
Using neon for pattern recognition in audio data
Using neon for pattern recognition in audio dataUsing neon for pattern recognition in audio data
Using neon for pattern recognition in audio dataIntel Nervana
 
Nervana Systems
Nervana SystemsNervana Systems
Nervana SystemsNand Dalal
 
BigDL webinar - Deep Learning Library for Spark
BigDL webinar - Deep Learning Library for SparkBigDL webinar - Deep Learning Library for Spark
BigDL webinar - Deep Learning Library for SparkDESMOND YUEN
 
Language translation with Deep Learning (RNN) with TensorFlow
Language translation with Deep Learning (RNN) with TensorFlowLanguage translation with Deep Learning (RNN) with TensorFlow
Language translation with Deep Learning (RNN) with TensorFlowS N
 
Moving Toward Deep Learning Algorithms on HPCC Systems
Moving Toward Deep Learning Algorithms on HPCC SystemsMoving Toward Deep Learning Algorithms on HPCC Systems
Moving Toward Deep Learning Algorithms on HPCC SystemsHPCC Systems
 
Distributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNetDistributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNetAmazon Web Services
 
Deep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog DetectorDeep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog DetectorRoelof Pieters
 
Improving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN ApplicationsImproving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN ApplicationsChester Chen
 
Deep learning on mobile
Deep learning on mobileDeep learning on mobile
Deep learning on mobileAnirudh Koul
 

Tendances (20)

Introduction to deep learning @ Startup.ML by Andres Rodriguez
Introduction to deep learning @ Startup.ML by Andres RodriguezIntroduction to deep learning @ Startup.ML by Andres Rodriguez
Introduction to deep learning @ Startup.ML by Andres Rodriguez
 
Rethinking computation: A processor architecture for machine intelligence
Rethinking computation: A processor architecture for machine intelligenceRethinking computation: A processor architecture for machine intelligence
Rethinking computation: A processor architecture for machine intelligence
 
Intel Nervana Artificial Intelligence Meetup 11/30/16
Intel Nervana Artificial Intelligence Meetup 11/30/16Intel Nervana Artificial Intelligence Meetup 11/30/16
Intel Nervana Artificial Intelligence Meetup 11/30/16
 
Urs Köster Presenting at RE-Work DL Summit in Boston
Urs Köster Presenting at RE-Work DL Summit in BostonUrs Köster Presenting at RE-Work DL Summit in Boston
Urs Köster Presenting at RE-Work DL Summit in Boston
 
NVIDIA 深度學習教育機構 (DLI): Approaches to object detection
NVIDIA 深度學習教育機構 (DLI): Approaches to object detectionNVIDIA 深度學習教育機構 (DLI): Approaches to object detection
NVIDIA 深度學習教育機構 (DLI): Approaches to object detection
 
RE-Work Deep Learning Summit - September 2016
RE-Work Deep Learning Summit - September 2016RE-Work Deep Learning Summit - September 2016
RE-Work Deep Learning Summit - September 2016
 
Anil Thomas - Object recognition
Anil Thomas - Object recognitionAnil Thomas - Object recognition
Anil Thomas - Object recognition
 
Squeezing Deep Learning Into Mobile Phones
Squeezing Deep Learning Into Mobile PhonesSqueezing Deep Learning Into Mobile Phones
Squeezing Deep Learning Into Mobile Phones
 
Using Deep Learning to do Real-Time Scoring in Practical Applications
Using Deep Learning to do Real-Time Scoring in Practical ApplicationsUsing Deep Learning to do Real-Time Scoring in Practical Applications
Using Deep Learning to do Real-Time Scoring in Practical Applications
 
Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep Features
 
Using neon for pattern recognition in audio data
Using neon for pattern recognition in audio dataUsing neon for pattern recognition in audio data
Using neon for pattern recognition in audio data
 
Nervana Systems
Nervana SystemsNervana Systems
Nervana Systems
 
BigDL webinar - Deep Learning Library for Spark
BigDL webinar - Deep Learning Library for SparkBigDL webinar - Deep Learning Library for Spark
BigDL webinar - Deep Learning Library for Spark
 
Language translation with Deep Learning (RNN) with TensorFlow
Language translation with Deep Learning (RNN) with TensorFlowLanguage translation with Deep Learning (RNN) with TensorFlow
Language translation with Deep Learning (RNN) with TensorFlow
 
Deep learning
Deep learningDeep learning
Deep learning
 
Moving Toward Deep Learning Algorithms on HPCC Systems
Moving Toward Deep Learning Algorithms on HPCC SystemsMoving Toward Deep Learning Algorithms on HPCC Systems
Moving Toward Deep Learning Algorithms on HPCC Systems
 
Distributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNetDistributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNet
 
Deep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog DetectorDeep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog Detector
 
Improving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN ApplicationsImproving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN Applications
 
Deep learning on mobile
Deep learning on mobileDeep learning on mobile
Deep learning on mobile
 

En vedette

Doing data science chap4
Doing data science chap4Doing data science chap4
Doing data science chap4Keunhyun Oh
 
Naive Bayes by Seo
Naive Bayes by SeoNaive Bayes by Seo
Naive Bayes by SeoBestKwSeo
 
Machine learning ch.1
Machine learning ch.1Machine learning ch.1
Machine learning ch.1S Rulez
 
Modern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentationModern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentationGioele Ciaparrone
 
Video Activity Recognition and NLP Q&A Model Example
Video Activity Recognition and NLP Q&A Model ExampleVideo Activity Recognition and NLP Q&A Model Example
Video Activity Recognition and NLP Q&A Model ExampleIntel Nervana
 
Real Time Clock Interfacing with FPGA
Real Time Clock Interfacing with FPGAReal Time Clock Interfacing with FPGA
Real Time Clock Interfacing with FPGAMafaz Ahmed
 
Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...
Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...
Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...Intel Nervana
 
An Analysis of Convolution for Inference
An Analysis of Convolution for InferenceAn Analysis of Convolution for Inference
An Analysis of Convolution for InferenceIntel Nervana
 
데이터 과학 입문 5장
데이터 과학 입문 5장데이터 과학 입문 5장
데이터 과학 입문 5장HyeonSeok Choi
 
Nervana AI Overview Deck April 2016
Nervana AI Overview Deck April 2016Nervana AI Overview Deck April 2016
Nervana AI Overview Deck April 2016Sean Everett
 
High-Performance GPU Programming for Deep Learning
High-Performance GPU Programming for Deep LearningHigh-Performance GPU Programming for Deep Learning
High-Performance GPU Programming for Deep LearningIntel Nervana
 
Introduction to FPGA, VHDL
Introduction to FPGA, VHDL  Introduction to FPGA, VHDL
Introduction to FPGA, VHDL Amr Rashed
 
"Accelerating Deep Learning Using Altera FPGAs," a Presentation from Intel
"Accelerating Deep Learning Using Altera FPGAs," a Presentation from Intel"Accelerating Deep Learning Using Altera FPGAs," a Presentation from Intel
"Accelerating Deep Learning Using Altera FPGAs," a Presentation from IntelEdge AI and Vision Alliance
 
Object Detection and Recognition
Object Detection and Recognition Object Detection and Recognition
Object Detection and Recognition Intel Nervana
 

En vedette (17)

Doing data science chap4
Doing data science chap4Doing data science chap4
Doing data science chap4
 
Naive Bayes by Seo
Naive Bayes by SeoNaive Bayes by Seo
Naive Bayes by Seo
 
Machine learning ch.1
Machine learning ch.1Machine learning ch.1
Machine learning ch.1
 
Modern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentationModern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentation
 
A petri-net
A petri-netA petri-net
A petri-net
 
Video Activity Recognition and NLP Q&A Model Example
Video Activity Recognition and NLP Q&A Model ExampleVideo Activity Recognition and NLP Q&A Model Example
Video Activity Recognition and NLP Q&A Model Example
 
Real Time Clock Interfacing with FPGA
Real Time Clock Interfacing with FPGAReal Time Clock Interfacing with FPGA
Real Time Clock Interfacing with FPGA
 
Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...
Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...
Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...
 
An Analysis of Convolution for Inference
An Analysis of Convolution for InferenceAn Analysis of Convolution for Inference
An Analysis of Convolution for Inference
 
데이터 과학 입문 5장
데이터 과학 입문 5장데이터 과학 입문 5장
데이터 과학 입문 5장
 
Nervana AI Overview Deck April 2016
Nervana AI Overview Deck April 2016Nervana AI Overview Deck April 2016
Nervana AI Overview Deck April 2016
 
High-Performance GPU Programming for Deep Learning
High-Performance GPU Programming for Deep LearningHigh-Performance GPU Programming for Deep Learning
High-Performance GPU Programming for Deep Learning
 
FPGA Introduction
FPGA IntroductionFPGA Introduction
FPGA Introduction
 
Introduction to FPGA, VHDL
Introduction to FPGA, VHDL  Introduction to FPGA, VHDL
Introduction to FPGA, VHDL
 
"Accelerating Deep Learning Using Altera FPGAs," a Presentation from Intel
"Accelerating Deep Learning Using Altera FPGAs," a Presentation from Intel"Accelerating Deep Learning Using Altera FPGAs," a Presentation from Intel
"Accelerating Deep Learning Using Altera FPGAs," a Presentation from Intel
 
Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)
 
Object Detection and Recognition
Object Detection and Recognition Object Detection and Recognition
Object Detection and Recognition
 

Similaire à ODSC West

Deep Learning and its Applications - Computer Vision
Deep Learning and its Applications - Computer VisionDeep Learning and its Applications - Computer Vision
Deep Learning and its Applications - Computer VisionAdam Gibson
 
[2A4]DeepLearningAtNAVER
[2A4]DeepLearningAtNAVER[2A4]DeepLearningAtNAVER
[2A4]DeepLearningAtNAVERNAVER D2
 
SeRanet introduction
SeRanet introductionSeRanet introduction
SeRanet introductionKosuke Nakago
 
Object recognition
Object recognitionObject recognition
Object recognitionsaniacorreya
 
Slides for "ROCKER – A Refinement Operator for Key Discovery", WWW2015
Slides for "ROCKER – A Refinement Operator for Key Discovery", WWW2015Slides for "ROCKER – A Refinement Operator for Key Discovery", WWW2015
Slides for "ROCKER – A Refinement Operator for Key Discovery", WWW2015Tommaso Soru
 
Graph Data Science with Neo4j: Nordics Webinar
Graph Data Science with Neo4j: Nordics WebinarGraph Data Science with Neo4j: Nordics Webinar
Graph Data Science with Neo4j: Nordics WebinarNeo4j
 
Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016
Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016
Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016Universitat Politècnica de Catalunya
 
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...Lionel Briand
 
ntroducing to the Power of Graph Technology
ntroducing to the Power of Graph Technologyntroducing to the Power of Graph Technology
ntroducing to the Power of Graph TechnologyNeo4j
 
Scalawox deeplearning
Scalawox deeplearningScalawox deeplearning
Scalawox deeplearningscalawox
 
Throttling Malware Families in 2D
Throttling Malware Families in 2DThrottling Malware Families in 2D
Throttling Malware Families in 2DMohamed Nassar
 
Anomaly Detection with Azure and .net
Anomaly Detection with Azure and .netAnomaly Detection with Azure and .net
Anomaly Detection with Azure and .netMarco Parenzan
 
Deep Learning Workflows: Training and Inference
Deep Learning Workflows: Training and InferenceDeep Learning Workflows: Training and Inference
Deep Learning Workflows: Training and InferenceNVIDIA
 
IRJET- Implementation of Gender Detection with Notice Board using Raspberry Pi
IRJET- Implementation of Gender Detection with Notice Board using Raspberry PiIRJET- Implementation of Gender Detection with Notice Board using Raspberry Pi
IRJET- Implementation of Gender Detection with Notice Board using Raspberry PiIRJET Journal
 
Report face recognition : ArganRecogn
Report face recognition :  ArganRecognReport face recognition :  ArganRecogn
Report face recognition : ArganRecognIlyas CHAOUA
 

Similaire à ODSC West (20)

Deep Learning and its Applications - Computer Vision
Deep Learning and its Applications - Computer VisionDeep Learning and its Applications - Computer Vision
Deep Learning and its Applications - Computer Vision
 
[2A4]DeepLearningAtNAVER
[2A4]DeepLearningAtNAVER[2A4]DeepLearningAtNAVER
[2A4]DeepLearningAtNAVER
 
Kinetica master chug_9.12
Kinetica master chug_9.12Kinetica master chug_9.12
Kinetica master chug_9.12
 
SeRanet introduction
SeRanet introductionSeRanet introduction
SeRanet introduction
 
Depth estimation using deep learning
Depth estimation using deep learningDepth estimation using deep learning
Depth estimation using deep learning
 
Region-oriented Convolutional Networks for Object Retrieval
Region-oriented Convolutional Networks for Object RetrievalRegion-oriented Convolutional Networks for Object Retrieval
Region-oriented Convolutional Networks for Object Retrieval
 
Object recognition
Object recognitionObject recognition
Object recognition
 
Slides for "ROCKER – A Refinement Operator for Key Discovery", WWW2015
Slides for "ROCKER – A Refinement Operator for Key Discovery", WWW2015Slides for "ROCKER – A Refinement Operator for Key Discovery", WWW2015
Slides for "ROCKER – A Refinement Operator for Key Discovery", WWW2015
 
Graph Data Science with Neo4j: Nordics Webinar
Graph Data Science with Neo4j: Nordics WebinarGraph Data Science with Neo4j: Nordics Webinar
Graph Data Science with Neo4j: Nordics Webinar
 
Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016
Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016
Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016
 
Distributed deep learning_over_spark_20_nov_2014_ver_2.8
Distributed deep learning_over_spark_20_nov_2014_ver_2.8Distributed deep learning_over_spark_20_nov_2014_ver_2.8
Distributed deep learning_over_spark_20_nov_2014_ver_2.8
 
C1_W4.pdf
C1_W4.pdfC1_W4.pdf
C1_W4.pdf
 
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
 
ntroducing to the Power of Graph Technology
ntroducing to the Power of Graph Technologyntroducing to the Power of Graph Technology
ntroducing to the Power of Graph Technology
 
Scalawox deeplearning
Scalawox deeplearningScalawox deeplearning
Scalawox deeplearning
 
Throttling Malware Families in 2D
Throttling Malware Families in 2DThrottling Malware Families in 2D
Throttling Malware Families in 2D
 
Anomaly Detection with Azure and .net
Anomaly Detection with Azure and .netAnomaly Detection with Azure and .net
Anomaly Detection with Azure and .net
 
Deep Learning Workflows: Training and Inference
Deep Learning Workflows: Training and InferenceDeep Learning Workflows: Training and Inference
Deep Learning Workflows: Training and Inference
 
IRJET- Implementation of Gender Detection with Notice Board using Raspberry Pi
IRJET- Implementation of Gender Detection with Notice Board using Raspberry PiIRJET- Implementation of Gender Detection with Notice Board using Raspberry Pi
IRJET- Implementation of Gender Detection with Notice Board using Raspberry Pi
 
Report face recognition : ArganRecogn
Report face recognition :  ArganRecognReport face recognition :  ArganRecogn
Report face recognition : ArganRecogn
 

Dernier

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 

Dernier (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

ODSC West

  • 1. Proprietary and confidential. Do not distribute. Diving deep into Deep Learning: Convolutional and Recurrent Neural Networks Urs Köster, PhD, Yinyin Liu, PhD MAKING MACHINES SMARTER.™ now part of
  • 2. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 2 • What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm) • Convolution Neural Networks (2:25 pm – 2:50 pm) • BREAK: 2:50 – 3:00 • Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm) • Segnet for Object Segmentation (3:25 pm – 3:50 pm) • BREAK: 3:50 – 4:00 • Recurrent Neural Networks (4:00 pm – 4:25 pm) • Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm) • BREAK: 4:50 – 5:00 • Neural Machine Translation (5:00 pm – 5:25 pm) • Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm) download neon! https://github.com/NervanaSystems/neon git clone git@github.com:NervanaSystems/neon.git Nervana’s deep learning tutorials: https://www.nervanasys.com/deep-learning- tutorials/ We are hiring! https://www.nervanasys.com/careers/
  • 3. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 3 Back-propagation End-to-end Resnet ImageNet Word2Vec Regularization Convolution Unrolling RNN Generalization hyperparameters Video recognition dropout Pooling LSTM AlexNet Speech recognition download neon! https://github.com/NervanaSystems/neon git clone git@github.com:NervanaSystems/neon.git Nervana’s deep learning tutorials: https://www.nervanasys.com/deep-learning-tutorials/ We are hiring! https://www.nervanasys.com/careers/
  • 4. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 4
  • 5. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 5
  • 6. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 6 https://www.nervanasys.com/industry-focus-serving-the-automotive-industry-with- the-nervana-platform/
  • 7. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 7
  • 8. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 8 http://www.nervanasys.com/deep-reinforcement-learning-with-neon/ https://youtu.be/KkIf0Ok5GCE
  • 9. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 9 ~60 million parameters Positive/ negative End-to-end learning Raw image input Output
  • 10. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 10 (Zeiler and Fergus, 2013)
  • 11. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 11 Historical perspective: • Input → designed features → output • Input → designed features → SVM → output • Input → learned features → SVM → output • Input → levels of learned features → output
  • 12. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 12 A method for extracting features at multiple levels of abstraction • Features are discovered from data • Performance improves with more data • Network can express complex transformations • High degree of representational power
  • 13. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 13 No free lunch: • lots of data • flexible models • powerful priors
  • 14. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary Source: ImageNet ImageNet top 5 error rate 0% 10% 20% 30% 2010 2011 2012 2013 2014 2015 human performance • No free lunch • lots of data • flexible and fast frameworks • powerful computing resources 14
  • 15. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 15 Healthcare: Tumor detection Automotive: Speech interfaces Finance: Time-series search engine Positive: Negative: Agricultural Robotics Oil & Gas Positive: Negative: Proteomics: Sequence analysis Query: Results:
  • 16. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 16 • What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm) • Convolution Neural Networks (2:25 pm – 2:50 pm) • BREAK: 2:50 – 3:00 • Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm) • Segnet for Object Segmentation (3:25 pm – 3:50 pm) • BREAK: 3:50 – 4:00 • Recurrent Neural Networks (4:00 pm – 4:25 pm) • Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm) • BREAK: 4:50 – 5:00 • Neural Machine Translation (5:00 pm – 5:25 pm) • Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)
  • 17. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 0 1 2 3 4 5 6 7 8 0 1 2 3 19 25 37 43 0 1 3 4 0 1 2 3 19 • Each element in the output is the result of a dot product between two vectors 17 input filter output
  • 18. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 24 B0 B1 B2 B3 B4 B5 B6 B7 B8 G0 G1 G2 G3 G4 G5 G6 G7 G8 R0 R1 R2 R3 R4 R5 R6 R7 R8
  • 19. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 19 B0 B1 B2 B3 B4 B5 B6 B7 B8 G0 G1 G2 G3 G4 G5 G6 G7 G8 R0 R1 R2 R3 R4 R5 R6 R7 R8
  • 20. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 20 B0 B1 B2 B3 B4 B5 B6 B7 B8 G0 G1 G2 G3 G4 G5 G6 G7 G8 R0 R1 R2 R3 R4 R5 R6 R7 R8
  • 21. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 21 B0 B1 B2 B3 B4 B5 B6 B7 B8 G0 G1 G2 G3 G4 G5 G6 G7 G8 R0 R1 R2 R3 R4 R5 R6 R7 R8
  • 22. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 22 0 1 2 3 4 5 6 7 8 0 1 2 3 19 25 37 43 0 1 2 3 4 5 6 7 8 19 0 2 3 1 0 2 3 1 0 2 3 1 0 2 3 1 25 37 43
  • 23. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 23 Detected the pattern!
  • 24. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 24 0 1 2 3 4 5 6 7 8 4 5 7 8 0 1 3 4 4 • Each element in the output is the maximum value within the pooling window • Precise location becomes less relevant • The layer becomes tolerant to local perturbations in the input – build in invariance Max( )
  • 25. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 25 0 0 1 0 4 6 4 12 9 0 1 2 3 0 1 2 3 • Opposite transformation of convolution • Represents the bases to reconstruct shape of an input
  • 26. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 26 0 0 1 0 4 6 4 12 9 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 x + x = 31 13 6
  • 27. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 27 • AlexNet (ILSVRC 2012 winner) • ZF Net (2013 winner) • GoogLeNet (2014 winner) • VGG (2014 runner-up) • ResNet (2015 winner) conv1 pool1 conv2 pool2 conv3 conv4 conv5 pool5 fc6 fc7
  • 28. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 28 When you construct a deep network, and train with a lot of data: • optimize all the parameters for the problem –optimized feature extractor • it discovers the intrinsic structures of the data on its own • different layers of filters discovers different level of features
  • 29. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 29 • Filters can be visualized by the weights • The weights reflect what patterns a filter is looking for • Low-level filters represent lower-level features, edges, color blobs 11x11x3 conv filters learned by the first layer http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf
  • 30. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 30 • High-level filters represent abstract features • What input activates the filters (neurons) will pass through to the upper layers • But the pattern can be hard to interpret http://eblearn.sourceforge.net/lib/exe/mnist_fprop1.png
  • 31. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 31 • A conv-deconv network to project filters to the pixel level • For high level filters: • Each tile shows a feature map activation projected to pixel space • Strong grouping within each feature map • Greater invariance at higher layers • Exaggeration of discriminative parts of the image, eyes, wheels… (Zeiler and Fergus, 2013)
  • 32. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 32 1. Each layer has different scope of things it can look for. Lower layers will develop general features, so it doesn’t have a wide variety to look for Higher layers have a larger variety of things to look for à Number of features increase 2. Combine simple features to complex features Choose convolution strides / padding to retain FM size Use pooling to reduce FM size à (H, W) decrease
  • 33. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 33 Layer Output shape Input (224, 224, 3) CONV (3x3x64) (224, 224, 64) CONV (3x3x64) (224, 224, 64) POOL (2x2) (112, 112, 64) CONV (3x3x128) (112, 112, 128) CONV (3x3x128) (112, 112, 128) POOL (2x2) (56, 56, 128) CONV (3x3x256) (56, 56, 256) CONV (3x3x256) (56, 56, 256) CONV (3x3x256) (56, 56, 256) POOL (2x2) (28, 28, 256) CONV (3x3x256) (28, 28, 512) CONV (3x3x256) (28, 28, 512) CONV (3x3x256) (28, 28, 512) POOL (2x2) (14, 14, 512) CONV (3x3x512) (14, 14, 512) CONV (3x3x512) (14, 14, 512) CONV (3x3x512) (14, 14, 512) POOL (2x2) (7, 7, 512) AFFINE (4096 units) (4096, 1) AFFINE (4096 units) (4096, 1) AFFINE (100 units) (100, 1) https://www.cs.toronto.edu/~frossard/post/vgg16/
  • 34. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary Input Conv1 Conv2 Conv3 Deconv1 Deconv2 Deconv3 • Can be trained to reconstruct meaningful variations • Have been used to generate images, and object localization http://arxiv.org/abs/1411.5928 http://arxiv.org/abs/1412.6583 http://arxiv.org/abs/1505.04366 34
  • 35. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 35 Image classification Image segmentation Object localizationVideo classification
  • 36. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 36 • Neon supports optimized convolution kernels for maxwell- based GPUs • All components for constructing example CNNs
  • 37. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 37 • What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm) • Convolution Neural Networks (2:25 pm – 2:50 pm) • BREAK: 2:50 – 3:00 • Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm) • Segnet for Object Segmentation (3:25 pm – 3:50 pm) • BREAK: 3:50 – 4:00 • Recurrent Neural Networks (4:00 pm – 4:25 pm) • Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm) • BREAK: 4:50 – 5:00 • Neural Machine Translation (5:00 pm – 5:25 pm) • Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)
  • 38. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 38 • https://www.kaggle.com/c/noaa-right-whale-recognition • Right whales being photographed and tracked for over 10 years • ~4500 labeled images, ~450 whales • ~7000 test images
  • 39. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 39 • They all look quite the same • Small objects to identify with background • Whales in the pictures have different orientations - challenging to build in this much variance.
  • 40. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 40 • How to go from to - up-close and orientation aligned? • Estimate the heading (angle) of the whale using a CNN? • Training set can be manually labeled, to train a segmentation CNN • Apply the segmentation CNN to process and auto-align the test images • Apply classification CNN on the pre-process images
  • 41. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary ? 41
  • 42. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 42 Input epoch 0 epoch 2 epoch 4 epoch 6 target prediction indicated by
  • 43. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 43 init = Gaussian(scale=0.1) opt = Adadelta(decay=0.9) common = dict(init=init, batch_norm=True, activation=Rectlin()) layers = [] nchan = 128 layers.append(Conv((2, 2, nchan), strides=2, **common)) for idx in range(16): layers.append(Conv((3, 3, nchan), **common)) if nchan > 16: nchan /= 2 for idx in range(15): layers.append(Deconv((3, 3, nchan), **common)) layers.append(Deconv((4, 4, nchan), strides=2, **common)) layers.append(Deconv((3, 3, 1), init=init)) cost = GeneralizedCost(costfunc=SumSquared()) mlp = Model(layers=layers) callbacks = Callbacks(mlp, train, eval_set=val, **args.callback_args) mlp.fit(train, optimizer=opt, num_epochs=args.epochs, cost=cost, callbacks=callbacks)
  • 44. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 44
  • 45. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary init = Gaussian(scale=0.01) opt = Adadelta(decay=0.9) common = dict(init=init, batch_norm=True, activation=Rectlin()) layers = [] nchan = 64 layers.append(Conv((2, 2, nchan), strides=2, **common)) for idx in range(6): if nchan > 1024: nchan = 1024 layers.append(Conv((3, 3, nchan), strides=1, **common)) layers.append(Pooling(2, strides=2)) nchan *= 2 layers.append(DropoutBinary(keep=0.5)) layers.append(Affine(nout=447, init=init, activation=Softmax())) cost = GeneralizedCost(costfunc=CrossEntropyMulti()) mlp = Model(layers=layers) callbacks = Callbacks(mlp, train, eval_set=val, **args.callback_args) mlp.fit(train, optimizer=opt, num_epochs=args.epochs, cost=cost, callbacks=callbacks) 45
  • 46. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 46 https://github.com/anlthms/whale-2015
  • 47. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 47 • What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm) • Convolution Neural Networks (2:25 pm – 2:50 pm) • BREAK: 2:50 – 3:00 • Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm) • Segnet for Object Segmentation (3:25 pm – 3:50 pm) • BREAK: 3:50 – 4:00 • Recurrent Neural Networks (4:00 pm – 4:25 pm) • Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm) • BREAK: 4:50 – 5:00 • Neural Machine Translation (5:00 pm – 5:25 pm) • Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)
  • 48. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 48 Source: http://mi.eng.cam.ac.uk/projects/segnet/
  • 49. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 49 • Uses CamVid dataset: https://github.com/alexgkendall/SegNet- Tutorial/tree/master/CamVid • converts the 1 channel target class images holding the ground truth values for each pixel into a 12 channel image using a one-hot representation for the class of each pixel • Takes about 12G GPU memory • After 650 epochs of training, the network should reach ~9000 training cost and ~80% pixel classification accuracy.
  • 50. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 50
  • 51. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 51
  • 52. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 52 Sky Building Sidewalk Tree Car Pedestrian Pole Road Sign Fence Bicyclist Unlabeled
  • 53. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 53 Sky Building Pole Road Sidewalk Tree Sign Fence Car Pedestrian Bicyclist Unlabeled
  • 54. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 54 Source at https://github.com/NervanaSystems/neon/tree/master/examples Example 1: ./conv_autoencoder.py Conv-deconv network to reconstruct input images Example 2: ./cifar10_conv.py ConvNet for Cifar10 dataset
  • 55. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 55 • What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm) • Convolution Neural Networks (2:25 pm – 2:50 pm) • BREAK: 2:50 – 3:00 • Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm) • Segnet for Object Segmentation (3:25 pm – 3:50 pm) • BREAK: 3:50 – 4:00 • Recurrent Neural Networks (4:00 pm – 4:25 pm) • Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm) • BREAK: 4:50 – 5:00 • Neural Machine Translation (5:00 pm – 5:25 pm) • Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)
  • 56. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 56
  • 57. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 57 Back Propagation through time 1. Unroll the network across time steps. 2. Follow the back- propagated gradients. 3. Update weights with average gradients. ht jt kt xt Encoder RNNEncoder RNN Recurrent weights Feed-forward weights h1 j1 k1 h2 j2 k2 hn xn jn kn x2x1 Unrolled Network gradients
  • 58. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 58 Network activations determine states of input, forget, output gate: • Open input, open output, closed forget: LSTM network acts like a standard RNN • Closing input, opening forget: Memory cell recalls previous state, new input is ignored • Closing output: Internal state is stored for the next time step without producing any output f g i o c ht Input Hidden f g i o c ht f g i o c ht f g i o c ht FF Weights Recurrent Weights
  • 59. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 59 • Neon supports a wide range of recurrent layers • Connectivity between recurrent and feed-forward layers • Deep and bi-directional RNNs • Containers for Encoders, Decoders, Sequence to Sequence models. Recurrent output layers Standard Recurrent layers Bidirectional Recurrent layers
  • 60. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 60 • Simple RNN example: neon/examples/char_rnn.py • Penn Tree Bank text dataset: Learn to predict text, one letter at a time. • Small enough to run on your Laptop, right now (should take about 4 minutes per epoch of training on a Laptop CPU). • LSTM example: text_generation_lstm.py • Generate Shakespeare-style text Backend Hyper-parameters Network Layers Dataset Cost Function Optimizer Fitting the model
  • 61. Other RNN examples you can try: 61 Source at https://github.com/anlthms/meetup2/blob/master Example 1: ./rnn1.py -e 16 -w /home/ubuntu/nervana/music -r 0 -v (Does not work well! This example demonstrates challenges of training RNNs) Example 2: ./rnn2.py -e 16 -w /home/ubuntu/nervana/music -r 0 -v (Uses Glorot init, gradient clipping, Adagrad and LSTM) Example 3 ./rnn3.py -e 16 -w /home/ubuntu/nervana/music -r 0 -v (Uses multiple bi-rnn layers) Warning: Large dataset, please do not download over ODSC WiFi.
  • 62. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 62 Image Captioning Speech recognition Machine Translation Time Series Analysis
  • 63. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 63 • What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm) • Convolution Neural Networks (2:25 pm – 2:50 pm) • BREAK: 2:50 – 3:00 • Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm) • Segnet for Object Segmentation (3:25 pm – 3:50 pm) • BREAK: 3:50 – 4:00 • Recurrent Neural Networks (4:00 pm – 4:25 pm) • Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm) • BREAK: 4:50 – 5:00 • Neural Machine Translation (5:00 pm – 5:25 pm) • Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)
  • 64. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 64 • Whale Detection challenge from Kaggle: https://www.kaggle.com/c/whale-detection-challenge • Identify calls by Right Whales based on their signature chirp sound • 30.000 training clips of 2s length at 2kHz
  • 65. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 65 • Processing the data: Work in the spectrogram domain: 81 frequencies, 49 time steps • neon dataloader has built in audio processing tools. • Essentially transforms the sound into an “image” we can apply ConvNet tools to. Whale Call Spectrogram
  • 66. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 66 Network • Spectrograms are 81 x 49 “pixel” images. • Apply convolutional layers to obtain a 37x10 feature map of depth 512 • RNN layers applied to 10 time steps of the 37*512-D stack • Feed last time step into a binary classifier with SoftMax • Conv layers have ReLu activations and use Batch Normalization Training • Optimization with AdaDelta • Initialization with Gaussian noise • This model is not very deep, not challenging to train
  • 67. Network Architecture 67 Linear (BN) BiRNN BiRNN BiRNN Conv2 (BN) Pool Conv1 (BN) Conv0 Spectrogram Class Label MAX Inspired by DeepSpeech 2 (Baidu). Convolution + Recurrent layers
  • 68. Network Architecture 68 Main Python script Full source at https://github.com/NervanaSystems/neon/blob/master/examples/whale_calls.py Spectrogram Class Label MAX
  • 69. Running the example 69 Command line: ./whale_calls.py -e 16 -r 0 -s whales.pkl –v -w /home/ubuntu/nervana/wdc Convolution Layer 'Convolution_0': 1 x (81x49) inputs, 128 x (79x24) outputs Activation Layer 'Convolution_0_Rectlin': Rectlin Convolution Layer 'Convolution_1': 128 x (79x24) inputs, 256 x (77x22) outputs BatchNorm Layer 'Convolution_1_bnorm': 433664 inputs, 1 steps, 256 feature maps Activation Layer 'Convolution_1_Rectlin': Rectlin Pooling Layer 'Pooling_0': 256 x (77x22) inputs, 256 x (38x11) outputs Convolution Layer 'Convolution_2': 256 x (38x11) inputs, 512 x (37x10) outputs BatchNorm Layer 'Convolution_2_bnorm': 189440 inputs, 1 steps, 512 feature maps Activation Layer 'Convolution_2_Rectlin': Rectlin BiRNN Layer 'BiRNN_0': 18944 inputs, (256 outputs) * 2, 10 steps BiRNN Layer 'BiRNN_1': (256 inputs) * 2, (256 outputs) * 2, 10 steps BiRNN Layer 'BiRNN_2': (256 inputs) * 2, (256 outputs) * 2, 10 steps RecurrentOutput choice RecurrentLast : (512, 10) inputs, 512 outputs Linear Layer 'Linear_0': 512 inputs, 32 outputs BatchNorm Layer 'Linear_0_bnorm': 32 inputs, 1 steps, 32 feature maps Activation Layer 'Linear_0_Rectlin': Rectlin Linear Layer 'Linear_1': 32 inputs, 2 outputs Activation Layer 'Linear_1_Softmax': Softmax Spectrogram Class Label MAX
  • 70. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 70 • What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm) • Convolution Neural Networks (2:25 pm – 2:50 pm) • BREAK: 2:50 – 3:00 • Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm) • Segnet for Object Segmentation (3:25 pm – 3:50 pm) • BREAK: 3:50 – 4:00 • Recurrent Neural Networks (4:00 pm – 4:25 pm) • Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm) • BREAK: 4:50 – 5:00 • Neural Machine Translation (5:00 pm – 5:25 pm) • Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)
  • 71. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 71 Image: Kyunghyun Cho
  • 72. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 72 • Data is tokenized and mapped to a dictionary • One-hot encoding: Each word is a category • No fixed mapping between words Input sentence De nouvelles règles sur les transferts de données pour une coopération policière plus efficace 00001000000000 00000000001000 00100000000000 00000000100000 … … Output Sentence New rules on data transfers to ensure smoother police cooperation 0000000010 0100000000 0001000000 0000001000 … … Input Sentence ?
  • 73. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 73 Sequence to Sequence Model h1 j1 k1 h2 j2 k2 hn xn jn kn x2x1 ENCODER kn k2 k1 jn j2 j1 hn h2 h1 ~~ ~ ~ ~ ~ ~ ~ ~ y1y2yn ~ ~ ~ ~ ~ DECODER Encoding the cat is le chat est Encoder Decoder le chat Recurrent weights Feed-forward weights Embedding weights
  • 74. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 74 neon layer configuration Encoder and Decoder are layer containers in neon stack of GRU-type LSTM layers in the container Seq2Seq container to train Encoder and Decoder The (correct) previous word is fed as input to the decoder LookupTable
  • 75. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 75 • Toy model with a vocabulary of 16,384 words • Word embedding of 1024 dimensions • 2 hidden layers with 512 GRU units Network Layers: Seq2Seq LookupTable Layer : 20 inputs, (512, 20) outputs size Recurrent Layer 'GRU1Enc': 512 inputs, 512 outputs, 20 steps Recurrent Layer 'GRU1Enc': 512 inputs, 512 outputs, 20 steps LookupTable Layer : 20 inputs, (512, 20) outputs size Recurrent Layer 'GRU1Dec': 512 inputs, 512 outputs, 20 steps Recurrent Layer 'GRU1Dec': 512 inputs, 512 outputs, 20 steps Linear Layer 'Affine': 512 inputs, 16384 outputs Bias Layer 'Affine_bias': size 16384 Activation Layer 'Affine_Softmax': Softmax Model trained with Cross-Entropy cost using RMSProp
  • 76. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 76 BLEU Score (bilingual evaluation understudy) • Compare n-grams with one or multiple references • Modified form of precision, additional penalties. Beam Search • Greedy algorithm to obtain output sequences • Not perfect, so often NMT systems used for rescoring Candidate on the mat there is a cat Reference 1 the cat is on the mat Reference 2 there is a cat on the mat on the mat is there a … 0.1 0.5 0.03 is cat a … 0.3 0.07 0.05 is the a … 0.01 0.2 0.02 BLEU Score Beam Search
  • 77. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 77 • What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm) • Convolution Neural Networks (2:25 pm – 2:50 pm) • BREAK: 2:50 – 3:00 • Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm) • Segnet for Object Segmentation (3:25 pm – 3:50 pm) • BREAK: 3:50 – 4:00 • Recurrent Neural Networks (4:00 pm – 4:25 pm) • Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm) • BREAK: 4:50 – 5:00 • Neural Machine Translation (5:00 pm – 5:25 pm) • Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)
  • 78. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 78 Krizhevsky, 2012 Kendall et al, 2016 Amodei et al, 2015
  • 79. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary Layers Linear, Convolution, Pooling, Deconvolution, Dropout, Recurrent, Long Short- Term Memory, Gated Recurrent Unit, BatchNorm, LookupTable, Local Response Normalization, Bidirectional-RNN, Bidirectional-LSTM Backend NervanaGPU, NervanaCPU, NervanaMGPU Datasets MNIST, CIFAR-10, Imagenet 1K, PASCAL VOC, Mini-Places2, IMDB, Penn Treebank, Shakespeare Text, bAbI, Hutter-prize, UCF101, flickr8k, flickr30k, COCO Initializers Constant, Uniform, Gaussian, Glorot Uniform, Xavier, Kaiming, IdentityInit, Orthonormal Optimizers Gradient Descent with Momentum, RMSProp, AdaDelta, Adam, Adagrad,MultiOptimizer Activations Rectified Linear, Softmax, Tanh, Logistic, Identity, ExpLin Costs Binary Cross Entropy, Multiclass Cross Entropy, Sum of Squares Error Metrics Misclassification (Top1, TopK), LogLoss, Accuracy, PrecisionRecall, ObjectDetection 79
  • 80. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 80 neon Theano Caffe Torch TensorFlow Academic Research Bleeding-edge Curated models Iteration Time Inference speed Package ecosystem Support
  • 81. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 81 Third-party (Facebook) benchmarking
  • 82. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 82
  • 83. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 83 • github.com/NervanaSystems/ModelZoo • model files, parameters
  • 84. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 84 Nervana’s deep learning tutorials: https://www.nervanasys.com/deep-learning-tutorials/ Github page: https://github.com/NervanaSystems/neon For more information, contact: info@nervanasys.com
  • 85. The image part with relationship ID rId7 was not found in the file. Nervana Systems Proprietary 85 THANK YOU! QUESTIONS?