5. CONVOLUTIONAL NETWORKS
• LeNet - Yann LeCun, 1998
• But need high computational resources
• It needs very high computational resources for 1998
27.07.16 5
9. CS231n Convolutional Neural Networks for Visual Recognition
POOLING
• Pooling layer downsamples the volume spatially, independently in each depth slice of the
input volume. Left: In this example, the input volume of size [224x224x64] is pooled with
filter size 2, stride 2 into output volume of size [112x112x64]. Notice that the volume
depth is preserved. Right: The most common downsampling operation is max, giving rise
to max pooling, here shown with a stride of 2. That is, each max is taken over 4 numbers
(little 2x2 square
27.07.16 9
11. Zeiler and Fergus 2013, ”Visualizing and Understanding Convolutional Networks".
CNN FEATURES
• Learned CNN features.
• The features become more extended and complex deeper in the network.
27.07.16 11
12. Zeiler and Fergus 2013, ”Visualizing and Understanding Convolutional Networks".
CNN FEATURES
• Learned CNN features.
27.07.16 12
13. Zeiler and Fergus 2013, ”Visualizing and Understanding Convolutional Networks".
CNN FEATURES
•
27.07.16 13
15. NATURAL IMAGE CLASSIFICATION - ImageNet
• Alex Krizhevsky , 2012
• 1.2M training images, 1000
classes
• Scored 15.3% Top-5 error
rate with 26.2% for the
second-best entry for
classification task
• CNNs trained with GPUs
• Demo
27.07.16 15
16. Learning and Transferring Mid-Level Image Representations using Convolutional Neural Networks
[Oquab et al. CVPR 2014]
TRANSFER LEARNING, IMAGE SEARCH
27.07.16 16
17. Deep Neural Networks for Object Detection [pdf], 2013
OBJECT DETECTION
27.07.16 17
18. Deep Visual-Semantic Alignments for Generating Image Descriptions [Andrej Karpathy, Li Fei-Fei]
IMAGE CAPTIONING
• Large CNN for the object
detection in the
photographs
• recurrent neural
network like an LSTM to
turn the labels into a
coherent sentence
• web demo
27.07.16 18
19. Image taken from Richard Zhang, Phillip Isola and Alexei A. Efros.
IMAGE COLORIZATION
• DEMO, VIDEO
27.07.16 19
20. RECURRENT NEURAL NETWORKS
• Sequence processing, memory
• A, looks at some input Xt and outputs a value Ht.
• A loop allows information to be passed from one step of the network to the next
27.07.16 20
21. RNN
• Problem - Long-Term
Dependencies
• “I grew up in France… I speak
fluent French.”
27.07.16 21
28. t-SNE visualizations of word embeddings. Left: Number Region; Right: Jobs Region. From Turian et
al. (2010), see complete image.
WORD EMBEDDINGS
• A word embedding is a parameterized function mapping words in some language
to high-dimensional vectors (perhaps 200 to 500 dimensions).
• W(“cat")=(0.2, -0.4, 0.7, ...)
• W("table")=(0.0, 0.6, -0.1, ...)
27.07.16 28
29. What words have embeddings closest to a given word? From Collobertet al. (2011)
WORD EMBEDDINGS
27.07.16 29
30. From Mikolov et al.(2013a)
WORD EMBEDDINGS
• Relationship
• W(‘‘woman")−W(‘‘man") ≃ W(‘‘aunt")−W(‘‘uncle")
• W(‘‘woman")−W(‘‘man") ≃ W(‘‘queen")−W(‘‘king")
27.07.16 30
31. Relationship pairs in a word embedding. From Mikolov et al. (2013b).
WORD EMBEDDINGS
27.07.16 31
32. Introduction to Neural Machine Translation with GPUs
NEURAL MACHINE TRANSLATION
27.07.16 32
34. NEURAL MACHINE TRANSLATION (ENCODER)
• Step 2: A one-hot vector to a continuous-space representation.
27.07.16 34
35. NEURAL MACHINE TRANSLATION (ENCODER)
• Step 3: Sequence summarization by a recurrent neural network.
27.07.16 35
36. [Sutskever et al., 2014].
NEURAL MACHINE TRANSLATION
• 2-D Visualization of Sentence Representations. Similar sentences are close
together in summary-vector space.
27.07.16 36
37. NEURAL MACHINE TRANSLATION (DECODER)
• Step 1: Computing the internal hidden state of the decoder.
27.07.16 37
39. NEURAL MACHINE TRANSLATION (DECODER)
• Step 3: Sampling the next word. Кepeating until we select the end-of-sentence word (<eos>).
27.07.16 39
40. REINFORCEMENT LEARNING
27.07.16 40
• a set of environment states S;
• a set of actions A;
• rules of transitioning between states;
• rules that determine the scalar immediate reward of a transition;
• rules that describe what the agent observes.