SlideShare une entreprise Scribd logo
1  sur  100
Télécharger pour lire hors ligne
[course site]
Xavier Giro-i-Nieto
xavier.giro@upc.edu
Associate Professor
Universitat Politecnica de Catalunya
Technical University of Catalonia
Unsupervised Learning
#DLUPC
2
Acknowledegments
Kevin McGuinness, “Unsupervised Learning” Deep Learning for Computer Vision.
[Slides 2016] [Slides 2017]
3
Outline
1. Motivation
2. Unsupervised Learning
3. Predictive Learning
4. Self-supervised Learning
4
Outline
1. Motivation
2. Unsupervised Learning
3. Predictive Learning
4. Self-supervised Learning
5
Motivation
6
Motivation
7
Greff, Klaus, Antti Rasmus, Mathias Berglund, Tele Hao, Harri Valpola, and Juergen Schmidhuber. "Tagger: Deep unsupervised
perceptual grouping." NIPS 2016 [video] [code]
Yann Lecun’s Black Forest cake
8
Motivation
1.
= ƒ( )
ƒ( )
= ƒ( )
9
Predict label y corresponding to
observation x
Estimate the distribution of
observation x
Predict action y based on
observation x, to maximize a future
reward z
Motivation
1.
= ƒ( )
ƒ( )
= ƒ( )
10
Motivation
Unsupervised Learning
11
Why Unsupervised Learning?
● It is the nature of how intelligent beings percept
the world.
● It can save us tons of efforts to build a
human-alike intelligent agent compared to a
totally supervised fashion.
● Vast amounts of unlabelled data.
12
Outline
1. Motivation
2. Unsupervised Learning
3. Predictive Learning
4. Self-supervised Learning
Max margin decision boundary
13
How data distribution P(x) influences decisions (1D)
Slide credit: Kevin McGuinness (DLCV UPC 2017)
Semi supervised decision
boundary
14
How data distribution P(x) influences decisions (1D)
Slide credit: Kevin McGuinness (DLCV UPC 2017)
15
How data distribution P(x) influences decisions (2D)
Slide credit: Kevin McGuinness (DLCV UPC 2017)
16
How data distribution P(x) influences decisions (2D)
Slide credit: Kevin McGuinness (DLCV UPC 2017)
How P(x) is valuable for naive Bayesian classifier
●
●
●
17
X: Data
Y: Labels
Slide credit: Kevin McGuinness (DLCV UPC 2017)
x1
x2
Not linearly separable in
this 2D space :(
18
How clustering is valueble for linear classifiers
Slide credit: Kevin McGuinness (DLCV UPC 2017)
x1
x2
Cluster 1 Cluster 2
Cluster 3
Cluster 4
1 2 3 4
1 2 3 4
4D BoW representation
Separable with a
linear classifiers
in a 4D space !
19
How clustering is valueble for linear classifiers
Slide credit: Kevin McGuinness (DLCV UPC 2017)
Example from Kevin McGuinness Juyter notebook.
20
How clustering is valueble for linear classifiers
21
Assumptions for unsupervised learning
Slide credit: Kevin McGuinness (DLCV UPC 2017)
●
○
●
○
●
○
22
Assumptions for unsupervised learning
Slide credit: Kevin McGuinness (DLCV UPC 2017)
●
○
○
●
○
○
○
○
●
○
●
○
○
○
○
●
○
○
○
○
23
Assumptions for unsupervised learning
Slide credit: Kevin McGuinness (DLCV UPC 2017)
●
○
○
●
○
○
○
○
●
○
●
○
○
○
○
●
○
○
○
○
24
Slide credit: Kevin McGuinness (DLCV UPC 2017)
●
●
○
●
○
The manifold hypothesis
25
The manifold hypothesis
Slide credit: Kevin McGuinness (DLCV UPC 2017)
x1
x2
Linear manifold
wT
x + b
x1
x2
Non-linear
manifold
26
The manifold hypothesis: Energy-based models
●
●
●
●
●
LeCun et al, A Tutorial on Energy-Based Learning, Predicting Structured Data, 2006 http://yann.lecun.com/exdb/publis/pdf/lecun-06.pdf
27
Autoencoder (AE)
F. Van Veen, “The Neural Network Zoo” (2016)
28
Autoencoder (AE)
“Deep Learning Tutorial”, Dept. Computer Science, Stanford
Autoencoders:
● Predict at the output the
same input data.
● Do not need labels:
29
Autoencoder (AE)
“Deep Learning Tutorial”, Dept. Computer Science, Stanford
Dimensionality reduction:
● Use hidden layer as
a feature extractor of
any desired size.
30
Autoencoder (AE)
Encoder
W1
Decoder
W2
hdata reconstruction
Loss
(reconstruction error)
Latent variables
(representation/features)
Figure: Kevin McGuinness (DLCV UPC 2017)
Pretraining:
1. Initialize a NN solving an
autoencoding problem.
31
Autoencoder (AE)
Figure: Kevin McGuinness (DLCV UPC 2017)
Encoder
W1
hdata Classifier
WC
Latent variables
(representation/features)
prediction
y Loss
(cross entropy)
Pretraining:
1. Initialize a NN solving an
autoencoding problem.
2. Train for final task with “few” labels.
32
Autoencoder (AE)
Deep Learning TV, “Autoencoders - Ep. 10”
33F. Van Veen, “The Neural Network Zoo” (2016)
Restricted Boltzmann Machine (RBM)
34
Restricted Boltzmann Machine (RBM)
Figure: Geoffrey Hinton (2013)
Salakhutdinov, Ruslan, Andriy Mnih, and Geoffrey Hinton. "Restricted Boltzmann machines for
collaborative filtering." Proceedings of the 24th international conference on Machine learning. ACM, 2007.
● Shallow two-layer net.
● Restricted = No two nodes in a layer share a
connection
●
● Bipartite graph.
● Bidirectional graph
○ Shared weights.
○ Different biases.
35
Restricted Boltzmann Machine (RBM)
DeepLearning4j, “A Beginner’s Tutorial for Restricted Boltzmann Machines”.
Forward
pass
36
Restricted Boltzmann Machine (RBM)
DeepLearning4j, “A Beginner’s Tutorial for Restricted Boltzmann Machines”.
Forward
pass
37
Restricted Boltzmann Machine (RBM)
DeepLearning4j, “A Beginner’s Tutorial for Restricted Boltzmann Machines”.
Forward
pass
38
Restricted Boltzmann Machine (RBM)
DeepLearning4j, “A Beginner’s Tutorial for Restricted Boltzmann Machines”.
Backward
pass
39
Restricted Boltzmann Machine (RBM)
DeepLearning4j, “A Beginner’s Tutorial for Restricted Boltzmann Machines”.
Backward
pass
40
Restricted Boltzmann Machine (RBM)
DeepLearning4j, “A Beginner’s Tutorial for Restricted Boltzmann Machines”.
Backward
pass
The reconstructed values at
the visible layer are
compared with the actual
ones with the KL
Divergence.
41
Restricted Boltzmann Machine (RBM)
Figure: Geoffrey Hinton (2013)
Salakhutdinov, Ruslan, Andriy Mnih, and Geoffrey Hinton. "Restricted Boltzmann machines for
collaborative filtering." Proceedings of the 24th international conference on Machine learning. ACM, 2007.
42
Restricted Boltzmann Machine (RBM)
DeepLearning4j, “A Beginner’s Tutorial for Restricted Boltzmann Machines”.
RBMs are a specific type of
autoencoder.
Unsupervised
learning
43
Restricted Boltzmann Machine (RBM)
Deep Learning TV, “Restricted Boltzmann Machines”.
44
Deep Belief Networks (DBN)
F. Van Veen, “The Neural Network Zoo” (2016)
45
Deep Belief Networks (DBN)
Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. "A fast learning algorithm for deep belief
nets." Neural computation 18, no. 7 (2006): 1527-1554.
● Architecture like an MLP.
● Training as a stack of
RBMs.
46
Deep Belief Networks (DBN)
● Architecture like an MLP.
● Training as a stack of
RBMs.
Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. "A fast learning algorithm for deep belief
nets." Neural computation 18, no. 7 (2006): 1527-1554.
47
Deep Belief Networks (DBN)
Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. "A fast learning algorithm for deep belief
nets." Neural computation 18, no. 7 (2006): 1527-1554.
● Architecture like an MLP.
● Training as a stack of
RBMs.
48
Deep Belief Networks (DBN)
Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. "A fast learning algorithm for deep belief
nets." Neural computation 18, no. 7 (2006): 1527-1554.
● Architecture like an MLP.
● Training as a stack of
RBMs.
49
Deep Belief Networks (DBN)
Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. "A fast learning algorithm for deep belief
nets." Neural computation 18, no. 7 (2006): 1527-1554.
● Architecture like an MLP.
● Training as a stack of
RBMs…
● ...so they do not need
labels:
Unsupervised
learning
50
Deep Belief Networks (DBN)
Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. "A fast learning algorithm for deep belief
nets." Neural computation 18, no. 7 (2006): 1527-1554.
After the DBN is trained, it can
be fine-tuned with a reduced
amount of labels to solve a
supervised task with superior
performance.
Supervised
learning
Softmax
51
Deep Belief Networks (DBN)
Deep Learning TV, “Deep Belief Nets”
52
Deep Belief Networks (DBN)
Geoffrey Hinton, "Introduction to Deep Learning & Deep Belief Nets” (2012)
Geoorey Hinton, “Tutorial on Deep Belief Networks”. NIPS 2007.
53
Deep Belief Networks (DBN)
Geoffrey Hinton
54
Outline
1. Motivation
2. Unsupervised Learning
3. Predictive Learning
4. Self-supervised Learning
Acknowledgments
55
Junting Pan Xunyu Lin
Unsupervised Feature Learning
56
Slide credit:
Yann LeCun
Unsupervised Feature Learning
57
Slide credit:
Yann LeCun
Slide credit: Junting Pan
Frame Reconstruction & Prediction
59
Srivastava, Nitish, Elman Mansimov, and Ruslan Salakhutdinov. "Unsupervised Learning of Video
Representations using LSTMs." In ICML 2015. [Github]
Unsupervised feature learning (no labels) for...
Frame Reconstruction & Prediction
60
Srivastava, Nitish, Elman Mansimov, and Ruslan Salakhutdinov. "Unsupervised Learning of Video
Representations using LSTMs." In ICML 2015. [Github]
Unsupervised feature learning (no labels) for...
...frame prediction.
Frame Reconstruction & Prediction
61
Srivastava, Nitish, Elman Mansimov, and Ruslan Salakhutdinov. "Unsupervised Learning of Video
Representations using LSTMs." In ICML 2015. [Github]
Unsupervised feature learning (no labels) for...
...frame prediction.
62
Srivastava, Nitish, Elman Mansimov, and Ruslan Salakhutdinov. "Unsupervised Learning of Video
Representations using LSTMs." In ICML 2015. [Github]
Unsupervised learned features (lots of data) are
fine-tuned for activity recognition (little data).
Frame Reconstruction & Prediction
Frame Prediction
63
Ranzato, MarcAurelio, Arthur Szlam, Joan Bruna, Michael Mathieu, Ronan Collobert, and Sumit Chopra. "Video (language)
modeling: a baseline for generative models of natural videos." arXiv preprint arXiv:1412.6604 (2014).
64
Mathieu, Michael, Camille Couprie, and Yann LeCun. "Deep multi-scale video prediction beyond mean square error."
ICLR 2016 [project] [code]
Video frame prediction with a ConvNet.
Frame Prediction
65
Mathieu, Michael, Camille Couprie, and Yann LeCun. "Deep multi-scale video prediction beyond mean square error."
ICLR 2016 [project] [code]
The blurry predictions from MSE are improved with multi-scale architecture,
adversarial learning and an image gradient difference loss function.
Frame Prediction
66
Mathieu, Michael, Camille Couprie, and Yann LeCun. "Deep multi-scale video prediction beyond mean square error."
ICLR 2016 [project] [code]
The blurry predictions from MSE (l1) are improved with multi-scale architecture,
adversarial training and an image gradient difference loss (GDL) function.
Frame Prediction
67
Mathieu, Michael, Camille Couprie, and Yann LeCun. "Deep multi-scale video prediction beyond mean square error."
ICLR 2016 [project] [code]
Frame Prediction
68Vondrick, Carl, Hamed Pirsiavash, and Antonio Torralba. "Generating videos with scene dynamics." NIPS 2016.
Frame Prediction
69
Vondrick, Carl, Hamed Pirsiavash, and Antonio Torralba. "Generating videos with scene dynamics." NIPS 2016.
Frame Prediction
70
Xue, Tianfan, Jiajun Wu, Katherine Bouman, and Bill Freeman. "Visual dynamics: Probabilistic future frame
synthesis via cross convolutional networks." NIPS 2016 [video]
71
Frame Prediction
Xue, Tianfan, Jiajun Wu, Katherine Bouman, and Bill Freeman. "Visual dynamics: Probabilistic future
frame synthesis via cross convolutional networks." NIPS 2016 [video]
Given an input image, probabilistic generation of future frames with a Variational
AutoEncoder (VAE).
72
Frame Prediction
Xue, Tianfan, Jiajun Wu, Katherine Bouman, and Bill Freeman. "Visual dynamics: Probabilistic future
frame synthesis via cross convolutional networks." NIPS 2016 [video]
Encodes image as feature maps, and motion as and cross-convolutional kernels.
73
Outline
1. Motivation
2. Unsupervised Learning
3. Predictive Learning
4. Self-supervised Learning
First steps in video feature learning
74
Le, Quoc V., Will Y. Zou, Serena Y. Yeung, and Andrew Y. Ng. "Learning hierarchical invariant
spatio-temporal features for action recognition with independent subspace analysis." CVPR 2011
Temporal Weak Labels
75
Goroshin, Ross, Joan Bruna, Jonathan Tompson, David Eigen, and Yann LeCun. "Unsupervised learning
of spatiotemporally coherent metrics." ICCV 2015.
Assumption: adjacent video frames contain semantically similar information.
Autoencoder trained with regularizations by slowliness and sparisty.
76
Jayaraman, Dinesh, and Kristen Grauman. "Slow and steady feature analysis: higher order temporal
coherence in video." CVPR 2016. [video]
●
●
Temporal Weak Labels
77
Jayaraman, Dinesh, and Kristen Grauman. "Slow and steady feature analysis: higher order temporal
coherence in video." CVPR 2016. [video]
Temporal Weak Labels
78
(Slides by Xunyu Lin): Misra, Ishan, C. Lawrence Zitnick, and Martial Hebert. "Shuffle and learn: unsupervised learning using
temporal order verification." ECCV 2016. [code]
Temporal order of frames is
exploited as the supervisory
signal for learning.
Temporal Weak Labels
79
(Slides by Xunyu Lin): Misra, Ishan, C. Lawrence Zitnick, and Martial Hebert. "Shuffle and learn: unsupervised learning using
temporal order verification." ECCV 2016. [code]
Take temporal order as the supervisory signals for learning
Shuffled
sequences
Binary classification
In order
Not in order
Temporal Weak Labels
80
Fernando, Basura, Hakan Bilen, Efstratios Gavves, and Stephen Gould. "Self-supervised video representation learning with
odd-one-out networks." CVPR 2017
Temporal Weak Labels
Train a network to detect which of the video sequences contains frames wrong order.
81
Spatio-Temporal Weak Labels
X. Lin, Campos, V., Giró-i-Nieto, X., Torres, J., and Canton-Ferrer, C., “Disentangling Motion, Foreground
and Background Features in Videos”, in CVPR 2017 Workshop Brave New Motion Representations
kernel dec
C3D
Foreground
Motion
First
Foreground
Background
Fg
Dec
Bg
Dec
Fg
Dec
Reconstruction
of foreground in
last frame
Reconstruction
of foreground in
first frame
Reconstruction
of background
in first frame
uNLC
Mask
Block
gradients
Last
foreground
Kernels
share
weights
82
Spatio-Temporal Weak Labels
Pathak, Deepak, Ross Girshick, Piotr Dollár, Trevor Darrell, and Bharath Hariharan. "Learning features by watching
objects move." CVPR 2017
83
Greff, Klaus, Antti Rasmus, Mathias Berglund, Tele Hao, Harri Valpola, and Juergen Schmidhuber. "Tagger: Deep unsupervised
perceptual grouping." NIPS 2016 [video] [code]
Spatio-Temporal Weak Labels
84
Aytar, Yusuf, Carl Vondrick, and Antonio Torralba. "Soundnet: Learning sound representations from unlabeled
video." NIPS 2016.
85
Audio Features from Visual weak labels
Aytar, Yusuf, Carl Vondrick, and Antonio Torralba. "Soundnet: Learning sound representations from
unlabeled video." NIPS 2016.
Object & Scenes recognition in videos by analysing the audio track (only).
86
Aytar, Yusuf, Carl Vondrick, and Antonio Torralba. "Soundnet: Learning sound representations from
unlabeled video." NIPS 2016.
Visualization of the 1D filters over raw audio in conv1.
Audio Features from Visual weak labels
87
Aytar, Yusuf, Carl Vondrick, and Antonio Torralba. "Soundnet: Learning sound representations from
unlabeled video." NIPS 2016.
Visualization of the 1D filters over raw audio in conv1.
Audio Features from Visual weak labels
88
Aytar, Yusuf, Carl Vondrick, and Antonio Torralba. "Soundnet: Learning sound representations from
unlabeled video." NIPS 2016.
Visualization of the 1D filters over raw audio in conv1.
Audio Features from Visual weak labels
89
Visualization of the video frames associated to the sounds that activate some of the
last hidden units of Soundnet (conv7):
Aytar, Yusuf, Carl Vondrick, and Antonio Torralba. "Soundnet: Learning sound representations from
unlabeled video." NIPS 2016.
Audio Features from Visual weak labels
90
Audio & Visual features from alignement
90Arandjelović, Relja, and Andrew Zisserman. "Look, Listen and Learn." ICCV 2017.
Audio and visual features learned by assessing alignement.
91
L. Chen, S. Srivastava, Z. Duan and C. Xu. Deep Cross-Modal Audio-Visual Generation. ACM
International Conference on Multimedia Thematic Workshops, 2017.
Audio & Visual features from alignement
92Ephrat, Ariel, and Shmuel Peleg. "Vid2speech: Speech Reconstruction from Silent Video." ICASSP 2017
93Ephrat, Ariel, and Shmuel Peleg. "Vid2speech: Speech Reconstruction from Silent Video." ICASSP 2017
Video to Speech Representations
CNN
(VGG)
Frame from a
silent video
Audio feature
Post-hoc
synthesis
94Chung, Joon Son, Amir Jamaludin, and Andrew Zisserman. "You said that?." BMVC 2017.
95Chung, Joon Son, Amir Jamaludin, and Andrew Zisserman. "You said that?." BMVC 2017.
Speech to Video Synthesis (mouth)
96
Karras, Tero, Timo Aila, Samuli Laine, Antti Herva, and Jaakko Lehtinen. "Audio-driven facial animation by
joint end-to-end learning of pose and emotion." SIGGRAPH 2017
97
Karras, Tero, Timo Aila, Samuli Laine, Antti Herva, and Jaakko Lehtinen. "Audio-driven facial animation by
joint end-to-end learning of pose and emotion." SIGGRAPH 2017
Speech to Video Synthesis (pose & emotion)
98
Karras, Tero, Timo Aila, Samuli Laine, Antti Herva, and Jaakko Lehtinen. "Audio-driven facial animation by
joint end-to-end learning of pose and emotion." SIGGRAPH 2017
99
Suwajanakorn, Supasorn, Steven M. Seitz, and Ira Kemelmacher-Shlizerman. "Synthesizing Obama: learning lip sync from
audio." SIGGRAPH 2017.
Speech to Video Synthesis (mouth)
100
Outline
1. Motivation
2. Unsupervised Learning
3. Predictive Learning
4. Self-supervised Learning

Contenu connexe

Tendances

Deep Language and Vision (DLSL D2L4 2018 UPC Deep Learning for Speech and Lan...
Deep Language and Vision (DLSL D2L4 2018 UPC Deep Learning for Speech and Lan...Deep Language and Vision (DLSL D2L4 2018 UPC Deep Learning for Speech and Lan...
Deep Language and Vision (DLSL D2L4 2018 UPC Deep Learning for Speech and Lan...Universitat Politècnica de Catalunya
 
Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...
Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...
Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Universitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Universitat Politècnica de Catalunya
 
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Universitat Politècnica de Catalunya
 
Temporal Activity Detection in Untrimmed Videos with Recurrent Neural Networks
Temporal Activity Detection in Untrimmed Videos with Recurrent Neural NetworksTemporal Activity Detection in Untrimmed Videos with Recurrent Neural Networks
Temporal Activity Detection in Untrimmed Videos with Recurrent Neural NetworksUniversitat Politècnica de Catalunya
 
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Universitat Politècnica de Catalunya
 
Language and Vision (D2L11 Insight@DCU Machine Learning Workshop 2017)
Language and Vision (D2L11 Insight@DCU Machine Learning Workshop 2017)Language and Vision (D2L11 Insight@DCU Machine Learning Workshop 2017)
Language and Vision (D2L11 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...Universitat Politècnica de Catalunya
 
Closing, Course Offer 17/18 & Homework (D5 2017 UPC Deep Learning for Compute...
Closing, Course Offer 17/18 & Homework (D5 2017 UPC Deep Learning for Compute...Closing, Course Offer 17/18 & Homework (D5 2017 UPC Deep Learning for Compute...
Closing, Course Offer 17/18 & Homework (D5 2017 UPC Deep Learning for Compute...Universitat Politècnica de Catalunya
 
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...Universitat Politècnica de Catalunya
 

Tendances (20)

Deep Language and Vision (DLSL D2L4 2018 UPC Deep Learning for Speech and Lan...
Deep Language and Vision (DLSL D2L4 2018 UPC Deep Learning for Speech and Lan...Deep Language and Vision (DLSL D2L4 2018 UPC Deep Learning for Speech and Lan...
Deep Language and Vision (DLSL D2L4 2018 UPC Deep Learning for Speech and Lan...
 
Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...
Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...
Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...
 
Deep Learning Representations for All (a.ka. the AI hype)
Deep Learning Representations for All (a.ka. the AI hype)Deep Learning Representations for All (a.ka. the AI hype)
Deep Learning Representations for All (a.ka. the AI hype)
 
Neural Architectures for Video Encoding
Neural Architectures for Video EncodingNeural Architectures for Video Encoding
Neural Architectures for Video Encoding
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Welcome (D1L1 2017 UPC Deep Learning for Computer Vision)
Welcome (D1L1 2017 UPC Deep Learning for Computer Vision)Welcome (D1L1 2017 UPC Deep Learning for Computer Vision)
Welcome (D1L1 2017 UPC Deep Learning for Computer Vision)
 
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
 
Temporal Activity Detection in Untrimmed Videos with Recurrent Neural Networks
Temporal Activity Detection in Untrimmed Videos with Recurrent Neural NetworksTemporal Activity Detection in Untrimmed Videos with Recurrent Neural Networks
Temporal Activity Detection in Untrimmed Videos with Recurrent Neural Networks
 
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Deep Learning for Video: Object Tracking (UPC 2018)
Deep Learning for Video: Object Tracking (UPC 2018)Deep Learning for Video: Object Tracking (UPC 2018)
Deep Learning for Video: Object Tracking (UPC 2018)
 
Language and Vision (D2L11 Insight@DCU Machine Learning Workshop 2017)
Language and Vision (D2L11 Insight@DCU Machine Learning Workshop 2017)Language and Vision (D2L11 Insight@DCU Machine Learning Workshop 2017)
Language and Vision (D2L11 Insight@DCU Machine Learning Workshop 2017)
 
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...
 
One Perceptron to Rule Them All: Language and Vision
One Perceptron to Rule Them All: Language and VisionOne Perceptron to Rule Them All: Language and Vision
One Perceptron to Rule Them All: Language and Vision
 
Closing, Course Offer 17/18 & Homework (D5 2017 UPC Deep Learning for Compute...
Closing, Course Offer 17/18 & Homework (D5 2017 UPC Deep Learning for Compute...Closing, Course Offer 17/18 & Homework (D5 2017 UPC Deep Learning for Compute...
Closing, Course Offer 17/18 & Homework (D5 2017 UPC Deep Learning for Compute...
 
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
 
Video Analysis (D4L2 2017 UPC Deep Learning for Computer Vision)
Video Analysis (D4L2 2017 UPC Deep Learning for Computer Vision)Video Analysis (D4L2 2017 UPC Deep Learning for Computer Vision)
Video Analysis (D4L2 2017 UPC Deep Learning for Computer Vision)
 
Deep Learning for Video: Language (UPC 2018)
Deep Learning for Video: Language (UPC 2018)Deep Learning for Video: Language (UPC 2018)
Deep Learning for Video: Language (UPC 2018)
 

Similaire à Unsupervised Learning (DLAI D9L1 2017 UPC Deep Learning for Artificial Intelligence)

Deep Learning without Annotations - Xavier Giro - UPC Barcelona 2018
Deep Learning without Annotations - Xavier Giro - UPC Barcelona 2018Deep Learning without Annotations - Xavier Giro - UPC Barcelona 2018
Deep Learning without Annotations - Xavier Giro - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
Learning with Videos (D4L4 2017 UPC Deep Learning for Computer Vision)
Learning with Videos  (D4L4 2017 UPC Deep Learning for Computer Vision)Learning with Videos  (D4L4 2017 UPC Deep Learning for Computer Vision)
Learning with Videos (D4L4 2017 UPC Deep Learning for Computer Vision)Universitat Politècnica de Catalunya
 
Deep Belief Networks (D2L1 Deep Learning for Speech and Language UPC 2017)
Deep Belief Networks (D2L1 Deep Learning for Speech and Language UPC 2017)Deep Belief Networks (D2L1 Deep Learning for Speech and Language UPC 2017)
Deep Belief Networks (D2L1 Deep Learning for Speech and Language UPC 2017)Universitat Politècnica de Catalunya
 
Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...
Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...
Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...Universitat Politècnica de Catalunya
 
Deep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN Barcelona
Deep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN BarcelonaDeep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN Barcelona
Deep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN BarcelonaUniversitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Universitat Politècnica de Catalunya
 
DeepFix: a fully convolutional neural network for predicting human fixations...
DeepFix:  a fully convolutional neural network for predicting human fixations...DeepFix:  a fully convolutional neural network for predicting human fixations...
DeepFix: a fully convolutional neural network for predicting human fixations...Universitat Politècnica de Catalunya
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Roelof Pieters
 
AI&BigData Lab. Артем Чернодуб "Распознавание изображений методом Lazy Deep ...
AI&BigData Lab. Артем Чернодуб  "Распознавание изображений методом Lazy Deep ...AI&BigData Lab. Артем Чернодуб  "Распознавание изображений методом Lazy Deep ...
AI&BigData Lab. Артем Чернодуб "Распознавание изображений методом Lazy Deep ...GeeksLab Odessa
 
Introduction to the Artificial Intelligence and Computer Vision revolution
Introduction to the Artificial Intelligence and Computer Vision revolutionIntroduction to the Artificial Intelligence and Computer Vision revolution
Introduction to the Artificial Intelligence and Computer Vision revolutionDarian Frajberg
 
Paper Reviews on Visual Attention
Paper Reviews on Visual AttentionPaper Reviews on Visual Attention
Paper Reviews on Visual Attention민기 정
 
Details of Lazy Deep Learning for Images Recognition in ZZ Photo app
Details of Lazy Deep Learning for Images Recognition in ZZ Photo appDetails of Lazy Deep Learning for Images Recognition in ZZ Photo app
Details of Lazy Deep Learning for Images Recognition in ZZ Photo appPAY2 YOU
 
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
Interactive Video Search: Where is the User in the Age of Deep Learning?
Interactive Video Search: Where is the User in the Age of Deep Learning?Interactive Video Search: Where is the User in the Age of Deep Learning?
Interactive Video Search: Where is the User in the Age of Deep Learning?klschoef
 
Evolution of Deep Learning and new advancements
Evolution of Deep Learning and new advancementsEvolution of Deep Learning and new advancements
Evolution of Deep Learning and new advancementsChitta Ranjan
 
AI&BigData Lab 2016. Артем Чернодуб: Обучение глубоких, очень глубоких и реку...
AI&BigData Lab 2016. Артем Чернодуб: Обучение глубоких, очень глубоких и реку...AI&BigData Lab 2016. Артем Чернодуб: Обучение глубоких, очень глубоких и реку...
AI&BigData Lab 2016. Артем Чернодуб: Обучение глубоких, очень глубоких и реку...GeeksLab Odessa
 

Similaire à Unsupervised Learning (DLAI D9L1 2017 UPC Deep Learning for Artificial Intelligence) (20)

Deep Learning without Annotations - Xavier Giro - UPC Barcelona 2018
Deep Learning without Annotations - Xavier Giro - UPC Barcelona 2018Deep Learning without Annotations - Xavier Giro - UPC Barcelona 2018
Deep Learning without Annotations - Xavier Giro - UPC Barcelona 2018
 
Learning with Videos (D4L4 2017 UPC Deep Learning for Computer Vision)
Learning with Videos  (D4L4 2017 UPC Deep Learning for Computer Vision)Learning with Videos  (D4L4 2017 UPC Deep Learning for Computer Vision)
Learning with Videos (D4L4 2017 UPC Deep Learning for Computer Vision)
 
Deep Belief Networks (D2L1 Deep Learning for Speech and Language UPC 2017)
Deep Belief Networks (D2L1 Deep Learning for Speech and Language UPC 2017)Deep Belief Networks (D2L1 Deep Learning for Speech and Language UPC 2017)
Deep Belief Networks (D2L1 Deep Learning for Speech and Language UPC 2017)
 
Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...
Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...
Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...
 
Deep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN Barcelona
Deep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN BarcelonaDeep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN Barcelona
Deep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN Barcelona
 
Deep Video Object Tracking - Xavier Giro - UPC Barcelona 2019
Deep Video Object Tracking - Xavier Giro - UPC Barcelona 2019Deep Video Object Tracking - Xavier Giro - UPC Barcelona 2019
Deep Video Object Tracking - Xavier Giro - UPC Barcelona 2019
 
Deep Video Object Segmentation - Xavier Giro - UPC Barcelona 2019
Deep Video Object Segmentation - Xavier Giro - UPC Barcelona 2019Deep Video Object Segmentation - Xavier Giro - UPC Barcelona 2019
Deep Video Object Segmentation - Xavier Giro - UPC Barcelona 2019
 
The Importance of Time in Visual Attention Models
The Importance of Time in Visual Attention ModelsThe Importance of Time in Visual Attention Models
The Importance of Time in Visual Attention Models
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
 
DeepFix: a fully convolutional neural network for predicting human fixations...
DeepFix:  a fully convolutional neural network for predicting human fixations...DeepFix:  a fully convolutional neural network for predicting human fixations...
DeepFix: a fully convolutional neural network for predicting human fixations...
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!
 
Tutorial inns2019 full
Tutorial inns2019 fullTutorial inns2019 full
Tutorial inns2019 full
 
AI&BigData Lab. Артем Чернодуб "Распознавание изображений методом Lazy Deep ...
AI&BigData Lab. Артем Чернодуб  "Распознавание изображений методом Lazy Deep ...AI&BigData Lab. Артем Чернодуб  "Распознавание изображений методом Lazy Deep ...
AI&BigData Lab. Артем Чернодуб "Распознавание изображений методом Lazy Deep ...
 
Introduction to the Artificial Intelligence and Computer Vision revolution
Introduction to the Artificial Intelligence and Computer Vision revolutionIntroduction to the Artificial Intelligence and Computer Vision revolution
Introduction to the Artificial Intelligence and Computer Vision revolution
 
Paper Reviews on Visual Attention
Paper Reviews on Visual AttentionPaper Reviews on Visual Attention
Paper Reviews on Visual Attention
 
Details of Lazy Deep Learning for Images Recognition in ZZ Photo app
Details of Lazy Deep Learning for Images Recognition in ZZ Photo appDetails of Lazy Deep Learning for Images Recognition in ZZ Photo app
Details of Lazy Deep Learning for Images Recognition in ZZ Photo app
 
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
 
Interactive Video Search: Where is the User in the Age of Deep Learning?
Interactive Video Search: Where is the User in the Age of Deep Learning?Interactive Video Search: Where is the User in the Age of Deep Learning?
Interactive Video Search: Where is the User in the Age of Deep Learning?
 
Evolution of Deep Learning and new advancements
Evolution of Deep Learning and new advancementsEvolution of Deep Learning and new advancements
Evolution of Deep Learning and new advancements
 
AI&BigData Lab 2016. Артем Чернодуб: Обучение глубоких, очень глубоких и реку...
AI&BigData Lab 2016. Артем Чернодуб: Обучение глубоких, очень глубоких и реку...AI&BigData Lab 2016. Артем Чернодуб: Обучение глубоких, очень глубоких и реку...
AI&BigData Lab 2016. Артем Чернодуб: Обучение глубоких, очень глубоких и реку...
 

Plus de Universitat Politècnica de Catalunya

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoUniversitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosUniversitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Universitat Politècnica de Catalunya
 
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...Universitat Politècnica de Catalunya
 
Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...
Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...
Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...Universitat Politècnica de Catalunya
 
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation
Hate Speech in Pixels: Detection of Offensive Memes towards Automatic ModerationHate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation
Hate Speech in Pixels: Detection of Offensive Memes towards Automatic ModerationUniversitat Politècnica de Catalunya
 

Plus de Universitat Politècnica de Catalunya (19)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
 
Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...
Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...
Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...
 
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Backpropagation for Deep Learning
Backpropagation for Deep LearningBackpropagation for Deep Learning
Backpropagation for Deep Learning
 
Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation
Hate Speech in Pixels: Detection of Offensive Memes towards Automatic ModerationHate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation
Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation
 

Dernier

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 

Dernier (20)

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 

Unsupervised Learning (DLAI D9L1 2017 UPC Deep Learning for Artificial Intelligence)

  • 1. [course site] Xavier Giro-i-Nieto xavier.giro@upc.edu Associate Professor Universitat Politecnica de Catalunya Technical University of Catalonia Unsupervised Learning #DLUPC
  • 2. 2 Acknowledegments Kevin McGuinness, “Unsupervised Learning” Deep Learning for Computer Vision. [Slides 2016] [Slides 2017]
  • 3. 3 Outline 1. Motivation 2. Unsupervised Learning 3. Predictive Learning 4. Self-supervised Learning
  • 4. 4 Outline 1. Motivation 2. Unsupervised Learning 3. Predictive Learning 4. Self-supervised Learning
  • 7. 7 Greff, Klaus, Antti Rasmus, Mathias Berglund, Tele Hao, Harri Valpola, and Juergen Schmidhuber. "Tagger: Deep unsupervised perceptual grouping." NIPS 2016 [video] [code]
  • 8. Yann Lecun’s Black Forest cake 8 Motivation
  • 9. 1. = ƒ( ) ƒ( ) = ƒ( ) 9 Predict label y corresponding to observation x Estimate the distribution of observation x Predict action y based on observation x, to maximize a future reward z Motivation
  • 10. 1. = ƒ( ) ƒ( ) = ƒ( ) 10 Motivation
  • 11. Unsupervised Learning 11 Why Unsupervised Learning? ● It is the nature of how intelligent beings percept the world. ● It can save us tons of efforts to build a human-alike intelligent agent compared to a totally supervised fashion. ● Vast amounts of unlabelled data.
  • 12. 12 Outline 1. Motivation 2. Unsupervised Learning 3. Predictive Learning 4. Self-supervised Learning
  • 13. Max margin decision boundary 13 How data distribution P(x) influences decisions (1D) Slide credit: Kevin McGuinness (DLCV UPC 2017)
  • 14. Semi supervised decision boundary 14 How data distribution P(x) influences decisions (1D) Slide credit: Kevin McGuinness (DLCV UPC 2017)
  • 15. 15 How data distribution P(x) influences decisions (2D) Slide credit: Kevin McGuinness (DLCV UPC 2017)
  • 16. 16 How data distribution P(x) influences decisions (2D) Slide credit: Kevin McGuinness (DLCV UPC 2017)
  • 17. How P(x) is valuable for naive Bayesian classifier ● ● ● 17 X: Data Y: Labels Slide credit: Kevin McGuinness (DLCV UPC 2017)
  • 18. x1 x2 Not linearly separable in this 2D space :( 18 How clustering is valueble for linear classifiers Slide credit: Kevin McGuinness (DLCV UPC 2017)
  • 19. x1 x2 Cluster 1 Cluster 2 Cluster 3 Cluster 4 1 2 3 4 1 2 3 4 4D BoW representation Separable with a linear classifiers in a 4D space ! 19 How clustering is valueble for linear classifiers Slide credit: Kevin McGuinness (DLCV UPC 2017)
  • 20. Example from Kevin McGuinness Juyter notebook. 20 How clustering is valueble for linear classifiers
  • 21. 21 Assumptions for unsupervised learning Slide credit: Kevin McGuinness (DLCV UPC 2017) ● ○ ● ○ ● ○
  • 22. 22 Assumptions for unsupervised learning Slide credit: Kevin McGuinness (DLCV UPC 2017) ● ○ ○ ● ○ ○ ○ ○ ● ○ ● ○ ○ ○ ○ ● ○ ○ ○ ○
  • 23. 23 Assumptions for unsupervised learning Slide credit: Kevin McGuinness (DLCV UPC 2017) ● ○ ○ ● ○ ○ ○ ○ ● ○ ● ○ ○ ○ ○ ● ○ ○ ○ ○
  • 24. 24 Slide credit: Kevin McGuinness (DLCV UPC 2017) ● ● ○ ● ○ The manifold hypothesis
  • 25. 25 The manifold hypothesis Slide credit: Kevin McGuinness (DLCV UPC 2017) x1 x2 Linear manifold wT x + b x1 x2 Non-linear manifold
  • 26. 26 The manifold hypothesis: Energy-based models ● ● ● ● ● LeCun et al, A Tutorial on Energy-Based Learning, Predicting Structured Data, 2006 http://yann.lecun.com/exdb/publis/pdf/lecun-06.pdf
  • 27. 27 Autoencoder (AE) F. Van Veen, “The Neural Network Zoo” (2016)
  • 28. 28 Autoencoder (AE) “Deep Learning Tutorial”, Dept. Computer Science, Stanford Autoencoders: ● Predict at the output the same input data. ● Do not need labels:
  • 29. 29 Autoencoder (AE) “Deep Learning Tutorial”, Dept. Computer Science, Stanford Dimensionality reduction: ● Use hidden layer as a feature extractor of any desired size.
  • 30. 30 Autoencoder (AE) Encoder W1 Decoder W2 hdata reconstruction Loss (reconstruction error) Latent variables (representation/features) Figure: Kevin McGuinness (DLCV UPC 2017) Pretraining: 1. Initialize a NN solving an autoencoding problem.
  • 31. 31 Autoencoder (AE) Figure: Kevin McGuinness (DLCV UPC 2017) Encoder W1 hdata Classifier WC Latent variables (representation/features) prediction y Loss (cross entropy) Pretraining: 1. Initialize a NN solving an autoencoding problem. 2. Train for final task with “few” labels.
  • 32. 32 Autoencoder (AE) Deep Learning TV, “Autoencoders - Ep. 10”
  • 33. 33F. Van Veen, “The Neural Network Zoo” (2016) Restricted Boltzmann Machine (RBM)
  • 34. 34 Restricted Boltzmann Machine (RBM) Figure: Geoffrey Hinton (2013) Salakhutdinov, Ruslan, Andriy Mnih, and Geoffrey Hinton. "Restricted Boltzmann machines for collaborative filtering." Proceedings of the 24th international conference on Machine learning. ACM, 2007. ● Shallow two-layer net. ● Restricted = No two nodes in a layer share a connection ● ● Bipartite graph. ● Bidirectional graph ○ Shared weights. ○ Different biases.
  • 35. 35 Restricted Boltzmann Machine (RBM) DeepLearning4j, “A Beginner’s Tutorial for Restricted Boltzmann Machines”. Forward pass
  • 36. 36 Restricted Boltzmann Machine (RBM) DeepLearning4j, “A Beginner’s Tutorial for Restricted Boltzmann Machines”. Forward pass
  • 37. 37 Restricted Boltzmann Machine (RBM) DeepLearning4j, “A Beginner’s Tutorial for Restricted Boltzmann Machines”. Forward pass
  • 38. 38 Restricted Boltzmann Machine (RBM) DeepLearning4j, “A Beginner’s Tutorial for Restricted Boltzmann Machines”. Backward pass
  • 39. 39 Restricted Boltzmann Machine (RBM) DeepLearning4j, “A Beginner’s Tutorial for Restricted Boltzmann Machines”. Backward pass
  • 40. 40 Restricted Boltzmann Machine (RBM) DeepLearning4j, “A Beginner’s Tutorial for Restricted Boltzmann Machines”. Backward pass The reconstructed values at the visible layer are compared with the actual ones with the KL Divergence.
  • 41. 41 Restricted Boltzmann Machine (RBM) Figure: Geoffrey Hinton (2013) Salakhutdinov, Ruslan, Andriy Mnih, and Geoffrey Hinton. "Restricted Boltzmann machines for collaborative filtering." Proceedings of the 24th international conference on Machine learning. ACM, 2007.
  • 42. 42 Restricted Boltzmann Machine (RBM) DeepLearning4j, “A Beginner’s Tutorial for Restricted Boltzmann Machines”. RBMs are a specific type of autoencoder. Unsupervised learning
  • 43. 43 Restricted Boltzmann Machine (RBM) Deep Learning TV, “Restricted Boltzmann Machines”.
  • 44. 44 Deep Belief Networks (DBN) F. Van Veen, “The Neural Network Zoo” (2016)
  • 45. 45 Deep Belief Networks (DBN) Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. "A fast learning algorithm for deep belief nets." Neural computation 18, no. 7 (2006): 1527-1554. ● Architecture like an MLP. ● Training as a stack of RBMs.
  • 46. 46 Deep Belief Networks (DBN) ● Architecture like an MLP. ● Training as a stack of RBMs. Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. "A fast learning algorithm for deep belief nets." Neural computation 18, no. 7 (2006): 1527-1554.
  • 47. 47 Deep Belief Networks (DBN) Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. "A fast learning algorithm for deep belief nets." Neural computation 18, no. 7 (2006): 1527-1554. ● Architecture like an MLP. ● Training as a stack of RBMs.
  • 48. 48 Deep Belief Networks (DBN) Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. "A fast learning algorithm for deep belief nets." Neural computation 18, no. 7 (2006): 1527-1554. ● Architecture like an MLP. ● Training as a stack of RBMs.
  • 49. 49 Deep Belief Networks (DBN) Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. "A fast learning algorithm for deep belief nets." Neural computation 18, no. 7 (2006): 1527-1554. ● Architecture like an MLP. ● Training as a stack of RBMs… ● ...so they do not need labels: Unsupervised learning
  • 50. 50 Deep Belief Networks (DBN) Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. "A fast learning algorithm for deep belief nets." Neural computation 18, no. 7 (2006): 1527-1554. After the DBN is trained, it can be fine-tuned with a reduced amount of labels to solve a supervised task with superior performance. Supervised learning Softmax
  • 51. 51 Deep Belief Networks (DBN) Deep Learning TV, “Deep Belief Nets”
  • 52. 52 Deep Belief Networks (DBN) Geoffrey Hinton, "Introduction to Deep Learning & Deep Belief Nets” (2012) Geoorey Hinton, “Tutorial on Deep Belief Networks”. NIPS 2007.
  • 53. 53 Deep Belief Networks (DBN) Geoffrey Hinton
  • 54. 54 Outline 1. Motivation 2. Unsupervised Learning 3. Predictive Learning 4. Self-supervised Learning
  • 59. Frame Reconstruction & Prediction 59 Srivastava, Nitish, Elman Mansimov, and Ruslan Salakhutdinov. "Unsupervised Learning of Video Representations using LSTMs." In ICML 2015. [Github] Unsupervised feature learning (no labels) for...
  • 60. Frame Reconstruction & Prediction 60 Srivastava, Nitish, Elman Mansimov, and Ruslan Salakhutdinov. "Unsupervised Learning of Video Representations using LSTMs." In ICML 2015. [Github] Unsupervised feature learning (no labels) for... ...frame prediction.
  • 61. Frame Reconstruction & Prediction 61 Srivastava, Nitish, Elman Mansimov, and Ruslan Salakhutdinov. "Unsupervised Learning of Video Representations using LSTMs." In ICML 2015. [Github] Unsupervised feature learning (no labels) for... ...frame prediction.
  • 62. 62 Srivastava, Nitish, Elman Mansimov, and Ruslan Salakhutdinov. "Unsupervised Learning of Video Representations using LSTMs." In ICML 2015. [Github] Unsupervised learned features (lots of data) are fine-tuned for activity recognition (little data). Frame Reconstruction & Prediction
  • 63. Frame Prediction 63 Ranzato, MarcAurelio, Arthur Szlam, Joan Bruna, Michael Mathieu, Ronan Collobert, and Sumit Chopra. "Video (language) modeling: a baseline for generative models of natural videos." arXiv preprint arXiv:1412.6604 (2014).
  • 64. 64 Mathieu, Michael, Camille Couprie, and Yann LeCun. "Deep multi-scale video prediction beyond mean square error." ICLR 2016 [project] [code] Video frame prediction with a ConvNet. Frame Prediction
  • 65. 65 Mathieu, Michael, Camille Couprie, and Yann LeCun. "Deep multi-scale video prediction beyond mean square error." ICLR 2016 [project] [code] The blurry predictions from MSE are improved with multi-scale architecture, adversarial learning and an image gradient difference loss function. Frame Prediction
  • 66. 66 Mathieu, Michael, Camille Couprie, and Yann LeCun. "Deep multi-scale video prediction beyond mean square error." ICLR 2016 [project] [code] The blurry predictions from MSE (l1) are improved with multi-scale architecture, adversarial training and an image gradient difference loss (GDL) function. Frame Prediction
  • 67. 67 Mathieu, Michael, Camille Couprie, and Yann LeCun. "Deep multi-scale video prediction beyond mean square error." ICLR 2016 [project] [code] Frame Prediction
  • 68. 68Vondrick, Carl, Hamed Pirsiavash, and Antonio Torralba. "Generating videos with scene dynamics." NIPS 2016. Frame Prediction
  • 69. 69 Vondrick, Carl, Hamed Pirsiavash, and Antonio Torralba. "Generating videos with scene dynamics." NIPS 2016. Frame Prediction
  • 70. 70 Xue, Tianfan, Jiajun Wu, Katherine Bouman, and Bill Freeman. "Visual dynamics: Probabilistic future frame synthesis via cross convolutional networks." NIPS 2016 [video]
  • 71. 71 Frame Prediction Xue, Tianfan, Jiajun Wu, Katherine Bouman, and Bill Freeman. "Visual dynamics: Probabilistic future frame synthesis via cross convolutional networks." NIPS 2016 [video] Given an input image, probabilistic generation of future frames with a Variational AutoEncoder (VAE).
  • 72. 72 Frame Prediction Xue, Tianfan, Jiajun Wu, Katherine Bouman, and Bill Freeman. "Visual dynamics: Probabilistic future frame synthesis via cross convolutional networks." NIPS 2016 [video] Encodes image as feature maps, and motion as and cross-convolutional kernels.
  • 73. 73 Outline 1. Motivation 2. Unsupervised Learning 3. Predictive Learning 4. Self-supervised Learning
  • 74. First steps in video feature learning 74 Le, Quoc V., Will Y. Zou, Serena Y. Yeung, and Andrew Y. Ng. "Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis." CVPR 2011
  • 75. Temporal Weak Labels 75 Goroshin, Ross, Joan Bruna, Jonathan Tompson, David Eigen, and Yann LeCun. "Unsupervised learning of spatiotemporally coherent metrics." ICCV 2015. Assumption: adjacent video frames contain semantically similar information. Autoencoder trained with regularizations by slowliness and sparisty.
  • 76. 76 Jayaraman, Dinesh, and Kristen Grauman. "Slow and steady feature analysis: higher order temporal coherence in video." CVPR 2016. [video] ● ● Temporal Weak Labels
  • 77. 77 Jayaraman, Dinesh, and Kristen Grauman. "Slow and steady feature analysis: higher order temporal coherence in video." CVPR 2016. [video] Temporal Weak Labels
  • 78. 78 (Slides by Xunyu Lin): Misra, Ishan, C. Lawrence Zitnick, and Martial Hebert. "Shuffle and learn: unsupervised learning using temporal order verification." ECCV 2016. [code] Temporal order of frames is exploited as the supervisory signal for learning. Temporal Weak Labels
  • 79. 79 (Slides by Xunyu Lin): Misra, Ishan, C. Lawrence Zitnick, and Martial Hebert. "Shuffle and learn: unsupervised learning using temporal order verification." ECCV 2016. [code] Take temporal order as the supervisory signals for learning Shuffled sequences Binary classification In order Not in order Temporal Weak Labels
  • 80. 80 Fernando, Basura, Hakan Bilen, Efstratios Gavves, and Stephen Gould. "Self-supervised video representation learning with odd-one-out networks." CVPR 2017 Temporal Weak Labels Train a network to detect which of the video sequences contains frames wrong order.
  • 81. 81 Spatio-Temporal Weak Labels X. Lin, Campos, V., Giró-i-Nieto, X., Torres, J., and Canton-Ferrer, C., “Disentangling Motion, Foreground and Background Features in Videos”, in CVPR 2017 Workshop Brave New Motion Representations kernel dec C3D Foreground Motion First Foreground Background Fg Dec Bg Dec Fg Dec Reconstruction of foreground in last frame Reconstruction of foreground in first frame Reconstruction of background in first frame uNLC Mask Block gradients Last foreground Kernels share weights
  • 82. 82 Spatio-Temporal Weak Labels Pathak, Deepak, Ross Girshick, Piotr Dollár, Trevor Darrell, and Bharath Hariharan. "Learning features by watching objects move." CVPR 2017
  • 83. 83 Greff, Klaus, Antti Rasmus, Mathias Berglund, Tele Hao, Harri Valpola, and Juergen Schmidhuber. "Tagger: Deep unsupervised perceptual grouping." NIPS 2016 [video] [code] Spatio-Temporal Weak Labels
  • 84. 84 Aytar, Yusuf, Carl Vondrick, and Antonio Torralba. "Soundnet: Learning sound representations from unlabeled video." NIPS 2016.
  • 85. 85 Audio Features from Visual weak labels Aytar, Yusuf, Carl Vondrick, and Antonio Torralba. "Soundnet: Learning sound representations from unlabeled video." NIPS 2016. Object & Scenes recognition in videos by analysing the audio track (only).
  • 86. 86 Aytar, Yusuf, Carl Vondrick, and Antonio Torralba. "Soundnet: Learning sound representations from unlabeled video." NIPS 2016. Visualization of the 1D filters over raw audio in conv1. Audio Features from Visual weak labels
  • 87. 87 Aytar, Yusuf, Carl Vondrick, and Antonio Torralba. "Soundnet: Learning sound representations from unlabeled video." NIPS 2016. Visualization of the 1D filters over raw audio in conv1. Audio Features from Visual weak labels
  • 88. 88 Aytar, Yusuf, Carl Vondrick, and Antonio Torralba. "Soundnet: Learning sound representations from unlabeled video." NIPS 2016. Visualization of the 1D filters over raw audio in conv1. Audio Features from Visual weak labels
  • 89. 89 Visualization of the video frames associated to the sounds that activate some of the last hidden units of Soundnet (conv7): Aytar, Yusuf, Carl Vondrick, and Antonio Torralba. "Soundnet: Learning sound representations from unlabeled video." NIPS 2016. Audio Features from Visual weak labels
  • 90. 90 Audio & Visual features from alignement 90Arandjelović, Relja, and Andrew Zisserman. "Look, Listen and Learn." ICCV 2017. Audio and visual features learned by assessing alignement.
  • 91. 91 L. Chen, S. Srivastava, Z. Duan and C. Xu. Deep Cross-Modal Audio-Visual Generation. ACM International Conference on Multimedia Thematic Workshops, 2017. Audio & Visual features from alignement
  • 92. 92Ephrat, Ariel, and Shmuel Peleg. "Vid2speech: Speech Reconstruction from Silent Video." ICASSP 2017
  • 93. 93Ephrat, Ariel, and Shmuel Peleg. "Vid2speech: Speech Reconstruction from Silent Video." ICASSP 2017 Video to Speech Representations CNN (VGG) Frame from a silent video Audio feature Post-hoc synthesis
  • 94. 94Chung, Joon Son, Amir Jamaludin, and Andrew Zisserman. "You said that?." BMVC 2017.
  • 95. 95Chung, Joon Son, Amir Jamaludin, and Andrew Zisserman. "You said that?." BMVC 2017. Speech to Video Synthesis (mouth)
  • 96. 96 Karras, Tero, Timo Aila, Samuli Laine, Antti Herva, and Jaakko Lehtinen. "Audio-driven facial animation by joint end-to-end learning of pose and emotion." SIGGRAPH 2017
  • 97. 97 Karras, Tero, Timo Aila, Samuli Laine, Antti Herva, and Jaakko Lehtinen. "Audio-driven facial animation by joint end-to-end learning of pose and emotion." SIGGRAPH 2017 Speech to Video Synthesis (pose & emotion)
  • 98. 98 Karras, Tero, Timo Aila, Samuli Laine, Antti Herva, and Jaakko Lehtinen. "Audio-driven facial animation by joint end-to-end learning of pose and emotion." SIGGRAPH 2017
  • 99. 99 Suwajanakorn, Supasorn, Steven M. Seitz, and Ira Kemelmacher-Shlizerman. "Synthesizing Obama: learning lip sync from audio." SIGGRAPH 2017. Speech to Video Synthesis (mouth)
  • 100. 100 Outline 1. Motivation 2. Unsupervised Learning 3. Predictive Learning 4. Self-supervised Learning