https://telecombcn-dl.github.io/2018-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
6. bit.ly/DLCV2018
#DLUPC
6
Recurrent Neural Network (RNN)
Alex Graves, “Supervised Sequence Labelling with Recurrent Neural Networks”
The hidden layers and the
output depend from previous
states of the hidden layers
16. bit.ly/DLCV2018
#DLUPC
16
Skip Connections
Van Den Oord, Aaron, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner,
Andrew Senior, and Koray Kavukcuoglu. "Wavenet: A generative model for raw audio." ISCA 2016.
17. bit.ly/DLCV2018
#DLUPC
17
Dense Connections
Huang, Gao, Zhuang Liu, Kilian Q. Weinberger, and Laurens van der Maaten. "Densely connected
convolutional networks." CVPR 2017. [code]
Dense Block of
5-layers with a
growth rate of k=4
Connect every layer to every other layer of the same filter
size.
24. bit.ly/DLCV2018
#DLUPC
24
Autoencoder (AE)
Pretraining:
1. Initialize a NN solving an autoencoding
problem.
2. Train for final task with “few” labels.
Figure: Kevin McGuinness (DLCV UPC 2017)
Encoder
W1
hdata Classifier
WC
Latent variables
(representation/features)
prediction
y Loss
(cross entropy)
Application #2