This is an introduction to deep learning presented to Plymouth University students. In the introduction it is explained how a neural network works. In the practical section it is shown how to use Tensorflow for building simple models. Finally the case studies, how to use deep learning in real world applications.
1. Deep Learning
Introduction, Design and Case Studies
Massimiliano Patacchiola, PhD Student in Cognitive Robotics
Centre for Robotics and Neural Systems , Plymouth University
1
3. What is Deep Learning?
“Deep Learning is a subfield of machine learning which uses
supervised learning methods to train large artificial neural
networks.”
Machine Learning: discipline based on pattern recognition and
learning theory which explores algorithms that can learn to
make predictions.
Supervised Learning: inferring a function from labeled training
data (examples).
3
4. Artificial Neural Networks
An Artificial Neural Network (ANN) is a network made by
units (often called neurons) grouped in layers. Most of the
time each unit in a layer is connected with all the units in the
previous layer. Each connection is represented by a real value
which is called weight or parameter.
4
5. Perceptron
It is the simplest ANN, invented by Frank Rosenblatt in 1957.
It has an input and an output layer and it can learn to
discriminate only linearly separable functions.
http://playground.tensorflow.org/ 5
6. Multilayer Perceptron
The Multilayer Perceptron add one more layer to the
perceptron. The intermediate layer is called hidden layer. The
additional layer allows the network to solve the XOR
problem.
http://playground.tensorflow.org/ 6
7. Deep Neural Networks
If the ANN has more than one hidden layer then it is a deep
neural network (DNN). A DNN can model more complex
spaces but is harder to train. Only recently thanks to
Graphical Processing Units (GPUs), large datasets and new
training techniques has been possible to use deep networks.
7
8. Convolutional Neural Networks
Nowadays the most used deep architecture is the
Convolutional Neural Network (CNN). The CNN take as
input a three-dimensional image matrix which is processed
with kernels and subsampling operations. The training search
for the kernels that give the best prediction.
8
10. Training: Backpropagation
10
Training a Deep Neural Network is complicated and can require a long time
(days) to get convergence.
The standard approach for training DNN is backpropagation which treat
learning as an optimisation problem. at each epoch the weights are
adjusted in order to obtain an output which is more similar to the target.
12. Tools for Deep Learning
12
Tensorflow C++, Python Linux, Mac, Windows (next release)
Keras Python Linux, Mac, Windows
Caffe C++ Linux, Mac, Windows, Android
Theano Python Linux, Mac, Windows
CNTK C++ Linux, Windows
Torch C, lua Linux, Mac, Windows, Android, iOS
13. Tensorflow
TensorFlow is an open source software library for numerical computation
using data flow graphs.
Nodes in the graph represent mathematical operations, while the graph
edges represent the multidimensional data arrays (tensors) communicated
between them.
It deploys computation to one or more CPUs or GPUs in a desktop, server,
or mobile device with a single API.
Developed by researchers and engineers working on the Google Brain Team.
13
17. Case Study: Image classification
Given an input image the CNN returns the probability that
the object present in that image belongs to a specific
category.
CIFAR-10 is an established computer-vision dataset used
for object recognition. It is a subset of the 80 million tiny
images dataset and consists of 60,000 32x32 color images
containing one of 10 object classes.
The best results have been achieved using CNNs, in
particular recently a fractional max-pooling method achieved
an accuracy of 96.53%
17
19. Case Study: Deep Reinforcement Learning
Reinforcement Learning: Given an environment with a set of states and an
agent, the agent can move from state to state performing an action. Executing an
action in a specific state provides the agent with a reward. Using RL it is possible to
find the policy that maximise the long term reward.
Deep Reinforcement Learning: Developed by Google DeepMind the first artificial
agents to achieve human-level performance across many challenging domains. The
key idea was to use deep neural networks to predict total reward. It achieved
human-level performance in almost half of the 50 games to which it was applied,
far beyond any previous method!
19
20. Case Study: Head Pose Estimation in the Wild
20“Head Pose Estimation in the Wild Using Convolutional Neural Networks and Adaptive Gradient Methods”
Patacchiola, M., Cangelosi, A., (under review), 2016
We investigated the use of CNNs for head pose estimation. It is
an hard classification task. We used three different CNNs for
estimating separately roll, pitch and yaw. We obtained the
best results ever reported on three datasets.
21. Case Study: Head Pose Estimation in the Wild
21“Head Pose Estimation in the Wild Using Convolutional Neural Networks and Adaptive Gradient Methods”
Patacchiola, M., Cangelosi, A., (under review), 2016
Deepgaze is a python library for people detection and
tracking which uses Convolutional Neural Networks (CNNs) to
estimate the Focus of Attention (FOA) of users.
The FOA can be approximately estimated finding the head
orientation. This is particularly useful when the eyes are
covered, or when the user is too far from the camera to grab
the eye region with a good resolution.
When the eye region is visible it is possible to estimate the
gaze direction, which is much more informative and can give
a good indication of the FOA.
https://github.com/mpatacchiola/deepgaze