SlideShare une entreprise Scribd logo
1  sur  36
Télécharger pour lire hors ligne
Louis Monier
@louis_monier
https://www.linkedin.com/in/louismonier
Deep Learning Basics
Go Deep or Go Home
Gregory Renard
@redo
https://www.linkedin.com/in/gregoryrenard
Class 1 - Q1 - 2016
Supervised Learning
cathatdogcatmug
…
… =?
prediction
dog (89%)
training set…
Training
(days)
Testing
(ms)
Single Neuron to Neural Networks
Single Neuron
+
x
w1
xpos
x
w2
ypos
w3
ChoiceScore
Linear Combination Nonlinearity
3.4906 0.9981, red
Inputs
3.45
1.21
-0.314
1.59
2.65
More Neurons, More Layers: Deep Learning
input1
input2
input3
input layer hidden layers output layer
circle
iris
eye
face
89% likely to be a face
How to find the best values for the weights?
w = Parameter
Error
a
b
Define error = |expected - computed| 2
Find parameters that minimize average error
Gradient of error wrt w is
Update:
take a step downhill
Egg carton in 1 million dimensions
Get to here
or here
Backpropagation: Assigning Blame
input1
input2
input3
label=1
loss=0.49
w1
w2
w3
b
w1’
w2’
w3’
b’
1). For loss to go down by 0.1, p must go up by 0.05.
For p to go up by 0.05, w1 must go up by 0.09.
For p to go up by 0.05, w2 must go down by 0.12.
...
prediction p=0.3
p’=0.89
2). For loss to go down by 0.1,
p’ must go down by 0.01
3). For p’ to go down by 0.01,
w2’ must go up by 0.04
Stochastic Gradient Descent (SGD) and Minibatch
It’s too expensive to compute the gradient on all inputs to take a step.
We prefer a quick approximation.
Use a small random sample of inputs (minibatch) to compute the gradient.
Apply the gradient to all the weights.
Welcome to SGD, much more efficient than regular GD!
Usually things would get intense at this point...
Chain rule
Partial derivatives
Hessian
Sum over paths
Credit: Richard Socher Stanford class cs224d
Jacobian matrix
Local gradient
Learning a new representation
Credit: http://colah.github.io/posts/2014-03-NN-Manifolds-Topology/
Training Data
Train. Validate. Test.
Training set
Validation set
Cross-validation
Testing set
2 3 41
2 3 41
2 3 41
2 3 41
shuffle!
one epoch
Never ever, ever, ever, ever, ever, ever use for training!
Tensors
Tensors are not scary
Number: 0D
Vector: 1D
Matrix: 2D
Tensor: 3D, 4D, … array of numbers
This image is a 3264 x 2448 x 3 tensor
[r=0.45 g=0.84 b=0.76]
Different types of networks / layers
Fully-Connected Networks
Per layer, every input connects to every neuron.
input1
input2
input3
input layer hidden layers output layer
circle
iris
eye
face
89% likely to be a face
Recurrent Neural Networks (RNN)
Appropriate when inputs are sequences.
We’ll cover in detail when we work on text.
pipe a not is this
t=0t=1t=2t=3t=4
this is not a pipe
t=0 t=1 t=2 t=3 t=4
Convolutional Neural Networks (ConvNets)
Appropriate for image tasks, but not limited to image tasks.
We’ll cover in detail in the next session.
Hyperparameters
Activations: a zoo of nonlinear functions
Initializations: Distribution of initial weights. Not all zeros.
Optimizers: driving the gradient descent
Objectives: comparing a prediction to the truth
Regularizers: forcing the function we learn to remain “simple”
...and many more
Activations
Nonlinear functions
Sigmoid, tanh, hard tanh and softsign
ReLU (Rectified Linear Unit) and Leaky ReLU
Softplus and Exponential Linear Unit (ELU)
Softmax
x1 = 6.78
x2 = 4.25
x3 = -0.16
x4 = -3.5
dog = 0.92536
cat = 0.07371
car = 0.00089
hat = 0.00003
Raw scores Probabilities
S
O
F
T
M
A
X
Optimizers
Various algorithms for driving Gradient Descent
Tricks to speed up SGD.
Learn the learning rate.
Credit: Alex Radford
Cost/Loss/Objective Functions
=?
Cross-entropy Loss
x1 = 6.78
x2 = 4.25
x3 = -0.16
x4 = -3.5
dog = 0.92536
cat = 0.07371
car = 0.00089
hat = 0.00003
Raw scores Probabilities
S
O
F
T
M
A
X
dog: loss = 0.078
cat: loss = 2.607
car: loss = 7.024
hat: loss = 10.414
If label is:
=?
label
(truth)
Regularization
Preventing Overfitting
Overfitting
Very simple.
Might not predict well.
Just right? Overfitting.
Rote memorization.
Will not generalize well.
Regularization: Avoiding Overfitting
Idea: keep the functions “simple” by constraining the weights.
Loss = Error(prediction, truth) + L(keep solution simple)
L2 makes all parameters medium-size
L1 kills many parameters.
L1+L2 sometimes used.
At every step, during training, ignore the output of a
fraction p of neurons.
p=0.5 is a good default.
Regularization: Dropout
input1
input2
input3
One Last Bit of Wisdom
Workshop : Keras & Addition RNN
https://github.com/holbertonschool/deep-learning/tree/master/Class%20%230

Contenu connexe

Tendances

Machine Learning - Neural Networks - Perceptron
Machine Learning - Neural Networks - PerceptronMachine Learning - Neural Networks - Perceptron
Machine Learning - Neural Networks - PerceptronAndrew Ferlitsch
 
Hands-on Deep Learning in Python
Hands-on Deep Learning in PythonHands-on Deep Learning in Python
Hands-on Deep Learning in PythonImry Kissos
 
Neural Networks on Steroids (Poster)
Neural Networks on Steroids (Poster)Neural Networks on Steroids (Poster)
Neural Networks on Steroids (Poster)Adam Blevins
 
Hands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in PythonHands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in PythonChun-Ming Chang
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural NetworkDessy Amirudin
 
Manifold Blurring Mean Shift algorithms for manifold denoising, presentation,...
Manifold Blurring Mean Shift algorithms for manifold denoising, presentation,...Manifold Blurring Mean Shift algorithms for manifold denoising, presentation,...
Manifold Blurring Mean Shift algorithms for manifold denoising, presentation,...Florent Renucci
 
Data Science - Part VII - Cluster Analysis
Data Science - Part VII -  Cluster AnalysisData Science - Part VII -  Cluster Analysis
Data Science - Part VII - Cluster AnalysisDerek Kane
 
Why Batch Normalization Works so Well
Why Batch Normalization Works so WellWhy Batch Normalization Works so Well
Why Batch Normalization Works so WellChun-Ming Chang
 
2021 03-02-transformer interpretability
2021 03-02-transformer interpretability2021 03-02-transformer interpretability
2021 03-02-transformer interpretabilityJAEMINJEONG5
 
NEURAL NETWORK Widrow-Hoff Learning Adaline Hagan LMS
NEURAL NETWORK Widrow-Hoff Learning Adaline Hagan LMSNEURAL NETWORK Widrow-Hoff Learning Adaline Hagan LMS
NEURAL NETWORK Widrow-Hoff Learning Adaline Hagan LMSESCOM
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationYan Xu
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnnKuppusamy P
 
Basic Learning Algorithms of ANN
Basic Learning Algorithms of ANNBasic Learning Algorithms of ANN
Basic Learning Algorithms of ANNwaseem khan
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural NetworksJun Young Park
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)EdutechLearners
 
Event classification & prediction using support vector machine
Event classification & prediction using support vector machineEvent classification & prediction using support vector machine
Event classification & prediction using support vector machineRuta Kambli
 

Tendances (20)

Machine Learning - Neural Networks - Perceptron
Machine Learning - Neural Networks - PerceptronMachine Learning - Neural Networks - Perceptron
Machine Learning - Neural Networks - Perceptron
 
Hands-on Deep Learning in Python
Hands-on Deep Learning in PythonHands-on Deep Learning in Python
Hands-on Deep Learning in Python
 
Neural Networks on Steroids (Poster)
Neural Networks on Steroids (Poster)Neural Networks on Steroids (Poster)
Neural Networks on Steroids (Poster)
 
Hands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in PythonHands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in Python
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
 
Manifold Blurring Mean Shift algorithms for manifold denoising, presentation,...
Manifold Blurring Mean Shift algorithms for manifold denoising, presentation,...Manifold Blurring Mean Shift algorithms for manifold denoising, presentation,...
Manifold Blurring Mean Shift algorithms for manifold denoising, presentation,...
 
Data Science - Part VII - Cluster Analysis
Data Science - Part VII -  Cluster AnalysisData Science - Part VII -  Cluster Analysis
Data Science - Part VII - Cluster Analysis
 
Why Batch Normalization Works so Well
Why Batch Normalization Works so WellWhy Batch Normalization Works so Well
Why Batch Normalization Works so Well
 
2021 03-02-transformer interpretability
2021 03-02-transformer interpretability2021 03-02-transformer interpretability
2021 03-02-transformer interpretability
 
Mnist report ppt
Mnist report pptMnist report ppt
Mnist report ppt
 
NEURAL NETWORK Widrow-Hoff Learning Adaline Hagan LMS
NEURAL NETWORK Widrow-Hoff Learning Adaline Hagan LMSNEURAL NETWORK Widrow-Hoff Learning Adaline Hagan LMS
NEURAL NETWORK Widrow-Hoff Learning Adaline Hagan LMS
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and Regularization
 
Deep learning
Deep learningDeep learning
Deep learning
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnn
 
Neurally Controlled Robot That Learns
Neurally Controlled Robot That LearnsNeurally Controlled Robot That Learns
Neurally Controlled Robot That Learns
 
Basic Learning Algorithms of ANN
Basic Learning Algorithms of ANNBasic Learning Algorithms of ANN
Basic Learning Algorithms of ANN
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)
 
Event classification & prediction using support vector machine
Event classification & prediction using support vector machineEvent classification & prediction using support vector machine
Event classification & prediction using support vector machine
 
Perceptron in ANN
Perceptron in ANNPerceptron in ANN
Perceptron in ANN
 

Similaire à DL Classe 1 - Go Deep or Go Home

Neural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learningNeural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learningTapas Majumdar
 
[系列活動] 手把手的深度學習實務
[系列活動] 手把手的深度學習實務[系列活動] 手把手的深度學習實務
[系列活動] 手把手的深度學習實務台灣資料科學年會
 
Hands-on Tutorial of Deep Learning
Hands-on Tutorial of Deep LearningHands-on Tutorial of Deep Learning
Hands-on Tutorial of Deep LearningChun-Ming Chang
 
Introduction to Neural Networks and Deep Learning
Introduction to Neural Networks and Deep LearningIntroduction to Neural Networks and Deep Learning
Introduction to Neural Networks and Deep LearningVahid Mirjalili
 
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習台灣資料科學年會
 
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Simplilearn
 
Deep learning from a novice perspective
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspectiveAnirban Santara
 
Deep learning from scratch
Deep learning from scratch Deep learning from scratch
Deep learning from scratch Eran Shlomo
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learningJunaid Bhat
 
Training DNN Models - II.pptx
Training DNN Models - II.pptxTraining DNN Models - II.pptx
Training DNN Models - II.pptxPrabhuSelvaraj15
 
[系列活動] 手把手的深度學實務
[系列活動] 手把手的深度學實務[系列活動] 手把手的深度學實務
[系列活動] 手把手的深度學實務台灣資料科學年會
 
Illustrative Introductory Neural Networks
Illustrative Introductory Neural NetworksIllustrative Introductory Neural Networks
Illustrative Introductory Neural NetworksYasutoTamura1
 
Deep neural networks & computational graphs
Deep neural networks & computational graphsDeep neural networks & computational graphs
Deep neural networks & computational graphsRevanth Kumar
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier홍배 김
 
Auto encoders in Deep Learning
Auto encoders in Deep LearningAuto encoders in Deep Learning
Auto encoders in Deep LearningShajun Nisha
 

Similaire à DL Classe 1 - Go Deep or Go Home (20)

Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
 
Neural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learningNeural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learning
 
[系列活動] 手把手的深度學習實務
[系列活動] 手把手的深度學習實務[系列活動] 手把手的深度學習實務
[系列活動] 手把手的深度學習實務
 
Hands-on Tutorial of Deep Learning
Hands-on Tutorial of Deep LearningHands-on Tutorial of Deep Learning
Hands-on Tutorial of Deep Learning
 
Introduction to Neural Networks and Deep Learning
Introduction to Neural Networks and Deep LearningIntroduction to Neural Networks and Deep Learning
Introduction to Neural Networks and Deep Learning
 
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
 
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
 
Deep learning from a novice perspective
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspective
 
Deep learning from scratch
Deep learning from scratch Deep learning from scratch
Deep learning from scratch
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Deep learning (2)
Deep learning (2)Deep learning (2)
Deep learning (2)
 
Training DNN Models - II.pptx
Training DNN Models - II.pptxTraining DNN Models - II.pptx
Training DNN Models - II.pptx
 
[系列活動] 手把手的深度學實務
[系列活動] 手把手的深度學實務[系列活動] 手把手的深度學實務
[系列活動] 手把手的深度學實務
 
Illustrative Introductory Neural Networks
Illustrative Introductory Neural NetworksIllustrative Introductory Neural Networks
Illustrative Introductory Neural Networks
 
Deep neural networks & computational graphs
Deep neural networks & computational graphsDeep neural networks & computational graphs
Deep neural networks & computational graphs
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
tutorial.ppt
tutorial.ppttutorial.ppt
tutorial.ppt
 
Auto encoders in Deep Learning
Auto encoders in Deep LearningAuto encoders in Deep Learning
Auto encoders in Deep Learning
 
Lec 6-bp
Lec 6-bpLec 6-bp
Lec 6-bp
 

DL Classe 1 - Go Deep or Go Home