Summary:
There are three parts in this presentation.
A. Why do we need Convolutional Neural Network
- Problems we face today
- Solutions for problems
B. LeNet Overview
- The origin of LeNet
- The result after using LeNet model
C. LeNet Techniques
- LeNet structure
- Function of every layer
In the following Github Link, there is a repository that I rebuilt LeNet without any deep learning package. Hope this can make you more understand the basic of Convolutional Neural Network.
Github Link : https://github.com/HiCraigChen/LeNet
LinkedIn : https://www.linkedin.com/in/YungKueiChen
11. Introduce
Yann LeCun
•Director of AI Research, Facebook
main research interest is machine learning, particularly
how it applies to perception, and more particularly to
visual perception.
• LeNet Paper:
Gradient-Based Learning Applied to Document Recognition.
Source : Yann LeCun, http://yann.lecun.com/
15. •Revolutionary
Even without traditional machine learning
concept, the result*(Error Rate:0.95%) is
the best among all machine learning
method.
Introduce
*LeNet-5, source : Yann LeCun, http://yann.lecun.com/exdb/mnist/
16. 0 2 4 6 8 10 12 14
linear classifier (1-layer NN)
K-nearest-neighbors, Euclidean (L2)
2-layer NN, 300 hidden units, MSE
SVM, Gaussian Kernel
Convolutional net LeNet-5
TEST ERROR RATE (%) (The lower the better)
Introduce
17. Overview
Source : [LeCun et al., 1998]: Gradient-Based Learning Applied to Document Recognition Page. 7
18. Input
Source : [LeCun et al., 1998]: Gradient-Based Learning Applied to Document Recognition Page. 7
19. Input Layer
Data : MNIST handwritten digits
training set : 60,000 examples
test set : 10,000 examples
Source : http://yann.lecun.com/exdb/mnist/
20.
21. Input Layer
Source : http://yann.lecun.com/exdb/mnist/
Data : MNIST handwritten digits
training set : 60,000 examples
test set : 10,000 examples
Size : 28x28
Color : Black & White
Range : 0~255
22.
23. Input Layer – Constant(Zero) Padding
Source : http://xrds.acm.org/blog/2016/06/convolutional-neural-networks-cnns-illustrated-explanation/
1.To make sure the data input
fit our structure.
2.Let the edge elements have
more chance to be filtered.
25. Convolutional Layer
Source : [LeCun et al., 1998]: Gradient-Based Learning Applied to Document Recognition Page. 7
26. Convolutional Layer – Function
Extract features from the input image
Source : An Intuitive Explanation of Convolutional Neural Networks
https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/
27. Convolution
Convolution is a mathematical operation on two functions to
produce a third function, that is typically viewed as a
modified version of one of the original functions.
28. Convolutional Layer Overview
Convolutional Layer = Multiply function + Sum Function
Layer input
Kernel
Layer
output
Source : https://mlnotebook.github.io/post/CNN1/
Multiply Sum
41. Pooling Layer – Function
Reduces the dimensionality of each feature map
but retains the most important information
Source : An Intuitive Explanation of Convolutional Neural Networks
https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/
42. Pooling Layer Overview.
Source : Using Convolutional Neural Networks for Image Recognition
https://www.embedded-vision.com/platinum-members/cadence/embedded-vision-
training/documents/pages/neuralnetworksimagerecognition#3
54. Output
Source : [LeCun et al., 1998]: Gradient-Based Learning Applied to Document Recognition Page. 7
55. Output – Loss Function (Least Squared error )
Output 𝑌 = 𝑆𝑈𝑀((𝑋 𝑇
− 𝑊)2
)
Loss Function (Cost Function):
To evaluate the difference between predicted
value and the answer.
56. [ ]
Output – One hot encoding
9
Make sure the differences between any
pair of numbers are the same.
57. Output – One hot encoding
9-8 = 1 Closer!!!
9-5 = 3 Farther!!
Make sure the differences between any
pair of numbers are the same.
58. Output – One hot encoding
0:
1:
2:
3:
4:
5:
6:
7:
8:
9:
Most of the features from convolutional and pooling layers may be good for the classification task, but combinations of those features might be even better