SlideShare une entreprise Scribd logo
1  sur  21
Télécharger pour lire hors ligne
1
A Study of Deep Learning Models and
Bayesian Statistics
By: Pritish Yuvraj
Summer Research Fellow
Indian Statistical Institute, Kolkata
Guide: Prof. Rajat Kumar De
Machine Intelligence Unit
Indian Statistical Institute, Kolkata
2
A Study of Deep Learning Models and
Bayesian Statistics
By: Pritish Yuvraj
Summer Research Fellow
Indian Statistical Institute, Kolkata
Guide: Prof. Rajat Kumar De
Machine Intelligence Unit
Indian Statistical Institute, Kolkata
3
Table of Contents
Serial No: Content Page No:
1) History 3 - 4
2)
Restricted Boltzmann
Machine
5 - 11
3) Deep Belief Network 12 - 18
4) Bayesian Statistics 19
5) Conclusion 20
4
1. History
● The ultimate aim for Artificial Intelligence is to reach a pinnacle where machines can
accomplish tasks which are pernicious, byzantine and arduous for human beings to
perform. Towards this direction, researchers are working and have invented multiple
algorithms. With a colossal amount of money and manpower dedicated to the
improvement of the field of Artificial Intelligence, the field is improving very fast.
● The history of Artificial Intelligence backs to 1940 when Philosopher Pamela
McCorduck, attempted to describe the process of human thinking as the mechanical
manipulation of symbols. Time passed, algorithms with the basis of statistics, which
could "theoretically" solve real life problems. Unfortunately, these problems could not
be applicable to real life situations. These Artificially Intelligent programs failed to
undertake the importance of Environment, hence failed in real life. This was paved the
advent of Machine Learning.
● A subfield of Artificial Intelligence evolved from studies of Pattern Recognition and
Computational Learning Theory. In 1959, Arthur Samuel defined machine learning as
a "Field of study that gives computers the ability to learn without being explicitly
programmed". Machine Learning was able to overcome the deficiency of AI. Unlike the
previous AI applications, ML incorporated a lot of Data.
5
A notion was created that more the data from pragmatic sources the better will your program run. The world
knew about the importance of "Data". A relevant question here asked could be, why did Machine Learning
do well? It's because it was being trained on real datasets recorded from sensors, cameras, and recording
devices. Statistical Analysis of these data gave us a good insight into the convoluted pattern stored in data
and hence Machine Learning Researchers were able to mimic the learned information into a real life
application. Some examples where Machine Learning is being actively used is Spam Filtering, Search
Engines, Computer Vision, etc.
● At the present moment, a mammoth amount of data has already been collected and the process is still
continuing. Here comes the glitch, these data are mostly unsupervised. Data Mining and other techniques
perform the task of Unsupervised Learning better than Machine Learning. These stirred up an environment
to delve into a complex subfield of Machine Learning, "Deep Learning". Deep Learning was present since
1940's but never utilized as:
● 1) Data was not Sufficient
● 2) Computation power was high
● 3) No immediate necessity
●
● In this project, a study of Deep Learning Algorithms was conducted by implementing the algorithms in C++,
using object oriented approach and results are shared. The following four algorithms were implemented:
● 1) Restricted Boltzmann Machine
● 2) Deep Belief Network
● 3) Recurrent Neural Network
● 4) Convoluted Neural Network
6
2.1 Restricted Boltzmann Machine
● Found prominance after
efforts from Geoffrey Hinton
(University of Toranto).
● Unsupervised or
Supervised Depending on
Application
● Applications in Dimensional
Reduction, Classification,
Collaborative Filtering,
Feature Learning and Topic
Modelling.
7
2.1 RBM: Mathematical Formulas
Energy Configurations:
Probability Dist (where
Z is PartitionFunction):
Probabiltiy V given H:
Probability H given V:
Weight Update:
8
2.2 RBM:Training
● Take training Sample (v), Compute probabilites of hidden units.
● Sample a hidden activation vector (h) from the above probability Dist.
● Compute “Outer Product” of (v) and (h) (called Positive Gradient p(x)).
● From (h), sample reconstruction (v') of visible units, resample hidden activations (h') from this
probability Dist. (Gibbs Sampling)
● Compute “Outer Product” of (v') and (h'). (called Negative Gradient q(x))
● Update the Weights based upon differences between Positive Gradient and Negative Gradient.
● Aim of KL Divergence is to maximize common area from function p(x) and q(x).
● Or in other ways the probability of positive gradient and the probability of negative gradient are converged.
● The better the convergence the better we have predicted probability dist. Of input to prob. Dist of hidden Layer.
9
Experiment 1: RBM
● Example taken from Edwin Chen blog.
● Results are found after implementing RBM in C+
+. Codes available with github under profile of
“Pritish Yuvraj”.
● RBM conducts the experiment in Unsupervised
way. It isn't fed with the comments. So based
only on the inputs it needs create some sort of
pattern. Only hint it has is that it needs to create 2
separate group based on the given inputs.
Harray Potter Avatar LOTR Gladiator Titanic Glitter Comments
Alice 1 1 1 0 0 0 Big SF/Fantasy
Fan
Bob 1 0 1 0 0 0 SF/Fantasy fan,
but not Avatar
Carol 1 1 1 0 0 0 Big SG/fantasy
fab
David 0 0 1 1 1 0 Big Oscar
Winners Fan
Eric 0 0 1 1 1 0 Oscar Fan Except
Titanic
Fred 0 0 1 1 1 0 Big Oscar Winner
Fan
10
Conclusion 1: RBM
Hidden Layer
1
Hidden Layer
2
Harry Potter -7.70958 6.260625
Avatar -13.7941 3.09608
LOTR3 8.89752 4.491787
Gladiator 7.87261 -6.69001
Titanic 7.87356 -6.73726
Glitter -8.50164 -5.07191
●
Result Inference:
● 1) Harry Potter and Avatar form one group
(Science Fiction/ Fantasy Movies).
● 2) Gladiator and Titanic form another group
(Oscar Winners).
● 3) LOTR3 and Glitter don't belong clearly to
anyone of the two groups.
● Experiment Conducted with 6 visible
layers (Inputs) and 2 hidden layers
(Output).
● No of Epoch = 50000
● Final Error after iteration of all the
epochs = 5.683 * 10-6 .
11
Experiment 2: RBM
● Applied Restricted Boltzmann Machine
Algorithm on Datasets available by Yale
University. The database is called “Yale
Face Database”.
● References: P. Belhumeur, J. Hespanha, D.
Kriegman, ÒEigenfaces vs. Fisherfaces:
Recognition Using Class Specific Linear
Projection,Ó IEEE Transactions on Pattern
Analysis and Machine Intelligence, July
1997, pp. 711-720.
● The next slide will show the conlcusion of
the experiment conducted. We try to
generate the image based on whatever the
RBM has learned. The number of epochs
and Hidden Layers used wil are mentioned
on the next slide.
12
Result: Dreaming Phase of an RBM
● Hidden Layer:
100
● Epochs: 5000
● Hidden Layer:
10
● Epochs: 5000
● Hidden
Layer: 2
● Epochs: 5000
13
3. Deep Belief Network
History:
● Observed by Yee-Whye Teh, a student
of Geoffrey Hinton.
● 1st Effective Deep Learning Algorithm.
About the Model:
● Generative Graphical Model.
● Can be used in Unsupervised /
Supervised way.
● Composition of RBM's in stack.
● Supervised Deep Belief Network can
be used for Classification.
14
3.1 DBN formed after stacking RBM's
15
3.2 DBN: Algorithm
● 1) Train RBM on inputs (X) to obtain weight matrix (W).
● 2)Transform (X) by the RBM to produce new data (X').
● 3) Repeat this procedure with X <- X' for the next pair of layers
● 4) Stop before the top 2 layers.
● 5) Fine-tune the top 2 layers for Supervised Learning.
3.3 Fine-Tuning
Implemented using RBM stack layers and Logistic
Classifier in Fine-Tuning Stage. Implementation of this
program is available on github under the profile of “Pritish
Yuvraj”
Methods for Fine-tuning:
● Feed Forward Network
● Support Vector Machine
● Logistic Classifier
16
3.4 SoftMax Function
● The purpose of SoftMax Function is to
convert the Output into probability it
belongs to a certain group.
(Classification Problems).
● Eg.
– [0.1, 0.0001, 0.00002] becomes
– [0.35558, 0.3220, 0.3219]
– Or, [35%, 32%, 32%] (approx).
17
Experiment 3: DBN
● The objective of the experiment was to
determine the accuracy of Deep Belief
Network wrt to other Machine Learning
Algorithms.
● The problem we picked was to classify
images.
● To conduct the experiment, two image
databases were merged to create an
artificial database.
– LFW database (Labeled Wild Face taken from
Department of Computer Science, University
of Massachussets, Amhrest)
– Flowers database (Department of Computer
Science, University of Oxford)
.
18
3. 5 DBN Architecture for the Experiment
Preprocessing of Images.
● First the images were preprocessed
Black/White from RBM.
● Pixels were made unifrom to 320 * 240 pixels.
● No of Input Neurons: 77760
● 2 Hidden Layers with 100 neurons
each.
● Output Layer with 2 neurons,
classifying either a flower or a human
in the image.
● The Fine-Tuning was achieved
using Logistic Regression Classifier.
● Probablity of an image belonging to a
group was decided based on
SoftMax Funciton.
19
3. 6 DBN: Results
Algorithm Accuracy (in %):
Stochastic Gradient
Descent
96.20%
Random Forest Tree
Classifier
98.26%
Support Vector
Machine
98.3%
Deep Belief Network 100%
Detailed Report on Deep Belief Network
performance:
Number of Epochs
(of Entire
Architecture):
Accuracy
10 74.91%
50 95.38%
100 100%
20
4. Bayesian statistics●
Three approaches to Probability
– Axiomatic
●
Probability by definition and properties
– Relative Frequency
●
Repeated trials
– Degree of belief (subjective)
●
Personal measure of uncertainty
●
Problems
– The chance that a meteor strikes earth is 1%
– The probability of rain today is 30%
– The chance of getting an A on the exam is 50%
4.1 Bayes Theorem for Statistics
●
Let θ represent parameter(s)
●
Let X represent data
●
Left-hand side is a function of θ
●
Denominator on right-hand side does not depend on θ
●
Posterior distribution: Likelihood x Prior distribution
●
Posterior dist’n = Constant x Likelihood x Prior dist’n
●
Equation can be understood at the level of densities
●
Goal: Explore the posterior distribution of θ
( | ) ( | ) ( ) / ( )f X f X f f X  
( | ) ( | ) ( )f X f X f  
21
5. Conclusion
●
The project was on much more on the practical aspects of Deep Learning and
extracting the statistical knowledge required for that.
●
Deep Learning is a new emerging field of Artificial Intelligence which has brought
us one step closer to real vision of AI.
●
RBM is very important as most of the data present in the world are unsupervised.
The same goes with Deep Belief Network, We can use a lot of unsupervised data
to train the intial stack of RBM's on DBN and then use fine-tuning method for some
small supervised dataset.
●
DBN is more accurate compared to SVM and other Machine Learning algorithms
as can be deduced by the results of the experiment conducted in DBN section.
●
Unlike Machine Learning where an algorithm does particularly a single task, Deep
Learning Algorithms can perform multiple tasks. Like the same DBN can be used in
Classification of an Image or Classification of Text.
●
This was the orignal dream of Artificial Intelligence, from which we are still very far
but Deep Learning has taken us one step closer to it.

Contenu connexe

Tendances

Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
Lukas Masuch
 
Deep Learning Tutorial | Deep Learning Tutorial For Beginners | What Is Deep ...
Deep Learning Tutorial | Deep Learning Tutorial For Beginners | What Is Deep ...Deep Learning Tutorial | Deep Learning Tutorial For Beginners | What Is Deep ...
Deep Learning Tutorial | Deep Learning Tutorial For Beginners | What Is Deep ...
Simplilearn
 
H2O Deep Learning at Next.ML
H2O Deep Learning at Next.MLH2O Deep Learning at Next.ML
H2O Deep Learning at Next.ML
Sri Ambati
 
H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614
Sri Ambati
 

Tendances (19)

Deep Learning: Chapter 11 Practical Methodology
Deep Learning: Chapter 11 Practical MethodologyDeep Learning: Chapter 11 Practical Methodology
Deep Learning: Chapter 11 Practical Methodology
 
20190927 generative models_aia
20190927 generative models_aia20190927 generative models_aia
20190927 generative models_aia
 
Machine teaching tbo_20190518
Machine teaching tbo_20190518Machine teaching tbo_20190518
Machine teaching tbo_20190518
 
Generative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their ApplicationsGenerative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their Applications
 
TensorFlow in 3 sentences
TensorFlow in 3 sentencesTensorFlow in 3 sentences
TensorFlow in 3 sentences
 
Visual Object Tracking: review
Visual Object Tracking: reviewVisual Object Tracking: review
Visual Object Tracking: review
 
Capitalico / Chart Pattern Matching in Financial Trading Using RNN
Capitalico / Chart Pattern Matching in Financial Trading Using RNNCapitalico / Chart Pattern Matching in Financial Trading Using RNN
Capitalico / Chart Pattern Matching in Financial Trading Using RNN
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
 
Deep Learning and Tensorflow Implementation(딥러닝, 텐서플로우, 파이썬, CNN)_Myungyon Ki...
Deep Learning and Tensorflow Implementation(딥러닝, 텐서플로우, 파이썬, CNN)_Myungyon Ki...Deep Learning and Tensorflow Implementation(딥러닝, 텐서플로우, 파이썬, CNN)_Myungyon Ki...
Deep Learning and Tensorflow Implementation(딥러닝, 텐서플로우, 파이썬, CNN)_Myungyon Ki...
 
Deep Learning Tutorial | Deep Learning Tutorial For Beginners | What Is Deep ...
Deep Learning Tutorial | Deep Learning Tutorial For Beginners | What Is Deep ...Deep Learning Tutorial | Deep Learning Tutorial For Beginners | What Is Deep ...
Deep Learning Tutorial | Deep Learning Tutorial For Beginners | What Is Deep ...
 
H2O Deep Learning at Next.ML
H2O Deep Learning at Next.MLH2O Deep Learning at Next.ML
H2O Deep Learning at Next.ML
 
GAN Evaluation
GAN EvaluationGAN Evaluation
GAN Evaluation
 
Deep learning
Deep learningDeep learning
Deep learning
 
What Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceWhat Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial Intelligence
 
[215]streetwise machine learning for painless parking
[215]streetwise machine learning for painless parking[215]streetwise machine learning for painless parking
[215]streetwise machine learning for painless parking
 
Josh Patterson MLconf slides
Josh Patterson MLconf slidesJosh Patterson MLconf slides
Josh Patterson MLconf slides
 
Deep Learning and Reinforcement Learning
Deep Learning and Reinforcement LearningDeep Learning and Reinforcement Learning
Deep Learning and Reinforcement Learning
 
FPT17: An object detector based on multiscale sliding window search using a f...
FPT17: An object detector based on multiscale sliding window search using a f...FPT17: An object detector based on multiscale sliding window search using a f...
FPT17: An object detector based on multiscale sliding window search using a f...
 
H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614
 

Similaire à ProjectReport

Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
butest
 
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer Farooqui
Databricks
 
ArtificialIntelligenceInObjectDetection-Report.pdf
ArtificialIntelligenceInObjectDetection-Report.pdfArtificialIntelligenceInObjectDetection-Report.pdf
ArtificialIntelligenceInObjectDetection-Report.pdf
Abishek86232
 

Similaire à ProjectReport (20)

20181212 ibm aot
20181212 ibm aot20181212 ibm aot
20181212 ibm aot
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
 
Deep learning: Cutting through the Myths and Hype
Deep learning: Cutting through the Myths and HypeDeep learning: Cutting through the Myths and Hype
Deep learning: Cutting through the Myths and Hype
 
Machine learning_ Replicating Human Brain
Machine learning_ Replicating Human BrainMachine learning_ Replicating Human Brain
Machine learning_ Replicating Human Brain
 
AI and Deep Learning
AI and Deep Learning AI and Deep Learning
AI and Deep Learning
 
final ppt
final pptfinal ppt
final ppt
 
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 
Report face recognition : ArganRecogn
Report face recognition :  ArganRecognReport face recognition :  ArganRecogn
Report face recognition : ArganRecogn
 
TensorFlow London: Cutting edge generative models
TensorFlow London: Cutting edge generative modelsTensorFlow London: Cutting edge generative models
TensorFlow London: Cutting edge generative models
 
01-pengantar.pdf
01-pengantar.pdf01-pengantar.pdf
01-pengantar.pdf
 
A Survey of Deep Learning Algorithms for Malware Detection
A Survey of Deep Learning Algorithms for Malware DetectionA Survey of Deep Learning Algorithms for Malware Detection
A Survey of Deep Learning Algorithms for Malware Detection
 
Real Time Object Dectection using machine learning
Real Time Object Dectection using machine learningReal Time Object Dectection using machine learning
Real Time Object Dectection using machine learning
 
ArtificialIntelligenceInObjectDetection-Report.pdf
ArtificialIntelligenceInObjectDetection-Report.pdfArtificialIntelligenceInObjectDetection-Report.pdf
ArtificialIntelligenceInObjectDetection-Report.pdf
 
IRJET- Real-Time Object Detection using Deep Learning: A Survey
IRJET- Real-Time Object Detection using Deep Learning: A SurveyIRJET- Real-Time Object Detection using Deep Learning: A Survey
IRJET- Real-Time Object Detection using Deep Learning: A Survey
 
Vertex perspectives artificial intelligence
Vertex perspectives   artificial intelligenceVertex perspectives   artificial intelligence
Vertex perspectives artificial intelligence
 
Vertex Perspectives | Artificial Intelligence
Vertex Perspectives | Artificial IntelligenceVertex Perspectives | Artificial Intelligence
Vertex Perspectives | Artificial Intelligence
 
ML Basic Concepts.pdf
ML Basic Concepts.pdfML Basic Concepts.pdf
ML Basic Concepts.pdf
 
“Introducing Machine Learning and How to Teach Machines to See,” a Presentati...
“Introducing Machine Learning and How to Teach Machines to See,” a Presentati...“Introducing Machine Learning and How to Teach Machines to See,” a Presentati...
“Introducing Machine Learning and How to Teach Machines to See,” a Presentati...
 
slide-171212080528.pptx
slide-171212080528.pptxslide-171212080528.pptx
slide-171212080528.pptx
 
Artificial Intelligence, Machine Learning and Deep Learning with CNN
Artificial Intelligence, Machine Learning and Deep Learning with CNNArtificial Intelligence, Machine Learning and Deep Learning with CNN
Artificial Intelligence, Machine Learning and Deep Learning with CNN
 

ProjectReport

  • 1. 1 A Study of Deep Learning Models and Bayesian Statistics By: Pritish Yuvraj Summer Research Fellow Indian Statistical Institute, Kolkata Guide: Prof. Rajat Kumar De Machine Intelligence Unit Indian Statistical Institute, Kolkata
  • 2. 2 A Study of Deep Learning Models and Bayesian Statistics By: Pritish Yuvraj Summer Research Fellow Indian Statistical Institute, Kolkata Guide: Prof. Rajat Kumar De Machine Intelligence Unit Indian Statistical Institute, Kolkata
  • 3. 3 Table of Contents Serial No: Content Page No: 1) History 3 - 4 2) Restricted Boltzmann Machine 5 - 11 3) Deep Belief Network 12 - 18 4) Bayesian Statistics 19 5) Conclusion 20
  • 4. 4 1. History ● The ultimate aim for Artificial Intelligence is to reach a pinnacle where machines can accomplish tasks which are pernicious, byzantine and arduous for human beings to perform. Towards this direction, researchers are working and have invented multiple algorithms. With a colossal amount of money and manpower dedicated to the improvement of the field of Artificial Intelligence, the field is improving very fast. ● The history of Artificial Intelligence backs to 1940 when Philosopher Pamela McCorduck, attempted to describe the process of human thinking as the mechanical manipulation of symbols. Time passed, algorithms with the basis of statistics, which could "theoretically" solve real life problems. Unfortunately, these problems could not be applicable to real life situations. These Artificially Intelligent programs failed to undertake the importance of Environment, hence failed in real life. This was paved the advent of Machine Learning. ● A subfield of Artificial Intelligence evolved from studies of Pattern Recognition and Computational Learning Theory. In 1959, Arthur Samuel defined machine learning as a "Field of study that gives computers the ability to learn without being explicitly programmed". Machine Learning was able to overcome the deficiency of AI. Unlike the previous AI applications, ML incorporated a lot of Data.
  • 5. 5 A notion was created that more the data from pragmatic sources the better will your program run. The world knew about the importance of "Data". A relevant question here asked could be, why did Machine Learning do well? It's because it was being trained on real datasets recorded from sensors, cameras, and recording devices. Statistical Analysis of these data gave us a good insight into the convoluted pattern stored in data and hence Machine Learning Researchers were able to mimic the learned information into a real life application. Some examples where Machine Learning is being actively used is Spam Filtering, Search Engines, Computer Vision, etc. ● At the present moment, a mammoth amount of data has already been collected and the process is still continuing. Here comes the glitch, these data are mostly unsupervised. Data Mining and other techniques perform the task of Unsupervised Learning better than Machine Learning. These stirred up an environment to delve into a complex subfield of Machine Learning, "Deep Learning". Deep Learning was present since 1940's but never utilized as: ● 1) Data was not Sufficient ● 2) Computation power was high ● 3) No immediate necessity ● ● In this project, a study of Deep Learning Algorithms was conducted by implementing the algorithms in C++, using object oriented approach and results are shared. The following four algorithms were implemented: ● 1) Restricted Boltzmann Machine ● 2) Deep Belief Network ● 3) Recurrent Neural Network ● 4) Convoluted Neural Network
  • 6. 6 2.1 Restricted Boltzmann Machine ● Found prominance after efforts from Geoffrey Hinton (University of Toranto). ● Unsupervised or Supervised Depending on Application ● Applications in Dimensional Reduction, Classification, Collaborative Filtering, Feature Learning and Topic Modelling.
  • 7. 7 2.1 RBM: Mathematical Formulas Energy Configurations: Probability Dist (where Z is PartitionFunction): Probabiltiy V given H: Probability H given V: Weight Update:
  • 8. 8 2.2 RBM:Training ● Take training Sample (v), Compute probabilites of hidden units. ● Sample a hidden activation vector (h) from the above probability Dist. ● Compute “Outer Product” of (v) and (h) (called Positive Gradient p(x)). ● From (h), sample reconstruction (v') of visible units, resample hidden activations (h') from this probability Dist. (Gibbs Sampling) ● Compute “Outer Product” of (v') and (h'). (called Negative Gradient q(x)) ● Update the Weights based upon differences between Positive Gradient and Negative Gradient. ● Aim of KL Divergence is to maximize common area from function p(x) and q(x). ● Or in other ways the probability of positive gradient and the probability of negative gradient are converged. ● The better the convergence the better we have predicted probability dist. Of input to prob. Dist of hidden Layer.
  • 9. 9 Experiment 1: RBM ● Example taken from Edwin Chen blog. ● Results are found after implementing RBM in C+ +. Codes available with github under profile of “Pritish Yuvraj”. ● RBM conducts the experiment in Unsupervised way. It isn't fed with the comments. So based only on the inputs it needs create some sort of pattern. Only hint it has is that it needs to create 2 separate group based on the given inputs. Harray Potter Avatar LOTR Gladiator Titanic Glitter Comments Alice 1 1 1 0 0 0 Big SF/Fantasy Fan Bob 1 0 1 0 0 0 SF/Fantasy fan, but not Avatar Carol 1 1 1 0 0 0 Big SG/fantasy fab David 0 0 1 1 1 0 Big Oscar Winners Fan Eric 0 0 1 1 1 0 Oscar Fan Except Titanic Fred 0 0 1 1 1 0 Big Oscar Winner Fan
  • 10. 10 Conclusion 1: RBM Hidden Layer 1 Hidden Layer 2 Harry Potter -7.70958 6.260625 Avatar -13.7941 3.09608 LOTR3 8.89752 4.491787 Gladiator 7.87261 -6.69001 Titanic 7.87356 -6.73726 Glitter -8.50164 -5.07191 ● Result Inference: ● 1) Harry Potter and Avatar form one group (Science Fiction/ Fantasy Movies). ● 2) Gladiator and Titanic form another group (Oscar Winners). ● 3) LOTR3 and Glitter don't belong clearly to anyone of the two groups. ● Experiment Conducted with 6 visible layers (Inputs) and 2 hidden layers (Output). ● No of Epoch = 50000 ● Final Error after iteration of all the epochs = 5.683 * 10-6 .
  • 11. 11 Experiment 2: RBM ● Applied Restricted Boltzmann Machine Algorithm on Datasets available by Yale University. The database is called “Yale Face Database”. ● References: P. Belhumeur, J. Hespanha, D. Kriegman, ÒEigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection,Ó IEEE Transactions on Pattern Analysis and Machine Intelligence, July 1997, pp. 711-720. ● The next slide will show the conlcusion of the experiment conducted. We try to generate the image based on whatever the RBM has learned. The number of epochs and Hidden Layers used wil are mentioned on the next slide.
  • 12. 12 Result: Dreaming Phase of an RBM ● Hidden Layer: 100 ● Epochs: 5000 ● Hidden Layer: 10 ● Epochs: 5000 ● Hidden Layer: 2 ● Epochs: 5000
  • 13. 13 3. Deep Belief Network History: ● Observed by Yee-Whye Teh, a student of Geoffrey Hinton. ● 1st Effective Deep Learning Algorithm. About the Model: ● Generative Graphical Model. ● Can be used in Unsupervised / Supervised way. ● Composition of RBM's in stack. ● Supervised Deep Belief Network can be used for Classification.
  • 14. 14 3.1 DBN formed after stacking RBM's
  • 15. 15 3.2 DBN: Algorithm ● 1) Train RBM on inputs (X) to obtain weight matrix (W). ● 2)Transform (X) by the RBM to produce new data (X'). ● 3) Repeat this procedure with X <- X' for the next pair of layers ● 4) Stop before the top 2 layers. ● 5) Fine-tune the top 2 layers for Supervised Learning. 3.3 Fine-Tuning Implemented using RBM stack layers and Logistic Classifier in Fine-Tuning Stage. Implementation of this program is available on github under the profile of “Pritish Yuvraj” Methods for Fine-tuning: ● Feed Forward Network ● Support Vector Machine ● Logistic Classifier
  • 16. 16 3.4 SoftMax Function ● The purpose of SoftMax Function is to convert the Output into probability it belongs to a certain group. (Classification Problems). ● Eg. – [0.1, 0.0001, 0.00002] becomes – [0.35558, 0.3220, 0.3219] – Or, [35%, 32%, 32%] (approx).
  • 17. 17 Experiment 3: DBN ● The objective of the experiment was to determine the accuracy of Deep Belief Network wrt to other Machine Learning Algorithms. ● The problem we picked was to classify images. ● To conduct the experiment, two image databases were merged to create an artificial database. – LFW database (Labeled Wild Face taken from Department of Computer Science, University of Massachussets, Amhrest) – Flowers database (Department of Computer Science, University of Oxford) .
  • 18. 18 3. 5 DBN Architecture for the Experiment Preprocessing of Images. ● First the images were preprocessed Black/White from RBM. ● Pixels were made unifrom to 320 * 240 pixels. ● No of Input Neurons: 77760 ● 2 Hidden Layers with 100 neurons each. ● Output Layer with 2 neurons, classifying either a flower or a human in the image. ● The Fine-Tuning was achieved using Logistic Regression Classifier. ● Probablity of an image belonging to a group was decided based on SoftMax Funciton.
  • 19. 19 3. 6 DBN: Results Algorithm Accuracy (in %): Stochastic Gradient Descent 96.20% Random Forest Tree Classifier 98.26% Support Vector Machine 98.3% Deep Belief Network 100% Detailed Report on Deep Belief Network performance: Number of Epochs (of Entire Architecture): Accuracy 10 74.91% 50 95.38% 100 100%
  • 20. 20 4. Bayesian statistics● Three approaches to Probability – Axiomatic ● Probability by definition and properties – Relative Frequency ● Repeated trials – Degree of belief (subjective) ● Personal measure of uncertainty ● Problems – The chance that a meteor strikes earth is 1% – The probability of rain today is 30% – The chance of getting an A on the exam is 50% 4.1 Bayes Theorem for Statistics ● Let θ represent parameter(s) ● Let X represent data ● Left-hand side is a function of θ ● Denominator on right-hand side does not depend on θ ● Posterior distribution: Likelihood x Prior distribution ● Posterior dist’n = Constant x Likelihood x Prior dist’n ● Equation can be understood at the level of densities ● Goal: Explore the posterior distribution of θ ( | ) ( | ) ( ) / ( )f X f X f f X   ( | ) ( | ) ( )f X f X f  
  • 21. 21 5. Conclusion ● The project was on much more on the practical aspects of Deep Learning and extracting the statistical knowledge required for that. ● Deep Learning is a new emerging field of Artificial Intelligence which has brought us one step closer to real vision of AI. ● RBM is very important as most of the data present in the world are unsupervised. The same goes with Deep Belief Network, We can use a lot of unsupervised data to train the intial stack of RBM's on DBN and then use fine-tuning method for some small supervised dataset. ● DBN is more accurate compared to SVM and other Machine Learning algorithms as can be deduced by the results of the experiment conducted in DBN section. ● Unlike Machine Learning where an algorithm does particularly a single task, Deep Learning Algorithms can perform multiple tasks. Like the same DBN can be used in Classification of an Image or Classification of Text. ● This was the orignal dream of Artificial Intelligence, from which we are still very far but Deep Learning has taken us one step closer to it.