SlideShare une entreprise Scribd logo
1  sur  45
Unsupervised Feature Learning:
A Literature Review
By: Amgad Muhammad & Mohamed EL Fadly

1
Outline
• Background
• Problem Definition
• Unsupervised Feature Learning
• Our Work
• Sparse Auto-encoder
• Preprocessing: PCA and Whitening
• Self-Taught Learning and Unsupervised Feature Learning

• References

2 of 37
Background
•

Machine learning is one of the corner stone fields in Artificial Intelligence, where machines learn to act

autonomously, and react to new situations without being pre-programmed.
•

Machine learning has seen numerous successes, but applying learning algorithms today often means
spending a long time hand-engineering the input feature representation. This is true for many problems in
vision, audio, NLP, robotics, and other areas.

•

There are many learning algorithms for learning among them are [1]:
1)

Supervised learning

2)

Unsupervised learning

3 of 37
Problem Definition
•

The target of the supervised learning method can be summarized as follows:
•
•

•

Regression
Classification

The first step to train a machine using the supervised learning method, is collecting the data set, which in most cases
is a very difficult and an expensive process

•

The alternative approach is to measure and use everything, which will lead to other problems, i.e. the noisy data [2]

4 of 37
Unsupervised feature learning
•

The unsupervised feature learning approach learns higher-level representation of the unlabeled data

features by detecting patterns using various algorithms, i.e. sparse encoding algorithm [3]
•

It is a self-taught learning framework developed to transfer knowledge from unlabeled data, which is much
easier to obtain, to be used as preprocessing step to enhance the supervised inductive models.

•

This framework is developed to tackle present issues in the supervised learning model and to increase its
accuracy regardless of the domain of interest (vision, sound, and text).[4]

5 of 37
Our Work
•

We will present some of the methods for unsupervised feature learning and deep learning, each of which

automatically learns a good representation of the input from unlabeled data.
•

We will be concentrating on the following algorithms, with more details in the following slides:
•
•

PCA and Whitening

•

•

Sparse Autoencoder

Self-Taught

We will also be focusing on the application of these algorithms to learn features from images

6 of 37
Sparse Autoencoders

7 of 37
Sparse Auto-encoder

Autoencoder [6]
8 of 37
Neural Network
Before we get further into the details of the algorithm, we need to quickly go through neural network.
To describe neural networks, we will begin by describing the simplest possible neural network. One that comprises

a single "neuron." We will use the following diagram to denote a single neuron [5]

Single Neuron [8]

9 of 37
Neural Network

10 of 37
Sigmoid Activation Function

Sigmoid Function [8]

11 of 37
Tanh Activation Function

Tanh Function [8]

12 of 37
Neural Network Model
•

A neural network is put together by hooking together many of our simple "neurons," so that the output of a
neuron can be the input of another. For example, here is a small neural network

•

The circles labeled "+1" are called bias units, and correspond to the intercept term. The leftmost layer of the
network is called the input layer, and the rightmost layer the output layer .The middle layer of nodes is called
the hidden layer, because its values are not observed in the training set.[8]

Small Neural Network[8]

13 of 37
Neural Network Model

14 of 37
Autoencoders and Sparsity

15 of 37
Autoencoders and Sparsity Algorithm

16 of 37
Autoencoders and Sparsity Algorithm –cont’d

17 of 37
Autoencoders and Sparsity Algorithm –cont’d

KL Function [6]
18 of 37
Autoencoders and Sparsity Algorithm – Cont’d

19 of 37
Autoencoder Implementation
•

We implemented a sparse autoencoder, trained with 8×8 image patches using the L-BFGS optimization algorithm

Step 1: Generate training set
The first step is to generate a training set.

A random sample of 200 patches from the dataset.

20 of 37
Autoencoder Implementation
Step 2: Sparse autoencoder objective
Compute the sparse autoencoder cost function Jsparse(W,b) and the corresponding derivatives of Jsparse with respect
to the different parameters
Step3: Train the sparse autoencoder
After computing Jsparse and its derivatives, we will minimize Jsparse with respect to its parameters, and thereby train our
sparse autoencoder. We trained our sparse encoder with L-BFGS algorithm Our neural network for training has 64
input units, 25 hidden units, and 64 output units.

21 of 37
Autoencoder Implementation Results
After training the sparse autoencoder, the sparse autoencoder
successfully learned a set of edge detectors.
CPU

Intel corei7 Quad Core processor
2.7GHz

RAM

6 GB RAM

Training Set

200 patches 8x8 images

Neural Network for training

64 input units, 25 hidden units, and
64 output units.

22 of 37
Autoencoder Implementation Results
Training Time

Expected Time [1]

39 seconds

Less than a minute

23 of 37
Principle Component Analysis –
PCA

24 of 32
Principle Component Analysis – PCA
•

PCA is a dimensionality reduction mechanism used to eliminate highly correlated variables, without
sacrificing much of the details.[7]

25 of 37
PCA – Example
Example
•

Given the 2D data example.

•

This data has already been pre-processed using mean normalization.

•

We want to find the principle directions of variation.

2D data example[8]
26 of 37
PCA – Example (Cont’d)

u2

u1

2D data example[8]
27 of 37
PCA – Math

28 of 37
PCA – Math

2D data example[8]

29 of 37
PCA – Dimensionality Reduction

30 of 37
PCA – Dimensionality Reduction

31 of 37
Whitening

32 of 37
Self-Taught Learning

33 of 32
Self-Taught learning and Unsupervised feature
learning
Given an unlabeled data set, we can start training a sparse autoencoder to extract
features to give us a better, condense representation of the data.

Neural Network[8]

34 of 37
Self-Taught learning and Unsupervised feature
learning
• Once the training is done, the network is now ready to find better features to represent
the input using the activations of the network hidden layer. [8]

Input layer of Neural Network[8]

35 of 37
Self-Taught learning and Unsupervised feature
learning

Input layer of Neural Network[8]

36 of 37
Self-Taught learning and Unsupervised feature
learning

Input layer of Neural Network[8]

37 of 37
Self-Taught Learning Application
• We used the self-taught learning paradigm with the sparse autoencoder and softmax
classifier to build a classifier for handwritten digits.
• The goal is to distinguish between the digits from 0 to 4. We will use the digits 5 to 9 as

our "unlabeled" dataset; we will then use a labeled dataset with the digits 0 to 4 with
which to train the softmax classifier.

38 of 37
Self-Taught Learning Implementation
Step 1: Generate the input and test data sets
We used the datasets from the MNIST Handwritten Digit Database for this project.
Step 2: Train the sparse autoencoder
We used the unlabeled data (the digits from 5 to 9) to train a sparse autoencoder. These results are shown after training is
complete for a visualization of pen strokes like the image shown to the right
Step 3: Extracting features
After the sparse autoencoder is trained, we will use it to extract features from the handwritten digit images.
Step 4: Training and testing the logistic regression model
We will train a softmax classifier using the training set features and labels and finally computing the predictions and accuracy

39 of 37
Self-Taught Learning Setup Environment
CPU

Intel
corei7
Quad
processor 2.7GHz

Core

RAM

6 GB RAM

Training Set

60,000 examples from MNIST
database

Unlabeled set

29404 examples

Supervised training set

15298 examples

Supervised testing set

15298 examples

40 of 37
Self-Taught Learning Results
The results are shown below after training is complete for a visualization of pen strokes like the image shown below:

41 of 37
Self-Taught Learning Anaylsis
We have done a comparison between our
application outputs and the Stanford course
tutorial outputs [8].
Our classifier

Tutorial’s
classifier

Training
Time

16 minutes

25 minutes

Classifier
Score
(Accuracy)

98.208916%

98 %

42 of 37
Future Work
We propose that if we were able to parallize our code or make the training part run on a GPU for example, it will
boost the performance and decrease the time needed to train the classifier

43 of 37
References
[1] Taiwo Oladipupo Ayodele. New Advances in Machine Learning. InTech, 2010.
[2] SB Kotsiantis, ID Zaharakis, and PE Pintelas. Supervised machine learning: A review of classication techniques. 31:249-268, 2007.
[3] Honglak Lee, Alexis Battle, Rajat Raina, and Andrew Ng. Ecient sparse coding algorithms. In Advances in neural information processing systems, pages 801-808,2006.
[4] Bruno A Olshausen et al. Emergence of simple-cell receptive field properties by learning a sparse code for natural images.Nature, 381(6583):607-609, 1996.
[5] Simon O. Haykin, ”Multilayer Perceptron,” in Neural Networks and Learning Machines, 3rd Edition ed. , Prentice Hall, 2009.
[6] Andrew Ng. CS294A . Lecture notes, Topic : “Sparse autoencoder ” Standford University, Jan 11, 2011. Available:
http://www.stanford.edu/class/cs294a/sparseAutoencoder_2011new.pdf. [Accessed Dec. 10,2013].
[7] Aapo Hyvärinen, Jarmo Hurri, and Patrik O. Hoyer, “Principal components and whitening,” in Natural Image Statistics: A Probabilistic Approach to Early
Computational Vision., Vol. 39, Springer-Verlag, 2009,pp. 97-137
[8] Andrew Ng, Jiquan Ngiam, Chuan Yu Foo, Yifan Mai, and Caroline Suen, “UFLDL Tutorial”, April 7, 2013. [Online]. Available:
http://deeplearning.stanford.edu/wiki/index.php/UFLDL_Tutorial. [Accessed Dec. 10,2013].

44 of 37
Thank You!

Contenu connexe

Tendances

Deep Learning Tutorial | Deep Learning Tutorial For Beginners | What Is Deep ...
Deep Learning Tutorial | Deep Learning Tutorial For Beginners | What Is Deep ...Deep Learning Tutorial | Deep Learning Tutorial For Beginners | What Is Deep ...
Deep Learning Tutorial | Deep Learning Tutorial For Beginners | What Is Deep ...
Simplilearn
 
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Simplilearn
 

Tendances (20)

A Brief Introduction on Recurrent Neural Network and Its Application
A Brief Introduction on Recurrent Neural Network and Its ApplicationA Brief Introduction on Recurrent Neural Network and Its Application
A Brief Introduction on Recurrent Neural Network and Its Application
 
Deep Learning Tutorial | Deep Learning Tutorial For Beginners | What Is Deep ...
Deep Learning Tutorial | Deep Learning Tutorial For Beginners | What Is Deep ...Deep Learning Tutorial | Deep Learning Tutorial For Beginners | What Is Deep ...
Deep Learning Tutorial | Deep Learning Tutorial For Beginners | What Is Deep ...
 
Deep learning in Computer Vision
Deep learning in Computer VisionDeep learning in Computer Vision
Deep learning in Computer Vision
 
Artificial neural network for machine learning
Artificial neural network for machine learningArtificial neural network for machine learning
Artificial neural network for machine learning
 
Fundamental of deep learning
Fundamental of deep learningFundamental of deep learning
Fundamental of deep learning
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networks
 
Intoduction to Neural Network
Intoduction to Neural NetworkIntoduction to Neural Network
Intoduction to Neural Network
 
introduction to deep Learning with full detail
introduction to deep Learning with full detailintroduction to deep Learning with full detail
introduction to deep Learning with full detail
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Image captioning
Image captioningImage captioning
Image captioning
 
Deep learning frameworks v0.40
Deep learning frameworks v0.40Deep learning frameworks v0.40
Deep learning frameworks v0.40
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis Introduction
 
Geek Night 17.0 - Artificial Intelligence and Machine Learning
Geek Night 17.0 - Artificial Intelligence and Machine LearningGeek Night 17.0 - Artificial Intelligence and Machine Learning
Geek Night 17.0 - Artificial Intelligence and Machine Learning
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural Networks
 
Deep learning (2)
Deep learning (2)Deep learning (2)
Deep learning (2)
 
Deep Learning With Neural Networks
Deep Learning With Neural NetworksDeep Learning With Neural Networks
Deep Learning With Neural Networks
 
Deep learning
Deep learningDeep learning
Deep learning
 
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
 
Som paper1.doc
Som paper1.docSom paper1.doc
Som paper1.doc
 
Deep learning
Deep learning Deep learning
Deep learning
 

Similaire à Unsupervised Feature Learning

Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep Features
Turi, Inc.
 
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakLearn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
PyData
 
Table of Contents
Table of ContentsTable of Contents
Table of Contents
butest
 

Similaire à Unsupervised Feature Learning (20)

Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep Features
 
Facial Emotion Detection on Children's Emotional Face
Facial Emotion Detection on Children's Emotional FaceFacial Emotion Detection on Children's Emotional Face
Facial Emotion Detection on Children's Emotional Face
 
Deep learning-practical
Deep learning-practicalDeep learning-practical
Deep learning-practical
 
Deep learning summary
Deep learning summaryDeep learning summary
Deep learning summary
 
Practical ML
Practical MLPractical ML
Practical ML
 
Deep Learning with Apache Spark: an Introduction
Deep Learning with Apache Spark: an IntroductionDeep Learning with Apache Spark: an Introduction
Deep Learning with Apache Spark: an Introduction
 
Build a simple image recognition system with tensor flow
Build a simple image recognition system with tensor flowBuild a simple image recognition system with tensor flow
Build a simple image recognition system with tensor flow
 
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakLearn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
 
IRJET- Deep Learning Model to Predict Hardware Performance
IRJET- Deep Learning Model to Predict Hardware PerformanceIRJET- Deep Learning Model to Predict Hardware Performance
IRJET- Deep Learning Model to Predict Hardware Performance
 
IRJET- Analysis of PV Fed Vector Controlled Induction Motor Drive
IRJET- Analysis of PV Fed Vector Controlled Induction Motor DriveIRJET- Analysis of PV Fed Vector Controlled Induction Motor Drive
IRJET- Analysis of PV Fed Vector Controlled Induction Motor Drive
 
Presentation 7.pptx
Presentation 7.pptxPresentation 7.pptx
Presentation 7.pptx
 
Table of Contents
Table of ContentsTable of Contents
Table of Contents
 
Nepali character classification
Nepali character classificationNepali character classification
Nepali character classification
 
MachinaFiesta: A Vision into Machine Learning 🚀
MachinaFiesta: A Vision into Machine Learning 🚀MachinaFiesta: A Vision into Machine Learning 🚀
MachinaFiesta: A Vision into Machine Learning 🚀
 
An LSTM-Based Neural Network Architecture for Model Transformations
An LSTM-Based Neural Network Architecture for Model TransformationsAn LSTM-Based Neural Network Architecture for Model Transformations
An LSTM-Based Neural Network Architecture for Model Transformations
 
Artificial Intelligence and Deep Learning in Azure, CNTK and Tensorflow
Artificial Intelligence and Deep Learning in Azure, CNTK and TensorflowArtificial Intelligence and Deep Learning in Azure, CNTK and Tensorflow
Artificial Intelligence and Deep Learning in Azure, CNTK and Tensorflow
 
Getting started with Machine Learning
Getting started with Machine LearningGetting started with Machine Learning
Getting started with Machine Learning
 
EssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdfEssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdf
 
Denoising autoencoder by Harish.R
Denoising autoencoder by Harish.RDenoising autoencoder by Harish.R
Denoising autoencoder by Harish.R
 
IRJET- Machine Learning V/S Deep Learning
IRJET- Machine Learning V/S Deep LearningIRJET- Machine Learning V/S Deep Learning
IRJET- Machine Learning V/S Deep Learning
 

Plus de Amgad Muhammad (6)

Improving region based CNN object detector using bayesian optimization
Improving region based CNN object detector using bayesian optimizationImproving region based CNN object detector using bayesian optimization
Improving region based CNN object detector using bayesian optimization
 
CUDA and Caffe for deep learning
CUDA and Caffe for deep learningCUDA and Caffe for deep learning
CUDA and Caffe for deep learning
 
Auto-Encoders and PCA, a brief psychological background
Auto-Encoders and PCA, a brief psychological backgroundAuto-Encoders and PCA, a brief psychological background
Auto-Encoders and PCA, a brief psychological background
 
Android Performance Best Practices
Android Performance Best Practices Android Performance Best Practices
Android Performance Best Practices
 
Google File System
Google File SystemGoogle File System
Google File System
 
Python
PythonPython
Python
 

Dernier

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Dernier (20)

Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 

Unsupervised Feature Learning

  • 1. Unsupervised Feature Learning: A Literature Review By: Amgad Muhammad & Mohamed EL Fadly 1
  • 2. Outline • Background • Problem Definition • Unsupervised Feature Learning • Our Work • Sparse Auto-encoder • Preprocessing: PCA and Whitening • Self-Taught Learning and Unsupervised Feature Learning • References 2 of 37
  • 3. Background • Machine learning is one of the corner stone fields in Artificial Intelligence, where machines learn to act autonomously, and react to new situations without being pre-programmed. • Machine learning has seen numerous successes, but applying learning algorithms today often means spending a long time hand-engineering the input feature representation. This is true for many problems in vision, audio, NLP, robotics, and other areas. • There are many learning algorithms for learning among them are [1]: 1) Supervised learning 2) Unsupervised learning 3 of 37
  • 4. Problem Definition • The target of the supervised learning method can be summarized as follows: • • • Regression Classification The first step to train a machine using the supervised learning method, is collecting the data set, which in most cases is a very difficult and an expensive process • The alternative approach is to measure and use everything, which will lead to other problems, i.e. the noisy data [2] 4 of 37
  • 5. Unsupervised feature learning • The unsupervised feature learning approach learns higher-level representation of the unlabeled data features by detecting patterns using various algorithms, i.e. sparse encoding algorithm [3] • It is a self-taught learning framework developed to transfer knowledge from unlabeled data, which is much easier to obtain, to be used as preprocessing step to enhance the supervised inductive models. • This framework is developed to tackle present issues in the supervised learning model and to increase its accuracy regardless of the domain of interest (vision, sound, and text).[4] 5 of 37
  • 6. Our Work • We will present some of the methods for unsupervised feature learning and deep learning, each of which automatically learns a good representation of the input from unlabeled data. • We will be concentrating on the following algorithms, with more details in the following slides: • • PCA and Whitening • • Sparse Autoencoder Self-Taught We will also be focusing on the application of these algorithms to learn features from images 6 of 37
  • 9. Neural Network Before we get further into the details of the algorithm, we need to quickly go through neural network. To describe neural networks, we will begin by describing the simplest possible neural network. One that comprises a single "neuron." We will use the following diagram to denote a single neuron [5] Single Neuron [8] 9 of 37
  • 11. Sigmoid Activation Function Sigmoid Function [8] 11 of 37
  • 12. Tanh Activation Function Tanh Function [8] 12 of 37
  • 13. Neural Network Model • A neural network is put together by hooking together many of our simple "neurons," so that the output of a neuron can be the input of another. For example, here is a small neural network • The circles labeled "+1" are called bias units, and correspond to the intercept term. The leftmost layer of the network is called the input layer, and the rightmost layer the output layer .The middle layer of nodes is called the hidden layer, because its values are not observed in the training set.[8] Small Neural Network[8] 13 of 37
  • 16. Autoencoders and Sparsity Algorithm 16 of 37
  • 17. Autoencoders and Sparsity Algorithm –cont’d 17 of 37
  • 18. Autoencoders and Sparsity Algorithm –cont’d KL Function [6] 18 of 37
  • 19. Autoencoders and Sparsity Algorithm – Cont’d 19 of 37
  • 20. Autoencoder Implementation • We implemented a sparse autoencoder, trained with 8×8 image patches using the L-BFGS optimization algorithm Step 1: Generate training set The first step is to generate a training set. A random sample of 200 patches from the dataset. 20 of 37
  • 21. Autoencoder Implementation Step 2: Sparse autoencoder objective Compute the sparse autoencoder cost function Jsparse(W,b) and the corresponding derivatives of Jsparse with respect to the different parameters Step3: Train the sparse autoencoder After computing Jsparse and its derivatives, we will minimize Jsparse with respect to its parameters, and thereby train our sparse autoencoder. We trained our sparse encoder with L-BFGS algorithm Our neural network for training has 64 input units, 25 hidden units, and 64 output units. 21 of 37
  • 22. Autoencoder Implementation Results After training the sparse autoencoder, the sparse autoencoder successfully learned a set of edge detectors. CPU Intel corei7 Quad Core processor 2.7GHz RAM 6 GB RAM Training Set 200 patches 8x8 images Neural Network for training 64 input units, 25 hidden units, and 64 output units. 22 of 37
  • 23. Autoencoder Implementation Results Training Time Expected Time [1] 39 seconds Less than a minute 23 of 37
  • 24. Principle Component Analysis – PCA 24 of 32
  • 25. Principle Component Analysis – PCA • PCA is a dimensionality reduction mechanism used to eliminate highly correlated variables, without sacrificing much of the details.[7] 25 of 37
  • 26. PCA – Example Example • Given the 2D data example. • This data has already been pre-processed using mean normalization. • We want to find the principle directions of variation. 2D data example[8] 26 of 37
  • 27. PCA – Example (Cont’d) u2 u1 2D data example[8] 27 of 37
  • 29. PCA – Math 2D data example[8] 29 of 37
  • 30. PCA – Dimensionality Reduction 30 of 37
  • 31. PCA – Dimensionality Reduction 31 of 37
  • 34. Self-Taught learning and Unsupervised feature learning Given an unlabeled data set, we can start training a sparse autoencoder to extract features to give us a better, condense representation of the data. Neural Network[8] 34 of 37
  • 35. Self-Taught learning and Unsupervised feature learning • Once the training is done, the network is now ready to find better features to represent the input using the activations of the network hidden layer. [8] Input layer of Neural Network[8] 35 of 37
  • 36. Self-Taught learning and Unsupervised feature learning Input layer of Neural Network[8] 36 of 37
  • 37. Self-Taught learning and Unsupervised feature learning Input layer of Neural Network[8] 37 of 37
  • 38. Self-Taught Learning Application • We used the self-taught learning paradigm with the sparse autoencoder and softmax classifier to build a classifier for handwritten digits. • The goal is to distinguish between the digits from 0 to 4. We will use the digits 5 to 9 as our "unlabeled" dataset; we will then use a labeled dataset with the digits 0 to 4 with which to train the softmax classifier. 38 of 37
  • 39. Self-Taught Learning Implementation Step 1: Generate the input and test data sets We used the datasets from the MNIST Handwritten Digit Database for this project. Step 2: Train the sparse autoencoder We used the unlabeled data (the digits from 5 to 9) to train a sparse autoencoder. These results are shown after training is complete for a visualization of pen strokes like the image shown to the right Step 3: Extracting features After the sparse autoencoder is trained, we will use it to extract features from the handwritten digit images. Step 4: Training and testing the logistic regression model We will train a softmax classifier using the training set features and labels and finally computing the predictions and accuracy 39 of 37
  • 40. Self-Taught Learning Setup Environment CPU Intel corei7 Quad processor 2.7GHz Core RAM 6 GB RAM Training Set 60,000 examples from MNIST database Unlabeled set 29404 examples Supervised training set 15298 examples Supervised testing set 15298 examples 40 of 37
  • 41. Self-Taught Learning Results The results are shown below after training is complete for a visualization of pen strokes like the image shown below: 41 of 37
  • 42. Self-Taught Learning Anaylsis We have done a comparison between our application outputs and the Stanford course tutorial outputs [8]. Our classifier Tutorial’s classifier Training Time 16 minutes 25 minutes Classifier Score (Accuracy) 98.208916% 98 % 42 of 37
  • 43. Future Work We propose that if we were able to parallize our code or make the training part run on a GPU for example, it will boost the performance and decrease the time needed to train the classifier 43 of 37
  • 44. References [1] Taiwo Oladipupo Ayodele. New Advances in Machine Learning. InTech, 2010. [2] SB Kotsiantis, ID Zaharakis, and PE Pintelas. Supervised machine learning: A review of classication techniques. 31:249-268, 2007. [3] Honglak Lee, Alexis Battle, Rajat Raina, and Andrew Ng. Ecient sparse coding algorithms. In Advances in neural information processing systems, pages 801-808,2006. [4] Bruno A Olshausen et al. Emergence of simple-cell receptive field properties by learning a sparse code for natural images.Nature, 381(6583):607-609, 1996. [5] Simon O. Haykin, ”Multilayer Perceptron,” in Neural Networks and Learning Machines, 3rd Edition ed. , Prentice Hall, 2009. [6] Andrew Ng. CS294A . Lecture notes, Topic : “Sparse autoencoder ” Standford University, Jan 11, 2011. Available: http://www.stanford.edu/class/cs294a/sparseAutoencoder_2011new.pdf. [Accessed Dec. 10,2013]. [7] Aapo Hyvärinen, Jarmo Hurri, and Patrik O. Hoyer, “Principal components and whitening,” in Natural Image Statistics: A Probabilistic Approach to Early Computational Vision., Vol. 39, Springer-Verlag, 2009,pp. 97-137 [8] Andrew Ng, Jiquan Ngiam, Chuan Yu Foo, Yifan Mai, and Caroline Suen, “UFLDL Tutorial”, April 7, 2013. [Online]. Available: http://deeplearning.stanford.edu/wiki/index.php/UFLDL_Tutorial. [Accessed Dec. 10,2013]. 44 of 37