SlideShare une entreprise Scribd logo
1  sur  63
1
Detection of Face using Viola Jones and
Recognition using Backpropagation
Neural Network
A dissertation submitted in fulfillment
of the requirements for the award degree of
Masters of Technology
(Electronics & Communication Engineering)
in
THE NORTHCAP UNIVERSITY,
Gurgaon
by
Smriti Tikoo
(14-ECP-015)
under the guidance of
Nitin Malik
Department of Electrical, Electronics and Communication Engineering
SCHOOL OF ENGINEERING AND TECHNOLOGY
Gurgaon, Haryana
May-2016
2
INDEX
CONTENTS Page No.
Candidate’s Declaration i
Acknowledgement ii
Abstract iii
List of abbreviations 1
List of Tables 2
List of Figures 3
CHAPTER 1: INTRODUCTION
1.1 PROBLEM DEFINITION 11
1.2 FACE DETECTION 13
1.3 FACE DETECTION TECHNIQUES 15
1.4 FACE RECOGNITION 18
1.5 FEATURE DETECTION 20
1.6 NEURAL NETWORKS 21
1.7 BACKPROPAGATION 24
1.8 COMPARISON WITH OTHER TECHNOLOGIES 24
1.9 PROBLEM STATEMENT 25
CHAPTER 2: LITERATURE REVIEW 26
CHAPTER 3: PROPOSED METHODOLOGY 30
3.1 VIOLA JONES ALGORITHM 31
3.2 HAAR FEATURES 33
3.3 INTEGRAL IMAGE 35
3.4 ADA BOOST TRAINING ALGORITHM 37
3.5 CASCADING AMPLIFIERS 38
3
CHAPTER 4: SOFTWAREIMPLEMENTATION
4.1 MATALAB 8.2 (R2013B) 41
4.2 OPEN SOURCE COMPUTER VISION 41
4.3 COMPUTER VISON TOOLS 43
4.4 VISION CASCADE OBJECT DETECTOR 44
4.5 NEURAL NETWORK TOOLS 47
4.6 NEURAL NETWORK FITTING TOOL 47
4.7 TRAINING ALGORITM 48
4.8 PROGRAMMING CODE FOR FACE DETECTION 48
4.9 PROCEDURE OF FACE RECOGNITION 53
CHAPTER 5 : RESULTS AND DISCUSSION
5.1 RESULTS OF RECOGNITIONPROCESS 56
CHAPTER 6: CONCLUSION AND FUTURE SCOPE
6.1 CONCLUSIONS 60
6.2 FUTURE SCOPE 61
References 62
Publications 65
Plagiarism Report
Date…………….
4
CANDIDATE’S DECLARATION
I hereby declare that the work presented in this dissertation report “Detection of face using
Viola Jones algorithm and recognition using Backpropagation Neural Network” in
complete fulfillment of the requirements for the award of degree of Master of Technology in
Electronics and Communications Engineering in THE NORTHCAP UNIVERSITY,
Gurgaon is an authentic record of my own work carried out under the guidance of Dr. Nitin
Malik (Associate Professor).
The matter embodied in this dissertation report has been submitted by me for the award of
degree of Master of Technology.
Date:
Place: (Smriti Tikoo)
5
Acknowledgement
This research was mentored by Dr.Nitin Malik (Associate Professor) at The NorthCap
University Gurgaon. I am grateful to him for his undivided support and professional expertise
on the area of research. I would like to express my deepest gratitude towards him for
moderating my research throughout which made a significant difference in the course of
dissertation. I would like to express my appreciation for voicing their views on the earlier
versions of my work..
6
Abstract
This thesis is an attempt towards facial recognition which is a prominent topic among the
communities such as image processing neural network and computer vision. Face detection
and recognition has been prevalent with research scholars and diverse approaches have been
incorporated till date to serve purpose. The rampant advent of biometric analysis systems,
which may be full body scanners, or iris detection and recognition systems and the finger print
recognition systems, and surveillance systems deployed for safety and security purposes have
contributed to inclination towards same. Advances has been made with frontal view, lateral
view of the face or using facial expressions such as anger, happiness and gloominess, still
images and video image to be used for detection and recognition. This led to newer methods
for face detection and recognition to be introduced in achieving accurate results and
economically feasible and extremely secure. Techniques such as Principal Component analysis
(PCA), Independent component analysis (ICA), Linear Discriminant Analysis (LDA), have
been the predominant ones to be used .But with improvements needed in the previous
approaches Neural Networks based recognition was like boon to the industry. It not only
enhanced the recognition but also the efficiency of the process. Choosing Backpropagation as
the learning method was clearly out of its efficiency to recognize non linear faces with an
acceptance ratio of more than 90% and execution time of only few seconds.
7
LIST OF ABBREVIATIONS
ANN Artificial Neural Network
BPNN Backpropagation Neural Network
DCT Direct Cosine Transform
FFT Fast Fourier Transform
ICA Independent Component Analysis
LDA Linear Discriminant Analysis
MFC Multi Resolution Feature Concatenation
MLP Multi Layer Perceptron
MLNN Multi Layer Neural Network
MMV Multi Resolution Majority Voting
PCA Principal Component Analysis
8
LIST OF TABLES
Page No:
Table1 Sample data for sample images 1, 2, and 3. 56
Table 2 Sample data of sample image 2. 58
9
LIST OF FIGURES
Page No:
Figure 1 Thumb impression as biometric scan . 12
Figure 2 Finger print being computed into binary data. 12
Figure 3 Generalized block diagram of face recognition. 13
Figure 4 A group of faces detected 14
Figure 5 Faces detected in a press conference. 15
Figure 6 Diagrammatical representation of viola Jones algorithm. 16
Figure 7 Simplified neural network. 23
Figure 8 Graphical representation of sigmoidal function. 23
Figure 9 Flowchart viola Jones algorithm 31
Figure 10 Rectangular features used to in Viola Jones 32
Figure 11 Flow chart for face recognition using BPNN. 35
Figure 12 Finding sum of shaded rectangular area 35
Figure 13 Cascading Amplifier a simplified diagram 39
Figure 14 Training cascade detector. 46
Figure 15 Faces detected by Viola Jones algorithm are highlighted by bounding boxes. 50
Figure 16 Nose detected. 51
Figure 17 Mouth detected. 52
Figure 18 Eyes detected. 52
Figure 19 Sample images 1, 2, 3 in rgb, gray scale and binary format. 53
Figure 20 Segmented image. 54
10
Figure 21 Binary image histogram equivalents. 54
Figure 22 Sample image1 in rgb, sample image 2 in gray and histogram equivalent of 55
sample image 2.
Figure 23 A three stage training procedure. 57
Figure 24 Parameters calculated during training procedure. 57
Figure 25 Performance plot for sample data 1. 58
Figure 26 Performance plot of sample image 2. 59
11
CHAPTER 1. Introduction
1.1 Problem Definition
Automatic recognition dates back to the years of 1960’s when pioneers such as Woody
Bledsoe, Helen Chan Wolf, and Charles Bisson introduced their works to the world. In 1964
& 1965, Bledsoe, along with Helen Chan and Charles Bisson had worked on the computer to
recognize human faces.
But it didn’t allow much recognition to his work due to not much support from agencies. After
him the work was carried forward by Peter Hart at the Stanford Research Institute. The results
given by computer were terrific when feeded with a gallery of images for recognition.
In around 1997, another approach by Christoph von der Malsburg and students of lot of reputed
universities got together to design a system which was even better than the previous one. The
system designed was robust and could identify people from photographs which didn’t offer a
clear view of the facial images .
After that technologies like Polar and FaceIt were introduced which were doing the work in
lesser time and gave accurate results by offering features like a 3D view of the picture and
comparing it to a database from worldwide. Till the year 2006 a lot of new algorithm replaced
the previous ones and were proving to be better in terms of offering various new techniques to
get better results and their performance was commendable.
Automatic Recognition has seen a lot of improvement since 1993, the error rate has dropped
down & enhancement methods to improve the resolution have hit the market. Ever since then
biometric authentication is all over the place, these days ranging from home security systems,
surveillance cameras, to fingerprint and iris authentication systems, user verification and etc.
12
These systems have contributed in catching terrorists and other type of intruders via cyber or
in person. Recognition of faces has been under a research since 1960’s but it’s just over a
decade that acceptable results have been obtained.
Figure 1: Thumb impression as biometric scan
Figure 2: Finger print being computed into binary data
Researchers are still trying to attain better models and approaches to be able to recognize just
like our brain does with utmost triviality be whatever the lighting conditions or time lapse of
years between generations, skin discolorations’ or even after intricate changes in faces like
undergoing botox or surgical treatments.
Approaches deployed do not count the intricate details which differ from person to person in
context of recognition results. The continuous changes in expression of face depending upon
the situation being experienced by person disclose important information regarding the
structural characteristics and the texture, thereby helping in recognition process.
Of the many methods adopted towards recognition of face they can be broadly classified as a)
holistic approach which relies of taking the whole face as an input in pixels and b) feature
13
extraction which concentrates on capturing the features in a face but certain methods are
deployed for removing the redundancy information and dimension problems involved.
Three basic approaches in this direction are namely PCA, LDA, ICA others which were added
to the list include the neural network, fuzzy and skin and texture based approaches also
including the ASM model. The problem is approached first by detecting the face and its features
and then moving on to recognition. A lot approaches are available for detection of faces like
Viola Jones algorithm, LBP, Adaboost algorithm, neural network based detection or usage of
Snow classifier method .
Figure 3: Generalised block diagram of face recognition
1.2 Face Detection
One of the foremost and important step in the procedure of facial recognition is the detection
of a face. It forms the first step in the direction of analysis of entire face, alignment, relighting,
modeling, recognition, head pose tracking, authentication of face and tracking of face and its
expressions.
Recognition by computers is necessary for them to be able to interpret peoples’ thoughts. The
aim behind facial detection is to detect a face in the image if any and return the image location
and extent of each face. For this appropriate face modeling and segmentation is to be done.
14
While carrying out face detection sources of variation of facial appearance like viewing
geometry (pose), illumination, the imaging process (resolution, focus, imaging noise,
perspective effects), and other factors like Occlusion should be considered .
Approaches of detection which rely on image formation help in detecting the color, shape or
motion as well. It’s a specific case of object-class detection. In object-class detection, the task
is to find the locations and sizes of all objects in an image that belongs to a given class.
Generally the detection algorithms for face Face-detection algorithms concentrate on frontal
facial images.
The image is then emulated with images in database. It is often used for biometric
authentication, video surveillance, and human computer interface and image database
management. It has gained itself a great deal of interest among the marketers as well.
Figure 4 : A group of faces detected
15
Figure 5: Faces of people detected during a press conference
1.3 Face DetectionTechniques
There a lot of techniques available for the purpose of detecting face in an image. The following
are listed below:
Viola Jones Face DetectionAlgorithm: Proposed by Paul Viola and Michael Jones, it proves
to be a one of its kind by providing competitive results for real time situations .It not only aces
the process of detecting frontal faces but can also be modeled for detecting variety of other
object classes. It’s robust nature and its adaptability for practical situation makes its popular
among its users. It follows a four stage procedure for the entire face detection which is : first is
Haar feature selection, then comes the integral image creation followed by adaboost training
and cascading of amplifiers.
Haar Feature selection matches the commonalities found in human faces. The integral image
calculates the rectangular features in fixed time which benefits it over other sophisticated
features. Integral image at (x,y) coordinates gives the pixel sum of the coordinates above and
on to the left of the (x,y).
16
Advantages:
1. Popular for accurate face detection.
2. High Detection rates and fit to be used for real time.
3. High Accuracy.
4. Reduced computation due to usage of cascading amplifiers.
Limitations:
1. Prolonged training time
2. Inadequate head positions.
3. Black faces not detected.
Figure 6: Diagrammatical representation of Viola Jones algorithm
Local Binary Pattern (LBP): An efficient technique for explaining the texture features of an
image .Its high computation speed and rotation invariance which assists its usage in areas of
retrieval of image, texture, recognition, segmentation. One of its successful applications
includes the objects detected in a video through elimination of the backdrop.
17
It assigns a unique texture value to each pixel to facilitate its conciliation with objective at hand
and also for easing the process of tracking the video.LBP majorly is beneficial for recognition
of the main points in target and formation of mask for joint colour texture selection.
Advantages:
1. Apt for expressing texture features.
2. Found effective for analysis of texture, retrieval of images, recognition and
segmentation of faces.
3. Spots a moving object with ease via elimination the backdrop
4. It is a Simple Approach and its computationally simple than Haar Like feature and fast.
5. The most vital properties of LBP features are tolerance against the monotonic
illumination changes and computational simplicity.
Limitations:
1. Insensitive to minute changes in localization process.
2. Error rate rises on usage of huge local regions.
3. Proves to be ineffective for any changes in non monotonic images.
4. Usage restricted for binary and grey scale.
Ada Boost Algorithm for Face Detection: Its basic aim is to create a superior prediction rule
which is highly accurate and is based on combining previously comparative weak incorrect
rules. It is the first and foremost practical boosting algorithm to have garnered wide acceptance
in various fields .It contributed to high detection rates and there by processing of images was
done on a faster rate.
From selection of visual features to linear combination of classifiers Ada boost did it all to
make an strong classifier. Over fitting seems to be one of the few restrictions other than noisy
data and outliers to the boosting algorithm.
Its adaptive name is due to its tendency to generate a strong classifier by carrying out multiple
iterations. A strong classifier is formed by cascading the weak amplifiers which slightly related
to the actual classifier.
18
A round about process of adding weak classifiers to the chain of formation of a strong classifier
goes about with an extra addition of a weight vector to concentrate on misclassified classifiers.
The outcome is a classifier that has higher accuracy than the weak learner’s classifiers.
Advantages:
1. Usage extends to various classifiers
2. Enhance accuracy levels of classifier.
3. Easy to implement and commonly used.
4. Over fitting, noisy data and outliers pose a restriction to algorithm.
Limitations:
1. Prior knowledge about the face structure not required. Just a pair of inputs to be fed, a
training database, and classification functions is needed.
2. Adaptive algorithm. The positive and negative aspects tested by classifier at every stage to
avoid misclassification
3. An error generated during training exponentially tends to approach zero in a finite number
of iterations.
4. Easy to program and its feature selection helps in giving a classifier is fast and simple.
5. Quality of detection relies on consistence of training set , its size and interclass variability.
6. Training is time consuming, as computation time is proportional to the size of features and
example sets.
1.4 Face recognition system: Modern era has brought about a revolution by bringer the
world closer to the computerized applications .With scenarios such breach of security, terrorist
attacks, cyber crimes and fraudery becoming rampant via various modes requirement of
authentication systems and verification technologies has become the need of the hour.
Although various biometric traits are available for carrying out the process but we have focused
on face recognition as it is flexible and reliable irrespective of a person being aware that his/her
face is being scanned. In this procedure user identification plays a vital role in order to
authenticate a face from the available database.
While designing feasibility and effective recognition should be kept in mind and a simple and
robust system is should be sought after solution at hand .This makes its wide acceptance
19
possible and makes socially more popular as well. Personal verification processes also utilizing
biometrics to make the procedures more accurate and efficient.
The major factors influencing the recognition method are the illumination and the various pose
variations at hand .Post the face detection feature extraction is the next step in command to
reach face recognition. A lot of approaches are available for extracting features. The
recognition process encompasses the usage of an automated or semi or manual way to match
the facial images.
Most common by far is 2Dimensional face recognition which is easier and less expensive
compared to the other approaches. The recognition system has the flexibility to operate in
either of the two mode s i.e. verification or identification. But the course of action is not as
easy as the brain makes it feel and work appropriately only under certain set conditions.
Degradation in performance is often observed when the images captured lack the appropriate
lighting, or changing poses make pose a restriction apart from the facial expressions. Some
algorithm for recognition use landmarks or features from the image in which the face is to
be detected.
Others resort normalization and compression of facial database. Popular recognition
algorithms include Principal Component Analysis using eigen faces, Linear Discriminate
Analysis, Elastic Bunch Graph Matching using the Fisher face algorithm, the Hidden Markov
model, the Multilinear Subspace Learning using tensor representation, and the neuronal
motivated dynamic link matching.
A new emerging trend is 3dimensional recognition which has argued for providing enhanced
accuracy utilizing sensors to capture the facial image of a person and information regarding
its shape and other properties such contour of eye sockets etc.
20
1.5 Feature Detection:
It refers to the ways which target the computing abstractions in the formation and decisions at
each image point where there might or might not be a feature to capture. Results of this modus
operandi are subsets of the original image domain in the form either isolated points, curves etc.
By definition feature can be understood as an unique part of image depending on the situation.
They might be at times the initial point of the beginning of an algorithm and therefore the
subsequent or the overall algorithm will be effective enough as much as the feature detector.
One of the prominent aspects is recurring features which might be detected in different images
of same type as well. It’s a nominal operation in image processing and gets operated as the next
operation after detection of face. It involves scanning each pixel to look for features. For a
lengthy algorithm scanning is done for the image in the region of features only.
Types of Features: Apart from the usually visible features in a face i.e eyes, nose, mouth, lips
, forehead there are various other features which are also helpful in verifying someone’s
identity.
Edges: forms the points where there is a boundary separating or joining two image regions.
Technically speaking points having high gradient magnitude classify as edges. Usually these
are one dimensional.
Corners/ interest points : Points like features anywhere in a facial image classify as corners.
They are two dimensional unlike the edges .The discovery of the corner points was because of
the facts that edges analyzed in the earlier works showed rampant changes in directions.
The algorithms were developed to avoid the detailed edge detection. It is detected as high levels
of curvature in the gradient. The next hurdle at hand was detection of corners on parts of images
which were not true corner points.
Blobs /region of interest or interest points: Provided the corresponding image description in
terms of region. The algorithms employed for detecting blob detected areas which were too
leveled to be detected as corner point. Blob descriptors contained points like center of gravity
etc distinguishing between a corner and blog becomes difficult if an image is reduced from its
actual size
21
Ridges –especially for elongated objects in context of a grey level image it is generalization of medial
axis in a realistic situation it is treated as one dimensional curve depicting axis of symmetry, but they
are harder to extract from the images.
Feature Extraction: On completion of the feature detection process, nest is the feature
detection which is capturing a section around the detected feature. Extraction involves
considerable amounts of processing of images and the outcome is called a feature descriptor or
a feature vector.
Approaches available for it are N-jets and the local histograms. In addition to such attribute
information, the feature detection step by itself may also provide complementary attributes
such as the edge orientation and gradient magnitude in edge detection and polarity and the
strength of the blob in blob detection. It is hybrid method of data reduction .Feature extraction
techniques extract a set of salient or discriminatory features as holistic or local features from
the face image known as feature space .Image will be rejected as a high dimensional feature
space.
1.6 Neural Networks:
Defines a family of algorithms via a multiple layers of interconnected inspired from the neuron
structure in the brain. Here each circular node represents an artificial neuron and an arrow
represents the connection from the output of one neuron to the input of another.
These networks are designed for approximating functions for huge amount of inputs generally
unknown. The nodes in neural network exchange data between one another. The connections
between the nodes have numerics weights associated which are altered to attain an output
matching the target output in ace of supervised learning or making neural nets adaptive to
inputs and capable of learning.
On the basis of the following criteria a network can be classified as neural: if it has set of
adaptive weights and is capable of approximating nonlinear functions of their inputs. The
adaptive weights can be thought of as connection strengths between neurons which are
activated during training and predictions.
Feed Forward Neural Networks: It is where connections between the nodes is not in a
cyclical form unlike the recurrent neural networks .Feed forward neural networks are the
22
simplest networks and the information moves in forward direction only from inputs to the
outputs via hidden layer in between.
Supervised Learning: Where the inputs are unknown and associated weights are updated so
as to compute an actual output that matches the desired output.eg. pattern recognition and
regression model.
Unsupervised Learning: It is a learning method where input data has no set responses to
compare its actual output. eg. Cluster analysis.
Single Perceptron: Consist of a single layer of input and output along with associated
numerical weights. The sum of the products of the weights and the inputs is calculated in
each node, and if the value is above some threshold the neuron is activated value otherwise
stays deactivated .Such neurons classify as linear threshold units . Usually the activation
holds a value of +1 and deactivation -1, keeping threshold as 0.
Delta rule is utilized for training of perceptron which calculated the error arisen due to
unmatched actual and target output. In multi-layer neural network continuous output is
obtained .A common choice is the so-called logistic function:
eq1.1
A logistic function is an S- shaped function. The values of x range from
-∞ to +∞.
Multilayer Perceptron: It’s a two layer neural network with ability to perform XORing. In
the figure shown in the next page the numerals within neurons represent the threshold. The
numbers that are depicted on the interconnections are the weight of the inputs. In this case if
threshold is not attained then the output is zero.
This class of networks consists of multiple layers of computational units, usually
interconnected in a feed-forward way. They usually employ sigmoid function as their activation
function. As per universal approximation theorem every continuous function which maps
intervals of real numbers with some output can be estimated using MLP with a single hidden
layer. Mostly the MLP’s employ back propagation as their learning technique for computing
data.
23
Figure 7: A generalized structure of neural network.
Sigmoid Function: A math function having an S shaped curve.It is usually called special
case of logistic function.
Figure 8: Graphical representation of sigmoidal function.
It is defined by 𝑆( 𝑡) = 1/(1 + 𝑒−𝑡
). It is a limited differentiable real function which is defined
for real values and gives a positive derivative. 𝑆′( 𝑡) = 𝑆(𝑡)(1− 𝑆( 𝑡)).
1.7 Backpropagation
Also called as "backward propagation of errors", is a popular method of training MLP
artificial neural networks in conjunction with an optimization method. Gradient of loss
function is calculated with respect to the associated weights. Optimization method is fed with
the gradient which alters the weights used to reduce the loss function.
24
This algorithm works on the supervised learning rule i.e. requires a known, desired output for
each input value in order to calculate the loss function gradient. It makes use of sigmoid
function as its activation usually. The algorithm can be split in two phases:
Phase 1: Propagation:
1. Forward propagation: Input is fed through the network to generate propagation's
output activations.
2. Backward propagation: A feedback network is formed by feeding the output as
input in order to generate a difference between actual and the target outputs.
Phase 2: Weight update:
1. Gradient of weight is a product of difference of outputs and input activation.
2. Subtract a ratio (percentage) of the gradient from the weight.
Ratio or the learning rate affects the speed and the quality of learning. Greater the ratio the
neurons are trained quickly but accuracy is assured with a lower ratio. If weight updation
increases in positive side the error shall increase. Repeat the steps to attain satisfactory results.
1.8 Comparison to other technologies: From the available biometric authentication
methods face recognition is the easier and reliable .No cooperation required from the subject
to be captured for identification and verification process .It is widely used on airports, defense
ministries, multiplexes and other areas which experience a threat to security of data or people.
Other biometrics like fingerprints, iris scans, and speech recognition cannot perform mass
identification unlike facial recognition systems. However, queries are being raised on the
efficiency of such systems.
Drawbacks: Though it is flexible and socially acceptable but it is not a perfect way out to solve
the problems at hand related to security of country and its people .Face recognition is not
perfect and struggles to perform under certain conditions.
Certain boundations like angles at which images to be captured, poor illumination, age related
changes in face or surgical, resolution of images, or clustered images, skin color, facial
expressions does pose as obstacle for the same .
25
1.9 Problem Statement
The advent of biometrics brought about a change in the way security of data or people and
country was treated. Be it via surveillance systems or the full body scans or the retina scans,
hand scans, finger prints not only it has changed the security setup of the country but it offers
a sense of reliability to the users and system as well. It has invented ways to protect and
preserve the confidential data and guard the system(of any type) with help of human and
computer interaction. It has brought about a great combination of image processing and
computer vision to light, that in a way have boosted the business for detection and recognition
systems. In this paper we propose a similar approach to detect and recognize a facial image
using a BPNN with help of Matlab. In the Matlab we have worked under the neural network,
using its tools to train and process the image for obtaining the performance plot.
CHAPTER 2. Literature Review
Whatever be the approaches be it Neural Network based[1][2] or any other approach
requirement of large database is required of facial images and non facial images as well to carry
out a gray scale based face detection. The non facial images pose a hurdle in that case.
26
Schneiderman and Kanade [4] have lent their modus operandi for the frontal faces and for
profile view detections.
The approach for non faces is a feature based method which encompasses the geometrical
attributes with belief networks [5].For real time situations [6] geometrical facial templates and
Hough transform can also be put to use. Facial detectors like Markov random fields [7][8] and
Markov chain[9] are detecting faces using the spatial arrangement of gray scale pixel values.
For tracking faces [10] whose initial position is known model based approaches are the answer,
eg. A 3d based deformable approach is good at tracking moving faces, provided that other
facial features are already known. A way for reducing the burden of detection algorithms are
motion and color they combined with other available information can add to fast detection and
tracking [11] by algorithms.
In a research paper the compilation of motion based tracking with Hidden Markov model has
already been used. A lot proposal have been published where in neural network combined with
motion and color filters[12] have been put to use. Zhao [14] provided survey and tasks for face
recognition of images in a video and for those which are still as well.
Turk [15] calculated the Eigen faces with linear algebra technique. This method enabled real-
time automated face recognition system. Liao [16] opted for fast fourier transform algorithms
before applying recognition.
Face recognition is performed using Principal Component Analysis (PCA) and Independent
Component Analysis (ICA) after the FFT has whitened the input images. Results have proved
that the whitening process enhances the recognition performances for both PCA and ICA.
Contrast enhancement and edge detection were used by Jarillo [17] for carrying out
transformations of input images. Discriminatory information can be obtained from such
experiments helpful for face classification, namely PCA, LDA (Linear Discriminate Analysis).
Database of YALE and FERET were used. Lin [18] opted for Hausdroff distance in
recognition procedure over Euclidian distance. Satisfactory results were obtained for the same.
Wavelet transform in combination with PCA was put to use by A. Eleyan [19].
27
Innovative methods such as Multi resolution Feature Concatenation (MFC), where PCA is used
for dimensional reduction on each sub band and resulting each sub band together for
performing classification, second is Multi resolution Majority Voting (MMV) where PCA and
classification are done separately on each sub band. MMV outperforms the MFC approach.
A DCT approach was used to minimize the problem of illumination by Shermina.J [20]. DCT
applied image later is used in recognition step and for this step PCA approach is used.
Unauthorized access is regulated by designing systems which can provide security and
reliability. So, building of efficient face recognition systems is very necessary for this.
Agarwal et al. [3] deduced an approach of coding and decoding an image to recognize a face.
It was carried in two stages firstly using PCA as method for extraction of features and then
using feed forward BPNN for recognition. Comparisons were made in terms of efficiency with
K-means, Fuzzy Ant with fuzzy C-means and better recognition rate was obtained compared
to the other two techniques.
Bhati et al.[4] opted for eigen faces for the feature vectors. The paper took novel route by
including aspects like mean square error, recognition rate and training time. Single layer neural
can give 100% recognition accuracy if correct learning rate was assigned to it.
They proposed that BPNN is more sensitive to hidden neurons and learning rate. Using
wavelets for feature extraction helped both neural networks showed improved results. Kim et
al. [5], suggested to improve the recognition rate combine the MLNN with a method that
computes the small number of eigen face feature vector through PCA.
The method improved functioning with variations in illumination and noise to minimize the
recognition error. The rectangular feature technique was used to extract the face from the
background image and after that it performed recognition by reducing image size and
denoising image, a proper vector through PCA was evaluated and finally applied MLNN on
the resultant image.
Existing PCA matching method was compared with the combinational approach of PCA and
MLNN by Benzaoui et al. [6] proposed a technique involving two stages: learning and
28
detection. In first phase, conversion to gray scale was done and then they were filtered by
median filter to reduce noise. Feature extraction was done by choosing appropriate information
from each picture to prepare the data for classifier learning.
System used DCT, to extract a vector of appropriate coefficient that symbolizes necessary
parameters for modeling the original one. There were problems due to variations in postures
so Saudagare et al. [7] proposed a system which involving inhibition of consequences of
changes in illumination and pose. Proposed a 2-level Haar wavelet transform to partition the
facade image into seven sub-image bands.
From the sub–image bands, eigenfaces features were extracted & used as input for BPNN based
classification algorithm. Denoising the images was necessary for enhanced & accurate
outcome to be achieved. So, Daramola et al. [8] designed a system with following procedure :
face detection ,pre-processing, principle component analysis (PCA) and classification.
Firstly, database with images in different poses was made, followed by normalization and
denoising the image. Eigen faces were calculated from the training set. Using PCA to calculate
eigen values resulted in were calculated larger eigen values in comparison training set images.
In ANN approach face descriptors were used as an input for training the network. Non-linearity
of the images was out looked by Kashemet al. [9].
A computational model for face detection and recognition, which was accurate, simple and fast
in forced environments was brought to light. Here eigen faces[10] were used for recognition to
up the accuracy and computation .
Whereas when Combining PCA and BPNN techniques non-linear face images were easily
distinguished. Conclusion was made that this method had execution time of few seconds and
its acceptance ratio was more than 90 %.
In [11], Chung et al. referred to a new face recognition method based on Principal Component
Analysis (PCA) and Gabor filter responses. First method, Gabor filtering was applied on
predefined fiducial points to match robust facial features from the original face image. Second
method involved transformation of the facial features into Eigen space by PCA, for optimal
classification of individual facial representations.
29
CHAPTER 3 Proposed Methodology
The advent of biometrics brought about a change in the way security of data or people and
country was treated. Be it via surveillance systems or the full body scans or the retina scans,
hand scans, finger prints not only it has changed the security setup of the country but it offers
a sense of reliability to the users and system as well. It has invented ways to protect and
preserve the confidential data and guard the system (of any type) with help of human and
computer interaction. It has brought about a great combination of image processing and
computer vision to light, that in a way have boosted the business for detection and recognition
30
systems. In this paper we propose a similar approach to detect and recognize a facial image
using a BPNN with help of MATLAB 8.2. In the MATLAB we have worked using the neural
network tool box, within which we have made use of the neural network fitting tool to train
and test the facial image at hand. This tool maps the numeric inputs to their targets. It helps in
training, testing and validating the data and evaluates the data on mean square error generated;
performance plot and if regression needed plot for the same data can also be put to use to make
an inference regarding the performance of the process
1. Detection of face and its features using Viola Jones algorithm.
2. Conversion from rgb to grayscale and binary.
3. Segmentation of image can be done in any format.
4. Histogram equivalent of binary or grayscale image.
5. Each segmented part is trained individually using the back propagation neural
network tool.
6. The above mentioned steps can also be carried out without segmentation of the
image.
7. For recognition process neural network tool in Matlab is used to train and test the
sample data provided by the histogram equivalent of the image taken.
8. In neural network tool, curve fitting tool has been made use to train the data.
9. The training of data takes place in 3 stages mainly training, testing and validation.
10. The mean squared error is a parameter which is focused to be able to contemplate
whether the training is giving desirable results or not.
11. The data is retrained to achieve better performance plots with least mean squared
error.
12. The training phase continues with usage of different samples in order to check the
performance of the network and to get better results.
3.1Viola Jones Algorithm:
It is a preferred algorithm for frontal face detection due to its higher accuracy and better
adaptability to real time situations. It trains across 5000 frontal faces and 300 million non faces.
The faces are normalized, scaled and then translated. Faces itself offer numerous variations due
to poses, marks, illumination, angles and various other innumerable factors. The key to face
31
detection is that every image is it mine or yours contains 10-50 thousand locks or scales on it
and rarity offered by per image is 0-50.
Under the four stages described before the first stage that is haar feature selection- all the
regularities found in the images is matched like the cheek regions are lighter than the eye
regions and the region enclosing bridge of the nose is even brighter than the eyes.
Figure 9: Flow chart for Viola Jones Algo
the mouth and the nose. Value: oriented gradients of pixel intensities. These features are then
matched by the algo. Under Rectangle features: the sum of pixels in black is subtracted from
those in white to get the value. Rectangle boxes are covered over the features being detected to
show how the pixel count is being done.
The integral image calculates all the rectangular features in a fixed interval of time, which
benefits them by speeding their operation over the detection of other features not included in
rectangular features.
A variant of Ada boost learning algorithm is employed to select the best features and also to
train the classifiers to use them. So construction of strong classifier begins by cascading the
previously used weak classifiers. The complex the structure of a classifier better and stronger
it becomes for detection purpose
INPUT
IMAGE
HAAR
FETURE
INTEGRA
L IMAGE
ADABOO
ST
CASCADI
NG
32
Figure 10: Rectangular features used in viola jones algo
.
Evaluation of classifiers does take a lot of time. To avoid this every successive classifier
continues to stick to the samples trained by previous ones. Since every feature is related to a
location in sub window, so if by any chance a sub window is rejected from inspection by a
classifier no further processing happens for the same window and a new sub window is
searched for.
The very first cascade has job to roughly half the time for which the entire cascade has to do
the evaluation. Every stage of classifier has a strong classifier which makes the features to be
grouped under the stages with every stage possessing certain features. The stage of classifier
check whether or not a given sub window classifies as face or not. If not a face then it is
discarded away.
3.2 Haar Features:
They are digital image features used in object recognition. They owe their name to their
intuitive similarity with Haar wavelets and were used in the first real-time face detector.
Historically, working with only image intensities (i.e., the RGB pixel values at each and every
pixel of image) made the task of feature calculation computationally expensive.
Viola and Jones adapted the idea of using Haar wavelets and developed the so-called Haar-like
features. A Haar-like feature considers adjacent rectangular regions at a specific location in a
33
detection window, sums up the pixel intensities in each region and calculates the difference
between these sums.
This difference is then used to categorize subsections of an image. For example, let us say we
have an image database with human faces. It is a common observation that among all faces the
region of the eyes is darker than the region of the cheeks.
Therefore a common haar feature for face detection is a set of two adjacent rectangles that lie
above the eye and the cheek region. The position of these rectangles is defined relative to a
detection window that acts like a bounding box to the target object (the face in this case).
In the detection phase of the Viola–Jones object detection framework, a window of the target
size is moved over the input image, and for each subsection of the image the Haar-like feature
is calculated.
This difference is then compared to a learned threshold that separates non-objects from objects.
Because such a Haar-like feature is only a weak learner or classifier (its detection quality is
slightly better than random guessing) a large number of Haar-like features are necessary to
describe an object with sufficient accuracy.
In the Viola–Jones object detection framework, the Haar-like features are therefore organized
in something called a classifier cascade to form a strong learner or classifier.
The key advantage of a Haar-like feature over most other features is its calculation speed. Due
to the use of integral images, a Haar-like feature of any size can be calculated in constant time
(approximately 60 microprocessor instructions for a 2-rectangle feature).
A simple rectangular Haar-like feature can be defined as the difference of the sum of pixels of
areas inside the rectangle, which can be at any position and scale within the original image.
This modified feature set is called 2-rectangle feature.
Viola and Jones also defined 3-rectangle features and 4-rectangle features. The values indicate
certain characteristics of a particular area of the image. Each feature type can indicate the
existence (or absence) of certain characteristics in the image, such as edges or changes in
texture. For example, a 2-rectangle feature can indicate where the border lies between a dark
region and a light region
One of the contributions of Viola and Jones was to use summed area tables,[3] which they
called integral images. Integral images can be defined as two-dimensional lookup tables in the
form of a matrix with the same size of the original image. Each element of the integral image
34
contains the sum of all pixels located on the up-left region of the original image (in relation to
the element's position). This allows to compute sum of rectangular areas in the image, at any
position or scale, using only four lookups:
Finding the sum of the shaded rectangular area
eq 3.1.
where points belong to the integral image , as shown in the figure.
Each Haar-like feature may need more than four lookups, depending on how it was defined.
Viola and Jones's 2-rectangle features need six lookups, 3-rectangle features need eight
lookups, and 4-rectangle features need nine lookups.
Figure 11: Flowchart for facial recognition
35
Figure 12: Finding sum of the shaded rectangular area.
3.3 Integral Image
A summed area table is a data structure and algorithm for quickly and efficiently generating
the sum of values in a rectangular subset of a grid. In the image processing domain, it is also
known as an integral image. It was introduced to computer graphics in 1984 by Frank
Crow for use with mipmaps. In computer vision it was popularized by Lewis and then given
the name "integral image" and prominently used within the Viola–Jones object detection
framework in 2001.
Historically, this principle is very well known in the study of multi-dimensional probability
distribution functions, namely in computing 2D (or ND) probabilities (area under the
probability.
Algorithm: As the name suggests, the value at any point (x, y) in the summed area table is just
the sum of all the pixels above and to the left of (x, y), inclusive:
eq3.2
Moreover, the summed area table can be computed efficiently in a single pass over the image,
using the fact that the value in the summed area table at (x, y) is just:
eq3.3
36
A description of computing a sum in the Summed Area Table data structure/algorithm
Once the summed area table has been computed, the task of evaluating the intensities over
any rectangular area requires only four array references. This allows for a constant
calculation time that is independent of the size of the rectangular area. That is, using the
notation in the figure at right, having A=(x0, y0), B=(x1, y0), C=(x0, y1) and D=(x1, y1), the
sum of i(x,y) over the rectangle spanned by A, B,C and D is:
eq3.4
distribution) from the respective cumulative distribution functions.
3.4 Ada Boost Training algorithm:
Ada Boost, short for "Adaptive Boosting", is a machine learning meta-algorithm formulated
by Yoav Freund and Robert Schapire who won the Gödel Prize in 2003 for their work. It can
be used in conjunction with many other types of learning algorithms to improve their
performance.
The output of the other learning algorithms ('weak learners') is combined into a weighted sum
that represents the final output of the boosted classifier. Ada Boost is adaptive in the sense that
subsequent weak learners are tweaked in favor of those instances misclassified by previous
classifiers.
Ada Boost is sensitive to noisy data and outliers. In some problems it can be less susceptible
to the over fitting problem than other learning algorithms. The individual learners can be weak,
but as long as the performance of each one is slightly better than random guessing (e.g., their
error rate is smaller than 0.5 for binary classification), the final model can be proven to
converge to a strong learner.
While every learning algorithm will tend to suit some problem types better than others, and
will typically have many different parameters and configurations to be adjusted before
37
achieving optimal performance on a dataset, Ada Boost (with decision trees as the weak
learners) is often referred to as the best out-of-the-box classifier.
When used with decision tree learning, information gathered at each stage of the Ada Boost
algorithm about the relative 'hardness' of each training sample is fed into the tree growing
algorithm such that later trees tend to focus on harder-to-classify examples.
Problems in machine learning often suffer from the curse of dimensionality — each sample
may consist of a huge number of potential features (for instance, there can be 162,336 Haar
features, as used by the Viola–Jones object detection framework, in a 24×24 pixel image
window), and evaluating every feature can reduce not only the speed of classifier training and
execution, but in fact reduce predictive power, per the Hughes Effect.
Unlike neural networks and SVMs, the Ada Boost training process selects only those features
known to improve the predictive power of the model, reducing dimensionality and potentially
improving execution time as irrelevant features do not need to be computed.
Training
Ada Boost refers to a particular method of training a boosted classifier. A boost classifier is a
classifier in the form
eq3.5
where each is a weak learner that takes an object as input and returns a real valued result
indicating the class of the object. The sign of the weak learner output identifies the predicted
object class and the absolute value gives the confidence in that classification. Similarly, the
-layer classifier will be positive if the sample is believed to be in the positive class and negative
otherwise.
Each weak learner produces an output, hypothesis , for each sample in the training set.
At each iteration , a weak learner is selected and assigned a coefficient such that the sum
training error of the resulting -stage boost classifier is minimized.
eq3.6
38
Here is the boosted classifier that has been built up to the previous stage of
training, is some error function and is the weak learner that is being
considered for addition to the final classifier.
Weighting: At each iteration of the training process, a weight is assigned to each sample in the
training set equal to the current error on that sample. These weights can be
used to inform the training of the weak learner, for instance, decision trees can be grown that
favor splitting sets of samples with high weights.
3.5 Cascading Amplifiers
A cascade amplifier is any two-port network constructed from a series of amplifiers, where
each amplifier sends its output to the input of the next amplifier in a daisy chain.[1]
The complication in calculating the gain of cascaded stages is the non-ideal coupling between
stages due to loading. Two cascaded common emitter stages are shown below. Because the
input resistance of the second stage forms a voltage divider with the output resistance of the
first stage, the total gain is not the product of the individual (separated) stages.
Figure 13: Cascading amplifier, a simplified diagram
On average only 0.01% of all sub-windows are positive (faces).Equal computation time is
spent on all sub-windows. Must spend most time only on potentially positive sub-windows. A
simple 2-feature classifier can achieve almost 100% detection rate with 50% FP rate. That
classifier can act as a 1st layer of a series to filter out most negative windows 2nd layer with
10 features can tackle “harder” negative-windows which survived the 1st layer, and so on…
39
A cascade of gradually more complex classifiers achieves even better detection rates. The
evaluation of the strong classifiers generated by the learning process can be done quickly, but
it isn’t fast enough to run in real-time. For this reason, the strong classifiers are arranged in a
cascade in order of complexity, where each successive classifier is trained only on those
selected samples which pass through the preceding classifiers.
If at any stage in the cascade a classifier rejects the sub-window under inspection, no further
processing is performed and continue on searching the next sub-window. The cascade therefore
has the form of a degenerate tree. In the case of faces, the first classifier in the cascade – called
the attentional operator – uses only two features to achieve a false negative rate of
approximately 0% and a false positive rate of 40%. The effect of this single classifier is to
reduce by roughly half the number of times the entire cascade is evaluated.
In cascading, each stage consists of a strong classifier. So all the features are grouped into
several stages where each stage has certain number of features.
The job of each stage is to determine whether a given sub-window is definitely not a face or
may be a face. A given sub-window is immediately discarded as not a face if it fails in any of
the stages.
A simple framework for cascade training is given below:
 User selects values for f, the maximum acceptable false positive rate per layer and d, the
minimum acceptable detection rate per layer.
 User selects target overall false positive rate Ftarget.
 P = set of positive examples
 N = set of negative examples.
40
CHAPTER 4 Software Implementation
4.1 MATLAB :
The software used for implementing the approach was MATLAB. MATLAB is matrix
laboratory capable enough to perform various mathematical operations with varied features
like fourth generation programming language to implement a approach.
It allows a user to plot functions, data , implementation of algorithms, creation of user
interfaces and interfaces with programmes written in different languages. It has numerous tool
boxes to work with depending upon the type of problem at hand. It has found users not only in
field of engineering but also in economics and science .It keeps on improving itself by
launching newer and enhanced versions of itself with novel tools to discover s & put to use for
varied purposes to satisfy.
Developed by Clever Moler in the 1970’s to give his students a better option to work with
like LINSPACK , EISPACK without having to learn Fortran. Researchers like Jack Little made
some much needed improvements to it and by the year 2000 newer libraries were added for
matrix manipulation.
41
The application is built around MATLAB scripting language and their usual usages include
writing commands in the command window which executes the written code. In MATLAB we
have worked under the neural network tool box where in the face recognition was done for the
image.
4.2Open Source Computer Vision
OpenCV (Open Source Computer Vision) is a library of programming functions mainly aimed
at real-time computer vision, originally developed by Intel's research center in Nizhny
Novgorod (Russia), later supported by Willow Garage and now maintained by Itseez.The
library is cross-platform and free for use under the open-source BSD license.
Officially launched in 1999, the OpenCV project was initially an Intel Research initiative to
advance CPU-intensive applications, part of a series of projects including real-time ray
tracing and 3D display walls. The main contributors to the project included a number of
optimization experts in Intel Russia, as well as Intel’s Performance Library Team. In the early
days of OpenCV, the goals of the project were described as:
 Advance vision research by providing not only open but also optimized code for basic
vision infrastructure. No more reinventing.
 Disseminate vision knowledge by providing a common infrastructure that developers
could build on, so that code would be more readily readable and transferable.
 Advance vision-based commercial applications by making portable, performance-
optimized code available for free—with a license that did not require to be open or free
themselves.
The first alpha version of OpenCV was released to the public at the IEEE Conference on
Computer Vision and Pattern Recognition in 2000, and five betas were released between 2001
and 2005. The first 1.0 version was released in 2006. In mid-2008, OpenCV obtained corporate
support from Willow Garage, and is now again under active development. A version 1.1 "pre-
release" was released in October 2008.
The second major release of the OpenCV was on October 2009. OpenCV 2 includes major
changes to the C++ interface, aiming at easier, more type-safe patterns, new functions, and
better implementations for existing ones in terms of performance (especially on multi-core
systems). Official releases now occur every six months and development is now done by an
independent Russian team supported by commercial corporations.
42
In August 2012, support for OpenCV was taken over by a non-profit foundation OpenCV.org,
which maintains a developer[4] and user site.
OpenCV is written in C++ and its primary interface is in C++, but it still retains a less
comprehensive though extensive older C interface. There are bindings in Python, Java and
MATLAB/OCTAVE. The API for these interfaces can be found in the online
documentation.[6] Wrappers in other languages such as C#, Perl, Ch, and Ruby have been
developed to encourage adoption by a wider audience. All of the new developments and
algorithms in OpenCV are now developed in the C++ interface.
4.3 Computer Vision Tools:
Computer Vision System Toolbox provides algorithms, functions, and apps for designing and
simulating computer vision and video processing systems. You can perform feature detection,
extraction, and matching; object detection and tracking; motion estimation; and video
processing. For 3-D computer vision, the system toolbox supports camera calibration, stereo
vision, 3-D reconstruction, and 3-D point cloud processing. With machine learning based
frameworks, you can train object detection, object recognition, and image retrieval systems.
Algorithms are available as MATLAB functions, System objects, and Simulink blocks. For
rapid prototyping and embedded system design, the system toolbox supports fixed-point
arithmetic and C-code generation.
Key Features:
 Object detection and tracking, including the Viola-Jones, Kanade-Lucas-Tomasi (KLT), and
Kalman filtering methods
 Training of object detection, object recognition, and image retrieval systems, including cascade
object detection and bag-of-features methods
 Camera calibration for single and stereo cameras, including automatic checkerboard detection
and an app for workflow automation
 Stereo vision, including rectification, disparity calculation, and 3-D reconstruction
 3-D point cloud processing, including I/O, visualization, registration, denoising, and geometric
shape fitting
43
 Feature detection, extraction, and matching
 Support for C-code generation and fixed-point arithmetic with code generation products.
Object Detection and Recognition: Object detection and recognition are used to locate, identify,
and categorize objects in images and video. Computer Vision System Toolbox provides a
comprehensive suite of algorithms and tools for object detection and recognition.
Object Classification: You can detect or recognize an object in an image by training an object
classifier using pattern recognition algorithms that create classifiers based on training data from
different object classes. The classifier accepts image data and assigns the appropriate object or
class label.
Computer Vision System Toolbox provides algorithms to train image classification and image
retrieval systems using the bag-of-words model. The system toolbox also provides feature
extraction techniques you can use to create custom classifiers using supervised classification
algorithms from Statistics and Machine Learning Toolbox.
4.4 Vision Cascade Object Detector:
Detect objects using the Viola-Jones algorithm The cascade object detector uses the Viola-
Jones algorithm to detect people's faces, noses, eyes, mouth, or upper body. One can also use
the Training Image Labeler to train a custom classifier to use with this System object.
Construction
detector = vision.CascadeObjectDetector creates a System object, detector, that detects objects
using the Viola-Jones algorithm. The ClassificationModel property controls the type of object
to detect. By default, the detector is configured to detect faces.
detector = vision.CascadeObjectDetector(MODEL) creates a System object, detector,
configured to detect objects defined by the input string, MODEL. The MODEL input describes
the type of object to detect. There are several valid MODEL strings, such as
'FrontalFaceCART', 'UpperBody', and 'ProfileFace'. detector = vision.
CascadeObjectDetector(XMLFILE) creates a System object, detector, and configures it to use
the custom classification model specified with the XML FILEinput. The XMLFILE can be
created using the trainCascadeObjectDetector function or OpenCV (Open Source Computer
Vision) training functionality.
44
You must specify a full or relative path to the XMLFILE, if it is not on the MATLAB path.
detector = vision.CascadeObjectDetector(Name,Value) configures the cascade object detector
object properties. You specify these properties as one or more name-value pair arguments.
Unspecified properties have default values.
To detect a feature:
1. Define and set up your cascade object detector using the constructor.
2. Call the step method with the input image, I, the cascade object detector object, detector,
points PTS, and any optional properties. See the syntax below for using the stepmethod.
Use the step syntax with input image, I, the selected Cascade object detector object, and any
optional properties to perform detection.
BBOX = step(detector, I) returns BBOX, an M-by-4 matrix defining M bounding boxes
containing the detected objects. This method performs multiscale object detection on the input
image, I. Each row of the output matrix, BBOX, contains a four-element vector, [x y width
height], that specifies in pixels, the upper-left corner and size of a bounding box. The input
image I, must be a grayscale or true color (RGB) image.
BBOX = step(detector, I,roi) detects objects within the rectangular search region specified
by roi. You must specify roi as a 4-element vector, [x y width height], that defines a rectangular
region of interest within image I. Set the 'Use ROI' property to true to use this syntax.
Why Train a Detector?
The vision.cascadeobjectdetector System object comes with several pretrained classifiers for
detecting frontal faces, profile faces, noses, eyes, and the upper body. However, these
classifiers are not always sufficient for a particular application. Computer Vision System
Toolbox provides the trainCascadeObjectDetector function to train a custom classifier.
What Kinds of Objects Can You Detect?: The Computer Vision System Toolbox cascade
object detector can detect object categories whose aspect ratio does not vary significantly.
Objects whose aspect ratio remains approximately fixed include faces, stop signs, and cars
viewed from one side. The vision.cascadeobjectdetector System object detects objects in
images by sliding a window over the image. The detector then uses a cascade classifier to
decide whether the window contains the object of interest.
The size of the window varies to detect objects at different scales, but its aspect ratio remains
fixed. The detector is very sensitive to out-of-plane rotation, because the aspect ratio changes
45
for most 3-D objects. Thus, you need to train a detector for each orientation of the object.
Training a single detector to handle all orientations will not work.
Figure 14: Training cascade detector
How Does the Cascade Classifier Work?
The cascade classifier consists of stages, where each stage is an ensemble of weak learners.
The weak learners are simple classifiers called decision stumps. Each stage is trained using a
technique called boosting. Boosting provides the ability to train a highly accurate classifier by
taking a weighted average of the decisions made by the weak learners.
Each stage of the classifier labels the region defined by the current location of the sliding
window as either positive or negative. Positive indicates that an object was found
andnegative indicates no objects were found. If the label is negative, the classification of this
region is complete, and the detector slides the window to the next location. If the label is
positive, the classifier passes the region to the next stage. The detector reports an object found
at the current window location when the final stage classifies the region as positive.
The stages are designed to reject negative samples as fast as possible. The assumption is that
the vast majority of windows do not contain the object of interest. Conversely, true positives
are rare and worth taking the time to verify.
 A true positive occurs when a positive sample is correctly classified.
 A false positive occurs when a negative sample is mistakenly classified as positive.
 A false negative occurs when a positive sample is mistakenly classified as negative.
46
To work well, each stage in the cascade must have a low false negative rate. If a stage
incorrectly labels an object as negative, the classification stops, and you cannot correct the
mistake. However, each stage can have a high false positive rate. Even if the detector
incorrectly labels a non object as positive, you can correct the mistake in subsequent stages.
The overall false positive rate of the cascade classifier is fs, where f is the false positive rate
per stage in the range (0 1), and s is the number of stages. Similarly, the overall true positive
rate is ts, where t is the true positive rate per stage in the range (0 1]. Thus, adding more stages
reduces the overall false positive rate, but it also reduces the overall true positive rate.
4.5 Neural Network Tools:
It provides algo’s, functions, and various app’s to create , train , visualize and simulate the
functions and the network. It offers regression, clustering, pattern recognition, reduction in
dimension, dynamic system modeling and control. The toolbox includes convolutional neural
networks ,deep learning algo’s for the various tasks to compute. The Parallel computing tool
box offers to compute large data sets by speeding their training and spreading it across multi
core processors & GPU’s .Among the various features to offer it allows to work its supports
supervised & unsupervised learning algo’s offers pre & post processing for better training
efficiency and enhancing the network performance.
Under the network architecture it offers supervised networks, and unsupervised networks,
under supervised networks it has four types to offer which are: feed forward networks, radial
basis networks, learning vector quantization and dynamic networks. We have used feed
forward networks to compute our data , which one way connections and no feedback
mechanism.
Feed forward networks support nonlinear function fitting, pattern recognition, and prediction
networks. Then they are supported feed forward networks which include feed forward back
propagation, cascade forward back propagation , feed forward input delay back propagation &
linear and perceptron networks.
4.6 Neural Network Fitting tool
nftool opens the neural network fitting tool GUI. nftool leads you through solving a data fitting
problem, solving it with a two-layer feed-forward network trained with Levenberg-Marquardt.
47
Neural networks are good at fitting functions. In fact, there is proof that a fairly simple neural
network can fit any practical function.
Suppose, for instance, that you have data from a housing application. You want to design a
network that can predict the value of a house (in $1000s), given 13 pieces of geographical and
real estate information. You have a total of 506 example homes for which you have those 13
items of data and their associated market values.
You can solve this problem in two ways:
 Use a graphical user interface, nftool, as described in Using the Neural Network Fitting Tool.
 Use command-line functions, as described in Using Command-Line Functions.
It is generally best to start with the GUI, and then to use the GUI to automatically generate
command-line scripts. Before using either method, first define the problem by selecting a data
set. Each GUI has access to many sample data sets that you can use to experiment with the
toolbox (see Neural Network Toolbox Sample Data Sets). If you have a specific problem that
you want to solve, you can load your own data into the workspace. The next section describes
the data format.
4.7 Training Algorithm
It supports training and learning functions which are ways to manipulate the numerical weights
& biases associated with the networks. Where a training function is more of an global algorithm
affecting the weights and biases of the whole network, learning function concentrates on the
individual weights and biases.
The algorithms supported by neural network tool box include Levenberg Marquardt Algo,
resilient back propagation algo, including many gradient descent methods and conjugate
gradient methods. The modular framework helps in developing custom trained algo’s as well.
When training the network one is allowed to use the error weight to check the relation between
them and the desired outputs.
The algo can be accessed directly from the command line or via apps that show some
diagrammatical representation of the networks in use and provide features to monitor and train
the specific network. A suite of learning functions, including gradient descent, Hebbian
learning, LVQ, Widrow-Hoff, and Kohonen is also provided.
48
trainlm is a network training function that updates weight and bias values according to
Levenberg-Marquardt optimization. trainlm is often the fastest Backpropagation algorithm in
the toolbox, and is highly recommended as a first-choice supervised algorithm, although it does
require more memory than other algorithms.
trainlm supports training with validation and test vectors if the
network's NET.divideFcn property is set to a data division function. Validation vectors are used
to stop training early if the network performance on the validation vectors fails to improve or
remains the same for max_fail epochs in a row. Test vectors are used as a further check that
the network is generalizing well, but do not have any effect on training.
trainlm can train any network as long as its weight, net input, and transfer functions have
derivative functions.
Backpropagation is used to calculate the Jacobian jX of performance perf with respect to the
weight and bias variables X. Each variable is adjusted according to Levenberg-Marquardt,
jj = jX * jX eq 4.1
je = jX * E eq 4.2
dX = -(jj+I*mu)  je eq 4.3
where E is all errors and I is the identity matrix.
The adaptive value mu is increased by mu_inc until the change above results in a reduced
performance value. The change is then made to the network and mu is decreased bymu_dec.
The parameter mem_reduc indicates how to use memory and speed to calculate the
Jacobian jX.
If mem_reduc is 1, then trainlm runs the fastest, but can require a lot of memory.
Increasing mem_reduc to 2 cuts some of the memory required by a factor of two, but
slows trainlm somewhat. Higher states continue to decrease the amount of memory needed and
increase training times.
Training stops when any of these conditions occurs:
 The maximum number of epochs (repetitions) is reached.
 The maximum amount of time is exceeded.
 Performance is minimized to the goal.
 The performance gradient falls below min_grad.
49
 mu exceeds mu_max.
Validation performance has increased more than max_fail times since the last time it decreased
(when using validation). In mathematics and computing, the Levenberg–Marquardt algorithm
(LMA), also known as the damped least-squares (DLS) method, is used to solve non-linear
least squares problems. These minimization problems arise especially in least squares curve
fitting. The LMA is used in many software applications for solving generic curve-fitting
problems. However, as for many fitting algorithms, the LMA finds only a local minimum,
which is not necessarily the global minimum.
The LMA interpolates between the Gauss–Newton algorithm (GNA) and the method
of gradient descent. The LMA is more robust than the GNA, which means that in many cases
it finds a solution even if it starts very far off the final minimum. For well-behaved functions
and reasonable starting parameters, the LMA tends to be a bit slower than the GNA. LMA can
also be viewed as Gauss–Newton using a trust region approach.
4.8 Programming Code for Face Detection and its features
Code for face detection:
clear all
clc
%Detect objects using Viola-Jones Algorithm
%To detect Face
FDetect = vision.CascadeObjectDetector;
%Read the input image
I = imread('HarryPotter.jpg');
%Returns Bounding Box values based on number of objects
BB = step(FDetect,I);
figure,
imshow(I); hold on
for i = 1:size(BB,1)rectangle('Position',BB(i,:),'LineWidth',5,'LineStyle','-','EdgeColor','r');
end
title('Face Detection');
hold off;
50
Figure 15: Faces detectedby algo are highlighted by bounding box
Code for Nose Detection:
%To detect Nose
NoseDetect = vision.CascadeObjectDetector('Nose','MergeThreshold',16);
BB=step(NoseDetect,I);
figure,
imshow(I); hold on
for i = 1:size(BB,1)
rectangle('Position',BB(i,:),'LineWidth',4,'LineStyle','-','EdgeColor','b');
end
title('Nose Detection');
hold off;
Figure 16 : Nose detected
Explanation:
51
To denote the object of interest as 'nose', the argument 'Nose' is passed.
vision.CascadeObjectDetector('Nose','MergeThreshold',16);
The default syntax for Nose detection :
vision.CascadeObjectDetector('Nose');
Code for Mouth Detection:
%To detect Mouth
MouthDetect = vision.CascadeObjectDetector('Mouth','MergeThreshold',16);
BB=step(MouthDetect,I);
figure,
imshow(I); hold on
for i = 1:size(BB,1)
rectangle('Position',BB(i,:),'LineWidth',4,'LineStyle','-','EdgeColor','r');
end
title('Mouth Detection');
hold off;
Figure 17: Mouth Detected
Code for eye detection:
52
%To detect Eyes
EyeDetect = vision.CascadeObjectDetector('EyePairBig');
%Read the input Image
I = imread('harry_potter.jpg');
BB=step(EyeDetect,I);
figure,imshow(I);
rectangle('Position',BB,'LineWidth',4,'LineStyle','-','EdgeColor','b');
title('Eyes Detection');
Eyes=imcrop(I,BB);
figure,imshow(Eyes);
Figure 18: Eyes detected
4.9 Procedure for face recognition:
After detecting faces and its features we move on the task of recognition using the Neural
Network training for recognition to be completed. Here I am using different faces to show the
procedure of face recognition one can continue with the same. So first we read the image
convert to grayscale or binary as we will be making use of their equivalent histogram for
feeding the data in the neural network.
Coding for image conversion from rgb to gray and binary.
I=imread('im111.jpg')
imshow(I)
J=rgb2gray(I)
subplot(1,3,1);imshow(I)
subplot(1,3,2);imshow(J)
K=im2bw(J);
subplot(1,3,3); imshow(K);
hold on;
53
Figure 19: sample images in rgb, gray and binary formats
Code for image segmentation
im=imread('im111.jpg');
n= fix(size(im,1)/2)
A=im(1:n,:,:);
B=im(n+1:end,:,:);
subplot(1,2,1);imshow(A);
subplot(1,2,2);imshow(B);
hold on;
Figure 20: Segmented image
One carry out the same procedure by segmenting the image as well. I have shown the
segmentation of images but recognition has been carried with the grayscale equivalent
histogram.
Code for binary image histogram
I=imread('im111.jpg');
J=rgb2gray(I);
K=im2bw(J);
subplot(1,3,1);imshow(I);
subplot(1,3,2);imshow(J);
54
subplot(1,3,3);imshow(K);
imhist(K);
hold on;
Figure 21: Binary image histogram equivalent
Code for calculating histogram equivalent for a gray scale image
I=imread('im111.jpg');
J=rgb2gray(I);
K=im2bw(J);
subplot(1,3,1);imshow(I);
subplot(1,3,2);imshow(J);
subplot(1,3,3);imshow(K);
imhist(J);
hold on;
Figure 22: sample image 1 in rgb, sample image 2 in gray scale ,
55
and histogram equivalent of sample image 2.
CHAPTER 5 Results and Discussion
For carrying out the training procedure a three layer neural network is used. The training and
testing of data has been done using the neural network tool in the MATLAB has been used.
We have used feed forward networks under the supervised learning architecture of neural
network tool box to compute our data, in which one way connections operate and no feedback
mechanism is included.
5.1 Results of face recognition:
After opening the nft tool we choose the data to be trained for recognition procedure. That
data is usually imported from an excel book.
Table 1: sample data for sample images 1, 2, 3
S.no Sample Input data Target data
1 1 10010101 10001111
2 2 11001100 11110000
3 3 10101100 10101011
After importing the data neural network tool starts the training depicted by the diagram shown
below. It carries out training in 3 states that is training validation and testing. Where in the
training stage the network is adjusted as per the error generated during it.
During the validation network generalization is monitored and is used to halt the training .In
testing the final solution is tested to confirm the predictive power of the network. Usually a lot
of training sessions are conducted to enhance the performance of network and lower the mean
squared error value which is the average squared difference between output and target.
56
Figure 23: A three stage training procedure
The performance plot shown below is a criteria to check the performance of the network,
regression plots to can be devised for the same purpose but for the convenience I have stick to
performance plots.
Figure 24: Parameters calculated during training
57
Figure 25: Performance plot for sample data 1.
Table 2: sample data for sample image 2
Input data Target data
462 0 0 102 1100
342 0 0 78 0010
234 0 0 65 1001
500 0 0 132 1010
222 0 0 69 1011
165 0 0 45 0111
58
Figure 26: Performance plot of sample data for sample image 2.
59
CHAPTER 6 Conclusionand future scope
6.1 Conclusion:
In this work it has been shown that if a facial image of a person is given then the network can
able to recognize the face of the person. The whole work is completed through the following
steps: Facial images of different persons have been collected for showing the detection and
recognition process in detail.
For detection purpose we used a group of people and detected their faces and features by
highlighting the detected regions using bounding boxes. For the purpose of detection we used
the vision.cascadeobjectdetector package in the computer vision tool box of the MATLAB
8.2.While as for recognition the neural network tool box was utilized and in which feed forward
back propagation architecture was utilized.
In neural network we have worked under the neural network fitting tool to train the facial
image data. The image taken for recognition was converted into grayscale and binary image
formats . Then taking the grayscale image we computed the histogram for it using which the
data was compiled for the image recognition.
One can also do the same using binary image . In neural network fitting tool the training was
carried out under 3 stages mainly training, validation and testing which have already been
explained before in the report.The training procedure computed the error to monitor the
performance of the network .In validation network generalization was the area of concentration
.
In testing the final solution was obtained by carrying out the training session for 5-10 times in
order to achieve better outcomes .Mean squared error was the parameter chosen to asses the
network’s execution so far. Lower values for mean squared were obtained for the same. Thus
with results one can conclude to have recognized the given image .The accuracy of this network
is 85% approx.
60
6.2 Future Scope:
Face recognition systems used today work very well under constrained conditions, although all
systems work much better with frontal mug-shot images and constant lighting.
All current face recognition algorithms fail under the vastly varying conditions under which
humans need to and are able to identify other people. Next generation person recognition
systems will need to recognize people in real-time and in much less constrained situations.
I believe that identification systems that are robust in natural environments, in the presence of
noise and illumination changes, cannot rely on a single modality, so that fusion with other
modalities is essential .
Technology used in smart environments has to be unobtrusive and allow users to act freely.
Wearable systems in particular require their sensing technology to be small, low powered and
easily integrable with the user's clothing.
Considering all the requirements, identification systems that use face recognition and speaker
identification seem to us to have the most potential for wide-spread application. Cameras and
microphones today are very small, light-weight and have been successfully integrated with
wearable systems.
Audio and video based recognition systems have the critical advantage that they use the
modalities humans use for recognition. Finally, researchers are beginning to demonstrate that
unobtrusive audio-and-video based person identification systems can achieve high recognition
rates without requiring the user to be in highly controlled environments. Sooner or later the
process of face recognition can become automated using the neural network as its base
technique and some other technique in combination with.
61
REFERENCES:
[1] H.A. Rowley, S. Baluja, and T. Kanade, “Neural network - based face detection,” IEEE Trans.Pattern Analysis
and Machine Intelligence, vol. 20, no. 1, pp. 23–38, Jan. 1998.
[2] H.A.Rowley, S. Baluja, and T. Kanade, “Rotation invariant neural net-workbased face detection,” Proc. IEEE
Int’l Conf. Computer Vision and Pattern Recognition, pp. 38–44, 1998.
[3] K.K Sung and T. Poggio, “Example-based learning for view-based human face detection,” IEEE Trans.
Pattern Analysis and Machine Intelli-gence, vol. 20, no. 1, pp. 39–51, Jan. 1998.
[4] H. Schneiderman and T. Kanade, “A statistical method for 3D object detection applied to faces and cars,”
Proc. IEEE Int’l Conf. Computer Vision and Pattern Recognition, pp. 746–751, June 2000.
[5] K.C. Yow and R. Cipolla, “Feature-based human face detection,” Image and Vision Computing, vol. 25, no.
9, pp. 713–735, Sept. 1997.
[6] D. Maio and D. Maltoni, “Real-time face location on gray-scale static images,” Pattern Recognition, vol. 33,
no. 9, pp. 1525–1539, Sept. 2000.
[7] M.S. Lew and N. Huijsmans, “Information theory and face detection,” Proc.IEEE Int’l Conf. Pattern
Recognition, pp. 601- 605, Aug. 1996.
[8] S.C. Dass and A.K. Jain, “Markov face models,” Proc. IEEE Int’l Conf.Computer Vision, pp. 680–687, July
2001.
[9] A.J. Colmenarez and T.S. Huang, “Face detection with information based maximum discrimination,” Proc.
IEEE Int’l Conf. Computer Vision and Pattern Recognition, pp. 782–787, June 1997.
[10] D. DeCarlo and D. Metaxas, “Optical flow constraints on deformable models with applications to face
tracking,” International Journal Computer Vision, vol. 38, no. 2, pp. 99–127, July 2000.
[11] V. Bakic and G. Stockman, “Menu selection by facial aspect,” Proc. Vision Interface, Canada, pp. 203–209,
May 1999.
[12] A. Colmenarez, B. Frey, and T. Huang, “Detection and tracking of faces and facial features,” Proc. IEEE
Int’l Conf. Image Processing, pp. 657–661, Oct. 1999.
[13] R. F´eraud, O.J. Bernier, J.-E. Viallet, and M. Collobert, “A fast and accurate face detection based on neural
network,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23.
[14] W.Zhao, R.Chellappa, P.J..Phillips and A. Rosennfeld, “Face Reconi-tion: A literature Survey”. ACM
Comput.Surv., 35(4): 399-458, 2003.
[15] M.A.Turk and A.P. Pentaland, “Face Recognition Using Eigenfaces”, IEEE conf. on Computer Vision and
Pattern Recognition, pp. 586-591, 1991.
[16] Ling-Zhi Liao, Si-Wei Luo, and Mei Tian “”Whitenedfaces” Recogni-tion With PCA and ICA” IEEE Signal
Processing Letters, Vol. 14, No. 12, pp1008-1011, Dec. 2007.
[17] G. Jarillo, W.Pedrycz , M. Reformat “Aggregation ofclassifiers based on image transformations in biometric
face recognition” Machine Vision and Applications (2008) Vol . 19,pp. 125-140, Springer-Verlag 2007.
[18] Tej Pal Singh, “Face Recognition by using Feed Forward Back Propagation Neural Network”, International
Journal of Innovative Research in Technology & Science, vol.1, no.1.
62
[19] N.Revathy, T.Guhan, “Face recognition system using back propagation artificial neural networks”,
International Journal of Advanced Engineering Technology, vol.3, no. 1, 2012.
[20] Kwan-Ho Lin, Kin-Man Lam, and Wan-Chi Siu. “A New Approach using ModiGied Hausdorff Distances
with EigenFace for Human Face Recognition” IEEE Seventh international Conference on Control, Automation,
Robotics and Vision , Singapore, 2-5 Dec, ,pp 980-984, 2002
[21] Zdravko Liposcak, , Sven Loncaric, “Face Recognition From Profiles Using Morphological Operations”,
IEEE Conference on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems, 1999.
[22] Simone Ceolin , William A.P Smith, Edwin Hancock, “Facial Shape Spaces from Surface Normals and
Geodesic Distance”, Digital Image Computing Techniques and Applications, 9th Biennial Conference of
Australian Pattern Recognition Society IEEE 3-5 Dec., Glenelg, pp-416-423, 2007
[23] Ki-Chung Chung , Seok Cheol Kee ,and Sang Ryong Kim, “Face Recognition using Principal Component
Analysis of Gabor Filter Responses” ,Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time
Systems, 1999,Proceedings. International Workshop IEEE 26-27 September, Corfu,pp-53-57, 1999,
[24] Ms.Varsha Gupta, Mr. Dipesh Sharma, “A Study of Various Face Detection Methods”,InternationalJournal
of Advanced Research in Computer and Communication Engineering), vol.3, no. 5, May 2014.
[25] Irene Kotsia, Iaonnis Pitas, “Facial expression recognition in image sequences using geometric deformation
features and support vector machines”, IEEE transaction paper on image processing, vol. 16, no.1, pp-172-187,
January 2007.
[26] R.Rojas,”The back propagation algorithm”,Springer-Verlag, Neural networks, pp 149-182 ,1996.
[27] Hosseien Lari-Najaffi , Mohammad Naseerudin and Tariq Samad,”Effects of initial weights on back
propagation and its variations”, Systems, Man and Cybernetics ,Conference Proceedings, IEEE International
Conference, 14-17 Nov ,Cambridge, pp-218-219,1989.
[28] M.H Ahmad Fadzil. , H.Abu Bakar., “Human face recognition using neural networks”, Image processing,
1994, Proceedings ICIP-94, IEEE International Conference ,13-16 November, Austin ,pp-936-939,1994.
[29] N.K Sinha , M.M Gupta and D.H Rao, “Dynamic Neural Networks -an overview”, Industrial Technology
2000,Proceedings of IEEE International Conference,19-22 Jan, , pp-491-496, 2000
[30] Prachi Agarwal, Naveen Prakash, “An Efficient Back Propagation NeuralNetwork Based Face Recognition
System Using Haar Wavelet Transform and PCA” International Journal of Computer Science and Mobile
Computing, vol.2, no.5,pg.386 – 395,May 2013.
[31]Dibber, 4 Jan 2005, “Backpropagation”, https://en.wikipedia.org/wiki/Backpropagation, 20 September 2015
[32] “Artificial Neural Networks”, https://en.wikipedia.org/wiki/Artificial_neural_network, accessed online 2
October 2001.
[33]MichaelNielsen,Jan, “Neural networks and deep learning”,
http://neuralnetworksanddeeplearning.com/chap2.html,2016
[34]Tyrell turing, 7 April, “Feed forward neural network”,
https://en.wikipedia.org/wiki/Feedforward_neural_network,2005.
63
Publications:
Journal Publications
Smriti Tikoo, Nitin Malik, “Detection of face using Viola Jones algorithm and
recognition using Backpropagation neural network”, International Journal of
Computer science and Mobile Computing, vol 5, issue no. 5, pg.288-295, May 2016.
Communicated: “Detection, segmentation and recognition of face and its features using
neural networks”, International Journal of Advanced Research in Computer and
Communication Engineering.

Contenu connexe

Tendances

ATM System by image processing
ATM System by image processingATM System by image processing
ATM System by image processing
shahab islam
 

Tendances (20)

License Plate Recognition
License Plate RecognitionLicense Plate Recognition
License Plate Recognition
 
Face recognition
Face recognitionFace recognition
Face recognition
 
Face Detection and Recognition System
Face Detection and Recognition SystemFace Detection and Recognition System
Face Detection and Recognition System
 
Face recognition attendance system
Face recognition attendance systemFace recognition attendance system
Face recognition attendance system
 
Face Recognition Technology
Face Recognition TechnologyFace Recognition Technology
Face Recognition Technology
 
Suspicious Activity Detection
Suspicious Activity DetectionSuspicious Activity Detection
Suspicious Activity Detection
 
Attendance Management System using Face Recognition
Attendance Management System using Face RecognitionAttendance Management System using Face Recognition
Attendance Management System using Face Recognition
 
Attendance system based on face recognition using python by Raihan Sikdar
Attendance system based on face recognition using python by Raihan SikdarAttendance system based on face recognition using python by Raihan Sikdar
Attendance system based on face recognition using python by Raihan Sikdar
 
ATM System by image processing
ATM System by image processingATM System by image processing
ATM System by image processing
 
Face Recognition Home Security System
Face Recognition Home Security SystemFace Recognition Home Security System
Face Recognition Home Security System
 
Real time drowsy driver detection
Real time drowsy driver detectionReal time drowsy driver detection
Real time drowsy driver detection
 
Gender and Age Detection using OpenCV.pptx
Gender and Age Detection using OpenCV.pptxGender and Age Detection using OpenCV.pptx
Gender and Age Detection using OpenCV.pptx
 
seminar presentation on Face ricognition technology
seminar presentation on Face ricognition technologyseminar presentation on Face ricognition technology
seminar presentation on Face ricognition technology
 
Security
SecuritySecurity
Security
 
Face Recognition Methods based on Convolutional Neural Networks
Face Recognition Methods based on Convolutional Neural NetworksFace Recognition Methods based on Convolutional Neural Networks
Face Recognition Methods based on Convolutional Neural Networks
 
Automated attendance system using Face recognition
Automated attendance system using Face recognitionAutomated attendance system using Face recognition
Automated attendance system using Face recognition
 
Face recognition
Face recognitionFace recognition
Face recognition
 
Face recognition system
Face recognition systemFace recognition system
Face recognition system
 
Ear Biometrics
Ear BiometricsEar Biometrics
Ear Biometrics
 
Facial expression recognition
Facial expression recognitionFacial expression recognition
Facial expression recognition
 

En vedette

Face recognition ppt
Face recognition pptFace recognition ppt
Face recognition ppt
Santosh Kumar
 
Face recognition technology - BEST PPT
Face recognition technology - BEST PPTFace recognition technology - BEST PPT
Face recognition technology - BEST PPT
Siddharth Modi
 

En vedette (16)

Face detection presentation slide
Face detection  presentation slideFace detection  presentation slide
Face detection presentation slide
 
Face recognition ppt
Face recognition pptFace recognition ppt
Face recognition ppt
 
Ppt for Online music store
Ppt for Online music storePpt for Online music store
Ppt for Online music store
 
Independent Research
Independent ResearchIndependent Research
Independent Research
 
Wits presentation 6_28072015
Wits presentation 6_28072015Wits presentation 6_28072015
Wits presentation 6_28072015
 
Face Recognition using C#
Face Recognition using C#Face Recognition using C#
Face Recognition using C#
 
What goes on during haar cascade face detection
What goes on during haar cascade face detectionWhat goes on during haar cascade face detection
What goes on during haar cascade face detection
 
Open cv nesne tespiti haar cascade sınıflandırıcısı
Open cv nesne tespiti haar cascade sınıflandırıcısıOpen cv nesne tespiti haar cascade sınıflandırıcısı
Open cv nesne tespiti haar cascade sınıflandırıcısı
 
Face Recognition with OpenCV and scikit-learn
Face Recognition with OpenCV and scikit-learnFace Recognition with OpenCV and scikit-learn
Face Recognition with OpenCV and scikit-learn
 
Face Detection
Face DetectionFace Detection
Face Detection
 
face detection
face detectionface detection
face detection
 
Face Recognition using OpenCV
Face Recognition using OpenCVFace Recognition using OpenCV
Face Recognition using OpenCV
 
Object detection
Object detectionObject detection
Object detection
 
Week6 face detection
Week6 face detectionWeek6 face detection
Week6 face detection
 
FACE RECOGNITION TECHNOLOGY
FACE RECOGNITION TECHNOLOGYFACE RECOGNITION TECHNOLOGY
FACE RECOGNITION TECHNOLOGY
 
Face recognition technology - BEST PPT
Face recognition technology - BEST PPTFace recognition technology - BEST PPT
Face recognition technology - BEST PPT
 

Similaire à Dissertation final report

Innovative Analytic and Holistic Combined Face Recognition and Verification M...
Innovative Analytic and Holistic Combined Face Recognition and Verification M...Innovative Analytic and Holistic Combined Face Recognition and Verification M...
Innovative Analytic and Holistic Combined Face Recognition and Verification M...
ijbuiiir1
 
Face Recognition & Detection Using Image Processing
Face Recognition & Detection Using Image ProcessingFace Recognition & Detection Using Image Processing
Face Recognition & Detection Using Image Processing
paperpublications3
 
Smriti's research paper
Smriti's research paperSmriti's research paper
Smriti's research paper
Smriti Tikoo
 
Paper id 24201475
Paper id 24201475Paper id 24201475
Paper id 24201475
IJRAT
 
FaceDetectionforColorImageBasedonMATLAB.pdf
FaceDetectionforColorImageBasedonMATLAB.pdfFaceDetectionforColorImageBasedonMATLAB.pdf
FaceDetectionforColorImageBasedonMATLAB.pdf
Anita Pal
 

Similaire à Dissertation final report (20)

Ijarcce 27
Ijarcce 27Ijarcce 27
Ijarcce 27
 
Age and Gender Classification using Convolutional Neural Network
Age and Gender Classification using Convolutional Neural NetworkAge and Gender Classification using Convolutional Neural Network
Age and Gender Classification using Convolutional Neural Network
 
Innovative Analytic and Holistic Combined Face Recognition and Verification M...
Innovative Analytic and Holistic Combined Face Recognition and Verification M...Innovative Analytic and Holistic Combined Face Recognition and Verification M...
Innovative Analytic and Holistic Combined Face Recognition and Verification M...
 
Face Recognition & Detection Using Image Processing
Face Recognition & Detection Using Image ProcessingFace Recognition & Detection Using Image Processing
Face Recognition & Detection Using Image Processing
 
Facial Emotion Recognition: A Survey
Facial Emotion Recognition: A SurveyFacial Emotion Recognition: A Survey
Facial Emotion Recognition: A Survey
 
Smriti's research paper
Smriti's research paperSmriti's research paper
Smriti's research paper
 
184
184184
184
 
Progression in Large Age-Gap Face Verification
Progression in Large Age-Gap Face VerificationProgression in Large Age-Gap Face Verification
Progression in Large Age-Gap Face Verification
 
Person identification based on facial biometrics in different lighting condit...
Person identification based on facial biometrics in different lighting condit...Person identification based on facial biometrics in different lighting condit...
Person identification based on facial biometrics in different lighting condit...
 
K017247882
K017247882K017247882
K017247882
 
A Hybrid Approach to Face Detection And Feature Extraction
A Hybrid Approach to Face Detection And Feature ExtractionA Hybrid Approach to Face Detection And Feature Extraction
A Hybrid Approach to Face Detection And Feature Extraction
 
AI Presentation- Human Face Detection Techniques- By Sudeep KC
AI Presentation- Human Face Detection Techniques- By Sudeep KCAI Presentation- Human Face Detection Techniques- By Sudeep KC
AI Presentation- Human Face Detection Techniques- By Sudeep KC
 
Paper id 24201475
Paper id 24201475Paper id 24201475
Paper id 24201475
 
AN EFFICIENT FACE RECOGNITION EMPLOYING SVM AND BU-LDP
AN EFFICIENT FACE RECOGNITION EMPLOYING SVM AND BU-LDPAN EFFICIENT FACE RECOGNITION EMPLOYING SVM AND BU-LDP
AN EFFICIENT FACE RECOGNITION EMPLOYING SVM AND BU-LDP
 
FaceDetectionforColorImageBasedonMATLAB.pdf
FaceDetectionforColorImageBasedonMATLAB.pdfFaceDetectionforColorImageBasedonMATLAB.pdf
FaceDetectionforColorImageBasedonMATLAB.pdf
 
PPT 7.4.2015
PPT 7.4.2015PPT 7.4.2015
PPT 7.4.2015
 
A face recognition system using convolutional feature extraction with linear...
A face recognition system using convolutional feature extraction  with linear...A face recognition system using convolutional feature extraction  with linear...
A face recognition system using convolutional feature extraction with linear...
 
Real-time face detection in digital video-based on Viola-Jones supported by c...
Real-time face detection in digital video-based on Viola-Jones supported by c...Real-time face detection in digital video-based on Viola-Jones supported by c...
Real-time face detection in digital video-based on Viola-Jones supported by c...
 
76 s201920
76 s20192076 s201920
76 s201920
 
40120140505010
4012014050501040120140505010
40120140505010
 

Plus de Smriti Tikoo

Minor projct(Broadband )
Minor projct(Broadband )Minor projct(Broadband )
Minor projct(Broadband )
Smriti Tikoo
 
Features of tms_320_2nd_generation_dsp
Features of tms_320_2nd_generation_dspFeatures of tms_320_2nd_generation_dsp
Features of tms_320_2nd_generation_dsp
Smriti Tikoo
 
Does one a reason to celebrate
Does one a reason to celebrateDoes one a reason to celebrate
Does one a reason to celebrate
Smriti Tikoo
 
Broadband Powerline Communication
Broadband Powerline CommunicationBroadband Powerline Communication
Broadband Powerline Communication
Smriti Tikoo
 

Plus de Smriti Tikoo (14)

A detailed study on fraud analysis of international trade on ecpommerce platf...
A detailed study on fraud analysis of international trade on ecpommerce platf...A detailed study on fraud analysis of international trade on ecpommerce platf...
A detailed study on fraud analysis of international trade on ecpommerce platf...
 
Fraud analysis
Fraud analysisFraud analysis
Fraud analysis
 
Minor projct(Broadband )
Minor projct(Broadband )Minor projct(Broadband )
Minor projct(Broadband )
 
Smriti
SmritiSmriti
Smriti
 
Video conferencing services
Video conferencing servicesVideo conferencing services
Video conferencing services
 
Bgp protocol
Bgp protocolBgp protocol
Bgp protocol
 
Features of tms_320_2nd_generation_dsp
Features of tms_320_2nd_generation_dspFeatures of tms_320_2nd_generation_dsp
Features of tms_320_2nd_generation_dsp
 
Internship report
Internship reportInternship report
Internship report
 
Does one a reason to celebrate
Does one a reason to celebrateDoes one a reason to celebrate
Does one a reason to celebrate
 
Does one a reason to celebrate
Does one a reason to celebrateDoes one a reason to celebrate
Does one a reason to celebrate
 
Embracing the singlehood
Embracing the singlehoodEmbracing the singlehood
Embracing the singlehood
 
Affect of the american sitcoms on the youth of today
Affect of the american sitcoms on the youth of todayAffect of the american sitcoms on the youth of today
Affect of the american sitcoms on the youth of today
 
GSM WHITE SPACES
GSM WHITE SPACESGSM WHITE SPACES
GSM WHITE SPACES
 
Broadband Powerline Communication
Broadband Powerline CommunicationBroadband Powerline Communication
Broadband Powerline Communication
 

Dernier

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
mphochane1998
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
AldoGarca30
 

Dernier (20)

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
PE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiesPE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and properties
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)
 
457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx
457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx
457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Introduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdfIntroduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdf
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 

Dissertation final report

  • 1. 1 Detection of Face using Viola Jones and Recognition using Backpropagation Neural Network A dissertation submitted in fulfillment of the requirements for the award degree of Masters of Technology (Electronics & Communication Engineering) in THE NORTHCAP UNIVERSITY, Gurgaon by Smriti Tikoo (14-ECP-015) under the guidance of Nitin Malik Department of Electrical, Electronics and Communication Engineering SCHOOL OF ENGINEERING AND TECHNOLOGY Gurgaon, Haryana May-2016
  • 2. 2 INDEX CONTENTS Page No. Candidate’s Declaration i Acknowledgement ii Abstract iii List of abbreviations 1 List of Tables 2 List of Figures 3 CHAPTER 1: INTRODUCTION 1.1 PROBLEM DEFINITION 11 1.2 FACE DETECTION 13 1.3 FACE DETECTION TECHNIQUES 15 1.4 FACE RECOGNITION 18 1.5 FEATURE DETECTION 20 1.6 NEURAL NETWORKS 21 1.7 BACKPROPAGATION 24 1.8 COMPARISON WITH OTHER TECHNOLOGIES 24 1.9 PROBLEM STATEMENT 25 CHAPTER 2: LITERATURE REVIEW 26 CHAPTER 3: PROPOSED METHODOLOGY 30 3.1 VIOLA JONES ALGORITHM 31 3.2 HAAR FEATURES 33 3.3 INTEGRAL IMAGE 35 3.4 ADA BOOST TRAINING ALGORITHM 37 3.5 CASCADING AMPLIFIERS 38
  • 3. 3 CHAPTER 4: SOFTWAREIMPLEMENTATION 4.1 MATALAB 8.2 (R2013B) 41 4.2 OPEN SOURCE COMPUTER VISION 41 4.3 COMPUTER VISON TOOLS 43 4.4 VISION CASCADE OBJECT DETECTOR 44 4.5 NEURAL NETWORK TOOLS 47 4.6 NEURAL NETWORK FITTING TOOL 47 4.7 TRAINING ALGORITM 48 4.8 PROGRAMMING CODE FOR FACE DETECTION 48 4.9 PROCEDURE OF FACE RECOGNITION 53 CHAPTER 5 : RESULTS AND DISCUSSION 5.1 RESULTS OF RECOGNITIONPROCESS 56 CHAPTER 6: CONCLUSION AND FUTURE SCOPE 6.1 CONCLUSIONS 60 6.2 FUTURE SCOPE 61 References 62 Publications 65 Plagiarism Report Date…………….
  • 4. 4 CANDIDATE’S DECLARATION I hereby declare that the work presented in this dissertation report “Detection of face using Viola Jones algorithm and recognition using Backpropagation Neural Network” in complete fulfillment of the requirements for the award of degree of Master of Technology in Electronics and Communications Engineering in THE NORTHCAP UNIVERSITY, Gurgaon is an authentic record of my own work carried out under the guidance of Dr. Nitin Malik (Associate Professor). The matter embodied in this dissertation report has been submitted by me for the award of degree of Master of Technology. Date: Place: (Smriti Tikoo)
  • 5. 5 Acknowledgement This research was mentored by Dr.Nitin Malik (Associate Professor) at The NorthCap University Gurgaon. I am grateful to him for his undivided support and professional expertise on the area of research. I would like to express my deepest gratitude towards him for moderating my research throughout which made a significant difference in the course of dissertation. I would like to express my appreciation for voicing their views on the earlier versions of my work..
  • 6. 6 Abstract This thesis is an attempt towards facial recognition which is a prominent topic among the communities such as image processing neural network and computer vision. Face detection and recognition has been prevalent with research scholars and diverse approaches have been incorporated till date to serve purpose. The rampant advent of biometric analysis systems, which may be full body scanners, or iris detection and recognition systems and the finger print recognition systems, and surveillance systems deployed for safety and security purposes have contributed to inclination towards same. Advances has been made with frontal view, lateral view of the face or using facial expressions such as anger, happiness and gloominess, still images and video image to be used for detection and recognition. This led to newer methods for face detection and recognition to be introduced in achieving accurate results and economically feasible and extremely secure. Techniques such as Principal Component analysis (PCA), Independent component analysis (ICA), Linear Discriminant Analysis (LDA), have been the predominant ones to be used .But with improvements needed in the previous approaches Neural Networks based recognition was like boon to the industry. It not only enhanced the recognition but also the efficiency of the process. Choosing Backpropagation as the learning method was clearly out of its efficiency to recognize non linear faces with an acceptance ratio of more than 90% and execution time of only few seconds.
  • 7. 7 LIST OF ABBREVIATIONS ANN Artificial Neural Network BPNN Backpropagation Neural Network DCT Direct Cosine Transform FFT Fast Fourier Transform ICA Independent Component Analysis LDA Linear Discriminant Analysis MFC Multi Resolution Feature Concatenation MLP Multi Layer Perceptron MLNN Multi Layer Neural Network MMV Multi Resolution Majority Voting PCA Principal Component Analysis
  • 8. 8 LIST OF TABLES Page No: Table1 Sample data for sample images 1, 2, and 3. 56 Table 2 Sample data of sample image 2. 58
  • 9. 9 LIST OF FIGURES Page No: Figure 1 Thumb impression as biometric scan . 12 Figure 2 Finger print being computed into binary data. 12 Figure 3 Generalized block diagram of face recognition. 13 Figure 4 A group of faces detected 14 Figure 5 Faces detected in a press conference. 15 Figure 6 Diagrammatical representation of viola Jones algorithm. 16 Figure 7 Simplified neural network. 23 Figure 8 Graphical representation of sigmoidal function. 23 Figure 9 Flowchart viola Jones algorithm 31 Figure 10 Rectangular features used to in Viola Jones 32 Figure 11 Flow chart for face recognition using BPNN. 35 Figure 12 Finding sum of shaded rectangular area 35 Figure 13 Cascading Amplifier a simplified diagram 39 Figure 14 Training cascade detector. 46 Figure 15 Faces detected by Viola Jones algorithm are highlighted by bounding boxes. 50 Figure 16 Nose detected. 51 Figure 17 Mouth detected. 52 Figure 18 Eyes detected. 52 Figure 19 Sample images 1, 2, 3 in rgb, gray scale and binary format. 53 Figure 20 Segmented image. 54
  • 10. 10 Figure 21 Binary image histogram equivalents. 54 Figure 22 Sample image1 in rgb, sample image 2 in gray and histogram equivalent of 55 sample image 2. Figure 23 A three stage training procedure. 57 Figure 24 Parameters calculated during training procedure. 57 Figure 25 Performance plot for sample data 1. 58 Figure 26 Performance plot of sample image 2. 59
  • 11. 11 CHAPTER 1. Introduction 1.1 Problem Definition Automatic recognition dates back to the years of 1960’s when pioneers such as Woody Bledsoe, Helen Chan Wolf, and Charles Bisson introduced their works to the world. In 1964 & 1965, Bledsoe, along with Helen Chan and Charles Bisson had worked on the computer to recognize human faces. But it didn’t allow much recognition to his work due to not much support from agencies. After him the work was carried forward by Peter Hart at the Stanford Research Institute. The results given by computer were terrific when feeded with a gallery of images for recognition. In around 1997, another approach by Christoph von der Malsburg and students of lot of reputed universities got together to design a system which was even better than the previous one. The system designed was robust and could identify people from photographs which didn’t offer a clear view of the facial images . After that technologies like Polar and FaceIt were introduced which were doing the work in lesser time and gave accurate results by offering features like a 3D view of the picture and comparing it to a database from worldwide. Till the year 2006 a lot of new algorithm replaced the previous ones and were proving to be better in terms of offering various new techniques to get better results and their performance was commendable. Automatic Recognition has seen a lot of improvement since 1993, the error rate has dropped down & enhancement methods to improve the resolution have hit the market. Ever since then biometric authentication is all over the place, these days ranging from home security systems, surveillance cameras, to fingerprint and iris authentication systems, user verification and etc.
  • 12. 12 These systems have contributed in catching terrorists and other type of intruders via cyber or in person. Recognition of faces has been under a research since 1960’s but it’s just over a decade that acceptable results have been obtained. Figure 1: Thumb impression as biometric scan Figure 2: Finger print being computed into binary data Researchers are still trying to attain better models and approaches to be able to recognize just like our brain does with utmost triviality be whatever the lighting conditions or time lapse of years between generations, skin discolorations’ or even after intricate changes in faces like undergoing botox or surgical treatments. Approaches deployed do not count the intricate details which differ from person to person in context of recognition results. The continuous changes in expression of face depending upon the situation being experienced by person disclose important information regarding the structural characteristics and the texture, thereby helping in recognition process. Of the many methods adopted towards recognition of face they can be broadly classified as a) holistic approach which relies of taking the whole face as an input in pixels and b) feature
  • 13. 13 extraction which concentrates on capturing the features in a face but certain methods are deployed for removing the redundancy information and dimension problems involved. Three basic approaches in this direction are namely PCA, LDA, ICA others which were added to the list include the neural network, fuzzy and skin and texture based approaches also including the ASM model. The problem is approached first by detecting the face and its features and then moving on to recognition. A lot approaches are available for detection of faces like Viola Jones algorithm, LBP, Adaboost algorithm, neural network based detection or usage of Snow classifier method . Figure 3: Generalised block diagram of face recognition 1.2 Face Detection One of the foremost and important step in the procedure of facial recognition is the detection of a face. It forms the first step in the direction of analysis of entire face, alignment, relighting, modeling, recognition, head pose tracking, authentication of face and tracking of face and its expressions. Recognition by computers is necessary for them to be able to interpret peoples’ thoughts. The aim behind facial detection is to detect a face in the image if any and return the image location and extent of each face. For this appropriate face modeling and segmentation is to be done.
  • 14. 14 While carrying out face detection sources of variation of facial appearance like viewing geometry (pose), illumination, the imaging process (resolution, focus, imaging noise, perspective effects), and other factors like Occlusion should be considered . Approaches of detection which rely on image formation help in detecting the color, shape or motion as well. It’s a specific case of object-class detection. In object-class detection, the task is to find the locations and sizes of all objects in an image that belongs to a given class. Generally the detection algorithms for face Face-detection algorithms concentrate on frontal facial images. The image is then emulated with images in database. It is often used for biometric authentication, video surveillance, and human computer interface and image database management. It has gained itself a great deal of interest among the marketers as well. Figure 4 : A group of faces detected
  • 15. 15 Figure 5: Faces of people detected during a press conference 1.3 Face DetectionTechniques There a lot of techniques available for the purpose of detecting face in an image. The following are listed below: Viola Jones Face DetectionAlgorithm: Proposed by Paul Viola and Michael Jones, it proves to be a one of its kind by providing competitive results for real time situations .It not only aces the process of detecting frontal faces but can also be modeled for detecting variety of other object classes. It’s robust nature and its adaptability for practical situation makes its popular among its users. It follows a four stage procedure for the entire face detection which is : first is Haar feature selection, then comes the integral image creation followed by adaboost training and cascading of amplifiers. Haar Feature selection matches the commonalities found in human faces. The integral image calculates the rectangular features in fixed time which benefits it over other sophisticated features. Integral image at (x,y) coordinates gives the pixel sum of the coordinates above and on to the left of the (x,y).
  • 16. 16 Advantages: 1. Popular for accurate face detection. 2. High Detection rates and fit to be used for real time. 3. High Accuracy. 4. Reduced computation due to usage of cascading amplifiers. Limitations: 1. Prolonged training time 2. Inadequate head positions. 3. Black faces not detected. Figure 6: Diagrammatical representation of Viola Jones algorithm Local Binary Pattern (LBP): An efficient technique for explaining the texture features of an image .Its high computation speed and rotation invariance which assists its usage in areas of retrieval of image, texture, recognition, segmentation. One of its successful applications includes the objects detected in a video through elimination of the backdrop.
  • 17. 17 It assigns a unique texture value to each pixel to facilitate its conciliation with objective at hand and also for easing the process of tracking the video.LBP majorly is beneficial for recognition of the main points in target and formation of mask for joint colour texture selection. Advantages: 1. Apt for expressing texture features. 2. Found effective for analysis of texture, retrieval of images, recognition and segmentation of faces. 3. Spots a moving object with ease via elimination the backdrop 4. It is a Simple Approach and its computationally simple than Haar Like feature and fast. 5. The most vital properties of LBP features are tolerance against the monotonic illumination changes and computational simplicity. Limitations: 1. Insensitive to minute changes in localization process. 2. Error rate rises on usage of huge local regions. 3. Proves to be ineffective for any changes in non monotonic images. 4. Usage restricted for binary and grey scale. Ada Boost Algorithm for Face Detection: Its basic aim is to create a superior prediction rule which is highly accurate and is based on combining previously comparative weak incorrect rules. It is the first and foremost practical boosting algorithm to have garnered wide acceptance in various fields .It contributed to high detection rates and there by processing of images was done on a faster rate. From selection of visual features to linear combination of classifiers Ada boost did it all to make an strong classifier. Over fitting seems to be one of the few restrictions other than noisy data and outliers to the boosting algorithm. Its adaptive name is due to its tendency to generate a strong classifier by carrying out multiple iterations. A strong classifier is formed by cascading the weak amplifiers which slightly related to the actual classifier.
  • 18. 18 A round about process of adding weak classifiers to the chain of formation of a strong classifier goes about with an extra addition of a weight vector to concentrate on misclassified classifiers. The outcome is a classifier that has higher accuracy than the weak learner’s classifiers. Advantages: 1. Usage extends to various classifiers 2. Enhance accuracy levels of classifier. 3. Easy to implement and commonly used. 4. Over fitting, noisy data and outliers pose a restriction to algorithm. Limitations: 1. Prior knowledge about the face structure not required. Just a pair of inputs to be fed, a training database, and classification functions is needed. 2. Adaptive algorithm. The positive and negative aspects tested by classifier at every stage to avoid misclassification 3. An error generated during training exponentially tends to approach zero in a finite number of iterations. 4. Easy to program and its feature selection helps in giving a classifier is fast and simple. 5. Quality of detection relies on consistence of training set , its size and interclass variability. 6. Training is time consuming, as computation time is proportional to the size of features and example sets. 1.4 Face recognition system: Modern era has brought about a revolution by bringer the world closer to the computerized applications .With scenarios such breach of security, terrorist attacks, cyber crimes and fraudery becoming rampant via various modes requirement of authentication systems and verification technologies has become the need of the hour. Although various biometric traits are available for carrying out the process but we have focused on face recognition as it is flexible and reliable irrespective of a person being aware that his/her face is being scanned. In this procedure user identification plays a vital role in order to authenticate a face from the available database. While designing feasibility and effective recognition should be kept in mind and a simple and robust system is should be sought after solution at hand .This makes its wide acceptance
  • 19. 19 possible and makes socially more popular as well. Personal verification processes also utilizing biometrics to make the procedures more accurate and efficient. The major factors influencing the recognition method are the illumination and the various pose variations at hand .Post the face detection feature extraction is the next step in command to reach face recognition. A lot of approaches are available for extracting features. The recognition process encompasses the usage of an automated or semi or manual way to match the facial images. Most common by far is 2Dimensional face recognition which is easier and less expensive compared to the other approaches. The recognition system has the flexibility to operate in either of the two mode s i.e. verification or identification. But the course of action is not as easy as the brain makes it feel and work appropriately only under certain set conditions. Degradation in performance is often observed when the images captured lack the appropriate lighting, or changing poses make pose a restriction apart from the facial expressions. Some algorithm for recognition use landmarks or features from the image in which the face is to be detected. Others resort normalization and compression of facial database. Popular recognition algorithms include Principal Component Analysis using eigen faces, Linear Discriminate Analysis, Elastic Bunch Graph Matching using the Fisher face algorithm, the Hidden Markov model, the Multilinear Subspace Learning using tensor representation, and the neuronal motivated dynamic link matching. A new emerging trend is 3dimensional recognition which has argued for providing enhanced accuracy utilizing sensors to capture the facial image of a person and information regarding its shape and other properties such contour of eye sockets etc.
  • 20. 20 1.5 Feature Detection: It refers to the ways which target the computing abstractions in the formation and decisions at each image point where there might or might not be a feature to capture. Results of this modus operandi are subsets of the original image domain in the form either isolated points, curves etc. By definition feature can be understood as an unique part of image depending on the situation. They might be at times the initial point of the beginning of an algorithm and therefore the subsequent or the overall algorithm will be effective enough as much as the feature detector. One of the prominent aspects is recurring features which might be detected in different images of same type as well. It’s a nominal operation in image processing and gets operated as the next operation after detection of face. It involves scanning each pixel to look for features. For a lengthy algorithm scanning is done for the image in the region of features only. Types of Features: Apart from the usually visible features in a face i.e eyes, nose, mouth, lips , forehead there are various other features which are also helpful in verifying someone’s identity. Edges: forms the points where there is a boundary separating or joining two image regions. Technically speaking points having high gradient magnitude classify as edges. Usually these are one dimensional. Corners/ interest points : Points like features anywhere in a facial image classify as corners. They are two dimensional unlike the edges .The discovery of the corner points was because of the facts that edges analyzed in the earlier works showed rampant changes in directions. The algorithms were developed to avoid the detailed edge detection. It is detected as high levels of curvature in the gradient. The next hurdle at hand was detection of corners on parts of images which were not true corner points. Blobs /region of interest or interest points: Provided the corresponding image description in terms of region. The algorithms employed for detecting blob detected areas which were too leveled to be detected as corner point. Blob descriptors contained points like center of gravity etc distinguishing between a corner and blog becomes difficult if an image is reduced from its actual size
  • 21. 21 Ridges –especially for elongated objects in context of a grey level image it is generalization of medial axis in a realistic situation it is treated as one dimensional curve depicting axis of symmetry, but they are harder to extract from the images. Feature Extraction: On completion of the feature detection process, nest is the feature detection which is capturing a section around the detected feature. Extraction involves considerable amounts of processing of images and the outcome is called a feature descriptor or a feature vector. Approaches available for it are N-jets and the local histograms. In addition to such attribute information, the feature detection step by itself may also provide complementary attributes such as the edge orientation and gradient magnitude in edge detection and polarity and the strength of the blob in blob detection. It is hybrid method of data reduction .Feature extraction techniques extract a set of salient or discriminatory features as holistic or local features from the face image known as feature space .Image will be rejected as a high dimensional feature space. 1.6 Neural Networks: Defines a family of algorithms via a multiple layers of interconnected inspired from the neuron structure in the brain. Here each circular node represents an artificial neuron and an arrow represents the connection from the output of one neuron to the input of another. These networks are designed for approximating functions for huge amount of inputs generally unknown. The nodes in neural network exchange data between one another. The connections between the nodes have numerics weights associated which are altered to attain an output matching the target output in ace of supervised learning or making neural nets adaptive to inputs and capable of learning. On the basis of the following criteria a network can be classified as neural: if it has set of adaptive weights and is capable of approximating nonlinear functions of their inputs. The adaptive weights can be thought of as connection strengths between neurons which are activated during training and predictions. Feed Forward Neural Networks: It is where connections between the nodes is not in a cyclical form unlike the recurrent neural networks .Feed forward neural networks are the
  • 22. 22 simplest networks and the information moves in forward direction only from inputs to the outputs via hidden layer in between. Supervised Learning: Where the inputs are unknown and associated weights are updated so as to compute an actual output that matches the desired output.eg. pattern recognition and regression model. Unsupervised Learning: It is a learning method where input data has no set responses to compare its actual output. eg. Cluster analysis. Single Perceptron: Consist of a single layer of input and output along with associated numerical weights. The sum of the products of the weights and the inputs is calculated in each node, and if the value is above some threshold the neuron is activated value otherwise stays deactivated .Such neurons classify as linear threshold units . Usually the activation holds a value of +1 and deactivation -1, keeping threshold as 0. Delta rule is utilized for training of perceptron which calculated the error arisen due to unmatched actual and target output. In multi-layer neural network continuous output is obtained .A common choice is the so-called logistic function: eq1.1 A logistic function is an S- shaped function. The values of x range from -∞ to +∞. Multilayer Perceptron: It’s a two layer neural network with ability to perform XORing. In the figure shown in the next page the numerals within neurons represent the threshold. The numbers that are depicted on the interconnections are the weight of the inputs. In this case if threshold is not attained then the output is zero. This class of networks consists of multiple layers of computational units, usually interconnected in a feed-forward way. They usually employ sigmoid function as their activation function. As per universal approximation theorem every continuous function which maps intervals of real numbers with some output can be estimated using MLP with a single hidden layer. Mostly the MLP’s employ back propagation as their learning technique for computing data.
  • 23. 23 Figure 7: A generalized structure of neural network. Sigmoid Function: A math function having an S shaped curve.It is usually called special case of logistic function. Figure 8: Graphical representation of sigmoidal function. It is defined by 𝑆( 𝑡) = 1/(1 + 𝑒−𝑡 ). It is a limited differentiable real function which is defined for real values and gives a positive derivative. 𝑆′( 𝑡) = 𝑆(𝑡)(1− 𝑆( 𝑡)). 1.7 Backpropagation Also called as "backward propagation of errors", is a popular method of training MLP artificial neural networks in conjunction with an optimization method. Gradient of loss function is calculated with respect to the associated weights. Optimization method is fed with the gradient which alters the weights used to reduce the loss function.
  • 24. 24 This algorithm works on the supervised learning rule i.e. requires a known, desired output for each input value in order to calculate the loss function gradient. It makes use of sigmoid function as its activation usually. The algorithm can be split in two phases: Phase 1: Propagation: 1. Forward propagation: Input is fed through the network to generate propagation's output activations. 2. Backward propagation: A feedback network is formed by feeding the output as input in order to generate a difference between actual and the target outputs. Phase 2: Weight update: 1. Gradient of weight is a product of difference of outputs and input activation. 2. Subtract a ratio (percentage) of the gradient from the weight. Ratio or the learning rate affects the speed and the quality of learning. Greater the ratio the neurons are trained quickly but accuracy is assured with a lower ratio. If weight updation increases in positive side the error shall increase. Repeat the steps to attain satisfactory results. 1.8 Comparison to other technologies: From the available biometric authentication methods face recognition is the easier and reliable .No cooperation required from the subject to be captured for identification and verification process .It is widely used on airports, defense ministries, multiplexes and other areas which experience a threat to security of data or people. Other biometrics like fingerprints, iris scans, and speech recognition cannot perform mass identification unlike facial recognition systems. However, queries are being raised on the efficiency of such systems. Drawbacks: Though it is flexible and socially acceptable but it is not a perfect way out to solve the problems at hand related to security of country and its people .Face recognition is not perfect and struggles to perform under certain conditions. Certain boundations like angles at which images to be captured, poor illumination, age related changes in face or surgical, resolution of images, or clustered images, skin color, facial expressions does pose as obstacle for the same .
  • 25. 25 1.9 Problem Statement The advent of biometrics brought about a change in the way security of data or people and country was treated. Be it via surveillance systems or the full body scans or the retina scans, hand scans, finger prints not only it has changed the security setup of the country but it offers a sense of reliability to the users and system as well. It has invented ways to protect and preserve the confidential data and guard the system(of any type) with help of human and computer interaction. It has brought about a great combination of image processing and computer vision to light, that in a way have boosted the business for detection and recognition systems. In this paper we propose a similar approach to detect and recognize a facial image using a BPNN with help of Matlab. In the Matlab we have worked under the neural network, using its tools to train and process the image for obtaining the performance plot. CHAPTER 2. Literature Review Whatever be the approaches be it Neural Network based[1][2] or any other approach requirement of large database is required of facial images and non facial images as well to carry out a gray scale based face detection. The non facial images pose a hurdle in that case.
  • 26. 26 Schneiderman and Kanade [4] have lent their modus operandi for the frontal faces and for profile view detections. The approach for non faces is a feature based method which encompasses the geometrical attributes with belief networks [5].For real time situations [6] geometrical facial templates and Hough transform can also be put to use. Facial detectors like Markov random fields [7][8] and Markov chain[9] are detecting faces using the spatial arrangement of gray scale pixel values. For tracking faces [10] whose initial position is known model based approaches are the answer, eg. A 3d based deformable approach is good at tracking moving faces, provided that other facial features are already known. A way for reducing the burden of detection algorithms are motion and color they combined with other available information can add to fast detection and tracking [11] by algorithms. In a research paper the compilation of motion based tracking with Hidden Markov model has already been used. A lot proposal have been published where in neural network combined with motion and color filters[12] have been put to use. Zhao [14] provided survey and tasks for face recognition of images in a video and for those which are still as well. Turk [15] calculated the Eigen faces with linear algebra technique. This method enabled real- time automated face recognition system. Liao [16] opted for fast fourier transform algorithms before applying recognition. Face recognition is performed using Principal Component Analysis (PCA) and Independent Component Analysis (ICA) after the FFT has whitened the input images. Results have proved that the whitening process enhances the recognition performances for both PCA and ICA. Contrast enhancement and edge detection were used by Jarillo [17] for carrying out transformations of input images. Discriminatory information can be obtained from such experiments helpful for face classification, namely PCA, LDA (Linear Discriminate Analysis). Database of YALE and FERET were used. Lin [18] opted for Hausdroff distance in recognition procedure over Euclidian distance. Satisfactory results were obtained for the same. Wavelet transform in combination with PCA was put to use by A. Eleyan [19].
  • 27. 27 Innovative methods such as Multi resolution Feature Concatenation (MFC), where PCA is used for dimensional reduction on each sub band and resulting each sub band together for performing classification, second is Multi resolution Majority Voting (MMV) where PCA and classification are done separately on each sub band. MMV outperforms the MFC approach. A DCT approach was used to minimize the problem of illumination by Shermina.J [20]. DCT applied image later is used in recognition step and for this step PCA approach is used. Unauthorized access is regulated by designing systems which can provide security and reliability. So, building of efficient face recognition systems is very necessary for this. Agarwal et al. [3] deduced an approach of coding and decoding an image to recognize a face. It was carried in two stages firstly using PCA as method for extraction of features and then using feed forward BPNN for recognition. Comparisons were made in terms of efficiency with K-means, Fuzzy Ant with fuzzy C-means and better recognition rate was obtained compared to the other two techniques. Bhati et al.[4] opted for eigen faces for the feature vectors. The paper took novel route by including aspects like mean square error, recognition rate and training time. Single layer neural can give 100% recognition accuracy if correct learning rate was assigned to it. They proposed that BPNN is more sensitive to hidden neurons and learning rate. Using wavelets for feature extraction helped both neural networks showed improved results. Kim et al. [5], suggested to improve the recognition rate combine the MLNN with a method that computes the small number of eigen face feature vector through PCA. The method improved functioning with variations in illumination and noise to minimize the recognition error. The rectangular feature technique was used to extract the face from the background image and after that it performed recognition by reducing image size and denoising image, a proper vector through PCA was evaluated and finally applied MLNN on the resultant image. Existing PCA matching method was compared with the combinational approach of PCA and MLNN by Benzaoui et al. [6] proposed a technique involving two stages: learning and
  • 28. 28 detection. In first phase, conversion to gray scale was done and then they were filtered by median filter to reduce noise. Feature extraction was done by choosing appropriate information from each picture to prepare the data for classifier learning. System used DCT, to extract a vector of appropriate coefficient that symbolizes necessary parameters for modeling the original one. There were problems due to variations in postures so Saudagare et al. [7] proposed a system which involving inhibition of consequences of changes in illumination and pose. Proposed a 2-level Haar wavelet transform to partition the facade image into seven sub-image bands. From the sub–image bands, eigenfaces features were extracted & used as input for BPNN based classification algorithm. Denoising the images was necessary for enhanced & accurate outcome to be achieved. So, Daramola et al. [8] designed a system with following procedure : face detection ,pre-processing, principle component analysis (PCA) and classification. Firstly, database with images in different poses was made, followed by normalization and denoising the image. Eigen faces were calculated from the training set. Using PCA to calculate eigen values resulted in were calculated larger eigen values in comparison training set images. In ANN approach face descriptors were used as an input for training the network. Non-linearity of the images was out looked by Kashemet al. [9]. A computational model for face detection and recognition, which was accurate, simple and fast in forced environments was brought to light. Here eigen faces[10] were used for recognition to up the accuracy and computation . Whereas when Combining PCA and BPNN techniques non-linear face images were easily distinguished. Conclusion was made that this method had execution time of few seconds and its acceptance ratio was more than 90 %. In [11], Chung et al. referred to a new face recognition method based on Principal Component Analysis (PCA) and Gabor filter responses. First method, Gabor filtering was applied on predefined fiducial points to match robust facial features from the original face image. Second method involved transformation of the facial features into Eigen space by PCA, for optimal classification of individual facial representations.
  • 29. 29 CHAPTER 3 Proposed Methodology The advent of biometrics brought about a change in the way security of data or people and country was treated. Be it via surveillance systems or the full body scans or the retina scans, hand scans, finger prints not only it has changed the security setup of the country but it offers a sense of reliability to the users and system as well. It has invented ways to protect and preserve the confidential data and guard the system (of any type) with help of human and computer interaction. It has brought about a great combination of image processing and computer vision to light, that in a way have boosted the business for detection and recognition
  • 30. 30 systems. In this paper we propose a similar approach to detect and recognize a facial image using a BPNN with help of MATLAB 8.2. In the MATLAB we have worked using the neural network tool box, within which we have made use of the neural network fitting tool to train and test the facial image at hand. This tool maps the numeric inputs to their targets. It helps in training, testing and validating the data and evaluates the data on mean square error generated; performance plot and if regression needed plot for the same data can also be put to use to make an inference regarding the performance of the process 1. Detection of face and its features using Viola Jones algorithm. 2. Conversion from rgb to grayscale and binary. 3. Segmentation of image can be done in any format. 4. Histogram equivalent of binary or grayscale image. 5. Each segmented part is trained individually using the back propagation neural network tool. 6. The above mentioned steps can also be carried out without segmentation of the image. 7. For recognition process neural network tool in Matlab is used to train and test the sample data provided by the histogram equivalent of the image taken. 8. In neural network tool, curve fitting tool has been made use to train the data. 9. The training of data takes place in 3 stages mainly training, testing and validation. 10. The mean squared error is a parameter which is focused to be able to contemplate whether the training is giving desirable results or not. 11. The data is retrained to achieve better performance plots with least mean squared error. 12. The training phase continues with usage of different samples in order to check the performance of the network and to get better results. 3.1Viola Jones Algorithm: It is a preferred algorithm for frontal face detection due to its higher accuracy and better adaptability to real time situations. It trains across 5000 frontal faces and 300 million non faces. The faces are normalized, scaled and then translated. Faces itself offer numerous variations due to poses, marks, illumination, angles and various other innumerable factors. The key to face
  • 31. 31 detection is that every image is it mine or yours contains 10-50 thousand locks or scales on it and rarity offered by per image is 0-50. Under the four stages described before the first stage that is haar feature selection- all the regularities found in the images is matched like the cheek regions are lighter than the eye regions and the region enclosing bridge of the nose is even brighter than the eyes. Figure 9: Flow chart for Viola Jones Algo the mouth and the nose. Value: oriented gradients of pixel intensities. These features are then matched by the algo. Under Rectangle features: the sum of pixels in black is subtracted from those in white to get the value. Rectangle boxes are covered over the features being detected to show how the pixel count is being done. The integral image calculates all the rectangular features in a fixed interval of time, which benefits them by speeding their operation over the detection of other features not included in rectangular features. A variant of Ada boost learning algorithm is employed to select the best features and also to train the classifiers to use them. So construction of strong classifier begins by cascading the previously used weak classifiers. The complex the structure of a classifier better and stronger it becomes for detection purpose INPUT IMAGE HAAR FETURE INTEGRA L IMAGE ADABOO ST CASCADI NG
  • 32. 32 Figure 10: Rectangular features used in viola jones algo . Evaluation of classifiers does take a lot of time. To avoid this every successive classifier continues to stick to the samples trained by previous ones. Since every feature is related to a location in sub window, so if by any chance a sub window is rejected from inspection by a classifier no further processing happens for the same window and a new sub window is searched for. The very first cascade has job to roughly half the time for which the entire cascade has to do the evaluation. Every stage of classifier has a strong classifier which makes the features to be grouped under the stages with every stage possessing certain features. The stage of classifier check whether or not a given sub window classifies as face or not. If not a face then it is discarded away. 3.2 Haar Features: They are digital image features used in object recognition. They owe their name to their intuitive similarity with Haar wavelets and were used in the first real-time face detector. Historically, working with only image intensities (i.e., the RGB pixel values at each and every pixel of image) made the task of feature calculation computationally expensive. Viola and Jones adapted the idea of using Haar wavelets and developed the so-called Haar-like features. A Haar-like feature considers adjacent rectangular regions at a specific location in a
  • 33. 33 detection window, sums up the pixel intensities in each region and calculates the difference between these sums. This difference is then used to categorize subsections of an image. For example, let us say we have an image database with human faces. It is a common observation that among all faces the region of the eyes is darker than the region of the cheeks. Therefore a common haar feature for face detection is a set of two adjacent rectangles that lie above the eye and the cheek region. The position of these rectangles is defined relative to a detection window that acts like a bounding box to the target object (the face in this case). In the detection phase of the Viola–Jones object detection framework, a window of the target size is moved over the input image, and for each subsection of the image the Haar-like feature is calculated. This difference is then compared to a learned threshold that separates non-objects from objects. Because such a Haar-like feature is only a weak learner or classifier (its detection quality is slightly better than random guessing) a large number of Haar-like features are necessary to describe an object with sufficient accuracy. In the Viola–Jones object detection framework, the Haar-like features are therefore organized in something called a classifier cascade to form a strong learner or classifier. The key advantage of a Haar-like feature over most other features is its calculation speed. Due to the use of integral images, a Haar-like feature of any size can be calculated in constant time (approximately 60 microprocessor instructions for a 2-rectangle feature). A simple rectangular Haar-like feature can be defined as the difference of the sum of pixels of areas inside the rectangle, which can be at any position and scale within the original image. This modified feature set is called 2-rectangle feature. Viola and Jones also defined 3-rectangle features and 4-rectangle features. The values indicate certain characteristics of a particular area of the image. Each feature type can indicate the existence (or absence) of certain characteristics in the image, such as edges or changes in texture. For example, a 2-rectangle feature can indicate where the border lies between a dark region and a light region One of the contributions of Viola and Jones was to use summed area tables,[3] which they called integral images. Integral images can be defined as two-dimensional lookup tables in the form of a matrix with the same size of the original image. Each element of the integral image
  • 34. 34 contains the sum of all pixels located on the up-left region of the original image (in relation to the element's position). This allows to compute sum of rectangular areas in the image, at any position or scale, using only four lookups: Finding the sum of the shaded rectangular area eq 3.1. where points belong to the integral image , as shown in the figure. Each Haar-like feature may need more than four lookups, depending on how it was defined. Viola and Jones's 2-rectangle features need six lookups, 3-rectangle features need eight lookups, and 4-rectangle features need nine lookups. Figure 11: Flowchart for facial recognition
  • 35. 35 Figure 12: Finding sum of the shaded rectangular area. 3.3 Integral Image A summed area table is a data structure and algorithm for quickly and efficiently generating the sum of values in a rectangular subset of a grid. In the image processing domain, it is also known as an integral image. It was introduced to computer graphics in 1984 by Frank Crow for use with mipmaps. In computer vision it was popularized by Lewis and then given the name "integral image" and prominently used within the Viola–Jones object detection framework in 2001. Historically, this principle is very well known in the study of multi-dimensional probability distribution functions, namely in computing 2D (or ND) probabilities (area under the probability. Algorithm: As the name suggests, the value at any point (x, y) in the summed area table is just the sum of all the pixels above and to the left of (x, y), inclusive: eq3.2 Moreover, the summed area table can be computed efficiently in a single pass over the image, using the fact that the value in the summed area table at (x, y) is just: eq3.3
  • 36. 36 A description of computing a sum in the Summed Area Table data structure/algorithm Once the summed area table has been computed, the task of evaluating the intensities over any rectangular area requires only four array references. This allows for a constant calculation time that is independent of the size of the rectangular area. That is, using the notation in the figure at right, having A=(x0, y0), B=(x1, y0), C=(x0, y1) and D=(x1, y1), the sum of i(x,y) over the rectangle spanned by A, B,C and D is: eq3.4 distribution) from the respective cumulative distribution functions. 3.4 Ada Boost Training algorithm: Ada Boost, short for "Adaptive Boosting", is a machine learning meta-algorithm formulated by Yoav Freund and Robert Schapire who won the Gödel Prize in 2003 for their work. It can be used in conjunction with many other types of learning algorithms to improve their performance. The output of the other learning algorithms ('weak learners') is combined into a weighted sum that represents the final output of the boosted classifier. Ada Boost is adaptive in the sense that subsequent weak learners are tweaked in favor of those instances misclassified by previous classifiers. Ada Boost is sensitive to noisy data and outliers. In some problems it can be less susceptible to the over fitting problem than other learning algorithms. The individual learners can be weak, but as long as the performance of each one is slightly better than random guessing (e.g., their error rate is smaller than 0.5 for binary classification), the final model can be proven to converge to a strong learner. While every learning algorithm will tend to suit some problem types better than others, and will typically have many different parameters and configurations to be adjusted before
  • 37. 37 achieving optimal performance on a dataset, Ada Boost (with decision trees as the weak learners) is often referred to as the best out-of-the-box classifier. When used with decision tree learning, information gathered at each stage of the Ada Boost algorithm about the relative 'hardness' of each training sample is fed into the tree growing algorithm such that later trees tend to focus on harder-to-classify examples. Problems in machine learning often suffer from the curse of dimensionality — each sample may consist of a huge number of potential features (for instance, there can be 162,336 Haar features, as used by the Viola–Jones object detection framework, in a 24×24 pixel image window), and evaluating every feature can reduce not only the speed of classifier training and execution, but in fact reduce predictive power, per the Hughes Effect. Unlike neural networks and SVMs, the Ada Boost training process selects only those features known to improve the predictive power of the model, reducing dimensionality and potentially improving execution time as irrelevant features do not need to be computed. Training Ada Boost refers to a particular method of training a boosted classifier. A boost classifier is a classifier in the form eq3.5 where each is a weak learner that takes an object as input and returns a real valued result indicating the class of the object. The sign of the weak learner output identifies the predicted object class and the absolute value gives the confidence in that classification. Similarly, the -layer classifier will be positive if the sample is believed to be in the positive class and negative otherwise. Each weak learner produces an output, hypothesis , for each sample in the training set. At each iteration , a weak learner is selected and assigned a coefficient such that the sum training error of the resulting -stage boost classifier is minimized. eq3.6
  • 38. 38 Here is the boosted classifier that has been built up to the previous stage of training, is some error function and is the weak learner that is being considered for addition to the final classifier. Weighting: At each iteration of the training process, a weight is assigned to each sample in the training set equal to the current error on that sample. These weights can be used to inform the training of the weak learner, for instance, decision trees can be grown that favor splitting sets of samples with high weights. 3.5 Cascading Amplifiers A cascade amplifier is any two-port network constructed from a series of amplifiers, where each amplifier sends its output to the input of the next amplifier in a daisy chain.[1] The complication in calculating the gain of cascaded stages is the non-ideal coupling between stages due to loading. Two cascaded common emitter stages are shown below. Because the input resistance of the second stage forms a voltage divider with the output resistance of the first stage, the total gain is not the product of the individual (separated) stages. Figure 13: Cascading amplifier, a simplified diagram On average only 0.01% of all sub-windows are positive (faces).Equal computation time is spent on all sub-windows. Must spend most time only on potentially positive sub-windows. A simple 2-feature classifier can achieve almost 100% detection rate with 50% FP rate. That classifier can act as a 1st layer of a series to filter out most negative windows 2nd layer with 10 features can tackle “harder” negative-windows which survived the 1st layer, and so on…
  • 39. 39 A cascade of gradually more complex classifiers achieves even better detection rates. The evaluation of the strong classifiers generated by the learning process can be done quickly, but it isn’t fast enough to run in real-time. For this reason, the strong classifiers are arranged in a cascade in order of complexity, where each successive classifier is trained only on those selected samples which pass through the preceding classifiers. If at any stage in the cascade a classifier rejects the sub-window under inspection, no further processing is performed and continue on searching the next sub-window. The cascade therefore has the form of a degenerate tree. In the case of faces, the first classifier in the cascade – called the attentional operator – uses only two features to achieve a false negative rate of approximately 0% and a false positive rate of 40%. The effect of this single classifier is to reduce by roughly half the number of times the entire cascade is evaluated. In cascading, each stage consists of a strong classifier. So all the features are grouped into several stages where each stage has certain number of features. The job of each stage is to determine whether a given sub-window is definitely not a face or may be a face. A given sub-window is immediately discarded as not a face if it fails in any of the stages. A simple framework for cascade training is given below:  User selects values for f, the maximum acceptable false positive rate per layer and d, the minimum acceptable detection rate per layer.  User selects target overall false positive rate Ftarget.  P = set of positive examples  N = set of negative examples.
  • 40. 40 CHAPTER 4 Software Implementation 4.1 MATLAB : The software used for implementing the approach was MATLAB. MATLAB is matrix laboratory capable enough to perform various mathematical operations with varied features like fourth generation programming language to implement a approach. It allows a user to plot functions, data , implementation of algorithms, creation of user interfaces and interfaces with programmes written in different languages. It has numerous tool boxes to work with depending upon the type of problem at hand. It has found users not only in field of engineering but also in economics and science .It keeps on improving itself by launching newer and enhanced versions of itself with novel tools to discover s & put to use for varied purposes to satisfy. Developed by Clever Moler in the 1970’s to give his students a better option to work with like LINSPACK , EISPACK without having to learn Fortran. Researchers like Jack Little made some much needed improvements to it and by the year 2000 newer libraries were added for matrix manipulation.
  • 41. 41 The application is built around MATLAB scripting language and their usual usages include writing commands in the command window which executes the written code. In MATLAB we have worked under the neural network tool box where in the face recognition was done for the image. 4.2Open Source Computer Vision OpenCV (Open Source Computer Vision) is a library of programming functions mainly aimed at real-time computer vision, originally developed by Intel's research center in Nizhny Novgorod (Russia), later supported by Willow Garage and now maintained by Itseez.The library is cross-platform and free for use under the open-source BSD license. Officially launched in 1999, the OpenCV project was initially an Intel Research initiative to advance CPU-intensive applications, part of a series of projects including real-time ray tracing and 3D display walls. The main contributors to the project included a number of optimization experts in Intel Russia, as well as Intel’s Performance Library Team. In the early days of OpenCV, the goals of the project were described as:  Advance vision research by providing not only open but also optimized code for basic vision infrastructure. No more reinventing.  Disseminate vision knowledge by providing a common infrastructure that developers could build on, so that code would be more readily readable and transferable.  Advance vision-based commercial applications by making portable, performance- optimized code available for free—with a license that did not require to be open or free themselves. The first alpha version of OpenCV was released to the public at the IEEE Conference on Computer Vision and Pattern Recognition in 2000, and five betas were released between 2001 and 2005. The first 1.0 version was released in 2006. In mid-2008, OpenCV obtained corporate support from Willow Garage, and is now again under active development. A version 1.1 "pre- release" was released in October 2008. The second major release of the OpenCV was on October 2009. OpenCV 2 includes major changes to the C++ interface, aiming at easier, more type-safe patterns, new functions, and better implementations for existing ones in terms of performance (especially on multi-core systems). Official releases now occur every six months and development is now done by an independent Russian team supported by commercial corporations.
  • 42. 42 In August 2012, support for OpenCV was taken over by a non-profit foundation OpenCV.org, which maintains a developer[4] and user site. OpenCV is written in C++ and its primary interface is in C++, but it still retains a less comprehensive though extensive older C interface. There are bindings in Python, Java and MATLAB/OCTAVE. The API for these interfaces can be found in the online documentation.[6] Wrappers in other languages such as C#, Perl, Ch, and Ruby have been developed to encourage adoption by a wider audience. All of the new developments and algorithms in OpenCV are now developed in the C++ interface. 4.3 Computer Vision Tools: Computer Vision System Toolbox provides algorithms, functions, and apps for designing and simulating computer vision and video processing systems. You can perform feature detection, extraction, and matching; object detection and tracking; motion estimation; and video processing. For 3-D computer vision, the system toolbox supports camera calibration, stereo vision, 3-D reconstruction, and 3-D point cloud processing. With machine learning based frameworks, you can train object detection, object recognition, and image retrieval systems. Algorithms are available as MATLAB functions, System objects, and Simulink blocks. For rapid prototyping and embedded system design, the system toolbox supports fixed-point arithmetic and C-code generation. Key Features:  Object detection and tracking, including the Viola-Jones, Kanade-Lucas-Tomasi (KLT), and Kalman filtering methods  Training of object detection, object recognition, and image retrieval systems, including cascade object detection and bag-of-features methods  Camera calibration for single and stereo cameras, including automatic checkerboard detection and an app for workflow automation  Stereo vision, including rectification, disparity calculation, and 3-D reconstruction  3-D point cloud processing, including I/O, visualization, registration, denoising, and geometric shape fitting
  • 43. 43  Feature detection, extraction, and matching  Support for C-code generation and fixed-point arithmetic with code generation products. Object Detection and Recognition: Object detection and recognition are used to locate, identify, and categorize objects in images and video. Computer Vision System Toolbox provides a comprehensive suite of algorithms and tools for object detection and recognition. Object Classification: You can detect or recognize an object in an image by training an object classifier using pattern recognition algorithms that create classifiers based on training data from different object classes. The classifier accepts image data and assigns the appropriate object or class label. Computer Vision System Toolbox provides algorithms to train image classification and image retrieval systems using the bag-of-words model. The system toolbox also provides feature extraction techniques you can use to create custom classifiers using supervised classification algorithms from Statistics and Machine Learning Toolbox. 4.4 Vision Cascade Object Detector: Detect objects using the Viola-Jones algorithm The cascade object detector uses the Viola- Jones algorithm to detect people's faces, noses, eyes, mouth, or upper body. One can also use the Training Image Labeler to train a custom classifier to use with this System object. Construction detector = vision.CascadeObjectDetector creates a System object, detector, that detects objects using the Viola-Jones algorithm. The ClassificationModel property controls the type of object to detect. By default, the detector is configured to detect faces. detector = vision.CascadeObjectDetector(MODEL) creates a System object, detector, configured to detect objects defined by the input string, MODEL. The MODEL input describes the type of object to detect. There are several valid MODEL strings, such as 'FrontalFaceCART', 'UpperBody', and 'ProfileFace'. detector = vision. CascadeObjectDetector(XMLFILE) creates a System object, detector, and configures it to use the custom classification model specified with the XML FILEinput. The XMLFILE can be created using the trainCascadeObjectDetector function or OpenCV (Open Source Computer Vision) training functionality.
  • 44. 44 You must specify a full or relative path to the XMLFILE, if it is not on the MATLAB path. detector = vision.CascadeObjectDetector(Name,Value) configures the cascade object detector object properties. You specify these properties as one or more name-value pair arguments. Unspecified properties have default values. To detect a feature: 1. Define and set up your cascade object detector using the constructor. 2. Call the step method with the input image, I, the cascade object detector object, detector, points PTS, and any optional properties. See the syntax below for using the stepmethod. Use the step syntax with input image, I, the selected Cascade object detector object, and any optional properties to perform detection. BBOX = step(detector, I) returns BBOX, an M-by-4 matrix defining M bounding boxes containing the detected objects. This method performs multiscale object detection on the input image, I. Each row of the output matrix, BBOX, contains a four-element vector, [x y width height], that specifies in pixels, the upper-left corner and size of a bounding box. The input image I, must be a grayscale or true color (RGB) image. BBOX = step(detector, I,roi) detects objects within the rectangular search region specified by roi. You must specify roi as a 4-element vector, [x y width height], that defines a rectangular region of interest within image I. Set the 'Use ROI' property to true to use this syntax. Why Train a Detector? The vision.cascadeobjectdetector System object comes with several pretrained classifiers for detecting frontal faces, profile faces, noses, eyes, and the upper body. However, these classifiers are not always sufficient for a particular application. Computer Vision System Toolbox provides the trainCascadeObjectDetector function to train a custom classifier. What Kinds of Objects Can You Detect?: The Computer Vision System Toolbox cascade object detector can detect object categories whose aspect ratio does not vary significantly. Objects whose aspect ratio remains approximately fixed include faces, stop signs, and cars viewed from one side. The vision.cascadeobjectdetector System object detects objects in images by sliding a window over the image. The detector then uses a cascade classifier to decide whether the window contains the object of interest. The size of the window varies to detect objects at different scales, but its aspect ratio remains fixed. The detector is very sensitive to out-of-plane rotation, because the aspect ratio changes
  • 45. 45 for most 3-D objects. Thus, you need to train a detector for each orientation of the object. Training a single detector to handle all orientations will not work. Figure 14: Training cascade detector How Does the Cascade Classifier Work? The cascade classifier consists of stages, where each stage is an ensemble of weak learners. The weak learners are simple classifiers called decision stumps. Each stage is trained using a technique called boosting. Boosting provides the ability to train a highly accurate classifier by taking a weighted average of the decisions made by the weak learners. Each stage of the classifier labels the region defined by the current location of the sliding window as either positive or negative. Positive indicates that an object was found andnegative indicates no objects were found. If the label is negative, the classification of this region is complete, and the detector slides the window to the next location. If the label is positive, the classifier passes the region to the next stage. The detector reports an object found at the current window location when the final stage classifies the region as positive. The stages are designed to reject negative samples as fast as possible. The assumption is that the vast majority of windows do not contain the object of interest. Conversely, true positives are rare and worth taking the time to verify.  A true positive occurs when a positive sample is correctly classified.  A false positive occurs when a negative sample is mistakenly classified as positive.  A false negative occurs when a positive sample is mistakenly classified as negative.
  • 46. 46 To work well, each stage in the cascade must have a low false negative rate. If a stage incorrectly labels an object as negative, the classification stops, and you cannot correct the mistake. However, each stage can have a high false positive rate. Even if the detector incorrectly labels a non object as positive, you can correct the mistake in subsequent stages. The overall false positive rate of the cascade classifier is fs, where f is the false positive rate per stage in the range (0 1), and s is the number of stages. Similarly, the overall true positive rate is ts, where t is the true positive rate per stage in the range (0 1]. Thus, adding more stages reduces the overall false positive rate, but it also reduces the overall true positive rate. 4.5 Neural Network Tools: It provides algo’s, functions, and various app’s to create , train , visualize and simulate the functions and the network. It offers regression, clustering, pattern recognition, reduction in dimension, dynamic system modeling and control. The toolbox includes convolutional neural networks ,deep learning algo’s for the various tasks to compute. The Parallel computing tool box offers to compute large data sets by speeding their training and spreading it across multi core processors & GPU’s .Among the various features to offer it allows to work its supports supervised & unsupervised learning algo’s offers pre & post processing for better training efficiency and enhancing the network performance. Under the network architecture it offers supervised networks, and unsupervised networks, under supervised networks it has four types to offer which are: feed forward networks, radial basis networks, learning vector quantization and dynamic networks. We have used feed forward networks to compute our data , which one way connections and no feedback mechanism. Feed forward networks support nonlinear function fitting, pattern recognition, and prediction networks. Then they are supported feed forward networks which include feed forward back propagation, cascade forward back propagation , feed forward input delay back propagation & linear and perceptron networks. 4.6 Neural Network Fitting tool nftool opens the neural network fitting tool GUI. nftool leads you through solving a data fitting problem, solving it with a two-layer feed-forward network trained with Levenberg-Marquardt.
  • 47. 47 Neural networks are good at fitting functions. In fact, there is proof that a fairly simple neural network can fit any practical function. Suppose, for instance, that you have data from a housing application. You want to design a network that can predict the value of a house (in $1000s), given 13 pieces of geographical and real estate information. You have a total of 506 example homes for which you have those 13 items of data and their associated market values. You can solve this problem in two ways:  Use a graphical user interface, nftool, as described in Using the Neural Network Fitting Tool.  Use command-line functions, as described in Using Command-Line Functions. It is generally best to start with the GUI, and then to use the GUI to automatically generate command-line scripts. Before using either method, first define the problem by selecting a data set. Each GUI has access to many sample data sets that you can use to experiment with the toolbox (see Neural Network Toolbox Sample Data Sets). If you have a specific problem that you want to solve, you can load your own data into the workspace. The next section describes the data format. 4.7 Training Algorithm It supports training and learning functions which are ways to manipulate the numerical weights & biases associated with the networks. Where a training function is more of an global algorithm affecting the weights and biases of the whole network, learning function concentrates on the individual weights and biases. The algorithms supported by neural network tool box include Levenberg Marquardt Algo, resilient back propagation algo, including many gradient descent methods and conjugate gradient methods. The modular framework helps in developing custom trained algo’s as well. When training the network one is allowed to use the error weight to check the relation between them and the desired outputs. The algo can be accessed directly from the command line or via apps that show some diagrammatical representation of the networks in use and provide features to monitor and train the specific network. A suite of learning functions, including gradient descent, Hebbian learning, LVQ, Widrow-Hoff, and Kohonen is also provided.
  • 48. 48 trainlm is a network training function that updates weight and bias values according to Levenberg-Marquardt optimization. trainlm is often the fastest Backpropagation algorithm in the toolbox, and is highly recommended as a first-choice supervised algorithm, although it does require more memory than other algorithms. trainlm supports training with validation and test vectors if the network's NET.divideFcn property is set to a data division function. Validation vectors are used to stop training early if the network performance on the validation vectors fails to improve or remains the same for max_fail epochs in a row. Test vectors are used as a further check that the network is generalizing well, but do not have any effect on training. trainlm can train any network as long as its weight, net input, and transfer functions have derivative functions. Backpropagation is used to calculate the Jacobian jX of performance perf with respect to the weight and bias variables X. Each variable is adjusted according to Levenberg-Marquardt, jj = jX * jX eq 4.1 je = jX * E eq 4.2 dX = -(jj+I*mu) je eq 4.3 where E is all errors and I is the identity matrix. The adaptive value mu is increased by mu_inc until the change above results in a reduced performance value. The change is then made to the network and mu is decreased bymu_dec. The parameter mem_reduc indicates how to use memory and speed to calculate the Jacobian jX. If mem_reduc is 1, then trainlm runs the fastest, but can require a lot of memory. Increasing mem_reduc to 2 cuts some of the memory required by a factor of two, but slows trainlm somewhat. Higher states continue to decrease the amount of memory needed and increase training times. Training stops when any of these conditions occurs:  The maximum number of epochs (repetitions) is reached.  The maximum amount of time is exceeded.  Performance is minimized to the goal.  The performance gradient falls below min_grad.
  • 49. 49  mu exceeds mu_max. Validation performance has increased more than max_fail times since the last time it decreased (when using validation). In mathematics and computing, the Levenberg–Marquardt algorithm (LMA), also known as the damped least-squares (DLS) method, is used to solve non-linear least squares problems. These minimization problems arise especially in least squares curve fitting. The LMA is used in many software applications for solving generic curve-fitting problems. However, as for many fitting algorithms, the LMA finds only a local minimum, which is not necessarily the global minimum. The LMA interpolates between the Gauss–Newton algorithm (GNA) and the method of gradient descent. The LMA is more robust than the GNA, which means that in many cases it finds a solution even if it starts very far off the final minimum. For well-behaved functions and reasonable starting parameters, the LMA tends to be a bit slower than the GNA. LMA can also be viewed as Gauss–Newton using a trust region approach. 4.8 Programming Code for Face Detection and its features Code for face detection: clear all clc %Detect objects using Viola-Jones Algorithm %To detect Face FDetect = vision.CascadeObjectDetector; %Read the input image I = imread('HarryPotter.jpg'); %Returns Bounding Box values based on number of objects BB = step(FDetect,I); figure, imshow(I); hold on for i = 1:size(BB,1)rectangle('Position',BB(i,:),'LineWidth',5,'LineStyle','-','EdgeColor','r'); end title('Face Detection'); hold off;
  • 50. 50 Figure 15: Faces detectedby algo are highlighted by bounding box Code for Nose Detection: %To detect Nose NoseDetect = vision.CascadeObjectDetector('Nose','MergeThreshold',16); BB=step(NoseDetect,I); figure, imshow(I); hold on for i = 1:size(BB,1) rectangle('Position',BB(i,:),'LineWidth',4,'LineStyle','-','EdgeColor','b'); end title('Nose Detection'); hold off; Figure 16 : Nose detected Explanation:
  • 51. 51 To denote the object of interest as 'nose', the argument 'Nose' is passed. vision.CascadeObjectDetector('Nose','MergeThreshold',16); The default syntax for Nose detection : vision.CascadeObjectDetector('Nose'); Code for Mouth Detection: %To detect Mouth MouthDetect = vision.CascadeObjectDetector('Mouth','MergeThreshold',16); BB=step(MouthDetect,I); figure, imshow(I); hold on for i = 1:size(BB,1) rectangle('Position',BB(i,:),'LineWidth',4,'LineStyle','-','EdgeColor','r'); end title('Mouth Detection'); hold off; Figure 17: Mouth Detected Code for eye detection:
  • 52. 52 %To detect Eyes EyeDetect = vision.CascadeObjectDetector('EyePairBig'); %Read the input Image I = imread('harry_potter.jpg'); BB=step(EyeDetect,I); figure,imshow(I); rectangle('Position',BB,'LineWidth',4,'LineStyle','-','EdgeColor','b'); title('Eyes Detection'); Eyes=imcrop(I,BB); figure,imshow(Eyes); Figure 18: Eyes detected 4.9 Procedure for face recognition: After detecting faces and its features we move on the task of recognition using the Neural Network training for recognition to be completed. Here I am using different faces to show the procedure of face recognition one can continue with the same. So first we read the image convert to grayscale or binary as we will be making use of their equivalent histogram for feeding the data in the neural network. Coding for image conversion from rgb to gray and binary. I=imread('im111.jpg') imshow(I) J=rgb2gray(I) subplot(1,3,1);imshow(I) subplot(1,3,2);imshow(J) K=im2bw(J); subplot(1,3,3); imshow(K); hold on;
  • 53. 53 Figure 19: sample images in rgb, gray and binary formats Code for image segmentation im=imread('im111.jpg'); n= fix(size(im,1)/2) A=im(1:n,:,:); B=im(n+1:end,:,:); subplot(1,2,1);imshow(A); subplot(1,2,2);imshow(B); hold on; Figure 20: Segmented image One carry out the same procedure by segmenting the image as well. I have shown the segmentation of images but recognition has been carried with the grayscale equivalent histogram. Code for binary image histogram I=imread('im111.jpg'); J=rgb2gray(I); K=im2bw(J); subplot(1,3,1);imshow(I); subplot(1,3,2);imshow(J);
  • 54. 54 subplot(1,3,3);imshow(K); imhist(K); hold on; Figure 21: Binary image histogram equivalent Code for calculating histogram equivalent for a gray scale image I=imread('im111.jpg'); J=rgb2gray(I); K=im2bw(J); subplot(1,3,1);imshow(I); subplot(1,3,2);imshow(J); subplot(1,3,3);imshow(K); imhist(J); hold on; Figure 22: sample image 1 in rgb, sample image 2 in gray scale ,
  • 55. 55 and histogram equivalent of sample image 2. CHAPTER 5 Results and Discussion For carrying out the training procedure a three layer neural network is used. The training and testing of data has been done using the neural network tool in the MATLAB has been used. We have used feed forward networks under the supervised learning architecture of neural network tool box to compute our data, in which one way connections operate and no feedback mechanism is included. 5.1 Results of face recognition: After opening the nft tool we choose the data to be trained for recognition procedure. That data is usually imported from an excel book. Table 1: sample data for sample images 1, 2, 3 S.no Sample Input data Target data 1 1 10010101 10001111 2 2 11001100 11110000 3 3 10101100 10101011 After importing the data neural network tool starts the training depicted by the diagram shown below. It carries out training in 3 states that is training validation and testing. Where in the training stage the network is adjusted as per the error generated during it. During the validation network generalization is monitored and is used to halt the training .In testing the final solution is tested to confirm the predictive power of the network. Usually a lot of training sessions are conducted to enhance the performance of network and lower the mean squared error value which is the average squared difference between output and target.
  • 56. 56 Figure 23: A three stage training procedure The performance plot shown below is a criteria to check the performance of the network, regression plots to can be devised for the same purpose but for the convenience I have stick to performance plots. Figure 24: Parameters calculated during training
  • 57. 57 Figure 25: Performance plot for sample data 1. Table 2: sample data for sample image 2 Input data Target data 462 0 0 102 1100 342 0 0 78 0010 234 0 0 65 1001 500 0 0 132 1010 222 0 0 69 1011 165 0 0 45 0111
  • 58. 58 Figure 26: Performance plot of sample data for sample image 2.
  • 59. 59 CHAPTER 6 Conclusionand future scope 6.1 Conclusion: In this work it has been shown that if a facial image of a person is given then the network can able to recognize the face of the person. The whole work is completed through the following steps: Facial images of different persons have been collected for showing the detection and recognition process in detail. For detection purpose we used a group of people and detected their faces and features by highlighting the detected regions using bounding boxes. For the purpose of detection we used the vision.cascadeobjectdetector package in the computer vision tool box of the MATLAB 8.2.While as for recognition the neural network tool box was utilized and in which feed forward back propagation architecture was utilized. In neural network we have worked under the neural network fitting tool to train the facial image data. The image taken for recognition was converted into grayscale and binary image formats . Then taking the grayscale image we computed the histogram for it using which the data was compiled for the image recognition. One can also do the same using binary image . In neural network fitting tool the training was carried out under 3 stages mainly training, validation and testing which have already been explained before in the report.The training procedure computed the error to monitor the performance of the network .In validation network generalization was the area of concentration . In testing the final solution was obtained by carrying out the training session for 5-10 times in order to achieve better outcomes .Mean squared error was the parameter chosen to asses the network’s execution so far. Lower values for mean squared were obtained for the same. Thus with results one can conclude to have recognized the given image .The accuracy of this network is 85% approx.
  • 60. 60 6.2 Future Scope: Face recognition systems used today work very well under constrained conditions, although all systems work much better with frontal mug-shot images and constant lighting. All current face recognition algorithms fail under the vastly varying conditions under which humans need to and are able to identify other people. Next generation person recognition systems will need to recognize people in real-time and in much less constrained situations. I believe that identification systems that are robust in natural environments, in the presence of noise and illumination changes, cannot rely on a single modality, so that fusion with other modalities is essential . Technology used in smart environments has to be unobtrusive and allow users to act freely. Wearable systems in particular require their sensing technology to be small, low powered and easily integrable with the user's clothing. Considering all the requirements, identification systems that use face recognition and speaker identification seem to us to have the most potential for wide-spread application. Cameras and microphones today are very small, light-weight and have been successfully integrated with wearable systems. Audio and video based recognition systems have the critical advantage that they use the modalities humans use for recognition. Finally, researchers are beginning to demonstrate that unobtrusive audio-and-video based person identification systems can achieve high recognition rates without requiring the user to be in highly controlled environments. Sooner or later the process of face recognition can become automated using the neural network as its base technique and some other technique in combination with.
  • 61. 61 REFERENCES: [1] H.A. Rowley, S. Baluja, and T. Kanade, “Neural network - based face detection,” IEEE Trans.Pattern Analysis and Machine Intelligence, vol. 20, no. 1, pp. 23–38, Jan. 1998. [2] H.A.Rowley, S. Baluja, and T. Kanade, “Rotation invariant neural net-workbased face detection,” Proc. IEEE Int’l Conf. Computer Vision and Pattern Recognition, pp. 38–44, 1998. [3] K.K Sung and T. Poggio, “Example-based learning for view-based human face detection,” IEEE Trans. Pattern Analysis and Machine Intelli-gence, vol. 20, no. 1, pp. 39–51, Jan. 1998. [4] H. Schneiderman and T. Kanade, “A statistical method for 3D object detection applied to faces and cars,” Proc. IEEE Int’l Conf. Computer Vision and Pattern Recognition, pp. 746–751, June 2000. [5] K.C. Yow and R. Cipolla, “Feature-based human face detection,” Image and Vision Computing, vol. 25, no. 9, pp. 713–735, Sept. 1997. [6] D. Maio and D. Maltoni, “Real-time face location on gray-scale static images,” Pattern Recognition, vol. 33, no. 9, pp. 1525–1539, Sept. 2000. [7] M.S. Lew and N. Huijsmans, “Information theory and face detection,” Proc.IEEE Int’l Conf. Pattern Recognition, pp. 601- 605, Aug. 1996. [8] S.C. Dass and A.K. Jain, “Markov face models,” Proc. IEEE Int’l Conf.Computer Vision, pp. 680–687, July 2001. [9] A.J. Colmenarez and T.S. Huang, “Face detection with information based maximum discrimination,” Proc. IEEE Int’l Conf. Computer Vision and Pattern Recognition, pp. 782–787, June 1997. [10] D. DeCarlo and D. Metaxas, “Optical flow constraints on deformable models with applications to face tracking,” International Journal Computer Vision, vol. 38, no. 2, pp. 99–127, July 2000. [11] V. Bakic and G. Stockman, “Menu selection by facial aspect,” Proc. Vision Interface, Canada, pp. 203–209, May 1999. [12] A. Colmenarez, B. Frey, and T. Huang, “Detection and tracking of faces and facial features,” Proc. IEEE Int’l Conf. Image Processing, pp. 657–661, Oct. 1999. [13] R. F´eraud, O.J. Bernier, J.-E. Viallet, and M. Collobert, “A fast and accurate face detection based on neural network,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23. [14] W.Zhao, R.Chellappa, P.J..Phillips and A. Rosennfeld, “Face Reconi-tion: A literature Survey”. ACM Comput.Surv., 35(4): 399-458, 2003. [15] M.A.Turk and A.P. Pentaland, “Face Recognition Using Eigenfaces”, IEEE conf. on Computer Vision and Pattern Recognition, pp. 586-591, 1991. [16] Ling-Zhi Liao, Si-Wei Luo, and Mei Tian “”Whitenedfaces” Recogni-tion With PCA and ICA” IEEE Signal Processing Letters, Vol. 14, No. 12, pp1008-1011, Dec. 2007. [17] G. Jarillo, W.Pedrycz , M. Reformat “Aggregation ofclassifiers based on image transformations in biometric face recognition” Machine Vision and Applications (2008) Vol . 19,pp. 125-140, Springer-Verlag 2007. [18] Tej Pal Singh, “Face Recognition by using Feed Forward Back Propagation Neural Network”, International Journal of Innovative Research in Technology & Science, vol.1, no.1.
  • 62. 62 [19] N.Revathy, T.Guhan, “Face recognition system using back propagation artificial neural networks”, International Journal of Advanced Engineering Technology, vol.3, no. 1, 2012. [20] Kwan-Ho Lin, Kin-Man Lam, and Wan-Chi Siu. “A New Approach using ModiGied Hausdorff Distances with EigenFace for Human Face Recognition” IEEE Seventh international Conference on Control, Automation, Robotics and Vision , Singapore, 2-5 Dec, ,pp 980-984, 2002 [21] Zdravko Liposcak, , Sven Loncaric, “Face Recognition From Profiles Using Morphological Operations”, IEEE Conference on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems, 1999. [22] Simone Ceolin , William A.P Smith, Edwin Hancock, “Facial Shape Spaces from Surface Normals and Geodesic Distance”, Digital Image Computing Techniques and Applications, 9th Biennial Conference of Australian Pattern Recognition Society IEEE 3-5 Dec., Glenelg, pp-416-423, 2007 [23] Ki-Chung Chung , Seok Cheol Kee ,and Sang Ryong Kim, “Face Recognition using Principal Component Analysis of Gabor Filter Responses” ,Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems, 1999,Proceedings. International Workshop IEEE 26-27 September, Corfu,pp-53-57, 1999, [24] Ms.Varsha Gupta, Mr. Dipesh Sharma, “A Study of Various Face Detection Methods”,InternationalJournal of Advanced Research in Computer and Communication Engineering), vol.3, no. 5, May 2014. [25] Irene Kotsia, Iaonnis Pitas, “Facial expression recognition in image sequences using geometric deformation features and support vector machines”, IEEE transaction paper on image processing, vol. 16, no.1, pp-172-187, January 2007. [26] R.Rojas,”The back propagation algorithm”,Springer-Verlag, Neural networks, pp 149-182 ,1996. [27] Hosseien Lari-Najaffi , Mohammad Naseerudin and Tariq Samad,”Effects of initial weights on back propagation and its variations”, Systems, Man and Cybernetics ,Conference Proceedings, IEEE International Conference, 14-17 Nov ,Cambridge, pp-218-219,1989. [28] M.H Ahmad Fadzil. , H.Abu Bakar., “Human face recognition using neural networks”, Image processing, 1994, Proceedings ICIP-94, IEEE International Conference ,13-16 November, Austin ,pp-936-939,1994. [29] N.K Sinha , M.M Gupta and D.H Rao, “Dynamic Neural Networks -an overview”, Industrial Technology 2000,Proceedings of IEEE International Conference,19-22 Jan, , pp-491-496, 2000 [30] Prachi Agarwal, Naveen Prakash, “An Efficient Back Propagation NeuralNetwork Based Face Recognition System Using Haar Wavelet Transform and PCA” International Journal of Computer Science and Mobile Computing, vol.2, no.5,pg.386 – 395,May 2013. [31]Dibber, 4 Jan 2005, “Backpropagation”, https://en.wikipedia.org/wiki/Backpropagation, 20 September 2015 [32] “Artificial Neural Networks”, https://en.wikipedia.org/wiki/Artificial_neural_network, accessed online 2 October 2001. [33]MichaelNielsen,Jan, “Neural networks and deep learning”, http://neuralnetworksanddeeplearning.com/chap2.html,2016 [34]Tyrell turing, 7 April, “Feed forward neural network”, https://en.wikipedia.org/wiki/Feedforward_neural_network,2005.
  • 63. 63 Publications: Journal Publications Smriti Tikoo, Nitin Malik, “Detection of face using Viola Jones algorithm and recognition using Backpropagation neural network”, International Journal of Computer science and Mobile Computing, vol 5, issue no. 5, pg.288-295, May 2016. Communicated: “Detection, segmentation and recognition of face and its features using neural networks”, International Journal of Advanced Research in Computer and Communication Engineering.