Automatic Attendance System using CNN

AUTOMATIC ATTENDANCE SYSTEM
By: Pinaki Ranjan Sarkar
Under the guidance of:
Dr. Gorthi R.K.S.S. Manyam &
Dr. Deepak Mishra

OUTLINE
▪ Motivation▪ Motivation
▪ Objective
▪ System Requirements
▪ Design Details
▪ Tried methods
▪ Inspiration▪ Inspiration
▪ Main design
▪ Status so far
▪ Future work

MOTIVATION
▪ Taking attendance in large classes is:▪ Taking attendance in large classes is:
▪ Cumbersome
▪ Repetitive
▪ Consumes valuable class time
▪ What if we make an efficient face detection and recognition system for▪ What if we make an efficient face detection and recognition system for
this task?

OBJECTIVES
▪ Automatic user identification via face detection and recognition.▪ Automatic user identification via face detection and recognition.
▪ Develop and implement an efficient face detection and recognition
system.
▪ End-to-end face recognition system using deep learning.

DIFFICULTIES
▪ Large pose variation▪ Large pose variation
▪ Hidden faces & tiny faces
▪ Different illumination conditions, occlusions

SYSTEM REQUIREMENTS
▪ Hardware:▪ Hardware:
▪ A camera
▪ PC or Raspberry pi
▪ Software:
▪ Matlab 2013+
▪ Python 2.7▪ Python 2.7
▪ Lasagne API

DESIGN DETAILS
Database
Face
Detection
Face
Recognition
Abhi - 1
Priya – 1
Ayushi – 0
Pinaki – 0
Akshay – 1
Sidd - 1Sidd - 1
All are using CNN!!

GOING DEEP INTO FACE RECOGNITION
▪ Various methods are employed to recognize a person in wild.▪ Various methods are employed to recognize a person in wild.
▪ Comparing to traditional handcrafted features such as high dimensional
LBP, Active Appearance Model(AAM), Active Shape Model(ASM) or
Bayesian face, Gaussian face etc.; automatically learnt deep features
based on personal identity are more advantageous.
▪ In most deep learning based face recognition methods the inputs to the
deep model are aligned face images.deep model are aligned face images.

TRIED METHODS
▪ Tal Hassner, Shai Harel, Eran Paz, Roee Enbar, "Effective Face Frontalization▪ Tal Hassner, Shai Harel, Eran Paz, Roee Enbar, "Effective Face Frontalization
in Unconstrained Images”, CVPR-2015

TRIED METHODS
▪ Zhu, Xiangyu, et al. "Face alignment across large poses: A 3d▪ Zhu, Xiangyu, et al. "Face alignment across large poses: A 3d
solution." CVPR-2016

TRIED METHODS
▪ I have tried to implement some more papers but they failed when we
are dealing with large pose.
▪ I have tried to implement some more papers but they failed when we
are dealing with large pose.
▪ Instead of AAM, 3D fitted model (3D frontalisation doesn’t show
significant improvements over simple 2D alignment*), we used Deep
learning techniques to recognize a face using only personal identity
clues.
* Banerjee, Sandipan, et al. "To Frontalize or Not To Frontalize: Do We Really Need Elaborate Pre-Processing to
Improve Face Recognition Performance?." arXiv preprint arXiv:1610.04823 (2016).

INSPIRATION
▪ Our work is inspired by some of the state-of-the-art papers.▪ Our work is inspired by some of the state-of-the-art papers.
▪ DeepFace: Closing the Gap to Human-Level Performance in Face Verification, CVPR-2014
▪ FaceNet: A Unified Embedding for Face Recognition and Clustering, CVPR-2015
▪ DeepID3: Face Recognition with Very Deep Neural Networks, CVPR-2015
▪ Supervised Transformer Network for Efficient Face Detection, ECCV-2016
▪ Towards End-to-End Face Recognition through Alignment Learning, arXiv-2017
▪ Spatial transformer networks. NIPS-2015
▪ Finding Tiny Faces. arXiv-2016

MAIN DESIGN
▪ The complete architecture has two stages▪ The complete architecture has two stages
▪ Face Detection
▪ Face Recognition

ARCHITECTURE FOR DETECTION
▪ They have provided an in-depth analysis of image resolution, object
scale, and spatial context for the purposes of finding small faces.
▪ Still the detailed study of the paper is pending as I have found this
paper very recently. I will briefly describe their architecture in the next
slide

WHERE IT FAILS?
▪ For out of plane rotation this proposed method works fine but when 2D▪ For out of plane rotation this proposed method works fine but when 2D
rotation comes into picture then their method suffers from less
accuracy.
▪ Some of the failures are shown in the next slide

11/14 True detection
1 False detection
1/14 True detection
1 False detection

ARCHITECTURE FOR RECOGNITION
Localization
Network
Transform
parameters
RecognitionRecognition
Network
Features
Augmented image
128 X 128
Transformer
Aligned face
64 X 64
Spatial Transformer Network

SPATIAL TRANSFORMER NETWORK
▪ Intuition behind STN▪ Intuition behind STN

▪ Intuition behind STN▪ Intuition behind STN
Sampling

▪ According to the original DeepMind paper, the spatial transformer can▪ According to the original DeepMind paper, the spatial transformer can
be used to implement any parametrizable transformation including
translation, scaling, affine, projective.
▪ Suppose that for the ith target point pt
i = (xt
i ; yt
i ; 1) in the output image,
a grid generator generates its source coordinates (xs
i ; ys
i ; 1) in the input
image according to transformation parameters.
Projective transformation equation

▪ Sampler: (Mathematical Formulation)▪ Sampler: (Mathematical Formulation)

▪ We use the bilinear kernel so that:▪ We use the bilinear kernel so that:

▪ So overall transformer model will be:

This is equivalent to convolving a sampling kernel
k with the source image of H X W dimension

This is equivalent to convolving a sampling kernel
k with the source image of H X W dimension
▪ All the blocks should be differentiable.

During the backward propagation, we need to calculate the gradient
of Vi with respect to each of the eight transformation parameters.

Where,

▪ The similarity transformation is defined here▪ The similarity transformation is defined here
in which α is the rotation angle, λ is the scaling factor, and t1; t2 are the horizontal and
vertical translation displacements respectively. Analogously, the gradients of Vi respected to α
and λ are shown below:

STATUS SO FAR
▪ STN is implemented and tested on Labeled Face in Wild (LFW) dataset.▪ STN is implemented and tested on Labeled Face in Wild (LFW) dataset.
▪ Out of 5423 classes, we took only 1000 classes because of the limitation in
computation.
▪ During training we did data augmentation with random 2D-Affine transformation
on face data to increase the training size.
▪ We had 15399 training images, 3501 testing images and 2100 validation images
during training.
▪ We introduced a CNN architecture to extract deep features from the
transformed face.

STATUS SO FAR
▪ Output of STN network▪ Output of STN network

STATUS SO FAR
Conv
Conv
Pool & Actv
Conv
Conv
Conv
Pool & Actv
Pool & Actv
Pool & Actv
Dense
Dense
Dense
Actv
STN Architecture
Conv
Pool & Actv
Pool & Actv
Dense
Dense
Dense
Actv
Actv
Recognition
Architecture

FUTURE WORK
▪ Try to validate the architecture in real data (taken from classroom)▪ Try to validate the architecture in real data (taken from classroom)
▪ Without training a new CNN model, compare recognition accuracy with
the ImageNet winning pre-trained models.
▪ Adding 2D rotation invariance face detection with the recent model.

Automatic Attendance System using CNN

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (18)

Similaire à Automatic Attendance System using CNN

Similaire à Automatic Attendance System using CNN (20)

Dernier

Dernier (20)

Automatic Attendance System using CNN