Major project report on

MAJOR PROJECT REPORT ON
AN EFFICIENT PARALLEL APPROACH FOR SCLERA VEIN
RECOGNITION
A dissertation work submitted in partial fulfilment of the
requirement for the award of the degree of
BACHELOR OF TECHNOLOGY
IN
(ELECTRONICS AND COMMUNICATION ENGINEERING)
BY
ADLA KIRAMAYI
ANNABATHULA SRILATHA
M.AYESHA MUBEEN
Under the guidance of
MS SYEDA SANA FATIMA
Assistant professor
DEPARTMENT OF ELECTRONIC & COMMUNICATION
ENGINEERING
SHADAN WOMENS COLLEGE OF ENGINEERING &
TECHNOLOGY
(Affiliated To Jawaharlal Nehru Technological University Hyderabad)
2011-2015

CERTIFICATE
This is to certify that the project report entitled “AN EFFICIENT
PARALLEL APPROACH FOR SCLERA VEIN RECOGNITION”
being submitted by A.KIRANMAYI , A.SRILATHA and M.AYESHA
MUBEEN to Jawaharlal Nehru Technological University, Hyderabad,
for the award of the degree of Bachelor of Technology in Electronics and
Communication Engineering. This is a record of bonafide work carried
out by them under my supervision and guidance.
The matter contained in this report has not been submitted to any
other university or
institute for the award of any degree or diploma.
MS SYEDA SANA FATIMA MS
S.SUNEETA
INTERNAL GUIDE HEAD OF
THE DEPARTMENT
EXTERNAL GUIDE

ACKNOWLEDGEMENT
This is a report giving details of our project work titled “AN EFFICIENT
PARALLEL APPROACH FOR SCLERA VEIN RECOGNITION”.
Though this, an attempt has been made to present the description of all the
theoretical and practical aspects of our project to all the possible extent.
We take this opportunity to express our sincere appreciation to professor
Ms. S.Suneeta, Head of the Department and staff of Bachelor of
Technology for their invaluable suggestion and keen interest they have
shown in successful completion of this project.
We express our deep gratitude to our guide Ms. Syeda Sana Fatima
whose in valuable reference. Suggestion and encouragement have
immensely helped in successful completion of the project. This project
would add as an asset to our academic profile.
It is with profound sense of gratitude that we acknowledge our project
guide Ms. Syeda Sana Fatima for providing us with live specification and
her valuable suggestion which encourage me to complete their project
successfully.
We are happy to express our gratitude to one and all who helped us in the
fulfilment
of the project successfully.
We are thankful to our principal Dr. MAZHER SALEEM, Shadan
Women’s
College of Engineering and Technology, for encouraging us to do the
projects.
ADLA KIRANMAYI
M.AYESHA MUBEEN

DECLARATION
We hereby declare that the work which is being presented in this project
entitled “AN EFFICIENT PARALLEL APPROACH FOR SCLERA
VEIN RECOGNITION” submitted towards the partial fulfilment of the
requirement for the award of the Degree of Bachelor Of Technology in
“Electronics and Communication Engineering” is an authentic record of
our work under the supervision of Ms. Syeda Sana Fatima , Assistant
professor and Ms. S.Suneeta, Head of the Department of Electronics and
Communication Engineering, SHADAN WOMENS COLLEGE OF
ENGINEERING AND TECHNOLOGY, affiliated to Jawaharlal Nehru
Technology University, Hyderabad.
The matter embodied in this report has not been submitted for the award of
any other degree.
ADLA KIRANMAYI
M.AYESHA MUBEEN

INDEX
ABSTRACT…………………………………………………………………
………….i
CHAPTER 1:
INTRODUCTION………………………………………………………..…
.1-16
1.1 GENERAL
1.2 OVERVIEW ABOUT DIGITAL IMAGE PROCESSING
1.2.1 PREPROCESSING
1.2.2 IMAGE ENHANCEMENT
1.2.3 IMAGE RESTORATION:
1.2.4 IMAGE COMPRESSION
1.2.5 SEGMENTATION
1.2.6 IMAGE RESTORATION
1.2.7 FUNDAMENTAL STEPS
1.3 A SIMPLE IMAGE MODEL
1.4 IMAGE FILE FORMATS
1.5 TYPE OF IMAGES
1.5.1 BINARY IMAGES
1.5.2 GRAY SCALE IMAGE
1.5.3. COLOR IMAGE
1.5.4 INDEXED IMAGE

1.6 APPLICATIONS OF IMAGE PROCESSING
1.7 EXIXTING SYSTEM
1.7.1 DISADVANTAGES OF EXISTING SYSTEM
1.8 LITERATURE SURVEY
1.9 PROPOSED SYSTEM
1.9.1 ADVANTAGES
CHAPTER 2: PROJECT
DESCRIPTION…………………………………………………17-46
2.1 INTRODUCTION
2.2 BACKGROUND OF SCLERA VEIN RECOGNITION
2.2.1 OVERVIEW OF SCLERA VEIN RECOGNITION
2.2.2 SCLERA SEGMENTATION
2.2.3 IRIS AND EYELID REFINEMENT
2.2.4 OCULAR SURFACE VASCULATURE
2.2.5 OVERVIEW OF THE LINE DESCRIPTOR-BASED SCLERA
VEIN
2.3 EVOLUTION OF GPU ARCHITECTURE
2.31 PROGRAMMING A GPU FOR GRAPHICS
2.3.2 PROGRAMMING A GPU FOR GENERAL-PURPOSE
PROGRAMS (OLD)
PROGRAMS (NEW)
2.4 COARSE-TO-FINE TWO-STAGE MATCHING PROCESS
2.4.1. STAGE I: MATCHING WITH Y SHAPE DESCRIPTOR
2.4.2 STAGE II: FINE MATCHING USING WPL DESCRIPTOR
2.5. MAPPING THE SUBTASKS TO CUDA

2.5.1. MAPPING ALGORITHM TO BLOCKS
2.5.2. MAPPING INSIDE BLOCK
2.5.3. MEMORY MANAGEMENT
2.6 HISTOGRAM OF ORIENTED GRADIENTS
CHAPTER 3: SOFTWARE
SPECIFICATION………………………………………..…47-53
3.1 GENERAL
3.2 SOFTWARE REQUIREMENTS
3.3 INTRODUCTION
3.4 FEATURES OF MATLAB
3.4.1 INTERFACING WITH OTHER LANGUAGES
3.5 THE MATLAB SYSTEM
3.5.1 DESKTOP TOOLS
3.5.2 ANALYZING AND ACCESSING DATA
3.5.3 PERFORMING NUMERIC COMPUTATION
CHAPTER 4:
IMPLEMENTATION……………………………………………….……..
54-69
4.1 GENERAL
4.2 CODING IMPLEMENTATION
4.3 SNAPSHOTS
CHAPTER 5:
………………………………………………………………70
CHAPTER 6: CONCLUSION & FUTURE
SCOPE…………………………..71-72
6.1 CONCLUSION

6.2 REFERENCES
APPLICATION
LIST OF FIGURES
FIG
N
O.
FIG NAME PG.NO
1.1 Fundamental blocks of
digital image processing
2
1.2 Gray scale image 8
1.3 The additive model of RGB 9
1.4 The colors created by the
subtractive model of CMYK
9
2.1 The diagram of a typical
sclera vein recognition
approach
19
2.2 Steps of Segmentation 21
2.3 Glare area detection 21
2.4 Detection of the sclera area 22
2.5 Pattern of veins 23
2.6 Sclera region and its vein
patterns 25
2.7 Filtering can take place
simultaneously on different
25

parts of the iris image
2.8 The sketch of parameters of
segment descriptor
26
2.9 The weighting image 28
2.10 The module of sclera
template matching
28
2.11 The Y shape vessel branch in
sclera
28
2.12 The rotation and scale
invariant character of Y
shape vessel branch
29
2.13 The line descriptor of the
sclera vessel pattern
30
2.14 The key elements of
descriptor vector
31
2.15 Simplified sclera matching
steps on GPU
32
2.16 Two-stage matching scheme 35
2.17 Example image from the
UBIRIS database
42
2.18 Occupancy on various thread
numbers per block
43
2.19 The task assignment inside
and outside the GPU
44
2.20 HOG features 46
4.1 Original sclera image 65
4.2 Binarised sclera image 65
4.3 Edge map subtracted image 66
4.4 Cropping roi 66
4.5 Roi mask 67
4.6 Roi finger sclera image 67

4.7 Enhanced sclera image 68
4.8 Feature extracted sclera
image
68
4.9 Matching with images in
database
69
4.10 Result 69
ABSTRACT
Sclera vein recognition is shown to be a promising method for human
identification. However, its matching speed is slow, which could impact its
application for real-time applications. To improve the matching efficiency,
we proposed a new parallel sclera vein recognition method using a two-
stage parallel approach for registration and matching. First, we designed a
rotation- and scale-invariant Y shape descriptor based feature extraction
method to efficiently eliminate most unlikely matches. Second, we
developed a weighted polar line sclera descriptor structure to incorporate
mask information to reduce GPU memory cost. Third, we designed a
coarse-to-fine two-stage matching method. Finally, we developed a
mapping scheme to map the subtasks to GPU processing units. The
experimental results show that our proposed method can achieve dramatic
processing speed improvement without compromising the recognition
accuracy.

CHAPTER 1
INTRODUCTION
1.1GENERAL
Digital image processing is the use of computer algorithms to
perform image processing on digital images. The 2D continuous image is
divided into N rows and M columns. The intersection of a row and a
column is called a pixel. The image can also be a function other variables
including depth, color, and time. An image given in the form of a
transparency, slide, photograph or an X-ray is first digitized and stored as a
matrix of binary digits in computer memory. This digitized image can then
be processed and/or displayed on a high-resolution television monitor. For
display, the image is stored in a rapid-access buffer memory, which
refreshes the monitor at a rate of 25 frames per second to produce a visually
continuous display.
1.2 OVERVIEW ABOUT DIGITAL IMAGE PROCESSING
The field of “Digital Image Processing” refers to processing the digital
images by means of a digital computer. In a broader sense, it can be
considered as a processing of any two dimensional data, where any image
(optical information) is represented as an array of real or complex numbers

represented by a definite number of bits. An image is represented as a two
dimensional function f(x,y), where ‘x’ and ‘y’ are spatial (plane)
coordinates and the amplitude of f at any pair of coordinates (x,y)
represents the intensity or gray level of the image at that point.
A digital image is one for which both the co-ordinates and the
amplitude values of f are all finite, discrete quantities. Hence, a digital
image is composed of a finite number of elements, each of which has a
particular location value. These elements are called “pixels”. A digital
image is discrete in both spatial coordinates and brightness and it can be
considered as a matrix whose rows and column indices identify a point on
the image and the corresponding matrix element value identifies the gray
level at that point.
One of the first applications of digital images was in the newspaper
industry, when pictures were first sent by submarine cable between London
and New York. Introduction of the Bartlane cable picture transmission
system in the early 1920s reduced the time required to transport a picture
across the Atlantic from more than a week to less than three hours.
FIG
1.2.1 PREPROCESSING

In imaging science, image processing is any form of signal
processing for which the input is an image, such as a photograph or video
frame; the output of image processing may be either an image or a set of
characteristics or parameters related to the image. Most image-processing
techniques involve treating the image as a two-dimensional signal and
applying standard signal-processing techniques to it. Image processing
usually refers to digital image processing, but optical and analog image
processing also are possible. This article is about general techniques that
apply to all of them. The acquisition of images (producing the input image
in the first place) is referred to as imaging.
Image processing refers to processing of a 2D picture by a
computer. Basic definitions:
An image defined in the “real world” is considered to be a function
of two real variables, for example, a(x,y) with a as the amplitude (e.g.
brightness) of the image at the real coordinate position (x,y). Modern digital
technology has made it possible to manipulate multi-dimensional signals
with systems that range from simple digital circuits to advanced parallel
computers. The goal of this manipulation can be divided into three
categories:
Image processing (image in -> image out)
Image Analysis (image in -> measurements out)
Image Understanding (image in -> high-level description out)
An image may be considered to contain sub-images sometimes referred
to as regions-of-interest, ROIs, or simply regions. This concept reflects the
fact that images frequently contain collections of objects each of which can
be the basis for a region. In a sophisticated image processing system it
should be possible to apply specific image processing operations to selected

regions. Thus one part of an image (region) might be processed to suppress
motion blur while another part might be processed to improve colour
rendition.
Most usually, image processing systems require that the images be
available in digitized form, that is, arrays of finite length binary words. For
digitization, the given Image is sampled on a discrete grid and each sample
or pixel is quantized using a finite number of bits. The digitized image is
processed by a computer. To display a digital image, it is first converted
into analog signal, which is scanned onto a display. Closely related to
image processing are computer graphics and computer vision. In computer
graphics, images are manually made from physical models of objects,
environments, and lighting, instead of being acquired (via imaging devices
such as cameras) from natural scenes, as in most animated movies.
Computer vision, on the other hand, is often considered high-level image
processing out of which a machine/computer/software intends to decipher
the physical contents of an image or a sequence of images (e.g., videos or
3D full-body magnetic resonance scans).
In modern sciences and technologies, images also gain much
broader scopes due to the ever growing importance of scientific
visualization (of often large-scale complex scientific/experimental data).
Examples include microarray data in genetic research, or real-time multi-
asset portfolio trading in finance. Before going to processing an image, it is
converted into a digital form. Digitization includes sampling of image and
quantization of sampled values. After converting the image into bit
information, processing is performed. This processing technique may be
Image enhancement, Image restoration, and Image compression.
1.2.2 IMAGE ENHANCEMENT:
It refers to accentuation, or sharpening, of image features such as
boundaries, or contrast to make a graphic display more useful for display &
analysis. This process does not increase the inherent information content in
data. It includes gray level & contrast manipulation, noise reduction, edge

crispening and sharpening, filtering, interpolation and magnification,
pseudo coloring, and so on.
1.2.3 IMAGE RESTORATION:
It is concerned with filtering the observed image to minimize the
effect of degradations. Effectiveness of image restoration depends on the
extent and accuracy of the knowledge of degradation process as well as on
filter design. Image restoration differs from image enhancement in that the
latter is concerned with more extraction or accentuation of image features.
1.2.4 IMAGE COMPRESSION:
It is concerned with minimizing the number of bits required to represent
an image. Application of compression are in broadcast TV, remote sensing
via satellite, military communication via aircraft, radar, teleconferencing,
facsimile transmission, for educational & business documents, medical
images that arise in Computer tomography, magnetic resonance imaging
and digital radiology, motion, pictures, satellite images, weather maps,
geological surveys and so on.
 Text compression – CCITT GROUP3 & GROUP4
 Still image compression – JPEG
 Video image compression – MPEG
1.2.5 SEGMENTATION
In computer vision, image segmentation is the process of
partitioning a digital image into multiple segments (sets of pixels, also
known as super pixels). The goal of segmentation is to simplify and/or
change the representation of an image into something that is more
meaningful and easier to analyze. Image segmentation is typically used to
locate objects and boundaries (lines, curves, etc.) in images. More precisely,
image segmentation is the process of assigning a label to every pixel in an
image such that pixels with the same label share certain visual
characteristics.

The result of image segmentation is a set of segments that
collectively cover the entire image, or a set of contours extracted from the
image (see edge detection). Each of the pixels in a region are similar with
respect to some characteristic or computed property, such as
colour, intensity, or texture. Adjacent regions are significantly different
with respect to the same characteristic(s). When applied to a stack of
images, typical in medical imaging, the resulting contours after image
segmentation can be used to create 3D reconstructions with the help of
interpolation algorithms like marching cubes.
1.2.6 IMAGE RESTORATION
Image restoration like enhancement improves the qualities of image
but all the operations are mainly based on known, measured, or
degradations of the original image. Image restorations are used to restore
images with problems such as geometric distortion, improper focus,
repetitive noise, and camera motion. It is used to correct images for known
degradations.
1.2.7 FUNDAMENTAL STEPS
 Image acquisition: to acquire a digital image
 Image preprocessing: to improve the image in ways that increases the
chances for success of the other processes.
 Image segmentation: to partitions an input image into its constituent parts
or objects.
 Image representation: to convert the input data to a form suitable for
computer processing.
 Image description: to extract features that result in some quantitative
information of interest or features that are basic for differentiating one
class of objects from another.
 Image recognition: to assign a label to an object based on the
information provided by its descriptors.
 Image interpretation: to assign meaning to an ensemble of recognized
objects.
 Knowledge about a problem domain is coded into an image processing
system in the form of a Knowledge database.

1.3 A SIMPLE IMAGE MODEL
 To be suitable for computer processing, an image f(x,y) must be digitalized
both spatially and in amplitude.
 Digitization of the spatial coordinates (x,y) is called image sampling.
 Amplitude digitization is called gray-level quantization.
 The storage and processing requirements increase rapidly with the spatial
resolution and the number of gray levels.
 Example: A 256 gray-level image of size 256x256 occupies 64K bytes of
memory.
 Images of very low spatial resolution produce a checkerboard effect.
 The use of insufficient number of gray levels in smooth areas of a digital
image results in false contouring.
1.4 IMAGE FILE FORMATS
There are two general groups of ‘images’: vector graphics (or line art)
and bitmaps (pixel-based or ‘images’). Some of the most common file
formats are:
 GIF — Graphical interchange Format. An 8-bit (256 colour), non-
destructively compressed bitmap format. Mostly used for web. Has several
sub-standards one of which is the animated GIF.
 JPEG — Joint Photographic Experts Group. a very efficient (i.e. much
information per byte) destructively compressed 24 bit (16 million colours)
bitmap format. Widely used, especially for web and Internet (bandwidth-
limited).
 TIFF — Tagged Image File Format. The standard 24 bit publication bitmap
format. Compresses non-destructively with, for instance, Lempel-Ziv-
Welch (LZW) compression.
 PS — Postscript, a standard vector format. Has numerous sub-standards
and can be difficult to transport across platforms and operating systems.
 PSD – Adobe PhotoShop Document, a dedicated Photoshop format that
keeps all the information in an image including all the layers.

 BMP- bit map file format.
1.5 TYPE OF IMAGES
Images are 4 types
1. Binary image.
2. Gray scale image.
3. Color image.
4. Indexed image.
1.5.1 BINARY IMAGES
A binary image is a digital image that has only two possible values for
each pixel. Typically the two colors used for a binary image are black and
white though any two colors can be used. Binary images are also called bi-
level or two-level. This means that each pixel is stored as a single bit—i.e.,
a 0 or 1. The names black-and-white, B&W
1.5.2 GRAY SCALE IMAGE
In a (8-bit) grayscale image each picture element has an assigned intensity
that ranges from 0 to 255. A grey scale image is what people normally call
a black and white image, but the name emphasizes that such an image will
also include many shades of grey.
FIG

1.5.3. COLOR IMAGE
The RGB colour model relates very closely to the way we perceive
colour with the r, g and b receptors in our retinas. RGB uses additive colour
mixing and is the basic colour model used in television or any other
medium that projects colour with light. It is the basic colour model used in
computers and for web graphics, but it cannot be used for print production.
The secondary colours of RGB – cyan, magenta, and yellow – are formed
by mixing two of the primary colours (red, green or blue) and excluding the
third colour. Red and green combine to make yellow, green and blue to
make cyan, and blue and red form magenta. The combination of red, green,
and blue in full intensity makes white.
In Photoshop using the “screen” mode for the different layers in an
image will make the intensities mix together according to the additive
colour mixing model. This is analogous to stacking slide images on top of
each other and shining light through them.
FIG
CMYK The 4-colour CMYK model used in printing lays down
overlapping layers of varying percentages of transparent cyan (C), magenta
(M) and yellow (Y) inks. In addition a layer of black (K) ink can be added.
The CMYK model uses the subtractive colour model.

1.5.4 INDEXED IMAGE
FIG
An indexed image consists of an array and a color map matrix. The
pixel values in the array are direct indices into a color map. By convention,
this documentation uses the variable name X to refer to the array and map
to refer to the color map. In computing, indexed color is a technique to
manage digital images' colors in a limited fashion, in order to save
computer memory and file storage, while speeding up display refresh and
file transfers. It is a form of vector quantization compression.
When an image is encoded in this way, color information is not
directly carried by the image pixel data, but is stored in a separate piece of
data called a palette: an array of color elements, in which every element, a
color, is indexed by its position within the array. The image pixels do not
contain the full specification of its color, but only its index in the palette.
This technique is sometimes referred as pseudocolor or indirect color, as
colors are addressed indirectly.
Perhaps the first device that supported palette colors was a random-
access frame buffer, described in 1975 by Kajiya, Sutherland and Cheadle.
This supported a palette of 256 36-bit RGB colors.
1.6 Applications of image processing
 Interest in digital image processing methods stems from 2 principal
application areas:

1) Improvement of pictorial information for human interpretation.
2) Processing of scene data for autonomous machine perception.
 In the second application area, interest focuses on procedures for
extracting from an image
 Information in a form suitable for computer processing.
 Examples include automatic character recognition, industrial machine
vision for product assembly and inspection, military recognizance,
automatic processing of fingerprints etc.
1.7 EXISTING SYSTEM:
In, Crihalmeanu and Ross proposed three approaches: Speed Up Robust
Features (SURF)-based method, minutiae detection, and direct correlation
matching for feature registration and matching. Within these three methods,
the SURF method achieves the best accuracy. It takes an average of 1.5
seconds1 using the SURF method to perform a one-to-one matching. In,
Zhou et. al. Proposed line descriptor-based method for sclera vein
recognition. The matching step (including registration) is the most time-
consuming step in this sclera vein recognition system, which costs about
1.2 seconds to perform a one-to-one matching. Both speed was calculated
using a PC with Intel® Core TM 2 Duo 2.4GHz processors and 4 GB
DRAM. Currently, Sclera vein recognition algorithms are designed using
central processing unit (CPU)-based systems.
1.7.1 DISADVANTAGES OF EXISTING SYSTEM
1 Mask files are used to calculate valid overlapping areas of two sclera
templates and to align the templates to the same coordinate system. But the
mask files are large in size and will preoccupy the GPU memory and slow
down the data transfer. Also, some of processing on the mask files will

involve convolution which is difficult to improve its performance on the
scalar process unit on CUDA
2 The procedure of sclera feature matching consists of a pipeline of several
computational stages with different memory and processing requirements.
There is no uniform mapping scheme applicable for all these stages.
3 When the scale of sclera database is far larger than the number of the
processing units on the GPU, parallel matching on the GPU is still unable to
satisfy the requirement of real-time performance.
LITERATURE SURVEY
1. S. Crihalmeanu and A. Ross, “ ieee Multispectral scleral patterns for
ocular biometric recognition,” Pattern Recognit. Lett., vol. 33, no. 14,
pp. 1860–1869, Oct. 2012.
Face recognition in unconstrained acquisition conditions is one of the
most challenging problems that has been actively researched in recent
years. It is well known that many state-of-the-arts still face recognition
algorithms perform well, when constrained (frontal, well illuminated, high-
resolution, sharp, and full) face images are acquired. However, their
performance degrades significantly when the test images contain variations
that are not present in the training images. In this paper, we highlight some
of the key issues in remote face recognition. We define the remote face
recognition as one where faces are several tens of meters (10-250m) from
the cameras. We then describe a remote face database which has been
acquired in an unconstrained outdoor maritime environment. Recognition
performance of a subset of existing still image-based face recognition
algorithms is evaluated on the remote face data set. Further, we define the
remote re-identification problem as matching a subject at one location with
candidate sets acquired at a different location and over time in remote
conditions. We provide preliminary experimental results on remote re-
identification. It is demonstrated that in addition to applying a good

classification algorithm, finding features that are robust to variations
mentioned above and developing statistical models which can account for
these variations are very important for remote face recognition.
2.R. N. Rakvic, B. J. Ulis, R. P. Broussard, R. W. Ives, and N. Steiner,1.
“Parallelizing iris recognition,” IEEE Trans. Inf. Forensics Security,
With the rapidly expanded biometric data collected by various sectors
of government and industry for identification and verification purposes,
how to manage and process such Big Data draws great concern. Even
though modern processors are equipped with more cores and memory
capacity, it still requires careful design in order to utilize the hardware
resource effectively and the power consumption efficiently. This research
addresses this issue by investigating the workload characteristics of
biometric application. Taking Daugman’s iris matching algorithm, which
has been proven to be the most reliable iris matching method, as a case
study, we conduct performance profiling and binary instrumentation on the
benchmark to capture its execution behavior. The results show that data
loading and memory access incurs great performance overhead and
motivates us to move the biometrics computation to high-performance
architecture.
Modern iris recognition algorithms can be computationally intensive,
yet are designed for traditional sequential processing elements, such as a
personal computer. However, a parallel processing alternative using field
programmable gate arrays (FPGAs) offers an opportunity to speed up iris
recognition. Within the means of this project, iris template generation with
directional filtering, which is acomputationally expensive, yet parallel
portion of a modern iris recognition algorithm, is parallelized on an FPGA
system. We will present a performance comparison of the parallelized
algorithm on the FPGA system to a traditional CPU-based version. The
parallelized template generation outperforms an optimized C++ code
version determining the information content of an iris approximately 324
times faster.

3.R. Derakhshani, A. Ross, and S. Crihalmeanu, “A new biometric
modality based on conjunctival vasculature,” in Proc. Artif. Neural
Netw Eng., 2006, pp. 1–8.
A new biometric indicator based on the patterns of conjunctival
vasculature is proposed. Conjunctival vessels can be observed on the visible
part of the sclera that is exposed to the outside world. These vessels
demonstrate rich and specific details in visible light, and can be easily
photographed using a regular digital camera. In this paper we discuss
methods for conjunctival imaging, preprocessing, and feature extraction in
order to derive a suitable conjunctival vascular template for biometric
authentication. Commensurate classification methods along with the
observed accuracy are discussed. Experimental results suggest the potential
of using conjunctival vasculature as a biometric measure. Identification of
a person based on some unique set of features is an important task. The
human identification is possible with several biometric systems and sclera
recognition is one of the promising biometrics. The sclera is the white
portion of the human eye. The vein pattern seen in the sclera region is
unique to each person. Thus, the sclera vein pattern is a well suited
biometric technology for human identification. The existing methods used
for sclera recognition have some drawbacks like only frontal looking
images are preferred for matching and rotation variance is another problem.
These problems are completely eliminated in the proposed system by using
two feature extraction techniques. They are Histogram of Oriented
Gradients (HOG) and converting the image into polar form using the
bilinear interpolation technique. These two features help the proposed
system to become illumination invariant and rotation invariant. The
experimentation is done with the help of UBIRIS database. The
experimental result shows that the proposed sclera recognition method can
achieve better accuracy than the previous methods.
4. J. D. Owens, M. Houston, D. Luebke, S. Green, J. E. Stone, and J. C.
Phillips, “GPU computing,” Proc. IEEE, vol. 96, no. 5,pp. 879–899,
May 2008.

The graphics processing unit (GPU) has become an integral part of
today’s mainstream computing systems. Over the past six years, there has
been a marked increase in the performance and capabilities of GPUs. The
modern GPU is not only a powerful graphics engine but also a highly
parallel programmable processor featuring peak arithmetic and memory
bandwidth that substantially outpaces its CPU counterpart. The GPU’s
rapid increase in both programmability and capability has spawned a
research community that has successfully mapped a broad range of
computationally demanding, complex problems to the GPU. This effort in
general purpose computing on the GPU, also known as GPU computing,
has positioned the GPU as a compelling alternative to traditional
microprocessors in high-performance computer systems of the future. We
describe the background, hardware, and programming model for GPU
computing, summarize the state of the art in tools and techniques, and
present four GPU computing successes in game physics and computational
biophysics that deliver order-of-magnitude performance gains over
optimized CPU applications.
5. H. Proença and L. A. Alexandre, “UBIRIS: A noisy iris image
database,”in Proc. 13th Int. Conf. Image Anal. Process., 2005, pp. 970–
977.
This paper proposes algorithms for iris segmentation, quality
enhancement, match score fusion, and indexing to improve both the
accuracy and the speed of iris recognition. A curve evolution approach is
proposed to effectively segment a nonideal iris image using the modified
Mumford–Shah functional. Different enhancement algorithms are
concurrently applied on the segmented iris image to produce multiple
enhanced versions of the iris image. A support-vector-machine-based
learning algorithm selects locally enhanced regions from each globally
enhanced image and combines these good-quality regions to create a single
high-quality iris image. Two distinct features are extracted from the high-
quality iris image. The global textural feature is extracted using the 1-D log
polar Gabor transform, and the local topological feature is extracted using

Euler numbers. An intelligent fusion algorithm combines the textural and
topological matching scores to further improve the iris recognition
performance and reduce the false rejection rate, whereas an indexing
algorithm enables fast and accurate iris identification. The verification and
identification performance of the propose algorithms is validated and
compared with other algorithms using the CASIA Version 3, ICE 2005, and
UBIRIS iris databases.
1.8 PROPOSED METHOD
Proposed a new parallel sclera vein recognition method using a two-
stage parallel approach for registration and matching. A parallel sclera
matching solution for Sclera vein recognition using our sequential line-
descriptor method using the CUDA GPU architecture. CUDA is a highly
parallel, multithreaded; many-core processor with tremendous
computational power.
It supports not only a traditional graphics pipeline but also computation
on non-graphical data. It is relatively straightforward to implement our C
program for CUDA on AMD-based GPU using Open CL. Our CUDA
kernels can be directly converted to Open CL kernels by concerning
different syntax for various keywords and built-in functions. The mapping
strategy is also effective in Open CL if we regard thread and block in
CUDA as work item and work-group in Open CL. Most of our optimization
techniques such as coalesced memory access and prefix sum can work in
Open CL too. Moreover, since CUDA is a data parallel architecture, the
implementation of our approach by Open CL should be programmed in
data-parallel model.
In this research, we first discuss why the naïve parallel approach would
not work. We then propose the new sclera descriptor – the Y shape sclera
feature-based efficient registration method to speed up the mapping scheme
; introduce the “weighted polar line (WPL) descriptor”, that would be better
suited for parallel computing to mitigate the mask size issue ; and develop
our coarse to fine two-stage matching process to dramatically improve the

matching speed. These new approaches make the parallel processing
possible and efficient.
1.9.1PROPOSED SYSTEM ADVANTAGES
1.To improve the efficiency, in this research, we propose a new descriptor —
the Y shape descriptor, which can greatly help improve the efficiency of the
coarse registration of two images and can be used to filter out some non-
matching pairs before refined matching.
2.We propose the coarse-to-fine two-stage matching process. In the first
stage, we matched two images coarsely using the Y-shape descriptors,
which is very fast to match because no registration was needed. The
matching result in this stage can help filter out image pairs with low
similarities.

CHAPTER 2
PROJECT DESCRIPTION
2.1 INTRODUCTION
The sclera is the opaque and white outer layer of the eye. The blood
vessel structure of sclera is formed randomly and is unique to each person ,
which can be used for human’s identification . Several researchers have
designed different Sclera vein recognition methods and have shown that it
is promising to use Sclera vein recognition for human identification. In ,
Crihalmeanu and Ross proposed three approaches: Speed Up Robust
Features (SURF)-based method, minutiae detection, and direct correlation
matching for feature registration and matching. Within these three methods,
the SURF method achieves the best accuracy. It takes an average of 1.5
seconds1 using the SURF method to perform a one-to-one matching. Zhou
et. al. proposed line descriptor-based method for sclera vein recognition.
The matching step (including registration) is the most time-consuming step
in this sclera vein recognition system, which costs about 1.2 seconds to
perform a one-to-one matching. Both speed was calculated using a PC with
Intel® Core™ 2 Duo 2.4GHz processors and 4 GB DRAM. Currently,
Sclera vein recognition algorithms are designed using central processing
unit (CPU)-based systems.
As discussed CPU-based systems are designed as sequential
processing devices, which may not be efficient in data processing where the
data can be parallelized. Because of large time consumption in the matching
step, Sclera vein recognition using sequential-based method would be very
challenging to be implemented in a real time biometric system, especially
when there is large number of templates in the database for matching. GPUs
(as abbreviation of General purpose Graphics Processing Units: GPGPUs)
are now popularly used for parallel computing to improve the

computational processing speed and efficiency. The highly parallel
structure of GPUs makes them more effective than CPUs for data
processing where processing can be performed in parallel. GPUs have been
widely used in biometrics recognition such as: speech recognition, text
detection , handwriting recognition and face recognition .In iris recognition
, GPU was used to extract the features, construct descriptors, and match
templates.
GPUs are also used for object retrieval and image search. Park et al.
designed the performance evaluation of image processing algorithms, such
as linear feature extraction and multi-view stereo matching, on GPUs.
However, these approaches were designed for their specific biometric
recognition applications and feature searching methods. Therefore they may
not be efficient for Sclera vein recognition. Compute Unified Device
Architecture (CUDA), the computing engine of NVIDIA GPUs, is used in
this research. CUDA is a highly parallel, multithreaded, many-core
processor with tremendous computational power. It supports not only a
traditional graphics pipeline but also computation on non-graphical data.
More importantly, it offers an easier programming platform which
outperforms its CPU counterparts in terms of peak arithmetic intensity and
memory bandwidth. In this research, the goal is not to develop a unified
strategy to parallelize all sclera matching methods because each method is
quite different from one another and would need customized design. To
develop an efficient parallel computing scheme, it would need different
strategies for different Sclera vein recognition methods.
Rather, the goal is to develop a parallel sclera matching solution for
Sclera vein recognition using our sequential line-descriptor method using
the CUDA GPU architecture. However, the parallelization strategies
developed in this research can be applied to design parallel approaches for
other Sclera vein recognition methods and help parallelize general pattern
recognition methods. Based on the matching approach in there are three
challenges to map the task of sclera feature matching to GPU.
1) Mask files are used to calculate valid overlapping areas of two sclera
templates and to align the templates to the same coordinate system. But the

mask files are large in size and will preoccupy the GPU memory and slow
down the data transfer. Also, some of processing on the mask files will
involve convolution which is difficult to improve its performance on the
scalar process unit on CUDA.
2) The procedure of sclera feature matching consists of a pipeline of several
computational stages with different memory and processing requirements.
There is no uniform mapping scheme applicable for all these stages.
3) When the scale of sclera database is far larger than the number of the
processing units on the GPU, parallel matching on the GPU is still unable to
satisfy the requirement of real-time performance. New designs are
necessary to help narrow down the search range. In summary, naïve
implementation of the algorithms in parallel would not work efficiently.
Note, it is relatively straightforward to implement our C program for
CUDA on AMD-based GPU using Open CL. Our CUDA kernels can be
directly converted to Open CL kernels by concerning different syntax for
various keywords and built-in functions. The mapping strategy is also
effective in Open CL if we regard thread and block in CUDA as work item
and work-group in Open CL. Most of our optimization techniques such as
coalesced memory access and prefix sum can work in Open CL too.
Moreover, since CUDA is a data parallel architecture, the implementation
of our approach by Open CL should be programmed in data-parallel model.
In this research, we first discuss why the naïve parallel approach would not
work (Section 3). We then propose the new sclera descriptor – the Y shape
sclera feature-based efficient registration method to speed up the mapping
scheme (Section 4); introduce the “weighted polar line (WPL) descriptor”,
that would be better suited for parallel computing to mitigate the mask size
issue (Section 5); and develop our coarse to fine two-stage matching
process to dramatically improve the matching speed (Section 6). These new
approaches make the parallel processing possible and efficient. However, it
is non-trivial to implement these algorithms in CUDA. We then developed
the implementation schemes to map our algorithms into CUDA (Section 7).
In the Section 2, we give brief introduction of Sclera vein recognition. In
the Section 8, we performed some experiments using the proposed system.
In the Section 9, we draw some conclusions.

2.2 BACKGROUND OF SCLERA VEIN RECOGNITION
2.2.1 OVERVIEW OF SCLERA VEIN RECOGNITION
A typical sclera vein recognition system includes sclera
segmentation, feature enhancement, feature extraction, and feature
matching (Figure 1).
FIG
Sclera image segmentation is the first step in sclera vein recognition.
Several methods have been designed for sclera segmentation. Crihalmeanu
et al. presented an semi-automated system for sclera segmentation. They
used a clustering algorithm to classify the color eye images into three
clusters - sclera, iris, and background. Later on, Crihalmeanu and Ross
designed a segmentation approach based on a normalized sclera index
measure, which includes coarse sclera segmentation, pupil region
segmentation, and fine sclera segmentation. Zhou et. al. developed a skin
tone plus “white color”-based voting method for sclera segmentation in
color images and Otsu’s thresholding-based method for grayscale images.
After sclera segmentation, it is necessary to enhance and extract the sclera
features since the sclera vein patterns often lack contrast, and are hard to
detect. Zhou et al. used a bank of multi-directional Gabor filters for
vascular pattern enhancement. Derakhshani et. al used contrast limited
adaptive histogram equalization (CLAHE) to enhance the green color plane
of the RGB image, and a multi-scale region growing approach to identify
the sclera veins from the image background. Crihalmeanu and Ross applied
a selective enhancement filter for blood vessels to extract features from the
green component in a color image. In the feature matching step,
Crihalmeanu and Ross proposed

three registration and matching approaches including Speed Up Robust
Features (SURF) which is based on interest-point detection, minutiae
detection which is based on minutiae points on the vasculature structure,
and direct correlation matching which relies on image registration. Zhou et.
al. designed a line descriptor based feature registration and matching
method .
The proposed sclera recognition consists of five steps which include
sclera segmentation, vein pattern enhancement, feature extraction, feature
matching and matching decision. Fig. 2 shows the block diagram of sclera
recognition. Two types of feature extraction are used in the proposed
method to achieve good accuracy for the identification. The characteristics
that are elicited from the blood vessel structure seen in the sclera region are
Histogram of Oriented Gradients (HOG) and interpolation of Cartesian to
Polar conversion. HOG is used to determine the gradient orientation and
edge orientations of vein pattern in the sclera region of an eye image. To
become more computationally efficient, the data of the image are converted
to the polar form. It is mainly used for circular or quasi circular shape of
object. These two characteristics are extracted from all the images in the
database and compared with the features of the query image whether the
person is correctly identified or not. This procedure is done in the feature
matching step and ultimately makes the matching decision. By using the
proposed feature extraction methods and matching techniques the human
identification is more accurate than the existing studies. In the proposed
method two features of an image are drawn out.
2.2.2 SCLERA SEGMENTATION
Sclera segmentation is the first step in the sclera recognition. It lets
in three steps: glare area detection, sclera area estimation and iris and eyelid
detection and refinement. Fig. shows the steps of segmentation.

FIG
Glare Area Detection: Glare area means a small bright area near
pupil or iris. This is the unwanted portion on the eye image. Sobel filter is
applied to detect the glare area present in the iris or pupil. Simply it runs
only for the grayscale image. If the image is color, then it needs a
conversion to grayscale image and after that apply it to the Sobel filter to
detect the glare area. Fig. 4 shows the result of the glare area detection
FIG
Sclera area estimation: For the estimation of sclera area Otsu’s
thresholding method is applied. The stairs of the sclera area detection are:
selection of the area of interest (ROI), Otsu’s thresholding, sclera area
detection. Left and right sclera area is selected based on the iris boundaries.
When the region of interest is selected, then apply Otsu’s thresholding for
obtaining the potential sclera areas. The correct left sclera area should be
placed in the right and center positions and correct right sclera area should
be placed in the left and center. In this way non sclera areas are wiped out.
2.2.3 IRIS AND EYELID REFINEMENT
The top and underside of the sclera regions are the limits of the
sclera area. And then that upper eyelid, lower eyelid and iris boundaries are
refined. These altogether are the unwanted portion for recognition. In order
to eliminate these effects refinement is done in the footstep of the detection
of sclera area. Fig. shows after the Otsu’s thresholding process and iris and

eyelid refinement to detect right sclera area. In the same way the left sclera
area is detected using this method.
FIG
In the segmentation process all images are not perfectly segmented.
Hence, feature extraction and matching are needed to reduce the
segmentation fault. The vein patterns in the sclera area are not visible in the
segmentation process. To get vein patterns more visible vein pattern
enhancement is to be performed.
2.2.4 OCULAR SURFACE VASCULATURE
Human recognition using vascular patterns in the human body has
been studied in the context of fingers (Miura et al. 2004), palm (Lin and
Fan 2004) and retina (Hill 1999). In the case of retinal biometrics, an
especial optical device for imaging the back of the eyeball is needed (Hill
1999). Due to its perceived invasiveness and the required degree of subject
cooperation, the use of retinal biometrics may not be acceptable to some
individuals. The conjunctiva is a thin, transparent, and moist tissue that
covers the outer surface of the eye. The part of the conjunctiva that covers
the inner lining of the eyelids is called palpebral conjunctiva, and the part
that covers the outer surface of the eye is called ocular (or the bulbar)
conjunctiva, which is the focus of this study. The ocular conjunctiva is very
thin and clear; thus the vasculature (including those of the episclera) is
easily visible through it. The visible microcirculation of conjunctiva offers a
rich and complex network of veins and fine microcirculation (Fig. 1). The
apparent complexity and specificity of these vascular patterns motivated us
to utilize them for personal identification (Derakhshani and Ross 2006).

FIG
We have found conjunctival vasculature to be a suitable biometric as it
conforms to the following criteria (Jain et al. 2004):
UNIVERSALITY: All normal living tissues, including that of the
conjunctiva and episclera, have vascular structure.
UNIQUENESS: Vasculature is created during embryonic vasculogenesis.
Its detailed final structure is mostly stochastic and, thus, unique. Even
though no comprehensive study on the uniqueness of vascular structures
has been conducted, study of some targeted areas such as those of the eye
fundus confirm the uniqueness of such vascular patterns, even between
identical twins (Simon and Goldstein 1935, Tower 1955).
PERMANENCE: Other than cases of significant trauma, pathology, or
chemical intervention, spontaneous adult ocular vasculogenesis and
angiogenesis does not easily occur. Thus, the conjunctival vascular
structure is expected to have reasonable permanence (Joussen 2001).
Practicality: Conjunctival vasculature can be captured with commercial off
the shelf digital cameras under normal lighting conditions, making this
modality highly practical.
ACCEPTABILITY: Since the subject is not required to stare directly into
the camera lens, and given the possibility of capturing the conjunctival
vasculature from several feet away, this modality is non-intrusive and thus
more acceptable.
SPOOF-PROOFNESS: The fine, multi surface structure of the ocular
veins makes them hard to reproduce as a physical artifact. Besides being a

stand-alone biometric modality, we anticipate that the addition of
conjunctival biometrics will enhance the performance of current iris-based
biometric system in the following ways:
 Improving accuracy by the addition of vascular features.
 Facilitating recognition using off-angle iris images. For instance, if the iris
information is relegated to the left or right portions of the eye, the sclera
vein patterns will be further exposed. This feature makes sclera vasculature
a natural complement to the iris biometric.
 Addressing the failure-to-enroll issue when iris patterns are not usable (e.g.,
due to surgical procedures)
 Reducing vulnerability to spoof attacks. For instance, when implemented
alongside iris systems, an attacker needs to reproduce not only the iris but
also different surfaces of the sclera along with the associated
microcirculation, and make them available on commensurate eye surfaces.
The first step in parallelizing an algorithm is to determine the
availability for simultaneous computation. Below Figure demonstrates the
possibility for parallel directional filtering. Since the filter is computed over
different portions of the input image, the computation can be computed in
parallel (denoted by Elements below). In addition, individual parallelization
of each element of Filtering can also be performed. A detailed discussion of
our proposed parallelization is outside the scope of this paper.
FIG

FIG
2.2.5 OVERVIEW OF THE LINE DESCRIPTOR-BASED SCLERA
VEIN
2.2.5.1 RECOGNITION METHOD
The matching segment of the line-descriptor based method is a
bottleneck with regard to matching speed. In this section, we briefly
describe the Line Descriptor-based sclera vein recognition method. After
segmentation, vein patterns were enhanced by a bank of directional Gabor
filters. Binary morphological operations are used to thin the detected vein
structure down to a single pixel wide skeleton and remove the branch
points. The line descriptor is used to describe the segments in the vein
structure. Figure 2 shows a visual description of the line descriptor. Each
segment is described by three quantities: the segments angle to some
reference angle at the iris center θ, the segments distance to the iris center r
, and the dominant angular orientation of the line segment ɸ. Thus, the
descriptor is S = ( θ r ɸ )T. The individual components of the line descriptor
are calculated as:

FIG
Here fline (x) is the polynomial approximation of the line segment, (xl , yl )
is the center point of the line segment, (xi , yi ) is the center of the detected
iris, and S is the line descriptor. In order to register the segments of the
vascular patterns, a RANSAC-based algorithm is used to estimate the best-
fit parameters for registration between the two sclera vascular patterns. For
the registration algorithm, it randomly chooses two points – one from the
test template, and one from the target template. It also randomly chooses a
scaling factor and a rotation value, based on a priori knowledge of the
database. Using these values, it calculates a fitness value for the registration
using these parameters.
After sclera template registration, each line segment in the test
template is compared to the line segments in the target template for

matches. In order to reduce the effect of segmentation errors, we created the
weighting image (Figure 3) from the sclera mask by setting interior pixels
in the sclera mask to 1, pixels within some distance of the boundary of the
mask to 0.5, and pixels outside the mask to 0.
The matching score for two segment descriptors is calculated By
where Si and Sj are two segment descriptors, m(Si , Sj ) is the matching
score between segments Si and Sj , d(Si , Sj ) is the Euclidean distance
between the segment descriptors center points (from Eq. 6-8), Dmatch is the
matching distance threshold, and match is the matching angle threshold.
The total matching score, M, is the sum of the individual matching scores
divided by the maximum matching score for the minimal set between the
test and target template. That is, one of the test or target templates has fewer
points, and thus the sum of its descriptors weight sets the maximum score
that can be attained

FIG
movement of eye. Y shape branches are observed to be a stable feature and
can be used as sclera feature descriptor. To detect the Y shape branches in
the original template, we search for the nearest neighbors set of every line
segment in a regular distance, classified the angles among these neighbors.
If there were two types of angle values in the line segment set, this set may
be inferred as a Y shape structure and the line segment angles would be
recorded as a new feature of the sclera.
There are two ways to measure both orientation and relationship of
every branch of Y shape vessels: one is to use the angles of every branch to
x axle, the other is to use the angels between branch and iris radial
direction. The first method needs additional rotation operating to align the
template. In our approach, we employed the second method. As Figure 6
shows, ϕ1, ϕ2, and ϕ3 denote the angle between each branch and the radius
from pupil center. Even when the head tilts, the eye moves, or the camera
zooms occurs at the image acquisition step, ϕ1, ϕ2, and ϕ3 are quite stable.
To tolerate errors from the pupil center calculation in the segmentation step,
we also recorded the center position (x, y) of the Y shape branches as
auxiliary parameters. So in our rotation, shift and scale invariant feature
vector is defined as: y(ϕ1, ϕ2, ϕ3, x, y). The Y-shape descriptor is generated
with reference to the iris center. Therefore, it is automatically aligned to the
iris centers. It is a rotational- and scale- invariant descriptor. V. WPL
SCLERA DESCRIPTOR As we discussed in the Section 2.2., the line
descriptor is extracted from the skeleton of vessel structure in binary images
(Figure 7). The skeleton is then broken into smaller segments. For each

segment, a line descriptor is created to record the center and orientation of
the segment. This descriptor is expressed as s(x, y,ɸ), where (x, y) is the
position of the center and ɸ is its orientation. Because of the limitation of
segmentation accuracy, the descriptor in the boundary of sclera area might
not be accurate and may contain spur edges resulting from the iris, eyelid,
and/or eyelashes. To be tolerant of such error, the mask file
FIG
The line descriptor of the sclera vessel pattern. (a) An eye image. (b) Vessel
patterns in sclera. (c) Enhanced sclera vessel patterns. (d) Centers of line
segments of vessel patterns.
Is designed to indicate whether a line segment belongs to the edge of the
sclera or not. However, in GPU application, using the mask is a challenging
since the mask files are large in size and will occupy the GPU memory and
slow down the data transfer. When matching, the registration RANSAC
type algorithm was used to randomly select the corresponding descriptors
and the transform parameter between them was used to generate the
template transform affine matrix. After every templates transform, the mask
data should also be transformed; and new boundary should be calculated to
evaluate the weight of the transformed descriptor. This results in too many
convolutions in processor unit.
To reduce heavy data transfer and computation, we designed the
weighted polar line (WPL) descriptor structure, which includes the

information of mask and can be automatically aligned. We extracted the
relationship of geometric feature of descriptors and store them as a new
descriptor. We use a weighted image created via setting various weight
values according to their positions. The weight of those descriptors who are
beyond the sclera are set to be 0, and those who are near the sclera
boundary are 0.5 and interior descriptors are set to be 1. In our work,
descriptors weights were calculated on their own mask by the CPU only
once.
The calculating result was saved as a component of descriptor. The
descriptor of sclera will change to s(x, y, ɸ,w), where, w denotes the weight
of the point and the value may be 0, 0.5, 1. To align two templates, when a
template is shifted to another location along the line connecting their
centers, all the descriptors of that template will be transformed. It would be
faster if two templates have similar reference points. If we use the center of
the iris as the reference point, when two templates are compared, the
correspondence will automatically be aligned to each other since they have
the similar reference point. Every feature vector of the template is a set of
line segment descriptors composed of three variable (Figure 8): the
segment angle to the reference line which went through the iris center,
denoted as θ; the distance between the segments center and pupil center
which is denoted as r ; the dominant angular orientation of the segment,
denoted as ɸ. To minimize the GPU computing, we also convert the
descriptor value from polar coordinate to rectangular coordinate in CPU
preprocess.
The descriptor vector becomes s(x, y, r, θ, ɸ,w). The left and right
parts of sclera in an eye may have different registration parameters. For
example, as an eyeball moves left, left part sclera patterns of the eye may be
compressed while the right part sclera patterns are stretched.
In parallel matching, these two parts are assigned to threads in
different warps to allow different deformation. The multiprocessor in
CUDA manages threads in groups of 32 parallel threads called warps. We
reorganized the descriptor from same sides and saved

FIG
FIG
them in continuous address. This would meet requirement of coalesced
memory access in GPU.
After reorganizing the structure of descriptors and adding mask information
into the new descriptor, the computation on the mask file is not needed on
the GPU. It was very fast to match with this feature because it does not
need to reregister the templates every time after shifting. Thus the cost of
data transfer and computation on GPU will be reduced. Matching on the
new descriptor, the shift parameters generator in Figure 4 is then simplified
as Figure 9.
2.3 EVOLUTION OF GPU ARCHITECTURE

The fixed-function pipeline lacked the generality to efficiently express
more complicated shading and lighting operations that are essential for
complex effects. The key step was replacing the fixed-function per-vertex
and per-fragment operations with user-specified programs run on each
vertex and fragment. Over the past six years, these vertex programs and
fragment programs have become increasingly more capable, with larger
limits on their size and resource consumption, with more fully featured
instruction sets, and with more flexible control-flow operations. After many
years of separate instruction sets for vertex and fragment operations, current
GPUs support the unified Shader Model 4.0 on both vertex and fragment
shaders .
 The hardware must support shader programs of at least 65 k static
instructions and unlimited dynamic instructions.
 The instruction set, for the first time, supports both 32-bit integers and 32-
bit floating-point numbers.
 The hardware must allow an arbitrary number of both direct and indirect
reads from global memory (texture).
 Finally, dynamic flow control in the form of loops and branches must be
supported.
As the shader model has evolved and become more powerful, and GPU
applications of all types have increased vertex and fragment program
complexity, GPU architectures have increasingly focused on the
programmable parts of the graphics pipeline. Indeed, while previous
generations of GPUs could best be described as additions of
programmability to a fixed-function pipeline, today’s GPUs are better
characterized as a programmable engine surrounded by supporting fixed-
function units. General-Purpose Computing on the GPU Mapping general-
purpose computation onto the GPU uses the graphics hardware in much the
same way as any standard graphics application. Because of this similarity, it
is both easier and more difficult to explain the process. On one hand, the
actual operations are the same and are easy to follow; on the other hand, the
terminology is different between graphics and general-purpose use. Harris
provides an excellent description of this mapping process .

We begin by describing GPU programming using graphics terminology,
then show how the same steps are used in a general-purpose way to author
GPGPU applications, and finally use the same steps to show the more
simple and direct way that today’s GPU computing applications are written.
2.3.1 PROGRAMMING A GPU FOR GRAPHICS
We begin with the same GPU pipeline that we described in Section II,
concentrating on the programmable aspects of this pipeline.
 The programmer specifies geometry that covers a region on the screen.
The rasterizer generates a fragment at each pixel location covered by that
geometry.
 Each fragment is shaded by the fragment program.
 The fragment program computes the value of the fragment by a
combination of math operations and global memory reads from a global
Btexture[ memory.
 The resulting image can then be used as texture on future passes through
the graphics pipeline.
PROGRAMS (OLD):
Coopting this pipeline to perform general-purpose computation
involves the exact same steps but different terminology. A motivating
example is a fluid simulation computed over a grid: at each time step, we
compute the next state of the fluid for each grid point from the current state
at its grid point and at the grid points of its neighbors.
 The programmer specifies a geometric primitive that covers a
computation domain of interest. The rasterizer generates a fragment at each
pixel location covered by that geometry. (In our example, our primitive
must cover a grid of fragments equal to the domain size of our fluid
simulation.)
 Each fragment is shaded by an SPMD general-purpose fragment
program. (Each grid point runs the same program to update the state of its
fluid.)

 The fragment program computes the value of the fragment by a
combination of math operations and Bgather[ accesses from global
memory. (Each grid point can access the state of its neighbors from the
previous time step in computing its current value.)
 The resulting buffer in global memory can then be used as an input on
future passes. (The current state of the fluid will be used on the next time
step.)
PROGRAMS (NEW):
One of the historical difficulties in programming GPGPU applications
has been that despite their general-purpose tasks’ having nothing to do with
graphics, the applications still had to be programmed using graphics APIs.
In addition, the program had to be structured in terms of the graphics
pipeline, with the programmable units only accessible as an intermediate
step in that pipeline, when the programmer would almost certainly prefer to
access the programmable units directly. The programming environments we
describe in detail in Section IV are solving this difficulty by providing a
more natural, direct, non-graphics interface to the hardware and,
specifically, the programmable units. Today, GPU computing applications
are structured in the following way.
 The programmer directly defines the computation domain of interest as a
structured grid of threads.
 An SPMD general-purpose program computes the value of each thread.
 The value for each thread is computed by a combination of math
operations and both Bgather[ (read) accesses from and Bscatter[ (write)
accesses to global memory. Unlike in the previous two
 methods, the same buffer can be used for both reading and writing,
allowing more flexible algorithms (for example, in-place algorithms that
use less memory).
 The resulting buffer in global memory can then be used as an input in
future computation.

2.4 COARSE-TO-FINE TWO-STAGE MATCHING PROCESS
To further improve the matching process, we propose the coarse-to-fine
two-stage matching process. In the first stage, we matched two images
coarsely using the Y-shape descriptors, which is very fast to match because
no registration was needed. The matching result in this stage can help filter
out image pairs with low similarities. After this step, it is still possible for
some false positive matches. In the second stage, we used WPL descriptor
to register the two images for more detailed descriptor matching including
scale- and translation invariance. This stage includes shift transform, affine
matrix generation, and final WPL descriptor matching. Overall, we
partitioned the registration and matching processing into four kernels2 in
CUDA (Figure 10): matching on the Y shape descriptor, shift
transformation, affine matrix generation, and final WSL descriptor
matching. Combining these two stages, the matching program can run faster
and
achieve more accurate score.
2.4.1. STAGE I: MATCHING WITH Y SHAPE DESCRIPTOR
Due to scale- and rotation- invariance of the Y-shape features,
registration is unnecessary before matching on Y shape descriptor. The
whole matching algorithm is listed as algorithm 1.

FIG
Here, ytei, and yta j are the Y shape descriptors of test template Tte
and target template Tta respectively. dϕ is the Euclidian distance of angle
element of descriptors vector defined as (3). dxy is the Euclidian distance of
two descriptor centers defined as (4). ni, and di are the matched descriptor
pairs’ number and their centers distance respectively. tϕ is a distance
threshold and txy is the threshold to restrict the searching area. We set tϕ to
30 and txy to 675 in our experiment. Here,
To match two sclera templates, we searched the areas nearby to all
the Y shape branches. The search area is limited to the corresponding left or
right half of the sclera in order to reduce the searching range and time. The
distance of two branches is defined in (3) where ϕi j is the angle between
the j th branch and the polar from pupil center in desctiptor i .
The number of matched pairs ni and the distance between Y shape
branches centers di are stored as the matching result. We fuse the number of

matched branches and the average distance between matched branches
centers as (2). Here, α is a factor to fuse the matching score which was set
to 30 in our study. Ni and Nj is the total numbers of feature vectors in
template i and j separately. The decision is regulated by the threshold t: if
the sclera’s matching score is lower than t, the sclera will be discarded. The
sclera with high matching score will be passed to the next more precisely
matching process
.
2.4.2 . STAGE II: FINE MATCHING USING WPL DESCRIPTOR
The line segment WSL descriptor reveals more vessel structure detail of
sclera than the Y shape descriptor. The variation of sclera vessel pattern is
nonlinear because:
 When acquiring an eye image in different gaze angle, the vessel structure
will appear nonlinear shrink or extend because eyeball is spherical in shape.
 sclera is made up of four layers: episclera, stroma, lamina fusca and
endothelium. There are slightly differences among movement of these
layers. Considering these factors, our registration employed both single
shift transform and multi-parameter transform which combines shift,
rotation, and scale together.
1) SHIFT PARAMETER SEARCH: As we discussed before,
segmentation may not be accurate. As a result, the detected iris center could
not be very accurate. Shift transform is designed to tolerant possible errors
in pupil center detection in the segmentation step. If there is no deformation
or only very minor deformation, registration with shift transform together
would be adequate to achieve an accurate result. We designed Algorithm 2
to get optimized shift parameter. Where, Tte is the test template; and ssei is
the i th WPL descriptor of Tte. Tta is the target template; and ssai is the i th
WPL descriptor of Tta.d(stek, staj ) is Euclidean distance of escriptors stek
and staj :

Δsk is the shift value of two descriptors defines as:

We first randomly select an equal number of segment descriptors
stek in test template Tte from each quad and find its nearest neighbors staj _
in target template Tta. The shift offset of them is recorded as the possible
registration shift factor _sk . The final offset registration factor is _soptim
which has the smallest standard deviation among these candidate offsets.
2) AFFINE TRANSFORM PARAMETER SEARCH
Affine transform is designed to tolerant some deformation of sclera
patterns in the matching step. The affine transform algorithm is shown in
Algorithm 3. The shift value in the parameter set is obtained by randomly
selecting descriptor s(it )te and calculating the distance from its nearest
neighbor staj_ in Tta. We transform the test template by the matrix in (7).
At end of the iteration, we count the numbers of matched descriptor pairs
from the transformed template and the target template. The factor β is
involved to determine if the pair of descriptor is matched, and we set it to
be 20 pixels in our experiment. After N iterations, the optimized transform
parameter set is determined via selecting the maximum matching numbers
m(it). Here, stei , Tte, staj and Tta is defined same as algorithm 2. tr (it )
shi f t, θ(it )tr (it ) scale is the parameters of shift, rotation and scale
transform generated in i tth iteration. R(θ (it )), T (tr (it ) shi f t ) and S(tr (it
) scale) are the transform matrix defined as (7). To search optimize
transform parameter, we iterated N times to generate these parameters. In
our experiment, we set iteration time to 512.
3) REGISTRATION AND MATCHING ALGORITHM
Using the optimized parameter set determined from Algorithms 2
and 3, the test template will be registered and matched simultaneously. The
registration and matching algorithm is listed in Algorithm 4. Here, stei , Tte,
staj and Tta are defined same as Algorithms 2 and 3. θ(optm), tr (optm) shi f
t , tr (optm) scale ,_soptim are the registration parameters attained from
Algorithms 2 and 3. R_θ(optm)_・T _tr (optm) shi f t _・S_tr (optm) scale
_ is the descriptor transform matrix defined in Algorithm 3. ∅ is the angle
between the segment descriptor and radius direction. w is the weight of the

descriptor which indicates whether the descriptor is at the edge of sclera or
not. To ensure that the nearest descriptors have a similar orientation, we
used a constant factor α to check the abstract difference of two ɸ . In our
experiment, we set α to 5. The total matching score is minimal score of two
transformed result divided by the minimal matching score for test template
and target template.
2.5. MAPPING THE SUBTASKS TO CUDA
CUDA is a single instruction multiple data (SIMD) system and
works as a coprocessor with a CPU. A CUDA consists of many streaming
multiprocessors (SM) where the parallel part of the program should be
partitioned into threads by the programmer and mapped into those threads.
There are multiple memory spaces in the CUDA memory hierarchy:
register, local memory, shared memory, global memory, constant memory
and texture memory. Register, local memory and shared memory are on-
chip and could be a little time consuming to access these memories. Only
shared memory can be accessed by other threads within the same block.
However, there is only limited availability of shared memory. Global
memory, constant memory, and texture memory are off-chip memory and
accessible by all threads, which would be very time consuming to access
these memories.
Constant memory and texture memory are read-only and cacheable
memory. Mapping algorithms to CUDA to achieve efficient processing is
not a trivial task. There are several challenges in CUDA programming:
 If threads in a warp have different control path, all the branches will be
executed serially. To improve performance, branch divergence within a
warp should be avoided.
 Global memory is slower than on-chip memory in term of access. To
completely hide the latency of the small instructions set, we should use on-
chip memory preferentially rather than global memory. When global
memory access occurs, threads in same warp should access the words in
sequence to achieve coalescence.

 Shared memory is much faster than the local and global memory space.
But shared memory is organized into banks which are equal in size. If two
addresses of memory request from different thread within a warp fall in the
same memory bank, the access will be serialized. To get maximum
performance, memory requests should be scheduled to minimize bank
conflicts.
2.5.1 . MAPPING ALGORITHM TO BLOCKS
Because the proposed registration and matching algorithm has four
independent modules, all the modules will be converted to different kernels
on the GPU. These kernels are different in computation density, thus we
map them to the GPU by various map strategies to fully utilize the
computing power of CUDA. Figure 11 shows our scheme of CPU-GPU
task distribution and the partition among blocks and threads. Algorithm 1 is
partitioned into coarse-grained parallel subtasks.
We create a number of threads in this kernel. The number of threads
is the same as the number of templates in the database. As the upper middle
column shows in Figure 11, each target template will be assigned to one
thread. One thread performs a pair of templates compare. In our work, we
use NVIDIA C2070 as our GPU. Threads and blocks number is set to
1024. That means we can match our test template with up to 1024×1024
target templates at same time.
Algorithms 2-4 will be partitioned into fine-grained subtasks which is
processed a section of descriptors in one thread. As the lower portion of the
middle column shows in Figure 11, we assigned a target template to one
block. Inside a block, one thread corresponds a set of descriptors in this
template. This partition makes every block execute independently and there
are no data exchange requirements between different blocks. When all
threads complete their responding descriptor fractions, the sum of the
intermediate results needs to be computed or compared. A parallel prefix
sum algorithm is used to calculate the sum of intermediate results which is
show in right of Figure 11. Firstly, all odd number threads compute the sum

of consecutive pairs of the results. Then, recursively, every first of i (= 4, 8,
16, 32, 64, ...) threads
compute the prefix sum on the new result. The final result will be saved in
the first address which has the same variable name as the first intermediate
result.
2.5.2 . MAPPING INSIDE BLOCK
In shift argument searching, there are two schemes we can choose to
map task:
 Mapping one pair of templates to all the threads in a block, and then every
thread would take charge of a fraction of descriptors and cooperation with
other threads
 Assigning a single possible shift offset to a thread, and all the threads will
compute independently unless the final result should be compared with
other possible offset.
Due to great number of sum and synchronization operations in every
nearest neighbor searching step, we choose the second method to parallelize
shift searching. In affine matrix generator, we mapped an entire parameter
set searching to a thread and every thread randomly generated a set of
parameters and tried them independently. The generated iterations were
assigned to all threads. The challenge of this step is the randomly generated
numbers might be correlated among threads. In the step of rotation and
scale registration generating, we used the Mersenne Twister pseudorandom
number generator because it can use bitwise arithmetic and have long
period.
The Mersenne twister, as most of pseudorandom generators, is iterative.
Therefore it’s hard to parallelize a single twister state update step among
several execution threads. To make sure that thousands of threads in the
launch grid generate uncorrelated random sequence, many simultaneous
Mersenne twisters need to process with different initial states in parallel.
But even “very different” (by any definition) initial state values do not
prevent the emission of correlated sequences by each generator sharing
identical parameters. To solve this problem and to enable efficient

implementation of Mersenne Twister on parallel architectures, we used a
special offline tool for the dynamic creation of Mersenne Twisters
parameters, modified from the algorithm developed by Makoto Matsumoto
and Takuji Nishimura . In the registration and matching step, when
searching the nearest neighbor, a line segment that has already matched
with others should not be used again. In our approach, a flag
FIG
FIG
Variable denoting whether the line has been matched is stored in
shared memory. To share the flags, all the threads in a block should wait
synchronic operation at every query step. Our solution is to use a single
thread in a block to process the matching.
2.5.3. MEMORY MANAGEMENT

The bandwidth inside GPU board is much higher than the
bandwidth between host memory and device memory. The data transfer
between host and device can lead to long latency. As shown in Figure 11,
we load the entire target templates set from database without considering
when they would be processed. Therefore, there was no data transfer from
host to device during the matching procedure. In global memory, the
components in descriptor y(ϕ1, ϕ2, ϕ3, x, y) and s(x, y, rθ, ϕ,w) were stored
separately. This would guarantee contiguous kernels of Algorithm 2 to 4
can access their data in successive addresses. Although such coalescing
access reduces the latency, frequently global memory access was still a
slower way to get data. In our kernel, we loaded the test template to shared
memory to accelerate memory access. Because the Algorithms 2 to 4
execute different number of iterations on same data, the bank conflict does
not happen. To maximize our texture memory space, we set the system
cache to the lowest value and bonded our target descriptor to texture
memory. Using this catchable memory, our data access was accelerated
more.
FIG

2.6 HISTOGRAM OF ORIENTED GRADIENTS
Histogram of oriented gradients is the feature descriptors. It is primarily
applied to the design of target detection. In this paper, it is applied as the
feature for human recognition. In the sclera region the vein patterns are the
edges of an image. So, HOG is used to determine the gradient orientation
and edge orientations of vein pattern in the sclera region of an eye image.
To follow out this technique first of all divide the image into small
connected regions called cells. For each cell compute the histogram of
gradient directions or edge orientations of the pixels. Then the combination
of different histogram of different cell represents the descriptor. To improve
accuracy, histograms can be contrast normalized by calculating the intensity
from the block and then using this value normalizes all cells within the
block. This normalization result shows that it is invariant to geometric and
photometric changes. The gradient magnitude m(x, y) and orientation �(x,
y) are calculated using x and y directions gradients dx (x, y) and dy (x, y).

Orientation binning is the second step of HOG. This method utilized
to create cell histograms. Each pixel within the cell used to give a weight to
the orientation which is found in the gradient computation. Gradient
magnitude is used as the weight. The cells are in the rectangular form. The
binning of gradient orientation should be spread over 0 to 180 degrees and
opposite direction counts as the same. In the Fig. 8 depicts the edge
orientation of picture elements. If the images have any illumination and
contrast changes, then the gradient strength must be locally normalized. For
that cells are grouped together into larger blocks. These blocks are
overlapping, so that each cell contributes more than once to the final
descriptor. Here rectangular HOG (R-HOG) blocks are applied which are
mainly in square grids. The performance of HOG is improved, by putting
on a Gaussian window into each block
FIG

CHAPTER 3
SOFTWARE SPECIFICATION
3.1 GENERAL
MATLAB(matrix laboratory) is a numerical
computing environment and fourth-generation programming language.
Developed by Math Works, MATLAB allows matrix manipulations,
plotting of functions and data, implementation of algorithms, creation
of user interfaces, and interfacing with programs written in other languages,
including C, C++, Java, and Fortran.
Although MATLAB is intended primarily for numerical computing, an
optional toolbox uses the MuPADsymbolic engine, allowing access
to symbolic computing capabilities. An additional package, Simulink, adds
graphicalmulti-domainsimulationandModel-Based
Design for dynamic and embedded systems.
In 2004, MATLAB had around one million users across industry
and academia. MATLAB users come from various backgrounds
of engineering, science, and economics. MATLAB is widely used in
academic and research institutions as well as industrial enterprises.
MATLAB was first adopted by researchers and practitioners
in control engineering, Little's specialty, but quickly spread to many other
domains. It is now also used in education, in particular the teaching
of linear algebra and numerical analysis, and is popular amongst scientists
involved in image processing. The MATLAB application is built around the
MATLAB language. The simplest way to execute MATLAB code is to type
it in the Command Window, which is one of the elements of the MATLAB
Desktop. When code is entered in the Command Window, MATLAB can
be used as an interactive mathematical shell. Sequences of commands can
be saved in a text file, typically using the MATLAB Editor, as a script or
encapsulated into a function, extending the commands available.
MATLAB provides a number of features for documenting and
sharing your work. You can integrate your MATLAB code with other

languages and applications, and distribute your MATLAB algorithms and
applications.
3.2 FEATURES OF MATLAB
 High-level language for technical computing.
 Development environment for managing code, files, and data.
 Interactive tools for iterative exploration, design, and problem solving.
 Mathematical functions for linear algebra, statistics, Fourier analysis,
filtering, optimization, and numerical integration.
 2-D and 3-D graphics functions for visualizing data.
 Tools for building custom graphical user interfaces.
 Functions for integrating MATLAB based algorithms with external
applications and languages, such as C, C++, FORTRAN, Java™, COM,
and Microsoft Excel.
MATLAB is used in vast area, including signal and image
processing, communications, control design, test and measurement,
financial modeling and analysis, and computational. Add-on toolboxes
(collections of special-purpose MATLAB functions) extend the MATLAB
environment to solve particular classes of problems in these application
areas.
MATLAB can be used on personal computers and powerful
server systems, including the Cheaha compute cluster. With the addition of
the Parallel Computing Toolbox, the language can be extended with parallel
implementations for common computational functions, including for-loop
unrolling. Additionally this toolbox supports offloading computationally
intensive workloads to Cheaha the campus compute cluster. MATLAB is
one of a few languages in which each variable is a matrix (broadly
construed) and "knows" how big it is. Moreover, the fundamental operators
(e.g. addition, multiplication) are programmed to deal with matrices when
required. And the MATLAB environment handles much of the bothersome
housekeeping that makes all this possible. Since so many of the procedures
required for Macro-Investment Analysis involves matrices, MATLAB

proves to be an extremely efficient language for both communication and
implementation.
3.2.1 INTERFACING WITH OTHER LANGUAGES
MATLAB can call functions and subroutines written in the C
programming language or FORTRAN. A wrapper function is created
allowing MATLAB data types to be passed and returned. The dynamically
loadable object files created by compiling such functions are termed "MEX-
files" (for MATLAB executable).
Libraries written in Java, ActiveX or .NET can be directly called
from MATLAB and many MATLAB libraries (for
example XML or SQL support) are implemented as wrappers around Java
or ActiveX libraries. Calling MATLAB from Java is more complicated, but
can be done with MATLAB extension, which is sold separately by Math
Works, or using an undocumented mechanism called JMI (Java-to-Mat lab
Interface), which should not be confused with the unrelated Java that is also
called JMI.
As alternatives to the MuPAD based Symbolic Math Toolbox
available from Math Works, MATLAB can be connected
to Maple or Mathematical.
Libraries also exist to import and export MathML.
Development Environment
 Startup Accelerator for faster MATLAB startup on Windows, especially on
Windows XP, and for network installations.
 Spreadsheet Import Tool that provides more options for selecting and
loading mixed textual and numeric data.
 Readability and navigation improvements to warning and error messages in
the MATLAB command window.
 Automatic variable and function renaming in the MATLAB Editor.
Developing Algorithms and Applications

MATLAB provides a high-level language and development
tools that let you quickly develop and analyze your algorithms and
applications.
The MATLAB Language
The MATLAB language supports the vector and matrix operations
that are fundamental to engineering and scientific problems. It enables fast
development and execution. With the MATLAB language, you can
program and develop algorithms faster than with traditional languages
because you do not need to perform low-level administrative tasks, such as
declaring variables, specifying data types, and allocating memory. In many
cases, MATLAB eliminates the need for ‘for’ loops. As a result, one line of
MATLAB code can often replace several lines of C or C++ code.
At the same time, MATLAB provides all the features of a traditional
programming language, including arithmetic operators, flow control, data
structures, data types, object-oriented programming (OOP), and debugging
features.
MATLAB lets you execute commands or groups of commands one
at a time, without compiling and linking, enabling you to quickly iterate to
the optimal solution. For fast execution of heavy matrix and vector
computations, MATLAB uses processor-optimized libraries. For general-
purpose scalar computations, MATLAB generates machine-code
instructions using its JIT (Just-In-Time) compilation technology.
This technology, which is available on most platforms, provides
execution speeds that rival those of traditional programming languages.
Development Tools
MATLAB includes development tools that help you implement
your algorithm efficiently. These include the following:
MATLAB Editor

Provides standard editing and debugging features, such as setting
breakpoints and single stepping
Code Analyzer
Checks your code for problems and recommends modifications to
maximize performance and maintainability
MATLAB Profiler
Records the time spent executing each line of code
Directory Reports
Scan all the files in a directory and report on code efficiency, file
differences, file dependencies, and code coverage
Designing Graphical User Interfaces
By using the interactive tool GUIDE (Graphical User Interface
Development Environment) to layout, design, and edit user interfaces.
GUIDE lets you include list boxes, pull-down menus, push buttons, radio
buttons, and sliders, as well as MATLAB plots and Microsoft
ActiveX® controls. Alternatively, you can create GUIs programmatically
using MATLAB functions.
3.2.2 ANALYZING AND ACCESSING DATA
MATLAB supports the entire data analysis process, from acquiring
data from external devices and databases, through preprocessing,
visualization, and numerical analysis, to producing presentation-quality
output.
Data Analysis
MATLAB provides interactive tools and command-line functions for data
analysis operations, including:
 Interpolating and decimating

 Extracting sections of data, scaling, and averaging
 Thresholding and smoothing
 Correlation, Fourier analysis, and filtering
 1-D peak, valley, and zero finding
 Basic statistics and curve fitting
 Matrix analysis
Data Access
MATLAB is an efficient platform for accessing data from
files, other applications, databases, and external devices. You can read data
from popular file formats, such as Microsoft Excel; ASCII text or binary
files; image, sound, and video files; and scientific files, such as HDF and
HDF5. Low-level binary file I/O functions let you work with data files in
any format. Additional functions let you read data from Web pages and
XML.
Visualizing Data
All the graphics features that are required to visualize engineering
and scientific data are available in MATLAB. These include 2-D and 3-D
plotting functions, 3-D volume visualization functions, tools for
interactively creating plots, and the ability to export results to all popular
graphics formats. You can customize plots by adding multiple axes;
changing line colors and markers; adding annotation, Latex equations, and
legends; and drawing shapes.
2-D Plotting
Visualizing vectors of data with 2-D plotting functions that create:
 Line, area, bar, and pie charts.
 Direction and velocity plots.
 Histograms.
 Polygons and surfaces.
 Scatter/bubble plots.
 Animations.
3-D Plotting and Volume Visualization

MATLAB provides functions for visualizing 2-D matrices, 3-
D scalar, and 3-D vector data. You can use these functions to visualize and
understand large, often complex, multidimensional data. Specifying plot
characteristics, such as camera viewing angle, perspective, lighting effect,
light source locations, and transparency.
3-D plotting functions include:
 Surface, contour, and mesh.
 Image plots.
 Cone, slice, stream, and isosurface.
3.2.3 PERFORMING NUMERIC COMPUTATION
MATLAB contains mathematical, statistical, and engineering
functions to support all common engineering and science operations. These
functions, developed by experts in mathematics, are the foundation of the
MATLAB language. The core math functions use the LAPACK and BLAS
linear algebra subroutine libraries and the FFTW Discrete Fourier
Transform library. Because these processor-dependent libraries are
optimized to the different platforms that MATLAB supports, they execute
faster than the equivalent C or C++ code.
MATLAB provides the following types of functions for performing
mathematical operations and analyzing data:
 Matrix manipulation and linear algebra.
 Polynomials and interpolation.
 Fourier analysis and filtering.
 Data analysis and statistics.
 Optimization and numerical integration.
 Ordinary differential equations (ODEs).
 Partial differential equations (PDEs).
 Sparse matrix operations.
MATLAB can perform arithmetic on a wide range of data types,
including doubles, singles, and integers.

CHAPTER 4
IMPLEMENTATION
4.1 GENERAL
Matlab is a program that was originally designed to simplify the
implementation of numerical linear algebra routines. It has since grown into
something much bigger, and it is used to implement numerical algorithms
for a wide range of applications. The basic language used is very similar to
standard linear algebra notation, but there are a few extensions that will
likely cause you some problems at first.
4.2 SNAPSHOTS
ORIGINAL SCLERA IMAGE IS CONVERTED INTO GREY
SCALE IMAGE:

FIG
GREY SCALE IMAGE IS CONVERTED INTO BINARY IMAGE:
FIG
EDGE DETECTON IS DONE BY OTSU’S THRESHOLDING :

FIG
SELECTING THE REGION OF INTEREST (SCLERA PART):
FIG
SELECTED ROI PART:

FIG
FIG
ENHANCEMENT OF SCLERA IMAGE:

FIG
FEATURE EXTRACTION OF SCLERA IMAGE USING GABOR
FILTERS:
FIG
MATCHING WITH IMAGES IN DATABASE:

FIG
DISPLAYING THE RESULT (MATCHED OR NOT MATCHED):
FIG

CHAPTER 5
APPLICATIONS
The applications of biometrics can be divided into the following
three main groups:
Commercial applications such as computer network login,
electronic data security, ecommerce, Internet access, ATM, credit card,
physical access control, cellular phone, PDA, medical records management,
distance learning, etc.
Government applications such as national ID card, correctional
facility, driver’s license, social security, welfare-disbursement, border
control, Passport control, etc.
Forensic applications such as corpse identification, criminal
investigation, terrorist identification, parenthood determination, missing
children, etc. Traditionally, commercial applications have used knowledge-
based systems (e.g., PIN sand passwords), government applications have
used token-based systems (e.g., ID cards and badges), and forensic
applications have relied on human experts to match biometric features.

Biometric systems are being increasingly deployed in large scale civilian
applications The Schiphol Premium scheme at the Amsterdam airport, for
example, employs iris scan cards to speed up the passport and visa control
procedures
CHAPTER 6
CONCLUSION AND FUTURE SCOPE
6.1 CONCLUSION
In this paper, we proposed a new parallel sclera vein recognition
method, which employees a two stage parallel approach for registration and
matching. Even though the research focused on developing a parallel sclera
matching solution for the sequential line-descriptor method using CUDA
GPU architecture, the parallel strategies developed in this research can be
applied to design parallel solutions to other sclera vein recognition methods
and general pattern recognition methods. We designed the Y shape
descriptor to narrow the search range to increase the matching efficiency,
which is a new feature extraction method to take advantage of the GPU
structures. We developed the WPL descriptor to incorporate mask
information and make it more suitable for parallel computing, which can
dramatically reduce data transferring and computation. We then carefully

mapped our algorithms to GPU threads and blocks, which is an important
step to achieve parallel computation efficiency using a GPU. A work flow,
which has high arithmetic intensity to hide the memory access latency, was
designed to partition the computation task to the heterogeneous system of
CPU and GPU, even to the threads in GPU. The proposed method
dramatically improves the matching efficiency without compromising
recognition accuracy.
6.2 REFERENCES
[1] C. W. Oyster, The Human Eye: Structure and Function. Sunderland:
Sinauer Associates, 1999.
[2] C. Cuevas, D. Berjon, F. Moran, and N. Garcia, “Moving object
detection for real-time augmented reality applications in a GPGPU,” IEEE
Trans. Consum. Electron., vol. 58, no. 1, pp. 117–125, Feb. 2012.
[3] D. C. Cirean, U. Meier, L. M. Gambardella, and J. Schmidhuber, “Deep,
big, simple neural nets for handwritten digit recognition,” Neural Comput.,
vol. 22, no. 12, pp. 3207–3220, 2010.
[4] F. Z. Sakr, M. Taher, and A. M. Wahba, “High performance iris
recognition system on GPU,” in Proc. ICCES, 2011, pp. 237–242.

[5] G. Poli, J. H. Saito, J. F. Mari, and M. R. Zorzan, “Processing
neocognitron of face recognition on high performance environment based
on GPU with CUDA architecture,” in Proc. 20th Int. Symp. Comput.
Archit. High Perform. Comput., 2008, pp. 81–88.
[6] J. Antikainen, J. Havel, R. Josth, A. Herout, P. Zemcik, and M. Hauta-
Kasari, “Nonnegative tensor factorization accelerated using GPGPU,” IEEE
Trans. Parallel Distrib. Syst., vol. 22, no. 7, pp. 1135–1141, Feb. 2011.
[7] K.-S. Oh and K. Jung, “GPU implementation of neural networks,”
Pattern Recognit., vol. 37, no. 6, pp. 1311–1314, 2004.
[8] P. R. Dixon, T. Oonishi, and S. Furui, “Harnessing graphics processors
for the fast computation of acoustic likelihoods in speech recognition,”
Comput. Speech Lang., vol. 23, no. 4, pp. 510–526, 2009.
[9] P. Kaufman, and A. Alm, “Clinical application,” Adler’s Physiology of
the Eye, 2003.
[10] R. N. Rakvic, B. J. Ulis, R. P. Broussard, R. W. Ives, and N. Steiner,
“Parallelizing iris recognition,” IEEE Trans. Inf. Forensics Security, vol. 4,
no. 4, pp. 812–823, Dec. 2009.
[11] S. Crihalmeanu and A. Ross, “Multispectral scleral patterns for ocular
biometric recognition,” Pattern Recognit. Lett., vol. 33, no. 14, pp. 1860–
1869, Oct. 2012.
[12] W. Wenying, Z. Dongming, Z. Yongdong, L. Jintao, and G.
Xiaoguang, “Robust spatial matching for object retrieval and its parallel
implementation on GPU,” IEEE Trans. Multimedia, vol. 13, no. 6, pp.
1308–1318, Dec. 2011.Multimedia Sec., Magdeburg, Germany, Sep. 2004,
pp. 4–15.
[13] Y. Xu, S. Deka, and R. Righetti, “A hybrid CPU-GPGPU approach for
real-time elastography,” IEEE Trans. Ultrason., Ferroelectr. Freq. Control,
vol. 58, no. 12, pp. 2631–2645, Dec. 2011.
[14] Z. Zhou, E. Y. Du, N. L. Thomas, and E. J. Delp, “A comprehensive
multimodal eye recognition,” Signal, Image Video Process., vol. 7, no. 4,
pp. 619–631, Jul. 2013.

[15] Z. Zhou, E. Y. Du, N. L. Thomas, and E. J. Delp, “A comprehensive
approach for sclera image quality measure,” Int. J. Biometrics, vol. 5, no. 2,
pp. 181–198, 2013.
[16] Z. Zhou, E. Y. Du, N. L. Thomas, and E. J. Delp, “A new human
identification method: Sclera recognition,” IEEE Trans. Syst., Man,
Cybern. A, Syst., Humans, vol. 42, no. 3, pp. 571–583, May 2012.

Major project report on

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (10)

Similaire à Major project report on

Similaire à Major project report on (20)

Plus de Ayesha Mubeen

Plus de Ayesha Mubeen (10)

Dernier

Dernier (20)

Major project report on