SlideShare une entreprise Scribd logo
1  sur  8
Télécharger pour lire hors ligne
VISUAL SALIENCY MODEL USING SIFT
AND COMPARISON OF LEARNING
APPROACHES
Hamdi Yalın Yalıç1
1

Department of Computer Engineering, Hacettepe University, Ankara, Turkey
yalinyalic@cs.hacettepe.edu.tr

ABSTRACT
Humans' ability to detect and locate salient objects on images is remarkably fast and successful.
Performing this process by using eye tracking equipment is expensive and cannot be easily
applied, and computer modeling of this human behavior is still a problem to be solved. In our
study, one of the largest public eye-tracking databases [1] which has fixation points of 15
observers on 1003 images is used. In addition to low, medium and high-level features which
have been used in previous studies, SIFT features extracted from the images are used to
improve the classification accuracy of the models. A second contribution of this paper is the
comparison and statistical analysis of different machine learning methods that can be used to
train our model. As a result, a best feature set and learning model to predict where humans look
at images, is determined.

KEYWORDS
Image Processing, Computer Vision, Machine Learning Methods, Saliency Map, Eye-Tracking

1. INTRODUCTION
Understanding the region where people look on a scene can be useful in many applications such
as graphics, design, advertising and human-computer interaction. For example, in nonphotorealistic rendering, different levels of details are proposed for different areas of the picture
[2]. In this study, regions where people focus are processed in high-resolution, whilst for the other
regions, low-resolution processing is performed. Auto-crop of pictures, thumbnails or previews
from photos, relate partial displays of images on small screen mobile devices are examples of
other fields of application.
Hardware solutions are also available for finding the region where people look at a scene. An
observer sitting in front of a screen with an eye-tracking device has all the spots recorded at
which he is looking at a scene. Special setups and multiple calibration steps are needed for
working these tracking devices. Also this solution is relatively expensive and cannot be easily
obtained. For this reason, the prediction of where people are looking is required independently
from such tracking devices. As a solution, computer models are generated that calculate the
salient point which attracts an observer's attention. These models analyze an image and extract a
saliency map. One of the former studies is the model developed by Itti and Koch [3] and uses
biologically inspired features like orientation, intensity and color. For each of the features, a
saliency map is computed and combined into a single saliency map result which describes the
David C. Wyld et al. (Eds) : CCSIT, SIPP, AISC, PDCTA, NLP - 2014
pp. 275–282, 2014. © CS & IT-CSCP 2014

DOI : 10.5121/csit.2014.4223
276

Computer Science & Information Technology (CS & IT)

saliency of each pixel. Hou and Zhang [4] proposed a model which is independent of features or
other forms of prior knowledge of objects. They analyzed the log spectrum of an image, extracted
the spectral residual of an image in spectral domain, and proposed a method to construct the
saliency map in spatial domain. Bruce and Tsotsos [5] presents a visual saliency model based on a
first principle information theoretic formulation, named as Attention Based on Information
Maximization (AIM) which performs better than the Itti model. Avraham et al. [6] uses a
probabilistic model to mathematically estimate the most probable targets. Cerf at al. [7] improved
on Itti's model by adding face detection. The most important characteristic of these studies is that
models are derived mathematically, and not trained by some large eye-tracking dataset.
Kienze et al. [8] propose a model that learns saliency directly from human eye movement data.
They only use low level features like previous studies. But the study by Ehinger et al. [9] shows
that observers trying to search for pedestrians in a scene leads to a model combined of three
sources: low-level features, target features and scene context. As a result of this study, while
predicting human behavior, besides low-level features, high-level features which contain semantic
information about the content of the scene must be used.
Our study is most closely related to the saliency model proposed by Judd et al. [1]. They used
high and mid-level features as well as low levels. Another important contribution of their study is
that they train the model using a supervised learning method (support vector machine) and
observed positive contributions of the features in the result. To verify this, they needed real eyetracking data and they recorded viewer's eye fixations on images containing a large amount of
data. They published this database for further studies and in our study, this publicly available
database was used. Results of the different machine learning methods was compared and analyzed
in order to obtain the most accurate model. In addition to the work of others, SIFT [17] feature is
added to the feature set to improve the accuracy of the model.
In the following sections of this paper, dataset and data gathering protocol is analyzed, machine
learning methods will be briefly explained and the feature set that was used to train the model is
mentioned. Performance of different learning methods are compared and analyzed in the
experimental results. Future work will also be discussed in the conclusion.

2. DATASET AND LEARNING METHODS
2.1. Dataset Analysis
Large ground truth data is required in the detection of where humans look in a scene. One of the
largest studies in this field is the experiment of Judd et al. [1] practiced with 15 viewers using
1003 images. They collected 1003 random images from Flickr and LabelMe [10] (Figure 2) and
full resolution images were displayed for 3 seconds to viewers in a dark room using a chin rest for
a head stabilizer. An eye-tracker in front of them recorded their gaze path and fixation points
(Figures 3 and 4). In order to guarantee a high-quality result, the camera’s calibration was
checked every 50 images. They provided a memory test at the end of each display to motivate and
encourage users into paying attention, and asked them which of the images they had seen
previously.
When we analyze the dataset, we noticed that viewers primarily looked at the living things. In the
close up images, organs like the eye, nose and mouth were the most fixated points. But when
people were located that bit further away, viewers searched for faces and limbs, such as a hand, or
an arm. If a human does not exist in the scene, animals can also be said to attract the attention of
people. Other important salient regions are objects like signs and boards which contain text. The
Computer Science & Information Technology (CS & IT)

277

fixations in the dataset have a bias towards the center because of the fact that photographers tend
to place objects in the center of the scene.

2.2. Machine Learning Methods
In this study, different supervised learning methods are used while discovering the model of
visual saliency and classification results of them are compared. The methods used are explained
briefly in the section below.
2.2.1. SVM (Support Vector Machines)
This method tries to find the linear discriminant function (classifier) with the maximum margin
between two classes. It is robust to outliners thus has strong generalization ability. With the help
of the kernel function, which implicitly maps data to high-dimensional space, classification
accuracy will increase. The main advantage of the method is its success with high-dimensional
data.
2.2.2. C4.5 (Decision Tree)
This is an algorithm that constructs a simple depth-first decision tree. It uses information gain
with feature selection and it is easy to implement. Classification of unknown records is
considerably fast, but it is not suitable for large datasets because it needs to fit the entire data into
memory.
2.2.3. K-Nearest Neighbor
This method is based on the principle of the nearest k-neighbor of the new data. Different
computations can be used as a distance metric and the advantage of this classifier is that it doesn't
require any training time. But testing time can be considerable and classifying unknown records is
relatively expensive. It is also easily fooled in high dimensional spaces.
2.2.4. Naive Bayes
This is a statistical classifier which applies Bayes theorem. It assumes that attributes are
independent from each other and is robust to isolated noise points and irrelevant attributes.
However independence assumption may not hold for some attributes and it can cause a loss of
accuracy.
2.2.5. Adaboost
It is a combination of classifiers that is constructed from the training data. It adaptively changes
the distribution of training data with each iteration by focusing more on previously misclassified
records. It can also be used to increase the classification accuracy of other methods.

3. LEARNING OF THE VISUAL SALIENCY MODEL
3.1. Features Used in Learning
According to the results obtained by the analysis of the dataset, the features used in this study are
listed below. For all images in the database, the features are extracted for each pixel and used for
training of the model. MATLAB was used in image processing and feature extraction.
278

Computer Science & Information Technology (CS & IT)

3.1.1. Low-level Features
•

•

•
•

Local energy of steerable pyramid filters [13] are used because they are physiologically
conceivable and associated with visual attention. Pyramid sub-bands were computed in 4
orientations and 3 scales.
Intensity, orientation and color contrast were considered as important features for
saliency regions. 3 channels corresponding to these features were computed as described
in the Itti and Koch [3] model.
In addition to red, green and blue color channel values, and the probabilities of these
channels, were also used.
The probability of each color in a 3D color histogram was also a basic feature which was
computed by the images filtered using median filter at 6 different scales.

3.1.2. Mid-level Features
Humans inherently look at the horizon line because most of the objects are on the earth's surface.
Gist feature [14] was used to detect the horizon line.
3.1.3. High-level Features
Humans mostly looked at people and their faces as understood by the analysis of the database, the
following high-level features were extracted:
•
•

Viola Jones face detector [15],
Felzenszwalb person and car detector [16].

3.1.4. Central Priority
Humans generally place the object of interest near the center of the image while taking pictures.
For this reason, a feature which specifies the distance to the center for each pixel is used.
3.1.5. SIFT Keypoints
For objects in a scene, the interesting points on the object can be extracted to provide a feature
description of them. This description can be used to identify the object in a test image. The SIFT
[17] method can be used to detect these interesting points in an image. In this paper, the keypoint
localization step of the method is used. Low-contrast keypoints are discarded and locations of the
more interesting keypoints, which can be attractive to the human eye, are used. One contribution
of this paper is the using of the position of interest points on an image as a feature and increasing
the classification rate. Other local features, like SURF and GLOH were also experimented with,
but since SIFT had a better effect on the model, it was SIFT that we decided to go with.

3.2. Training Phase
While training and testing the saliency model, machine learning methods described in Section 2.2
were used. Images in the database were divided up as 80% training and 20% testing. That means
that 803 images were used for training and 201 for testing. 10 positively labeled pixels from the
most 5% salient locations and 10 negatively labeled pixels from the least 30% salient locations
were chosen from each image. Thus the examples are guaranteed to be both strongly positive and
strongly negative. Any samples on the boundary between the two regions, and samples within 10
pixels of the image boundary were not selected. It is noted that selecting more than 10 samples as
positive and negative did not increase the classification rate and contained redundant information.
Computer Science & Information Technology (CS & IT)

279

The dataset used to train the model consisted of 16,040 samples; the test dataset consisted of
4,020 samples. Every sample (point) has 34 continuous attribute and labeled as positive or
negative. Obtained data was given as an input to the described learning methods for training and a
more accurate model will be discussed. Details of the parameters used in the classifier training are
explained below:
•

In the model that uses SVM, all data were normalized and radial basis function (1) is used
for kernel function (other kernels produce results similar to radial). γ value is assigned to
0.8. Cost of misclassification didn't affect the result, so c is assigned to 8.
ଶ

݇൫‫ݔ‬௜ , ‫ݔ‬௝ ൯ = ݁‫ ݌ݔ‬ቀ−ߛฮ‫ݔ‬௜ − ‫ݔ‬௝ ฮ ቁ
•
•

•
•

(1)

In the model that uses C4.5 decision tree, for spanning variables, the minimum number of
objects (instances) in leaves m = 2 and pruning confidence level (cf) is assigned to 25%.
The number of nearest neighbors (k) used in classification is 9 for the kNN method and it
was observed that the value does not affect the result. In this model all attributes are
normalized and Euclidean distance is used as a distance metric.
In the NaiveBayes method, we used relative frequency as a prior in probability
estimation. Size of LOESS window is 0.5 and LOESS sample points are set to 100.
Adaboost method is used with SVM to improve the result obtained. Boosting with 10
SVM classifier is applied and weighted combination of weak learners is obtained.

3.3. Comparison of Learning Methods
In this section classification performance of machine learning methods will be discussed. While
sampling, a 5-fold cross validation is used and how the specified model accurately worked in
practice is shown.
Table 1. Evaluation results of the methods using same features
Method
SVM [1]

0,8825

C4.5

0,8410

0,8413

0,8407

kNN

0,8583

0,8407

NaiveBayes

0,8168

0,8209

0,8821

0,8818

AdaBoost

Sens

Results (same features)
Spec
AUC
0,8778
0,9597

CA
0,8801

Prec
0,8786

Recall
0,8825

0,8730

0,8409

0,8413

0,8759

0,949

0,8695

0,8407

0,8129

0,9029

0,8147

0,8209

0,8825

0,8818

0,8823

0,8818

Table 2. Evaluation results of the methods using same features and SIFT
Method

CA

Sens

Results (same features+SIFT)
Spec
AUC
Prec

Recall

SVM

0.9065

0.9089

0.9091

0.9884

0.9091

0.9089

C4.5

0.8610

0.8656

0.8611

0.8971

0.8649

0.8659

kNN

0.8820

0.8629

0.9059

0.9791

0.8991

0.8665

NaiveBayes

0.8423

0.8405

0.8327

0.9282

0.8355

0.8482

AdaBoost

0.9085

0.9072

0.9089

0.9099

0.9087

0.9055
280

Computer Science & Information Technology (CS & IT)

Table 1 and 2 shows the evaluation results of the methods used in this study. As a key: CA means
classification accuracy, Sens means sensitivity, Spec means specificity, AUC means area under
ROC curve, Prec means precision, Recall as itself.
Values found in the evaluation results show how the models can be compared to human
performance while looking at an image. The study [1] that our model is based on used the SVM
model (Table 1) and reached 88% classification accuracy. Similarly in our study, the SVM model
gives better results than other methods. The main reason for this performance is that our problem
is based on a binary classification prediction (positive and negative points). Also the dataset
contains high-dimensional data with large numbers, and this makes SVM more advantageous
over the other methods.
C4.5 decision tree method produced lower accuracy results because it doesn't have a suitable
structure for large datasets. In addition, it falls behind on computation time during training when
compared to other methods. In this work, Random Forest method was also used, but because of
the failure of the decision trees on this dataset and because the algorithm couldn't construct more
than 10 trees due to computational time limits, this method has not been reviewed in this paper.
The model which uses kNN method produced a lower accuracy because the error rate increases in
multi-dimensional space with this method. In addition, computational time of the testing stage is
very high when compared to other methods. Evaluation results show that the NaiveBayes method
is the least successful. Especially as dependencies between attributes (for example, color features
and their probabilities are dependent) leads this method to failure.
The Adaboost method produced similar results with SVM; even better in some evaluations. That
result is obvious because it used the successful method SVM, focusing more on previously
misclassified samples iteratively and constructed a better model. Saliency maps obtained from
some sample images using this model are shown in Figure 6.
Another comparison between learning methods is given in Figure 1 using ROC curve.

Figure 1. ROC curves of methods (Adaboost, SVM, kNN, C4.5, NaiveBayes)
Computer Science & Information Technology (CS & IT)

281

4. CONCLUSIONS
In this study, visual saliency models which try to predict where humans look at images are
developed. For this purpose, a real eye-tracking database is used and different levels of features
are extracted from the images. Using eye fixation points of 15 viewers, a ground truth saliency
map (Figure 5) is uncovered with this information when different learning methods are trained
and classification results are compared. As a conclusion of this study, machine learning methods
that can be applied to this dataset are explained with reasons. Another contribution of this study is
that the interest points extracted from the SIFT method are added to the feature set and better
classification results (~90%) were then obtained over previous studies.
As a future piece of work, we expect to speed up the image processing step and feature extraction.
However, training time of the model is not important because it can be done offline, but when a
system wants to find the most salient points or regions on an image, this process must be handled
quickly. If online saliency is successful, it can be integrated with many applications and can also
be used in mobile systems.

Figure 2. Some example images from the database.

Figure 3. Eye fixation points of one observer on the images.

Figure 4. Eye fixation points of all observers on the images.

Figure 5. Continuous saliency map obtained from convolving Gaussian over fixation points of all
observers.

Figure 6. Saliency map obtained from Adaboost
282

Computer Science & Information Technology (CS & IT)

REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]

[8]

[9]
[10]
[11]
[12]

[13]
[14]
[15]
[16]

[17]

Judd, T., Ehinger, K., Durand, F., & Torralba, A. (2009). "Learning to predict where humans look",
IEEE International Conference on Computer Vision, pp 2106–2113, IEEE Computer Society.
D. DeCarlo & A. Santella, (2002) "Stylization and abstraction of photographs", ACM Transactions on
Graphics, Vol. 21, No. 3, pp 769–776.
L. Itti & C. Koch, (2000) "A saliency-based search mechanism for overt and covert shifts of visual
attention", Vision Research, Vol. 40, pp 1489–1506.
X. Hou & L. Zhang, (2007) "Saliency detection: A spectral residual approach", Computer Vision and
Pattern Recognition, IEEE Computer Society Conference on, Vol 10, pp 1–8.
N. D. B. Bruce & J. K. Tsotsos, (2009) "Saliency, attention, and visual search: An information
theoretic approach", Journal of Vision, Vol. 9, No. 3, pp 1–24.
T. Avraham & M. Lindenbaum, (2009) "Esaliency: Meaningful attention using stochastic image
modeling", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 99, No. 1.
M. Cerf, J. Harel, W. Einhauser, & C. Koch, (2007),"Predicting human gaze using low-level saliency
combined with face detection", In J. C. Platt, D. Koller, Y. Singer, and S. T. Roweis, editors, NIPS.
MIT Press.
W. Kienzle, F. A.Wichmann, B. Scholkopf, & M. O. Franz, (2006) "A nonparametric approach to
bottom-up visual saliency", In B. Scholkopf, J. C. Platt, and T. Hoffman, editors, NIPS, pp 689–696.
MIT Press.
K. Ehinger, B. Hidalgo-Sotelo, A. Torralba, & A. Oliva, (2009) "Modeling search for people in 900
scenes: A combined source model of eye guidance", Visual Cognition.
B. Russell, A. Torralba, K. Murphy, & W. Freeman, (2005) "Labelme: a database and web-based tool
for image annotation", MIT AI Lab Memo AIM-2005-025, MIT CSAIL, Sept. 2005.
Cerf, M., Frady, E., & Koch, C., (2009) "Faces and text attract gaze independent of the task:
Experimental data and computer model", Journal of Vision, Vol. 9, No. 12, pp 1–15.
Subramanian, R., Katti, H., Sebe, N., Kankanhalli, M., & Chua, T. S., (2010) "An eye fixation
database for saliency detection in images", In European Conference on Computer Vision, Vol. 6314,
pp 30–43, Springer.
E. P. Simoncelli & W. T. Freeman, (1995) "The steerable pyramid: A flexible architecture for multiscale derivative computation", pp 444–447.
A. Oliva & A. Torralba, (2001) "Modeling the shape of the scene: A holistic representation of the
spatial envelope", International Journal of Computer Vision, Vol. 42, pp 145–175.
P. Viola & M. Jones, (2001) "Robust real-time object detection", In International Journal of
Computer Vision.
P. Felzenszwalb, D. McAllester, & D. Ramanan, (2008), "A discriminatively trained, multiscale,
deformable part model", Computer Vision and Pattern Recognition 2008 IEEE Conference on, pp 1–
8.
Lowe, D. G., (2004) "Distinctive image features from scale-invariant keypoints", International
journal of computer vision, Vol. 60, No. 2, pp 91-110.

AUTHOR
H. Yalın Yalıç was born in Ankara, Turkey in 1984. Yalıç received his B.Sc. degree
from the Department of Computer Engineering, Çankaya University, Ankara, Turkey in
2007. He received his M.Sc. degree from the Department of Computer Engineering,
Hacettepe University, Ankara, Turkey in 2010 and has continued with his Ph.D.
education in the same department. He has worked as a Research and Teaching Assistant
at Hacettepe University Computer Engineering Department since 2007. The author's
major field of study is computer vision, image and video processing.

Contenu connexe

Tendances

Smriti's research paper
Smriti's research paperSmriti's research paper
Smriti's research paperSmriti Tikoo
 
Facial expression recognition
Facial expression recognitionFacial expression recognition
Facial expression recognitionElyesMiri
 
IRJET - A Review on Face Recognition using Deep Learning Algorithm
IRJET -  	  A Review on Face Recognition using Deep Learning AlgorithmIRJET -  	  A Review on Face Recognition using Deep Learning Algorithm
IRJET - A Review on Face Recognition using Deep Learning AlgorithmIRJET Journal
 
FACIAL EXTRACTION AND LIP TRACKING USING FACIAL POINTS
FACIAL EXTRACTION AND LIP TRACKING USING FACIAL POINTSFACIAL EXTRACTION AND LIP TRACKING USING FACIAL POINTS
FACIAL EXTRACTION AND LIP TRACKING USING FACIAL POINTSijcseit
 
An interactive image segmentation using multiple user input’s
An interactive image segmentation using multiple user input’sAn interactive image segmentation using multiple user input’s
An interactive image segmentation using multiple user input’seSAT Publishing House
 
Facial expression recognition using pca and gabor with jaffe database 11748
Facial expression recognition using pca and gabor with jaffe database 11748Facial expression recognition using pca and gabor with jaffe database 11748
Facial expression recognition using pca and gabor with jaffe database 11748EditorIJAERD
 
Face Recognition using PCA and MSNN
Face Recognition using PCA and MSNNFace Recognition using PCA and MSNN
Face Recognition using PCA and MSNNRABI GAYAN
 
Facial expression recongnition Techniques, Database and Classifiers
Facial expression recongnition Techniques, Database and Classifiers Facial expression recongnition Techniques, Database and Classifiers
Facial expression recongnition Techniques, Database and Classifiers Rupinder Saini
 
IRJET - Facial In-Painting using Deep Learning in Machine Learning
IRJET -  	  Facial In-Painting using Deep Learning in Machine LearningIRJET -  	  Facial In-Painting using Deep Learning in Machine Learning
IRJET - Facial In-Painting using Deep Learning in Machine LearningIRJET Journal
 
FACE DETECTION USING PRINCIPAL COMPONENT ANALYSIS
FACE DETECTION USING PRINCIPAL COMPONENT ANALYSISFACE DETECTION USING PRINCIPAL COMPONENT ANALYSIS
FACE DETECTION USING PRINCIPAL COMPONENT ANALYSISIAEME Publication
 
Yoga Posture Classification using Computer Vision
Yoga Posture Classification using Computer VisionYoga Posture Classification using Computer Vision
Yoga Posture Classification using Computer VisionDr. Amarjeet Singh
 
IRJET- Spot Me - A Smart Attendance System based on Face Recognition
IRJET- Spot Me - A Smart Attendance System based on Face RecognitionIRJET- Spot Me - A Smart Attendance System based on Face Recognition
IRJET- Spot Me - A Smart Attendance System based on Face RecognitionIRJET Journal
 
Medoid based model for face recognition using eigen and fisher faces
Medoid based model for face recognition using eigen and fisher facesMedoid based model for face recognition using eigen and fisher faces
Medoid based model for face recognition using eigen and fisher facesijscmcj
 
Schematic model for analyzing mobility and detection of multiple
Schematic model for analyzing mobility and detection of multipleSchematic model for analyzing mobility and detection of multiple
Schematic model for analyzing mobility and detection of multipleIAEME Publication
 
IRJET- Automated Detection of Gender from Face Images
IRJET-  	  Automated Detection of Gender from Face ImagesIRJET-  	  Automated Detection of Gender from Face Images
IRJET- Automated Detection of Gender from Face ImagesIRJET Journal
 
Implementation of Face Recognition in Cloud Vision Using Eigen Faces
Implementation of Face Recognition in Cloud Vision Using Eigen FacesImplementation of Face Recognition in Cloud Vision Using Eigen Faces
Implementation of Face Recognition in Cloud Vision Using Eigen FacesIJERA Editor
 
A novel approach for performance parameter estimation of face recognition bas...
A novel approach for performance parameter estimation of face recognition bas...A novel approach for performance parameter estimation of face recognition bas...
A novel approach for performance parameter estimation of face recognition bas...IJMER
 

Tendances (20)

Smriti's research paper
Smriti's research paperSmriti's research paper
Smriti's research paper
 
Facial expression recognition
Facial expression recognitionFacial expression recognition
Facial expression recognition
 
IRJET - A Review on Face Recognition using Deep Learning Algorithm
IRJET -  	  A Review on Face Recognition using Deep Learning AlgorithmIRJET -  	  A Review on Face Recognition using Deep Learning Algorithm
IRJET - A Review on Face Recognition using Deep Learning Algorithm
 
FACIAL EXTRACTION AND LIP TRACKING USING FACIAL POINTS
FACIAL EXTRACTION AND LIP TRACKING USING FACIAL POINTSFACIAL EXTRACTION AND LIP TRACKING USING FACIAL POINTS
FACIAL EXTRACTION AND LIP TRACKING USING FACIAL POINTS
 
An interactive image segmentation using multiple user input’s
An interactive image segmentation using multiple user input’sAn interactive image segmentation using multiple user input’s
An interactive image segmentation using multiple user input’s
 
Facial expression recognition using pca and gabor with jaffe database 11748
Facial expression recognition using pca and gabor with jaffe database 11748Facial expression recognition using pca and gabor with jaffe database 11748
Facial expression recognition using pca and gabor with jaffe database 11748
 
Face Recognition using PCA and MSNN
Face Recognition using PCA and MSNNFace Recognition using PCA and MSNN
Face Recognition using PCA and MSNN
 
Facial expression recongnition Techniques, Database and Classifiers
Facial expression recongnition Techniques, Database and Classifiers Facial expression recongnition Techniques, Database and Classifiers
Facial expression recongnition Techniques, Database and Classifiers
 
An optimal face recoginition tool
An optimal face recoginition toolAn optimal face recoginition tool
An optimal face recoginition tool
 
IRJET - Facial In-Painting using Deep Learning in Machine Learning
IRJET -  	  Facial In-Painting using Deep Learning in Machine LearningIRJET -  	  Facial In-Painting using Deep Learning in Machine Learning
IRJET - Facial In-Painting using Deep Learning in Machine Learning
 
FACE DETECTION USING PRINCIPAL COMPONENT ANALYSIS
FACE DETECTION USING PRINCIPAL COMPONENT ANALYSISFACE DETECTION USING PRINCIPAL COMPONENT ANALYSIS
FACE DETECTION USING PRINCIPAL COMPONENT ANALYSIS
 
Yoga Posture Classification using Computer Vision
Yoga Posture Classification using Computer VisionYoga Posture Classification using Computer Vision
Yoga Posture Classification using Computer Vision
 
IRJET- Spot Me - A Smart Attendance System based on Face Recognition
IRJET- Spot Me - A Smart Attendance System based on Face RecognitionIRJET- Spot Me - A Smart Attendance System based on Face Recognition
IRJET- Spot Me - A Smart Attendance System based on Face Recognition
 
Medoid based model for face recognition using eigen and fisher faces
Medoid based model for face recognition using eigen and fisher facesMedoid based model for face recognition using eigen and fisher faces
Medoid based model for face recognition using eigen and fisher faces
 
I017525560
I017525560I017525560
I017525560
 
Schematic model for analyzing mobility and detection of multiple
Schematic model for analyzing mobility and detection of multipleSchematic model for analyzing mobility and detection of multiple
Schematic model for analyzing mobility and detection of multiple
 
IRJET- Automated Detection of Gender from Face Images
IRJET-  	  Automated Detection of Gender from Face ImagesIRJET-  	  Automated Detection of Gender from Face Images
IRJET- Automated Detection of Gender from Face Images
 
H0334749
H0334749H0334749
H0334749
 
Implementation of Face Recognition in Cloud Vision Using Eigen Faces
Implementation of Face Recognition in Cloud Vision Using Eigen FacesImplementation of Face Recognition in Cloud Vision Using Eigen Faces
Implementation of Face Recognition in Cloud Vision Using Eigen Faces
 
A novel approach for performance parameter estimation of face recognition bas...
A novel approach for performance parameter estimation of face recognition bas...A novel approach for performance parameter estimation of face recognition bas...
A novel approach for performance parameter estimation of face recognition bas...
 

En vedette

Visual Saliency: Learning to Detect Salient Objects
Visual Saliency: Learning to Detect Salient ObjectsVisual Saliency: Learning to Detect Salient Objects
Visual Saliency: Learning to Detect Salient ObjectsVicente Ordonez
 
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012Jia-Bin Huang
 
From Large Scale Image Categorization to Entry-Level Categories
From Large Scale Image Categorization to Entry-Level CategoriesFrom Large Scale Image Categorization to Entry-Level Categories
From Large Scale Image Categorization to Entry-Level CategoriesVicente Ordonez
 
Andrew Zisserman Talk - Part 1a
Andrew Zisserman Talk - Part 1aAndrew Zisserman Talk - Part 1a
Andrew Zisserman Talk - Part 1aAnton Konushin
 
(Reading Group) Saliency Detection: A Boolean Map Approach
(Reading Group) Saliency Detection: A Boolean Map Approach(Reading Group) Saliency Detection: A Boolean Map Approach
(Reading Group) Saliency Detection: A Boolean Map ApproachMohamed Elawady
 
Automatic Foreground object detection using Visual and Motion Saliency
Automatic Foreground object detection using Visual and Motion SaliencyAutomatic Foreground object detection using Visual and Motion Saliency
Automatic Foreground object detection using Visual and Motion SaliencyIJERD Editor
 
ITIS Web Présentation - PrestaShop - le ecommerce en 2015 - salon eCom GENEVE...
ITIS Web Présentation - PrestaShop - le ecommerce en 2015 - salon eCom GENEVE...ITIS Web Présentation - PrestaShop - le ecommerce en 2015 - salon eCom GENEVE...
ITIS Web Présentation - PrestaShop - le ecommerce en 2015 - salon eCom GENEVE...Paul Guillemin
 
Saliency-based Models of Image Content and their Application to Auto-Annotati...
Saliency-based Models of Image Content and their Application to Auto-Annotati...Saliency-based Models of Image Content and their Application to Auto-Annotati...
Saliency-based Models of Image Content and their Application to Auto-Annotati...Jonathon Hare
 
3D Riesz-wavelet Based Covariance Descriptors for Texture Classi cation of Lu...
3D Riesz-wavelet Based Covariance Descriptors for Texture Classication of Lu...3D Riesz-wavelet Based Covariance Descriptors for Texture Classication of Lu...
3D Riesz-wavelet Based Covariance Descriptors for Texture Classi cation of Lu...Institute of Information Systems (HES-SO)
 
Visual attention: models and performance
Visual attention: models and performanceVisual attention: models and performance
Visual attention: models and performanceOlivier Le Meur
 
Saccadic model of eye movements for free-viewing condition
Saccadic model of eye movements for free-viewing conditionSaccadic model of eye movements for free-viewing condition
Saccadic model of eye movements for free-viewing conditionOlivier Le Meur
 
Local feature descriptors for visual recognition
Local feature descriptors for visual recognitionLocal feature descriptors for visual recognition
Local feature descriptors for visual recognitionpotaters
 
CVPR2010: Context-aware saliency detection
CVPR2010: Context-aware saliency detectionCVPR2010: Context-aware saliency detection
CVPR2010: Context-aware saliency detectionzukun
 
Matching with Invariant Features
Matching with Invariant FeaturesMatching with Invariant Features
Matching with Invariant Featureszukun
 
Computational models of human visual attention driven by auditory cues
Computational models of human visual attention driven by auditory cuesComputational models of human visual attention driven by auditory cues
Computational models of human visual attention driven by auditory cuesAkisato Kimura
 
Feature Extraction and Principal Component Analysis
Feature Extraction and Principal Component AnalysisFeature Extraction and Principal Component Analysis
Feature Extraction and Principal Component AnalysisSayed Abulhasan Quadri
 
Matlab Feature Extraction Using Segmentation And Edge Detection
Matlab Feature Extraction Using Segmentation And Edge DetectionMatlab Feature Extraction Using Segmentation And Edge Detection
Matlab Feature Extraction Using Segmentation And Edge DetectionDataminingTools Inc
 
Image feature extraction
Image feature extractionImage feature extraction
Image feature extractionRushin Shah
 

En vedette (20)

Visual Saliency: Learning to Detect Salient Objects
Visual Saliency: Learning to Detect Salient ObjectsVisual Saliency: Learning to Detect Salient Objects
Visual Saliency: Learning to Detect Salient Objects
 
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012
 
From Large Scale Image Categorization to Entry-Level Categories
From Large Scale Image Categorization to Entry-Level CategoriesFrom Large Scale Image Categorization to Entry-Level Categories
From Large Scale Image Categorization to Entry-Level Categories
 
Andrew Zisserman Talk - Part 1a
Andrew Zisserman Talk - Part 1aAndrew Zisserman Talk - Part 1a
Andrew Zisserman Talk - Part 1a
 
(Reading Group) Saliency Detection: A Boolean Map Approach
(Reading Group) Saliency Detection: A Boolean Map Approach(Reading Group) Saliency Detection: A Boolean Map Approach
(Reading Group) Saliency Detection: A Boolean Map Approach
 
Automatic Foreground object detection using Visual and Motion Saliency
Automatic Foreground object detection using Visual and Motion SaliencyAutomatic Foreground object detection using Visual and Motion Saliency
Automatic Foreground object detection using Visual and Motion Saliency
 
ITIS Web Présentation - PrestaShop - le ecommerce en 2015 - salon eCom GENEVE...
ITIS Web Présentation - PrestaShop - le ecommerce en 2015 - salon eCom GENEVE...ITIS Web Présentation - PrestaShop - le ecommerce en 2015 - salon eCom GENEVE...
ITIS Web Présentation - PrestaShop - le ecommerce en 2015 - salon eCom GENEVE...
 
Saliency-based Models of Image Content and their Application to Auto-Annotati...
Saliency-based Models of Image Content and their Application to Auto-Annotati...Saliency-based Models of Image Content and their Application to Auto-Annotati...
Saliency-based Models of Image Content and their Application to Auto-Annotati...
 
3D Riesz-wavelet Based Covariance Descriptors for Texture Classi cation of Lu...
3D Riesz-wavelet Based Covariance Descriptors for Texture Classication of Lu...3D Riesz-wavelet Based Covariance Descriptors for Texture Classication of Lu...
3D Riesz-wavelet Based Covariance Descriptors for Texture Classi cation of Lu...
 
Visual attention: models and performance
Visual attention: models and performanceVisual attention: models and performance
Visual attention: models and performance
 
Saccadic model of eye movements for free-viewing condition
Saccadic model of eye movements for free-viewing conditionSaccadic model of eye movements for free-viewing condition
Saccadic model of eye movements for free-viewing condition
 
Local feature descriptors for visual recognition
Local feature descriptors for visual recognitionLocal feature descriptors for visual recognition
Local feature descriptors for visual recognition
 
CVPR2010: Context-aware saliency detection
CVPR2010: Context-aware saliency detectionCVPR2010: Context-aware saliency detection
CVPR2010: Context-aware saliency detection
 
Matching with Invariant Features
Matching with Invariant FeaturesMatching with Invariant Features
Matching with Invariant Features
 
Computational models of human visual attention driven by auditory cues
Computational models of human visual attention driven by auditory cuesComputational models of human visual attention driven by auditory cues
Computational models of human visual attention driven by auditory cues
 
Feature Extraction and Principal Component Analysis
Feature Extraction and Principal Component AnalysisFeature Extraction and Principal Component Analysis
Feature Extraction and Principal Component Analysis
 
Feature Extraction
Feature ExtractionFeature Extraction
Feature Extraction
 
Matlab Feature Extraction Using Segmentation And Edge Detection
Matlab Feature Extraction Using Segmentation And Edge DetectionMatlab Feature Extraction Using Segmentation And Edge Detection
Matlab Feature Extraction Using Segmentation And Edge Detection
 
Image feature extraction
Image feature extractionImage feature extraction
Image feature extraction
 
Text Detection and Recognition
Text Detection and RecognitionText Detection and Recognition
Text Detection and Recognition
 

Similaire à Visual Saliency Model Using Sift and Comparison of Learning Approaches

Real-time eyeglass detection using transfer learning for non-standard facial...
Real-time eyeglass detection using transfer learning for  non-standard facial...Real-time eyeglass detection using transfer learning for  non-standard facial...
Real-time eyeglass detection using transfer learning for non-standard facial...IJECEIAES
 
Analysis of student sentiment during video class with multi-layer deep learni...
Analysis of student sentiment during video class with multi-layer deep learni...Analysis of student sentiment during video class with multi-layer deep learni...
Analysis of student sentiment during video class with multi-layer deep learni...IJECEIAES
 
CROWD ANALYSIS WITH FISH EYE CAMERA
CROWD ANALYSIS WITH FISH EYE CAMERA CROWD ANALYSIS WITH FISH EYE CAMERA
CROWD ANALYSIS WITH FISH EYE CAMERA ijaceeejournal
 
CROWD ANALYSIS WITH FISH EYE CAMERA
CROWD ANALYSIS WITH FISH EYE CAMERACROWD ANALYSIS WITH FISH EYE CAMERA
CROWD ANALYSIS WITH FISH EYE CAMERAijaceeejournal
 
Paper id 24201475
Paper id 24201475Paper id 24201475
Paper id 24201475IJRAT
 
Automatic Attendance Management System Using Face Recognition
Automatic Attendance Management System Using Face RecognitionAutomatic Attendance Management System Using Face Recognition
Automatic Attendance Management System Using Face RecognitionKathryn Patel
 
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATIONMULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATIONgerogepatton
 
Multi-Level Feature Fusion Based Transfer Learning for Person Re-Identification
Multi-Level Feature Fusion Based Transfer Learning for Person Re-IdentificationMulti-Level Feature Fusion Based Transfer Learning for Person Re-Identification
Multi-Level Feature Fusion Based Transfer Learning for Person Re-Identificationgerogepatton
 
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATIONMULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATIONijaia
 
FACE PHOTO-SKETCH RECOGNITION USING DEEP LEARNING TECHNIQUES - A REVIEW
FACE PHOTO-SKETCH RECOGNITION USING DEEP LEARNING TECHNIQUES - A REVIEWFACE PHOTO-SKETCH RECOGNITION USING DEEP LEARNING TECHNIQUES - A REVIEW
FACE PHOTO-SKETCH RECOGNITION USING DEEP LEARNING TECHNIQUES - A REVIEWIRJET Journal
 
IRJET- Survey on Face Recognition using Biometrics
IRJET-  	  Survey on Face Recognition using BiometricsIRJET-  	  Survey on Face Recognition using Biometrics
IRJET- Survey on Face Recognition using BiometricsIRJET Journal
 
Face Recognition Smart Attendance System: (InClass System)
Face Recognition Smart Attendance System: (InClass System)Face Recognition Smart Attendance System: (InClass System)
Face Recognition Smart Attendance System: (InClass System)IRJET Journal
 
Image Recognition Expert System based on deep learning
Image Recognition Expert System based on deep learningImage Recognition Expert System based on deep learning
Image Recognition Expert System based on deep learningPRATHAMESH REGE
 
IRJET - Effective Workflow for High-Performance Recognition of Fruits using M...
IRJET - Effective Workflow for High-Performance Recognition of Fruits using M...IRJET - Effective Workflow for High-Performance Recognition of Fruits using M...
IRJET - Effective Workflow for High-Performance Recognition of Fruits using M...IRJET Journal
 
Face Annotation using Co-Relation based Matching for Improving Image Mining ...
Face Annotation using Co-Relation based Matching  for Improving Image Mining ...Face Annotation using Co-Relation based Matching  for Improving Image Mining ...
Face Annotation using Co-Relation based Matching for Improving Image Mining ...IRJET Journal
 
A Hybrid Approach to Face Detection And Feature Extraction
A Hybrid Approach to Face Detection And Feature ExtractionA Hybrid Approach to Face Detection And Feature Extraction
A Hybrid Approach to Face Detection And Feature Extractioniosrjce
 
Real time voting system using face recognition for different expressions and ...
Real time voting system using face recognition for different expressions and ...Real time voting system using face recognition for different expressions and ...
Real time voting system using face recognition for different expressions and ...eSAT Publishing House
 
FACIAL EMOTION RECOGNITION SYSTEM
FACIAL EMOTION RECOGNITION SYSTEMFACIAL EMOTION RECOGNITION SYSTEM
FACIAL EMOTION RECOGNITION SYSTEMIRJET Journal
 

Similaire à Visual Saliency Model Using Sift and Comparison of Learning Approaches (20)

Real-time eyeglass detection using transfer learning for non-standard facial...
Real-time eyeglass detection using transfer learning for  non-standard facial...Real-time eyeglass detection using transfer learning for  non-standard facial...
Real-time eyeglass detection using transfer learning for non-standard facial...
 
Analysis of student sentiment during video class with multi-layer deep learni...
Analysis of student sentiment during video class with multi-layer deep learni...Analysis of student sentiment during video class with multi-layer deep learni...
Analysis of student sentiment during video class with multi-layer deep learni...
 
CROWD ANALYSIS WITH FISH EYE CAMERA
CROWD ANALYSIS WITH FISH EYE CAMERA CROWD ANALYSIS WITH FISH EYE CAMERA
CROWD ANALYSIS WITH FISH EYE CAMERA
 
CROWD ANALYSIS WITH FISH EYE CAMERA
CROWD ANALYSIS WITH FISH EYE CAMERACROWD ANALYSIS WITH FISH EYE CAMERA
CROWD ANALYSIS WITH FISH EYE CAMERA
 
Paper id 24201475
Paper id 24201475Paper id 24201475
Paper id 24201475
 
Automatic Attendance Management System Using Face Recognition
Automatic Attendance Management System Using Face RecognitionAutomatic Attendance Management System Using Face Recognition
Automatic Attendance Management System Using Face Recognition
 
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATIONMULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
 
Multi-Level Feature Fusion Based Transfer Learning for Person Re-Identification
Multi-Level Feature Fusion Based Transfer Learning for Person Re-IdentificationMulti-Level Feature Fusion Based Transfer Learning for Person Re-Identification
Multi-Level Feature Fusion Based Transfer Learning for Person Re-Identification
 
Real time facial expression analysis using pca
Real time facial expression analysis using pcaReal time facial expression analysis using pca
Real time facial expression analysis using pca
 
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATIONMULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
 
FACE PHOTO-SKETCH RECOGNITION USING DEEP LEARNING TECHNIQUES - A REVIEW
FACE PHOTO-SKETCH RECOGNITION USING DEEP LEARNING TECHNIQUES - A REVIEWFACE PHOTO-SKETCH RECOGNITION USING DEEP LEARNING TECHNIQUES - A REVIEW
FACE PHOTO-SKETCH RECOGNITION USING DEEP LEARNING TECHNIQUES - A REVIEW
 
IRJET- Survey on Face Recognition using Biometrics
IRJET-  	  Survey on Face Recognition using BiometricsIRJET-  	  Survey on Face Recognition using Biometrics
IRJET- Survey on Face Recognition using Biometrics
 
Face Recognition Smart Attendance System: (InClass System)
Face Recognition Smart Attendance System: (InClass System)Face Recognition Smart Attendance System: (InClass System)
Face Recognition Smart Attendance System: (InClass System)
 
Image Recognition Expert System based on deep learning
Image Recognition Expert System based on deep learningImage Recognition Expert System based on deep learning
Image Recognition Expert System based on deep learning
 
IRJET - Effective Workflow for High-Performance Recognition of Fruits using M...
IRJET - Effective Workflow for High-Performance Recognition of Fruits using M...IRJET - Effective Workflow for High-Performance Recognition of Fruits using M...
IRJET - Effective Workflow for High-Performance Recognition of Fruits using M...
 
Face Annotation using Co-Relation based Matching for Improving Image Mining ...
Face Annotation using Co-Relation based Matching  for Improving Image Mining ...Face Annotation using Co-Relation based Matching  for Improving Image Mining ...
Face Annotation using Co-Relation based Matching for Improving Image Mining ...
 
K017247882
K017247882K017247882
K017247882
 
A Hybrid Approach to Face Detection And Feature Extraction
A Hybrid Approach to Face Detection And Feature ExtractionA Hybrid Approach to Face Detection And Feature Extraction
A Hybrid Approach to Face Detection And Feature Extraction
 
Real time voting system using face recognition for different expressions and ...
Real time voting system using face recognition for different expressions and ...Real time voting system using face recognition for different expressions and ...
Real time voting system using face recognition for different expressions and ...
 
FACIAL EMOTION RECOGNITION SYSTEM
FACIAL EMOTION RECOGNITION SYSTEMFACIAL EMOTION RECOGNITION SYSTEM
FACIAL EMOTION RECOGNITION SYSTEM
 

Dernier

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 

Dernier (20)

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 

Visual Saliency Model Using Sift and Comparison of Learning Approaches

  • 1. VISUAL SALIENCY MODEL USING SIFT AND COMPARISON OF LEARNING APPROACHES Hamdi Yalın Yalıç1 1 Department of Computer Engineering, Hacettepe University, Ankara, Turkey yalinyalic@cs.hacettepe.edu.tr ABSTRACT Humans' ability to detect and locate salient objects on images is remarkably fast and successful. Performing this process by using eye tracking equipment is expensive and cannot be easily applied, and computer modeling of this human behavior is still a problem to be solved. In our study, one of the largest public eye-tracking databases [1] which has fixation points of 15 observers on 1003 images is used. In addition to low, medium and high-level features which have been used in previous studies, SIFT features extracted from the images are used to improve the classification accuracy of the models. A second contribution of this paper is the comparison and statistical analysis of different machine learning methods that can be used to train our model. As a result, a best feature set and learning model to predict where humans look at images, is determined. KEYWORDS Image Processing, Computer Vision, Machine Learning Methods, Saliency Map, Eye-Tracking 1. INTRODUCTION Understanding the region where people look on a scene can be useful in many applications such as graphics, design, advertising and human-computer interaction. For example, in nonphotorealistic rendering, different levels of details are proposed for different areas of the picture [2]. In this study, regions where people focus are processed in high-resolution, whilst for the other regions, low-resolution processing is performed. Auto-crop of pictures, thumbnails or previews from photos, relate partial displays of images on small screen mobile devices are examples of other fields of application. Hardware solutions are also available for finding the region where people look at a scene. An observer sitting in front of a screen with an eye-tracking device has all the spots recorded at which he is looking at a scene. Special setups and multiple calibration steps are needed for working these tracking devices. Also this solution is relatively expensive and cannot be easily obtained. For this reason, the prediction of where people are looking is required independently from such tracking devices. As a solution, computer models are generated that calculate the salient point which attracts an observer's attention. These models analyze an image and extract a saliency map. One of the former studies is the model developed by Itti and Koch [3] and uses biologically inspired features like orientation, intensity and color. For each of the features, a saliency map is computed and combined into a single saliency map result which describes the David C. Wyld et al. (Eds) : CCSIT, SIPP, AISC, PDCTA, NLP - 2014 pp. 275–282, 2014. © CS & IT-CSCP 2014 DOI : 10.5121/csit.2014.4223
  • 2. 276 Computer Science & Information Technology (CS & IT) saliency of each pixel. Hou and Zhang [4] proposed a model which is independent of features or other forms of prior knowledge of objects. They analyzed the log spectrum of an image, extracted the spectral residual of an image in spectral domain, and proposed a method to construct the saliency map in spatial domain. Bruce and Tsotsos [5] presents a visual saliency model based on a first principle information theoretic formulation, named as Attention Based on Information Maximization (AIM) which performs better than the Itti model. Avraham et al. [6] uses a probabilistic model to mathematically estimate the most probable targets. Cerf at al. [7] improved on Itti's model by adding face detection. The most important characteristic of these studies is that models are derived mathematically, and not trained by some large eye-tracking dataset. Kienze et al. [8] propose a model that learns saliency directly from human eye movement data. They only use low level features like previous studies. But the study by Ehinger et al. [9] shows that observers trying to search for pedestrians in a scene leads to a model combined of three sources: low-level features, target features and scene context. As a result of this study, while predicting human behavior, besides low-level features, high-level features which contain semantic information about the content of the scene must be used. Our study is most closely related to the saliency model proposed by Judd et al. [1]. They used high and mid-level features as well as low levels. Another important contribution of their study is that they train the model using a supervised learning method (support vector machine) and observed positive contributions of the features in the result. To verify this, they needed real eyetracking data and they recorded viewer's eye fixations on images containing a large amount of data. They published this database for further studies and in our study, this publicly available database was used. Results of the different machine learning methods was compared and analyzed in order to obtain the most accurate model. In addition to the work of others, SIFT [17] feature is added to the feature set to improve the accuracy of the model. In the following sections of this paper, dataset and data gathering protocol is analyzed, machine learning methods will be briefly explained and the feature set that was used to train the model is mentioned. Performance of different learning methods are compared and analyzed in the experimental results. Future work will also be discussed in the conclusion. 2. DATASET AND LEARNING METHODS 2.1. Dataset Analysis Large ground truth data is required in the detection of where humans look in a scene. One of the largest studies in this field is the experiment of Judd et al. [1] practiced with 15 viewers using 1003 images. They collected 1003 random images from Flickr and LabelMe [10] (Figure 2) and full resolution images were displayed for 3 seconds to viewers in a dark room using a chin rest for a head stabilizer. An eye-tracker in front of them recorded their gaze path and fixation points (Figures 3 and 4). In order to guarantee a high-quality result, the camera’s calibration was checked every 50 images. They provided a memory test at the end of each display to motivate and encourage users into paying attention, and asked them which of the images they had seen previously. When we analyze the dataset, we noticed that viewers primarily looked at the living things. In the close up images, organs like the eye, nose and mouth were the most fixated points. But when people were located that bit further away, viewers searched for faces and limbs, such as a hand, or an arm. If a human does not exist in the scene, animals can also be said to attract the attention of people. Other important salient regions are objects like signs and boards which contain text. The
  • 3. Computer Science & Information Technology (CS & IT) 277 fixations in the dataset have a bias towards the center because of the fact that photographers tend to place objects in the center of the scene. 2.2. Machine Learning Methods In this study, different supervised learning methods are used while discovering the model of visual saliency and classification results of them are compared. The methods used are explained briefly in the section below. 2.2.1. SVM (Support Vector Machines) This method tries to find the linear discriminant function (classifier) with the maximum margin between two classes. It is robust to outliners thus has strong generalization ability. With the help of the kernel function, which implicitly maps data to high-dimensional space, classification accuracy will increase. The main advantage of the method is its success with high-dimensional data. 2.2.2. C4.5 (Decision Tree) This is an algorithm that constructs a simple depth-first decision tree. It uses information gain with feature selection and it is easy to implement. Classification of unknown records is considerably fast, but it is not suitable for large datasets because it needs to fit the entire data into memory. 2.2.3. K-Nearest Neighbor This method is based on the principle of the nearest k-neighbor of the new data. Different computations can be used as a distance metric and the advantage of this classifier is that it doesn't require any training time. But testing time can be considerable and classifying unknown records is relatively expensive. It is also easily fooled in high dimensional spaces. 2.2.4. Naive Bayes This is a statistical classifier which applies Bayes theorem. It assumes that attributes are independent from each other and is robust to isolated noise points and irrelevant attributes. However independence assumption may not hold for some attributes and it can cause a loss of accuracy. 2.2.5. Adaboost It is a combination of classifiers that is constructed from the training data. It adaptively changes the distribution of training data with each iteration by focusing more on previously misclassified records. It can also be used to increase the classification accuracy of other methods. 3. LEARNING OF THE VISUAL SALIENCY MODEL 3.1. Features Used in Learning According to the results obtained by the analysis of the dataset, the features used in this study are listed below. For all images in the database, the features are extracted for each pixel and used for training of the model. MATLAB was used in image processing and feature extraction.
  • 4. 278 Computer Science & Information Technology (CS & IT) 3.1.1. Low-level Features • • • • Local energy of steerable pyramid filters [13] are used because they are physiologically conceivable and associated with visual attention. Pyramid sub-bands were computed in 4 orientations and 3 scales. Intensity, orientation and color contrast were considered as important features for saliency regions. 3 channels corresponding to these features were computed as described in the Itti and Koch [3] model. In addition to red, green and blue color channel values, and the probabilities of these channels, were also used. The probability of each color in a 3D color histogram was also a basic feature which was computed by the images filtered using median filter at 6 different scales. 3.1.2. Mid-level Features Humans inherently look at the horizon line because most of the objects are on the earth's surface. Gist feature [14] was used to detect the horizon line. 3.1.3. High-level Features Humans mostly looked at people and their faces as understood by the analysis of the database, the following high-level features were extracted: • • Viola Jones face detector [15], Felzenszwalb person and car detector [16]. 3.1.4. Central Priority Humans generally place the object of interest near the center of the image while taking pictures. For this reason, a feature which specifies the distance to the center for each pixel is used. 3.1.5. SIFT Keypoints For objects in a scene, the interesting points on the object can be extracted to provide a feature description of them. This description can be used to identify the object in a test image. The SIFT [17] method can be used to detect these interesting points in an image. In this paper, the keypoint localization step of the method is used. Low-contrast keypoints are discarded and locations of the more interesting keypoints, which can be attractive to the human eye, are used. One contribution of this paper is the using of the position of interest points on an image as a feature and increasing the classification rate. Other local features, like SURF and GLOH were also experimented with, but since SIFT had a better effect on the model, it was SIFT that we decided to go with. 3.2. Training Phase While training and testing the saliency model, machine learning methods described in Section 2.2 were used. Images in the database were divided up as 80% training and 20% testing. That means that 803 images were used for training and 201 for testing. 10 positively labeled pixels from the most 5% salient locations and 10 negatively labeled pixels from the least 30% salient locations were chosen from each image. Thus the examples are guaranteed to be both strongly positive and strongly negative. Any samples on the boundary between the two regions, and samples within 10 pixels of the image boundary were not selected. It is noted that selecting more than 10 samples as positive and negative did not increase the classification rate and contained redundant information.
  • 5. Computer Science & Information Technology (CS & IT) 279 The dataset used to train the model consisted of 16,040 samples; the test dataset consisted of 4,020 samples. Every sample (point) has 34 continuous attribute and labeled as positive or negative. Obtained data was given as an input to the described learning methods for training and a more accurate model will be discussed. Details of the parameters used in the classifier training are explained below: • In the model that uses SVM, all data were normalized and radial basis function (1) is used for kernel function (other kernels produce results similar to radial). γ value is assigned to 0.8. Cost of misclassification didn't affect the result, so c is assigned to 8. ଶ ݇൫‫ݔ‬௜ , ‫ݔ‬௝ ൯ = ݁‫ ݌ݔ‬ቀ−ߛฮ‫ݔ‬௜ − ‫ݔ‬௝ ฮ ቁ • • • • (1) In the model that uses C4.5 decision tree, for spanning variables, the minimum number of objects (instances) in leaves m = 2 and pruning confidence level (cf) is assigned to 25%. The number of nearest neighbors (k) used in classification is 9 for the kNN method and it was observed that the value does not affect the result. In this model all attributes are normalized and Euclidean distance is used as a distance metric. In the NaiveBayes method, we used relative frequency as a prior in probability estimation. Size of LOESS window is 0.5 and LOESS sample points are set to 100. Adaboost method is used with SVM to improve the result obtained. Boosting with 10 SVM classifier is applied and weighted combination of weak learners is obtained. 3.3. Comparison of Learning Methods In this section classification performance of machine learning methods will be discussed. While sampling, a 5-fold cross validation is used and how the specified model accurately worked in practice is shown. Table 1. Evaluation results of the methods using same features Method SVM [1] 0,8825 C4.5 0,8410 0,8413 0,8407 kNN 0,8583 0,8407 NaiveBayes 0,8168 0,8209 0,8821 0,8818 AdaBoost Sens Results (same features) Spec AUC 0,8778 0,9597 CA 0,8801 Prec 0,8786 Recall 0,8825 0,8730 0,8409 0,8413 0,8759 0,949 0,8695 0,8407 0,8129 0,9029 0,8147 0,8209 0,8825 0,8818 0,8823 0,8818 Table 2. Evaluation results of the methods using same features and SIFT Method CA Sens Results (same features+SIFT) Spec AUC Prec Recall SVM 0.9065 0.9089 0.9091 0.9884 0.9091 0.9089 C4.5 0.8610 0.8656 0.8611 0.8971 0.8649 0.8659 kNN 0.8820 0.8629 0.9059 0.9791 0.8991 0.8665 NaiveBayes 0.8423 0.8405 0.8327 0.9282 0.8355 0.8482 AdaBoost 0.9085 0.9072 0.9089 0.9099 0.9087 0.9055
  • 6. 280 Computer Science & Information Technology (CS & IT) Table 1 and 2 shows the evaluation results of the methods used in this study. As a key: CA means classification accuracy, Sens means sensitivity, Spec means specificity, AUC means area under ROC curve, Prec means precision, Recall as itself. Values found in the evaluation results show how the models can be compared to human performance while looking at an image. The study [1] that our model is based on used the SVM model (Table 1) and reached 88% classification accuracy. Similarly in our study, the SVM model gives better results than other methods. The main reason for this performance is that our problem is based on a binary classification prediction (positive and negative points). Also the dataset contains high-dimensional data with large numbers, and this makes SVM more advantageous over the other methods. C4.5 decision tree method produced lower accuracy results because it doesn't have a suitable structure for large datasets. In addition, it falls behind on computation time during training when compared to other methods. In this work, Random Forest method was also used, but because of the failure of the decision trees on this dataset and because the algorithm couldn't construct more than 10 trees due to computational time limits, this method has not been reviewed in this paper. The model which uses kNN method produced a lower accuracy because the error rate increases in multi-dimensional space with this method. In addition, computational time of the testing stage is very high when compared to other methods. Evaluation results show that the NaiveBayes method is the least successful. Especially as dependencies between attributes (for example, color features and their probabilities are dependent) leads this method to failure. The Adaboost method produced similar results with SVM; even better in some evaluations. That result is obvious because it used the successful method SVM, focusing more on previously misclassified samples iteratively and constructed a better model. Saliency maps obtained from some sample images using this model are shown in Figure 6. Another comparison between learning methods is given in Figure 1 using ROC curve. Figure 1. ROC curves of methods (Adaboost, SVM, kNN, C4.5, NaiveBayes)
  • 7. Computer Science & Information Technology (CS & IT) 281 4. CONCLUSIONS In this study, visual saliency models which try to predict where humans look at images are developed. For this purpose, a real eye-tracking database is used and different levels of features are extracted from the images. Using eye fixation points of 15 viewers, a ground truth saliency map (Figure 5) is uncovered with this information when different learning methods are trained and classification results are compared. As a conclusion of this study, machine learning methods that can be applied to this dataset are explained with reasons. Another contribution of this study is that the interest points extracted from the SIFT method are added to the feature set and better classification results (~90%) were then obtained over previous studies. As a future piece of work, we expect to speed up the image processing step and feature extraction. However, training time of the model is not important because it can be done offline, but when a system wants to find the most salient points or regions on an image, this process must be handled quickly. If online saliency is successful, it can be integrated with many applications and can also be used in mobile systems. Figure 2. Some example images from the database. Figure 3. Eye fixation points of one observer on the images. Figure 4. Eye fixation points of all observers on the images. Figure 5. Continuous saliency map obtained from convolving Gaussian over fixation points of all observers. Figure 6. Saliency map obtained from Adaboost
  • 8. 282 Computer Science & Information Technology (CS & IT) REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] Judd, T., Ehinger, K., Durand, F., & Torralba, A. (2009). "Learning to predict where humans look", IEEE International Conference on Computer Vision, pp 2106–2113, IEEE Computer Society. D. DeCarlo & A. Santella, (2002) "Stylization and abstraction of photographs", ACM Transactions on Graphics, Vol. 21, No. 3, pp 769–776. L. Itti & C. Koch, (2000) "A saliency-based search mechanism for overt and covert shifts of visual attention", Vision Research, Vol. 40, pp 1489–1506. X. Hou & L. Zhang, (2007) "Saliency detection: A spectral residual approach", Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, Vol 10, pp 1–8. N. D. B. Bruce & J. K. Tsotsos, (2009) "Saliency, attention, and visual search: An information theoretic approach", Journal of Vision, Vol. 9, No. 3, pp 1–24. T. Avraham & M. Lindenbaum, (2009) "Esaliency: Meaningful attention using stochastic image modeling", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 99, No. 1. M. Cerf, J. Harel, W. Einhauser, & C. Koch, (2007),"Predicting human gaze using low-level saliency combined with face detection", In J. C. Platt, D. Koller, Y. Singer, and S. T. Roweis, editors, NIPS. MIT Press. W. Kienzle, F. A.Wichmann, B. Scholkopf, & M. O. Franz, (2006) "A nonparametric approach to bottom-up visual saliency", In B. Scholkopf, J. C. Platt, and T. Hoffman, editors, NIPS, pp 689–696. MIT Press. K. Ehinger, B. Hidalgo-Sotelo, A. Torralba, & A. Oliva, (2009) "Modeling search for people in 900 scenes: A combined source model of eye guidance", Visual Cognition. B. Russell, A. Torralba, K. Murphy, & W. Freeman, (2005) "Labelme: a database and web-based tool for image annotation", MIT AI Lab Memo AIM-2005-025, MIT CSAIL, Sept. 2005. Cerf, M., Frady, E., & Koch, C., (2009) "Faces and text attract gaze independent of the task: Experimental data and computer model", Journal of Vision, Vol. 9, No. 12, pp 1–15. Subramanian, R., Katti, H., Sebe, N., Kankanhalli, M., & Chua, T. S., (2010) "An eye fixation database for saliency detection in images", In European Conference on Computer Vision, Vol. 6314, pp 30–43, Springer. E. P. Simoncelli & W. T. Freeman, (1995) "The steerable pyramid: A flexible architecture for multiscale derivative computation", pp 444–447. A. Oliva & A. Torralba, (2001) "Modeling the shape of the scene: A holistic representation of the spatial envelope", International Journal of Computer Vision, Vol. 42, pp 145–175. P. Viola & M. Jones, (2001) "Robust real-time object detection", In International Journal of Computer Vision. P. Felzenszwalb, D. McAllester, & D. Ramanan, (2008), "A discriminatively trained, multiscale, deformable part model", Computer Vision and Pattern Recognition 2008 IEEE Conference on, pp 1– 8. Lowe, D. G., (2004) "Distinctive image features from scale-invariant keypoints", International journal of computer vision, Vol. 60, No. 2, pp 91-110. AUTHOR H. Yalın Yalıç was born in Ankara, Turkey in 1984. Yalıç received his B.Sc. degree from the Department of Computer Engineering, Çankaya University, Ankara, Turkey in 2007. He received his M.Sc. degree from the Department of Computer Engineering, Hacettepe University, Ankara, Turkey in 2010 and has continued with his Ph.D. education in the same department. He has worked as a Research and Teaching Assistant at Hacettepe University Computer Engineering Department since 2007. The author's major field of study is computer vision, image and video processing.