SlideShare une entreprise Scribd logo
1  sur  4
Télécharger pour lire hors ligne
Natural Scene Statistics at Stereo Fixations

                   Yang Liu                                                  Lawrence K. Cormack                                     Alan C. Bovik
             Department of ECE                                             Department of Psychology                               Department of ECE
         Univeristy of Texas at Austin                                    University of Texas at Austin                       University of Texas at Austin
          young76@mail.utexas.edu                                          cormack@psy.utexas.edu                                bovik@ece.utexas.edu

Abstract                                                                                           the adaptation of the visual system [Liu et al. 2008]. One reason
                                                                                                   for the lack of studies on fixations in 3D space has been the dearth
We conducted eye tracking experiments on naturalistic stereo                                       of 3D ground truth natural scene data. The few databases of natu-
images presented through a haploscope, and found that fixated                                      ralistic stereo images do not adequately fulfill this need, since
luminance contrast and luminance gradient were generally higher                                    what is needed are dense disparity maps for each scene. One very
than randomly selected luminance contrast and luminance gra-                                       recent study [Jansen et al., 2009] used natural scenes with ground
dient, which agrees with previous literatures. However we also                                     truth disparity map acquired from laser scanning. They found that
found that the fixated disparity contrast and disparity gradient                                   disparity appears to be a salient feature that affects eye move-
were generally lower than randomly selected disparity contrast                                     ments. In particular, they found that the presence of disparity
and disparity gradient. We discuss the implications of this re-                                    information may affect saccade length, but not duration; that dis-
markable result.                                                                                   parity appears not to affect the saliency of luminance features; and
                                                                                                   that subjects tend to fixate nearer objects earlier than more dis-
Keywords: fixation, stereopsis, natural scene statistics                                           tance objects. A reasonable agreement with intuition may be
                                                                                                   found in each of the results. The authors also found that in 3D
                                                                                                   noise images, subjects tended to fixate depth discontinuities more
1       Introduction / Overview                                                                    frequently than smooth depth regions. One might view this result
                                                                                                   as intuitive also; as with luminance images, such locations might
Many studies have shown that several low level luminance fea-                                      be deemed interesting.
tures at human fixations are significantly different from those at
other areas. Reinagel and Zador [1999] found that the regions                                      2     Stereo Eyetracking
around human fixations tend to higher spatial contrasts and spatial
entropies than random fixation regions, which suggests that the
human visual system may try to select image regions that help                                      2.1     Observers
maximize the information content transmitted to the visual cortex,
by minimizing the redundancy in the image representation. By                                       Three male observers participated in this study. Their stereo abili-
varying the patch sizes around the fixations, Parkhurst and Niebur                                 ty was tested to be normal by presenting them random dot stereo-
[2003] found that the largest difference between the luminance                                     grams. Observer LKC has extensive experience in psychophysical
contrast of fixated regions and that of image shuffled (pseudo-                                    studies, and knew the purpose of the experiment. Observers JSL
random) regions is observed when the patch size is 1 degree. In a                                  and CHY were naïve.
recent study [Rajashekar et al. 2007], fixated image patches were
foveated using an eccentricity-based model. Higher order statis-                                   2.2     Stimulus
tics were analyzed than in prior studies. They found that bandpass
contrast showed a notably larger difference between fixations and                                  We manually selected 48 grayscale stereoscopic outdoor scenes
random patches than other higher-order statistics. Other features                                  [Hoyer & Hyvärinen, 2000] that contained mountains, trees, wa-
found to attract fixations (in decreasing order of attractiveness)                                 ter, rocks, bushes, etc., but avoided manmade objects. We didn’t
included bandpass luminance, RMS contrast, and luminance. This                                     have the ground truth disparity data from these scenes. Instead,
data was subsequently used to develop an image processing “fixa-                                   we relied on a local correlation method which is simple yet bio-
tion predictor” [Rajashekar et al. 2008].                                                          logical inspired to solve this issue. Models of binocular complex
                                                                                                   neurons [Anzai et al. 1999; Fleet et al. 1996; Ohzawa et al. 1990;
The body of literature on visual fixations on 2D luminance images                                  Qian, 1994] commonly contain a cross correlation term in their
appears to be largely consistent across the studies. However, there                                response function.
has been very little work done on analyzing the nature or statistics
of images and scenes at the point of gaze in three dimensions.                                     The simple correspondence algorithm was defined as follows.
Certainly the three dimensional attributes of the world affect the                                 Given a pixel (xr,yr) in the right image, we defined a 161×5
way we interact with it both visually and physically. Moreover,                                    (3.2°×0.1°) search window centered on the same pixel location
the 3D statistics of the natural world have likely played a role in                                (preferring zero disparity) in the left image. Given a 1°×1° patch
                                                                                                   in the right image centered on the pixel (xr,yr), the algorithm com-
                                                                                                   puted the cross correlation between the right patch and a candi-
                                                                                                   date 1°×1° left patch centered on each pixel in the 161×5 search
Copyright © 2010 by the Association for Computing Machinery, Inc.
Permission to make digital or hard copies of part or all of this work for personal or
                                                                                                   window. The left patch yielding the largest cross correlation was
classroom use is granted without fee provided that copies are not made or distributed              deemed to be the matched patch. The location (xl, yl) correspond-
for commercial advantage and that copies bear this notice and the full citation on the             ing the center of the patch was deemed the matched pixel for the
first page. Copyrights for components of this work owned by others than ACM must be                right pixel (xr,yr), hence the horizontal disparity of (xr,yr) was
honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on
servers, or to redistribute to lists, requires prior specific permission and/or a fee.
                                                                                                   taken to be D(xr,yr) = xr-xl.
Request permissions from Permissions Dept, ACM Inc., fax +1 (212) 869-0481 or e-mail
permissions@acm.org.
ETRA 2010, Austin, TX, March 22 – 24, 2010.
© 2010 ACM 978-1-60558-994-7/10/0003 $10.00

                                                                                             161
It could be argued that, since we are only interested in local scene           centers of both monitors to help the observers to fuse before the
statistics, why not record the movements of both eyes, and use the             presentation of the next stereo pair.
disparity between the recorded left and right fixations to find the
correct match? There are several reasons why this wasn’t practic-              This calibration routine was repeated compulsorily every 10 im-
al. First, there are fixation disparities that occur between the right         ages, and a calibration test run every 5 images. This was achieved
eye and the left eyes. Most people have a fixation disparity that is           by requiring that the observer fixate for 500ms within a 5s time
less than 6 arcmin, but can be as large as 20 arcmin with peripher-            limit on a central square region (0.3◦ ×0.3◦) prior to progressing to
al visual targets [Wick, 1985]. When fixation disparity occurs,                the next image in the stimulus collection. If the calibration had
the image of an object point that a person is trying to fixate do not          drifted, the observer would be unable to satisfy this test, and the
fall on exactly corresponding points. Secondly, each Purkinje eye              full calibration procedure was re-run.
tracker has an accuracy about 7 acrmin (the median offset from
real fixations); hence the error between two eye trackers is about             Observers who became uncomfortable during the experiment
14 arcmin, which corresponds to about 12 pixels. This is quite a               were allowed to take a break of any duration they desired.
large error, considering that the stereo images are only 800×600.
Thirdly, the fixation detection algorithm (associated with eye-                The ambient illumination in the experiment room was kept con-
tracking) of the two eye paths can also introduce unwanted noise               stant for all observers, with a minimum of 5 minutes luminance
into the corresponding fixation locations; registered stereoscopic             adaptation provided while the eye-tracker was calibrated.
eyetracking is difficult to accomplish in practice. Lastly, since we
are interested in disparity features within neighborhoods (not just
points) of the fixations, a dense disparity map is required for all
points in the neighborhood. Hence, local processing of the type
that our disparity algorithm accomplishes would be required any-
way.

2.3     Equipments

Stereo images were displayed on two 17 inch, gamma calibrated                              Luminance                  Disparity
monitors. The distance between the monitors and the observer
was 124 cm. Each monitor’s screen resolution was set at 800×600
pixels, corresponding to about 50 pixels per degree of visual an-
gle. The total spatial extent of each display was thus about 16◦ ×
12◦ of visual angle.

A haploscope was placed between the two monitors and the ob-
servers to completely separate the displays from the left and right                       Luminance gradient         Disparity gradient
monitor. Eye movements were recorded by a SRI Generation V
Dual Purkinje eye tracker. This eye tracker has an accuracy of <
10′ of arc, a response time of under 1 ms, and bandwidth of DC to
> 400Hz. The output of the eye tracker (horizontal and vertical
eye position signals) was low-pass filtered in the hardware and
then sampled at 200 hHz by a National Instruments data acquisi-
tion board in a Pentium IV host computer, where the data were
stored for offline data analysis. The observers used a bite bar and
                                                                                          Luminance contrast          Disparity contrast
a forehead rest to restrict their head movements.

A viewing session was composed of 48 viewed stereo image                               Figure 1. The luminance and disparity features
pairs. At the beginning of each session, a 0.3°×0.3° crosshair was
displayed on the centers of both monitors to help the observers to
fuse by fixating on it. When the correct binocular fixation (the
crosshair was perceived single) was achieved, the observer
                                                                               3     Analysis
pressed a button to start the calibration. Two 3×3 calibration grids
were displayed on the monitors respectively. After the observers               3.1     Computation of scene statistics
visited all 9 dots, a linear interpolation was then done to establish
the transformation between the output voltages of the eye tracker              Denote the right image as I, and the dense disparity map as D We
and the position of the subject’s gaze on each computer display.               computed the luminance gradient map:
The calibration also accounted for crosstalk between the horizon-
tal and vertical voltage measurements. After correct calibration, a
0.3°×0.3° crosshair was displayed on the centers to force all ob-
servers to start from the same center position. The stereo images
were displayed on two monitors for 10 seconds during which the
eye movements were recorded. Between two consecutive image                     and the disparity gradient map:
pairs, two identical Gaussian noise images were displayed for 3
seconds on both monitors to help suppress after-images corres-
ponding to the previous stereo pairs that may otherwise have at-
tracted fixations. Then a 0.3°×0.3° crosshair was displayed on the




                                                                         162
using the Matlab function gradient(X). Here we define luminance                                                       ∑          ,
contrast as the RMS contrast of a luminance patch:
                                                                                      where         ,   is the location of the           fixation.
                                  ∑                   ,                               We also computed the mean patch gradient of the                      random loca-
                                                                                      tions the  image as:
and likewise, disparity contrast as
                                                                                                                      ∑              ,

                                 ∑
                                                                                      where         ,   is the location of the           random location.

which is the standard deviation of a disparity patch. All of the                      For each image, we then defined the fixation-to-random lumin-
following analysis is based on these four scene features: lumin-                      ance contrast ratio           /     and the fixation-to-random lu-
                                                                                      minance gradient ratio             / . If          1, it means that
ance contrast, disparity contrast, luminance gradient, and disparity                  the fixated patches generally have a larger luminance contrast
gradient. Figure 1 depicts these scene features for one example                       than randomly selected patches on the image being considered. If
stereo pair.                                                                                  1, then the meaning is reversed. The same meaning applies
                                                                                      to the luminance gradient ratio.
3.2     Random locations
                                                                                      The same analysis method that was used on the luminance con-
Suppose an observer made fixations for the          image. Then,                      trast and luminance gradient was also applied to for the analysis
the total number of fixations that the observer made during a ses-                    of disparity. We calculated the mean disparity contrast      on the
sion is ∑      . We assume that a random observer also made the                       fixated patches, and the same quantity         on the randomly se-
same number of fixations as the subject did. That is, for the                         lected patches. The ratio of disparity contrast between the fixated
image, the random observer selected fixations uniformly distri-                       patches and the randomly selected patches is defined as
buted on the image plane too. For each human observer, we as-                             / .
sume that there are 100 random observers each making the same
number of fixations in each image as the human observer. For                          The mean patch disparity gradient at the fixated patches ( ) and
example, the overall fixation that subject LKC made is 486 fixa-                      the randomly selected patches (       was also calculated. The ratio
tions in 48 images, so each random observer selected 486 random                       of the disparity gradient between the fixated patches and the ran-
locations too. The total number of random locations is 48,600. We                     dom patches is defined as              / . If the ratios are signif-
wanted to know whether or not there is a statistically significant                    icantly greater than 1, then fixated patches tend to have a larger
difference between image features at fixations and those at ran-                      disparity contrast and gradient than randomly picked locations.
domly selected locations by comparing the human observer’s data
and the 100 random observers’ data.

3.3     Fixation/Random ratios
                                                                                            1.1
For the    image, we computed the mean luminance contrast at
the fixations as:                                                                               1
                                                                                       Ratios




                             ∑          ,
                                                                                            0.9
where    ,    is the location of the             fixation, and     is the lu-
minance contrast map.                                                                       0.8

We also computed the mean luminance contrast at the                 random
locations as:                                                                               0.7
                                                                                              0.4       0.6     0.8       1              1.2         1.4     1.6

                             ∑              ,
                                                                                                                Patch size (deg)
                                                                                           Figure 2. Mean luminance gradient ratios (red), and
where     ,    is the location of the           random location.                           mean disparity gradient ratios (blue) of three observers.
We defined patch luminance gradient as the mean gradient of the
patch:
                                                                                      We ran 100 simulations for each image, and plotted the mean
                             ∑          ,       ⁄ ,                                   luminance gradient ratio        with 95% confidence intervals (CI),
                                                                                      and the mean disparity gradient ratio          with 95% CIs, for all
where     is the luminance gradient map. Similarly, we computed                       subjects as shown in Figure 2. For better comparison, we plotted 1
the mean patch luminance gradient of the fixations for the                            as a straight horizontal line across all patch sizes. The red curves
image as:                                                                             show the ratios of luminance gradient, and the blue curves showed
                                                                                      the ratios of disparity gradients. Different markers were used to
                                                                                      represent the observers: LKC (*), CHY (o), JSL (∆). We made a




                                                                                163
similar ensemble comparison plot for the mean luminance con-                   References
trast ratio and mean disparity contrast ratio, as displayed in Figure
3. It is clear that all the mean ratios of luminance gradient/contrast         ANZAI, A., OHZAWA, I., AND FREEMAN, R. D. 1999. Neural Me-
and their 95% CIs fall well above 1, while the mean ratios of dis-             chanisms for Processing Binocular Information II. Complex Cells.
parity gradient/contrast fell well below 1. All of the results we              J Neurophysiol, 82, 2, 909–924.
obtained lead to the conclusion that humans tend to fixate at re-
gions of higher luminance variation, and lower disparity variation.            BACKUS, B. T., FLEET, D. J., PARKER, A. J., & HEEGER, D. J. 2001.
                                                                               Human Cortical Activity Correlates With Stereoscopic Depth
             1.3                                                               Perception. J Neurophysiol, 86, 4, 2054-2068.

             1.2                                                               FLEET D. J., WAGNER H., AND HEEGER D. J. 1996. Neural Encod-
                                                                               ing of Binocular Disparity: Energy Models, Position Shifts and
    Ratios




                                                                               Phase Shifts. Vision Research, 36, 1839-1857.
             1.1
                                                                               GEORGIEVA, S., PEETERS, R., KOLSTER, H., TODD, J. T., & ORBAN,
               1                                                               G. A. 2009. The Processing of Three-Dimensional Shape from
                                                                               Disparity in the Human Brain. J. Neurosci., 29, 3, 727-742.
             0.9
                                                                               HOYER, P. O., AND HYVÄRINEN, A. 2000. Independent Component
                                                                               Analysis Applied to Feature Extraction from Colour and Stereo
             0.8                                                               Images . Network Computation in Neural Systems, 11, 191--210.
               0.4   0.6     0.8       1      1.2    1.4      1.6              JANSEN, L., ONAT, S., AND KÖNIG, P. 2009. Influence of Disparity
                           Patch size (deg)                                    on Fixation and Saccades in Free Viewing of Natural Scenes.
                                                                               Journal of Vision 9, 1, 1-19.
      Figure 3. Mean luminance contrast ratios (red), and
      mean disparity contrast ratios (blue) of three observers.                LIU, Y., BOVIK, A. C., AND CORMACK, L. K. 2008. Disparity Sta-
                                                                               tistics in Natural Scenes. Journal of Vision 8, 11, 1-14.
4            Discussion
                                                                               OHZAWA, I., DEANGELIS, G., AND FREEMAN, R. 1990. Stereoscop-
The most immediate explanation for this behavioral phenomenon                  ic Depth Discrimination in The Visual Cortex: Neurons Ideally
is that the binocular vision system actively seeks to fixate scene             Suited as Disparity Detectors. Science, 249, 4972, 1037-1041.
points that will simplify or enhance stereoscopic perception. The
nature of this enhancement is less clear. We suggest that the bino-            PARKHURST, D. J., AND NIEBUR, E. 2003. Scene Content selected
cular visual system, unless directed otherwise by a higher-level               by active vision. Spatial Vision 16, 2, 125-154.
mediator, seeks fixations that simplify the computational process
of disparity and depth calculation. In particular, we propose that             QIAN, N. 1994. Computing Stereo Disparity and Motion with
the binocular vision system seeks to avoid, when possible, regions             Known Binocular Cell Properties. Neural Computation, 6, 3, 390-
where the disparity computations are complicated by missing                    404.
information, such as occlusions, or rapid changes in disparity,
which may be harder to resolve.                                                RAJASHEKAR, U., VAN DER LINDE, I., BOVIK, AND CORMACK, L. K..
                                                                               2007. Foveated Analysis of Image Features at Fixations. Vision
Many fMRI studies have reported that the V1 activity increases                 Research 47, 25, 3160-3172.
with disparity variation [Tootell et al. 1988; Backus et al. 2001].
Georgieva et al. [2009] showed that activity in occipital cortex               RAJASHEKAR, U., VAN DER LINDE, I., BOVIK, A. C., AND CORMACK,
and ventral IPS, at the edges of the V3A complex was correlated                L. K. 2008. GAFFE: A Gaze-attentive Fixation Finding Engine.
with the amplitude of the disparity variation that subjects per-               IEEE Transactions on Image Processing 17, 4, 564-573.
ceived in the stimuli. All of these studies show that the processing
of large, complex disparity shapes involves the allocation of more             REINAGEL, P., AND ZADOR, A. M. 1999. Natural Scene Statistics at
neuronal resources and energy than do smooth disparity fields.                 the Center of Gaze. Networks 10, 4, 341-350.

Another (related) possibility is that when viewing an object, ob-              TOOTELL, R. B., HAMILTON, S. L., SILVERMAN, M. S., & SWITKES,
servers tend to fixate towards the center of the object, leaving the           E. 1988. Functional Anatomy of Macaque Striate Cortex. I. Ocu-
object boundaries (often associated with depth discontinuities) to             lar Dominance, Binocular Interactions, and Baseline Conditions.
peripheral processing involving larger receptive fields both in                J. Neurosci., 8, 5, 1500-1530.
space and disparity. Such a strategy would allow the fovea to
operate under better-posed stereo viewing conditions, and to                   WICK, B. 1985. Forced Vergence Fixation Disparity Curves at
process 3D surface detail, while the periphery could simulta-                  Distance and Near in an Asymptomatic Young Adult Population.
neously encode the (statistically) large disparities associated with           American Journal of Optometry and Physiological Optics, 62, 9,
object boundaries.                                                             591-599.




                                                                         164

Contenu connexe

Tendances

Tendances (6)

Effect of kernel size on Wiener and Gaussian image filtering
Effect of kernel size on Wiener and Gaussian image filteringEffect of kernel size on Wiener and Gaussian image filtering
Effect of kernel size on Wiener and Gaussian image filtering
 
An Efficient Thresholding Neural Network Technique for High Noise Densities E...
An Efficient Thresholding Neural Network Technique for High Noise Densities E...An Efficient Thresholding Neural Network Technique for High Noise Densities E...
An Efficient Thresholding Neural Network Technique for High Noise Densities E...
 
Dorr Space Variant Spatio Temporal Filtering Of Video For Gaze Visualization ...
Dorr Space Variant Spatio Temporal Filtering Of Video For Gaze Visualization ...Dorr Space Variant Spatio Temporal Filtering Of Video For Gaze Visualization ...
Dorr Space Variant Spatio Temporal Filtering Of Video For Gaze Visualization ...
 
Despeckling of Ultrasound Imaging using Median Regularized Coupled Pde
Despeckling of Ultrasound Imaging using Median Regularized Coupled PdeDespeckling of Ultrasound Imaging using Median Regularized Coupled Pde
Despeckling of Ultrasound Imaging using Median Regularized Coupled Pde
 
Survey Paper on Image Denoising Using Spatial Statistic son Pixel
Survey Paper on Image Denoising Using Spatial Statistic son PixelSurvey Paper on Image Denoising Using Spatial Statistic son Pixel
Survey Paper on Image Denoising Using Spatial Statistic son Pixel
 
Texture Classification based on Gabor Wavelet
Texture Classification based on Gabor Wavelet Texture Classification based on Gabor Wavelet
Texture Classification based on Gabor Wavelet
 

En vedette

My map assimetn
My map assimetnMy map assimetn
My map assimetn
criszamu
 
2010 apa guidelinesppt
2010 apa guidelinesppt2010 apa guidelinesppt
2010 apa guidelinesppt
Kokodega Nurse
 
ASA RA VPN with AD Authentication
ASA RA VPN with AD AuthenticationASA RA VPN with AD Authentication
ASA RA VPN with AD Authentication
dirflash
 
Ecobuild Londres
Ecobuild LondresEcobuild Londres
Ecobuild Londres
Eva Cajigas
 

En vedette (20)

My map assimetn
My map assimetnMy map assimetn
My map assimetn
 
Presentatie lastoevoegmaterialen
Presentatie lastoevoegmaterialenPresentatie lastoevoegmaterialen
Presentatie lastoevoegmaterialen
 
Dia
DiaDia
Dia
 
ยุโกสลาเวีย
ยุโกสลาเวียยุโกสลาเวีย
ยุโกสลาเวีย
 
2010 apa guidelinesppt
2010 apa guidelinesppt2010 apa guidelinesppt
2010 apa guidelinesppt
 
ลักษณะภูมิประเทศ 2.4
ลักษณะภูมิประเทศ 2.4ลักษณะภูมิประเทศ 2.4
ลักษณะภูมิประเทศ 2.4
 
ASA RA VPN with AD Authentication
ASA RA VPN with AD AuthenticationASA RA VPN with AD Authentication
ASA RA VPN with AD Authentication
 
Operation outbreak
Operation outbreakOperation outbreak
Operation outbreak
 
Guía para la gestión del uso de medicamentos
Guía para la gestión del uso de medicamentosGuía para la gestión del uso de medicamentos
Guía para la gestión del uso de medicamentos
 
Rock'n Roll in Database S
Rock'n Roll in Database SRock'n Roll in Database S
Rock'n Roll in Database S
 
ParaEmpezarEnLaClase
ParaEmpezarEnLaClaseParaEmpezarEnLaClase
ParaEmpezarEnLaClase
 
Ecobuild Londres
Ecobuild LondresEcobuild Londres
Ecobuild Londres
 
Sanaaaaaaaa 1
Sanaaaaaaaa 1Sanaaaaaaaa 1
Sanaaaaaaaa 1
 
Override Meeting S01E02
Override Meeting S01E02Override Meeting S01E02
Override Meeting S01E02
 
Cere tom neurologica
Cere tom neurologicaCere tom neurologica
Cere tom neurologica
 
Linked data et nettverk!
Linked data   et nettverk!Linked data   et nettverk!
Linked data et nettverk!
 
Content statbyschool 2553_m3_1057012007
Content statbyschool 2553_m3_1057012007Content statbyschool 2553_m3_1057012007
Content statbyschool 2553_m3_1057012007
 
A BIBLIONET program első éve Hargita megyében
A BIBLIONET program első éve Hargita megyébenA BIBLIONET program első éve Hargita megyében
A BIBLIONET program első éve Hargita megyében
 
ความสัมพันธ์ทางเศรษฐกิจ
ความสัมพันธ์ทางเศรษฐกิจความสัมพันธ์ทางเศรษฐกิจ
ความสัมพันธ์ทางเศรษฐกิจ
 
Musica Elettronica
Musica ElettronicaMusica Elettronica
Musica Elettronica
 

Similaire à Liu Natural Scene Statistics At Stereo Fixations

How we look at photographs 1 (JSPS&TJ 2005)
How we look at photographs 1 (JSPS&TJ 2005)How we look at photographs 1 (JSPS&TJ 2005)
How we look at photographs 1 (JSPS&TJ 2005)
Sharon Gershoni
 
Presentación del Filtro de Bosso 2
Presentación del Filtro de Bosso 2Presentación del Filtro de Bosso 2
Presentación del Filtro de Bosso 2
Independiente
 
Image texture analysis techniques survey-1
Image texture analysis techniques  survey-1Image texture analysis techniques  survey-1
Image texture analysis techniques survey-1
anitadixitjoshi
 
COMPUTATIONALLY EFFICIENT TWO STAGE SEQUENTIAL FRAMEWORK FOR STEREO MATCHING
COMPUTATIONALLY EFFICIENT TWO STAGE SEQUENTIAL FRAMEWORK FOR STEREO MATCHINGCOMPUTATIONALLY EFFICIENT TWO STAGE SEQUENTIAL FRAMEWORK FOR STEREO MATCHING
COMPUTATIONALLY EFFICIENT TWO STAGE SEQUENTIAL FRAMEWORK FOR STEREO MATCHING
ijfcstjournal
 
European project PROMET - Deep Real-Time Optoacoustic Imaging of Breast Tumors
European project PROMET - Deep Real-Time Optoacoustic Imaging of Breast TumorsEuropean project PROMET - Deep Real-Time Optoacoustic Imaging of Breast Tumors
European project PROMET - Deep Real-Time Optoacoustic Imaging of Breast Tumors
Sébastien Sénégas
 
Hyperspectral Data Compression Using Spatial-Spectral Lossless Coding Technique
Hyperspectral Data Compression Using Spatial-Spectral Lossless Coding TechniqueHyperspectral Data Compression Using Spatial-Spectral Lossless Coding Technique
Hyperspectral Data Compression Using Spatial-Spectral Lossless Coding Technique
CSCJournals
 

Similaire à Liu Natural Scene Statistics At Stereo Fixations (20)

Cassandra audio-video sensor fusion for aggression detection
Cassandra  audio-video sensor fusion for aggression detectionCassandra  audio-video sensor fusion for aggression detection
Cassandra audio-video sensor fusion for aggression detection
 
poster09
poster09poster09
poster09
 
Automated Diagnosis of Glaucoma using Haralick Texture Features
Automated Diagnosis of Glaucoma using Haralick Texture FeaturesAutomated Diagnosis of Glaucoma using Haralick Texture Features
Automated Diagnosis of Glaucoma using Haralick Texture Features
 
How we look at photographs 1 (JSPS&TJ 2005)
How we look at photographs 1 (JSPS&TJ 2005)How we look at photographs 1 (JSPS&TJ 2005)
How we look at photographs 1 (JSPS&TJ 2005)
 
P.maria sheeba 15 mco010
P.maria sheeba 15 mco010P.maria sheeba 15 mco010
P.maria sheeba 15 mco010
 
Presentación del Filtro de Bosso 2
Presentación del Filtro de Bosso 2Presentación del Filtro de Bosso 2
Presentación del Filtro de Bosso 2
 
Content-based Image Retrieval
Content-based Image RetrievalContent-based Image Retrieval
Content-based Image Retrieval
 
HIGH RESOLUTION MRI BRAIN IMAGE SEGMENTATION TECHNIQUE USING HOLDER EXPONENT
HIGH RESOLUTION MRI BRAIN IMAGE SEGMENTATION TECHNIQUE USING HOLDER EXPONENTHIGH RESOLUTION MRI BRAIN IMAGE SEGMENTATION TECHNIQUE USING HOLDER EXPONENT
HIGH RESOLUTION MRI BRAIN IMAGE SEGMENTATION TECHNIQUE USING HOLDER EXPONENT
 
High Resolution Mri Brain Image Segmentation Technique Using Holder Exponent
High Resolution Mri Brain Image Segmentation Technique Using Holder Exponent  High Resolution Mri Brain Image Segmentation Technique Using Holder Exponent
High Resolution Mri Brain Image Segmentation Technique Using Holder Exponent
 
Image texture analysis techniques survey-1
Image texture analysis techniques  survey-1Image texture analysis techniques  survey-1
Image texture analysis techniques survey-1
 
COMPUTATIONALLY EFFICIENT TWO STAGE SEQUENTIAL FRAMEWORK FOR STEREO MATCHING
COMPUTATIONALLY EFFICIENT TWO STAGE SEQUENTIAL FRAMEWORK FOR STEREO MATCHINGCOMPUTATIONALLY EFFICIENT TWO STAGE SEQUENTIAL FRAMEWORK FOR STEREO MATCHING
COMPUTATIONALLY EFFICIENT TWO STAGE SEQUENTIAL FRAMEWORK FOR STEREO MATCHING
 
European project PROMET - Deep Real-Time Optoacoustic Imaging of Breast Tumors
European project PROMET - Deep Real-Time Optoacoustic Imaging of Breast TumorsEuropean project PROMET - Deep Real-Time Optoacoustic Imaging of Breast Tumors
European project PROMET - Deep Real-Time Optoacoustic Imaging of Breast Tumors
 
Congenital Bucolic and Farming Region Taxonomy Using Neural Networks for Remo...
Congenital Bucolic and Farming Region Taxonomy Using Neural Networks for Remo...Congenital Bucolic and Farming Region Taxonomy Using Neural Networks for Remo...
Congenital Bucolic and Farming Region Taxonomy Using Neural Networks for Remo...
 
Hyperspectral Data Compression Using Spatial-Spectral Lossless Coding Technique
Hyperspectral Data Compression Using Spatial-Spectral Lossless Coding TechniqueHyperspectral Data Compression Using Spatial-Spectral Lossless Coding Technique
Hyperspectral Data Compression Using Spatial-Spectral Lossless Coding Technique
 
DOMAIN SPECIFIC CBIR FOR HIGHLY TEXTURED IMAGES
DOMAIN SPECIFIC CBIR FOR HIGHLY TEXTURED IMAGESDOMAIN SPECIFIC CBIR FOR HIGHLY TEXTURED IMAGES
DOMAIN SPECIFIC CBIR FOR HIGHLY TEXTURED IMAGES
 
Image Denoising Based On Sparse Representation In A Probabilistic Framework
Image Denoising Based On Sparse Representation In A Probabilistic FrameworkImage Denoising Based On Sparse Representation In A Probabilistic Framework
Image Denoising Based On Sparse Representation In A Probabilistic Framework
 
Supraorbital Margins for Identification of Sexual Dimorphism and Age Detectio...
Supraorbital Margins for Identification of Sexual Dimorphism and Age Detectio...Supraorbital Margins for Identification of Sexual Dimorphism and Age Detectio...
Supraorbital Margins for Identification of Sexual Dimorphism and Age Detectio...
 
A Hybrid Data Analysis And Mesh Refinement Paradigm For Conformal Voxel
A Hybrid Data Analysis And Mesh Refinement Paradigm For Conformal VoxelA Hybrid Data Analysis And Mesh Refinement Paradigm For Conformal Voxel
A Hybrid Data Analysis And Mesh Refinement Paradigm For Conformal Voxel
 
Maximizing Strength of Digital Watermarks Using Fuzzy Logic
Maximizing Strength of Digital Watermarks Using Fuzzy LogicMaximizing Strength of Digital Watermarks Using Fuzzy Logic
Maximizing Strength of Digital Watermarks Using Fuzzy Logic
 
Optics group research overview
Optics group research overviewOptics group research overview
Optics group research overview
 

Plus de Kalle

Plus de Kalle (20)

Blignaut Visual Span And Other Parameters For The Generation Of Heatmaps
Blignaut Visual Span And Other Parameters For The Generation Of HeatmapsBlignaut Visual Span And Other Parameters For The Generation Of Heatmaps
Blignaut Visual Span And Other Parameters For The Generation Of Heatmaps
 
Zhang Eye Movement As An Interaction Mechanism For Relevance Feedback In A Co...
Zhang Eye Movement As An Interaction Mechanism For Relevance Feedback In A Co...Zhang Eye Movement As An Interaction Mechanism For Relevance Feedback In A Co...
Zhang Eye Movement As An Interaction Mechanism For Relevance Feedback In A Co...
 
Yamamoto Development Of Eye Tracking Pen Display Based On Stereo Bright Pupil...
Yamamoto Development Of Eye Tracking Pen Display Based On Stereo Bright Pupil...Yamamoto Development Of Eye Tracking Pen Display Based On Stereo Bright Pupil...
Yamamoto Development Of Eye Tracking Pen Display Based On Stereo Bright Pupil...
 
Wastlund What You See Is Where You Go Testing A Gaze Driven Power Wheelchair ...
Wastlund What You See Is Where You Go Testing A Gaze Driven Power Wheelchair ...Wastlund What You See Is Where You Go Testing A Gaze Driven Power Wheelchair ...
Wastlund What You See Is Where You Go Testing A Gaze Driven Power Wheelchair ...
 
Vinnikov Contingency Evaluation Of Gaze Contingent Displays For Real Time Vis...
Vinnikov Contingency Evaluation Of Gaze Contingent Displays For Real Time Vis...Vinnikov Contingency Evaluation Of Gaze Contingent Displays For Real Time Vis...
Vinnikov Contingency Evaluation Of Gaze Contingent Displays For Real Time Vis...
 
Urbina Pies With Ey Es The Limits Of Hierarchical Pie Menus In Gaze Control
Urbina Pies With Ey Es The Limits Of Hierarchical Pie Menus In Gaze ControlUrbina Pies With Ey Es The Limits Of Hierarchical Pie Menus In Gaze Control
Urbina Pies With Ey Es The Limits Of Hierarchical Pie Menus In Gaze Control
 
Urbina Alternatives To Single Character Entry And Dwell Time Selection On Eye...
Urbina Alternatives To Single Character Entry And Dwell Time Selection On Eye...Urbina Alternatives To Single Character Entry And Dwell Time Selection On Eye...
Urbina Alternatives To Single Character Entry And Dwell Time Selection On Eye...
 
Tien Measuring Situation Awareness Of Surgeons In Laparoscopic Training
Tien Measuring Situation Awareness Of Surgeons In Laparoscopic TrainingTien Measuring Situation Awareness Of Surgeons In Laparoscopic Training
Tien Measuring Situation Awareness Of Surgeons In Laparoscopic Training
 
Takemura Estimating 3 D Point Of Regard And Visualizing Gaze Trajectories Und...
Takemura Estimating 3 D Point Of Regard And Visualizing Gaze Trajectories Und...Takemura Estimating 3 D Point Of Regard And Visualizing Gaze Trajectories Und...
Takemura Estimating 3 D Point Of Regard And Visualizing Gaze Trajectories Und...
 
Stevenson Eye Tracking With The Adaptive Optics Scanning Laser Ophthalmoscope
Stevenson Eye Tracking With The Adaptive Optics Scanning Laser OphthalmoscopeStevenson Eye Tracking With The Adaptive Optics Scanning Laser Ophthalmoscope
Stevenson Eye Tracking With The Adaptive Optics Scanning Laser Ophthalmoscope
 
Stellmach Advanced Gaze Visualizations For Three Dimensional Virtual Environm...
Stellmach Advanced Gaze Visualizations For Three Dimensional Virtual Environm...Stellmach Advanced Gaze Visualizations For Three Dimensional Virtual Environm...
Stellmach Advanced Gaze Visualizations For Three Dimensional Virtual Environm...
 
Skovsgaard Small Target Selection With Gaze Alone
Skovsgaard Small Target Selection With Gaze AloneSkovsgaard Small Target Selection With Gaze Alone
Skovsgaard Small Target Selection With Gaze Alone
 
San Agustin Evaluation Of A Low Cost Open Source Gaze Tracker
San Agustin Evaluation Of A Low Cost Open Source Gaze TrackerSan Agustin Evaluation Of A Low Cost Open Source Gaze Tracker
San Agustin Evaluation Of A Low Cost Open Source Gaze Tracker
 
Ryan Match Moving For Area Based Analysis Of Eye Movements In Natural Tasks
Ryan Match Moving For Area Based Analysis Of Eye Movements In Natural TasksRyan Match Moving For Area Based Analysis Of Eye Movements In Natural Tasks
Ryan Match Moving For Area Based Analysis Of Eye Movements In Natural Tasks
 
Rosengrant Gaze Scribing In Physics Problem Solving
Rosengrant Gaze Scribing In Physics Problem SolvingRosengrant Gaze Scribing In Physics Problem Solving
Rosengrant Gaze Scribing In Physics Problem Solving
 
Qvarfordt Understanding The Benefits Of Gaze Enhanced Visual Search
Qvarfordt Understanding The Benefits Of Gaze Enhanced Visual SearchQvarfordt Understanding The Benefits Of Gaze Enhanced Visual Search
Qvarfordt Understanding The Benefits Of Gaze Enhanced Visual Search
 
Prats Interpretation Of Geometric Shapes An Eye Movement Study
Prats Interpretation Of Geometric Shapes An Eye Movement StudyPrats Interpretation Of Geometric Shapes An Eye Movement Study
Prats Interpretation Of Geometric Shapes An Eye Movement Study
 
Porta Ce Cursor A Contextual Eye Cursor For General Pointing In Windows Envir...
Porta Ce Cursor A Contextual Eye Cursor For General Pointing In Windows Envir...Porta Ce Cursor A Contextual Eye Cursor For General Pointing In Windows Envir...
Porta Ce Cursor A Contextual Eye Cursor For General Pointing In Windows Envir...
 
Pontillo Semanti Code Using Content Similarity And Database Driven Matching T...
Pontillo Semanti Code Using Content Similarity And Database Driven Matching T...Pontillo Semanti Code Using Content Similarity And Database Driven Matching T...
Pontillo Semanti Code Using Content Similarity And Database Driven Matching T...
 
Park Quantification Of Aesthetic Viewing Using Eye Tracking Technology The In...
Park Quantification Of Aesthetic Viewing Using Eye Tracking Technology The In...Park Quantification Of Aesthetic Viewing Using Eye Tracking Technology The In...
Park Quantification Of Aesthetic Viewing Using Eye Tracking Technology The In...
 

Liu Natural Scene Statistics At Stereo Fixations

  • 1. Natural Scene Statistics at Stereo Fixations Yang Liu Lawrence K. Cormack Alan C. Bovik Department of ECE Department of Psychology Department of ECE Univeristy of Texas at Austin University of Texas at Austin University of Texas at Austin young76@mail.utexas.edu cormack@psy.utexas.edu bovik@ece.utexas.edu Abstract the adaptation of the visual system [Liu et al. 2008]. One reason for the lack of studies on fixations in 3D space has been the dearth We conducted eye tracking experiments on naturalistic stereo of 3D ground truth natural scene data. The few databases of natu- images presented through a haploscope, and found that fixated ralistic stereo images do not adequately fulfill this need, since luminance contrast and luminance gradient were generally higher what is needed are dense disparity maps for each scene. One very than randomly selected luminance contrast and luminance gra- recent study [Jansen et al., 2009] used natural scenes with ground dient, which agrees with previous literatures. However we also truth disparity map acquired from laser scanning. They found that found that the fixated disparity contrast and disparity gradient disparity appears to be a salient feature that affects eye move- were generally lower than randomly selected disparity contrast ments. In particular, they found that the presence of disparity and disparity gradient. We discuss the implications of this re- information may affect saccade length, but not duration; that dis- markable result. parity appears not to affect the saliency of luminance features; and that subjects tend to fixate nearer objects earlier than more dis- Keywords: fixation, stereopsis, natural scene statistics tance objects. A reasonable agreement with intuition may be found in each of the results. The authors also found that in 3D noise images, subjects tended to fixate depth discontinuities more 1 Introduction / Overview frequently than smooth depth regions. One might view this result as intuitive also; as with luminance images, such locations might Many studies have shown that several low level luminance fea- be deemed interesting. tures at human fixations are significantly different from those at other areas. Reinagel and Zador [1999] found that the regions 2 Stereo Eyetracking around human fixations tend to higher spatial contrasts and spatial entropies than random fixation regions, which suggests that the human visual system may try to select image regions that help 2.1 Observers maximize the information content transmitted to the visual cortex, by minimizing the redundancy in the image representation. By Three male observers participated in this study. Their stereo abili- varying the patch sizes around the fixations, Parkhurst and Niebur ty was tested to be normal by presenting them random dot stereo- [2003] found that the largest difference between the luminance grams. Observer LKC has extensive experience in psychophysical contrast of fixated regions and that of image shuffled (pseudo- studies, and knew the purpose of the experiment. Observers JSL random) regions is observed when the patch size is 1 degree. In a and CHY were naïve. recent study [Rajashekar et al. 2007], fixated image patches were foveated using an eccentricity-based model. Higher order statis- 2.2 Stimulus tics were analyzed than in prior studies. They found that bandpass contrast showed a notably larger difference between fixations and We manually selected 48 grayscale stereoscopic outdoor scenes random patches than other higher-order statistics. Other features [Hoyer & Hyvärinen, 2000] that contained mountains, trees, wa- found to attract fixations (in decreasing order of attractiveness) ter, rocks, bushes, etc., but avoided manmade objects. We didn’t included bandpass luminance, RMS contrast, and luminance. This have the ground truth disparity data from these scenes. Instead, data was subsequently used to develop an image processing “fixa- we relied on a local correlation method which is simple yet bio- tion predictor” [Rajashekar et al. 2008]. logical inspired to solve this issue. Models of binocular complex neurons [Anzai et al. 1999; Fleet et al. 1996; Ohzawa et al. 1990; The body of literature on visual fixations on 2D luminance images Qian, 1994] commonly contain a cross correlation term in their appears to be largely consistent across the studies. However, there response function. has been very little work done on analyzing the nature or statistics of images and scenes at the point of gaze in three dimensions. The simple correspondence algorithm was defined as follows. Certainly the three dimensional attributes of the world affect the Given a pixel (xr,yr) in the right image, we defined a 161×5 way we interact with it both visually and physically. Moreover, (3.2°×0.1°) search window centered on the same pixel location the 3D statistics of the natural world have likely played a role in (preferring zero disparity) in the left image. Given a 1°×1° patch in the right image centered on the pixel (xr,yr), the algorithm com- puted the cross correlation between the right patch and a candi- date 1°×1° left patch centered on each pixel in the 161×5 search Copyright © 2010 by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or window. The left patch yielding the largest cross correlation was classroom use is granted without fee provided that copies are not made or distributed deemed to be the matched patch. The location (xl, yl) correspond- for commercial advantage and that copies bear this notice and the full citation on the ing the center of the patch was deemed the matched pixel for the first page. Copyrights for components of this work owned by others than ACM must be right pixel (xr,yr), hence the horizontal disparity of (xr,yr) was honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. taken to be D(xr,yr) = xr-xl. Request permissions from Permissions Dept, ACM Inc., fax +1 (212) 869-0481 or e-mail permissions@acm.org. ETRA 2010, Austin, TX, March 22 – 24, 2010. © 2010 ACM 978-1-60558-994-7/10/0003 $10.00 161
  • 2. It could be argued that, since we are only interested in local scene centers of both monitors to help the observers to fuse before the statistics, why not record the movements of both eyes, and use the presentation of the next stereo pair. disparity between the recorded left and right fixations to find the correct match? There are several reasons why this wasn’t practic- This calibration routine was repeated compulsorily every 10 im- al. First, there are fixation disparities that occur between the right ages, and a calibration test run every 5 images. This was achieved eye and the left eyes. Most people have a fixation disparity that is by requiring that the observer fixate for 500ms within a 5s time less than 6 arcmin, but can be as large as 20 arcmin with peripher- limit on a central square region (0.3◦ ×0.3◦) prior to progressing to al visual targets [Wick, 1985]. When fixation disparity occurs, the next image in the stimulus collection. If the calibration had the image of an object point that a person is trying to fixate do not drifted, the observer would be unable to satisfy this test, and the fall on exactly corresponding points. Secondly, each Purkinje eye full calibration procedure was re-run. tracker has an accuracy about 7 acrmin (the median offset from real fixations); hence the error between two eye trackers is about Observers who became uncomfortable during the experiment 14 arcmin, which corresponds to about 12 pixels. This is quite a were allowed to take a break of any duration they desired. large error, considering that the stereo images are only 800×600. Thirdly, the fixation detection algorithm (associated with eye- The ambient illumination in the experiment room was kept con- tracking) of the two eye paths can also introduce unwanted noise stant for all observers, with a minimum of 5 minutes luminance into the corresponding fixation locations; registered stereoscopic adaptation provided while the eye-tracker was calibrated. eyetracking is difficult to accomplish in practice. Lastly, since we are interested in disparity features within neighborhoods (not just points) of the fixations, a dense disparity map is required for all points in the neighborhood. Hence, local processing of the type that our disparity algorithm accomplishes would be required any- way. 2.3 Equipments Stereo images were displayed on two 17 inch, gamma calibrated Luminance Disparity monitors. The distance between the monitors and the observer was 124 cm. Each monitor’s screen resolution was set at 800×600 pixels, corresponding to about 50 pixels per degree of visual an- gle. The total spatial extent of each display was thus about 16◦ × 12◦ of visual angle. A haploscope was placed between the two monitors and the ob- servers to completely separate the displays from the left and right Luminance gradient Disparity gradient monitor. Eye movements were recorded by a SRI Generation V Dual Purkinje eye tracker. This eye tracker has an accuracy of < 10′ of arc, a response time of under 1 ms, and bandwidth of DC to > 400Hz. The output of the eye tracker (horizontal and vertical eye position signals) was low-pass filtered in the hardware and then sampled at 200 hHz by a National Instruments data acquisi- tion board in a Pentium IV host computer, where the data were stored for offline data analysis. The observers used a bite bar and Luminance contrast Disparity contrast a forehead rest to restrict their head movements. A viewing session was composed of 48 viewed stereo image Figure 1. The luminance and disparity features pairs. At the beginning of each session, a 0.3°×0.3° crosshair was displayed on the centers of both monitors to help the observers to fuse by fixating on it. When the correct binocular fixation (the crosshair was perceived single) was achieved, the observer 3 Analysis pressed a button to start the calibration. Two 3×3 calibration grids were displayed on the monitors respectively. After the observers 3.1 Computation of scene statistics visited all 9 dots, a linear interpolation was then done to establish the transformation between the output voltages of the eye tracker Denote the right image as I, and the dense disparity map as D We and the position of the subject’s gaze on each computer display. computed the luminance gradient map: The calibration also accounted for crosstalk between the horizon- tal and vertical voltage measurements. After correct calibration, a 0.3°×0.3° crosshair was displayed on the centers to force all ob- servers to start from the same center position. The stereo images were displayed on two monitors for 10 seconds during which the eye movements were recorded. Between two consecutive image and the disparity gradient map: pairs, two identical Gaussian noise images were displayed for 3 seconds on both monitors to help suppress after-images corres- ponding to the previous stereo pairs that may otherwise have at- tracted fixations. Then a 0.3°×0.3° crosshair was displayed on the 162
  • 3. using the Matlab function gradient(X). Here we define luminance ∑ , contrast as the RMS contrast of a luminance patch: where , is the location of the fixation. ∑ , We also computed the mean patch gradient of the random loca- tions the image as: and likewise, disparity contrast as ∑ , ∑ where , is the location of the random location. which is the standard deviation of a disparity patch. All of the For each image, we then defined the fixation-to-random lumin- following analysis is based on these four scene features: lumin- ance contrast ratio / and the fixation-to-random lu- minance gradient ratio / . If 1, it means that ance contrast, disparity contrast, luminance gradient, and disparity the fixated patches generally have a larger luminance contrast gradient. Figure 1 depicts these scene features for one example than randomly selected patches on the image being considered. If stereo pair. 1, then the meaning is reversed. The same meaning applies to the luminance gradient ratio. 3.2 Random locations The same analysis method that was used on the luminance con- Suppose an observer made fixations for the image. Then, trast and luminance gradient was also applied to for the analysis the total number of fixations that the observer made during a ses- of disparity. We calculated the mean disparity contrast on the sion is ∑ . We assume that a random observer also made the fixated patches, and the same quantity on the randomly se- same number of fixations as the subject did. That is, for the lected patches. The ratio of disparity contrast between the fixated image, the random observer selected fixations uniformly distri- patches and the randomly selected patches is defined as buted on the image plane too. For each human observer, we as- / . sume that there are 100 random observers each making the same number of fixations in each image as the human observer. For The mean patch disparity gradient at the fixated patches ( ) and example, the overall fixation that subject LKC made is 486 fixa- the randomly selected patches ( was also calculated. The ratio tions in 48 images, so each random observer selected 486 random of the disparity gradient between the fixated patches and the ran- locations too. The total number of random locations is 48,600. We dom patches is defined as / . If the ratios are signif- wanted to know whether or not there is a statistically significant icantly greater than 1, then fixated patches tend to have a larger difference between image features at fixations and those at ran- disparity contrast and gradient than randomly picked locations. domly selected locations by comparing the human observer’s data and the 100 random observers’ data. 3.3 Fixation/Random ratios 1.1 For the image, we computed the mean luminance contrast at the fixations as: 1 Ratios ∑ , 0.9 where , is the location of the fixation, and is the lu- minance contrast map. 0.8 We also computed the mean luminance contrast at the random locations as: 0.7 0.4 0.6 0.8 1 1.2 1.4 1.6 ∑ , Patch size (deg) Figure 2. Mean luminance gradient ratios (red), and where , is the location of the random location. mean disparity gradient ratios (blue) of three observers. We defined patch luminance gradient as the mean gradient of the patch: We ran 100 simulations for each image, and plotted the mean ∑ , ⁄ , luminance gradient ratio with 95% confidence intervals (CI), and the mean disparity gradient ratio with 95% CIs, for all where is the luminance gradient map. Similarly, we computed subjects as shown in Figure 2. For better comparison, we plotted 1 the mean patch luminance gradient of the fixations for the as a straight horizontal line across all patch sizes. The red curves image as: show the ratios of luminance gradient, and the blue curves showed the ratios of disparity gradients. Different markers were used to represent the observers: LKC (*), CHY (o), JSL (∆). We made a 163
  • 4. similar ensemble comparison plot for the mean luminance con- References trast ratio and mean disparity contrast ratio, as displayed in Figure 3. It is clear that all the mean ratios of luminance gradient/contrast ANZAI, A., OHZAWA, I., AND FREEMAN, R. D. 1999. Neural Me- and their 95% CIs fall well above 1, while the mean ratios of dis- chanisms for Processing Binocular Information II. Complex Cells. parity gradient/contrast fell well below 1. All of the results we J Neurophysiol, 82, 2, 909–924. obtained lead to the conclusion that humans tend to fixate at re- gions of higher luminance variation, and lower disparity variation. BACKUS, B. T., FLEET, D. J., PARKER, A. J., & HEEGER, D. J. 2001. Human Cortical Activity Correlates With Stereoscopic Depth 1.3 Perception. J Neurophysiol, 86, 4, 2054-2068. 1.2 FLEET D. J., WAGNER H., AND HEEGER D. J. 1996. Neural Encod- ing of Binocular Disparity: Energy Models, Position Shifts and Ratios Phase Shifts. Vision Research, 36, 1839-1857. 1.1 GEORGIEVA, S., PEETERS, R., KOLSTER, H., TODD, J. T., & ORBAN, 1 G. A. 2009. The Processing of Three-Dimensional Shape from Disparity in the Human Brain. J. Neurosci., 29, 3, 727-742. 0.9 HOYER, P. O., AND HYVÄRINEN, A. 2000. Independent Component Analysis Applied to Feature Extraction from Colour and Stereo 0.8 Images . Network Computation in Neural Systems, 11, 191--210. 0.4 0.6 0.8 1 1.2 1.4 1.6 JANSEN, L., ONAT, S., AND KÖNIG, P. 2009. Influence of Disparity Patch size (deg) on Fixation and Saccades in Free Viewing of Natural Scenes. Journal of Vision 9, 1, 1-19. Figure 3. Mean luminance contrast ratios (red), and mean disparity contrast ratios (blue) of three observers. LIU, Y., BOVIK, A. C., AND CORMACK, L. K. 2008. Disparity Sta- tistics in Natural Scenes. Journal of Vision 8, 11, 1-14. 4 Discussion OHZAWA, I., DEANGELIS, G., AND FREEMAN, R. 1990. Stereoscop- The most immediate explanation for this behavioral phenomenon ic Depth Discrimination in The Visual Cortex: Neurons Ideally is that the binocular vision system actively seeks to fixate scene Suited as Disparity Detectors. Science, 249, 4972, 1037-1041. points that will simplify or enhance stereoscopic perception. The nature of this enhancement is less clear. We suggest that the bino- PARKHURST, D. J., AND NIEBUR, E. 2003. Scene Content selected cular visual system, unless directed otherwise by a higher-level by active vision. Spatial Vision 16, 2, 125-154. mediator, seeks fixations that simplify the computational process of disparity and depth calculation. In particular, we propose that QIAN, N. 1994. Computing Stereo Disparity and Motion with the binocular vision system seeks to avoid, when possible, regions Known Binocular Cell Properties. Neural Computation, 6, 3, 390- where the disparity computations are complicated by missing 404. information, such as occlusions, or rapid changes in disparity, which may be harder to resolve. RAJASHEKAR, U., VAN DER LINDE, I., BOVIK, AND CORMACK, L. K.. 2007. Foveated Analysis of Image Features at Fixations. Vision Many fMRI studies have reported that the V1 activity increases Research 47, 25, 3160-3172. with disparity variation [Tootell et al. 1988; Backus et al. 2001]. Georgieva et al. [2009] showed that activity in occipital cortex RAJASHEKAR, U., VAN DER LINDE, I., BOVIK, A. C., AND CORMACK, and ventral IPS, at the edges of the V3A complex was correlated L. K. 2008. GAFFE: A Gaze-attentive Fixation Finding Engine. with the amplitude of the disparity variation that subjects per- IEEE Transactions on Image Processing 17, 4, 564-573. ceived in the stimuli. All of these studies show that the processing of large, complex disparity shapes involves the allocation of more REINAGEL, P., AND ZADOR, A. M. 1999. Natural Scene Statistics at neuronal resources and energy than do smooth disparity fields. the Center of Gaze. Networks 10, 4, 341-350. Another (related) possibility is that when viewing an object, ob- TOOTELL, R. B., HAMILTON, S. L., SILVERMAN, M. S., & SWITKES, servers tend to fixate towards the center of the object, leaving the E. 1988. Functional Anatomy of Macaque Striate Cortex. I. Ocu- object boundaries (often associated with depth discontinuities) to lar Dominance, Binocular Interactions, and Baseline Conditions. peripheral processing involving larger receptive fields both in J. Neurosci., 8, 5, 1500-1530. space and disparity. Such a strategy would allow the fovea to operate under better-posed stereo viewing conditions, and to WICK, B. 1985. Forced Vergence Fixation Disparity Curves at process 3D surface detail, while the periphery could simulta- Distance and Near in an Asymptomatic Young Adult Population. neously encode the (statistically) large disparities associated with American Journal of Optometry and Physiological Optics, 62, 9, object boundaries. 591-599. 164