We present saccadic models which are an alternative way to predict where observers look at. Compared to saliency models, saccadic models generate plausible visual scanpaths from which saliency maps can be computed. In addition these models have the advantage of being adaptable to different viewing conditions, viewing tasks and types of visual scene. We demonstrate that saccadic models perform better than existing saliency models for predicting where an observer looks at in free-viewing condition and quality-task condition (i.e. when observers have to score the quality of an image). For that, the joint distributions of saccade amplitudes and orientations in both conditions (i.e. free-viewing and quality task) have been estimated from eye tracking data. Thanks to saccadic models, we hope we will be able to improve upon the performance of saliency-based quality metrics, and more generally the capacity to predict where we look within visual scenes when performing visual tasks.
Neurodevelopmental disorders according to the dsm 5 tr
How saccadic models help predict where we look during a visual task? Application to visual quality assessment.
1. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
How saccadic models help predict where we
look during a visual task?
Application to visual quality assessment.
Olivier Le Meur 1
Antoine Coutrot 2
olemeur@irisa.fr
1
IRISA - University of Rennes 1, France
2
CoMPLEX, University College London, UK
February 17, 2016
1 / 25
2. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
Preamble
These slides are based on the following papers:
O. Le Meur & Z. Liu, Saccadic model of eye
movements for free-viewing condition, Vision
Research, 2015, doi:10.1016/j.visres.2014.12.026.
O. Le Meur & A. Coutrot, Introducing
context-dependent and spatially-variant viewing
biases in saccadic models, Vision Research,
2016.
2 / 25
3. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
Outline
1 Introduction
2 Saliency models
3 Saccadic models
4 Tuning saccadic model
5 Conclusion
3 / 25
4. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
Outline
1 Introduction
Covert vs overt
Bottom-Up vs Top-Down
2 Saliency models
3 Saccadic models
4 Tuning saccadic model
5 Conclusion
4 / 25
5. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
Introduction (1/2)
Visual attention
Posner proposed the following definition (Posner, 1980). Visual atten-
tion is used:
ª to select important areas of our visual field (alerting);
ª to search for a target in cluttered scenes (searching).
There are several kinds of visual attention:
ª Overt visual attention: involving eye movements;
ª Covert visual attention: without eye movements (Covert
fixations are not observable).
5 / 25
6. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
Introduction (2/2)
Bottom-Up vs Top-Down
ª Bottom-Up: some things draw attention reflexively, in a
task-independent way (Involuntary; Very quick; Unconscious);
ª Top-Down: some things draw volitional attention, in a
task-dependent way (Voluntary; Very slow; Conscious).
6 / 25
7. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
Outline
1 Introduction
2 Saliency models
3 Saccadic models
4 Tuning saccadic model
5 Conclusion
7 / 25
8. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
Saliency model (1/3)
Computational models of visual attention aim at predicting
where we look within a scene.
ª Saliency model computes a 2D static saliency map from an input
image.
ª Most of state-of-the-art models simulate the Bottom-Up Overt
visual attention.
8 / 25
9. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
Saliency model (2/3)
Taxonomy of models:
ª Information Theoretic
models;
ª Cognitive models;
ª Graphical models;
ª Spectral analysis
models;
ª Pattern classification
models;
ª Bayesian models.
Extracted from (Borji and Itti, 2013).
9 / 25
10. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
Saliency model (3/3)
The picture is much clearer than 10 years ago!
New datasets, new metrics/benchmarks, new protocols, new
methods, new devices.....
10 / 25
11. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
Saliency model (3/3)
The picture is much clearer than 10 years ago!
New datasets, new metrics/benchmarks, new protocols, new
methods, new devices.....
BUT
Important aspects of our visual system are clearly overlooked:
Current models implicitly assume that eyes are equally likely to
move in any direction;
Viewing biases are not taken into account;
The temporal dimension is not considered (static saliency map).
10 / 25
12. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
Outline
1 Introduction
2 Saliency models
3 Saccadic models
Presentation
Proposed model
4 Tuning saccadic model
5 Conclusion
11 / 25
13. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
Presentation (1/1)
ª Eye movements are composed of fixations and saccades. A
sequence of fixations is called a visual scanpath.
ª When looking at visual scenes, we perform in average 4 visual
fixations per second.
Saccadic models are used:
1 to compute plausible visual
scanpaths (stochastic,
saccade amplitudes /
orientations...);
2 to infer the scanpath-based
saliency map ⇔ to predict
salient areas!!
12 / 25
14. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
Saccadic model (1/6)
Let I : Ω ⊂ R2
→ R3
an image and xt a fixation point at time t.
We consider the 2D discrete conditional probability:
p (x|xt−1, . . . , xt−T ) ∝ pBU (x)pB(d, φ)pM (x, t)
ª pBU : Ω → [0, 1] is the grayscale saliency map;
ª pB(d, φ) represents the joint probability distribution of saccade
amplitudes and orientations. d is the saccade amplitude between two
fixation points xt and xt−1 (expressed in degree of visual angle), and
φ is the angle (expressed in degree between these two points);
ª pM (x, t) represents the memory state of the location x at time t. This
time-dependent term simulates the inhibition of return.
O. Le Meur & Z. Liu, Saccadic model of eye movements for free-viewing
condition, Vision Research, 2015.
O. Le Meur & A. Coutrot, Introducing context-dependent and spatially-variant
viewing biases in saccadic models, Vision Research, 2016.
13 / 25
15. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
Saccadic model (1/6)
Let I : Ω ⊂ R2
→ R3
an image and xt a fixation point at time t.
We consider the 2D discrete conditional probability:
p (x|xt−1, . . . , xt−T ) ∝ pBU (x)pB(d, φ)pM (x, t)
ª pBU : Ω → [0, 1] is the grayscale saliency map;
ª pB(d, φ) represents the joint probability distribution of saccade
amplitudes and orientations. d is the saccade amplitude between two
fixation points xt and xt−1 (expressed in degree of visual angle), and
φ is the angle (expressed in degree between these two points);
ª pM (x, t) represents the memory state of the location x at time t. This
time-dependent term simulates the inhibition of return.
O. Le Meur & Z. Liu, Saccadic model of eye movements for free-viewing
condition, Vision Research, 2015.
O. Le Meur & A. Coutrot, Introducing context-dependent and spatially-variant
viewing biases in saccadic models, Vision Research, 2016.
13 / 25
16. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
Saccadic model (1/6)
Let I : Ω ⊂ R2
→ R3
an image and xt a fixation point at time t.
We consider the 2D discrete conditional probability:
p (x|xt−1, . . . , xt−T ) ∝ pBU (x)pB(d, φ)pM (x, t)
ª pBU : Ω → [0, 1] is the grayscale saliency map;
ª pB(d, φ) represents the joint probability distribution of saccade
amplitudes and orientations. d is the saccade amplitude between two
fixation points xt and xt−1 (expressed in degree of visual angle), and
φ is the angle (expressed in degree between these two points);
ª pM (x, t) represents the memory state of the location x at time t. This
time-dependent term simulates the inhibition of return.
O. Le Meur & Z. Liu, Saccadic model of eye movements for free-viewing
condition, Vision Research, 2015.
O. Le Meur & A. Coutrot, Introducing context-dependent and spatially-variant
viewing biases in saccadic models, Vision Research, 2016.
13 / 25
17. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
Saccadic model (1/6)
Let I : Ω ⊂ R2
→ R3
an image and xt a fixation point at time t.
We consider the 2D discrete conditional probability:
p (x|xt−1, . . . , xt−T ) ∝ pBU (x)pB(d, φ)pM (x, t)
ª pBU : Ω → [0, 1] is the grayscale saliency map;
ª pB(d, φ) represents the joint probability distribution of saccade
amplitudes and orientations. d is the saccade amplitude between two
fixation points xt and xt−1 (expressed in degree of visual angle), and
φ is the angle (expressed in degree between these two points);
ª pM (x, t) represents the memory state of the location x at time t. This
time-dependent term simulates the inhibition of return.
O. Le Meur & Z. Liu, Saccadic model of eye movements for free-viewing
condition, Vision Research, 2015.
O. Le Meur & A. Coutrot, Introducing context-dependent and spatially-variant
viewing biases in saccadic models, Vision Research, 2016.
13 / 25
18. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
Saccadic model (2/6)
Bottom-up saliency map
p (x|xt−1, . . . , xt−T ) ∝ pBU (x)pB(d, φ)pM (x, t)
ª pBU is the bottom-up saliency map.
• Saliency aggregation (Le Meur and Liu, 2014) GBVS
model (Harel et al., 2006) and RARE2012 model (Riche et al.,
2013). These models present a good trade-off between quality
and complexity.
• pBU (x) is constant over time. (Tatler et al., 2005) indeed
demonstrated that bottom-up influences do not vanish over time.
14 / 25
19. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
Saccadic model (3/6)
Memory effect and inhibition of return (IoR)
p (x|xt−1, . . . , xt−T ) ∝ pBU (x)pB(d, φ)pM (x, t)
ª pM (x, t) represents the memory effect and IoR of the location x
at time t. It is composed of two terms: Inhibition and Recovery.
• The spatial IoR effect declines as a Gaussian function Φσi (d)
with the Euclidean distance d from the attended
location (Bennett and Pratt, 2001);
• The temporal decline of the IoR effect is simulated by a simple
linear model.
15 / 25
20. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
Saccadic model (4/6)
Viewing biases
p (x|xt−1, . . . , xt−T ) ∝ pBU (x)pB(d, φ)pM (x, t)
ª pB(d, φ) represents the joint probability distribution of saccade
amplitudes and orientations.
d and φ represent the distance and the angle between each pair of
successive fixations, respectively.
16 / 25
21. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
Proposed model (5/6)
Selecting the next fixation point
p (x|xt−1, . . . , xt−T ) ∝ pBU (x)pB(d, φ)pM (x, t)
ª Stochastic behavior:
We generate Nc = 5 random locations according to the 2D
discrete conditional probability p (x|xt−1, · · · , xt−T ).
The location with the highest saliency gain is chosen as the next
fixation point x∗
t .
17 / 25
22. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
Proposed model (6/6)
Scanpaths and saliency maps
ª Scanpaths composed of 10 fixations:
ª Scanpath-based saliency maps:
(a) original image; (b) human saliency map; (c) GBVS saliency map; (d)
GBVS-SM saliency maps computed from the simulated scanpaths.
See papers for more details
18 / 25
23. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
Outline
1 Introduction
2 Saliency models
3 Saccadic models
4 Tuning saccadic model
A flexible framework
Example: the quality task
5 Conclusion
19 / 25
24. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
Tuning saccadic model (1/1)
The proposed framework allows us to adapt the saccadic model for
different conditions (thanks to the 2D conditional probability pM ):
ª for specific visual scenes: scene-dependent and spatially-variant
distributions.
ª for specific category of people, e.g. young or elder.
ª for a given task...
20 / 25
25. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
Tuning saccadic model for a quality task (1/3)
Our goal is to reproduce the visual behavior of observers while
they are performing a quality task.
We re-analyse eye tracking data of (Ninassi et al., 2007):
ª FT: eye movements were recorded while observers freely looked
at unimpaired images;
ª Ref. QT: eye movements were recorded while observers looked
at the unimpaired images during the quality task;
ª Deg. QT: eye movements were recorded while observers looked
at the degraded images during the quality task.
Double Stimulus Impairment Scale was used to perform the subjective
quality evaluation.
21 / 25
26. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
Tuning saccadic model for a quality task (2/3)
From left to right: FT, Ref. QT and Deg. QT.
ª The joint distribution of the condition Deg. QT is less focussed
than the two others ⇒ higher visual exploration.
ª Observers performed smaller saccades in the condition Ref. QT
compared to FT and Deg. QT conditions ⇒ Memorization.
22 / 25
27. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
Tuning saccadic model for a quality task (3/3)
ª Preliminary results:
23 / 25
28. Visual attention
O. Le Meur
Introduction
Covert vs overt
Bottom-Up vs
Top-Down
Saliency models
Saccadic models
Presentation
Proposed model
Tuning saccadic
model
A flexible framework
Example: the quality
task
Conclusion
Conclusion (1/1)
The key take-home messages:
ª A new saccadic model performing well to
• produce plausible scanpaths;
• detect the most salient regions of visual scenes.
ª A flexible approach!
Future works:
ª Dealing with the limitations of the current implementation.
The presentation is available on my slideShare account:
http://fr.slideshare.net/OlivierLeMeur.
24 / 25
30. Visual attention
O. Le Meur
References
References
P. J. Bennett and J. Pratt. The spatial distribution of inhibition of return:. Psychological Science, 12:76–80, 2001.
A. Borji and L. Itti. State-of-the-art in visual attention modeling. IEEE Trans. on Pattern Analysis and Machine Intelligence, 35:
185–207, 2013.
J. Harel, C. Koch, and P. Perona. Graph-based visual saliency. In Proceedings of Neural Information Processing Systems (NIPS),
2006.
O. Le Meur and Z. Liu. Saliency aggregation: Does unity make strength? In ACCV, 2014.
A. Ninassi, O. Le Meur, P. Le Callet, and D. Barba. Does where you gaze on an image affect your perception of quality? applying
visual attention to image quality metric. In ICIP, 2007.
M. I. Posner. Orienting of attention. Quarterly Journal of Experimental Psychology, 32:3–25, 1980.
N. Riche, M. Mancas, M. Duvinage, M. Mibulumukini, B. Gosselin, and T. Dutoit. Rare2012: A multi-scale rarity-based saliency
detection with its comparative statistical analysis. Signal Processing: Image Communication, 28(6):642 – 658, 2013. ISSN
0923-5965. doi: http://dx.doi.org/10.1016/j.image.2013.03.009.
B.W. Tatler, R. J. Baddeley, and I.D. Gilchrist. Visual correlates of fixation selection: effects of scale and time. Vision Research,
45:643–659, 2005.
25 / 25