SlideShare une entreprise Scribd logo
1  sur  33
Télécharger pour lire hors ligne
Kevin McGuinness
kevin.mcguinness@dcu.ie
Assistant Professor
School of Electronic Engineering
Dublin City University
http://bit.ly/dlcv2018
#DLUPC
Saliency Prediction
Day 3 Lecture 5
1
The importance of visual attention
The importance of visual attention
The importance of visual attention
The importance of visual attention
Why don’t we see the changes?
We don’t really see the whole image
We only focus on small specific regions: the salient parts
Human beings reliably attend to the same regions of images
when shown
What we perceive
Where we look
What we actually see
Saliency prediction
Produce a computational model of visual attention: predict where humans will look.
Often want to map an image to a heatmap (saliency map).
12
Salient object detection?
Often confused with saliency prediction, but a different task.
Figure from: Progressive Attention Guided Recurrent Network for Salient Object Detection, CVPR 2018 13
Datasets
14
MIT 300
300 natural indoor and
outdoor scenes.
39 observers. 3 sec free view.
ETL 400 ISCAN eye tracker
Test set only: no training data
or public ground truth
http://saliency.mit.edu/results
_mit300.html
A Benchmark of Computational Models of Saliency to Predict Human Fixations [MIT tech report 2012] 15
Fixations and saliency maps
Raw eye tracker data needs to be processed to produce saliency maps
Eye tracker
Fixation
detection
Saliency map
generation
Raw gaze
location-time tracks
Eye fixations
Detect saccades using
distance/velocity thresholding,
clustering
Rendering, Gaussian blur,
normalizing
16
MIT 1003
1003 natural indoor and outdoor
scenes.
15 observers. 3 sec free view.
ETL 400 ISCAN eye tracker
Training dataset for MIT 300
Learning to Predict where Humans Look [ICCV 2009] 17
iSUN
Large scale dataset of natural
scenes
20,608 images with avg. 3
observers each
Collected using webcams and
Amazon Mechanical Turk
Used in LSUN challenge
2015/2016
Xu et al. TurkerGaze: Crowdsourcing Saliency with Webcam based Eye Tracking arXiv 2015. 18
SALICON
Another large scaled dataset of
images from MS COCO dataset
10K train, 5K val, 5K test
Simulated crowdsourced
attention using mouse
movements and simulated
artificial foveation.
Jiang et al. SALICON: Saliency in Context, CVPR 2015
19
Models
20
SalNet: deep visual saliency model
Predict map of visual attention from image pixels
(find the parts of the image that stand out)
● Feedforward 8 layer “fully convolutional”
architecture
● Transfer learning in bottom 3 layers from
pretrained VGG-M model on ImageNet
● Trained on SALICON dataset
Predicted Ground truth
Pan, McGuinness, et al. Shallow and Deep Convolutional Networks for Saliency Prediction, CVPR 2016 http://arxiv.org/abs/1603.00845
21
ImageGroundtruthPrediction
22
ImageGroundtruthPrediction
23
SalGAN
Adversarial loss
Data loss
Junting Pan, Cristian Canton, Kevin McGuinness, Noel E. O’Connor, Jordi Torres, Elisa Sayrol and Xavier Giro-i-Nieto. “SalGAN: Visual
Saliency Prediction with Generative Adversarial Networks.” arXiv. 2017.
24
SalNet and SalGAN benchmarks
25
Deep Gaze
Simple linear model trained on
activations of all conv layers
(upsampled) from AlexNet
Softmax output over full image,
categorical cross entropy.
L1
regularization used to
encourage sparsity.
Kümmerer et al. Deep gaze 1: Boosting saliency prediction with feature maps trained on imagenet. ICLR workshops 2015
26
MLNet
Cornia et al. A Deep Multi-Level Network for Saliency Prediction. ICPR 2016.
27
SALICON
Huang et al., SALICON: Reducing the Semantic Gap in Saliency Prediction by Adapting Deep Neural Networks, ICCV 2015 28
DeepFix
Weights initialized from
VGG16 trained on ImageNet
Dilated convolutions
Location biased convolutions
Inception layers
Kruthiventi et al. DeepFix: A Fully Convolutional
Neural Network for predicting Human Eye Fixations
https://arxiv.org/abs/1510.02927
29
Deep Gaze II
Kümmerer, et al. Understanding low-and high-level contributions to fixation prediction. ICCV 2017
30
From image to video saliency?
Bak et al. Spatio-temporal saliency networks for dynamic saliency prediction. IEEE Trans. Multimedia 2017
SalNet
31
From image to video saliency
Gorji and Clark, Going From Image to Video Saliency: Augmenting Image Salience With Dynamic Attentional Push, CVPR 2018
32
Questions?
33

Contenu connexe

Tendances

Discrete And Continuous Simulation
Discrete And Continuous SimulationDiscrete And Continuous Simulation
Discrete And Continuous SimulationNguyen Chien
 
Chapter 13 Linear Factor Models
Chapter 13 Linear Factor ModelsChapter 13 Linear Factor Models
Chapter 13 Linear Factor ModelsKyeongUkJang
 
Webinar: Survival Analysis for Marketing Attribution - July 17, 2013
Webinar: Survival Analysis for Marketing Attribution - July 17, 2013Webinar: Survival Analysis for Marketing Attribution - July 17, 2013
Webinar: Survival Analysis for Marketing Attribution - July 17, 2013Revolution Analytics
 
PR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic ModelsPR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic ModelsHyeongmin Lee
 
An Empirical Study of Training Self-Supervised Vision Transformers.pptx
An Empirical Study of Training Self-Supervised Vision Transformers.pptxAn Empirical Study of Training Self-Supervised Vision Transformers.pptx
An Empirical Study of Training Self-Supervised Vision Transformers.pptxSangmin Woo
 
Dimensionality reduction with UMAP
Dimensionality reduction with UMAPDimensionality reduction with UMAP
Dimensionality reduction with UMAPJakub Bartczuk
 
Master's Thesis Presentation
Master's Thesis PresentationMaster's Thesis Presentation
Master's Thesis PresentationWajdi Khattel
 
[PR12] categorical reparameterization with gumbel softmax
[PR12] categorical reparameterization with gumbel softmax[PR12] categorical reparameterization with gumbel softmax
[PR12] categorical reparameterization with gumbel softmaxJaeJun Yoo
 
capsule network
capsule networkcapsule network
capsule network민기 정
 
Physics-Informed Machine Learning
Physics-Informed Machine LearningPhysics-Informed Machine Learning
Physics-Informed Machine LearningOmarYounis21
 
A review of net lift models
A review of net lift modelsA review of net lift models
A review of net lift modelsZixia Wang
 
Cs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelCs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelYanbin Kong
 
Introduction to Genetic Algorithms
Introduction to Genetic AlgorithmsIntroduction to Genetic Algorithms
Introduction to Genetic AlgorithmsAhmed Othman
 
Explicit Density Models
Explicit Density ModelsExplicit Density Models
Explicit Density ModelsSangwoo Mo
 
Intro to Reinforcement learning - part III
Intro to Reinforcement learning - part IIIIntro to Reinforcement learning - part III
Intro to Reinforcement learning - part IIIMikko Mäkipää
 
Echo state networks and locomotion patterns
Echo state networks and locomotion patternsEcho state networks and locomotion patterns
Echo state networks and locomotion patternsVito Strano
 

Tendances (20)

Discrete And Continuous Simulation
Discrete And Continuous SimulationDiscrete And Continuous Simulation
Discrete And Continuous Simulation
 
Chapter 13 Linear Factor Models
Chapter 13 Linear Factor ModelsChapter 13 Linear Factor Models
Chapter 13 Linear Factor Models
 
Webinar: Survival Analysis for Marketing Attribution - July 17, 2013
Webinar: Survival Analysis for Marketing Attribution - July 17, 2013Webinar: Survival Analysis for Marketing Attribution - July 17, 2013
Webinar: Survival Analysis for Marketing Attribution - July 17, 2013
 
AlexNet
AlexNetAlexNet
AlexNet
 
PR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic ModelsPR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic Models
 
An Empirical Study of Training Self-Supervised Vision Transformers.pptx
An Empirical Study of Training Self-Supervised Vision Transformers.pptxAn Empirical Study of Training Self-Supervised Vision Transformers.pptx
An Empirical Study of Training Self-Supervised Vision Transformers.pptx
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithm
 
Dimensionality reduction with UMAP
Dimensionality reduction with UMAPDimensionality reduction with UMAP
Dimensionality reduction with UMAP
 
Master's Thesis Presentation
Master's Thesis PresentationMaster's Thesis Presentation
Master's Thesis Presentation
 
[PR12] categorical reparameterization with gumbel softmax
[PR12] categorical reparameterization with gumbel softmax[PR12] categorical reparameterization with gumbel softmax
[PR12] categorical reparameterization with gumbel softmax
 
Genetic algorithm
Genetic algorithm Genetic algorithm
Genetic algorithm
 
Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)
 
capsule network
capsule networkcapsule network
capsule network
 
Physics-Informed Machine Learning
Physics-Informed Machine LearningPhysics-Informed Machine Learning
Physics-Informed Machine Learning
 
A review of net lift models
A review of net lift modelsA review of net lift models
A review of net lift models
 
Cs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelCs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative Model
 
Introduction to Genetic Algorithms
Introduction to Genetic AlgorithmsIntroduction to Genetic Algorithms
Introduction to Genetic Algorithms
 
Explicit Density Models
Explicit Density ModelsExplicit Density Models
Explicit Density Models
 
Intro to Reinforcement learning - part III
Intro to Reinforcement learning - part IIIIntro to Reinforcement learning - part III
Intro to Reinforcement learning - part III
 
Echo state networks and locomotion patterns
Echo state networks and locomotion patternsEcho state networks and locomotion patterns
Echo state networks and locomotion patterns
 

Similaire à Visual Attention Prediction Using Saliency Models

最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - Hiroshi Fukui
 
Review of Pose Recognition Systems
Review of Pose Recognition SystemsReview of Pose Recognition Systems
Review of Pose Recognition Systemsvivatechijri
 
PS1_2014_2012B5A7521P_2012B5A7848P_2012B4A7958H
PS1_2014_2012B5A7521P_2012B5A7848P_2012B4A7958HPS1_2014_2012B5A7521P_2012B5A7848P_2012B4A7958H
PS1_2014_2012B5A7521P_2012B5A7848P_2012B4A7958HSaurabh Kumar
 
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...IRJET Journal
 
lec_11_self_supervised_learning.pdf
lec_11_self_supervised_learning.pdflec_11_self_supervised_learning.pdf
lec_11_self_supervised_learning.pdfAlamgirAkash3
 
Artificial Intelligence for Vision: A walkthrough of recent breakthroughs
Artificial Intelligence for Vision:  A walkthrough of recent breakthroughsArtificial Intelligence for Vision:  A walkthrough of recent breakthroughs
Artificial Intelligence for Vision: A walkthrough of recent breakthroughsNikolas Markou
 
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Universitat Politècnica de Catalunya
 
Representational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual LearningRepresentational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual LearningMLAI2
 
Satellite Image Classification with Deep Learning Survey
Satellite Image Classification with Deep Learning SurveySatellite Image Classification with Deep Learning Survey
Satellite Image Classification with Deep Learning Surveyijtsrd
 
Semantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesSemantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesFellowship at Vodafone FutureLab
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Universitat Politècnica de Catalunya
 
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...ijtsrd
 
Quality assessment of stereoscopic 3 d image compression by binocular integra...
Quality assessment of stereoscopic 3 d image compression by binocular integra...Quality assessment of stereoscopic 3 d image compression by binocular integra...
Quality assessment of stereoscopic 3 d image compression by binocular integra...Shakas Technologies
 
Nicolae_Denut_Theses
Nicolae_Denut_ThesesNicolae_Denut_Theses
Nicolae_Denut_ThesesNicolae Denut
 

Similaire à Visual Attention Prediction Using Saliency Models (20)

最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
 
The Importance of Time in Visual Attention Models
The Importance of Time in Visual Attention ModelsThe Importance of Time in Visual Attention Models
The Importance of Time in Visual Attention Models
 
End to-end convolutional network for saliency prediction
End to-end convolutional network for saliency predictionEnd to-end convolutional network for saliency prediction
End to-end convolutional network for saliency prediction
 
Review of Pose Recognition Systems
Review of Pose Recognition SystemsReview of Pose Recognition Systems
Review of Pose Recognition Systems
 
PS1_2014_2012B5A7521P_2012B5A7848P_2012B4A7958H
PS1_2014_2012B5A7521P_2012B5A7848P_2012B4A7958HPS1_2014_2012B5A7521P_2012B5A7848P_2012B4A7958H
PS1_2014_2012B5A7521P_2012B5A7848P_2012B4A7958H
 
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...
 
lec_11_self_supervised_learning.pdf
lec_11_self_supervised_learning.pdflec_11_self_supervised_learning.pdf
lec_11_self_supervised_learning.pdf
 
Artificial Intelligence for Vision: A walkthrough of recent breakthroughs
Artificial Intelligence for Vision:  A walkthrough of recent breakthroughsArtificial Intelligence for Vision:  A walkthrough of recent breakthroughs
Artificial Intelligence for Vision: A walkthrough of recent breakthroughs
 
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
 
Representational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual LearningRepresentational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual Learning
 
paper
paperpaper
paper
 
Satellite Image Classification with Deep Learning Survey
Satellite Image Classification with Deep Learning SurveySatellite Image Classification with Deep Learning Survey
Satellite Image Classification with Deep Learning Survey
 
AR/SLAM for end-users
AR/SLAM for end-usersAR/SLAM for end-users
AR/SLAM for end-users
 
Semantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesSemantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network Approaches
 
Learning where to look: focus and attention in deep vision
Learning where to look: focus and attention in deep visionLearning where to look: focus and attention in deep vision
Learning where to look: focus and attention in deep vision
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)
Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)
Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)
 
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...
 
Quality assessment of stereoscopic 3 d image compression by binocular integra...
Quality assessment of stereoscopic 3 d image compression by binocular integra...Quality assessment of stereoscopic 3 d image compression by binocular integra...
Quality assessment of stereoscopic 3 d image compression by binocular integra...
 
Nicolae_Denut_Theses
Nicolae_Denut_ThesesNicolae_Denut_Theses
Nicolae_Denut_Theses
 

Plus de Universitat Politècnica de Catalunya

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoUniversitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosUniversitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Universitat Politècnica de Catalunya
 
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020Universitat Politècnica de Catalunya
 

Plus de Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
 

Dernier

BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 

Dernier (20)

BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 

Visual Attention Prediction Using Saliency Models

  • 1. Kevin McGuinness kevin.mcguinness@dcu.ie Assistant Professor School of Electronic Engineering Dublin City University http://bit.ly/dlcv2018 #DLUPC Saliency Prediction Day 3 Lecture 5 1
  • 2. The importance of visual attention
  • 3.
  • 4. The importance of visual attention
  • 5. The importance of visual attention
  • 6.
  • 7. The importance of visual attention
  • 8. Why don’t we see the changes? We don’t really see the whole image We only focus on small specific regions: the salient parts Human beings reliably attend to the same regions of images when shown
  • 12. Saliency prediction Produce a computational model of visual attention: predict where humans will look. Often want to map an image to a heatmap (saliency map). 12
  • 13. Salient object detection? Often confused with saliency prediction, but a different task. Figure from: Progressive Attention Guided Recurrent Network for Salient Object Detection, CVPR 2018 13
  • 15. MIT 300 300 natural indoor and outdoor scenes. 39 observers. 3 sec free view. ETL 400 ISCAN eye tracker Test set only: no training data or public ground truth http://saliency.mit.edu/results _mit300.html A Benchmark of Computational Models of Saliency to Predict Human Fixations [MIT tech report 2012] 15
  • 16. Fixations and saliency maps Raw eye tracker data needs to be processed to produce saliency maps Eye tracker Fixation detection Saliency map generation Raw gaze location-time tracks Eye fixations Detect saccades using distance/velocity thresholding, clustering Rendering, Gaussian blur, normalizing 16
  • 17. MIT 1003 1003 natural indoor and outdoor scenes. 15 observers. 3 sec free view. ETL 400 ISCAN eye tracker Training dataset for MIT 300 Learning to Predict where Humans Look [ICCV 2009] 17
  • 18. iSUN Large scale dataset of natural scenes 20,608 images with avg. 3 observers each Collected using webcams and Amazon Mechanical Turk Used in LSUN challenge 2015/2016 Xu et al. TurkerGaze: Crowdsourcing Saliency with Webcam based Eye Tracking arXiv 2015. 18
  • 19. SALICON Another large scaled dataset of images from MS COCO dataset 10K train, 5K val, 5K test Simulated crowdsourced attention using mouse movements and simulated artificial foveation. Jiang et al. SALICON: Saliency in Context, CVPR 2015 19
  • 21. SalNet: deep visual saliency model Predict map of visual attention from image pixels (find the parts of the image that stand out) ● Feedforward 8 layer “fully convolutional” architecture ● Transfer learning in bottom 3 layers from pretrained VGG-M model on ImageNet ● Trained on SALICON dataset Predicted Ground truth Pan, McGuinness, et al. Shallow and Deep Convolutional Networks for Saliency Prediction, CVPR 2016 http://arxiv.org/abs/1603.00845 21
  • 24. SalGAN Adversarial loss Data loss Junting Pan, Cristian Canton, Kevin McGuinness, Noel E. O’Connor, Jordi Torres, Elisa Sayrol and Xavier Giro-i-Nieto. “SalGAN: Visual Saliency Prediction with Generative Adversarial Networks.” arXiv. 2017. 24
  • 25. SalNet and SalGAN benchmarks 25
  • 26. Deep Gaze Simple linear model trained on activations of all conv layers (upsampled) from AlexNet Softmax output over full image, categorical cross entropy. L1 regularization used to encourage sparsity. Kümmerer et al. Deep gaze 1: Boosting saliency prediction with feature maps trained on imagenet. ICLR workshops 2015 26
  • 27. MLNet Cornia et al. A Deep Multi-Level Network for Saliency Prediction. ICPR 2016. 27
  • 28. SALICON Huang et al., SALICON: Reducing the Semantic Gap in Saliency Prediction by Adapting Deep Neural Networks, ICCV 2015 28
  • 29. DeepFix Weights initialized from VGG16 trained on ImageNet Dilated convolutions Location biased convolutions Inception layers Kruthiventi et al. DeepFix: A Fully Convolutional Neural Network for predicting Human Eye Fixations https://arxiv.org/abs/1510.02927 29
  • 30. Deep Gaze II Kümmerer, et al. Understanding low-and high-level contributions to fixation prediction. ICCV 2017 30
  • 31. From image to video saliency? Bak et al. Spatio-temporal saliency networks for dynamic saliency prediction. IEEE Trans. Multimedia 2017 SalNet 31
  • 32. From image to video saliency Gorji and Clark, Going From Image to Video Saliency: Augmenting Image Salience With Dynamic Attentional Push, CVPR 2018 32