SlideShare une entreprise Scribd logo
1  sur  44
Télécharger pour lire hors ligne
Paper Reviews in
Visual Attention
1
2018.3.29
SNU DATAMINING CENTER
MINKI CHUNG
WHO AM I 2
▸ Chung Minki
▸ BS, KAIST, IE, 2016
▸ MS, SNU, IE, 2018..?!
▸ Vision Projects
▸ Working on Semantic Image Inpainting
WHAT IS VISUAL ATTENTION 3
▸ Attention is HOT nowadays
▸ http://openaccess.thecvf.com/CVPR2017_search.py
▸ http://search.iclr2018.smerity.com/search/?query=attention
WHAT IS VISUAL ATTENTION 4
▸ Maybe heard of
▸ "Neural Machine Translation by Jointly Learning to Align and Translate"
▸ "Show, Attend, and Tell: Neural Image Caption"
Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio, 2015, ICLR. "Neural Machine Translation by Jointly Learning to Align and Translate"
Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, Yoshua Bengio, 2015, ICML.
"Show, Attend, and Tell: Neural Image Caption Generation with Visual Attention"
WHAT IS VISUAL ATTENTION 5
▸ More,
Jimmy Lei Ba, Volodymyr Mnih, Koray Kavukcuoglu, 2015, ILCR. "Multiple Object Recognition With Visual Attention"
Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, NIPS, 2014. "Spatial Transformer Network"
Jianlong Fu, Heliang Zheng, Tao Mei, 2017, CVPR. "Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-
grained Image Recognition"
Siavash Gorji, James J. Clark, 2017, CVPR. "Attentional Push: A Deep Convolutional Network for Augmenting Image Salience
with Shared Attention Modeling in Social Scenes"
WHAT IS VISUAL ATTENTION 6
▸ Visual Attention:
▸ Attend on certain part of image to solve a task more efficiently
▸ Deep learning, the black box model → Interpretability
TABLE OF CONTENTS 7
▸ Early Works
▸ Recurrent Attention Model (RAM)
▸ Spatial Transformer Network (STN)
▸ Recent Works of visual attention
▸ in ICLR
▸ in CVPR
PREREQUISITE 8
▸ CNN, Transpose Convolution(or Deconvolution), Dilated Convolution
▸ RNN
▸ MLP
▸ GAN
https://towardsdatascience.com/types-of-convolutions-in-deep-learning-717013397f4d
EARLY WORKS
:RAM, STN
9
RECURRENT ATTENTION MODEL 10
▸ Volodymyr Mnih, Nicolas Heess, Alex Graves, Koray Kavukcuoglu, 2014, NIPS.
"Recurrent Models of Visual Attention"
▸ Google DeepMind, 563 citations
▸ Motivation: Confronted by large image, human process image sequentially,
selecting where and what to look
▸ Tackle ConvNet limitation: poor scalability with increasing input image size
RECURRENT ATTENTION MODEL 11
▸ Multiple Object Recognition with Visual Attention (DRAM), 2015, ICLR
▸ Refined architecture version of RAM
▸ RNN Structure with multi-resolution crop, called glimpse
▸ Architecture:
Jimmy Lei Ba, Volodymyr Mnih, Koray Kavukcuoglu, 2015, ILCR. "Multiple Object Recognition With Visual Attention"
RECURRENT ATTENTION MODEL 12
▸ Architecture:
Jimmy Lei Ba, Volodymyr Mnih, Koray Kavukcuoglu, 2015, ILCR. "Multiple Object Recognition With Visual Attention"
WHERE TO SEE
WHAT TO SEE
provide initial state
locate glimpse
outputs the inputs for rnn(1)
for multiple objects
RECURRENT ATTENTION MODEL 13
▸ Demo
▸ Single object classification
https://github.com/kevinzakka/recurrent-visual-attention
RECURRENT ATTENTION MODEL 14
▸ Training:
▸ maximize
Jimmy Lei Ba, Volodymyr Mnih, Koray Kavukcuoglu, 2015, ILCR. "Multiple Object Recognition With Visual Attention"
LOWERBOUND F
multiple object case
RECURRENT ATTENTION MODEL 15
▸ Cont'd:
Jimmy Lei Ba, Volodymyr Mnih, Koray Kavukcuoglu, 2015, ILCR. "Multiple Object Recognition With Visual Attention"
REINFORCE
RECURRENT ATTENTION MODEL 16
▸ Experiments & Results
▸ MNIST, SVHN
Jimmy Lei Ba, Volodymyr Mnih, Koray Kavukcuoglu, 2015, ILCR. "Multiple Object Recognition With Visual Attention"
SPATIAL TRANSFORMER NETWORK 17
▸ Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, 2014
NIPS. "Spatial Transformer Network"
▸ Google DeepMind, 624 citations
▸ Motivation: Human process distorted objects by un-distorting it
▸ ConvNet is not actually invariant to large transformation(only realised over a
deep hierarchy of max-pooling)
Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, 2014, NIPS. "Spatial Transformer Network"
https://kevinzakka.github.io/2017/01/18/stn-part2/
SPATIAL TRANSFORMER NETWORK 18
▸ Architecture:
▸ three parts: localisation net, sampling grid, sampler
▸ Assume 𝛵𝜃 is 2D affine transformation A𝜃,
Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, 2014, NIPS. "Spatial Transformer Network"
regression
H,W,C H',W',C
SPATIAL TRANSFORMER NETWORK 19
▸ 𝛵𝜃, for attention becomes:
▸ Allowing cropping, translation, isotropic scaling
▸ In case if a bilinear sampling kernel,
▸ Differentiable, Modular,
Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, 2014, NIPS. "Spatial Transformer Network"
SPATIAL TRANSFORMER NETWORK 20
▸ Experiments and Results
▸ MNIST
▸ SVHN
Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, 2014, NIPS. "Spatial Transformer Network"
SPATIAL TRANSFORMER NETWORK 21
▸ Experiments and Results
▸ Fine-grained classification (CUB-200-211 bird classification dataset)
Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, 2014, NIPS. "Spatial Transformer Network"
SPATIAL TRANSFORMER NETWORK 22
▸ Already implemented in Tensorlayer
Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, 2014, NIPS. "Spatial Transformer Network"
RECURRENT ATTENTIONAL NETWORKS FOR SALIENCY DETECTION 23
▸ Jason Kuen, Zhenhua Wang, Gang Wang, 2016, CVPR. "Recurrent Attentional
Networks for Saliency Detection"
▸ RAM(Glimpse system) + STN(Differentiability) for Saliency Detection
Jason Kuen, Zhenhua Wang, Gang Wang, 2016, CVPR. "Recurrent Attentional Networks for Saliency Detection"
RECURRENT ATTENTIONAL NETWORKS FOR SALIENCY DETECTION 24
▸ Recurrent Attentional Convolutional-Deconvolutional Network (RACDNN)
▸ Architecture
Jason Kuen, Zhenhua Wang, Gang Wang, 2016, CVPR. "Recurrent Attentional Networks for Saliency Detection"
RECURRENT ATTENTIONAL NETWORKS FOR SALIENCY DETECTION 25
▸ Experiments & Results
Jason Kuen, Zhenhua Wang, Gang Wang, 2016, CVPR. "Recurrent Attentional Networks for Saliency Detection"
RECENT WORKS
:ICLR, CVPR
26
GENERATIVE IMAGE INPAINTING WITH CONTEXTUAL ATTENTION 27
▸ Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas S. Huang, 2018, CVPR.
"Generative Image Inpainting with Contextual Attention"
▸ Adobe Research
Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas S. Huang, 2018, CVPR. "Generative Image Inpainting with Contextual Attention
GENERATIVE IMAGE INPAINTING WITH CONTEXTUAL ATTENTION 28
▸ Architecture
▸ Two-stage(coarse to fine)
▸ Global and Local W-GANS
▸ Spatially discounted reconstruction loss(𝑙1): 𝛾
Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas S. Huang, 2018, CVPR. "Generative Image Inpainting with Contextual Attention
USE W-GAN
attention
𝑙
GENERATIVE IMAGE INPAINTING WITH CONTEXTUAL ATTENTION 29
▸ Attention
Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas S. Huang, 2018, CVPR. "Generative Image Inpainting with Contextual Attention
fx,y
bx,y
Calculate cosine similarity:
GENERATIVE IMAGE INPAINTING WITH CONTEXTUAL ATTENTION 30
▸ Experiments & Results
Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas S. Huang, 2018, CVPR. "Generative Image Inpainting with Contextual Attention
LEARN TO PAY ATTENTION 31
▸ Saumya Jetley, Nicholas A. Lord, Namhoon Lee, Philip H. S. Torr, 2018, ICLR. "Learn
to Pay Attention"
▸ Very simple
Saumya Jetley, Nicholas A. Lord, Namhoon Lee, Philip H. S. Torr, 2018, ICLR. "Learn to Pay Attention"
LEARN TO PAY ATTENTION 32
▸ Architecture
Saumya Jetley, Nicholas A. Lord, Namhoon Lee, Philip H. S. Torr, 2018, ICLR. "Learn to Pay Attention"
Attention
Compatibility
function(dot
product)
LEARN TO PAY ATTENTION 33
▸ Experiments & Results
▸ Image classification and fine-grained recognition
Saumya Jetley, Nicholas A. Lord, Namhoon Lee, Philip H. S. Torr, 2018, ICLR. "Learn to Pay Attention"
LEARN TO PAY ATTENTION 34
▸ Experiments & Results
▸ Weakly supervised semantic segmentation
Saumya Jetley, Nicholas A. Lord, Namhoon Lee, Philip H. S. Torr, 2018, ICLR. "Learn to Pay Attention"
LOOK CLOSER TO SEE BETTER 35
▸ Jianlong Fu, Heliang Zheng, Tao Mei, 2017, CVPR. "Look Closer to See Better:
Recurrent Attention Convolutional Neural Network for Fine-grained Image
Recognition"
▸ Fine-grained image recognition:
▸ Discriminative region localization + fine-grained feature learning
Jianlong Fu, Heliang Zheng, Tao Mei, 2017, CVPR. "Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-
grained Image Recognition"
LOOK CLOSER TO SEE BETTER 36
▸ Recurrent Attention Convolutional Neural Network (RA-CNN)
▸ Multi-scale networks: classification sub-network, attention proposal sub-
network(APN)
▸ Finer-scale network (coarse to fine)
▸ Intra-scale softmax loss for classification, inter-scale pairwise ranking loss for
APN
Jianlong Fu, Heliang Zheng, Tao Mei, 2017, CVPR. "Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-
grained Image Recognition"
LOOK CLOSER TO SEE BETTER 37
▸ RA-CNN architecture:
Jianlong Fu, Heliang Zheng, Tao Mei, 2017, CVPR. "Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-
grained Image Recognition"
bilinear
interpolation
to amplify
LOOK CLOSER TO SEE BETTER 38
▸ Training:
▸ Multi-task loss:
Jianlong Fu, Heliang Zheng, Tao Mei, 2017, CVPR. "Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-
grained Image Recognition"
forces
LOOK CLOSER TO SEE BETTER 39
▸ Experiments & Results
▸ CUB-200-211 Bird Dataset
Jianlong Fu, Heliang Zheng, Tao Mei, 2017, CVPR. "Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-
grained Image Recognition"
LOOK CLOSER TO SEE BETTER 40
▸ Experiments & Results
▸ Stanford Dogs, Stanford Cars
Jianlong Fu, Heliang Zheng, Tao Mei, 2017, CVPR. "Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-
grained Image Recognition"
SUMMARY 41
▸ Attention for efficiency, better performance, interpretability
▸ Many types of Attention:
▸ RAM
▸ STN
▸ RAM+STN
▸ Others
ANY Q?
42
REFERERNCE 43
▸ Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio, 2015, ICLR. "Neural Machine Translation by Jointly
Learning to Align and Translate"
▸ Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard
Zemel, Yoshua Bengio, 2015, ICML. "Show, Attend, and Tell: Neural Image Caption Generation with Visual
Attention"
▸ Volodymyr Mnih, Nicolas Heess, Alex Graves, Koray Kavukcuoglu, 2014, NIPS. "Recurrent Models of Visual
Attention"
▸ Jimmy Lei Ba, Volodymyr Mnih, Koray Kavukcuoglu, 2015, ILCR. "Multiple Object Recognition With Visual
Attention"
▸ Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, 2014 NIPS. "Spatial Transformer
Network"
▸ Jason Kuen, Zhenhua Wang, Gang Wang, 2016, CVPR. "Recurrent Attentional Networks for Saliency Detection"
▸ Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas S. Huang, 2018, CVPR. "Generative Image
Inpainting with Contextual Attention"
▸ Saumya Jetley, Nicholas A. Lord, Namhoon Lee, Philip H. S. Torr, 2018, ICLR. "Learn to Pay Attention"
▸ Jianlong Fu, Heliang Zheng, Tao Mei, 2017, CVPR. "Look Closer to See Better: Recurrent Attention
Convolutional Neural Network for Fine-grained Image Recognition"
END OF
DOCUMENT
44

Contenu connexe

Similaire à Paper Reviews on Visual Attention

Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Universitat Politècnica de Catalunya
 
(Research Note) Delving deeper into convolutional neural networks for camera ...
(Research Note) Delving deeper into convolutional neural networks for camera ...(Research Note) Delving deeper into convolutional neural networks for camera ...
(Research Note) Delving deeper into convolutional neural networks for camera ...Jacky Liu
 
Cs231n 2017 lecture12 Visualizing and Understanding
Cs231n 2017 lecture12 Visualizing and UnderstandingCs231n 2017 lecture12 Visualizing and Understanding
Cs231n 2017 lecture12 Visualizing and UnderstandingYanbin Kong
 
Modeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networksModeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networksNAVER Engineering
 
What Would Shannon Do?
What Would Shannon Do?What Would Shannon Do?
What Would Shannon Do?Karen Ullrich
 
ICCES 2017 - Crowd Density Estimation Method using Regression Analysis
ICCES 2017 - Crowd Density Estimation Method using Regression AnalysisICCES 2017 - Crowd Density Estimation Method using Regression Analysis
ICCES 2017 - Crowd Density Estimation Method using Regression AnalysisAhmed Gad
 
Towards better analysis of deep convolutional neural networks
Towards better analysis of deep convolutional neural networksTowards better analysis of deep convolutional neural networks
Towards better analysis of deep convolutional neural networks曾 子芸
 
Semantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesSemantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesFellowship at Vodafone FutureLab
 
DLD_WeightSharing_Slide
DLD_WeightSharing_SlideDLD_WeightSharing_Slide
DLD_WeightSharing_SlideKang-Ho Lee
 
Supervised Learning of Sparsity-Promoting Regularizers for Denoising
Supervised Learning of Sparsity-Promoting Regularizers for DenoisingSupervised Learning of Sparsity-Promoting Regularizers for Denoising
Supervised Learning of Sparsity-Promoting Regularizers for DenoisingMike McCann
 
capsule network
capsule networkcapsule network
capsule network민기 정
 
Deep Neural Networks 
that talk (Back)… with style
Deep Neural Networks 
that talk (Back)… with styleDeep Neural Networks 
that talk (Back)… with style
Deep Neural Networks 
that talk (Back)… with styleRoelof Pieters
 
Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...
Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...
Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...Wanjin Yu
 
EPC 2018 - SEED - Exploring The Collaboration Between Proceduralism & Deep Le...
EPC 2018 - SEED - Exploring The Collaboration Between Proceduralism & Deep Le...EPC 2018 - SEED - Exploring The Collaboration Between Proceduralism & Deep Le...
EPC 2018 - SEED - Exploring The Collaboration Between Proceduralism & Deep Le...Electronic Arts / DICE
 
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-ResolutionTaegyun Jeon
 
Intermediate inception network for person re-identification
Intermediate inception network for person re-identificationIntermediate inception network for person re-identification
Intermediate inception network for person re-identificationHuan-Cheng Hsu
 

Similaire à Paper Reviews on Visual Attention (20)

Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
 
(Research Note) Delving deeper into convolutional neural networks for camera ...
(Research Note) Delving deeper into convolutional neural networks for camera ...(Research Note) Delving deeper into convolutional neural networks for camera ...
(Research Note) Delving deeper into convolutional neural networks for camera ...
 
Cs231n 2017 lecture12 Visualizing and Understanding
Cs231n 2017 lecture12 Visualizing and UnderstandingCs231n 2017 lecture12 Visualizing and Understanding
Cs231n 2017 lecture12 Visualizing and Understanding
 
Modeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networksModeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networks
 
One Perceptron to Rule Them All: Language and Vision
One Perceptron to Rule Them All: Language and VisionOne Perceptron to Rule Them All: Language and Vision
One Perceptron to Rule Them All: Language and Vision
 
What Would Shannon Do?
What Would Shannon Do?What Would Shannon Do?
What Would Shannon Do?
 
Learning where to look: focus and attention in deep vision
Learning where to look: focus and attention in deep visionLearning where to look: focus and attention in deep vision
Learning where to look: focus and attention in deep vision
 
ICCES 2017 - Crowd Density Estimation Method using Regression Analysis
ICCES 2017 - Crowd Density Estimation Method using Regression AnalysisICCES 2017 - Crowd Density Estimation Method using Regression Analysis
ICCES 2017 - Crowd Density Estimation Method using Regression Analysis
 
Towards better analysis of deep convolutional neural networks
Towards better analysis of deep convolutional neural networksTowards better analysis of deep convolutional neural networks
Towards better analysis of deep convolutional neural networks
 
Semantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesSemantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network Approaches
 
DLD_WeightSharing_Slide
DLD_WeightSharing_SlideDLD_WeightSharing_Slide
DLD_WeightSharing_Slide
 
Supervised Learning of Sparsity-Promoting Regularizers for Denoising
Supervised Learning of Sparsity-Promoting Regularizers for DenoisingSupervised Learning of Sparsity-Promoting Regularizers for Denoising
Supervised Learning of Sparsity-Promoting Regularizers for Denoising
 
capsule network
capsule networkcapsule network
capsule network
 
Deep Neural Networks 
that talk (Back)… with style
Deep Neural Networks 
that talk (Back)… with styleDeep Neural Networks 
that talk (Back)… with style
Deep Neural Networks 
that talk (Back)… with style
 
Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...
Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...
Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...
 
Deep Visual Saliency - Kevin McGuinness - UPC Barcelona 2017
Deep Visual Saliency - Kevin McGuinness - UPC Barcelona 2017Deep Visual Saliency - Kevin McGuinness - UPC Barcelona 2017
Deep Visual Saliency - Kevin McGuinness - UPC Barcelona 2017
 
Trip Report Seattle
Trip Report SeattleTrip Report Seattle
Trip Report Seattle
 
EPC 2018 - SEED - Exploring The Collaboration Between Proceduralism & Deep Le...
EPC 2018 - SEED - Exploring The Collaboration Between Proceduralism & Deep Le...EPC 2018 - SEED - Exploring The Collaboration Between Proceduralism & Deep Le...
EPC 2018 - SEED - Exploring The Collaboration Between Proceduralism & Deep Le...
 
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution
 
Intermediate inception network for person re-identification
Intermediate inception network for person re-identificationIntermediate inception network for person re-identification
Intermediate inception network for person re-identification
 

Dernier

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dashnarutouzumaki53779
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 

Dernier (20)

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dash
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 

Paper Reviews on Visual Attention

  • 1. Paper Reviews in Visual Attention 1 2018.3.29 SNU DATAMINING CENTER MINKI CHUNG
  • 2. WHO AM I 2 ▸ Chung Minki ▸ BS, KAIST, IE, 2016 ▸ MS, SNU, IE, 2018..?! ▸ Vision Projects ▸ Working on Semantic Image Inpainting
  • 3. WHAT IS VISUAL ATTENTION 3 ▸ Attention is HOT nowadays ▸ http://openaccess.thecvf.com/CVPR2017_search.py ▸ http://search.iclr2018.smerity.com/search/?query=attention
  • 4. WHAT IS VISUAL ATTENTION 4 ▸ Maybe heard of ▸ "Neural Machine Translation by Jointly Learning to Align and Translate" ▸ "Show, Attend, and Tell: Neural Image Caption" Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio, 2015, ICLR. "Neural Machine Translation by Jointly Learning to Align and Translate" Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, Yoshua Bengio, 2015, ICML. "Show, Attend, and Tell: Neural Image Caption Generation with Visual Attention"
  • 5. WHAT IS VISUAL ATTENTION 5 ▸ More, Jimmy Lei Ba, Volodymyr Mnih, Koray Kavukcuoglu, 2015, ILCR. "Multiple Object Recognition With Visual Attention" Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, NIPS, 2014. "Spatial Transformer Network" Jianlong Fu, Heliang Zheng, Tao Mei, 2017, CVPR. "Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine- grained Image Recognition" Siavash Gorji, James J. Clark, 2017, CVPR. "Attentional Push: A Deep Convolutional Network for Augmenting Image Salience with Shared Attention Modeling in Social Scenes"
  • 6. WHAT IS VISUAL ATTENTION 6 ▸ Visual Attention: ▸ Attend on certain part of image to solve a task more efficiently ▸ Deep learning, the black box model → Interpretability
  • 7. TABLE OF CONTENTS 7 ▸ Early Works ▸ Recurrent Attention Model (RAM) ▸ Spatial Transformer Network (STN) ▸ Recent Works of visual attention ▸ in ICLR ▸ in CVPR
  • 8. PREREQUISITE 8 ▸ CNN, Transpose Convolution(or Deconvolution), Dilated Convolution ▸ RNN ▸ MLP ▸ GAN https://towardsdatascience.com/types-of-convolutions-in-deep-learning-717013397f4d
  • 10. RECURRENT ATTENTION MODEL 10 ▸ Volodymyr Mnih, Nicolas Heess, Alex Graves, Koray Kavukcuoglu, 2014, NIPS. "Recurrent Models of Visual Attention" ▸ Google DeepMind, 563 citations ▸ Motivation: Confronted by large image, human process image sequentially, selecting where and what to look ▸ Tackle ConvNet limitation: poor scalability with increasing input image size
  • 11. RECURRENT ATTENTION MODEL 11 ▸ Multiple Object Recognition with Visual Attention (DRAM), 2015, ICLR ▸ Refined architecture version of RAM ▸ RNN Structure with multi-resolution crop, called glimpse ▸ Architecture: Jimmy Lei Ba, Volodymyr Mnih, Koray Kavukcuoglu, 2015, ILCR. "Multiple Object Recognition With Visual Attention"
  • 12. RECURRENT ATTENTION MODEL 12 ▸ Architecture: Jimmy Lei Ba, Volodymyr Mnih, Koray Kavukcuoglu, 2015, ILCR. "Multiple Object Recognition With Visual Attention" WHERE TO SEE WHAT TO SEE provide initial state locate glimpse outputs the inputs for rnn(1) for multiple objects
  • 13. RECURRENT ATTENTION MODEL 13 ▸ Demo ▸ Single object classification https://github.com/kevinzakka/recurrent-visual-attention
  • 14. RECURRENT ATTENTION MODEL 14 ▸ Training: ▸ maximize Jimmy Lei Ba, Volodymyr Mnih, Koray Kavukcuoglu, 2015, ILCR. "Multiple Object Recognition With Visual Attention" LOWERBOUND F multiple object case
  • 15. RECURRENT ATTENTION MODEL 15 ▸ Cont'd: Jimmy Lei Ba, Volodymyr Mnih, Koray Kavukcuoglu, 2015, ILCR. "Multiple Object Recognition With Visual Attention" REINFORCE
  • 16. RECURRENT ATTENTION MODEL 16 ▸ Experiments & Results ▸ MNIST, SVHN Jimmy Lei Ba, Volodymyr Mnih, Koray Kavukcuoglu, 2015, ILCR. "Multiple Object Recognition With Visual Attention"
  • 17. SPATIAL TRANSFORMER NETWORK 17 ▸ Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, 2014 NIPS. "Spatial Transformer Network" ▸ Google DeepMind, 624 citations ▸ Motivation: Human process distorted objects by un-distorting it ▸ ConvNet is not actually invariant to large transformation(only realised over a deep hierarchy of max-pooling) Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, 2014, NIPS. "Spatial Transformer Network" https://kevinzakka.github.io/2017/01/18/stn-part2/
  • 18. SPATIAL TRANSFORMER NETWORK 18 ▸ Architecture: ▸ three parts: localisation net, sampling grid, sampler ▸ Assume 𝛵𝜃 is 2D affine transformation A𝜃, Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, 2014, NIPS. "Spatial Transformer Network" regression H,W,C H',W',C
  • 19. SPATIAL TRANSFORMER NETWORK 19 ▸ 𝛵𝜃, for attention becomes: ▸ Allowing cropping, translation, isotropic scaling ▸ In case if a bilinear sampling kernel, ▸ Differentiable, Modular, Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, 2014, NIPS. "Spatial Transformer Network"
  • 20. SPATIAL TRANSFORMER NETWORK 20 ▸ Experiments and Results ▸ MNIST ▸ SVHN Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, 2014, NIPS. "Spatial Transformer Network"
  • 21. SPATIAL TRANSFORMER NETWORK 21 ▸ Experiments and Results ▸ Fine-grained classification (CUB-200-211 bird classification dataset) Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, 2014, NIPS. "Spatial Transformer Network"
  • 22. SPATIAL TRANSFORMER NETWORK 22 ▸ Already implemented in Tensorlayer Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, 2014, NIPS. "Spatial Transformer Network"
  • 23. RECURRENT ATTENTIONAL NETWORKS FOR SALIENCY DETECTION 23 ▸ Jason Kuen, Zhenhua Wang, Gang Wang, 2016, CVPR. "Recurrent Attentional Networks for Saliency Detection" ▸ RAM(Glimpse system) + STN(Differentiability) for Saliency Detection Jason Kuen, Zhenhua Wang, Gang Wang, 2016, CVPR. "Recurrent Attentional Networks for Saliency Detection"
  • 24. RECURRENT ATTENTIONAL NETWORKS FOR SALIENCY DETECTION 24 ▸ Recurrent Attentional Convolutional-Deconvolutional Network (RACDNN) ▸ Architecture Jason Kuen, Zhenhua Wang, Gang Wang, 2016, CVPR. "Recurrent Attentional Networks for Saliency Detection"
  • 25. RECURRENT ATTENTIONAL NETWORKS FOR SALIENCY DETECTION 25 ▸ Experiments & Results Jason Kuen, Zhenhua Wang, Gang Wang, 2016, CVPR. "Recurrent Attentional Networks for Saliency Detection"
  • 27. GENERATIVE IMAGE INPAINTING WITH CONTEXTUAL ATTENTION 27 ▸ Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas S. Huang, 2018, CVPR. "Generative Image Inpainting with Contextual Attention" ▸ Adobe Research Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas S. Huang, 2018, CVPR. "Generative Image Inpainting with Contextual Attention
  • 28. GENERATIVE IMAGE INPAINTING WITH CONTEXTUAL ATTENTION 28 ▸ Architecture ▸ Two-stage(coarse to fine) ▸ Global and Local W-GANS ▸ Spatially discounted reconstruction loss(𝑙1): 𝛾 Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas S. Huang, 2018, CVPR. "Generative Image Inpainting with Contextual Attention USE W-GAN attention 𝑙
  • 29. GENERATIVE IMAGE INPAINTING WITH CONTEXTUAL ATTENTION 29 ▸ Attention Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas S. Huang, 2018, CVPR. "Generative Image Inpainting with Contextual Attention fx,y bx,y Calculate cosine similarity:
  • 30. GENERATIVE IMAGE INPAINTING WITH CONTEXTUAL ATTENTION 30 ▸ Experiments & Results Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas S. Huang, 2018, CVPR. "Generative Image Inpainting with Contextual Attention
  • 31. LEARN TO PAY ATTENTION 31 ▸ Saumya Jetley, Nicholas A. Lord, Namhoon Lee, Philip H. S. Torr, 2018, ICLR. "Learn to Pay Attention" ▸ Very simple Saumya Jetley, Nicholas A. Lord, Namhoon Lee, Philip H. S. Torr, 2018, ICLR. "Learn to Pay Attention"
  • 32. LEARN TO PAY ATTENTION 32 ▸ Architecture Saumya Jetley, Nicholas A. Lord, Namhoon Lee, Philip H. S. Torr, 2018, ICLR. "Learn to Pay Attention" Attention Compatibility function(dot product)
  • 33. LEARN TO PAY ATTENTION 33 ▸ Experiments & Results ▸ Image classification and fine-grained recognition Saumya Jetley, Nicholas A. Lord, Namhoon Lee, Philip H. S. Torr, 2018, ICLR. "Learn to Pay Attention"
  • 34. LEARN TO PAY ATTENTION 34 ▸ Experiments & Results ▸ Weakly supervised semantic segmentation Saumya Jetley, Nicholas A. Lord, Namhoon Lee, Philip H. S. Torr, 2018, ICLR. "Learn to Pay Attention"
  • 35. LOOK CLOSER TO SEE BETTER 35 ▸ Jianlong Fu, Heliang Zheng, Tao Mei, 2017, CVPR. "Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-grained Image Recognition" ▸ Fine-grained image recognition: ▸ Discriminative region localization + fine-grained feature learning Jianlong Fu, Heliang Zheng, Tao Mei, 2017, CVPR. "Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine- grained Image Recognition"
  • 36. LOOK CLOSER TO SEE BETTER 36 ▸ Recurrent Attention Convolutional Neural Network (RA-CNN) ▸ Multi-scale networks: classification sub-network, attention proposal sub- network(APN) ▸ Finer-scale network (coarse to fine) ▸ Intra-scale softmax loss for classification, inter-scale pairwise ranking loss for APN Jianlong Fu, Heliang Zheng, Tao Mei, 2017, CVPR. "Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine- grained Image Recognition"
  • 37. LOOK CLOSER TO SEE BETTER 37 ▸ RA-CNN architecture: Jianlong Fu, Heliang Zheng, Tao Mei, 2017, CVPR. "Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine- grained Image Recognition" bilinear interpolation to amplify
  • 38. LOOK CLOSER TO SEE BETTER 38 ▸ Training: ▸ Multi-task loss: Jianlong Fu, Heliang Zheng, Tao Mei, 2017, CVPR. "Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine- grained Image Recognition" forces
  • 39. LOOK CLOSER TO SEE BETTER 39 ▸ Experiments & Results ▸ CUB-200-211 Bird Dataset Jianlong Fu, Heliang Zheng, Tao Mei, 2017, CVPR. "Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine- grained Image Recognition"
  • 40. LOOK CLOSER TO SEE BETTER 40 ▸ Experiments & Results ▸ Stanford Dogs, Stanford Cars Jianlong Fu, Heliang Zheng, Tao Mei, 2017, CVPR. "Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine- grained Image Recognition"
  • 41. SUMMARY 41 ▸ Attention for efficiency, better performance, interpretability ▸ Many types of Attention: ▸ RAM ▸ STN ▸ RAM+STN ▸ Others
  • 43. REFERERNCE 43 ▸ Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio, 2015, ICLR. "Neural Machine Translation by Jointly Learning to Align and Translate" ▸ Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, Yoshua Bengio, 2015, ICML. "Show, Attend, and Tell: Neural Image Caption Generation with Visual Attention" ▸ Volodymyr Mnih, Nicolas Heess, Alex Graves, Koray Kavukcuoglu, 2014, NIPS. "Recurrent Models of Visual Attention" ▸ Jimmy Lei Ba, Volodymyr Mnih, Koray Kavukcuoglu, 2015, ILCR. "Multiple Object Recognition With Visual Attention" ▸ Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, 2014 NIPS. "Spatial Transformer Network" ▸ Jason Kuen, Zhenhua Wang, Gang Wang, 2016, CVPR. "Recurrent Attentional Networks for Saliency Detection" ▸ Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas S. Huang, 2018, CVPR. "Generative Image Inpainting with Contextual Attention" ▸ Saumya Jetley, Nicholas A. Lord, Namhoon Lee, Philip H. S. Torr, 2018, ICLR. "Learn to Pay Attention" ▸ Jianlong Fu, Heliang Zheng, Tao Mei, 2017, CVPR. "Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-grained Image Recognition"