SlideShare une entreprise Scribd logo
1  sur  54
Télécharger pour lire hors ligne
Class-Weighted Convolutional
Features for Image Retrieval
Xavier Giró-i-NietoAlbert Jiménez Jose Alvarez [Code][arXiv]
Image Retrieval
Google Search By Image
2
Image Retrieval
Given an image query, generate a ranked list of similar images
Task Definition
3
Image Retrieval
Given an image query, generate a ranked list of similar images
Task Definition
4
What are the properties that we want for these systems?
Image Retrieval
Desirable System Properties
5
Invariant to scale, illumination, translation
Be able to provide a fast searchMemory Efficient
Image Retrieval
Desirable System Properties
6
Invariant to scale, illumination, translation
Be able to provide a fast searchMemory Efficient
How to achieve them?
Image Retrieval
General Pipeline
● Encode images into representations that are
○ Invariant
○ Compact
7
8
Image Retrieval
Image Representations
● Hand-Crafted Features (Before 2012)
○ SIFT
○ SIFT + BoW
○ VLAD
○ Fisher Vectors
Josef Sivic, Andrew Zisserman, et al. Video google: A text retrieval approach to object matching in videos. In ICCV, 2003.
9
Image Retrieval
Image Representations
● Learned Features (After 2012)
○ Convolutional Neural Networks (CNNs)
.
.
.
.
.
.
Convolutional Layer Max-Pooling Layer Fully-Connected Layer
Flattening
Class Predictions
Input Image
Softmax Activation
Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition.
arXiv preprint arXiv:1409.1556. (2014)
VGG-16
10
Image Retrieval
Image Representations
● Learned Features (After 2012)
○ Convolutional Neural Networks (CNNs)
.
.
.
.
.
.
Flattening
Large Datasets
Labels
Computational Power (GPUs)
Training is expensive!
11
Retrieval
System
➔ Collecting, annotating and cleaning a large dataset to train models
requires great effort.
➔ Training or fine-tuning a model every time new data is added is neither
efficient nor scalable.
Dynamic datasets of unlabeled images
Image Representations
Challenges for Learning Features in Retrieval
12
➔ Transferring the knowledge of a model trained in a large-scale dataset.
(e.g. ImageNet)
One solution that does not involve training is Transfer Learning
Image Representations
Challenges for Learning Features
Classification → Retrieval
How?
Outline
▷ Related Work
▷ Proposal
▷ Experiments
▷ Conclusions
13
Related Work
14
Babenko, A., Slesarev, A., Chigorin, A., & Lempitsky, V. (2014). Neural codes for image retrieval. In ECCV 2014
Razavian, A., Azizpour, H., Sullivan, J., & Carlsson, S. (2014). CNN features off-the-shelf: an astounding baseline for recognition. In
DeepVision CVPRW 2014
Fully-connected features as global representation
15
Image Representations
Fully-Connected Features
.
.
.
.
.
.
Convolutional Layer Max-Pooling Layer Fully-Connected Layer
Flattening
Class Predictions
Input Image
Softmax Activation
Neural Codes
Babenko, A., & Lempitsky, V. (2015). Aggregating local deep features for image retrieval. ICCV 2015
Tolias, G., Sicre, R., & Jégou, H. (2016). Particular object retrieval with integral max-pooling of CNN activations. ICLR 2016
Kalantidis, Y., Mellina, C., & Osindero, S. (2015). Cross-dimensional Weighting for Aggregated Deep Convolutional Features. arXiv
preprint arXiv:1512.04065.
16
Image Representations
Convolutional Features
.
.
.
.
.
.
Convolutional Layer Max-Pooling Layer Fully-Connected Layer
Flattening
Class Predictions
Input Image
Softmax Activation
Convolutional
Features
Convolutional features as global representation
➔ They convey the spatial information of the image
17
Image Representations
From Feature Maps to Compact Representations
(Fh
x Fw
x 3) x K Filters
Stride: S, Padding: P
Ih
x Iw
x 3 (H x W) x K Feature Maps
H = (Ih
− Fh
+ 2P) / S +1
W = (Iw
− Fw
+ 2P) / S +1
Image Convolutional layer Feature Maps
18
Image Representations
From Feature Maps to Compact Representations
1 0 1
0 1 0
9 0 4
1 2 1
0 0 0
0 0 0
0 0 1
0 0 0
5 1 1
Max Pooling
Max Pooling
Max Pooling
9
2
5
3 Feature Maps (3x3) Compact vector of (1x3)Pooling Operation
19
Image Representations
3 Feature Maps (3x3)
1 0 1
1 1 0
9 0 4
1 2 1
0 0 0
0 0 0
0 0 1
0 0 0
5 1 1
Sum Pooling
Sum Pooling
Sum Pooling
Compact vector of (1x3)
17
4
8
From Feature Maps to Compact Representations
Pooling Operation
Image Representations
Related Works
SPoC R-MAC BoW CroW
Combine and aggregate convolutional features:
● SPoC → Gaussian prior + Sum Pooling
● R-MAC → Rigid grid of regions + Max Pooling
● BoW → Bag of Visual Words
● CroW → Spatial weighting based on normalized sum of feature maps (boost
salient content) + channel weighting based on sparsity + Sum Pooling
20
Image Representations
Related Works
SPoC R-MAC BoW CroW
21
All except CroW:
● Apply fixed weights
● Based on fixed regions
Not based on image contents
Proposal
22
Dog
Tennis Ball
Sand bar
Seashore
23
Motivation
Image Semantics
Objective
.
.
.
.
.
.
Flattening
Class Predictions
Input Image Convolutional
Features
Combine semantic content of the images with convolutional features
Explicit
Semantic
Information
Statue
GAP .
.
.
.
.
.
w1
w2
w3
wN
Class 1
(Tennis Ball)
VGG-16
CAMi
= w1,i
∗ conv1
+ w2,i
∗ conv2
+ … + wN,i
∗ convN
Class N
CAM
(Tennis Ball)
B. Zhou, A. Khosla, Lapedriza. A., A. Oliva, and A. Torralba. 2016. Learning Deep Features for Discriminative Localization. CVPR
(2016).
25
CAM layer
Class Activation Maps
Obtaining Discriminative Regions
Tennis Ball Weimaraner
Simple Classes
26
Class Activation Maps
Examples
Sand Bar Seashore
Complex Classes
27
Class Activation Maps
Examples
Encode images dynamically combining their semantics with convolutional features using
only the knowledge contained inside a trained network
Compact
Descriptor
Class-Weighted Convolutional Features
28
GAP .
.
.
.
.
.
w1
w2
w3
wN
Class 1
(Tennis Ball)
CAM layer
Class N
Proposal
29
CNN
CW1
CW2
CWK
Class 1
(Weimaraner) F1
= [ f1,1
, f1,2
, … , f1,K
]
CW1
CW2
CWK F2
= [ f2,1
, f2,2
, … , f2,K
]
CW1
CW2
CWK F3
= [ f3,1
, f3,2
, … , f3,K
]
Class 2
(Tennis Ball)
Class 3
(Seashore)
Sum-Pooling
l2
norm
PCA + l2
PCA+ l2
PCA + l2
l2
Sum-Pooling
l2
norm
Sum-Pooling
l2
norm
DI
= [d1
, … , dK
]
Image: I
2. Feature Weighting and Pooling 3. Descriptor Aggregation
CAM layer
Feat. Ext.
1. Features & CAMs Extraction
Image Encoding Pipeline
GAP .
.
.
.
.
.
w1
w2
w3
wN
Class 1
(Tennis Ball)
CAM layer
Class N
Feature Extraction
CAMs Computation
Depth = K
In a single forward pass we extract convolutional features and image CAMs
Normalized
CAMs
[0,1]
30
Image Encoding Pipeline
1. Features & CAMs Extraction
CW1
CW2
CWK
F1
= [f1,1
, f1,2
, … , f1,K
]
CW1
CW2
CWK
F2
= [f2,1
, f2,2
, … , f2,K
]
CW1
CW2
CWK
FN
= [fN,1
, fN,2
, … , fN,K
]
Sum-Pooling
l2
norm
Sum-Pooling
l2
norm
Sum-Pooling
l2
norm
.
.
.
.
.
.
One vector per class:
31
Tennis Ball
(Class 1)
Weimaraner
(Class 2)
Seashore
(Class N)
Spatial Weighting using Class Activation Maps
Channel Weighting based on feature maps sparsity (as in CroW)
Image Encoding Pipeline
2. Feature Weighting and Pooling
32
F1
= [ f1,1
, f1,2
, … , f1,K
]
F2
= [ f2,1
, f2,2
, … , f2,K
]
F N
= [ fN,1
, fN,2
, … , fN,K
]
PCA + l2
PCA+ l2
PCA + l2
l2
DI
= [d1
, … , dK
]
➔ Which classes do we aggregate?
➔ What is the optimal number of classes (NC
) ?
➔ What is the optimal number of classes (NPCA
) to compute the PCA ?
.
.
.
Image Encoding Pipeline
3. Descriptor Aggregation
Image
Encoding
Pipeline
Store all the
class vectors (Fc
)
Image Dataset
Query
Image
Encoding
Pipeline
CNN Most Probable Classes Prediction
from query (Retriever, Seashore...)
Query Descriptor
Dataset Descriptor
Aggregation
Scores
33
Descriptor Aggregation Strategies
Online Aggregation (OnA)
Image
Encoding
Pipeline
Store final descriptor
Image Dataset
Query
Image
Encoding
Pipeline
CNN Most Probable Classes Prediction
Query Descriptor
Dataset Descriptor
Aggregation
Scores
Pre-Computed Descriptors
34
Descriptor Aggregation Strategies
Offline Aggregation (OfA)
Experiments
35
● Framework: Keras with Theano as backend / PyTorch
● Images resized to 1024x720 (keeping aspect ratio)
● We explored using different CNNs as feature extractors
36
Experimental Setup
Experimental Setup
Examples of CAMs from different networks
37
Experimental Setup
Examples of CAMs from different networks
38
● VGG-16 CAM model as feature extractor (pre-trained on ImageNet)
● Features from Conv5_1 Layer
GAP .
.
.
.
.
.
w1
w2
w3
wN
CAM layer
39
Conv5_1
Experimental Setup
● Datasets
○ Oxford 5k
○ Paris 6k
○ 100k Distractors (Flickr) → Oxford105k, Paris106k
● Using the features that fall inside the cropped query
● PCA computed with Oxf5k when testing in Par6k and vice versa
Philbin, J. , Chum, O. , Isard, M. , Sivic, J. and Zisserman, A. Object retrieval with large vocabularies and fast spatial matching, CVPR 2007
Philbin, J. , Chum, O. , Isard, M. , Sivic, J. and Zisserman, A. Lost in Quantization: Improving Particular Object Retrieval in Large Scale Image
Databases. CVPR 2008
40
Experimental Setup
● Scores computed with cosine similarity
41
● Evaluation metric: mean Average Precision (mAP)
Experimental Setup
42
Finding the optimal parameters
Ablation Studies
● We carried out ablation studies to validate our method
● Computational Burden
● Number of top-classes to add: NC
● Number of top-classes to compute the PCA matrix: NPCA
43
Ablation Studies
Baseline Results & Computational Burden
44
Ablation Studies
Networks Comparison
45
Nc
corresponds to the number of classes aggregated.
Npca
corresponds to the classes used to compute the PCA (computed with data of
Oxford when testing in Paris and vice versa).
Ablation Studies
Sensitivity with respect to NC
and Npca
46
The first 16 classes correspond to: vault, bell cote, bannister, analog clock, movie
theater, coil, pier, dome, pedestal, flagpole, church, chime, suspension bridge, birdhouse,
sundial, triumphal arch. All of them are related to buildings.
Ablation Studies
Using predefined classes
47
Comparison with State-of-the-Art
Nc
= 64 , Npca
= 1
48
Search Post-Processing
● We establish a set of thresholds based on the max value of the
normalized CAM
○ 1%, 10%, 20%, 30%, 40%
● Generate a bounding box covering the largest connected element
● Compare the descriptor of each new region of the target image with the
query descriptor and generate new scores.
● Results a bit better if we average over 2 most probable CAMs
● For the top-R target images
Region Proposals for Re-Ranking
Query Expansion
● Generate a new query by l2
normalizing the sum of the top-QE target descriptors
49
Green - Ground Truth
Orange → Red - Different Thresholds
Region Proposals using CAMs
50
Green - Ground Truth
Orange → Red - Different Thresholds
Region Proposals using CAMs
51
Comparison with State-of-the-Art
After Re-Ranking and Query Expansion
Nc
= 64 , Npca
= 1, Nre-ranking
= 6
Conclusions
52
▷ In this work we proposed a technique to build compact image
representations focusing on their semantic content.
▷ We employed an image encoding pipeline that makes use of a
pre-trained CNN and Class Activation Maps to extract discriminative
regions from the image and weight its convolutional features
accordingly
▷ Our experiments demonstrated that selecting the relevant content of
an image to build the image descriptor is beneficial, and contributes to
increase the retrieval performance. The proposed approach establishes
a new state-of-the-art compared to methods that build image
representations combining off-the-shelf features using random or fixed
grid regions.
53
Conclusions
5. Conclusions
54

Contenu connexe

Tendances

Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationDat Nguyen
 
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
Object detection - RCNNs vs Retinanet
Object detection - RCNNs vs RetinanetObject detection - RCNNs vs Retinanet
Object detection - RCNNs vs RetinanetRishabh Indoria
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Universitat Politècnica de Catalunya
 
Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)Hwa Pyung Kim
 
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...Universitat Politècnica de Catalunya
 
VJAI Paper Reading#3-KDD2019-ClusterGCN
VJAI Paper Reading#3-KDD2019-ClusterGCNVJAI Paper Reading#3-KDD2019-ClusterGCN
VJAI Paper Reading#3-KDD2019-ClusterGCNDat Nguyen
 
Shai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingShai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingwolf
 
Review: You Only Look One-level Feature
Review: You Only Look One-level FeatureReview: You Only Look One-level Feature
Review: You Only Look One-level FeatureDongmin Choi
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkNader Karimi
 
Object Detection Methods using Deep Learning
Object Detection Methods using Deep LearningObject Detection Methods using Deep Learning
Object Detection Methods using Deep LearningSungjoon Choi
 
Deep learning based object detection basics
Deep learning based object detection basicsDeep learning based object detection basics
Deep learning based object detection basicsBrodmann17
 

Tendances (20)

Region-oriented Convolutional Networks for Object Retrieval
Region-oriented Convolutional Networks for Object RetrievalRegion-oriented Convolutional Networks for Object Retrieval
Region-oriented Convolutional Networks for Object Retrieval
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance Segmentation
 
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
 
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
 
Deep and Young Vision Learning at UPC BarcelonaTech (NIPS 2016)
Deep and Young Vision Learning at UPC BarcelonaTech (NIPS 2016)Deep and Young Vision Learning at UPC BarcelonaTech (NIPS 2016)
Deep and Young Vision Learning at UPC BarcelonaTech (NIPS 2016)
 
Mask R-CNN
Mask R-CNNMask R-CNN
Mask R-CNN
 
Object detection - RCNNs vs Retinanet
Object detection - RCNNs vs RetinanetObject detection - RCNNs vs Retinanet
Object detection - RCNNs vs Retinanet
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
 
Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)
 
Deep Learning for Computer Vision: Object Detection (UPC 2016)
Deep Learning for Computer Vision: Object Detection (UPC 2016)Deep Learning for Computer Vision: Object Detection (UPC 2016)
Deep Learning for Computer Vision: Object Detection (UPC 2016)
 
Visual Object Analysis using Regions and Local Features
Visual Object Analysis using Regions and Local FeaturesVisual Object Analysis using Regions and Local Features
Visual Object Analysis using Regions and Local Features
 
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
 
VJAI Paper Reading#3-KDD2019-ClusterGCN
VJAI Paper Reading#3-KDD2019-ClusterGCNVJAI Paper Reading#3-KDD2019-ClusterGCN
VJAI Paper Reading#3-KDD2019-ClusterGCN
 
Shai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingShai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble tracking
 
Lec11 object-re-id
Lec11 object-re-idLec11 object-re-id
Lec11 object-re-id
 
Review: You Only Look One-level Feature
Review: You Only Look One-level FeatureReview: You Only Look One-level Feature
Review: You Only Look One-level Feature
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning Framework
 
Object Detection Methods using Deep Learning
Object Detection Methods using Deep LearningObject Detection Methods using Deep Learning
Object Detection Methods using Deep Learning
 
Deep learning based object detection basics
Deep learning based object detection basicsDeep learning based object detection basics
Deep learning based object detection basics
 
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
 

Similaire à Class Weighted Convolutional Features for Image Retrieval

D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
Supervised Learning of Semantic Classes for Image Annotation and Retrieval
Supervised Learning of Semantic Classes for Image Annotation and RetrievalSupervised Learning of Semantic Classes for Image Annotation and Retrieval
Supervised Learning of Semantic Classes for Image Annotation and RetrievalLukas Tencer
 
#10 pydata warsaw object detection with dn ns
#10   pydata warsaw object detection with dn ns#10   pydata warsaw object detection with dn ns
#10 pydata warsaw object detection with dn nsAndrew Brozek
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術CHENHuiMei
 
Learning to Project and Binarise for Hashing-based Approximate Nearest Neighb...
Learning to Project and Binarise for Hashing-based Approximate Nearest Neighb...Learning to Project and Binarise for Hashing-based Approximate Nearest Neighb...
Learning to Project and Binarise for Hashing-based Approximate Nearest Neighb...Sean Moran
 
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)Yusuke Uchida
 
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data SetsMethods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data SetsRyan B Harvey, CSDP, CSM
 
object detection paper review
object detection paper reviewobject detection paper review
object detection paper reviewYoonho Na
 
Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...Symeon Papadopoulos
 
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015Jia-Bin Huang
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer VisionSungjoon Choi
 
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning MachineFast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning MachineSoma Boubou
 
NetVLAD: CNN architecture for weakly supervised place recognition
NetVLAD:  CNN architecture for weakly supervised place recognitionNetVLAD:  CNN architecture for weakly supervised place recognition
NetVLAD: CNN architecture for weakly supervised place recognitionGeunhee Cho
 
Learning visual explanations for DCNN-based image classifiers using an attent...
Learning visual explanations for DCNN-based image classifiers using an attent...Learning visual explanations for DCNN-based image classifiers using an attent...
Learning visual explanations for DCNN-based image classifiers using an attent...VasileiosMezaris
 
Computer Vision image classification
Computer Vision image classificationComputer Vision image classification
Computer Vision image classificationWael Badawy
 
On-the-fly Visual Category Search in Web-scale Image Collections
On-the-fly Visual Category Search in Web-scale Image CollectionsOn-the-fly Visual Category Search in Web-scale Image Collections
On-the-fly Visual Category Search in Web-scale Image CollectionsKen Chatfield
 

Similaire à Class Weighted Convolutional Features for Image Retrieval (20)

D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
 
ECCV WS 2012 (Frank)
ECCV WS 2012 (Frank)ECCV WS 2012 (Frank)
ECCV WS 2012 (Frank)
 
Lec07 aggregation-and-retrieval-system
Lec07 aggregation-and-retrieval-systemLec07 aggregation-and-retrieval-system
Lec07 aggregation-and-retrieval-system
 
Supervised Learning of Semantic Classes for Image Annotation and Retrieval
Supervised Learning of Semantic Classes for Image Annotation and RetrievalSupervised Learning of Semantic Classes for Image Annotation and Retrieval
Supervised Learning of Semantic Classes for Image Annotation and Retrieval
 
#10 pydata warsaw object detection with dn ns
#10   pydata warsaw object detection with dn ns#10   pydata warsaw object detection with dn ns
#10 pydata warsaw object detection with dn ns
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術
 
Learning to Project and Binarise for Hashing-based Approximate Nearest Neighb...
Learning to Project and Binarise for Hashing-based Approximate Nearest Neighb...Learning to Project and Binarise for Hashing-based Approximate Nearest Neighb...
Learning to Project and Binarise for Hashing-based Approximate Nearest Neighb...
 
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)
 
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data SetsMethods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
 
object detection paper review
object detection paper reviewobject detection paper review
object detection paper review
 
Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...
 
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer Vision
 
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning MachineFast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
 
D3L4-objects.pdf
D3L4-objects.pdfD3L4-objects.pdf
D3L4-objects.pdf
 
NetVLAD: CNN architecture for weakly supervised place recognition
NetVLAD:  CNN architecture for weakly supervised place recognitionNetVLAD:  CNN architecture for weakly supervised place recognition
NetVLAD: CNN architecture for weakly supervised place recognition
 
Learning visual explanations for DCNN-based image classifiers using an attent...
Learning visual explanations for DCNN-based image classifiers using an attent...Learning visual explanations for DCNN-based image classifiers using an attent...
Learning visual explanations for DCNN-based image classifiers using an attent...
 
Computer Vision image classification
Computer Vision image classificationComputer Vision image classification
Computer Vision image classification
 
On-the-fly Visual Category Search in Web-scale Image Collections
On-the-fly Visual Category Search in Web-scale Image CollectionsOn-the-fly Visual Category Search in Web-scale Image Collections
On-the-fly Visual Category Search in Web-scale Image Collections
 
Poster_Final
Poster_FinalPoster_Final
Poster_Final
 

Plus de Universitat Politècnica de Catalunya

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoUniversitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosUniversitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Universitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Universitat Politècnica de Catalunya
 

Plus de Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 

Dernier

Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 

Dernier (20)

Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 

Class Weighted Convolutional Features for Image Retrieval

  • 1. Class-Weighted Convolutional Features for Image Retrieval Xavier Giró-i-NietoAlbert Jiménez Jose Alvarez [Code][arXiv]
  • 3. Image Retrieval Given an image query, generate a ranked list of similar images Task Definition 3
  • 4. Image Retrieval Given an image query, generate a ranked list of similar images Task Definition 4 What are the properties that we want for these systems?
  • 5. Image Retrieval Desirable System Properties 5 Invariant to scale, illumination, translation Be able to provide a fast searchMemory Efficient
  • 6. Image Retrieval Desirable System Properties 6 Invariant to scale, illumination, translation Be able to provide a fast searchMemory Efficient How to achieve them?
  • 7. Image Retrieval General Pipeline ● Encode images into representations that are ○ Invariant ○ Compact 7
  • 8. 8 Image Retrieval Image Representations ● Hand-Crafted Features (Before 2012) ○ SIFT ○ SIFT + BoW ○ VLAD ○ Fisher Vectors Josef Sivic, Andrew Zisserman, et al. Video google: A text retrieval approach to object matching in videos. In ICCV, 2003.
  • 9. 9 Image Retrieval Image Representations ● Learned Features (After 2012) ○ Convolutional Neural Networks (CNNs) . . . . . . Convolutional Layer Max-Pooling Layer Fully-Connected Layer Flattening Class Predictions Input Image Softmax Activation Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. (2014) VGG-16
  • 10. 10 Image Retrieval Image Representations ● Learned Features (After 2012) ○ Convolutional Neural Networks (CNNs) . . . . . . Flattening Large Datasets Labels Computational Power (GPUs) Training is expensive!
  • 11. 11 Retrieval System ➔ Collecting, annotating and cleaning a large dataset to train models requires great effort. ➔ Training or fine-tuning a model every time new data is added is neither efficient nor scalable. Dynamic datasets of unlabeled images Image Representations Challenges for Learning Features in Retrieval
  • 12. 12 ➔ Transferring the knowledge of a model trained in a large-scale dataset. (e.g. ImageNet) One solution that does not involve training is Transfer Learning Image Representations Challenges for Learning Features Classification → Retrieval How?
  • 13. Outline ▷ Related Work ▷ Proposal ▷ Experiments ▷ Conclusions 13
  • 15. Babenko, A., Slesarev, A., Chigorin, A., & Lempitsky, V. (2014). Neural codes for image retrieval. In ECCV 2014 Razavian, A., Azizpour, H., Sullivan, J., & Carlsson, S. (2014). CNN features off-the-shelf: an astounding baseline for recognition. In DeepVision CVPRW 2014 Fully-connected features as global representation 15 Image Representations Fully-Connected Features . . . . . . Convolutional Layer Max-Pooling Layer Fully-Connected Layer Flattening Class Predictions Input Image Softmax Activation Neural Codes
  • 16. Babenko, A., & Lempitsky, V. (2015). Aggregating local deep features for image retrieval. ICCV 2015 Tolias, G., Sicre, R., & Jégou, H. (2016). Particular object retrieval with integral max-pooling of CNN activations. ICLR 2016 Kalantidis, Y., Mellina, C., & Osindero, S. (2015). Cross-dimensional Weighting for Aggregated Deep Convolutional Features. arXiv preprint arXiv:1512.04065. 16 Image Representations Convolutional Features . . . . . . Convolutional Layer Max-Pooling Layer Fully-Connected Layer Flattening Class Predictions Input Image Softmax Activation Convolutional Features Convolutional features as global representation ➔ They convey the spatial information of the image
  • 17. 17 Image Representations From Feature Maps to Compact Representations (Fh x Fw x 3) x K Filters Stride: S, Padding: P Ih x Iw x 3 (H x W) x K Feature Maps H = (Ih − Fh + 2P) / S +1 W = (Iw − Fw + 2P) / S +1 Image Convolutional layer Feature Maps
  • 18. 18 Image Representations From Feature Maps to Compact Representations 1 0 1 0 1 0 9 0 4 1 2 1 0 0 0 0 0 0 0 0 1 0 0 0 5 1 1 Max Pooling Max Pooling Max Pooling 9 2 5 3 Feature Maps (3x3) Compact vector of (1x3)Pooling Operation
  • 19. 19 Image Representations 3 Feature Maps (3x3) 1 0 1 1 1 0 9 0 4 1 2 1 0 0 0 0 0 0 0 0 1 0 0 0 5 1 1 Sum Pooling Sum Pooling Sum Pooling Compact vector of (1x3) 17 4 8 From Feature Maps to Compact Representations Pooling Operation
  • 20. Image Representations Related Works SPoC R-MAC BoW CroW Combine and aggregate convolutional features: ● SPoC → Gaussian prior + Sum Pooling ● R-MAC → Rigid grid of regions + Max Pooling ● BoW → Bag of Visual Words ● CroW → Spatial weighting based on normalized sum of feature maps (boost salient content) + channel weighting based on sparsity + Sum Pooling 20
  • 21. Image Representations Related Works SPoC R-MAC BoW CroW 21 All except CroW: ● Apply fixed weights ● Based on fixed regions Not based on image contents
  • 24. Objective . . . . . . Flattening Class Predictions Input Image Convolutional Features Combine semantic content of the images with convolutional features Explicit Semantic Information Statue
  • 25. GAP . . . . . . w1 w2 w3 wN Class 1 (Tennis Ball) VGG-16 CAMi = w1,i ∗ conv1 + w2,i ∗ conv2 + … + wN,i ∗ convN Class N CAM (Tennis Ball) B. Zhou, A. Khosla, Lapedriza. A., A. Oliva, and A. Torralba. 2016. Learning Deep Features for Discriminative Localization. CVPR (2016). 25 CAM layer Class Activation Maps Obtaining Discriminative Regions
  • 26. Tennis Ball Weimaraner Simple Classes 26 Class Activation Maps Examples
  • 27. Sand Bar Seashore Complex Classes 27 Class Activation Maps Examples
  • 28. Encode images dynamically combining their semantics with convolutional features using only the knowledge contained inside a trained network Compact Descriptor Class-Weighted Convolutional Features 28 GAP . . . . . . w1 w2 w3 wN Class 1 (Tennis Ball) CAM layer Class N Proposal
  • 29. 29 CNN CW1 CW2 CWK Class 1 (Weimaraner) F1 = [ f1,1 , f1,2 , … , f1,K ] CW1 CW2 CWK F2 = [ f2,1 , f2,2 , … , f2,K ] CW1 CW2 CWK F3 = [ f3,1 , f3,2 , … , f3,K ] Class 2 (Tennis Ball) Class 3 (Seashore) Sum-Pooling l2 norm PCA + l2 PCA+ l2 PCA + l2 l2 Sum-Pooling l2 norm Sum-Pooling l2 norm DI = [d1 , … , dK ] Image: I 2. Feature Weighting and Pooling 3. Descriptor Aggregation CAM layer Feat. Ext. 1. Features & CAMs Extraction Image Encoding Pipeline
  • 30. GAP . . . . . . w1 w2 w3 wN Class 1 (Tennis Ball) CAM layer Class N Feature Extraction CAMs Computation Depth = K In a single forward pass we extract convolutional features and image CAMs Normalized CAMs [0,1] 30 Image Encoding Pipeline 1. Features & CAMs Extraction
  • 31. CW1 CW2 CWK F1 = [f1,1 , f1,2 , … , f1,K ] CW1 CW2 CWK F2 = [f2,1 , f2,2 , … , f2,K ] CW1 CW2 CWK FN = [fN,1 , fN,2 , … , fN,K ] Sum-Pooling l2 norm Sum-Pooling l2 norm Sum-Pooling l2 norm . . . . . . One vector per class: 31 Tennis Ball (Class 1) Weimaraner (Class 2) Seashore (Class N) Spatial Weighting using Class Activation Maps Channel Weighting based on feature maps sparsity (as in CroW) Image Encoding Pipeline 2. Feature Weighting and Pooling
  • 32. 32 F1 = [ f1,1 , f1,2 , … , f1,K ] F2 = [ f2,1 , f2,2 , … , f2,K ] F N = [ fN,1 , fN,2 , … , fN,K ] PCA + l2 PCA+ l2 PCA + l2 l2 DI = [d1 , … , dK ] ➔ Which classes do we aggregate? ➔ What is the optimal number of classes (NC ) ? ➔ What is the optimal number of classes (NPCA ) to compute the PCA ? . . . Image Encoding Pipeline 3. Descriptor Aggregation
  • 33. Image Encoding Pipeline Store all the class vectors (Fc ) Image Dataset Query Image Encoding Pipeline CNN Most Probable Classes Prediction from query (Retriever, Seashore...) Query Descriptor Dataset Descriptor Aggregation Scores 33 Descriptor Aggregation Strategies Online Aggregation (OnA)
  • 34. Image Encoding Pipeline Store final descriptor Image Dataset Query Image Encoding Pipeline CNN Most Probable Classes Prediction Query Descriptor Dataset Descriptor Aggregation Scores Pre-Computed Descriptors 34 Descriptor Aggregation Strategies Offline Aggregation (OfA)
  • 36. ● Framework: Keras with Theano as backend / PyTorch ● Images resized to 1024x720 (keeping aspect ratio) ● We explored using different CNNs as feature extractors 36 Experimental Setup
  • 37. Experimental Setup Examples of CAMs from different networks 37
  • 38. Experimental Setup Examples of CAMs from different networks 38
  • 39. ● VGG-16 CAM model as feature extractor (pre-trained on ImageNet) ● Features from Conv5_1 Layer GAP . . . . . . w1 w2 w3 wN CAM layer 39 Conv5_1 Experimental Setup
  • 40. ● Datasets ○ Oxford 5k ○ Paris 6k ○ 100k Distractors (Flickr) → Oxford105k, Paris106k ● Using the features that fall inside the cropped query ● PCA computed with Oxf5k when testing in Par6k and vice versa Philbin, J. , Chum, O. , Isard, M. , Sivic, J. and Zisserman, A. Object retrieval with large vocabularies and fast spatial matching, CVPR 2007 Philbin, J. , Chum, O. , Isard, M. , Sivic, J. and Zisserman, A. Lost in Quantization: Improving Particular Object Retrieval in Large Scale Image Databases. CVPR 2008 40 Experimental Setup
  • 41. ● Scores computed with cosine similarity 41 ● Evaluation metric: mean Average Precision (mAP) Experimental Setup
  • 42. 42 Finding the optimal parameters Ablation Studies ● We carried out ablation studies to validate our method ● Computational Burden ● Number of top-classes to add: NC ● Number of top-classes to compute the PCA matrix: NPCA
  • 43. 43 Ablation Studies Baseline Results & Computational Burden
  • 45. 45 Nc corresponds to the number of classes aggregated. Npca corresponds to the classes used to compute the PCA (computed with data of Oxford when testing in Paris and vice versa). Ablation Studies Sensitivity with respect to NC and Npca
  • 46. 46 The first 16 classes correspond to: vault, bell cote, bannister, analog clock, movie theater, coil, pier, dome, pedestal, flagpole, church, chime, suspension bridge, birdhouse, sundial, triumphal arch. All of them are related to buildings. Ablation Studies Using predefined classes
  • 48. 48 Search Post-Processing ● We establish a set of thresholds based on the max value of the normalized CAM ○ 1%, 10%, 20%, 30%, 40% ● Generate a bounding box covering the largest connected element ● Compare the descriptor of each new region of the target image with the query descriptor and generate new scores. ● Results a bit better if we average over 2 most probable CAMs ● For the top-R target images Region Proposals for Re-Ranking Query Expansion ● Generate a new query by l2 normalizing the sum of the top-QE target descriptors
  • 49. 49 Green - Ground Truth Orange → Red - Different Thresholds Region Proposals using CAMs
  • 50. 50 Green - Ground Truth Orange → Red - Different Thresholds Region Proposals using CAMs
  • 51. 51 Comparison with State-of-the-Art After Re-Ranking and Query Expansion Nc = 64 , Npca = 1, Nre-ranking = 6
  • 53. ▷ In this work we proposed a technique to build compact image representations focusing on their semantic content. ▷ We employed an image encoding pipeline that makes use of a pre-trained CNN and Class Activation Maps to extract discriminative regions from the image and weight its convolutional features accordingly ▷ Our experiments demonstrated that selecting the relevant content of an image to build the image descriptor is beneficial, and contributes to increase the retrieval performance. The proposed approach establishes a new state-of-the-art compared to methods that build image representations combining off-the-shelf features using random or fixed grid regions. 53 Conclusions