SlideShare une entreprise Scribd logo
1  sur  25
Télécharger pour lire hors ligne
IEEE International Conference on Multimedia & Expo 2013
Augmenting Descriptors for Fine-grained Visual
Categorization Using Polynomial Embedding
Hideki Nakayama
Graduate School of Information Science and Technology
The University of Tokyo
Outline
 Background and Motivation
 Our solution
 Polynomial embedding
 Experiment
 Fine-grained categorization
 Comparison with state-of-the-art
 Conclusion
2
Fine-grained visual categorization (FGVC)
 Distinguish hundreds of fine-grained objects
under a certain domain
(e.g., species of animals and plants)
 Complement to traditional object recognition problems
Caltech-256
[Griffin et al., 2007]
Caltech-Bird-200
[Welinder et al., 2010]
Generic Object Recognition FGVC
Yellow
Warbler
Pririe
Warbler
Pine
WarblerAirplane Monitor Dog
V.S.
3
Motivation
 We need highly discriminative features to
distinguish visually very similar categories
 Especially at local level.
4
Two basic ideas
 1. Co-occurrence (correlation) of
neighboring local descriptors
 Shaplet [Sabzmeydani et al., 2007] Covariance feature [Tuzel et al., 2006]
GGV [Harada et al., 2012]
Expected to capture middle-level local information
Results in high-dimensional local features
 2. State-of-the-art bag-of-words representation
 Based on higher-order statistics of local features
 Fisher vector [Perronnin et al., 2010]
 VLAD [Jegou et al., 2010]
Remarkably high-performance, enables linear classification
Dimensionality increases in linear to the size of local features
☹
☺
☺
☹
(N: number of visual words, D: size of local features)
ND
2ND
conflict
5
Our approach
 Compress polynomials of neighboring local features vector
with supervised dimensionality reduction
 Discriminative latent descriptor
 Encode by means of bag-of-words (Fisher vector)
 Logistic regression classifier
1,000~
1,0000 dim 64 dimDescriptor
(e.g. SIFT)
Dense sampling
polynomial
vectors
latent
descriptor
category
label
CCA
(training)
Fisher
vector
logistic
regression
classifier
6
★● ●
Exploit co-occurrence information
e.g. SIFT
( )
( )
( ) 













=
+
−
T
yxyx
T
yxyx
T
yxyx
yx
yx
Vec
Vec
upperVec
),(),(
),(),(
),(),(
),(
2
),(
δ
δ
vv
vv
vv
v
p
),( yxv),( yx σ−v ),( yx σ+v
Neighbor
(Left)
Neighbor
(Right)
Descriptor
at target
position



×

×
Polynomial Vector
a Matrixofvectorflattened:)(Vec
7
Exploit co-occurrence information
 More spatial information can be integrated with more
neighbors (but become high-dimensional)
( )
( )
( ) 













=
+
−
T
yxyx
T
yxyx
T
yxyx
yx
yx
Vec
Vec
upperVec
),(),(
),(),(
),(),(
),(
2
),(
δ
δ
vv
vv
vv
v
p
( )







= T
yxyx
yx
yx
upperVec ),(),(
),(0
),(
vv
v
p
( )
( )
( )
( )
( ) 





















=
+
+
−
−
T
yxyx
T
yxyx
T
yxyx
T
yxyx
T
yxyx
yx
yx
Vec
Vec
Vec
Vec
upperVec
),(),(
),(),(
),(),(
),(),(
),(),(
),(
4
),(
δ
δ
δ
δ
vv
vv
vv
vv
vv
v
p
★
★● ●
★● ●
●
●
0-neighbor
2-neighbors
4-neighbors
2,144dim
10,336dim
18,528dim
8
Descriptor
(e.g. SIFT)
Dense sampling
polynomial
vectors
latent
descriptor
Fisher
vector
logistic
regression
classifier
category
label
CCA
(training)
9
 Patch feature and label pairs
 Category label: Binary occurrence vector
 Strong supervision assumption
 Most patches should be related to the content (category)
 (Somewhat) justified for FGVC considering the applications
 Users will target the object, sometimes can give segmentation
Supervised dimensionality reduction
Allium
triquetrum
















0
0
1
0

















0
0
1
0

















0
0
1
0

















0
0
1
0

(We do not perform manual
segmentation in this work, though)10
 Canonical Correlation Analysis (CCA) [Hotelling, 1936]
( ) ( ) tslltpps
lp
andbetweenncorrelatiothemaximizethat,
tionstransformalinearfindsCCA
featurelabel:ls),(polynomiafeaturepatch:
−=−= TT
BA
Supervised dimensionality reduction
( )
( )IBCBBCBCCC
IACAACACCC
ll
T
llplpplp
pp
T
pplpllpl
=Λ=
=Λ=
−
−
21
21
nscorrelatiocanonical:
matricescovariance:
Λ
C
p l
Canonical space
s t
s
t
Image feature Labels feature
( )pps −= T
A
Latent descriptor
1,000~
1,0000 dim
64 dim
(discriminative)
11
Descriptor
(e.g. SIFT)
Dense sampling
polynomial
vectors
latent
descriptor
Fisher
vector
logistic
regression
classifier
category
label
CCA
(training)
12
Fisher Vector [Perronnin et al., 2010]
 State-of-the-art bag-of-words encoding method using
higher-level statistics of descriptors (mean and var)
http://www.image-net.org/challenges/LSVRC/2010/ILSVRC2010_XRCE.pdf
13
Experiments
Experimental setup
 FGVC Datasets
 Oxford-Flowers-102
 Caltech-Bird-200
 Descriptors
 SIFT, C-SIFT, Opponent-SIFT, Self Similarity
 Compressed into 64dim using several methods
 Fisher Vector
 64 Gaussians (visual wods)
 Global + 3 horizontal spatial regions
 Classifier
 Logistic regression
 Evaluation
 Mean classification accuracy
Flowers
Birds
15
Results: comparison with PCA and CCA
 Our method substantially improves performance for all
descriptors
 Just applying CCA to concatenated neighbors does not
improve performance
 Polynomial embedding makes sense (non-linear convolution)
0
10
20
30
40
50
60
70
80
90
SIFT C-SIFT Opp.-SIFT SSIM
PCA (baseline)
CCA (4-neighbors)
PolEmb (4-neighbors)
0
5
10
15
20
25
SIFT C-SIFT Opp.-SIFT SSIM
Flower Bird
Classification performance (%) with different embedding methods (all 64dim)
Baseline
(PCA)
Ours
16
CCA
without
Pol.
Results: number of neighbors
 Including more neighbors improves performance
0
10
20
30
40
50
60
70
80
90
SIFT C-SIFT Opp.-SIFT SSIM
0
5
10
15
20
25
SIFT C-SIFT Opp.-SIFT SSIM
Classification performance (%) of our method with different number of neighbors
Flower Bird
★
★● ●
★● ●
●
●
17
Comparison with other work
Our final system
 Combine four descriptors in late-fusion
approach (SIFT, C-SIFT, Opp.-SIFT, SSIM)
 Sum of log-likelihoods output by each classifier
(weighted by its individual confidence)
Descriptor 1
(e.g. SIFT)
Dense sampling
polynomial
vectors
category
label
CCA
latent
descriptor
Fisher
vector
logistic
regression
classifier
logistic
regression
classifier
logistic
regression
classifier
(training)
Descriptor 2
Descriptor K
+
・
・
・
・
・
・
Same as above
Same as above
Allium
triquetrum
19
Comparison on FGVC datasets
 Our method outperforms previous work on bird
and flower datasets
For the bird dataset, [32] uses the bounding box only for training images,
therefore the result is not directly comparable to ours.
(PCA)
(PolEmb)
(PCA+PolEmb)
← baseline
Mean classification accuracy (%)
20
ImageCLEF 2013 Plant Identification
Flower FruitLeaf Stem Entire
Kaki
Persimmon
Silver
birch
Boxelder
mapple
 Identify 250 plant species from different organs (Leaf,
Flower, Fruit, etc.)
 Got the 1st place in Natural Background task, and in 4/5
subtasks.
 (Coming in Sept., 2013.)
21
Conclusion
 A simple but effective method for FGVC
 Embedding co-occurrence patterns of neighboring descriptors
 Obtain discriminative and small-dimensional latent descriptor
to use together with Fisher vector
 Polynomial embedding greatly improves the performance,
indicating the importance of non-linearity
 Patch-level strong supervision approximation
 Not always perfect but reasonable for FGVC problems
 Future work
 Theoretical analysis (probabilistic interpretation)
 Multiple instance dimensionality reduction
22
Thank you!
 Any questions?
23
Object and scene categorization
 Caltech-101 (Object dataset)
 MIT-Indoor-67 (Scene dataset)
24
Results: Object and scene categorization
 Our method seems to be not as effective as
in FGVC problems
 Combining PCA feature + our feature
consistent improves performance
Mean classification accuracy (%)
0
10
20
30
40
50
60
70
80
SIFT C-SIFT Opp.-SIFT
0
10
20
30
40
50
60
SIFT C-SIFT Opp.-SIFT
Caltech-101 MIT-Indoor-67
0
10
20
30
40
50
60
70
80
SIFT C-SIFT Opp.-SIFT
0
10
20
30
40
50
60
SIFT C-SIFT Opp.-SIFT
25

Contenu connexe

Tendances

Log Analytics in Datacenter with Apache Spark and Machine Learning
Log Analytics in Datacenter with Apache Spark and Machine LearningLog Analytics in Datacenter with Apache Spark and Machine Learning
Log Analytics in Datacenter with Apache Spark and Machine LearningPiotr Tylenda
 
Conditional neural processes
Conditional neural processesConditional neural processes
Conditional neural processesKazuki Fujikawa
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks남주 김
 
Fast Identification of Heavy Hitters by Cached and Packed Group Testing
Fast Identification of Heavy Hitters by Cached and Packed Group TestingFast Identification of Heavy Hitters by Cached and Packed Group Testing
Fast Identification of Heavy Hitters by Cached and Packed Group TestingRakuten Group, Inc.
 
Gradient boosting in practice: a deep dive into xgboost
Gradient boosting in practice: a deep dive into xgboostGradient boosting in practice: a deep dive into xgboost
Gradient boosting in practice: a deep dive into xgboostJaroslaw Szymczak
 
Neural Processes Family
Neural Processes FamilyNeural Processes Family
Neural Processes FamilyKota Matsui
 
Fast Wavelet Tree Construction in Practice
Fast Wavelet Tree Construction in PracticeFast Wavelet Tree Construction in Practice
Fast Wavelet Tree Construction in PracticeRakuten Group, Inc.
 
A Signature Scheme as Secure as the Diffie Hellman Problem
A Signature Scheme as Secure as the Diffie Hellman ProblemA Signature Scheme as Secure as the Diffie Hellman Problem
A Signature Scheme as Secure as the Diffie Hellman Problemvsubhashini
 
Some fixed point theorems in generalised dislocated metric spaces
Some fixed point theorems in generalised dislocated metric spacesSome fixed point theorems in generalised dislocated metric spaces
Some fixed point theorems in generalised dislocated metric spacesAlexander Decker
 
11.some fixed point theorems in generalised dislocated metric spaces
11.some fixed point theorems in generalised dislocated metric spaces11.some fixed point theorems in generalised dislocated metric spaces
11.some fixed point theorems in generalised dislocated metric spacesAlexander Decker
 
Faster Practical Block Compression for Rank/Select Dictionaries
Faster Practical Block Compression for Rank/Select DictionariesFaster Practical Block Compression for Rank/Select Dictionaries
Faster Practical Block Compression for Rank/Select DictionariesRakuten Group, Inc.
 
VRP2013 - Comp Aspects VRP
VRP2013 - Comp Aspects VRPVRP2013 - Comp Aspects VRP
VRP2013 - Comp Aspects VRPVictor Pillac
 
Neural Processes
Neural ProcessesNeural Processes
Neural ProcessesSangwoo Mo
 
NIPS2017 Few-shot Learning and Graph Convolution
NIPS2017 Few-shot Learning and Graph ConvolutionNIPS2017 Few-shot Learning and Graph Convolution
NIPS2017 Few-shot Learning and Graph ConvolutionKazuki Fujikawa
 
Principal component analysis and matrix factorizations for learning (part 3) ...
Principal component analysis and matrix factorizations for learning (part 3) ...Principal component analysis and matrix factorizations for learning (part 3) ...
Principal component analysis and matrix factorizations for learning (part 3) ...zukun
 
Sparse Kernel Learning for Image Annotation
Sparse Kernel Learning for Image AnnotationSparse Kernel Learning for Image Annotation
Sparse Kernel Learning for Image AnnotationSean Moran
 
Scala for Java programmers
Scala for Java programmersScala for Java programmers
Scala for Java programmers輝 子安
 
Salt Identification Challenge
Salt Identification ChallengeSalt Identification Challenge
Salt Identification Challengekenluck2001
 

Tendances (19)

Log Analytics in Datacenter with Apache Spark and Machine Learning
Log Analytics in Datacenter with Apache Spark and Machine LearningLog Analytics in Datacenter with Apache Spark and Machine Learning
Log Analytics in Datacenter with Apache Spark and Machine Learning
 
Conditional neural processes
Conditional neural processesConditional neural processes
Conditional neural processes
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
Fast Identification of Heavy Hitters by Cached and Packed Group Testing
Fast Identification of Heavy Hitters by Cached and Packed Group TestingFast Identification of Heavy Hitters by Cached and Packed Group Testing
Fast Identification of Heavy Hitters by Cached and Packed Group Testing
 
Gradient boosting in practice: a deep dive into xgboost
Gradient boosting in practice: a deep dive into xgboostGradient boosting in practice: a deep dive into xgboost
Gradient boosting in practice: a deep dive into xgboost
 
Neural Processes Family
Neural Processes FamilyNeural Processes Family
Neural Processes Family
 
Fast Wavelet Tree Construction in Practice
Fast Wavelet Tree Construction in PracticeFast Wavelet Tree Construction in Practice
Fast Wavelet Tree Construction in Practice
 
A Signature Scheme as Secure as the Diffie Hellman Problem
A Signature Scheme as Secure as the Diffie Hellman ProblemA Signature Scheme as Secure as the Diffie Hellman Problem
A Signature Scheme as Secure as the Diffie Hellman Problem
 
Some fixed point theorems in generalised dislocated metric spaces
Some fixed point theorems in generalised dislocated metric spacesSome fixed point theorems in generalised dislocated metric spaces
Some fixed point theorems in generalised dislocated metric spaces
 
11.some fixed point theorems in generalised dislocated metric spaces
11.some fixed point theorems in generalised dislocated metric spaces11.some fixed point theorems in generalised dislocated metric spaces
11.some fixed point theorems in generalised dislocated metric spaces
 
Faster Practical Block Compression for Rank/Select Dictionaries
Faster Practical Block Compression for Rank/Select DictionariesFaster Practical Block Compression for Rank/Select Dictionaries
Faster Practical Block Compression for Rank/Select Dictionaries
 
VRP2013 - Comp Aspects VRP
VRP2013 - Comp Aspects VRPVRP2013 - Comp Aspects VRP
VRP2013 - Comp Aspects VRP
 
Neural Processes
Neural ProcessesNeural Processes
Neural Processes
 
NIPS2017 Few-shot Learning and Graph Convolution
NIPS2017 Few-shot Learning and Graph ConvolutionNIPS2017 Few-shot Learning and Graph Convolution
NIPS2017 Few-shot Learning and Graph Convolution
 
Principal component analysis and matrix factorizations for learning (part 3) ...
Principal component analysis and matrix factorizations for learning (part 3) ...Principal component analysis and matrix factorizations for learning (part 3) ...
Principal component analysis and matrix factorizations for learning (part 3) ...
 
Sparse Kernel Learning for Image Annotation
Sparse Kernel Learning for Image AnnotationSparse Kernel Learning for Image Annotation
Sparse Kernel Learning for Image Annotation
 
Scala for Java programmers
Scala for Java programmersScala for Java programmers
Scala for Java programmers
 
07slide
07slide07slide
07slide
 
Salt Identification Challenge
Salt Identification ChallengeSalt Identification Challenge
Salt Identification Challenge
 

En vedette

SSII2014 詳細画像識別 (FGVC) @OS2
SSII2014 詳細画像識別 (FGVC) @OS2SSII2014 詳細画像識別 (FGVC) @OS2
SSII2014 詳細画像識別 (FGVC) @OS2nlab_utokyo
 
マシンパーセプション研究におけるChainer活用事例
マシンパーセプション研究におけるChainer活用事例マシンパーセプション研究におけるChainer活用事例
マシンパーセプション研究におけるChainer活用事例nlab_utokyo
 
Deep Learningと画像認識   ~歴史・理論・実践~
Deep Learningと画像認識 ~歴史・理論・実践~Deep Learningと画像認識 ~歴史・理論・実践~
Deep Learningと画像認識   ~歴史・理論・実践~nlab_utokyo
 
Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super Vector
Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super VectorLec-08 Feature Aggregation II: Fisher Vector, AKULA and Super Vector
Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super VectorUnited States Air Force Academy
 
Lab introduction 2014
Lab introduction 2014Lab introduction 2014
Lab introduction 2014nlab_utokyo
 
画像処理分野における研究事例紹介
画像処理分野における研究事例紹介画像処理分野における研究事例紹介
画像処理分野における研究事例紹介nlab_utokyo
 
20160601画像電子学会
20160601画像電子学会20160601画像電子学会
20160601画像電子学会nlab_utokyo
 
Machine Translation Introduction
Machine Translation IntroductionMachine Translation Introduction
Machine Translation Introductionnlab_utokyo
 
Deep Learningによる画像認識革命 ー歴史・最新理論から実践応用までー
Deep Learningによる画像認識革命 ー歴史・最新理論から実践応用までーDeep Learningによる画像認識革命 ー歴史・最新理論から実践応用までー
Deep Learningによる画像認識革命 ー歴史・最新理論から実践応用までーnlab_utokyo
 
Mobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large RepositoriesMobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large RepositoriesUnited States Air Force Academy
 
A Survey about Object Retrieval
A Survey about Object RetrievalA Survey about Object Retrieval
A Survey about Object RetrievalNguyen Tuan
 

En vedette (20)

Seminar
SeminarSeminar
Seminar
 
SSII2014 詳細画像識別 (FGVC) @OS2
SSII2014 詳細画像識別 (FGVC) @OS2SSII2014 詳細画像識別 (FGVC) @OS2
SSII2014 詳細画像識別 (FGVC) @OS2
 
マシンパーセプション研究におけるChainer活用事例
マシンパーセプション研究におけるChainer活用事例マシンパーセプション研究におけるChainer活用事例
マシンパーセプション研究におけるChainer活用事例
 
Deep Learningと画像認識   ~歴史・理論・実践~
Deep Learningと画像認識 ~歴史・理論・実践~Deep Learningと画像認識 ~歴史・理論・実践~
Deep Learningと画像認識   ~歴史・理論・実践~
 
Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super Vector
Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super VectorLec-08 Feature Aggregation II: Fisher Vector, AKULA and Super Vector
Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super Vector
 
RecSysTV2014
RecSysTV2014RecSysTV2014
RecSysTV2014
 
MIRU2014 SLAC
MIRU2014 SLACMIRU2014 SLAC
MIRU2014 SLAC
 
Lab introduction 2014
Lab introduction 2014Lab introduction 2014
Lab introduction 2014
 
ISM2014
ISM2014ISM2014
ISM2014
 
画像処理分野における研究事例紹介
画像処理分野における研究事例紹介画像処理分野における研究事例紹介
画像処理分野における研究事例紹介
 
20150414seminar
20150414seminar20150414seminar
20150414seminar
 
20160601画像電子学会
20160601画像電子学会20160601画像電子学会
20160601画像電子学会
 
Machine Translation Introduction
Machine Translation IntroductionMachine Translation Introduction
Machine Translation Introduction
 
20150930
2015093020150930
20150930
 
Deep Learningによる画像認識革命 ー歴史・最新理論から実践応用までー
Deep Learningによる画像認識革命 ー歴史・最新理論から実践応用までーDeep Learningによる画像認識革命 ー歴史・最新理論から実践応用までー
Deep Learningによる画像認識革命 ー歴史・最新理論から実践応用までー
 
Mobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large RepositoriesMobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large Repositories
 
Lec16 subspace optimization
Lec16 subspace optimizationLec16 subspace optimization
Lec16 subspace optimization
 
Lec11 rate distortion optimization
Lec11 rate distortion optimizationLec11 rate distortion optimization
Lec11 rate distortion optimization
 
A Survey about Object Retrieval
A Survey about Object RetrievalA Survey about Object Retrieval
A Survey about Object Retrieval
 
Lec07 aggregation-and-retrieval-system
Lec07 aggregation-and-retrieval-systemLec07 aggregation-and-retrieval-system
Lec07 aggregation-and-retrieval-system
 

Similaire à IEEE 2013 Conference Paper on Fine-Grained Visual Categorization Using Polynomial Embedding

Stack squeues lists
Stack squeues listsStack squeues lists
Stack squeues listsJames Wong
 
Stacksqueueslists
StacksqueueslistsStacksqueueslists
StacksqueueslistsFraboni Ec
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues listsTony Nguyen
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues listsHarry Potter
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues listsYoung Alista
 
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...cvpaper. challenge
 
Introduction to Julia
Introduction to JuliaIntroduction to Julia
Introduction to Julia岳華 杜
 
Instance Based Learning in Machine Learning
Instance Based Learning in Machine LearningInstance Based Learning in Machine Learning
Instance Based Learning in Machine LearningPavithra Thippanaik
 
An optimal and progressive algorithm for skyline queries slide
An optimal and progressive algorithm for skyline queries slideAn optimal and progressive algorithm for skyline queries slide
An optimal and progressive algorithm for skyline queries slideWooSung Choi
 
Metaheuristic Tuning of Type-II Fuzzy Inference System for Data Mining
Metaheuristic Tuning of Type-II Fuzzy Inference System for Data MiningMetaheuristic Tuning of Type-II Fuzzy Inference System for Data Mining
Metaheuristic Tuning of Type-II Fuzzy Inference System for Data MiningVarun Ojha
 
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetup
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetupLucas Theis - Compressing Images with Neural Networks - Creative AI meetup
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetupLuba Elliott
 
Introduction
IntroductionIntroduction
Introductionbutest
 
Lecture 2: Stochastic Hydrology
Lecture 2: Stochastic Hydrology Lecture 2: Stochastic Hydrology
Lecture 2: Stochastic Hydrology Amro Elfeki
 

Similaire à IEEE 2013 Conference Paper on Fine-Grained Visual Categorization Using Polynomial Embedding (20)

ICSM07.ppt
ICSM07.pptICSM07.ppt
ICSM07.ppt
 
Seminar psu 20.10.2013
Seminar psu 20.10.2013Seminar psu 20.10.2013
Seminar psu 20.10.2013
 
Stack squeues lists
Stack squeues listsStack squeues lists
Stack squeues lists
 
Stacksqueueslists
StacksqueueslistsStacksqueueslists
Stacksqueueslists
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues lists
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues lists
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues lists
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues lists
 
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
 
Introduction to Julia
Introduction to JuliaIntroduction to Julia
Introduction to Julia
 
Instance Based Learning in Machine Learning
Instance Based Learning in Machine LearningInstance Based Learning in Machine Learning
Instance Based Learning in Machine Learning
 
An optimal and progressive algorithm for skyline queries slide
An optimal and progressive algorithm for skyline queries slideAn optimal and progressive algorithm for skyline queries slide
An optimal and progressive algorithm for skyline queries slide
 
Metaheuristic Tuning of Type-II Fuzzy Inference System for Data Mining
Metaheuristic Tuning of Type-II Fuzzy Inference System for Data MiningMetaheuristic Tuning of Type-II Fuzzy Inference System for Data Mining
Metaheuristic Tuning of Type-II Fuzzy Inference System for Data Mining
 
Presentation OCIP 2015
Presentation OCIP 2015Presentation OCIP 2015
Presentation OCIP 2015
 
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof..."Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
 
Polynomial Matrix Decompositions
Polynomial Matrix DecompositionsPolynomial Matrix Decompositions
Polynomial Matrix Decompositions
 
Perm winter school 2014.01.31
Perm winter school 2014.01.31Perm winter school 2014.01.31
Perm winter school 2014.01.31
 
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetup
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetupLucas Theis - Compressing Images with Neural Networks - Creative AI meetup
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetup
 
Introduction
IntroductionIntroduction
Introduction
 
Lecture 2: Stochastic Hydrology
Lecture 2: Stochastic Hydrology Lecture 2: Stochastic Hydrology
Lecture 2: Stochastic Hydrology
 

Dernier

unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 

Dernier (20)

unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 

IEEE 2013 Conference Paper on Fine-Grained Visual Categorization Using Polynomial Embedding

  • 1. IEEE International Conference on Multimedia & Expo 2013 Augmenting Descriptors for Fine-grained Visual Categorization Using Polynomial Embedding Hideki Nakayama Graduate School of Information Science and Technology The University of Tokyo
  • 2. Outline  Background and Motivation  Our solution  Polynomial embedding  Experiment  Fine-grained categorization  Comparison with state-of-the-art  Conclusion 2
  • 3. Fine-grained visual categorization (FGVC)  Distinguish hundreds of fine-grained objects under a certain domain (e.g., species of animals and plants)  Complement to traditional object recognition problems Caltech-256 [Griffin et al., 2007] Caltech-Bird-200 [Welinder et al., 2010] Generic Object Recognition FGVC Yellow Warbler Pririe Warbler Pine WarblerAirplane Monitor Dog V.S. 3
  • 4. Motivation  We need highly discriminative features to distinguish visually very similar categories  Especially at local level. 4
  • 5. Two basic ideas  1. Co-occurrence (correlation) of neighboring local descriptors  Shaplet [Sabzmeydani et al., 2007] Covariance feature [Tuzel et al., 2006] GGV [Harada et al., 2012] Expected to capture middle-level local information Results in high-dimensional local features  2. State-of-the-art bag-of-words representation  Based on higher-order statistics of local features  Fisher vector [Perronnin et al., 2010]  VLAD [Jegou et al., 2010] Remarkably high-performance, enables linear classification Dimensionality increases in linear to the size of local features ☹ ☺ ☺ ☹ (N: number of visual words, D: size of local features) ND 2ND conflict 5
  • 6. Our approach  Compress polynomials of neighboring local features vector with supervised dimensionality reduction  Discriminative latent descriptor  Encode by means of bag-of-words (Fisher vector)  Logistic regression classifier 1,000~ 1,0000 dim 64 dimDescriptor (e.g. SIFT) Dense sampling polynomial vectors latent descriptor category label CCA (training) Fisher vector logistic regression classifier 6
  • 7. ★● ● Exploit co-occurrence information e.g. SIFT ( ) ( ) ( )               = + − T yxyx T yxyx T yxyx yx yx Vec Vec upperVec ),(),( ),(),( ),(),( ),( 2 ),( δ δ vv vv vv v p ),( yxv),( yx σ−v ),( yx σ+v Neighbor (Left) Neighbor (Right) Descriptor at target position    ×  × Polynomial Vector a Matrixofvectorflattened:)(Vec 7
  • 8. Exploit co-occurrence information  More spatial information can be integrated with more neighbors (but become high-dimensional) ( ) ( ) ( )               = + − T yxyx T yxyx T yxyx yx yx Vec Vec upperVec ),(),( ),(),( ),(),( ),( 2 ),( δ δ vv vv vv v p ( )        = T yxyx yx yx upperVec ),(),( ),(0 ),( vv v p ( ) ( ) ( ) ( ) ( )                       = + + − − T yxyx T yxyx T yxyx T yxyx T yxyx yx yx Vec Vec Vec Vec upperVec ),(),( ),(),( ),(),( ),(),( ),(),( ),( 4 ),( δ δ δ δ vv vv vv vv vv v p ★ ★● ● ★● ● ● ● 0-neighbor 2-neighbors 4-neighbors 2,144dim 10,336dim 18,528dim 8
  • 10.  Patch feature and label pairs  Category label: Binary occurrence vector  Strong supervision assumption  Most patches should be related to the content (category)  (Somewhat) justified for FGVC considering the applications  Users will target the object, sometimes can give segmentation Supervised dimensionality reduction Allium triquetrum                 0 0 1 0                  0 0 1 0                  0 0 1 0                  0 0 1 0  (We do not perform manual segmentation in this work, though)10
  • 11.  Canonical Correlation Analysis (CCA) [Hotelling, 1936] ( ) ( ) tslltpps lp andbetweenncorrelatiothemaximizethat, tionstransformalinearfindsCCA featurelabel:ls),(polynomiafeaturepatch: −=−= TT BA Supervised dimensionality reduction ( ) ( )IBCBBCBCCC IACAACACCC ll T llplpplp pp T pplpllpl =Λ= =Λ= − − 21 21 nscorrelatiocanonical: matricescovariance: Λ C p l Canonical space s t s t Image feature Labels feature ( )pps −= T A Latent descriptor 1,000~ 1,0000 dim 64 dim (discriminative) 11
  • 13. Fisher Vector [Perronnin et al., 2010]  State-of-the-art bag-of-words encoding method using higher-level statistics of descriptors (mean and var) http://www.image-net.org/challenges/LSVRC/2010/ILSVRC2010_XRCE.pdf 13
  • 15. Experimental setup  FGVC Datasets  Oxford-Flowers-102  Caltech-Bird-200  Descriptors  SIFT, C-SIFT, Opponent-SIFT, Self Similarity  Compressed into 64dim using several methods  Fisher Vector  64 Gaussians (visual wods)  Global + 3 horizontal spatial regions  Classifier  Logistic regression  Evaluation  Mean classification accuracy Flowers Birds 15
  • 16. Results: comparison with PCA and CCA  Our method substantially improves performance for all descriptors  Just applying CCA to concatenated neighbors does not improve performance  Polynomial embedding makes sense (non-linear convolution) 0 10 20 30 40 50 60 70 80 90 SIFT C-SIFT Opp.-SIFT SSIM PCA (baseline) CCA (4-neighbors) PolEmb (4-neighbors) 0 5 10 15 20 25 SIFT C-SIFT Opp.-SIFT SSIM Flower Bird Classification performance (%) with different embedding methods (all 64dim) Baseline (PCA) Ours 16 CCA without Pol.
  • 17. Results: number of neighbors  Including more neighbors improves performance 0 10 20 30 40 50 60 70 80 90 SIFT C-SIFT Opp.-SIFT SSIM 0 5 10 15 20 25 SIFT C-SIFT Opp.-SIFT SSIM Classification performance (%) of our method with different number of neighbors Flower Bird ★ ★● ● ★● ● ● ● 17
  • 19. Our final system  Combine four descriptors in late-fusion approach (SIFT, C-SIFT, Opp.-SIFT, SSIM)  Sum of log-likelihoods output by each classifier (weighted by its individual confidence) Descriptor 1 (e.g. SIFT) Dense sampling polynomial vectors category label CCA latent descriptor Fisher vector logistic regression classifier logistic regression classifier logistic regression classifier (training) Descriptor 2 Descriptor K + ・ ・ ・ ・ ・ ・ Same as above Same as above Allium triquetrum 19
  • 20. Comparison on FGVC datasets  Our method outperforms previous work on bird and flower datasets For the bird dataset, [32] uses the bounding box only for training images, therefore the result is not directly comparable to ours. (PCA) (PolEmb) (PCA+PolEmb) ← baseline Mean classification accuracy (%) 20
  • 21. ImageCLEF 2013 Plant Identification Flower FruitLeaf Stem Entire Kaki Persimmon Silver birch Boxelder mapple  Identify 250 plant species from different organs (Leaf, Flower, Fruit, etc.)  Got the 1st place in Natural Background task, and in 4/5 subtasks.  (Coming in Sept., 2013.) 21
  • 22. Conclusion  A simple but effective method for FGVC  Embedding co-occurrence patterns of neighboring descriptors  Obtain discriminative and small-dimensional latent descriptor to use together with Fisher vector  Polynomial embedding greatly improves the performance, indicating the importance of non-linearity  Patch-level strong supervision approximation  Not always perfect but reasonable for FGVC problems  Future work  Theoretical analysis (probabilistic interpretation)  Multiple instance dimensionality reduction 22
  • 23. Thank you!  Any questions? 23
  • 24. Object and scene categorization  Caltech-101 (Object dataset)  MIT-Indoor-67 (Scene dataset) 24
  • 25. Results: Object and scene categorization  Our method seems to be not as effective as in FGVC problems  Combining PCA feature + our feature consistent improves performance Mean classification accuracy (%) 0 10 20 30 40 50 60 70 80 SIFT C-SIFT Opp.-SIFT 0 10 20 30 40 50 60 SIFT C-SIFT Opp.-SIFT Caltech-101 MIT-Indoor-67 0 10 20 30 40 50 60 70 80 SIFT C-SIFT Opp.-SIFT 0 10 20 30 40 50 60 SIFT C-SIFT Opp.-SIFT 25