SlideShare une entreprise Scribd logo
1  sur  103
Visual Object Analysis using
Regions and Local Features
Carles Ventura Royo
Co-advisors
Xavier Giró i Nieto
Verónica Vilaplana Besler
Tutor
Ferran Marqués Acosta
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Part II: Multiresolution co-clustering for uncalibrated multiview
segmentation
• Conclusions
2
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Introduction
• Related Work
• Contributions
• Experiments
• Conclusions
• Part II: Multiresolution co-clustering for uncalibrated multiview
segmentation
• Conclusions
3
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Part II: Multiresolution co-clustering for uncalibrated multiview
segmentation
• Introduction
• Related Work
• Contributions
• Experiments
• Conclusions
• Conclusions
4
Introduction: Semantic segmentation
5
Instance
segmentation
Class
segmentation
boat
Introduction: Semantic segmentation
6
Part I: Single view Part II: Multiview
STATE OF
THE ART
OUR
RESULTS
Introduction: Visual Object Analysis
7
vs
Objects Scene
Introduction: Regions
8
Introduction: Regions
9
1 2
9
6
7
3
45
8
10
11
9 2
3
12 10
15 14
4 13
5 1
16 7
18 17
8 6
19
BINARY PARTITION TREE
Introduction: Regions
10
1 2
9
6
7
3
45
8
10
9
2
310
4
5
1
7
8
6
REGION ADJACENCY GRAPH
Introduction: Local Features
11
Local Features Global Features
Introduction: Local Features Aggregation
12
• Bag of Features (BoF) [1]
vector
quantization
codebook
Bag of Features
[1] G Csurka et al, Visual Categorization with Bags of Keypoints. ECCV’04
Introduction: Local Features Aggregation
13
• Pooling
1
𝑁
𝑖=1
𝑁
𝑥𝑖
1
𝑁
𝑖=1
𝑁
𝑥𝑖 𝑥𝑖
𝑇
First Order Average Pooling (O1P) [1]
Second Order Average Pooling (O2P) [2]
𝑥𝑖: 𝑙𝑜𝑐𝑎𝑙 𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠
No need of codebook High dimensionality
[1] Y Boureau et al, A Theoretical Analysis of Feature Pooling in Visual Recognition. ICML’10
[2] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12
Part I
Context analysis
in semantic segmentation
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Introduction
• Related Work
• Contributions
• Experiments
• Conclusions
• Part II: Multiresolution co-clustering for uncalibrated multiview
segmentation
• Conclusions
15
Introduction: Context
16
[2] A Rabinovich et al, Objects in Context. ICCV’07
Semantic context [1,2] Spatial context
[1] M Bar, Visual Objects in Context. Nature Reviews Neuroscience 2004
GOAL: Analyze the influence of the
spatial context in object recognition
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Introduction
• Related Work
• Contributions
• Experiments
• Conclusions
• Part II: Multiresolution co-clustering for uncalibrated multiview
segmentation
• Conclusions
17
Related Work: Ideal scenario
18
Ground
truth
object
location
[1] J.R.R. Uijlings et al., The Visual Extent of an Object. IJCV’12
Conclusion: Aggregating the local features over three region pools
(interior, border and surround) increases the performance [1]
Related Work: Realistic scenario
• Pipeline [1]
19
Input
image
Generate
object
candidates
Rank
object
candidates
Predict
class
scores
Aggregate
high-rank
candidates
[1] J Carreira et al, Object Recognition as Ranking
Holistic Figure-Ground Hypotheses. CVPR’10
Semantic
partition
Related Work: Realistic scenario
• How is each class predictor trained? [1]
20
0.8179
0.6861
0.9013
0.7381
0.7105
0.6462
TRAINING
DATA
A SVR is used to learn the function that
predicts the overlap for each class
GOAL: CHANGE SPATIAL CODIFICATION
O2PF O2PG
overlap
score
os_1
os_2
os_N
SVR os = f([O2PF O2PG])
[O2PF_1 O2PG_1]
[O2PF_2 O2PG_2]
[O2PF_1 O2PG_1]
…
[1] J Carreira et al, Semantic segmentation
with second-order pooling. ECCV’12
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Introduction
• Related Work
• Contributions
• Experiments
• Conclusions
• Part II: Multiresolution co-clustering for uncalibrated multiview
segmentation
• Conclusions
21
Contributions
• Figure-Border-Ground spatial pooling in the realistic scenario
22
os_1
os_2
os_N
SVR os = f([O2PF O2PB O2PG])
[O2PF_1 O2PB_1 O2PG_1]
[O2PF_2 O2PB_2 O2PG_2]
[O2PF_N O2PB_N O2PG_N]
…
Contributions
• Contour-based spatial pyramid [1]: crown-based
23
os_1
os_2
os_N
SVR os = f([O2PF O2PSR1 O2PSR2 O2PSR3 O2PSR4])
[O2PF_1 O2PSR1_1 O2PSR2_1 O2PSR3_1 O2PSR4_1]
[O2PF_2 O2PSR1_2 O2PSR2_2 O2PSR3_2 O2PSR4_2]
[O2PF_N O2PSR1_N O2PSR2_N O2PSR3_N O2PSR4_N]
[1] S Lazebnik et al, Beyond bags of features: Spatial pyramid
matching for recognizing natural scene categories. CVPR’06
…
Contributions
• Contour-based spatial pyramid [1]: Cartesian-based
24
os_1
os_2
os_N
SVR os = f([O2PF O2PSR1 O2PSR2 O2PSR3 O2PSR4])
[O2PF_1 O2PSR1_1 O2PSR2_1 O2PSR3_1 O2PSR4_1]
[O2PF_2 O2PSR1_2 O2PSR2_2 O2PSR3_2 O2PSR4_2]
[O2PF_N O2PSR1_N O2PSR2_N O2PSR3_N O2PSR4_N]
[1] S Lazebnik et al, Beyond bags of features: Spatial pyramid
matching for recognizing natural scene categories. CVPR’06
…
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Introduction
• Related Work
• Contributions
• Experiments
• Conclusions
• Part II: Multiresolution co-clustering for uncalibrated multiview
segmentation
• Conclusions
25
Experiments
• Pascal VOC segmentation challenge 2011 & 2012 [1]
• Train, validation and test subsets
• Train: 1,112 (2011) / 1,464 (2012)
• Validation: 1,111 (2011) / 1,449 (2012)
• Test: 1,111 (2011) / 1,456 (2012)
• 20 semantic classes
• aeroplane, bicycle, bird, boat, bottle, bus, car, cat, chair, cow, dinningtable, dog,
horse, motorbike, person, pottedplant, sheep, sofa, train, tvmonitor
• Evaluation measure: Average Accuracy Classification
26[1] M Everingham et al, The PASCAL Visual Object Classes (VOC) Challenge. IJCV’10
Experiments: Local Features Aggregation
27
• Pooling
1
𝑁
𝑖=1
𝑁
𝑥𝑖
1
𝑁
𝑖=1
𝑁
𝑥𝑖 𝑥𝑖
𝑇
First Order Average Pooling (O1P) [1]
Second Order Average Pooling (O2P) [2]
𝑥𝑖: 𝑙𝑜𝑐𝑎𝑙 𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠
No need of codebook High dimensionality
[1] Y Boureau et al, A Theoretical Analysis of Feature Pooling in Visual Recognition. ICML’10
[2] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12
Experiments
• Ideal scenario
• Train set: train11
• Test set: val11
28
F [1] F-B F-G [1] F-B-G
eSIFT [1] 63.9 66.2 66.4 68.6
eMSIFT [1] 64.8 68.9 67.7 70.8
[1] J Carreira et al, Semantic segmentation with second-
order pooling. ECCV’12
Experiments
• Ideal scenario
• Train set: train11
• Test set: val11
29
F [1] F-B F-B-G
Non SP 64.8 68.9 70.8
Crown-based SP 68.7 71.1 71.7
Cartesian-based SP 67.7 71.6 72.7
[1] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12
Experiments
• Ideal scenario
• Train set: train11
• Test set: val11
30
Figure SP (Figure) Border Ground AAC
eSIFT+eMSIFT+eLBP eSIFT 72.98 [1]
eSIFT+eMSIFT eSIFT+eMSIFT eSIFT+eMSIFT 73.84
eSIFT+eMSIFT+eLBP eMSIFT eSIFT+eMSIFT eSIFT+eMSIFT 75.86
[1] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12
Experiments
• Realistic scenario (CPMC [1])
• Train set: train11
• Test set: val11
31
Figure SP (Figure) Border Ground AAC
eSIFT eSIFT 28.6 [2]
eSIFT eSIFT eSIFT 34.8
eSIFT+eMSIFT+eLBP eSIFT 37.2 [2]
eSIFT eSIFT eSIFT eSIFT 37.4
eSIFT+eMSIFT+eLBP eSIFT eSIFT eSIFT 39.6
[2] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12
[1] J Carreira et al, Constrained parametric min-cuts for automatic object segmentation. CVPR’10
Experiments
• Realistic scenario (CPMC [1])
• Train set: trainval11/12
• Test set: test11/12
32
[2] J Carreira et al, Semantic segmentation with second-
order pooling. ECCV’12
F-G [2] F-B-G SP(F)-B-G
VOC11 38.8 43.8 40.3
VOC12 39.9 42.2 40.8
[1] J Carreira et al, Constrained parametric min-cuts for
automatic object segmentation. CVPR’10
Experiments
• Realistic scenario (MCG [1])
• Train set: train11
• Test set: val11
33
[2] J Carreira et al, Semantic segmentation with
second-order pooling. ECCV’12
F-G [2] F-B-G SP(F)-B-G
CPMC 37.2 38.9 39.6
MCG 30.9 34.1 36.1
[1] P Arbeláez et al, Multiscale combinatorial grouping.
CVPR’14
Experiments: Qualitative evaluation
34
F-G F-B-G F-G F-B-G
aeroplane
bicycle bicycle
cat bird
motorbike boat
bottle
bus
bus
motorbike car
chair
cat
chair chair
horse bird
cow
Experiments: Qualitative evaluation
35
F-G F-B-G F-G F-B-G
chair
diningtable
cow dog
person
horse
person motorbike
motorbike
motorbike
person
pottedplant bottle
sheep
sofa
cat
bus
train train
tvmonitor
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Introduction
• Related Work
• Contributions
• Experiments
• Conclusions
• Part II: Multiresolution co-clustering for uncalibrated multiview
segmentation
• Conclusions
36
Conclusions
• Figure-Border-Ground spatial pooling improves the original Figure-
Ground pooling in both ideal and realistic scenarios
• The Border region pool carries the richest contextual information
• The Cartesian-based spatial pyramid outperforms the crown-based
spatial pyramid, but both of them may result in overfitting
• Both Figure-Border-Ground pooling and Cartesian-based spatial
pyramid have been validated with MCG object candidates
• Published in ICIP’15
37
Part II
Multiresolution co-clustering for
uncalibrated multiview segmentation
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Part II: Multiresolution co-clustering for uncalibrated multiview
segmentation
• Introduction
• Related Work
• Contributions
• Experiments
• Conclusions
• Conclusions
39
Introduction
40
STATEOFTHEARTOURRESULTS
Introduction
• First goal: improving generic segmentation
41
• Motion-based region adjacency graph
• New resolution parameterization
• Relaxing hierarchical constraints with a two-step architecture
• Practical framework for a global optimization
• Second goal: improving semantic segmentation
• Semantic-based generic segmentation
• Automatic resolution selection technique
• Generic segmentation based semantic segmentation
Introduction
• Co-segmentation
42
• Video segmentation
• Co-clustering
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Part II: Multiresolution co-clustering for uncalibrated multiview
segmentation
• Introduction
• Related Work
• Contributions
• Experiments
• Conclusions
• Conclusions
43
Related Work: Co-clustering framework [1,2]
• Objective: Find the clusters that define the coherent regions across
the different views at multiple resolutions
44
[2] D Varas et al, Multiresolution hierarchy co-clustering for semantic segmentation in sequences with small variations. ICCV’15
[1] D Glasner et al, Contour-based joint clustering of multiple segmentations. CVPR’11
LEAVES
PARTITIONS
CO-CLUSTERED PARTITIONS
INPUT
IMAGES
HIERARCHIES
Related Work: Co-clustering framework [1,2]
• Objective: Find the clusters that define the coherent regions across
the different views
45
view 1 view 2 view 1 view 2
LEAVES PARTITIONS CO-CLUSTERED PARTITIONS
[2] D Varas et al, Multiresolution hierarchy co-clustering for semantic segmentation in sequences with small variations. ICCV’15
[1] D Glasner et al, Contour-based joint clustering of multiple segmentations. CVPR’11
R2
Related Work: Co-clustering framework
• Representation with boundary variables
• Intra-image boundary variables: D1,2, D1,3, D2,3, D4,5, D5,6
• Inter-image boundary variables: D1,4, D1,5, D2,4, D2,5, D3,6
46
view 1 view 2 view 1 view 2
LEAVES PARTITIONS CO-CLUSTERED PARTITIONS
D1,2 = 0 D1,4 = 0
D1,3 = 1 D1,5 = 0
D2,3 = 1 D2,4 = 0
D4,5 = 0 D2,5 = 0
D5,6 = 1 D3,6 = 0
R2
Related Work: Co-clustering framework
• How are the values of the boundary variables chosen?
47
view 1 view 2
LEAVES PARTITIONS
INTRA INTERACTIONS INTER INTERACTIONS
Q1,2, Q1,3, Q2,3, Q4,5, Q5,6 Q1,4, Q1,5, Q2,4, Q2,5, Q3,6
R2
Related Work: Co-clustering framework
• Hierarchical constraint
48
view 1 view 2
1 2
3
4 5
6 Co-clustered partitions cannot
violate the hierarchical structures
R2
Related Work: Co-clustering framework
• Hierarchical constraint
49
view 1 view 2
1 3
2
4 5
6 Co-clustered partitions cannot
violate the hierarchical structures
R2
Related Work: Co-clustering framework
• Multiresolution parameterization
50
view 1 view 2
LEAVES PARTITIONS
…
R2
Related Work: Co-clustering framework
• Iterative approach
51
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Part II: Multiresolution co-clustering for uncalibrated multiview
segmentation
• Introduction
• Related Work
• Contributions
• Experiments
• Conclusions
• Conclusions
52
Contribution I: Motion-based adjacency
53
View #i View #i-1
Contribution I: Motion-based adjacency
• Similarity computation
• RAG definition
54
View #i View #i-1
Contribution II: Resolution parameterization
55
view 1 view 2
LEAVES PARTITIONS …
Original parameterization
Proposed parameterization
= ???
= 2
R2
Contribution III: Two-step iterative architecture
• Hierarchical constraints are not imposed in a second step
56
Contribution III: Two-step iterative architecture
57
First step Second step
Contribution III: Two-step iterative architecture
58
Contribution IV: Generic global co-clustering
59
• All co-clustered partitions
resulting from the iterative
architecture are fed into a
global optimization
• The reduction on the
number of regions makes
the global optimization
feasible
Contribution V: Semantic global co-clustering
60
• Semantic information is
introduced in the global
optimization
Contribution V: Semantic global co-clustering
61
GENERIC
CO-CLUSTERING
SEMANTIC
SEGMENTATIONS
SEMANTIC
CO-CLUSTERING
Contribution VI: Automatic resolution selection
62
view 1 view 2
LEAVES PARTITIONS …
MULTIRESOLUTION
CO-CLUSTERING
• We propose a method that
automatically selects the
resolution that best fits with
the semantic information
SEMANTIC
PARTITIONS
SINGLE RESOLUTION
CO-CLUSTERING
R2
Contribution VII: Coherent semantic partitions
63
view 1 view 2
LEAVES PARTITIONS
SEMANTIC PARTITIONS
SINGLE RESOLUTION
CO-CLUSTERING
COHERENT
SEMANTIC PARTITIONS
R2
Contribution VII: Coherent semantic partitions
64
STATE OF
THE ART [1]
OUR
RESULTS
[1] S Zheng et al, Conditional Random Fields as Recurrent Neural Networks. ICCV’15
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Part II: Multiresolution co-clustering for uncalibrated multiview
segmentation
• Introduction
• Related Work
• Contributions
• Experiments
• Conclusions
• Conclusions
65
Experiments: Dataset
• Multiview dataset [1]
66[1] A. Kowdle et at, Multiple view object cosegmentation using appearance and stereo cues (ECCV’12)
Experiments: Generic co-clustering
67
Co-segmentation techniques
Video segmentation techniques
Co-clustering techniques
• I-1S: Motion-compensated one-step
iterative (baseline)
• I-2S: Two-step iterative
• UCM+I-1S: First step is replaced by a cut
from a hierarchical segmentation algorithm
• I-2S+GG: Two-step iterative followed by
generic global optimization
Experiments: Generic co-clustering
68
I-2S UCM+I-1S I-2S+GG [KX12] [JBP12] [XXC12] [GKHE10] [GCS13] UCM+Pr I-1S
BMW 0.72 0.68 0.70 0.42 0.56 0.70 0.65 0.63 0.62 0.67
Chair 0.79 0.77 0.76 0.53 0.78 0.80 0.76 0.47 0.59 0.78
Couch 0.93 0.95 0.94 0.78 0.90 0.85 0.88 0.73 0.89 0.90
GardenChair 0.84 0.63 0.87 0.31 0.52 0.70 0.68 0.63 0.84 0.80
Motorbike 0.76 0.77 0.77 0.39 0.39 0.71 0.73 0.46 0.54 0.70
Teddy 0.92 0.92 0.92 0.69 0.87 0.88 0.84 0.85 0.82 0.90
Average 0.83 0.79 0.83 0.52 0.67 0.77 0.76 0.63 0.72 0.79
CO-CLUSTERING CO-SEGMENTATION VIDEO SEGMENTATION BASELINES
• Two-step iterative co-clustering techniques (I-2S and I-2S+GG)
outperform other state-of-the-art techniques
Experiments: Semantic co-clustering
69
Co-clustering techniques
• I-2S+GG(MR): Multiresolution global
generic co-clustering
• I-2S+SG(MR): Multiresolution global
semantic co-clustering
• I-2S+GG(SR): Single resolution global
generic co-clustering
• I-2S+SG(SR): Single resolution global
semantic co-clustering
Semantic segmentation techniques
• SCSS: Semantic co-clustering based
semantic segmentation
• GCSS: Generic co-clustering based
semantic segmentation
• [ZJRP+15]: state-of-the-art
[ZJRP+15] S Zheng et al, Conditional Random Fields as
Recurrent Neural Networks. ICCV’15
Experiments: Qualitative assessment
70
Experiments: Qualitative assessment
71
Experiments: Qualitative assessment
72
leaves
partition
I-2S I-2S+GG I-2S+SG SCSS [ZJRP+15]
[ZJRP+15] S Zheng et al, Conditional Random Fields as Recurrent Neural Networks. ICCV’15
Experiments: Qualitative assessment
73
leaves
partition
I-2S I-2S+GG I-2S+SG SCSS
[ZJRP+15] S Zheng et al, Conditional Random Fields as Recurrent Neural Networks. ICCV’15
[ZJRP+15]
Experiments: Qualitative assessment
74
Occlusion/Object Boundary Detection Dataset [GVB11]Ballet and Breakdancers datasets [ZKU+04]
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Part II: Multiresolution co-clustering for uncalibrated multiview
segmentation
• Introduction
• Related Work
• Contributions
• Experiments
• Conclusions
• Conclusions
75
Conclusions
• The use of motion cues significantly improved the performance
• The new resolution parameterization allowed us to have a more uniform
distribution of resolutions
• The two-step architecture improved the performance of the original one-
step architecture
• Although global optimization is now feasible, there is no clear gain for
generic co-clustering. However, it is useful for semantic co-clustering.
• A small decrease in performance is achieved as a result of applying the
resolution selection technique
• Submitted to ECCV’16 (waiting decision)
76
Future Work
• Extending experiments to video datasets
• VSB100 (Video Segmentation Benchmark) [1]
• Cityscapes [2]
• Extending experiments to calibrated scenarios
• Training end-to-end CNNs for multiview semantic segmentation
77
[1] F Galasso et al, A Unified Video Segmentation Benchmark: Annotation, Metrics and Analysis. ICCV’13
[2] M Cordts et al, The cityscapes dataset for semantic urban scene understanding. CVPR’16
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Part II: Multiresolution co-clustering for uncalibrated multiview
segmentation
• Introduction
• Related Work
• Contributions
• Experiments
• Conclusions
• Conclusions
78
Conclusions
• Results achieved in the first part by considering new spatial
configurations are now obsolete after the outstanding results
achieved by deep learning techniques.
• Results from deep learning techniques were used in the second part.
• The proposed multiresolution co-clustering has improved state-of-
the-art results, but we should consider an end-to-end deep learning
approach to achieve a more significant improvement.
• Semantic segmentation techniques evolve really fast, making this field
very competitive and challenging.
79
Publications
• Related with the Thesis
• C. Ventura, D. Varas, X. Giro-i-Nieto, V. Vilaplana, F. Marques. Semantically driven
multiresolution co-clustering for uncalibrated multiview segmentation. Submitted to
the European Conference on Computer Vision (ECCV) 2016. In process of review.
• C. Ventura, X. Giro-i-Nieto, V. Vilaplana, K. McGuinness, F. Marques, Noel E O'Connor.
Improving spatial codication in semantic segmentation. International Conference on
Image Processing (ICIP) 2015.
• C. Ventura. Visual object analysis using regions and interest points. ACM
international conference on Multimedia 2013.
80
Publications
• Other publications:
• K. McGuinness, E. Mohedano, Z. Zhang, F. Hu, R. Albatal, Cathal Gurrin, N.E O'Connor, A. F.
Smeaton, A. Salvador, X. Giro-i-Nieto, C. Ventura. Insight Centre for Data Analytics (DCU) at
TRECVid 2014: instance search and semantic indexing tasks. TRECVID Workshop 2014.
• C. Ventura, V. Vilaplana, X. Giro-i-Nieto, F. Marques. Improving retrieval accuracy of Hierarchical
Cellular Trees for generic metric spaces. Multimedia Tools and Applications, 2014.
• C. Ventura, X. Giro-i-Nieto, V. Vilaplana, D. Giribet, E. Carasusan. Automatic keyframe selection
based on mutual reinforcement algorithm. International Workshop on Content-Based
Multimedia Indexing (CBMI) 2013.
• C. Ventura, M. Tella-Amo, X. Giro-i-Nieto. UPC at MediaEval 2013 Hyperlinking Task. MediaEval
2013.
• C. Ventura, M. Martos, X. Giro-i-Nieto, V. Vilaplana, F. Marques. Hierarchical navigation and
visual search for video keyframe retrieval. International Conference on Multimedia Modeling
2012.
81
82
Introduction: Context
83Source: A. Oliva and A. Torralba, The role of context in object recognition
Introduction: Context
84Source: A. Oliva and A. Torralba, The role of context in object recognition
Introduction: Context
85Source: T. Malisiewicz and A. A. Efros, Improving spatial support for objects via multiple segmentations.
Related Work: Realistic scenario
86Source: J. Carreira et al., Semantic segmentation with second-order pooling
Input image
Object segment
hypotheses
Ranked object
segment hypotheses
(class independent)
object
plausibility
score
Related Work: Realistic scenario
87Source: J. Carreira et al., Semantic segmentation with second-order pooling
Predict overlap estimate of each segment to each
object class and sort segments by maximal score
Aggregate high-rank segments
Related Work: Realistic scenario
88
0.8179
0.6861
0.9013
0.7381
0.7105
0.6462
TRAINING
DATA
TEST
DATA
?0.4905
[1] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12
Related Work: Co-clustering framework
• What are the contour elements?
89
view 1 view 2
LEAVES PARTITIONS Which contour elements are considered to compute Q1,4?
• Contour elements of R1
• Contour elements of R4
Related Work: Co-clustering framework
90
INTRA INTERACTIONS INTER INTERACTIONS
Related Work: Co-clustering framework
91
Related Work: Co-clustering framework
92
LINEAR PROGRAMMING RELAXATION
Related Work: Co-clustering framework
93
1
2
3
4
5
Intra: Q1,2 = -0.81
Q3,4 = -0.81, Q3,5 = -0.81, Q4,5 = -0.49
Inter: Q1,3 = 2.81e+03
Q1,4 = -1.36e+03
Q1,5 = -1.45e+03
Q2,3 = -2.81e+03
Q2,4 = 1.36e+03
Q2,5 = 1.45e+03
x 0
x 0
x 1
Q4,5 = -0.49 D4,5 = 1 ??
𝐷4,5 ≤ 𝐷4,2 + 𝐷2,5
D4,2 = 0, D2,5 = 0 D4,5 = 0
Related Work: Co-clustering framework
94
LEAVES PARTITIONS CO-CLUSTERED PARTITIONS
Related Work: Co-clustering framework
• Hierarchical constraint
95
PARENT NODE 11
Inter-sibling boundaries:
Intra-sibling boundaries:
Related Work: Co-clustering framework
• Multiresolution parameterization
96
: Number of active contours
to encode leave contours
: Maximum fraction to describe
the r-th coarse level
: Maximum difference between
consecutive levels
= 9 = 0.5 = 0.1
4.53.6
Related Work: Co-clustering framework
• Iterative approach
97
Contribution II: Resolution parameterization
98
Selected inter-sibling boundaries:
Contributions
• Semantic global co-clustering
99
1. Class assignment to regions 3. Optimization constraints
• Regions from same partition
with same class
• Regions from different partitions
with diferent class
2. Similarity penalizations
• Regions from same partition
with different classes
Contribution VI: Automatic resolution selection
• Some applications require a single resolution
100
l1
l2
C1
C2
C3
l1 C1 C2U
l2
C2
C2
l1 or l2 ? l1
Experiments: Semantic co-clustering
101
Conclusions
• Multiresolution co-clustering framework for uncalibrated multiview
sequences
• Two-step architecture
• Global optimization
• Semantic-based co-clustering with resolution selection
• Submitted to ECCV’16 (waiting decision)
102
Conclusions
• Part I: Improving spatial codification in semantic segmentation
• Figure-Border-Ground in realistic scenario
• Contour-based spatial pyramid
• Part II: Multiresolution co-clustering for uncalibrated multiview
segmentation
• Results from Part I are replaced by SoA deep learning techniques
• Generic co-clustering for multiview sequences
• Semantic co-clustering for multiview sequences
103

Contenu connexe

Tendances

Shai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingShai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingwolf
 
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
Advanced deep learning based object detection methods
Advanced deep learning based object detection methodsAdvanced deep learning based object detection methods
Advanced deep learning based object detection methodsBrodmann17
 
Deep image retrieval learning global representations for image search
Deep image retrieval  learning global representations for image searchDeep image retrieval  learning global representations for image search
Deep image retrieval learning global representations for image searchUniversitat Politècnica de Catalunya
 
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
Focal loss for dense object detection
Focal loss for dense object detectionFocal loss for dense object detection
Focal loss for dense object detectionDaeHeeKim31
 
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
Object Detection using Deep Neural Networks
Object Detection using Deep Neural NetworksObject Detection using Deep Neural Networks
Object Detection using Deep Neural NetworksUsman Qayyum
 

Tendances (20)

Convolutional Features for Instance Search
Convolutional Features for Instance SearchConvolutional Features for Instance Search
Convolutional Features for Instance Search
 
Shai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingShai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble tracking
 
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
 
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
 
Advanced deep learning based object detection methods
Advanced deep learning based object detection methodsAdvanced deep learning based object detection methods
Advanced deep learning based object detection methods
 
Hierarchical Object Detection with Deep Reinforcement Learning
Hierarchical Object Detection with Deep Reinforcement LearningHierarchical Object Detection with Deep Reinforcement Learning
Hierarchical Object Detection with Deep Reinforcement Learning
 
Deep image retrieval learning global representations for image search
Deep image retrieval  learning global representations for image searchDeep image retrieval  learning global representations for image search
Deep image retrieval learning global representations for image search
 
Deep Learning for Computer Vision: Segmentation (UPC 2016)
Deep Learning for Computer Vision: Segmentation (UPC 2016)Deep Learning for Computer Vision: Segmentation (UPC 2016)
Deep Learning for Computer Vision: Segmentation (UPC 2016)
 
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
 
Recurrent Instance Segmentation (UPC Reading Group)
Recurrent Instance Segmentation (UPC Reading Group)Recurrent Instance Segmentation (UPC Reading Group)
Recurrent Instance Segmentation (UPC Reading Group)
 
Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)
Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)
Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)
 
Object Detection - Míriam Bellver - UPC Barcelona 2018
Object Detection - Míriam Bellver - UPC Barcelona 2018Object Detection - Míriam Bellver - UPC Barcelona 2018
Object Detection - Míriam Bellver - UPC Barcelona 2018
 
Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)
Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)
Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)
 
Focal loss for dense object detection
Focal loss for dense object detectionFocal loss for dense object detection
Focal loss for dense object detection
 
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
 
Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018
Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018
Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018
 
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
 
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
 
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
 
Object Detection using Deep Neural Networks
Object Detection using Deep Neural NetworksObject Detection using Deep Neural Networks
Object Detection using Deep Neural Networks
 

En vedette

En vedette (13)

CC NEW
CC NEWCC NEW
CC NEW
 
Space arch eng
Space arch engSpace arch eng
Space arch eng
 
Agriculture in ahmednagar
Agriculture in ahmednagarAgriculture in ahmednagar
Agriculture in ahmednagar
 
Hilly area settlements in Uttarakhand
Hilly area settlements in UttarakhandHilly area settlements in Uttarakhand
Hilly area settlements in Uttarakhand
 
Final review contour
Final review  contourFinal review  contour
Final review contour
 
Asian Architecture Presentation Slides
Asian Architecture Presentation SlidesAsian Architecture Presentation Slides
Asian Architecture Presentation Slides
 
Space Hotel Design
Space Hotel DesignSpace Hotel Design
Space Hotel Design
 
CONTOUR CRAFTING TECHNOLOGY
CONTOUR CRAFTING TECHNOLOGYCONTOUR CRAFTING TECHNOLOGY
CONTOUR CRAFTING TECHNOLOGY
 
Space hotel eng_1
Space hotel eng_1Space hotel eng_1
Space hotel eng_1
 
Construction Challenges For Bridges In Hilly Areas
Construction Challenges For Bridges In Hilly AreasConstruction Challenges For Bridges In Hilly Areas
Construction Challenges For Bridges In Hilly Areas
 
Vernacular hill
Vernacular hillVernacular hill
Vernacular hill
 
Site Analysis
Site AnalysisSite Analysis
Site Analysis
 
Case study/ Literature of a School
Case study/ Literature of a SchoolCase study/ Literature of a School
Case study/ Literature of a School
 

Similaire à Visual Object Analysis using Regions and Local Features

Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015Jia-Bin Huang
 
Video Object Segmentation in Videos
Video Object Segmentation in VideosVideo Object Segmentation in Videos
Video Object Segmentation in VideosNAVER Engineering
 
“Robust Object Detection Under Dataset Shifts,” a Presentation from Arm
“Robust Object Detection Under Dataset Shifts,” a Presentation from Arm“Robust Object Detection Under Dataset Shifts,” a Presentation from Arm
“Robust Object Detection Under Dataset Shifts,” a Presentation from ArmEdge AI and Vision Alliance
 
Multi-class Classification on Riemannian Manifolds for Video Surveillance
Multi-class Classification on Riemannian Manifolds for Video SurveillanceMulti-class Classification on Riemannian Manifolds for Video Surveillance
Multi-class Classification on Riemannian Manifolds for Video SurveillanceDiego Tosato
 
Impact Analysis of Granularity Levels on Feature Location Technique
Impact Analysis of Granularity Levels on Feature Location TechniqueImpact Analysis of Granularity Levels on Feature Location Technique
Impact Analysis of Granularity Levels on Feature Location TechniqueChakkrit (Kla) Tantithamthavorn
 
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...Simone Ercoli
 
Deep image retrieval - learning global representations for image search - ub ...
Deep image retrieval - learning global representations for image search - ub ...Deep image retrieval - learning global representations for image search - ub ...
Deep image retrieval - learning global representations for image search - ub ...Universitat de Barcelona
 
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...Francisco (Paco) Florez-Revuelta
 
Scrdet++ analysis
Scrdet++ analysisScrdet++ analysis
Scrdet++ analysisNEHA Kapoor
 
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)Saimunur Rahman
 
MSCV Capstone Spring 2020 Presentation - RL for AD
MSCV Capstone Spring 2020 Presentation - RL for ADMSCV Capstone Spring 2020 Presentation - RL for AD
MSCV Capstone Spring 2020 Presentation - RL for ADMayank Gupta
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImageryRAHUL BHOJWANI
 
Online video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident networkOnline video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident networkNAVER Engineering
 
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...multimediaeval
 
A location-aware embedding technique for accurate landmark recognition
A location-aware embedding technique for accurate landmark recognitionA location-aware embedding technique for accurate landmark recognition
A location-aware embedding technique for accurate landmark recognitionFederico Magliani
 
Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...Symeon Papadopoulos
 

Similaire à Visual Object Analysis using Regions and Local Features (20)

Improving Spatial Codification in Semantic Segmentation
Improving Spatial Codification in Semantic SegmentationImproving Spatial Codification in Semantic Segmentation
Improving Spatial Codification in Semantic Segmentation
 
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015
 
Video Object Segmentation in Videos
Video Object Segmentation in VideosVideo Object Segmentation in Videos
Video Object Segmentation in Videos
 
“Robust Object Detection Under Dataset Shifts,” a Presentation from Arm
“Robust Object Detection Under Dataset Shifts,” a Presentation from Arm“Robust Object Detection Under Dataset Shifts,” a Presentation from Arm
“Robust Object Detection Under Dataset Shifts,” a Presentation from Arm
 
Multi-class Classification on Riemannian Manifolds for Video Surveillance
Multi-class Classification on Riemannian Manifolds for Video SurveillanceMulti-class Classification on Riemannian Manifolds for Video Surveillance
Multi-class Classification on Riemannian Manifolds for Video Surveillance
 
Impact Analysis of Granularity Levels on Feature Location Technique
Impact Analysis of Granularity Levels on Feature Location TechniqueImpact Analysis of Granularity Levels on Feature Location Technique
Impact Analysis of Granularity Levels on Feature Location Technique
 
Visual Search for Musical Performances and Endoscopic Videos
Visual Search for Musical Performances and Endoscopic VideosVisual Search for Musical Performances and Endoscopic Videos
Visual Search for Musical Performances and Endoscopic Videos
 
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
 
Deep image retrieval - learning global representations for image search - ub ...
Deep image retrieval - learning global representations for image search - ub ...Deep image retrieval - learning global representations for image search - ub ...
Deep image retrieval - learning global representations for image search - ub ...
 
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...
 
Scrdet++ analysis
Scrdet++ analysisScrdet++ analysis
Scrdet++ analysis
 
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
 
MSCV Capstone Spring 2020 Presentation - RL for AD
MSCV Capstone Spring 2020 Presentation - RL for ADMSCV Capstone Spring 2020 Presentation - RL for AD
MSCV Capstone Spring 2020 Presentation - RL for AD
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite Imagery
 
Online video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident networkOnline video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident network
 
20210226 esa-science-coffee-v2.0
20210226 esa-science-coffee-v2.020210226 esa-science-coffee-v2.0
20210226 esa-science-coffee-v2.0
 
ECCV WS 2012 (Frank)
ECCV WS 2012 (Frank)ECCV WS 2012 (Frank)
ECCV WS 2012 (Frank)
 
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
 
A location-aware embedding technique for accurate landmark recognition
A location-aware embedding technique for accurate landmark recognitionA location-aware embedding technique for accurate landmark recognition
A location-aware embedding technique for accurate landmark recognition
 
Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...
 

Plus de Universitat Politècnica de Catalunya

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoUniversitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosUniversitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Universitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Universitat Politècnica de Catalunya
 

Plus de Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 

Dernier

Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 

Dernier (20)

Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 

Visual Object Analysis using Regions and Local Features

  • 1. Visual Object Analysis using Regions and Local Features Carles Ventura Royo Co-advisors Xavier Giró i Nieto Verónica Vilaplana Besler Tutor Ferran Marqués Acosta
  • 2. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Conclusions 2
  • 3. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Introduction • Related Work • Contributions • Experiments • Conclusions • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Conclusions 3
  • 4. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Introduction • Related Work • Contributions • Experiments • Conclusions • Conclusions 4
  • 6. Introduction: Semantic segmentation 6 Part I: Single view Part II: Multiview STATE OF THE ART OUR RESULTS
  • 7. Introduction: Visual Object Analysis 7 vs Objects Scene
  • 9. Introduction: Regions 9 1 2 9 6 7 3 45 8 10 11 9 2 3 12 10 15 14 4 13 5 1 16 7 18 17 8 6 19 BINARY PARTITION TREE
  • 11. Introduction: Local Features 11 Local Features Global Features
  • 12. Introduction: Local Features Aggregation 12 • Bag of Features (BoF) [1] vector quantization codebook Bag of Features [1] G Csurka et al, Visual Categorization with Bags of Keypoints. ECCV’04
  • 13. Introduction: Local Features Aggregation 13 • Pooling 1 𝑁 𝑖=1 𝑁 𝑥𝑖 1 𝑁 𝑖=1 𝑁 𝑥𝑖 𝑥𝑖 𝑇 First Order Average Pooling (O1P) [1] Second Order Average Pooling (O2P) [2] 𝑥𝑖: 𝑙𝑜𝑐𝑎𝑙 𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠 No need of codebook High dimensionality [1] Y Boureau et al, A Theoretical Analysis of Feature Pooling in Visual Recognition. ICML’10 [2] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12
  • 14. Part I Context analysis in semantic segmentation
  • 15. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Introduction • Related Work • Contributions • Experiments • Conclusions • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Conclusions 15
  • 16. Introduction: Context 16 [2] A Rabinovich et al, Objects in Context. ICCV’07 Semantic context [1,2] Spatial context [1] M Bar, Visual Objects in Context. Nature Reviews Neuroscience 2004 GOAL: Analyze the influence of the spatial context in object recognition
  • 17. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Introduction • Related Work • Contributions • Experiments • Conclusions • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Conclusions 17
  • 18. Related Work: Ideal scenario 18 Ground truth object location [1] J.R.R. Uijlings et al., The Visual Extent of an Object. IJCV’12 Conclusion: Aggregating the local features over three region pools (interior, border and surround) increases the performance [1]
  • 19. Related Work: Realistic scenario • Pipeline [1] 19 Input image Generate object candidates Rank object candidates Predict class scores Aggregate high-rank candidates [1] J Carreira et al, Object Recognition as Ranking Holistic Figure-Ground Hypotheses. CVPR’10 Semantic partition
  • 20. Related Work: Realistic scenario • How is each class predictor trained? [1] 20 0.8179 0.6861 0.9013 0.7381 0.7105 0.6462 TRAINING DATA A SVR is used to learn the function that predicts the overlap for each class GOAL: CHANGE SPATIAL CODIFICATION O2PF O2PG overlap score os_1 os_2 os_N SVR os = f([O2PF O2PG]) [O2PF_1 O2PG_1] [O2PF_2 O2PG_2] [O2PF_1 O2PG_1] … [1] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12
  • 21. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Introduction • Related Work • Contributions • Experiments • Conclusions • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Conclusions 21
  • 22. Contributions • Figure-Border-Ground spatial pooling in the realistic scenario 22 os_1 os_2 os_N SVR os = f([O2PF O2PB O2PG]) [O2PF_1 O2PB_1 O2PG_1] [O2PF_2 O2PB_2 O2PG_2] [O2PF_N O2PB_N O2PG_N] …
  • 23. Contributions • Contour-based spatial pyramid [1]: crown-based 23 os_1 os_2 os_N SVR os = f([O2PF O2PSR1 O2PSR2 O2PSR3 O2PSR4]) [O2PF_1 O2PSR1_1 O2PSR2_1 O2PSR3_1 O2PSR4_1] [O2PF_2 O2PSR1_2 O2PSR2_2 O2PSR3_2 O2PSR4_2] [O2PF_N O2PSR1_N O2PSR2_N O2PSR3_N O2PSR4_N] [1] S Lazebnik et al, Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. CVPR’06 …
  • 24. Contributions • Contour-based spatial pyramid [1]: Cartesian-based 24 os_1 os_2 os_N SVR os = f([O2PF O2PSR1 O2PSR2 O2PSR3 O2PSR4]) [O2PF_1 O2PSR1_1 O2PSR2_1 O2PSR3_1 O2PSR4_1] [O2PF_2 O2PSR1_2 O2PSR2_2 O2PSR3_2 O2PSR4_2] [O2PF_N O2PSR1_N O2PSR2_N O2PSR3_N O2PSR4_N] [1] S Lazebnik et al, Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. CVPR’06 …
  • 25. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Introduction • Related Work • Contributions • Experiments • Conclusions • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Conclusions 25
  • 26. Experiments • Pascal VOC segmentation challenge 2011 & 2012 [1] • Train, validation and test subsets • Train: 1,112 (2011) / 1,464 (2012) • Validation: 1,111 (2011) / 1,449 (2012) • Test: 1,111 (2011) / 1,456 (2012) • 20 semantic classes • aeroplane, bicycle, bird, boat, bottle, bus, car, cat, chair, cow, dinningtable, dog, horse, motorbike, person, pottedplant, sheep, sofa, train, tvmonitor • Evaluation measure: Average Accuracy Classification 26[1] M Everingham et al, The PASCAL Visual Object Classes (VOC) Challenge. IJCV’10
  • 27. Experiments: Local Features Aggregation 27 • Pooling 1 𝑁 𝑖=1 𝑁 𝑥𝑖 1 𝑁 𝑖=1 𝑁 𝑥𝑖 𝑥𝑖 𝑇 First Order Average Pooling (O1P) [1] Second Order Average Pooling (O2P) [2] 𝑥𝑖: 𝑙𝑜𝑐𝑎𝑙 𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠 No need of codebook High dimensionality [1] Y Boureau et al, A Theoretical Analysis of Feature Pooling in Visual Recognition. ICML’10 [2] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12
  • 28. Experiments • Ideal scenario • Train set: train11 • Test set: val11 28 F [1] F-B F-G [1] F-B-G eSIFT [1] 63.9 66.2 66.4 68.6 eMSIFT [1] 64.8 68.9 67.7 70.8 [1] J Carreira et al, Semantic segmentation with second- order pooling. ECCV’12
  • 29. Experiments • Ideal scenario • Train set: train11 • Test set: val11 29 F [1] F-B F-B-G Non SP 64.8 68.9 70.8 Crown-based SP 68.7 71.1 71.7 Cartesian-based SP 67.7 71.6 72.7 [1] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12
  • 30. Experiments • Ideal scenario • Train set: train11 • Test set: val11 30 Figure SP (Figure) Border Ground AAC eSIFT+eMSIFT+eLBP eSIFT 72.98 [1] eSIFT+eMSIFT eSIFT+eMSIFT eSIFT+eMSIFT 73.84 eSIFT+eMSIFT+eLBP eMSIFT eSIFT+eMSIFT eSIFT+eMSIFT 75.86 [1] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12
  • 31. Experiments • Realistic scenario (CPMC [1]) • Train set: train11 • Test set: val11 31 Figure SP (Figure) Border Ground AAC eSIFT eSIFT 28.6 [2] eSIFT eSIFT eSIFT 34.8 eSIFT+eMSIFT+eLBP eSIFT 37.2 [2] eSIFT eSIFT eSIFT eSIFT 37.4 eSIFT+eMSIFT+eLBP eSIFT eSIFT eSIFT 39.6 [2] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12 [1] J Carreira et al, Constrained parametric min-cuts for automatic object segmentation. CVPR’10
  • 32. Experiments • Realistic scenario (CPMC [1]) • Train set: trainval11/12 • Test set: test11/12 32 [2] J Carreira et al, Semantic segmentation with second- order pooling. ECCV’12 F-G [2] F-B-G SP(F)-B-G VOC11 38.8 43.8 40.3 VOC12 39.9 42.2 40.8 [1] J Carreira et al, Constrained parametric min-cuts for automatic object segmentation. CVPR’10
  • 33. Experiments • Realistic scenario (MCG [1]) • Train set: train11 • Test set: val11 33 [2] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12 F-G [2] F-B-G SP(F)-B-G CPMC 37.2 38.9 39.6 MCG 30.9 34.1 36.1 [1] P Arbeláez et al, Multiscale combinatorial grouping. CVPR’14
  • 34. Experiments: Qualitative evaluation 34 F-G F-B-G F-G F-B-G aeroplane bicycle bicycle cat bird motorbike boat bottle bus bus motorbike car chair cat chair chair horse bird cow
  • 35. Experiments: Qualitative evaluation 35 F-G F-B-G F-G F-B-G chair diningtable cow dog person horse person motorbike motorbike motorbike person pottedplant bottle sheep sofa cat bus train train tvmonitor
  • 36. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Introduction • Related Work • Contributions • Experiments • Conclusions • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Conclusions 36
  • 37. Conclusions • Figure-Border-Ground spatial pooling improves the original Figure- Ground pooling in both ideal and realistic scenarios • The Border region pool carries the richest contextual information • The Cartesian-based spatial pyramid outperforms the crown-based spatial pyramid, but both of them may result in overfitting • Both Figure-Border-Ground pooling and Cartesian-based spatial pyramid have been validated with MCG object candidates • Published in ICIP’15 37
  • 38. Part II Multiresolution co-clustering for uncalibrated multiview segmentation
  • 39. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Introduction • Related Work • Contributions • Experiments • Conclusions • Conclusions 39
  • 41. Introduction • First goal: improving generic segmentation 41 • Motion-based region adjacency graph • New resolution parameterization • Relaxing hierarchical constraints with a two-step architecture • Practical framework for a global optimization • Second goal: improving semantic segmentation • Semantic-based generic segmentation • Automatic resolution selection technique • Generic segmentation based semantic segmentation
  • 42. Introduction • Co-segmentation 42 • Video segmentation • Co-clustering
  • 43. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Introduction • Related Work • Contributions • Experiments • Conclusions • Conclusions 43
  • 44. Related Work: Co-clustering framework [1,2] • Objective: Find the clusters that define the coherent regions across the different views at multiple resolutions 44 [2] D Varas et al, Multiresolution hierarchy co-clustering for semantic segmentation in sequences with small variations. ICCV’15 [1] D Glasner et al, Contour-based joint clustering of multiple segmentations. CVPR’11 LEAVES PARTITIONS CO-CLUSTERED PARTITIONS INPUT IMAGES HIERARCHIES
  • 45. Related Work: Co-clustering framework [1,2] • Objective: Find the clusters that define the coherent regions across the different views 45 view 1 view 2 view 1 view 2 LEAVES PARTITIONS CO-CLUSTERED PARTITIONS [2] D Varas et al, Multiresolution hierarchy co-clustering for semantic segmentation in sequences with small variations. ICCV’15 [1] D Glasner et al, Contour-based joint clustering of multiple segmentations. CVPR’11 R2
  • 46. Related Work: Co-clustering framework • Representation with boundary variables • Intra-image boundary variables: D1,2, D1,3, D2,3, D4,5, D5,6 • Inter-image boundary variables: D1,4, D1,5, D2,4, D2,5, D3,6 46 view 1 view 2 view 1 view 2 LEAVES PARTITIONS CO-CLUSTERED PARTITIONS D1,2 = 0 D1,4 = 0 D1,3 = 1 D1,5 = 0 D2,3 = 1 D2,4 = 0 D4,5 = 0 D2,5 = 0 D5,6 = 1 D3,6 = 0 R2
  • 47. Related Work: Co-clustering framework • How are the values of the boundary variables chosen? 47 view 1 view 2 LEAVES PARTITIONS INTRA INTERACTIONS INTER INTERACTIONS Q1,2, Q1,3, Q2,3, Q4,5, Q5,6 Q1,4, Q1,5, Q2,4, Q2,5, Q3,6 R2
  • 48. Related Work: Co-clustering framework • Hierarchical constraint 48 view 1 view 2 1 2 3 4 5 6 Co-clustered partitions cannot violate the hierarchical structures R2
  • 49. Related Work: Co-clustering framework • Hierarchical constraint 49 view 1 view 2 1 3 2 4 5 6 Co-clustered partitions cannot violate the hierarchical structures R2
  • 50. Related Work: Co-clustering framework • Multiresolution parameterization 50 view 1 view 2 LEAVES PARTITIONS … R2
  • 51. Related Work: Co-clustering framework • Iterative approach 51
  • 52. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Introduction • Related Work • Contributions • Experiments • Conclusions • Conclusions 52
  • 53. Contribution I: Motion-based adjacency 53 View #i View #i-1
  • 54. Contribution I: Motion-based adjacency • Similarity computation • RAG definition 54 View #i View #i-1
  • 55. Contribution II: Resolution parameterization 55 view 1 view 2 LEAVES PARTITIONS … Original parameterization Proposed parameterization = ??? = 2 R2
  • 56. Contribution III: Two-step iterative architecture • Hierarchical constraints are not imposed in a second step 56
  • 57. Contribution III: Two-step iterative architecture 57 First step Second step
  • 58. Contribution III: Two-step iterative architecture 58
  • 59. Contribution IV: Generic global co-clustering 59 • All co-clustered partitions resulting from the iterative architecture are fed into a global optimization • The reduction on the number of regions makes the global optimization feasible
  • 60. Contribution V: Semantic global co-clustering 60 • Semantic information is introduced in the global optimization
  • 61. Contribution V: Semantic global co-clustering 61 GENERIC CO-CLUSTERING SEMANTIC SEGMENTATIONS SEMANTIC CO-CLUSTERING
  • 62. Contribution VI: Automatic resolution selection 62 view 1 view 2 LEAVES PARTITIONS … MULTIRESOLUTION CO-CLUSTERING • We propose a method that automatically selects the resolution that best fits with the semantic information SEMANTIC PARTITIONS SINGLE RESOLUTION CO-CLUSTERING R2
  • 63. Contribution VII: Coherent semantic partitions 63 view 1 view 2 LEAVES PARTITIONS SEMANTIC PARTITIONS SINGLE RESOLUTION CO-CLUSTERING COHERENT SEMANTIC PARTITIONS R2
  • 64. Contribution VII: Coherent semantic partitions 64 STATE OF THE ART [1] OUR RESULTS [1] S Zheng et al, Conditional Random Fields as Recurrent Neural Networks. ICCV’15
  • 65. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Introduction • Related Work • Contributions • Experiments • Conclusions • Conclusions 65
  • 66. Experiments: Dataset • Multiview dataset [1] 66[1] A. Kowdle et at, Multiple view object cosegmentation using appearance and stereo cues (ECCV’12)
  • 67. Experiments: Generic co-clustering 67 Co-segmentation techniques Video segmentation techniques Co-clustering techniques • I-1S: Motion-compensated one-step iterative (baseline) • I-2S: Two-step iterative • UCM+I-1S: First step is replaced by a cut from a hierarchical segmentation algorithm • I-2S+GG: Two-step iterative followed by generic global optimization
  • 68. Experiments: Generic co-clustering 68 I-2S UCM+I-1S I-2S+GG [KX12] [JBP12] [XXC12] [GKHE10] [GCS13] UCM+Pr I-1S BMW 0.72 0.68 0.70 0.42 0.56 0.70 0.65 0.63 0.62 0.67 Chair 0.79 0.77 0.76 0.53 0.78 0.80 0.76 0.47 0.59 0.78 Couch 0.93 0.95 0.94 0.78 0.90 0.85 0.88 0.73 0.89 0.90 GardenChair 0.84 0.63 0.87 0.31 0.52 0.70 0.68 0.63 0.84 0.80 Motorbike 0.76 0.77 0.77 0.39 0.39 0.71 0.73 0.46 0.54 0.70 Teddy 0.92 0.92 0.92 0.69 0.87 0.88 0.84 0.85 0.82 0.90 Average 0.83 0.79 0.83 0.52 0.67 0.77 0.76 0.63 0.72 0.79 CO-CLUSTERING CO-SEGMENTATION VIDEO SEGMENTATION BASELINES • Two-step iterative co-clustering techniques (I-2S and I-2S+GG) outperform other state-of-the-art techniques
  • 69. Experiments: Semantic co-clustering 69 Co-clustering techniques • I-2S+GG(MR): Multiresolution global generic co-clustering • I-2S+SG(MR): Multiresolution global semantic co-clustering • I-2S+GG(SR): Single resolution global generic co-clustering • I-2S+SG(SR): Single resolution global semantic co-clustering Semantic segmentation techniques • SCSS: Semantic co-clustering based semantic segmentation • GCSS: Generic co-clustering based semantic segmentation • [ZJRP+15]: state-of-the-art [ZJRP+15] S Zheng et al, Conditional Random Fields as Recurrent Neural Networks. ICCV’15
  • 72. Experiments: Qualitative assessment 72 leaves partition I-2S I-2S+GG I-2S+SG SCSS [ZJRP+15] [ZJRP+15] S Zheng et al, Conditional Random Fields as Recurrent Neural Networks. ICCV’15
  • 73. Experiments: Qualitative assessment 73 leaves partition I-2S I-2S+GG I-2S+SG SCSS [ZJRP+15] S Zheng et al, Conditional Random Fields as Recurrent Neural Networks. ICCV’15 [ZJRP+15]
  • 74. Experiments: Qualitative assessment 74 Occlusion/Object Boundary Detection Dataset [GVB11]Ballet and Breakdancers datasets [ZKU+04]
  • 75. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Introduction • Related Work • Contributions • Experiments • Conclusions • Conclusions 75
  • 76. Conclusions • The use of motion cues significantly improved the performance • The new resolution parameterization allowed us to have a more uniform distribution of resolutions • The two-step architecture improved the performance of the original one- step architecture • Although global optimization is now feasible, there is no clear gain for generic co-clustering. However, it is useful for semantic co-clustering. • A small decrease in performance is achieved as a result of applying the resolution selection technique • Submitted to ECCV’16 (waiting decision) 76
  • 77. Future Work • Extending experiments to video datasets • VSB100 (Video Segmentation Benchmark) [1] • Cityscapes [2] • Extending experiments to calibrated scenarios • Training end-to-end CNNs for multiview semantic segmentation 77 [1] F Galasso et al, A Unified Video Segmentation Benchmark: Annotation, Metrics and Analysis. ICCV’13 [2] M Cordts et al, The cityscapes dataset for semantic urban scene understanding. CVPR’16
  • 78. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Introduction • Related Work • Contributions • Experiments • Conclusions • Conclusions 78
  • 79. Conclusions • Results achieved in the first part by considering new spatial configurations are now obsolete after the outstanding results achieved by deep learning techniques. • Results from deep learning techniques were used in the second part. • The proposed multiresolution co-clustering has improved state-of- the-art results, but we should consider an end-to-end deep learning approach to achieve a more significant improvement. • Semantic segmentation techniques evolve really fast, making this field very competitive and challenging. 79
  • 80. Publications • Related with the Thesis • C. Ventura, D. Varas, X. Giro-i-Nieto, V. Vilaplana, F. Marques. Semantically driven multiresolution co-clustering for uncalibrated multiview segmentation. Submitted to the European Conference on Computer Vision (ECCV) 2016. In process of review. • C. Ventura, X. Giro-i-Nieto, V. Vilaplana, K. McGuinness, F. Marques, Noel E O'Connor. Improving spatial codication in semantic segmentation. International Conference on Image Processing (ICIP) 2015. • C. Ventura. Visual object analysis using regions and interest points. ACM international conference on Multimedia 2013. 80
  • 81. Publications • Other publications: • K. McGuinness, E. Mohedano, Z. Zhang, F. Hu, R. Albatal, Cathal Gurrin, N.E O'Connor, A. F. Smeaton, A. Salvador, X. Giro-i-Nieto, C. Ventura. Insight Centre for Data Analytics (DCU) at TRECVid 2014: instance search and semantic indexing tasks. TRECVID Workshop 2014. • C. Ventura, V. Vilaplana, X. Giro-i-Nieto, F. Marques. Improving retrieval accuracy of Hierarchical Cellular Trees for generic metric spaces. Multimedia Tools and Applications, 2014. • C. Ventura, X. Giro-i-Nieto, V. Vilaplana, D. Giribet, E. Carasusan. Automatic keyframe selection based on mutual reinforcement algorithm. International Workshop on Content-Based Multimedia Indexing (CBMI) 2013. • C. Ventura, M. Tella-Amo, X. Giro-i-Nieto. UPC at MediaEval 2013 Hyperlinking Task. MediaEval 2013. • C. Ventura, M. Martos, X. Giro-i-Nieto, V. Vilaplana, F. Marques. Hierarchical navigation and visual search for video keyframe retrieval. International Conference on Multimedia Modeling 2012. 81
  • 82. 82
  • 83. Introduction: Context 83Source: A. Oliva and A. Torralba, The role of context in object recognition
  • 84. Introduction: Context 84Source: A. Oliva and A. Torralba, The role of context in object recognition
  • 85. Introduction: Context 85Source: T. Malisiewicz and A. A. Efros, Improving spatial support for objects via multiple segmentations.
  • 86. Related Work: Realistic scenario 86Source: J. Carreira et al., Semantic segmentation with second-order pooling Input image Object segment hypotheses Ranked object segment hypotheses (class independent) object plausibility score
  • 87. Related Work: Realistic scenario 87Source: J. Carreira et al., Semantic segmentation with second-order pooling Predict overlap estimate of each segment to each object class and sort segments by maximal score Aggregate high-rank segments
  • 88. Related Work: Realistic scenario 88 0.8179 0.6861 0.9013 0.7381 0.7105 0.6462 TRAINING DATA TEST DATA ?0.4905 [1] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12
  • 89. Related Work: Co-clustering framework • What are the contour elements? 89 view 1 view 2 LEAVES PARTITIONS Which contour elements are considered to compute Q1,4? • Contour elements of R1 • Contour elements of R4
  • 90. Related Work: Co-clustering framework 90 INTRA INTERACTIONS INTER INTERACTIONS
  • 92. Related Work: Co-clustering framework 92 LINEAR PROGRAMMING RELAXATION
  • 93. Related Work: Co-clustering framework 93 1 2 3 4 5 Intra: Q1,2 = -0.81 Q3,4 = -0.81, Q3,5 = -0.81, Q4,5 = -0.49 Inter: Q1,3 = 2.81e+03 Q1,4 = -1.36e+03 Q1,5 = -1.45e+03 Q2,3 = -2.81e+03 Q2,4 = 1.36e+03 Q2,5 = 1.45e+03 x 0 x 0 x 1 Q4,5 = -0.49 D4,5 = 1 ?? 𝐷4,5 ≤ 𝐷4,2 + 𝐷2,5 D4,2 = 0, D2,5 = 0 D4,5 = 0
  • 94. Related Work: Co-clustering framework 94 LEAVES PARTITIONS CO-CLUSTERED PARTITIONS
  • 95. Related Work: Co-clustering framework • Hierarchical constraint 95 PARENT NODE 11 Inter-sibling boundaries: Intra-sibling boundaries:
  • 96. Related Work: Co-clustering framework • Multiresolution parameterization 96 : Number of active contours to encode leave contours : Maximum fraction to describe the r-th coarse level : Maximum difference between consecutive levels = 9 = 0.5 = 0.1 4.53.6
  • 97. Related Work: Co-clustering framework • Iterative approach 97
  • 98. Contribution II: Resolution parameterization 98 Selected inter-sibling boundaries:
  • 99. Contributions • Semantic global co-clustering 99 1. Class assignment to regions 3. Optimization constraints • Regions from same partition with same class • Regions from different partitions with diferent class 2. Similarity penalizations • Regions from same partition with different classes
  • 100. Contribution VI: Automatic resolution selection • Some applications require a single resolution 100 l1 l2 C1 C2 C3 l1 C1 C2U l2 C2 C2 l1 or l2 ? l1
  • 102. Conclusions • Multiresolution co-clustering framework for uncalibrated multiview sequences • Two-step architecture • Global optimization • Semantic-based co-clustering with resolution selection • Submitted to ECCV’16 (waiting decision) 102
  • 103. Conclusions • Part I: Improving spatial codification in semantic segmentation • Figure-Border-Ground in realistic scenario • Contour-based spatial pyramid • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Results from Part I are replaced by SoA deep learning techniques • Generic co-clustering for multiview sequences • Semantic co-clustering for multiview sequences 103