Visual Object Analysis using Regions and Local Features

Visual Object Analysis using
Regions and Local Features
Carles Ventura Royo
Co-advisors
Xavier Giró i Nieto
Verónica Vilaplana Besler
Tutor
Ferran Marqués Acosta

Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Part II: Multiresolution co-clustering for uncalibrated multiview
segmentation
• Conclusions
2

Outline
• Introduction
• Introduction
• Related Work
• Contributions
• Experiments
• Conclusions
segmentation
• Conclusions
3

Outline
• Introduction
segmentation
• Introduction
• Related Work
• Contributions
• Experiments
• Conclusions
• Conclusions
4

Introduction: Semantic segmentation
5
Instance
segmentation
Class
segmentation
boat

Introduction: Semantic segmentation
6
Part I: Single view Part II: Multiview
STATE OF
THE ART
OUR
RESULTS

Introduction: Visual Object Analysis
7
vs
Objects Scene

Introduction: Regions
9
1 2
9
6
7
3
45
8
10
11
9 2
3
12 10
15 14
4 13
5 1
16 7
18 17
8 6
19
BINARY PARTITION TREE

Introduction: Regions
10
1 2
9
6
7
3
45
8
10
9
2
310
4
5
1
7
8
6
REGION ADJACENCY GRAPH

Introduction: Local Features
11
Local Features Global Features

Introduction: Local Features Aggregation
12
• Bag of Features (BoF) [1]
vector
quantization
codebook
Bag of Features
[1] G Csurka et al, Visual Categorization with Bags of Keypoints. ECCV’04

Introduction: Local Features Aggregation
13
• Pooling
1
𝑁
𝑖=1
𝑁
𝑥𝑖
1
𝑁
𝑖=1
𝑁
𝑥𝑖 𝑥𝑖
𝑇
First Order Average Pooling (O1P) [1]
Second Order Average Pooling (O2P) [2]
𝑥𝑖: 𝑙𝑜𝑐𝑎𝑙 𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠
No need of codebook High dimensionality
[1] Y Boureau et al, A Theoretical Analysis of Feature Pooling in Visual Recognition. ICML’10
[2] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12

Part I
Context analysis
in semantic segmentation

Outline
• Introduction
• Introduction
• Related Work
• Contributions
• Experiments
• Conclusions
segmentation
• Conclusions
15

Introduction: Context
16
[2] A Rabinovich et al, Objects in Context. ICCV’07
Semantic context [1,2] Spatial context
[1] M Bar, Visual Objects in Context. Nature Reviews Neuroscience 2004
GOAL: Analyze the influence of the
spatial context in object recognition

Outline
• Introduction
• Introduction
• Related Work
• Contributions
• Experiments
• Conclusions
segmentation
• Conclusions
17

Related Work: Ideal scenario
18
Ground
truth
object
location
[1] J.R.R. Uijlings et al., The Visual Extent of an Object. IJCV’12
Conclusion: Aggregating the local features over three region pools
(interior, border and surround) increases the performance [1]

Related Work: Realistic scenario
• Pipeline [1]
19
Input
image
Generate
object
candidates
Rank
object
candidates
Predict
class
scores
Aggregate
high-rank
candidates
[1] J Carreira et al, Object Recognition as Ranking
Holistic Figure-Ground Hypotheses. CVPR’10
Semantic
partition

• How is each class predictor trained? [1]
20
0.8179
0.6861
0.9013
0.7381
0.7105
0.6462
TRAINING
DATA
A SVR is used to learn the function that
predicts the overlap for each class
GOAL: CHANGE SPATIAL CODIFICATION
O2PF O2PG
overlap
score
os_1
os_2
os_N
SVR os = f([O2PF O2PG])
[O2PF_1 O2PG_1]
[O2PF_2 O2PG_2]
[O2PF_1 O2PG_1]
…
[1] J Carreira et al, Semantic segmentation
with second-order pooling. ECCV’12

Outline
• Introduction
• Introduction
• Related Work
• Contributions
• Experiments
• Conclusions
segmentation
• Conclusions
21

Contributions
• Figure-Border-Ground spatial pooling in the realistic scenario
22
os_1
os_2
os_N
SVR os = f([O2PF O2PB O2PG])
[O2PF_1 O2PB_1 O2PG_1]
[O2PF_2 O2PB_2 O2PG_2]
[O2PF_N O2PB_N O2PG_N]
…

Contributions
• Contour-based spatial pyramid [1]: crown-based
23
os_1
os_2
os_N
SVR os = f([O2PF O2PSR1 O2PSR2 O2PSR3 O2PSR4])
[O2PF_1 O2PSR1_1 O2PSR2_1 O2PSR3_1 O2PSR4_1]
[O2PF_N O2PSR1_N O2PSR2_N O2PSR3_N O2PSR4_N]
[1] S Lazebnik et al, Beyond bags of features: Spatial pyramid
matching for recognizing natural scene categories. CVPR’06
…

Contributions
• Contour-based spatial pyramid [1]: Cartesian-based
24
os_1
os_2
os_N
SVR os = f([O2PF O2PSR1 O2PSR2 O2PSR3 O2PSR4])
[O2PF_N O2PSR1_N O2PSR2_N O2PSR3_N O2PSR4_N]
[1] S Lazebnik et al, Beyond bags of features: Spatial pyramid
matching for recognizing natural scene categories. CVPR’06
…

Outline
• Introduction
• Introduction
• Related Work
• Contributions
• Experiments
• Conclusions
segmentation
• Conclusions
25

Experiments
• Pascal VOC segmentation challenge 2011 & 2012 [1]
• Train, validation and test subsets
• Train: 1,112 (2011) / 1,464 (2012)
• Validation: 1,111 (2011) / 1,449 (2012)
• Test: 1,111 (2011) / 1,456 (2012)
• 20 semantic classes
• aeroplane, bicycle, bird, boat, bottle, bus, car, cat, chair, cow, dinningtable, dog,
horse, motorbike, person, pottedplant, sheep, sofa, train, tvmonitor
• Evaluation measure: Average Accuracy Classification
26[1] M Everingham et al, The PASCAL Visual Object Classes (VOC) Challenge. IJCV’10

Experiments: Local Features Aggregation
27
• Pooling
1
𝑁
𝑖=1
𝑁
𝑥𝑖
1
𝑁
𝑖=1
𝑁
𝑥𝑖 𝑥𝑖
𝑇
First Order Average Pooling (O1P) [1]
Second Order Average Pooling (O2P) [2]
𝑥𝑖: 𝑙𝑜𝑐𝑎𝑙 𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠
No need of codebook High dimensionality
[1] Y Boureau et al, A Theoretical Analysis of Feature Pooling in Visual Recognition. ICML’10

Experiments
• Ideal scenario
• Train set: train11
• Test set: val11
28
F [1] F-B F-G [1] F-B-G
eSIFT [1] 63.9 66.2 66.4 68.6
eMSIFT [1] 64.8 68.9 67.7 70.8
[1] J Carreira et al, Semantic segmentation with second-
order pooling. ECCV’12

Experiments
• Ideal scenario
• Test set: val11
29
F [1] F-B F-B-G
Non SP 64.8 68.9 70.8
Crown-based SP 68.7 71.1 71.7
Cartesian-based SP 67.7 71.6 72.7

Experiments
• Ideal scenario
• Test set: val11
30
Figure SP (Figure) Border Ground AAC
eSIFT+eMSIFT+eLBP eSIFT 72.98 [1]
eSIFT+eMSIFT eSIFT+eMSIFT eSIFT+eMSIFT 73.84
eSIFT+eMSIFT+eLBP eMSIFT eSIFT+eMSIFT eSIFT+eMSIFT 75.86

Experiments
• Realistic scenario (CPMC [1])
• Test set: val11
31
Figure SP (Figure) Border Ground AAC
eSIFT eSIFT 28.6 [2]
eSIFT eSIFT eSIFT 34.8
eSIFT+eMSIFT+eLBP eSIFT 37.2 [2]
eSIFT eSIFT eSIFT eSIFT 37.4
eSIFT+eMSIFT+eLBP eSIFT eSIFT eSIFT 39.6
[1] J Carreira et al, Constrained parametric min-cuts for automatic object segmentation. CVPR’10

Experiments
• Realistic scenario (CPMC [1])
• Train set: trainval11/12
• Test set: test11/12
32
[2] J Carreira et al, Semantic segmentation with second-
order pooling. ECCV’12
F-G [2] F-B-G SP(F)-B-G
VOC11 38.8 43.8 40.3
VOC12 39.9 42.2 40.8
[1] J Carreira et al, Constrained parametric min-cuts for
automatic object segmentation. CVPR’10

Experiments
• Realistic scenario (MCG [1])
• Test set: val11
33
[2] J Carreira et al, Semantic segmentation with
second-order pooling. ECCV’12
F-G [2] F-B-G SP(F)-B-G
CPMC 37.2 38.9 39.6
MCG 30.9 34.1 36.1
[1] P Arbeláez et al, Multiscale combinatorial grouping.
CVPR’14

Experiments: Qualitative evaluation
34
F-G F-B-G F-G F-B-G
aeroplane
bicycle bicycle
cat bird
motorbike boat
bottle
bus
bus
motorbike car
chair
cat
chair chair
horse bird
cow

Experiments: Qualitative evaluation
35
F-G F-B-G F-G F-B-G
chair
diningtable
cow dog
person
horse
person motorbike
motorbike
motorbike
person
pottedplant bottle
sheep
sofa
cat
bus
train train
tvmonitor

Outline
• Introduction
• Introduction
• Related Work
• Contributions
• Experiments
• Conclusions
segmentation
• Conclusions
36

Conclusions
• Figure-Border-Ground spatial pooling improves the original Figure-
Ground pooling in both ideal and realistic scenarios
• The Border region pool carries the richest contextual information
• The Cartesian-based spatial pyramid outperforms the crown-based
spatial pyramid, but both of them may result in overfitting
• Both Figure-Border-Ground pooling and Cartesian-based spatial
pyramid have been validated with MCG object candidates
• Published in ICIP’15
37

Part II
Multiresolution co-clustering for
uncalibrated multiview segmentation

Outline
• Introduction
segmentation
• Introduction
• Related Work
• Contributions
• Experiments
• Conclusions
• Conclusions
39

Introduction
40
STATEOFTHEARTOURRESULTS

Introduction
• First goal: improving generic segmentation
41
• Motion-based region adjacency graph
• New resolution parameterization
• Relaxing hierarchical constraints with a two-step architecture
• Practical framework for a global optimization
• Second goal: improving semantic segmentation
• Semantic-based generic segmentation
• Automatic resolution selection technique
• Generic segmentation based semantic segmentation

Introduction
• Co-segmentation
42
• Video segmentation
• Co-clustering

Outline
• Introduction
segmentation
• Introduction
• Related Work
• Contributions
• Experiments
• Conclusions
• Conclusions
43

Related Work: Co-clustering framework [1,2]
• Objective: Find the clusters that define the coherent regions across
the different views at multiple resolutions
44
[2] D Varas et al, Multiresolution hierarchy co-clustering for semantic segmentation in sequences with small variations. ICCV’15
[1] D Glasner et al, Contour-based joint clustering of multiple segmentations. CVPR’11
LEAVES
PARTITIONS
CO-CLUSTERED PARTITIONS
INPUT
IMAGES
HIERARCHIES

Related Work: Co-clustering framework [1,2]
• Objective: Find the clusters that define the coherent regions across
the different views
45
view 1 view 2 view 1 view 2
LEAVES PARTITIONS CO-CLUSTERED PARTITIONS
[2] D Varas et al, Multiresolution hierarchy co-clustering for semantic segmentation in sequences with small variations. ICCV’15
[1] D Glasner et al, Contour-based joint clustering of multiple segmentations. CVPR’11
R2

Related Work: Co-clustering framework
• Representation with boundary variables
• Intra-image boundary variables: D1,2, D1,3, D2,3, D4,5, D5,6
• Inter-image boundary variables: D1,4, D1,5, D2,4, D2,5, D3,6
46
view 1 view 2 view 1 view 2
D1,2 = 0 D1,4 = 0
D1,3 = 1 D1,5 = 0
D2,3 = 1 D2,4 = 0
D4,5 = 0 D2,5 = 0
D5,6 = 1 D3,6 = 0
R2

• How are the values of the boundary variables chosen?
47
view 1 view 2
LEAVES PARTITIONS
INTRA INTERACTIONS INTER INTERACTIONS
Q1,2, Q1,3, Q2,3, Q4,5, Q5,6 Q1,4, Q1,5, Q2,4, Q2,5, Q3,6
R2

• Hierarchical constraint
48
view 1 view 2
1 2
3
4 5
6 Co-clustered partitions cannot
violate the hierarchical structures
R2

49
view 1 view 2
1 3
2
4 5
6 Co-clustered partitions cannot
violate the hierarchical structures
R2

• Multiresolution parameterization
50
view 1 view 2
LEAVES PARTITIONS
…
R2

• Iterative approach
51

Outline
• Introduction
segmentation
• Introduction
• Related Work
• Contributions
• Experiments
• Conclusions
• Conclusions
52

Contribution I: Motion-based adjacency
53
View #i View #i-1

Contribution I: Motion-based adjacency
• Similarity computation
• RAG definition
54
View #i View #i-1

Contribution II: Resolution parameterization
55
view 1 view 2
LEAVES PARTITIONS …
Original parameterization
Proposed parameterization
= ???
= 2
R2

Contribution III: Two-step iterative architecture
• Hierarchical constraints are not imposed in a second step
56

57
First step Second step

58

Contribution IV: Generic global co-clustering
59
• All co-clustered partitions
resulting from the iterative
architecture are fed into a
global optimization
• The reduction on the
number of regions makes
the global optimization
feasible

Contribution V: Semantic global co-clustering
60
• Semantic information is
introduced in the global
optimization

Contribution V: Semantic global co-clustering
61
GENERIC
CO-CLUSTERING
SEMANTIC
SEGMENTATIONS
SEMANTIC
CO-CLUSTERING

Contribution VI: Automatic resolution selection
62
view 1 view 2
LEAVES PARTITIONS …
MULTIRESOLUTION
CO-CLUSTERING
• We propose a method that
automatically selects the
resolution that best fits with
the semantic information
SEMANTIC
PARTITIONS
SINGLE RESOLUTION
CO-CLUSTERING
R2

Contribution VII: Coherent semantic partitions
63
view 1 view 2
LEAVES PARTITIONS
SEMANTIC PARTITIONS
SINGLE RESOLUTION
CO-CLUSTERING
COHERENT
SEMANTIC PARTITIONS
R2

Contribution VII: Coherent semantic partitions
64
STATE OF
THE ART [1]
OUR
RESULTS
[1] S Zheng et al, Conditional Random Fields as Recurrent Neural Networks. ICCV’15

Outline
• Introduction
segmentation
• Introduction
• Related Work
• Contributions
• Experiments
• Conclusions
• Conclusions
65

Experiments: Dataset
• Multiview dataset [1]
66[1] A. Kowdle et at, Multiple view object cosegmentation using appearance and stereo cues (ECCV’12)

Experiments: Generic co-clustering
67
Co-segmentation techniques
Video segmentation techniques
Co-clustering techniques
• I-1S: Motion-compensated one-step
iterative (baseline)
• I-2S: Two-step iterative
• UCM+I-1S: First step is replaced by a cut
from a hierarchical segmentation algorithm
• I-2S+GG: Two-step iterative followed by
generic global optimization

Experiments: Generic co-clustering
68
I-2S UCM+I-1S I-2S+GG [KX12] [JBP12] [XXC12] [GKHE10] [GCS13] UCM+Pr I-1S
BMW 0.72 0.68 0.70 0.42 0.56 0.70 0.65 0.63 0.62 0.67
Chair 0.79 0.77 0.76 0.53 0.78 0.80 0.76 0.47 0.59 0.78
Couch 0.93 0.95 0.94 0.78 0.90 0.85 0.88 0.73 0.89 0.90
GardenChair 0.84 0.63 0.87 0.31 0.52 0.70 0.68 0.63 0.84 0.80
Motorbike 0.76 0.77 0.77 0.39 0.39 0.71 0.73 0.46 0.54 0.70
Teddy 0.92 0.92 0.92 0.69 0.87 0.88 0.84 0.85 0.82 0.90
Average 0.83 0.79 0.83 0.52 0.67 0.77 0.76 0.63 0.72 0.79
CO-CLUSTERING CO-SEGMENTATION VIDEO SEGMENTATION BASELINES
• Two-step iterative co-clustering techniques (I-2S and I-2S+GG)
outperform other state-of-the-art techniques

Experiments: Semantic co-clustering
69
Co-clustering techniques
• I-2S+GG(MR): Multiresolution global
generic co-clustering
• I-2S+SG(MR): Multiresolution global
semantic co-clustering
• I-2S+GG(SR): Single resolution global
generic co-clustering
• I-2S+SG(SR): Single resolution global
semantic co-clustering
Semantic segmentation techniques
• SCSS: Semantic co-clustering based
semantic segmentation
• GCSS: Generic co-clustering based
semantic segmentation
• [ZJRP+15]: state-of-the-art
[ZJRP+15] S Zheng et al, Conditional Random Fields as
Recurrent Neural Networks. ICCV’15

Experiments: Qualitative assessment
70

71

72
leaves
partition
I-2S I-2S+GG I-2S+SG SCSS [ZJRP+15]
[ZJRP+15] S Zheng et al, Conditional Random Fields as Recurrent Neural Networks. ICCV’15

73
leaves
partition
I-2S I-2S+GG I-2S+SG SCSS
[ZJRP+15] S Zheng et al, Conditional Random Fields as Recurrent Neural Networks. ICCV’15
[ZJRP+15]

74
Occlusion/Object Boundary Detection Dataset [GVB11]Ballet and Breakdancers datasets [ZKU+04]

Outline
• Introduction
segmentation
• Introduction
• Related Work
• Contributions
• Experiments
• Conclusions
• Conclusions
75

Conclusions
• The use of motion cues significantly improved the performance
• The new resolution parameterization allowed us to have a more uniform
distribution of resolutions
• The two-step architecture improved the performance of the original one-
step architecture
• Although global optimization is now feasible, there is no clear gain for
generic co-clustering. However, it is useful for semantic co-clustering.
• A small decrease in performance is achieved as a result of applying the
resolution selection technique
• Submitted to ECCV’16 (waiting decision)
76

Future Work
• Extending experiments to video datasets
• VSB100 (Video Segmentation Benchmark) [1]
• Cityscapes [2]
• Extending experiments to calibrated scenarios
• Training end-to-end CNNs for multiview semantic segmentation
77
[1] F Galasso et al, A Unified Video Segmentation Benchmark: Annotation, Metrics and Analysis. ICCV’13
[2] M Cordts et al, The cityscapes dataset for semantic urban scene understanding. CVPR’16

Outline
• Introduction
segmentation
• Introduction
• Related Work
• Contributions
• Experiments
• Conclusions
• Conclusions
78

Conclusions
• Results achieved in the first part by considering new spatial
configurations are now obsolete after the outstanding results
achieved by deep learning techniques.
• Results from deep learning techniques were used in the second part.
• The proposed multiresolution co-clustering has improved state-of-
the-art results, but we should consider an end-to-end deep learning
approach to achieve a more significant improvement.
• Semantic segmentation techniques evolve really fast, making this field
very competitive and challenging.
79

Publications
• Related with the Thesis
• C. Ventura, D. Varas, X. Giro-i-Nieto, V. Vilaplana, F. Marques. Semantically driven
multiresolution co-clustering for uncalibrated multiview segmentation. Submitted to
the European Conference on Computer Vision (ECCV) 2016. In process of review.
• C. Ventura, X. Giro-i-Nieto, V. Vilaplana, K. McGuinness, F. Marques, Noel E O'Connor.
Improving spatial codication in semantic segmentation. International Conference on
Image Processing (ICIP) 2015.
• C. Ventura. Visual object analysis using regions and interest points. ACM
international conference on Multimedia 2013.
80

Publications
• Other publications:
• K. McGuinness, E. Mohedano, Z. Zhang, F. Hu, R. Albatal, Cathal Gurrin, N.E O'Connor, A. F.
Smeaton, A. Salvador, X. Giro-i-Nieto, C. Ventura. Insight Centre for Data Analytics (DCU) at
TRECVid 2014: instance search and semantic indexing tasks. TRECVID Workshop 2014.
• C. Ventura, V. Vilaplana, X. Giro-i-Nieto, F. Marques. Improving retrieval accuracy of Hierarchical
Cellular Trees for generic metric spaces. Multimedia Tools and Applications, 2014.
• C. Ventura, X. Giro-i-Nieto, V. Vilaplana, D. Giribet, E. Carasusan. Automatic keyframe selection
based on mutual reinforcement algorithm. International Workshop on Content-Based
Multimedia Indexing (CBMI) 2013.
• C. Ventura, M. Tella-Amo, X. Giro-i-Nieto. UPC at MediaEval 2013 Hyperlinking Task. MediaEval
2013.
• C. Ventura, M. Martos, X. Giro-i-Nieto, V. Vilaplana, F. Marques. Hierarchical navigation and
visual search for video keyframe retrieval. International Conference on Multimedia Modeling
2012.
81

83Source: A. Oliva and A. Torralba, The role of context in object recognition

84Source: A. Oliva and A. Torralba, The role of context in object recognition

85Source: T. Malisiewicz and A. A. Efros, Improving spatial support for objects via multiple segmentations.

86Source: J. Carreira et al., Semantic segmentation with second-order pooling
Input image
Object segment
hypotheses
Ranked object
segment hypotheses
(class independent)
object
plausibility
score

87Source: J. Carreira et al., Semantic segmentation with second-order pooling
Predict overlap estimate of each segment to each
object class and sort segments by maximal score
Aggregate high-rank segments

88
0.8179
0.6861
0.9013
0.7381
0.7105
0.6462
TRAINING
DATA
TEST
DATA
?0.4905

• What are the contour elements?
89
view 1 view 2
LEAVES PARTITIONS Which contour elements are considered to compute Q1,4?
• Contour elements of R1
• Contour elements of R4

90
INTRA INTERACTIONS INTER INTERACTIONS

91

92
LINEAR PROGRAMMING RELAXATION

93
1
2
3
4
5
Intra: Q1,2 = -0.81
Q3,4 = -0.81, Q3,5 = -0.81, Q4,5 = -0.49
Inter: Q1,3 = 2.81e+03
Q1,4 = -1.36e+03
Q1,5 = -1.45e+03
Q2,3 = -2.81e+03
Q2,4 = 1.36e+03
Q2,5 = 1.45e+03
x 0
x 0
x 1
Q4,5 = -0.49 D4,5 = 1 ??
𝐷4,5 ≤ 𝐷4,2 + 𝐷2,5
D4,2 = 0, D2,5 = 0 D4,5 = 0

94

95
PARENT NODE 11
Inter-sibling boundaries:
Intra-sibling boundaries:

• Multiresolution parameterization
96
: Number of active contours
to encode leave contours
: Maximum fraction to describe
the r-th coarse level
: Maximum difference between
consecutive levels
= 9 = 0.5 = 0.1
4.53.6

• Iterative approach
97

Contribution II: Resolution parameterization
98
Selected inter-sibling boundaries:

Contributions
• Semantic global co-clustering
99
1. Class assignment to regions 3. Optimization constraints
• Regions from same partition
with same class
• Regions from different partitions
with diferent class
2. Similarity penalizations
• Regions from same partition
with different classes

Contribution VI: Automatic resolution selection
• Some applications require a single resolution
100
l1
l2
C1
C2
C3
l1 C1 C2U
l2
C2
C2
l1 or l2 ? l1

Experiments: Semantic co-clustering
101

Conclusions
• Multiresolution co-clustering framework for uncalibrated multiview
sequences
• Two-step architecture
• Global optimization
• Semantic-based co-clustering with resolution selection
• Submitted to ECCV’16 (waiting decision)
102

Conclusions
• Part I: Improving spatial codification in semantic segmentation
• Figure-Border-Ground in realistic scenario
• Contour-based spatial pyramid
segmentation
• Results from Part I are replaced by SoA deep learning techniques
• Generic co-clustering for multiview sequences
• Semantic co-clustering for multiview sequences
103

Visual Object Analysis using Regions and Local Features

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (13)

Similaire à Visual Object Analysis using Regions and Local Features

Similaire à Visual Object Analysis using Regions and Local Features (20)

Plus de Universitat Politècnica de Catalunya

Plus de Universitat Politècnica de Catalunya (20)

Dernier

Dernier (20)

Visual Object Analysis using Regions and Local Features