Beyond nouns eccv_2008

1. Beyond Nouns: Exploiting Prepositions and Comparative Adjectives for Learning Visual Classifiers Abhinav Gupta and Larry S. Davis University of Maryland, College Park Proceedings of ECCV 2008 Presented by: DebaleenaChattopadhyay

2. Presentation Outline - The Problem Definition - The Novelty - The Problem Solution - The Results

3. The Problem Definition To learn visual classifiers for object recognition from weakly labeled data Input: Labels: city, mountain, sky, sun sun sky Expected Output: mountain city

4. Novelty To learn visual classifiers for object recognition from weakly labeled data utilizing additional language constructs Input: Labels: (Nouns) city, mountain, sky, sun (Relations) below(mountain, sky), below(mountain, sun) above(sky, city), above(sun, city) brighter(sun, mountain), brighter(sun, city) behind(mountain, city), convex(sun, city) in(sun, sky), smaller(sun, sky) sun sky Expected Output: mountain city

6. Overview Pairs of Nouns: Nouns: (SEA, SUN) SEA (SEA, SKY) (SKY, SEA) SKY (SKY, SUN) SUN (SUN, SKY) (SUN, SEA) Relationships: in, above, below

8. Algorithm:

9. Each image represented into a set of image regions.

10. Each image region is represented by a set of features

11. Classifiers for nouns are based on these features (CA)

12. Classifiers for relationships are based on differential features extracted from pairs of regions (CR)

13. EM-approach is used to learn noun and relationship models simultaneously

14. E-step: Update assignments of nouns to image regions, given CA and CR

16. Learning the Model EM-approach: Simultaneously solve for the correspondence problem and learn the parameters of classifiers (noun and relationship) E-step: Compute the noun assignment using parameters from the previous iteration. P( noun iassigned to region j) = Where,

17. Learning the Model

18. Learning the Model EM-approach: Simultaneously solve for the correspondence problem and learn the parameters of classifiers (noun and relationship) M-step: Update the model parameters depending on the updated assignments in the E-step. The Maximum Likelihood parameters depends upon the classifier used. To utilize contextual information for labeling test-images, priors on relationship ,P(r|ns,np), are also learnt from a co-occurrence table after the relationship annotations are generated.

20. We know Ij and we have to estimate nj.

21. The labeling problem is constrained by priors on relationships between pairs of nouns.

23. For training, 850 images with nouns and hand-labelled relationships between subset of pairs of nouns.

24. Nearest neighbor and Gaussian Classifier based likelihood model for nouns is used.

25. Decision stump based likelihood model for relationships is used.

26. 173 nouns

27. 19 relationships: above, behind, below, beside, more textured, brighter, in, greener, larger, left, near, far from, ontopof, more blue, right, similar, smaller, taller, shorter

29. Compared with human labeling

30. Performance measures:

31. Range of semantics identified- Both algorithm give similar performance (L)

32. Frequency Correct- Later algorithm performs better in number of times a noun is identified (R)Nouns only Nouns & Relationships (Human) Nouns & Relationships (learned) Proposed EM algorithm bootstrapped by IBM Model 1 Proposed EM algorithm bootstrapped by Duygulu et. al

33. Experimental Results Reducing Correspondence Ambiguity Duygulu et. al Beyond Nouns

35. Performance Measure:

37. Experimental Results Precision-Recall: Precision Ratio- The ratio of number of images that have been correctly annotated with that word to the number of images which were annotated with the word by the algorithm. (Respect to Human Observers) Recall Ratio: The ratio of the number of images correctly annotated with that word using the algorithm to the number of images that should have been annotated with that word. (Respect to Corel Annotations)

39. This algorithm proposes an EM based method to simultaneously learn visual classifiers for nouns, prepositions and comparative adjectives.

42. We know Ij and we have to estimate nj.

43. The labeling problem is constrained by priors on relationships between pairs of nouns.

44. Bayesian Network is used to represent the labeling problem and belief propagation for inference.

45. The word likelihood in an image is given as:

Beyond nouns eccv_2008

Recommandé

Recommandé

Contenu connexe

Similaire à Beyond nouns eccv_2008

Similaire à Beyond nouns eccv_2008 (20)

Plus de Debaleena Chattopadhyay

Plus de Debaleena Chattopadhyay (10)

Dernier

Dernier (20)

Beyond nouns eccv_2008

Notes de l'éditeur