Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Fashion 10000
An Enriched Dataset of
Fashion and Clothing
Presentation: Michael Riegler, Klagenfurt University & TU Delft
...
Table of Content
• Introduction
• Dataset Collection
• Dataset Annotation
– Statistics
• Applications of Dataset
• Conclus...
The Dataset
• Social Images
• At least 10000 fashion-
related images
• Social metadata
• Creative Common images
• Annotate...
The Collection
Wikipedia
470 Fashion
Categories
Flickr
- Query only CC
attribution images
- Query should also
appear in ta...
Metadata
• Collected in xml and csv format
– Title, description, owner, Tags, Location,
geo-parameters
• Additional metada...
General Statistics
Pairs fashion item, photo 32,398
Number of distinct fashion categories 262
Max/avg/min nr of photos per...
General Statistics
Pairs fashion item, photo 32,398
Number of distinct fashion categories 262
Max/avg/min nr of photos per...
General Statistics
Pairs fashion item, photo 32,398
Number of distinct fashion categories 262
Max/avg/min nr of photos per...
General Statistics
Pairs fashion item, photo 32,398
Number of distinct fashion categories 262
Max/avg/min nr of photos per...
Dataset
Annotation
• Some images might
not be relevant to
fashion and clothing
• The ground truth
differentiates relevant
...
Dataset
Annotation
• We used AMT to create ground
truth for the images
• The fashion category is described
with a definiti...
HIT Design
HIT Design
HIT Questions (Labels)
Question Possible values
Q1) Fashion / Clothing Related yes – no - notsure
Q2) Specialty clothing i...
Annotation Statistics
Total number of assignments 24,457
% of rejected assignments 4 %
Total number of unique workers 1470...
Dataset Statistics
• Using the generated ground truth the
statistics about the images were calculated
Number of fashion re...
Applications of the
Dataset
• Developing social media content analysis
– Game with a purpose (domino game)
• Basis for the...
Conclusion
• Fashion dataset
• Six different labels
• AMT generated ground
truth
• Can be used in various
research areas
•...
Michael Riegler
m.a.riegler@tudelft.nl
Thank you!
Prochain SlideShare
Chargement dans…5
×

Fashion 10000: An Enriched Dataset of Fashion and Clothing

2 117 vues

Publié le

Presentation of the Fashion 10000 data set for the ACM Multimedia Systems Conference 2014.

Publié dans : Sciences, Mode de vie, Business
  • Identifiez-vous pour voir les commentaires

Fashion 10000: An Enriched Dataset of Fashion and Clothing

  1. 1. Fashion 10000 An Enriched Dataset of Fashion and Clothing Presentation: Michael Riegler, Klagenfurt University & TU Delft Babak Loni, TU Delft Lei Yen Cheung, TU Delft Alessandro Bozzon, TU Delft Luke Gottlieb, ICSI Martha Larson, TU Delft
  2. 2. Table of Content • Introduction • Dataset Collection • Dataset Annotation – Statistics • Applications of Dataset • Conclusion
  3. 3. The Dataset • Social Images • At least 10000 fashion- related images • Social metadata • Creative Common images • Annotated with different labels
  4. 4. The Collection Wikipedia 470 Fashion Categories Flickr - Query only CC attribution images - Query should also appear in tags - Top relevant images 32K Images 262 Categories Flickr Fashion 10000 + MTurk Annotations + Metadata
  5. 5. Metadata • Collected in xml and csv format – Title, description, owner, Tags, Location, geo-parameters • Additional metadata: Info, Geos, Context, Tags, Notes, Favorites, Urls, Comments
  6. 6. General Statistics Pairs fashion item, photo 32,398 Number of distinct fashion categories 262 Max/avg/min nr of photos per fashion item 200/ 122.95 / 10 Number of photos with geo annotations 7,933 Total number of comments 58,578 Max/avg/min nr of comments per photo 575 / 7.35/ 1 Total number of tags, photo pairs 460,907 Total number of distinct tags 56,275 Max/avg/min nr of tags per photo 136/ 15.15/ 1 Total number of notes, photo pairs 5,892 Max/avg/min nr of notes per photo 195/ 5.31/ 1 Total number of favorites 37,131 Max/avg/min nr of favorites per photo 20/ 3.61/ 1 Total number of contexts 110,505 Max/avg/min nr of contexts per photo 206/ 3.93/ 1
  7. 7. General Statistics Pairs fashion item, photo 32,398 Number of distinct fashion categories 262 Max/avg/min nr of photos per fashion item 200/ 122.95 / 10 Number of photos with geo annotations 7,933 Total number of comments 58,578 Max/avg/min nr of comments per photo 575 / 7.35/ 1 Total number of tags, photo pairs 460,907 Total number of distinct tags 56,275 Max/avg/min nr of tags per photo 136/ 15.15/ 1 Total number of notes, photo pairs 5,892 Max/avg/min nr of notes per photo 195/ 5.31/ 1 Total number of favorites 37,131 Max/avg/min nr of favorites per photo 20/ 3.61/ 1 Total number of contexts 110,505 Max/avg/min nr of contexts per photo 206/ 3.93/ 1
  8. 8. General Statistics Pairs fashion item, photo 32,398 Number of distinct fashion categories 262 Max/avg/min nr of photos per fashion item 200/ 122.95 / 10 Number of photos with geo annotations 7,933 Total number of comments 58,578 Max/avg/min nr of comments per photo 575 / 7.35/ 1 Total number of tags, photo pairs 460,907 Total number of distinct tags 56,275 Max/avg/min nr of tags per photo 136/ 15.15/ 1 Total number of notes, photo pairs 5,892 Max/avg/min nr of notes per photo 195/ 5.31/ 1 Total number of favorites 37,131 Max/avg/min nr of favorites per photo 20/ 3.61/ 1 Total number of contexts 110,505 Max/avg/min nr of contexts per photo 206/ 3.93/ 1
  9. 9. General Statistics Pairs fashion item, photo 32,398 Number of distinct fashion categories 262 Max/avg/min nr of photos per fashion item 200/ 122.95 / 10 Number of photos with geo annotations7,933 Total number of comments 58,578 Max/avg/min nr of comments per photo 575 / 7.35/ 1 Total number of tags, photo pairs 460,907 Total number of distinct tags 56,275 Max/avg/min nr of tags per photo 136/ 15.15/ 1 Total number of notes, photo pairs 5,892 Max/avg/min nr of notes per photo 195/ 5.31/ 1 Total number of favorites 37,131 Max/avg/min nr of favorites per photo 20/ 3.61/ 1 Total number of contexts 110,505 Max/avg/min nr of contexts per photo 206/ 3.93/ 1
  10. 10. Dataset Annotation • Some images might not be relevant to fashion and clothing • The ground truth differentiates relevant from non-relevant
  11. 11. Dataset Annotation • We used AMT to create ground truth for the images • The fashion category is described with a definition from Wikipedia • 6 questions to create 6 labels for each of the images • We also ask about familiarity of workers with the fashion category
  12. 12. HIT Design
  13. 13. HIT Design
  14. 14. HIT Questions (Labels) Question Possible values Q1) Fashion / Clothing Related yes – no - notsure Q2) Specialty clothing item (image Category) yes – no - notsure Q3) Number of people nopeople – onepeople - manypeople Q4) Professional model or not? yes – no – notapp (not applicable) Q5) Person wearing fashion? yes – no – noperson – notapp (not applicable) Q6) Formal / Informal formalmen - formalwomen - informalmen informalwomen – other (cross-dressing or multiple persons) – notapp (not applicatble)
  15. 15. Annotation Statistics Total number of assignments 24,457 % of rejected assignments 4 % Total number of unique workers 1470 Avg. number of assignment by each worker 17 Avg. Completion time 127 sec Avg. familiarity of workers with fashion items 5.8 (range 1-7) Question 1 2 3 4 5 6 Kappa Value 0.66 0.65 0.85 0.51 0.38 0.48
  16. 16. Dataset Statistics • Using the generated ground truth the statistics about the images were calculated Number of fashion related images 18,487 Number of images with many people 7,417 Number of images with one person 9,771 Number of images with no person 13,179 Number of images with intention of showing fashion 9,096 Number of professional fashion images 2,814
  17. 17. Applications of the Dataset • Developing social media content analysis – Game with a purpose (domino game) • Basis for the brave new task in MediaEval multimedia benchmarking initiative • Use case for the proof of intentional framing
  18. 18. Conclusion • Fashion dataset • Six different labels • AMT generated ground truth • Can be used in various research areas • Evaluated in the MediaEval Benchmark
  19. 19. Michael Riegler m.a.riegler@tudelft.nl Thank you!

×