14. http://lora-aroyo.org @laroyo
in the (very near) future
most visitors will be digital-born
not bound by time or location
native to new forms of co-makership
native to new media
Siebe Weide, Max Meijer and Marieke Krabshuis (2012).
Agenda 2026: Study on the Future of the Dutch Museum Sector
22. http://lora-aroyo.org @laroyo
expertise of Rijksmuseum professionals is
in annotating their collection
with art-historical information, e.g. when they
were created, by whom, etc.
26. http://lora-aroyo.org @laroyo
Engage with Games
training the general crowd to be a niche:
game in which players can carry out an expert
annotation tasks with some assistance
36. http://lora-aroyo.org @laroyo
Low reproducibility rates
Difficult to estimate & control the time to complete
Difficult to assess & compare quality
Demands continuous promotional effort
Active learning (human-in-the-loop) needs different expertise
Difficult to incorporate results into existing content infrastructure
Challenges
Crowdsourcing typically undertaken in isolation
40. http://lora-aroyo.org @laroyo
Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 Other
passionate, rollicking, literate, humorous, silly, aggressive, fiery, does not fit into
rousing, cheerful, fun, poignant, wistful, campy, quirky, tense, anxious, any of the 5
confident, sweet, amiable, bittersweet, whimsical, witty, intense, volatile, clusters
boisterous, good-natured autumnal, wry visceral
rowdy brooding
Choose one:
Which is the mood most appropriate
for each song?
Goal:
(Lee and Hu 2012)
1 song - 1 mood???
43. Web & Media Group
http://lora-aroyo.org @laroyo
simplification of context
this all results in
44. Web & Media Group
http://lora-aroyo.org @laroyo
45. http://lora-aroyo.org @laroyo
● Identify Crowdsourcing Goals through user log analysis
○ # queries, #unique queries, #queries of specific type
○ ranked by popularity
○ ranked by popularity and with error, e.g.
■ # queries entered over 50 times with 0 results
■ # queries of specific type with 0 results
○ which will have biggest impact
○ which has biggest urgency
● … or through other user analysis
Assess Impact of Results
47. http://lora-aroyo.org @laroyo
people search for fragments
experts annotate full videos
35% of search queries result in not found
people search for fragments
experts annotate full videos
35% of search queries result in not found
for example
in video search
49. http://lora-aroyo.org @laroyo
Measure Quality
“On the Role of User-Generated Metadata in A/V Collections”, Riste Gligorov et al. KCAP2011
time-based annotation
bernhard
88% of the tags useful
for specific genres
describe short segments
often not very specific
don’t describe program as a whole
50. http://lora-aroyo.org @laroyo
for example
in video search
video annotation is time-consuming
5 times the video duration
experts use a specific vocabulary
that is unknown to general audiences
video annotation is time-consuming
5 times the video duration
experts use a specific vocabulary
that is unknown to general audiences
51. http://lora-aroyo.org @laroyo
user vocabulary
8% in professional vocabulary
23% in Dutch lexicon
89% found on Google
locations (7%)
engeland
persons (31%)
objects (57%)
Measure Quality
“On the Role of User-Generated Metadata in A/V Collections”, Riste Gligorov et al. KCAP2011
52. Web & Media Group
http://lora-aroyo.org @laroyo
human subjectivity, ambiguity & uncertainty of expression
natural part of human semantics
53. http://lora-aroyo.org @laroyo
measure quality
quality is not just about spam
quality is typically multi-dimensional
understand the diversity in crowd answers
do not ignore multitude of interpretations
understand the variety of contexts
identify cases with high ambiguity, similarity, …
experiment with explicit metrics
experiment with different designs
54. http://lora-aroyo.org @laroyo
Measure Progress
6 months 2 years
340,551 tags 36,981 tags
137.421 matches
602 items 1.782 items
555 registered players 2,017 users (taggers)
thousands of anonymous players
12,279 visits (3+ min online)
44,362 pageviews
Riste Gligorov, Michiel Hildebrand, Jacco van Ossenbruggen, Guus Schreiber, Lora Aroyo
(2011). On the role of user-generated metadata in audio visual collections. International
conference on Knowledge capture K-CAP '11, Pages 145-152