Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Defining spacial representations for the meaning of sounds
1. Vrije Universiteit Amsterdam
Faculty of Sciences / Computer Sciences / Benjamin Timmermans
By Benjamin Timmermans
In collaboration with Emiel van Miltenburg
7. Vrije Universiteit Amsterdam
Faculty of Sciences / Computer Sciences / Benjamin Timmermans
“This is the sound of a piece of A4 paper being cut with scissors. The paper was wedged
between the capsules of a Rode NT-4 stereo microphone and so was in contact with
them.”
Author tags:
Scissors Stereo Cutting Foley Paper
AMBIGUOUS SOUND
9. Vrije Universiteit Amsterdam
Faculty of Sciences / Computer Sciences / Benjamin Timmermans
Experts describe different things then what people are
interested in
PROBLEMS
11. Vrije Universiteit Amsterdam
Faculty of Sciences / Computer Sciences / Benjamin Timmermans
WHAT IS SEARCHED FOR
• Top 5000 search queries of freesound.org
• Representing 5,6 million searches during 5 months
12. Vrije Universiteit Amsterdam
Faculty of Sciences / Computer Sciences / Benjamin Timmermans
HYPOTHESIS:
THE CROWD CAN PROVIDE BETTER PERCEPTUAL
DESCRIPTIONS THAN EXPERTS
13. Vrije Universiteit Amsterdam
Faculty of Sciences / Computer Sciences / Benjamin Timmermans
• 2148 sounds from freesound.org
• 19.098 tags by experts
• 2.008 unique terms
• Avg 8,9 terms per sound
DATASET
15. Vrije Universiteit Amsterdam
Faculty of Sciences / Computer Sciences / Benjamin Timmermans
• CrowdTruth + CrowdFlower
• US/UK/AUS/CA crowd workers
• 10 annotations for each sound
• 3 sounds in one task
• $ 0.02 reward
EXPERIMENTAL SETUP
16. Vrije Universiteit Amsterdam
Faculty of Sciences / Computer Sciences / Benjamin Timmermans
Three versions:
• 38 x 3 sounds < 30 sec.
• 300 x 3 sounds
• short: < 1 sec.
• med: 5 to 6 sec.
• long: 17 to 21 sec.
• 378 x 3 sounds < 1 sec.
DATASET
17. Vrije Universiteit Amsterdam
Faculty of Sciences / Computer Sciences / Benjamin Timmermans
• No phonetic words
• No sentences
• Correct spelling
• Comma separated tags
• Infinite number of tags
TASK INSTRUCTIONS
18. Vrije Universiteit Amsterdam
Faculty of Sciences / Computer Sciences / Benjamin Timmermans
• Remove spammers using CrowdTruth metrics
• Clean up non-alphanumeric characters
• Loud! -> loud
• Spell correction
• Barkking -> barking
• Cluster on morphology, substrings, word order
• Walk, walks, walking -> walk
• Dog, Fat dog -> Fat dog
• Running dog = Dog running
POST-PROCESSING
27. Vrije Universiteit Amsterdam
Faculty of Sciences / Computer Sciences / Benjamin Timmermans
• Expert sound tags often do not describe what is heard
• Crowd can efficiently generate useful sound tags
• Crowd provided better perceptual descriptions
• Perceptual descriptions improve search
CONCLUSION
28. Vrije Universiteit Amsterdam
Faculty of Sciences / Computer Sciences / Benjamin Timmermans
• Investigate the type of content described with sound tags
• Specificity
• Subjectivity
• Further investigate influence of sound length
• Improve search for sounds
FUTURE WORK