Italian Information Retrieval 2013 - Workshop (http://iir2013.isti.cnr.it) - Distributional Models vs. Linked Data: leveraging crowdsourcing to personalize music playlists
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Distributional Models vs. Linked Data: leveraging crowdsourcing to personalize music playlists
1. IIR 2013 - 4th Italian Information Retrieval Workshop
Pisa (Italy), 17.01.2013
Cataldo Musto, Fedelucio Narducci, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis
Distributional models vs. Linked Data:
exploiting crowdsourcing to
personalize music playlists
2. exponential growth
of the available music
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
3. Some stats
28,000,000 songs available on iTunes Store (*)
around 31,000 hours of music
a typical user spends 1.5 hours for day listening to music
=
56 years
to listen to the whole iTunes Library
(*) http://www.digitalmusicnews.com/permalink/2012/120425itunes
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
4. Information Overload
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
5. what music should I listen to?
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
6. solution
personalization.
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
7. solution
personalized music playlists
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
8. Is this something new?
No.
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
9. Amazon.com
Recommendations
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
10. Genius @iTunes
Recommendations
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
11. Recommendations
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
12. All the state of the art
platforms share an
important drawback.
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
13. training is a bottleneck.
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
14. need for
explicit
information
about
user interests.
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
15. social media
provide information about user preferences
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
16. example
user preferences in music from Facebook
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
17. Our contribution
Play.me
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
18. Play.me
personalized music playlists
• Goal
• To provide users with personalized music playlists
• Insights
• Extraction of explicit user preferences from Facebook
• Playlist creation by enriching explicit user preferences.
• New artists are added to those explicitly extracted from
Facebook
• Comparison of two enrichment techniques
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
19. Play.me
architecture
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
20. Play.me
architecture
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
21. Play.me
pre-processing
• Crawling from Last.fm
• Public API
• Content-based features
• Name of the artist + Social tags
• Noise processing
• Information locally stored
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
22. Play.me
pre-processing
Sigur Ròs tag cloud from Last.fm
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
23. Play.me
architecture
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
24. Play.me
data extraction from Facebook
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
25. Play.me
data extraction from Facebook
explicit preferences
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
26. Play.me
data extraction from Facebook
implicit preferences
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
27. Play.me
architecture
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
28. Play.me
enrichment
• Rationale
• Given a set of explicit preferences extracted from
Facebook
• Play.me enrichs this set
• Extraction of artists similar to those the user
explicity likes
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
29. Play.me
enrichment example
Coldplay extracted from Facebook
enrichment
radiohead red hot chili peppers kings of leon
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
30. Play.me
architecture
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
31. Play.me
playlist
Most popular songs of the artists extracted from Last.fm (as well as
those added through the enrichment) are proposed to the user.
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
32. let’s go
deeper
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
33. Play.me
enrichment
• Comparison of two approaches
•
Content-based strategy
• Distributional Models
• Linked Data
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
34. Play.me
enrichment based on Distributional Models
• Content-based strategy
• Each artist is modeled through a set of tags
• Each artist is represented as a point in a
semantic geometrical space
• Distributional Models
• Similarity calculations to extract the most
similar artists.
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
35. distributional models
“meaning
is its use”
L.Wittgenstein
(Austrian philosopher)
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
36. distributional models
insight
by analyzing large corpus of textual data it is possible
to infer information about the usage (about the meaning)
of the terms.
example
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
37. distributional models
term/context matrix (WordSpace)
c1 c2 c3 c4 c5 c6 c7 c8 c9
t1 ✔ ✔ ✔ ✔
t2 ✔ ✔ ✔ ✔
t3 ✔ ✔ ✔
t4 ✔ ✔ ✔ ✔
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
38. distributional models
beer vs. glass: good overlap
c1 c2 c3 c4 c5 c6 c7 c8 c9
t1 ✔ ✔ ✔ ✔
t2 ✔ ✔ ✔ ✔
t3 ✔ ✔ ✔
t4 ✔ ✔ ✔ ✔
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
39. distributional models
beer vs. spoon: no overlap
c1 c2 c3 c4 c5 c6 c7 c8 c9
t1 ✔ ✔ ✔ ✔
t2 ✔ ✔ ✔ ✔
t3 ✔ ✔ ✔
t4 ✔ ✔ ✔ ✔
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
40. distributional models
rock vs. post rock = good overlap
c1 c2 c3 c4 c5 c6
rock ✔ ✔ ✔
post rock ✔ ✔
jazz ✔
classical ✔ ✔ ✔
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
41. distributional models
rock vs. classical = no overlap
c1 c2 c3 c4 c5 c6
rock ✔ ✔ ✔
post rock ✔ ✔
jazz ✔
classical ✔ ✔ ✔
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
42. representation of documents (*)
can be inferred by combining the representation of
the terms (**) occurring in the document.
(*) documents = artists
(**) terms = tags
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
43. distributional models
term/context matrix (DocSpace)
c1 c2 c3 c4 c5 c6 c7 c8 c9
t2 ✔ ✔ ✔ ✔
t3 ✔ ✔ ✔
d1 ✔ ✔ ✔ ✔ ✔
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
44. Play.me
enrichment based on Distributional Models
Coldplay
Radiohead
Kings of Leon
Lady Gaga
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
45. Play.me
enrichment based on Distributional Models
input: vector space representation
output: artists with the highest cosine similarity
radiohead the killers kings of leon
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
46. Linked Open Data Cloud
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
47. Linked Open Data Cloud
Structured
(RDF)
representation
of the information
stored in Wikipedia.
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
48. Play.me
enrichment based on Linked Data
Coldplay play Alternative Rock
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
49. Play.me
RDF triple
Relationships are explictly encoded in RDF.
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
50. Play.me
enrichment based on Linked Data
• Linked Open Data Cloud
• Each artist is mapped on a DBpedia node.
• univocal URI
• Relationship between artists (nodes) are explicitly
encoded
• e.g. genre, artist category, etc.
• Use of SPARQL to extract artists (nodes) that
share the same features
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
51. Play.me
enrichment based on Linked Data
input: SPARQL query
output: artists sharing the same properties
radiohead the smiths the verve
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
52. recap
enrichment process
input: artist output: similar artists
coldplay the smiths
Linked Data
radiohead
the verve
kings of leon
Distributional Models
radiohead
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
53. experimental
evaluation.
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
54. experimental design
• Experiment
• Which one is the enrichment technique that
can provide users with the best playlists ?
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
55. experimental design
settings
• 30 users
• Heterogeneous musical knowledge
• Last.fm crawl: 228,878 artists
• Extraction & Recommendation step
• 325 artists extracted
• 11 per user, on average
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
56. experimental setup
Given a playlist, each user can freely express her own
feedback (like/dislike) on the proposed tracks.
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
57. experimental setup
Experiment repeated three times (one run with Linked Data enrichment, another
one with Distributional Models, one with a simple baseline based on popularity).
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
58. experimental setup
Users were unaware of the adopted configuration.
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
59. experimental design
results
76,3
80
75,2 Linked Data
Distributional Models
Baseline (Popularity)
73,75
69,7
67,5
65,9
64,6
61,25 63,2
58 58 58
55
n=1 n=2 n=3
n = number of artists added for each extracted artist
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
60. experimental design
results
76,3
80
75,2 Linked Data
Distributional Models
Baseline (Popularity)
73,75
69,7
67,5
65,9
64,6
61,25 63,2
58 58 58
55
n=1 n=2 n=3
distributional models overcome linked data
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
61. experimental design
results
76,3
80
75,2 Linked Data
Distributional Models
Baseline (Popularity)
73,75
69,7
67,5
65,9
64,6
61,25 63,2
58 58 58
55
n=1 n=2 n=3
precision in distributional models drops down more rapidly
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
62. experimental design
results
76,3
80
75,2 Linked Data
Distributional Models
Baseline (Popularity)
73,75
69,7
67,5
65,9
64,6
61,25 63,2
58 58 58
55
n=1 n=2 n=3
good results for baseline, as well (poor music knowledge?)
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
63. conclusions.
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
64. both enrichment techniques
overcome the baseline
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
65. distributional models
overcome linked data
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
66. future research.
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
67. merging different
enrichment techniques
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
68. evaluation with user-based metrics
(serendipity, novelty, unexpectedness)
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
69. modeling context.
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
70. questions?
C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.
Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13