TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
'Meaning is its use' - Towards the use of distributional semantics for content-based recommender systems
1. SASWeb 2012 - Workshop on Social, Semantic and Adaptive Web
Montréal (Canada), 16.07.2012
‘Meaning is its use’: towards the use
of distributional semantics for
content-based recommender systems
Cataldo Musto, Ph.D.
University of Bari Aldo Moro (Italy) - cataldo.musto@uniba.it
2. “meaning
is its use”
L.Wittgenstein
(Austrian philosopher)
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
3. semantics:
study of meaning
Greek: σημαντικος
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
4. semantics plays a key role for
most of the adaptive systems.
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
5. adaptive systems can benefit
from semantic representation
of the information.
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
6. example.
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
7. Recommender Systems
Relevant items (movies, news, books, etc.) are pushed to the
user according to her preferences or her needs.
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
8. content-based recommenders
Suggest items similar to those liked in the past by the user
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
9. scenario.
book recommendation
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
10. content-based recommenders
book recommendation: key concepts
• Each book has to be described through a set of
textual features
• e.g title of the book, summary, etc.
• Each user is described through textual features, as
well
• Recommendations are provided by calculating
the overlap between the textual description of the
TV show and the features stored in the user profile
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
11. “I love turkey.
I will choose it for my holidays”
?
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
12. “what book will the
user be interested in?”
vs.
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
13. how can we boost
content-based recommender systems
with semantics ?
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
14. alternative representation
distributional models
(Firth, 1957)
Firth, J.R. A synopsis of linguistic theory
1930-1955. In Studies in Linguistic Analysis,
pp. 1-32, 1957.
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
15. distributional models
insight
by analyzing large corpus of textual data it is possible
to infer information about the usage (about the meaning)
of the terms.
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
16. distributional models
insight
by analyzing large corpus of textual data it is possible
to infer information about the usage (about the meaning)
of the terms.
example
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
17. distributional models
insight
by analyzing large corpus of textual data it is possible
to infer information about the usage (about the meaning)
of the terms.
distributional hypothesis
“ words that share similar contexts
(usages) share similar meaning “
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
18. distributional models
• Key: definition of what is the
‘context’
• Different granularities
are possible
• Document
• Paragraph
• Sentence
• Sliding window of words
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
19. distributional models
term/context matrix (WordSpace)
c1 c2 c3 c4 c5 c6 c7 c8 c9
t1 ✔ ✔ ✔ ✔
t2 ✔ ✔ ✔ ✔
t3 ✔ ✔ ✔
t4 ✔ ✔ ✔ ✔
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
20. WordSpace
example
beer
wine
glass
spoon
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
21. distributional models
beer vs. glass: good overlap
c1 c2 c3 c4 c5 c6 c7 c8 c9
t1 ✔ ✔ ✔ ✔
t2 ✔ ✔ ✔ ✔
t3 ✔ ✔ ✔
t4 ✔ ✔ ✔ ✔
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
22. distributional models
beer vs. spoon: no overlap
c1 c2 c3 c4 c5 c6 c7 c8 c9
t1 ✔ ✔ ✔ ✔
t2 ✔ ✔ ✔ ✔
t3 ✔ ✔ ✔
t4 ✔ ✔ ✔ ✔
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
23. representation of documents can be inferred
by combining the representation of the terms
occurring in the document.
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
24. distributional models
term/context matrix (DocSpace)
c1 c2 c3 c4 c5 c6 c7 c8 c9
t2 ✔ ✔ ✔ ✔
t3 ✔ ✔ ✔
d1 ✔ ✔ ✔ ✔ ✔
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
25. distributional models
similarity between documents (DocSpace)
c1 c2 c3 c4 c5 c6 c7 c8 c9
d1 ✔ ✔ ✔ ✔ ✔
d2 ✔ ✔ ✔
d3 ✔ ✔
d4 ✔ ✔ ✔ ✔ ✔
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
26. distributional models
similarity between documents: good overlap
c1 c2 c3 c4 c5 c6 c7 c8 c9
d1 ✔ ✔ ✔ ✔ ✔
d2 ✔ ✔ ✔
d3 ✔ ✔
d4 ✔ ✔ ✔ ✔ ✔
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
27. distributional models
similarity between documents: no overlap
c1 c2 c3 c4 c5 c6 c7 c8 c9
d1 ✔ ✔ ✔ ✔ ✔
d2 ✔ ✔ ✔
d3 ✔ ✔
d4 ✔ ✔ ✔ ✔ ✔
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
28. distributional models
recap
models for representing terms/
documents in large vector spaces
light semantics
it is simple to calculate
similarities between words
and document
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
29. strength:
representation based on
distributional models are
inherently multilingual.
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
30. distributional models
multilingual representation
• Assumption
• The distribution of the terms is (almost) language-
independent
drink bere
beer / birra
glass bicchiere
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
31. distributional models
multilingual representation
• Assumption
• The distribution of the terms is (almost) language-
independent
The position of concept of in a WordSpace will be beer
always the same, regardless the language!
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
32. (english) WordSpace
beer
wine
spoon
dog
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
33. (italian) WordSpace
relationships between
terms stay
birra regardless the
language!
vino
cucchiaio
cane
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
34. multilingual representation
comes with no costs.
Thanks to distributional hypothesis.
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
35. distributional models
recap
models for representing terms/
documents in large vector spaces
light semantics
it is simple to calculate
similarities between words
and document
representation is inherently
multilanguage
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
36. how to combine
distributional models
with
content-based recommender systems?
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
37. a novel recommendation framework based on VSM
eVSM
enhanced Vector Space Model
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
38. eVSM building blocks
distributional models.
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
39. eVSM representation
mystery book
poetry book
recipe book
recipe book
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
40. user profile
how to represent it?
• In eVSM each item is represented as a vector
• User profile vector space representation as well needs a
• How?
• For example, by combining vectors of the items (documents)
the user liked in the past
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
41. user profile
Items Rating Threshold
VSM representation of RI-based profile for user u
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
42. eVSM scenario
mystery book
user profile
poetry book
recipe book
recipe book
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
43. eVSM scenario
Recommendation
task seen as
mystery book
similarity
user profile calculation
between vectors
poetry book
recipe book
recipe book
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
44. eVSM scenario
recommender
mystery book
system suggests
user profile mystery book.
poetry book
recipe book
recipe book
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
45. Why a multilanguage
representation does matter?
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
46. Language issues
• VSM representation is language-dependant
• User profile built in a language can not be
exploited to provide recommendation of
items described in another language
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
47. eVSM
language-dependant recommendations
user profile content-based recommendations
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
48. Vector Space Model
multilingual scenario
d1 d2 d3 p
t1 basketball italian ✔
✔
document
t2 italian ✔ ✔
t3 bargnani ✔ ✔
t4 pallacanestro ✔ ✔
t5 italiana ✔ ✔
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
49. Vector Space Model
multilingual scenario
d1 d2 d3 p
t1 basketball english ✔ ✔
t2 italian documents ✔ ✔
t3 bargnani ✔ ✔
t4 pallacanestro ✔ ✔
t5 italiana ✔ ✔
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
50. Vector Space Model
multilingual scenario
d1 d2 d3 p
t1 basketball ✔ ✔
t2 italian ✔ ✔
t3 bargnani ✔ ✔
t4 pallacanestro ✔ ✔
t5 italiana ✔ ✔
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
51. eVSM
multilingual scenario
d1 d2 d3 p
t1 basketball user interested in
✔ ✔
t2 italian
basketball, italian
✔ ✔
language
t3 bargnani ✔ ✔
t4 pallacanestro ✔ ✔
t5 italiana ✔ ✔
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
52. eVSM
language-dependant recommendations
d1 d2 d3 p
t1 basketball ✔ ✔
t2 italian ✔ ✔
t3 bargnani ✔ ✔
t4 pallacanestro
✔ ✔
t5 italiana
✔ ✔
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
53. eVSM
language-dependant recommendations
d1 d2 d3 p
t1 basketball ✔ ✔
X
t2 italian ✔ ✔
X
t3 bargnani ✔ ✔
t4 pallacanestro ✔ ✔
t5 italiana ✔ ✔
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
54. eVSM overcomes this issue.
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
55. eVSM
providing suggestions - multilingual scenario
DocSpace for L1 DocSpace for L2
c1 c2 c3 c4 c5 . . . ck c1 c2 c3 c4 c5 . . . ck
d1 Parallel d1
DocSpaces
d2 d2
Built upon the
d3 same d3
d4
set of content d4
d5 d5
(italian) (english)
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
56. eVSM
providing suggestions - multilingual scenario
DocSpace for L1 DocSpace for L2
italian football news
english football news
italian user profile
english basketball news
italian basketball news
english politics news
italian politics news
english politics
italian politics news
news
(italian) (english)
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
57. eVSM
providing suggestions - multilingual scenario
DocSpace for L1 DocSpace for L2
c1 c2 c3 c4 c5 . . . ck c1 c2 c3 c4 c5 . . . ck
d1 Parallel d1
DocSpaces
d2 d2
Built upon the
d3 same d3
d4
set of content d4
p
d5
L1
user profile in L1
(italian)
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
58. eVSM
providing suggestions - multilingual scenario
DocSpace for L1 DocSpace for L2
c1 c2 c3 c4 c5 . . . ck c1 c2 c3 c4 c5 . . . ck
d1 Parallel d1
DocSpaces
d2 d2
Built upon the
d3 same d3
d4
set of content d4
p p
L1 L1
we can project user profile in the
DocSpace of english items
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
59. eVSM
providing suggestions - multilingual scenario
DocSpace for L1 DocSpace for L2
italian football news
english football news
italian user profile
english basketball news
italian basketball news
english politics news
italian politics news
english politics
italian politics news
news
(italian) (english)
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
60. eVSM
providing suggestions - multilingual scenario
DocSpace for L1 DocSpace for L2
italian football news
english football news
italian user profile italian user profile
english basketball news
italian basketball news
english politics news
italian politics news
english politics
italian politics news
news
(italian) (english)
through similarity calculations an english news
about basketball is received as recommendation!
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
61. eVSM
multilanguage recommendations
italian
english
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
62. recap and contributions.
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
63. adaptive systems.
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
64. semantics.
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
65. recommender systems.
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
66. eVSM
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
67. richer representation based on
distributional models
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
68. framework for multilingual recommendations
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
69. experimental evaluation
applications
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
70. applications
‘in vitro’
experiments
Cataldo Musto - Enhanced Vector Space Models for Content-based Recommender Systems - Ph.D. defense - University of Bari Aldo Moro, Italy - 08.06.12
71. movie recommendation
‘in vitro’ experiments
• Goal: to provide users with recommendations about movies
worth to be watched.
• Subset of 100k MovieLens dataset + Wikipedia content
• Monolingual and Multilingual settings
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
72. experimental design
experiments
• Experiment
• How do the model perform with respect to other
state of the art approaches?
• VSM - Vector Space Model
• LSI - Latent Semantic Indexing
• Bayes Bayes Text Classifier
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
73. experiment
87
Movielens dataset
eVSM VSM
86.25
85.94 86.01 LSI Bayes
85.58 85.52
85.5 85.39
85.27
84.97
84.85
84.77 84.75
84.75 84.7 84.7
84.58
84.47 84.5
84.43
84
p@1 P@3 P@5 P@10
Gap always around 1%
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
74. experiment
87
Movielens dataset
eVSM VSM
86.25
85.94 86.01 LSI Bayes
85.58 85.52
85.5 85.39
85.27
84.97
84.85
84.77 84.75
84.75 84.7 84.7
84.58
84.47 84.5
84.43
84
p@1 P@3 P@5 P@10
Significant Improvement
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
75. applications
‘in vivo’
experiments
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
76. Play.me
personalized music playlists
• Goal
• To provide users with personalized music playlists
• Methodology
• Extraction of explicit user preferences from Facebook
• Playlist creation by enriching explicit user preferences with similar artists.
• Comparison of two enrichment algorithms
• DBPedia-based enrichment
• Distributional models-based enrichment
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
77. Play.me
architecture
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
78. Play.me
architecture
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
79. Extractor
insight
Social Media provide us unlimited, trustful
and continously updated flow of information about
user interests and needs.
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
80. Extractor
insight
Social Media are a cheap and effective way
to overcome cold start.
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
81. Myusic
data extraction from Facebook
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
82. Myusic
data extraction from Facebook
explicit preferences
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
83. Myusic
data extraction from Facebook
implicit preferences
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
84. Play.me
architecture
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
85. Play.fm
enrichment
• Given a set of explicit preferences extracted from
Facebook
• Play.me enrichs this set by calculating artists similar
to those the user explicity likes
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
86. Play.fm
enrichment example
Coldplay extracted from Facebook
enrichment
radiohead red hot chili peppers kings of leon
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
87. Play.fm
enrichment
• Comparison of two approaches
•
Linked Data
•
Distributional Models
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
88. Linked Open Data Cloud
Structured
(RDF)
representation
of the information
stored in Wikipedia.
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
89. Play.fm
enrichment based on Distributional Models
• Distributional Models
• Each artist is represented through a set of
tags
• Each artist is represented as a point in a
distributional DocSpace
• Similarity calculations to extract the most
similar artists.
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
90. Play.fm
enrichment based on Distributional Models
Coldplay
Radiohead
Kings of Leon
Lady Gaga
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
91. Play.fm
enrichment based on Distributional Models
input: vector space representation
output: artists with the highest cosine similarity
radiohead the killers kings of leon
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
92. Play.me
architecture
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
93. Play.me
playlist
Most popular songs of the artists extracted from Facebook (as well
as those added through the enrichment) are proposed to the user.
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
94. experimental design
• Experiment
• Which one is the enrichment technique that
can provide users with the best playlists ?
• User study with 30 users.
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
95. experimental design
results
76.3
80
75.2 Linked Data
Distributional Models
Baseline (Popularity)
73.75
69.7
67.5
65.9
64.6
61.25 63.2
58 58 58
55
m=1 m=2 m=3
m = number of artists added for each extracted artist
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
96. conclusions.
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
97. eVSM overcomes
state of the art approaches
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
98. semantic representation based
on distributional models
effectively tackles language-issues of CBRS
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
99. end.
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12
100. questions?
Cataldo Musto, ‘Meaning is its use’: towards the use of distributional semantics for content-based recommender systems. SASWeb Workshop, UMAP 2012, 16.07.12