Knowledge Graph Embeddings for Recommender Systems

Knowledge Graph
Embeddings for
Recommender Systems
PhD Candidate
Enrico Palumbo (Links Foundation, EURECOM, Politecnico di Torino)
Referees
Prof. Paolo Cremonesi, Politecnico di Milano
Dr. Cataldo Musto, Università degli Studi di Bari “Aldo Moro”
Examiners
Prof. Alejandro Bellogin, Universidad Autonóma de Madrid
Prof. Silvia Chiusano, Politecnico di Torino
Prof. Paolo Garza, Politecnico di Torino
27/04/2020
Advisors
Prof. Elena Baralis, Politecnico di Torino
Dr. Giuseppe Rizzo, Links Foundation
Prof. Raphaël Troncy, EURECOM

RECOMMENDER SYSTEMS SEMANTICS
2

KNOWLEDGE GRAPH
EMBEDDINGS FOR
RECOMMENDER SYSTEMS
3

entity matching
for knowledge
graph generation
KNOWLEDGE GRAPH
EMBEDDINGS FOR
RECOMMENDER SYSTEMS
4

TOURIST PATH
RECOMMENDATION
entity matching
for knowledge
graph generation
KNOWLEDGE GRAPH
EMBEDDINGS FOR
RECOMMENDER SYSTEMS
5

MORE IS LESS: THE PARADOX OF CHOICE
“But as the number of choices keeps growing, negative
aspects of having a multitude of options begin to appear. As
the number of choices grows further, the negatives escalate
until we become overloaded. At this point, choice no longer
liberates, but debilitates. It might even be said to tyrannize.”
Barry Schwartz, The Paradox of Choice: Why More Is Less
7

“Ads are shifting toward not just digitization but also
personalization[...]. Already, 35 percent of what
consumers purchase on Amazon and 75 percent of
what they watch on Netﬂix come from product
recommendations1
.”
1: McKinsey: https://www.mckinsey.com/industries/retail/our-insights/how-retailers-can-keep-up-with-consumers
RECOMMENDER SYSTEMS
“We think the combined eﬀect of personalization and
recommendations save us more than $1B per year2
.”
2: Gomez-Uribe, Carlos A., and Neil Hunt. "The netflix recommender system: Algorithms, business value, and innovation." ACM Transactions on
Management Information Systems (TMIS) 6.4 (2015): 1-19.
8

ITEM RECOMMENDATION
USER
ITEMS
ranking function
ρ (u, i)
0.8
0.6
0.5
0.2ρ(u,i)
9

CONTENT-BASED FILTERING
JANE
ρ (u, i)
0.8Kill_Bill_Vol.2
Samuel_Jackson
0.0
Content similarities with items liked in the past
starring
starring
State-of-the-art
10

COLLABORATIVE FILTERING ρ (u, i)
0.8
0.0
collaborative: users who watch x also like y...
Kill_Bill_Vol.2
Jane
Mark
Taxi_Driver
State-of-the-art
11

BEYOND CONTENT-BASED AND COLLABORATIVE...
Content-based
TF-IDF [1], word embeddings
[2], knowledge-aware [3]
Issues:
● requires item model
● over-specialization
Collaborative
UserKNN [4], ItemKNN [5],
Matrix Factorization [6], SLIM [7]
Issues:
● requires feedback data
● new items
● explainability
Hybrid
FM [8], S-SLIMs [9],
knowledge-aware [10]
Issues:
● best of both worlds!
State-of-the-art
12

KNOWLEDGE GRAPH
● K = (E, R, O)
● Entities E
● O = ontology
● Γ = types of relations
● R ⊂ E x Γ x E typed relations
● Enables data integration and
linking
● Fundamental concept of Linked
Open Data and Semantic Web
13

KNOWLEDGE GRAPH EMBEDDINGS
● Learning feature vectors for entities and
relations for downstream prediction tasks. Map
knowledge graph into a vector space.
● translational models (TransE [11], TransH [12],
TransR [13]), semantic matching models (RESCAL
[14], DistMult [15]), random-walk models (RDF2Vec
[16])
● Common applications [17]:
● Link prediction
● Entity relatedness and linking
● Entity matching and resolution
● Recommendations!
State-of-the-art
14

RQ1) How can knowledge graph embeddings be used to create
hybrid, accurate, non-obvious and semantics-aware
recommendations?
15

Knowledge-aware recommender systems
Kill Bill Vol.2
Samuel Jackson
Jane
Mark
Quentin Tarantino
Taxi Driver
starring
director
feedback
feedback
feedback
feedback
starring
starring
Star Wars Ep.1
THE AVENGERS
Knowledge Graph (KG):
● Users, items, items attributes, user
attributes = entities
● user-item interactions: ‘feedback’
relation. Collaborative ﬁltering.
● item content: “director”, “starring”,
“music composer”, …, as relations.
Content-based ﬁltering.
collaborative + content = hybrid recommender
17

objective: user-item relatedness
Kill Bill Vol.2
Samuel Jackson
Jane
Mark
Quentin Tarantino
Taxi Driver
starring
director
feedback
feedback
feedback
feedback
starring
starring
Star Wars Ep.1
THE AVENGERSρ(u,i)
18

Kill Bill Vol.2
Samuel Jackson
JaneMark
Quentin Tarantino
Taxi Driver
Star Wars Ep.1
THE AVENGERS
graph structure
19

Kill Bill
Vol.2 Samuel
Jackson
Jane
Mark
Quentin
Tarantino
Taxi Driver
Star Wars Ep.1
THE AVENGERS
node2vec [18]
Random
Walks
Mark, Kill_Bill_Vol.2,
Quentin_Tarantino, ...
Jane, Kill_Bill_Vol.2,
Samuel Jackson …
Kill Bill Vol.2, Samuel
Jackson, The Avengers,
Jane…
Samuel_Jackson, The
Avengers, Jane, …
…
GRAPH “TEXT”
Palumbo E., Rizzo G., Troncy R., Baralis E., Osella M., Ferro E. (2018) Knowledge Graph Embeddings with node2vec for Item Recommendation. In: Gangemi A. et al.
(eds) The Semantic Web: ESWC 2018 Satellite Events. ESWC 2018. Lecture Notes in Computer Science, vol 11155. Springer, Cham
word2vec
ρ(u,i)
20

SEMANTICS
Kill Bill Vol.2
Samuel Jackson
JaneMark
Quentin Tarantino
Taxi Driver
starring
director
feedback
feedback
feedback
feedback
starring
starring
Star Wars Ep.1
THE AVENGERS
21

ENTITY2REC
KG
node2vec
aggregation
top-N
items
Graphs:
property-speciﬁc
subgraphs (PSS)
Features:
property-speciﬁc
relatedness scores
Ranking function:
global relatedness
score
ρ(u,i) = f(ρp
(u,i))
ρp
(u,i)
node2vec
22

ENTITY2REC (2017): COLLABORATIVE-CONTENT SUBGRAPHS
JaneMark
feedback
feedback
feedback
feedback
THE AVENGERS
Palumbo, Enrico, Giuseppe Rizzo, and Raphaël Troncy. "Entity2rec: Learning user-item relatedness from knowledge graphs for top-n item recommendation."
Proceedings of the eleventh ACM conference on recommender systems. 2017.
feedback subgraph starring subgraph
Kill Bill Vol.2
Samuel
Jackson
Taxi Driver
starring
starring
starring
Star Wars Ep.1
THE AVENGERS
starring
robert de niro
starring
JODIE FOSTER
starring
starring
jackie brown
23

Palumbo, Enrico, Giuseppe Rizzo, and Raphaël Troncy. "Entity2rec: Learning user-item relatedness from knowledge graphs for top-n item
recommendation." Proceedings of the eleventh ACM conference on recommender systems. 2017.
director subgraph
Kill Bill Vol.2
Quentin TarantinoTaxi Driver
director
THE AVENGERS
director
JOSS WHEDON
director
MARTIN
SCORSESE
jackie brown
director
ENTITY2REC (2017): COLLABORATIVE-CONTENT SUBGRAPHS
director
director
director
pulp
ﬁctionTHE DEPARTED
WOLF OF WALL
STREET
directordirector
buffy angel
24

ENTITY2REC: HYBRID SUBGRAPHS
Palumbo, Enrico, et al. "entity2rec: Property-specific Knowledge Graph Embeddings for Item Recommendation." Expert Systems with Applications (2020): 113235.
starring
starring
starring
starring
starring
jackie brown
starring
starring
feedback_starring subgraph
JODIE FOSTER Star Wars Ep.1
Kill Bill Vol.2
THE AVENGERS
Samuel
Jackson
ROBERT DE NIRO
feedback
feedbackfeedback
feedback
Kill Bill Vol.2
JaneMark
Quentin Tarantino
Taxi Driver
director
feedback
feedback
feedback
feedback
THE AVENGERS
director
JOSS WHEDON
director
MARTIN SCORSESE
feedback_director subgraph
25

AGGREGATION
➢ Learning to rank: supervised (LambdaMart, AdaRank)
➢ Average: average of the property-specific relatedness scores
➢ Min: minimum of the property-specific relatedness scores
➢ Max: maximum of the property-specific relatedness scores
ρ(u,i) = f(ρp
(u,i))
ρ(u,i) = user-item relatedness = ranking function
26

u
feedback
i1
i3
ρ(u,i) = - D(u + feedback, i)
i2
i3
TransE
TransH
ρ(u,i) = - D(u⊥
+ dfeedback
, i⊥
)
TransR
ρ(u,i) = - D(ufeedback
+ feedback, ifeedback
)
Palumbo, Enrico, et al "Translational Models for Item Recommendation." The Semantic Web: ESWC 2018 Satellite Events: ESWC 2018 Satellite Events, Heraklion,
Crete, Greece, June 3-7, 2018, Revised Selected Papers 11155 (2018): 478.
TRANSLATIONAL MODELS FOR ITEM RECOMMENDATION
27

DATASETS
● Items: movies
● Feedback: ratings
● Users: 6040
● Items: 3226
● Sparsity: 95.1
● Entropy: 7.17
● Items: music artists
● Feedback: listened
● Users: 1865
● Items: 9765
● Sparsity: 99.6
● Entropy: 7.77
● Items: books
● Feedback: ratings
● Users: 6789
● Items: 9926
● Sparsity: 99.4
● Entropy: 8.26
29

DATASETS: DBPEDIA MAPPING
Mappings: https://github.com/sisinflab/LODrecsys-datasets
31

USER
ITEMS
ranking function
ρ (u, i)
0.8
0.6
0.5
0.2ρ(u,i)
EVALUATION: CANDIDATES
ALL UNRATED
ITEMS BY U [19]
32

EVALUATION: METRICS
ρ (u, i)
0.8
0.6
0.5
0.2
Y
1
0
0
1
P@Nu
= % of relevant items in top N items
R@Nu
= % of relevant items in top N items in of all
relevant items for the user
SER@Nu
= % of non-obvious relevant items in top N items
NOV@Nu
=average of -log (p(i)) in top n items
Average over all users:
33

EXPERIMENT 1:
ENTITY2REC (2017) VS ENTITY2REC
34
hybrid subgraphscollaborative/content subgraphs

P@5 R@5 SER@5 NOV@5
0.2125 0.0967 0.1913 9.654
0.2372 0.1045 0.2125 9.577
0.2198 0.0976 0.1946 9.466
0.2206 0.0951 0.2038 10.046
0.1836 0.0748 0.1640 9.948
0.0578 0.0234 0.0523 11.085
0.0166 0.009 0.0166 11.541
0.0099 0.0023 0.0095 12.129
P@5 R@5 SER@5 NOV@5
0.1852 0.1066 0.1512 10.101
0.2062 0.1191 0.1682 10.379
0.2055 0.1191 0.1664 9.807
0.1693 0.0986 0.1423 10.243
0.1469 0.0844 0.1194 11.092
0.0597 0.0351 0.0574 13.143
0.0002 0.0001 0.0002 13.090
0.1387 0.0801 0.1063 11.426
P@5 R@5 SER@5 NOV@5
0.1271 0.0803 0.1229 12.469
0.1800 0.1072 0.1736 12.886
0.1831 0.1084 0.1757 11.709
0.1634 0.0984 0.1591 12.783
0.1322 0.0746 0.1285 13.000
0.0720 0.0495 0.0719 13.481
0.0060 0.0027 0.0060 13.396
0.0319 0.0250 0.0316 14.549
System
entity2rec_lambda
entity2rec_avg
entity2rec_min
entity2rec_max
entity2rec_lambda (2017)
entity2rec_avg (2017)
entity2rec_min (2017)
entity2rec_max (2017)
● entity2rec, i.e. hybrid subgraphs, performs better than entity2rec (2017), collaborative/content
subgraphs, on all the datasets. Higher novelty is associated to much lower precision.
● Supervised learning to rank is fundamental for entity2rec (2017). But it no longer beneﬁcial
for entity2rec. Best result is given by simple aggregation functions such as average or minimum.
35

EXPERIMENT 2:
ENTITY2REC VS SOTA
36

COMPARISON
● entity2rec: hybrid property-specific subgraphs (slide 19)
● node2vec: graph embedding algorithm applied on knowledge graph as a whole for
recommendations (slide 17)
● TransE, TransR, TransH: translational models for KG embeddings recommendations (slide 24)
● RankingFM [8]: hybrid non KG-based method, Factorization Machine with ranking
regularization. DBpedia data is used as item side information
● BPRMF [20]: Matrix Factorization optimized using Bayesian Personalized Ranking
● WRMF [21]: Matrix Factorization where weighting matrix is used to account for different
confidence levels in user-item feedback
● LeastSquareSLIM [9]: Sparse LInear Method optimized using least squares
● BPRSLIM [20]: Sparse LInear Method optimized using Bayesian Personalized Ranking
● ItemKNN [4]: item-based K-nearest neighbors recommender
● MostPop: non-personalized heuristic, recommends N most popular items to all users 37

EXPERIMENT 3:
MODEL INTERPRETABILITY
39

director
director
director
ρstarring
(u,i)
ρdirector
(u,i)
average of
property-speciﬁc
scores
ρ(u,i) = avg(ρp
(u,i))
one feature =
one property
starring
starring
starring
starring
starring
40

MOVIELENS 1M
Property P@5 R@5 SER@5 NOV@5
feedback_dbo:cinematography 0.1847 0.0813 0.1675 9.835
feedback_dbo:director 0.1913 0.0842 0.1741 9.859
feedback_dbo:distributor 0.1846 0.0805 0.1673 9.894
feedback_dbo:editing 0.1829 0.0810 0.1668 9.855
feedback_dbo:musicComposer 0.1861 0.0817 0.1691 9.891
feedback_dbo:producer 0.1777 0.0826 0.1603 10.349
feedback_dbo:starring 0.2113 0.0937 0.1965 9.957
feedback_dbo:writer 0.1808 0.0822 0.1652 10.393
feedback_dct:subject 0.2249 0.0958 0.2044 9.831
all 0.2372 0.1045 0.2125 9.577
feedback 0.1801 0.0814 0.1629 9.881
41
ρp
(u,i)
Only one hybrid subgraph
(feedback + content)
entity2rec with all properties
entity2rec on feedback graph only
dct:subject (e.g. dbc:Palme_d'Or_winners) gets
best results, dbo:starring also good results.
No property alone is better than all. Almost all
properties are better than feedback alone.

feedback_dbo:associatedBand 0.1539 0.0894 0.1253 11.05
feedback_dbo:associatedMusicalArtist 0.1575 0.0915 0.1299 10.95
feedback_dbo:bandMember 0.1511 0.0873 0.1217 11.60
feedback_dbo:birthPlace 0.1612 0.0925 0.1287 11.08
feedback_dbo:formerBandMember 0.1580 0.0909 0.1274 11.48
feedback_dbo:genre 0.1801 0.1042 0.1466 10.33
feedback_dbo:hometown 0.1708 0.0979 0.1371 10.36
feedback_dbo:instrument 0.1601 0.0919 0.1270 11.25
feedback_dbo:occupation 0.1457 0.0844 0.1103 10.96
feedback_dbo:recordLabel 0.1856 0.1076 0.1532 10.42
all 0.2062 0.1191 0.1682 10.38
feedback 0.1542 0.0886 0.1198 11.51
feedback_dbo:author 0.1603 0.0972 0.1556 12.736
feedback_dbo:country 0.1629 0.0976 0.1572 12.360
feedback_dbo:coverArtist 0.1625 0.0973 0.1571 12.362
feedback_dbo:language 0.1619 0.0971 0.1558 12.350
feedback_dbo:literaryGenre 0.1633 0.0978 0.1583 12.353
feedback_dbo:mediaType 0.1610 0.0956 0.1559 12.411
feedback_dbo:previousWork 0.1688 0.1001 0.1630 12.523
feedback_dbo:publisher 0.1643 0.0979 0.1588 12.326
feedback_dbo:series 0.1635 0.0976 0.1581 12.460
feedback_dbo:subsequentWork 0.1687 0.1007 0.1634 12.520
all 0.1800 0.1072 0.1736 12.089
feedback 0.1632 0.0976 0.1578 12.409
LIBRARY THINGLAST FM
dct:subject gets best results, dbo:recordLabel
and dbo:genre also good results.
No property alone is better than all. Almost all
properties are better than feedback alone.
dct:subject gets best results, dbo:previous and
subsequentWork also good results.
No property alone is better than all. Almost all properties
are better than feedback alone.
42

USER
ITEMS
ranking function
ρ (u, i)
0.8
0.6
0.5
0.2ρstarring
(u,i)
Suggest me a
movie with my
favorite actors.
43

WHY DOES THE USER LIKE IT?
ρdirector
(u,i)ρstarring
(u,i)
44

ONLINE EXPERIMENT: TINDERBOOK
● 2,210,000 new books are published every year
● Hard to ﬁnd a good book to read, most readers
typically give up on a book in the early chapters
● entity2rec showed to be particularly eﬀective on the
LibraryThing dataset -> book recommendation
Problem: how to generate recommendations for new
users with entity2rec? Cannot generate user embedding at
runtime.
Requirement: no login, recommendations given a single
book that the user likes (“seed book”)
46
Palumbo, E., Buzio, A., Gaiardo, A., Rizzo, G., Troncy, R. and Baralis, E., 2019, June. Tinderbook: Fall in Love
with Culture. In European Semantic Web Conference (pp. 590-605). Springer, Cham.

TINDERBOOK WORKFLOW
http://www.tinderbook.it
entity2rec
onboarding
47

ENTITY2REC FOR COLD START: ITEM-ITEM RELATEDNESS
KG
node2vec
aggregation
top-N
items
Graphs:
hybrid
property-speciﬁc
subgraphs
Features:
item-item
property-speciﬁc
relatedness scores
Ranking function:
global item-item
relatedness score.
f = average
ρ(i’,i) = f(ρp
(i’,i))
ρp
(i’,i)
feedback
feedback
feedback
director
director
feedback
feedback
feedback
starring
SEED BOOK
SEED BOOK
SEED BOOK
SEED BOOK
node2vec
48

item-item offline evaluation
System P@5 R@5 SER@5 NOV@5
entity2rec 0.0549 0.0508 0.0514 11.099
ItemKNN 0.0484 0.0472 0.0463 12.200
RDF2Vec 0.0315 0.0288 0.0311 13.913
TF-IDF DBpedia 0.0322 0.0283 0.0312 12.568
MostPop 0.0343 0.0256 0.007 8.4525
49

ONBOARDING: A/B TESTING
● How to present books in the
onboarding phase?
Popularity-biased sampling with
temperature:
probability (book) ~ popularity
(book)1/T
T<1: “rich gets richer” eﬀect
T=1: no eﬀect
T>1: more uniform sampling
• T = 0.3 more than 90% of the seed books are
concentrated the top 10% in terms of popularity
• T = 1. the popularity bias,although still strong,
decreases 50

T = 0.3 T = 1. p value signiﬁcant
P@5 0.497368 ± 0.026381 0.495833 ± 0.052701 9.79E-01 no
SER@5 0.417105 ± 0.024892 0.437500 ± 0.047382 7.07E-01 no
NOV@5 8.315443 ± 0.176832 10.095039 ± 0.347261 2.30E-05 yes
completeness 0.903947 ± 0.018229 0.937500 ± 0.025108 2.86E-01 no
discard 6.321229 ± 0.663185 12.544643 ± 2.070238 2.09E-03 yes
dropout 0.131285 ± 0.019150 0.178571 ± 0.039930 2.45E-01 no
seed_pop 0.002626 ± 0.000060 0.000835 ± 0.000086 2.74E-48 yes
A/B TESTING
Duration = two weeks, number of sessions = 470
T = 1. is better than T = 0.3, more novelty without aﬀecting dropout.
51

CONTRIBUTION 2: STACKED THRESHOLD-BASED
ENTITY MATCHING (STEM)
52

KNOWLEDGE GRAPH GENERATION: ENTITY MATCHING
Name Zip code Address
Lowry bank 56348 223 main st.
JP bank 19390 101 aven.
JP fsb bank 56347 223 main st.
JPR bank 19390 101 aven.
In many practical applications, you need to create a Knowledge Graph, matching records to
create a single entity. Hard task: implicit semantics, mistakes, misspellings, info missing in data.
53

THRESHOLD-BASED CLASSIFIERS
1. Threshold-based classiﬁers are commonly
used (Silk [22], Duke*
)
2. Property-wise similarities: sname
, szip
,
saddress
3. f(sname
, szip
, saddress
) > t
Trade-oﬀ between precision and recall:
1. High t: many false negatives, low recall
2. Low t: many false positives, low precision
Can we increase both at the same time? 54
*https://github.com/larsga/Duke

RQ2) Can ensemble learning algorithms such as stacked
generalization improve the accuracy of threshold-based
classiﬁers in the entity matching process?
55

STEM: STACKED THRESHOLD-BASED ENTITY MATCHING
Lowry bank 56348 223 main st.
JP bank 19390 101 aven.
JP fsb bank 56347 main st.
JPR bank 11390 1 aven.
f(e1,e2; t)
e1
e2
f1(e1,e2; t+2d)
f2(e1,e2; t+d)
f3(e1,e2; t)
f4(e1,e2; t -d)
f5(e1,e2; t -2d)
threshold
perturbation
threshold
classiﬁer
x1
x2
x3
x4
x5
F(x; t)
features
supervised
classiﬁer
56

RESULTS
Palumbo, Enrico, Giuseppe Rizzo, and Raphaël Troncy. “STEM: Stacked Threshold-based Entity Matching”, Semantic Web, 2019
STEM consistently improves precision, recall and thus f-score. STEM-NB performs better than STEM-LIN.
58

RESULTS
Troncy R., Rizzo G., Jameson A., Corcho O., Plu J., Palumbo E., et al. (2017) 3cixty: Building Comprehensive Knowledge Bases For City Exploration. In Web Semantics:
Science, Services and Agents on the World Wide Web, 2017
STEM-NB used to reconcile places to build a knowledge graph of places for the 3cixty project.
Improves precision, recall and f-score.
59

CONTRIBUTION 3: PATH RECOMMENDER
60

Matt, British man,
is going to visit
Picasso Museum
in Antibes this
afternoon
After, he can be
interested in
Taking a
beer in an
Irish Pub,
ﬁrst: Hop
Store
Then, having
a dinner in a
French
Restaurant:
Le Jardin
And, ﬁnally,
attending
Jazz à Juan
event in a
Jazz Club
path, i.e. a sequence of venues
that Matt can be interested to go
after having visited Picasso
Museum
TOURIST PATHS
61Sequence-aware Recommender System [23]

RQ3) How can we create a recommender system that learns to
recommend tourist paths, effectively leveraging the temporal
correlation among tourist activities?
62

Data collection
● Collect public check-ins from
Swarmapp users
(Foursquare) publishing on
Twitter
● Linking tweet to Foursquare
venue to extract venue
category
● Split sequence when the
temporal interval is more than
8 hours
● Published open source as the
Semantic Trails Dataset
DATA COLLECTION
Monti, D., Palumbo, E., Rizzo, G., Troncy, R. and Morisio, M., 2018. Semantic Trails of City Explorations: How Do We Live a City. arXiv preprint arXiv:1812.04367.
Path: Art_Museum, Café, French_Restaurant, Rock_Club
63

PATH RECOMMENDER
Art_Museum Park Sushi_Restaurant EOP
Cafeteria Art_Museum Park Sushi_Restaurant
GRU
o0
o1
o2
o3
x1
x2
x3
Dropout Dropout Dropout Dropout
Target
encoding encodingencodingencoding
GRU GRU GRU
Input layer
Hidden layer(s)
Output layer
Softmax Softmax SoftmaxSoftmax
h0
h1
h2
h3
x0
Palumbo, Enrico, et al. "Predicting Your Next Stop-over from Location-based Social Network Data with Recurrent Neural Networks." RecTour@ RecSys. 2017.
64

Datasets
● Foursquare
● Yes.com (music playlists)
Evaluation protocol
● Sequeval: evaluation framework for
sequence-aware recommender
systems
Baselines
● MostPop, Random, Unigram, Bigram,
Conditional Random Field (CRF)
Monti, D., Palumbo, E., Rizzo, G. and Morisio, M., 2019. Sequeval: An
offline evaluation framework for sequence-based recommender systems.
Information, 10(5), p.174.
EXPERIMENTAL SETUP
65

RESULTS
Foursquare
Yes.com
66
Serendipity best
for RNN on both
datasets

RESULTS
Foursquare
Yes.com
67
Pop bias so
strong in
Foursquare that
MP has best not
precision. Not in
Yes.com

PLAYLIST CONTINUATION
Task
● Given a playlist with some initial information (e.g. tracks, title), continue the playlist with other
relevant tracks
Dataset
● Main track: Million Playlist Dataset (MPD)
● Creative track: MPD + external data sources (song lyrics)
Approach
● RNN architecture like Path Recommender with side information extracted from titles and lyrics
Results
● 36/113 in main track, 14/33 in creative track 68

RQ1
How can knowledge graph embeddings be used to create hybrid,
accurate, non-obvious and semantics-aware recommendations?
Knowledge graph including user-item interactions and item-item interactions. KG
embeddings. Ranking function = user-item relatedness in the vector space.
We propose entity2rec:
a. extends node2vec by encoding the semantics of the KG properties in its features
b. creates accurate (high precision, high recall) and non-obvious (high serendipity, good
novelty) recommendations. Particularly effective if sparsity is high and popularity bias is
not too strong.
c. direct interpretation of features that can be used to assess feature importance,
configure recommendation to specific user requests, and transparency
d. can be used in a cold start scenario leveraging item-item relatedness as shown in the
Tinderbook application
e. integrates seamlessly with Linked Open Data and Semantic Web technologies 70

RQ2
Can ensemble learning algorithms such as stacked generalization
improve the performance of threshold-based classifiers in the entity
matching process?
Threshold-based classifiers are commonly used to match entities when creating a
knowledge graph. The final decision threshold creates a trade-off between
precision and recall of the entity matching process.
We propose STEM (Stacked Threshold-based Entity Matching):
a. Starts from a threshold-based classifier, generates binary match predictions from
several different thresholds, and stacks a supervised classifier on top of these
predictions for the final binary decision (match/ no match)
b. it is independent from the nature of the threshold-based classifier
c. improves F-score up to 44%
d. results are consistent across different datasets and domains (finance, music, tourism)
e. it has been applied in a concrete use-case in the context of the 3cixty European project
to generate a knowledge graph of places in the city of Nice 71

RQ3
How can we create a recommender system that learns to recommend
tourist paths, effectively leveraging the temporal correlation among
tourist activities?
Sequence-aware recommendation as a next word prediction problem.
We propose the Path Recommender:
a. we collect data from Foursquare check-ins and extract sequences of venue categories
to describe user paths. Dataset is publicly released as the Semantic Trails Dataset.
b. we propose a set of metrics to evaluate sequence-aware recommender systems. Code
is publicly released as Sequeval.
c. RNN architecture to predict the next Point Of Interest as a next word prediction
problem.
d. Path Recommender outperforms non-neural network approaches such as MostPop,
unigram, bigram or CRF
e. Architecture has been successfully been applied to related task such as generation of
music playlists 72

2017
2018
2019
2020
Publications
Journal Conference Workshop Poster
RecSys17*
RecTour17*
JWS
DL4KGS*
x2,
REVEAL18,
RecSysChallenge18
ESWC18*
ESWC19*SWJ*
,
Information
ESWA*
Community Awards
Resources
DL4KGS workshop @ ESWC18 (organizer)
TDKE, IPM, IEEE Access, IS (reviewer)
ISWC, UMAP (sub-reviewer)
entity2rec, STEM, RecSys18 challenge, Semantic
Trails Dataset, sequeval
Polito PhD quality award (2019)
Best poster paper award (ESWC18)
Other
3 master thesis co-supervisions
EU research projects: Snowball, 3cixty,
PasTime, CEDUS
*ﬁrst author
73

FUTURE WORK (1/2)
Contribution 1: entity2rec
● Knowledge modelling and data integration: creation/mapping to a KG is expensive, often issues
in coverage and data quality. How to improve this?
● Online embeddings training: training time for embeddings is long. When new users/items are
added to the graph, update embeddings without re-training from scratch.
● Explanations: online experiment on entity2rec explanations
● Negative feedback and weights: experiment with weights on properties for diﬀerent levels of
conﬁdence. What if the user expresses negative preference for an item?
● Similarity functions: always relied on cosine similarity to measure relatedness in entity2rec and
node2vec. What about other measures?
● Personalized entity linking/disambiguation: use of personalized user-entity relatedness scores
to disambiguate entities in text for a particular user
● Tinderbook: improve diversity and novelty of recommendations
● Translational models: multi-type user-item interactions (click, discard, like, buy...)
75

FUTURE WORK (2/2)
Contribution 2: STEM
● Supervised classifier: so far, only used SVM. What about other classifiers?
● Parameters for ensemble: only used the final decision threshold as a parameter. Experiment
with other supervised classifiers and other parameters for stacking.
● Parallel implementation: improve computing time by parallelizing implementation
● Generalization to other tasks: is the approach generalizable to other threshold-based
classifiers in other tasks?
Contribution 3: Path Recommender
● Instance Recommendation: only used venue categories. How to generate recommendations
for instances?
● Data quality: check-ins contain information that is not useful for tourist path
recommendations (bots, unrelevant venue categories). Best way to filter these out?
● Knowledge and sequence-aware recommendations: how to better combine these two? For
instance, entity2rec and path recommender? 76

REFERENCES
[1]: Pasquale Lops, Marco De Gemmis, and Giovanni Semeraro. Content-based recommender systems: State of the art and trends. In Recommender systems handbook, pages 73–105. Springer, 2011.
[2]: Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, and Pasquale Lops. Learning word embeddings from wikipedia for content-based recommender systems. Advances in Information Retrieval,
pages 729–734, Cham, 2016.
[3]: De Gemmis, Marco, et al. Semantics-aware content-based recommender systems. Recommender Systems Handbook. Springer, Boston, MA, 2015. 119-159.
[4]: Christian Desrosiers and George Karypis. A comprehensive survey of neighborhood-based recommendation methods. In Recommender systems handbook, pages 107–144. 2011.
[5]: Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th International Conference on World Wide
Web, WWW ’01, pages 285–295, New York, NY, USA, 2001. ACM.
[6]: Yehuda Koren, Robert Bell, and Chris Volinsky. Matrix factorization techniques for recommender systems. Computer, (8):30–37, 2009
[7]: Xia Ning and George Karypis. Slim: Sparse linear methods for top-n recommender systems. In Data Mining (ICDM), 2011 IEEE 11th International Conference on, pages 497–506. IEEE, 2011.
[8]: Steffen Rendle. Factorization machines. In Data Mining (ICDM), 2010 IEEE 10th International Conference on, pages 995–1000. IEEE, 2010.
[9]: Xia Ning and George Karypis. Sparse linear methods with side information for top-n recommendations. In Proceedings of the sixth ACM conference on Recommender systems, pages 155–162,
2012.
[10]: Guo, Qingyu, et al. A Survey on Knowledge Graph-Based Recommender Systems. arXiv preprint arXiv:2003.00911 (2020).
[11]: Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. Translating embeddings for modeling multi-relational data. In Advances in neural information
processing systems, pages 2787–2795, 2013.
[12]: Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. Knowledge graph embedding by translating on hyperplanes. In AAAI, volume 14, pages 1112–1119, 2014.
[13]: Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, and Xuan Zhu. Learning entity and relation embeddings for knowledge graph completion. In AAAI, volume 15, pages 2181–2187, 2015.
[14]: Maximilian Nickel, Volker Tresp, and Hans-Peter Kriegel. A three-way model for collective learning on multi-relational data. In ICML, volume 11, pages 809–816, 2011.
[15]: B. Yang, W.-t. Yih, X. He, J. Gao, and L. Deng, Embedding Entities and Relations for Learning and Inference in Knowledge Bases, CoRR, vol. abs/1412.6575, 2014.
[16]: Ristoski, P., Rosati, J., Di Noia, T., De Leone, R. and Paulheim, H., 2019. RDF2Vec: RDF graph embeddings and their applications. Semantic Web, 10(4), pp.721-752.
[17]: Quan Wang, Zhendong Mao, Bin Wang, and Li Guo. Knowledge graph embedding: A survey of approaches and applications. IEEE Transactions on Knowledge and Data Engineering,
29(12):2724–2743, 2017.
[18]: Aditya Grover and Jure Leskovec. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,
pages 855–864. ACM, 2016
[19]: Alejandro Bellogin, Pablo Castells, and Ivan Cantador. Precision-oriented evaluation of recommender systems: an algorithmic comparison. In Proceedings of the fifth ACM conference on
Recommender systems, pages 333–336. ACM, 2011.
[20]: Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. Bpr: Bayesian personalized ranking from implicit feedback. In Proceedings of the twenty-fifth conference on
uncertainty in artificial intelligence, pages 452–461. AUAI Press, 2009.
[21]: Hu, Yifan, Yehuda Koren, and Chris Volinsky. Collaborative filtering for implicit feedback datasets. 2008 Eighth IEEE International Conference on Data Mining. Ieee, 2008.
[22]: Robert Isele, Anja Jentzsch, and Christian Bizer. Silk server-adding missing links while consuming linked data. In Proceedings of the First International Conference on Consuming Linked
Data-Volume 665, pages 85–96. CEUR-WS.org, 2010
[23]: Quadrana, Massimo, Paolo Cremonesi, and Dietmar Jannach. Sequence-aware recommender systems. ACM Computing Surveys (CSUR) 51.4 (2018): 1-36. 78

DATASETS: POPULARITY BIAS
HMovielens1M
= 7.17
HLastFM
= 7.77
HLibraryThing
= 8.26
80

DATASETS: DBPEDIA PROPERTIES
dbo:director
dbo:starring
dbo:distributor
dbo:writer
dbo:musicComposer
dbo:producer
dbo:cinematography
dbo:editing
dct:subject
dbo:author
dbo:publisher
dbo:literaryGenre
dbo:mediaType
dbo:subsequentWork
dbo:previousWork
dbo:series
dbo:country
dbo:language
dbo:coverArtist
dct:subject"
dbo:genre
dbo:recordLabel
dbo:hometown
dbo:associatedBand
dbo:associatedMusicalArtist
dbo:birthPlace
dbo:bandMember
dbo:formerBandMember
dbo:occupation
dbo:instrument
dct:subject
Select properties with more data from DBpedia ontology: top N so that N+1-th property has less than
50% of N-th property. Then, add dct:subject.
81

DATASETS: DATA SPLITTING
82
We split the data into a training X train , validation X val and test set X test
containing, for each user, respectively 70%, 10% and 20% of the ratings.
Users with less than 10 ratings are removed from the dataset, as well as items
that do not have a corresponding entry in DBpedia.
In this process, we lose 674 movies from Movielens1M, 27 users and 7867
musical artists from LastFM, 323 users and 27305 books for LibraryThing.
Yes.com is splitted randomly, Foursquare is splitted temporally.

83
MOVIELENS 1M LAST FM
LIBRARY THING

MOVIELENS 1M
84
C1 = {p : 4, q : 1, d : 200, l :
100, c : 30, n : 50}
C2 = {p : 4, q : 1, d : 200, l :
100, c : 50, n : 100}

LAST FM
85
C1 = {p : 4, q : 1, d : 200, l :
100, c : 30, n : 50}
C3 = {p : 4, q : 4, d : 200, l :
100, c : 60, n : 100}

LIBRARY THING
86
C1 = {p : 4, q : 1, d : 200, l :
100, c : 30, n : 50}
C4 = {p : 1, q : 1, d : 200, l :
100, c : 50, n : 100}

87
MOVIELENS 1M
C2 = {p : 4, q : 1, d : 200, l : 100,
c : 50, n : 100}

88
LAST FM
C3 = {p : 4, q : 4, d : 200, l : 100,
c : 60, n : 100}

89
LIBRARY THING
C4 = {p : 1, q : 1, d : 200, l : 100,
c : 50, n : 100}

PERSONALIZED ENTITY DISAMBIGUATION
ρ(u,e)
Samuel Jackson
is my hero!
94

TRANSLATIONAL MODELS
TransR
Model: hr
+ r ~ tr
Embed entities and relations in
diﬀerent vector spaces through
projection matrix Mr
Complexity: O(nd + mdk)
TransH
Model: h⊥
+ dr
~ t⊥
Multiple representations for
diﬀerent relations projecting on
hyperplanes
Complexity: O(nd + 2md)
TransE
Model: h + r ~ t
Simple, but not suitable for
1-to-N, N-to-1, N-to-N relations
Complexity: O(nd + md)
95

TransRTransHTransE
Score function:
f(h, r, t) = D(h + r, t)
Score function:
f(h, r, t) = D(h⊥
+ dr
, t⊥
)
Score function:
f(h, r, t) = D(hr
+ r , tr
)
TRANSLATIONAL MODELS
D = Euclidean distance 96

node2vec random walks
X0
X-1
1
1/p
1/q
1/q
p: return probability
q: in-out probability
97

CONFIGURATION
● node2vec, entity2rec
○ Movielens1M: p ∈ {0.25, 1, 4}, q ∈ {0.25, 1, 4}, d ∈ {200, 500}, l ∈ {10, 20, 30, 50, 100}, c ∈ {10, 20,
30}, n ∈ {10, 50}. C1 = {p : 4, q : 1, d : 200, l : 100, c : 30, n : 50} optimal in this range. C2 = {p : 4, q :
1, d : 200, l : 100, c : 50, n : 100} outperforms C1, found manually.
○ LastFM: p ∈ {1, 4}, q ∈ {1, 4}, c ∈ {30, 40, 50, 60}, l ∈ {60, 100, 120}, n ∈ {50, 100}. C3 = {p : 4, q : 4,
d : 200, l : 100, c : 60, n : 100} optimal in this range.
○ LibraryThing: p ∈ {1, 4}, q ∈ {1, 4}, c ∈ {30, 50}. C4 = {p : 1, q : 1, d : 200, l : 100, c : 50, n : 100}
● RankingFM
○ l1 ∈ {10**-12, 10**-6}, factors ∈ {32, 100}, ranking_reg ∈ {0.1, 0.5}, max_iter ∈ {25,50}.
○ Movielens1M: l1: 10**-6, factors: 32, rank_reg: 0.5, max_iter: 25
○ LastFM: factors: l1: 10**-6, 100, rank_reg: 0.1, max_iter: 50
○ LibraryThing: factor: l1: 10**-6, 100, rank_reg: 0.5, max_iter: 25
● Translational models
○ d ∈ {10, 20, 30, 50, 100, 200}. d=50 for TransE, TransR and d=100 for TransH for all datasets.
learning_rate=0.001.
● Collaborative ﬁltering
○ MyMediaLite conﬁguration: http://www.mymedialite.net/documentation/item_prediction.html
98

BECAUSE YOU WATCHED TAXI DRIVER...
ρstarring
(u,i’)
SIMILARITY WITH OTHER ITEMS
99

STARRING SAMUEL JACKSON...
ρ(u,e)
SIMILARITY WITH OTHER ENTITIES...
100

LSTM
o3
x1
x2
x3
Target
input
features
Input
layer
Hidden
layer
Output
layer
Softmax
h0 h1
h2
h3
Wo
T4
T0
T1
T2
T3
input
features
input
features
input
features
LSTM LSTM LSTM
track w2v embeddings album w2v embeddings artist w2v embeddings title2rec embeddings lyrics features
x0
Wo
Wo
Wo
o3
Softmax
T3
o3
Softmax
T2
o3
Softmax
T1
Monti, D., Palumbo, E., Rizzo, G., Lisena, P., Troncy, R., Fell, M., Cabrio, E. and Morisio, M., 2018. An ensemble approach of recurrent neural networks using pre-trained
embeddings for playlist completion. In Proceedings of the ACM Recommender Systems Challenge 2018 (pp. 1-6).
101

TINDERBOOK ARCHITECTURE
User
Interface
API
entity2rec
item-item
model
Seed
Discard
Feedback
Book
cover
Book
metadata
Jane
MongoDB 102

STEM - SUMMARY
● Threshold-based classifiers are commonly used for entity
matching
● Final threshold creates a trade-off between precision and recall
● Stacking can be applied to the final decision threshold
significantly improving both precision and recall at the same
time
● Experiments show that this conclusion is independent from the
threshold-based classifier and the dataset
Palumbo, Enrico, Giuseppe Rizzo, and Raphaël Troncy. "An Ensemble Approach to Financial Entity Matching for the FEIII 2016 Challenge." Proceedings of the Second
International Workshop on Data Science for Macro-Modeling. ACM, 2016.
Troncy R., Rizzo G., Jameson A., Corcho O., Plu J., Palumbo E., et al. (2017) 3cixty: Building Comprehensive Knowledge Bases For City Exploration. In Web Semantics:
Science, Services and Agents on the World Wide Web, 2017
Palumbo, Enrico, Giuseppe Rizzo, and Raphaël Troncy. “STEM: Stacked Threshold-based Entity Matching”, Semantic Web, 2019 104

INTERPRETABILITY - FINDINGS
● entity2rec features have a direct one-to-one connection with knowledge
graph properties
● Some properties have more importance than others: dct:subject
performs best when used alone
● Content helps: all properties together outperform any single property,
results improve with respect to feedback only
● Knowledge graphs can be used to generate explanations both in terms
of related items that in terms of item content
● Recommendations can be tailored to speciﬁc user requests using the
semantics of features 105

PATH RECOMMENDER - SUMMARY
● Created a publicly available dataset for sequences of user check-ins
● Created a publicly available software library for the evaluation of
sequence-aware recommender systems
● Recurrent Neural Networks can be used to eﬀectively create tourist paths
and music playlists
● Experiments show that Recurrent Neural Networks performs better
than traditional sequential models
● Language modelling = sequence prediction analogy is powerful, can
leverage all recent improvements in NLP for sequence-aware
recommender systems
106

TINDERBOOK - FINDINGS
● entity2rec can perform recommendations in a cold start scenario using
item-item relatedness
● In an oﬄine experiment, entity2rec oupterforms ItemKNN (collaborative),
RDF2Vec (content-based), TF-IDF (content-based) and MostPop
● entity2rec integrates seamlessly with Semantic Web technologies
● P@5 ~ 50%. Roughly one book out of two is liked by the user, given a
single book as a seed.
● Positive feedback coming from users, saying Tinderbook is fun to use and
generally accurate
107

FINDINGS
● entity2rec consistently outperforms node2vec: property-speciﬁc
embeddings are beneﬁcial
● entity2rec has the best precision, recall and serendipity for all
datasets, except for P@5 in Movielens1M where WRMF is best. WRMF
has low novelty, and does not work as well for users with few feedback
data (lower recall).
● entity2rec performs particularly well in LibraryThing, where sparsity
is high and popularity bias is lower. Decent level of novelty across
datasets. ItemKNN has the best novelty.
● Translational models have good novelty, but generally do not
perform as well as other KG embeddings methods.
108

Knowledge Graph Embeddings for Recommender Systems

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Knowledge Graph Embeddings for Recommender Systems

Similaire à Knowledge Graph Embeddings for Recommender Systems (20)

Dernier

Dernier (20)

Knowledge Graph Embeddings for Recommender Systems