Multiverse Recommendation: N-dimensional Tensor Factorization for Context-aware Collaborative Filtering

Multiverse Recommendation: N-dimensional Tensor
Factorization for Context-aware Collaborative
Filtering
Alexandros Karatzoglou1 Xavier Amatriain 1 Linas Baltrunas 2
Nuria Oliver 2
1Telefonica Research
Barcelona, Spain
2Free University of Bolzano
Bolzano, Italy
November 4, 2010
1

Context in Recommender Systems
2

Context in Recommender Systems
Context is an important factor to consider in personalized
Recommendation
2

Current State of the Art in Context- aware
Recommendation
3

Recommendation
Pre-Filtering Techniques
3

Recommendation
Post-Filtering Techniques
3

Recommendation
Contextual modeling
3

Recommendation
Contextual modeling
The approach presented here ﬁts in the Contextual Modeling category
3

Collaborative Filtering problem setting
Typically data sizes e.g. Netlix data n = 5 × 105, m = 17 × 103
4

Standard Matrix Factorization
Find U ∈ Rn×d and M ∈ Rd×m so that F = UM
minimizeU,ML(F, Y) + λΩ(U, M)
Movies!
Users!
U!
M!
5

Multiverse Recommendation: Tensors for Context
Aware Collaborative Filtering
Movies!
Users!
6

Tensors for Context Aware Collaborative Filtering
Movies!
Users!
U!
C!
M!
S!
7

Tensors for Context Aware Collaborative Filtering
Movies!
Users!
U!
C!
M!
S!
Fijk = S ×U Ui∗ ×M Mj∗ ×C Ck∗
R[U, M, C, S] := L(F, Y) + Ω[U, M, C] + Ω[S]
7

Regularization
Ω[F] = λM M 2
F + λU U 2
F + λC C 2
F
Ω[S] := λS S 2
F (1)
8

Squared Error Loss Function
Many implementations of MF used a simple squared error regression
loss function
l(f, y) =
1
2
(f − y)2
thus the loss over all users and items is:
L(F, Y) =
n
i
m
j
l(fij, yij)
Note that this loss provides an estimate of the conditional mean
9

Absolute Error Loss Function
Alternatively one can use the absolute error loss function
l(f, y) = |f − y|
thus the loss over all users and items is:
L(F, Y) =
n
i
m
j
l(fij, yij)
which provides an estimate of the conditional median
10

Optimization - Stochastic Gradient Descent for TF
The partial gradients with respect to U, M, C and S can then be
written as:
∂Ui∗
l(Fijk , Yijk ) = ∂Fijk
l(Fijk , Yijk )S ×M Mj∗ ×C Ck∗
∂Mj∗
l(Fijk , Yijk )S ×U Ui∗ ×C Ck∗
∂Ck∗
l(Fijk , Yijk )S ×U Ui∗ ×M Mj∗
∂Sl(Fijk , Yijk ) = ∂Fijk
l(Fijk , Yijk )Ui∗ ⊗ Mj∗ ⊗ Ck∗
11

The partial gradients with respect to U, M, C and S can then be
written as:
∂Ui∗
l(Fijk , Yijk )S ×M Mj∗ ×C Ck∗
∂Mj∗
l(Fijk , Yijk )S ×U Ui∗ ×C Ck∗
∂Ck∗
l(Fijk , Yijk )S ×U Ui∗ ×M Mj∗
∂Sl(Fijk , Yijk ) = ∂Fijk
l(Fijk , Yijk )Ui∗ ⊗ Mj∗ ⊗ Ck∗
We then iteratively update the parameter matrices and tensors using
the following update rules:
Ut+1
i∗ = Ut
i∗ − η∂UL − ηλUUi∗
Mt+1
j∗ = Mt
j∗ − η∂ML − ηλMMj∗
Ct+1
k∗ = Ct
k∗ − η∂CL − ηλCCk∗
St+1
= St
− η∂Sl(Fijk , Yijk ) − ηλSS
where η is the learning rate. 11

Movies!
Users!
U!
C!
M!
S!
12

Movies!
Users!
U!
C!
M!
S!
13

Movies!
Users!
U!
C!
M!
S!
14

Movies!
Users!
U!
C!
M!
S!
15

Movies!
Users!
U!
C!
M!
S!
16

Experimental evaluation
We evaluate our model on contextual rating data and computing the
Mean Absolute Error (MAE),using 5-fold cross validation deﬁned as
follows:
MAE =
1
K
n,m,c
ijk
Dijk |Yijk − Fijk |
17

Data
Data set Users Movies Context Dim. Ratings Scale
Yahoo! 7642 11915 2 221K 1-5
Adom. 84 192 5 1464 1-13
Food 212 20 2 6360 1-5
Table: Data set statistics
18

Context Aware Methods
Pre-filtering based approach, (G. Adomavicius et.al), computes
recommendations using only the ratings made in the same context
as the target one
Item splitting method (L. Baltrunas, F. Ricci) which identifies items
which have significant differences in their rating under different
context situations.
19

Results: Context vs. No Context
No
context
Tensor
Factorization
1.9
2.0
2.1
2.2
2.3
2.4
2.5
MAE
(a)
No
context
Tensor
Factorization
0.80
0.85
0.90
0.95
MAE
(b)
Figure: Comparison of matrix (no context) and tensor (context) factorization
on the Adom and Food data.
20

Yahoo Artiﬁcial Data
0.60
0.65
0.70
0.75
0.80
0.85
0.90
0.95
1.00
MAE
α=0.1
0.60
0.65
0.70
0.75
0.80
0.85
0.90
0.95
1.00
α=0.5
0.60
0.65
0.70
0.75
0.80
0.85
0.90
0.95
1.00
α=0.9
No Context Reduction Item-Split Tensor Factorization
Figure: Comparison of context-aware methods on the Yahoo! artiﬁcial data
21

Yahoo Artiﬁcial Data
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Probability of contextual influence
0.60
0.65
0.70
0.75
0.80
0.85
0.90
0.95MAE
No Context
Reduction
Item-Split
Tensor Factorization
22

Reduction Item-Split Tensor
Factorization
1.9
2.0
2.1
2.2
2.3
2.4
2.5
MAE
Figure: Comparison of context-aware methods on the Adom data.
23

No
context
Reduction Tensor
Factorization
0.80
0.85
0.90
0.95
MAE
Figure: Comparison of context-aware methods on the Food data.
24

Conclusions
Tensor Factorization methods seem to be promising for CARS
Many different TF methods exist
Future work: extend to implicit taste data
Tensor representation of context data seems promising
25

Multiverse Recommendation: N-dimensional Tensor Factorization for Context-aware Collaborative Filtering

Recommandé

Recommandé

Contenu connexe

Similaire à Multiverse Recommendation: N-dimensional Tensor Factorization for Context-aware Collaborative Filtering

Similaire à Multiverse Recommendation: N-dimensional Tensor Factorization for Context-aware Collaborative Filtering (20)

Plus de Alexandros Karatzoglou

Plus de Alexandros Karatzoglou (9)

Dernier

Dernier (20)

Multiverse Recommendation: N-dimensional Tensor Factorization for Context-aware Collaborative Filtering