SlideShare une entreprise Scribd logo
1  sur  82
Télécharger pour lire hors ligne
Factorization Models & Polynomial Regression Factorization Machines Applications Summary 
Factorization Machines 
Steen Rendle 
Current aliation: Google Inc. 
Work was done at University of Konstanz 
MLConf, November 14, 2014 
Steen Rendle 1 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Models 
Linear/ Polynomial Regression 
Comparison 
Factorization Machines 
Applications 
Summary 
Steen Rendle 2 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Matrix Factorization 
Example for data: Matrix Factorization: 
Movie 
TI NH SW ST ... 
5 3 1 ? ... 
? ? 4 5 ... 
1 ? 5 ? ... 
... ... ... ... ... 
A 
B 
C 
... 
User 
^ Y := W Ht ; W 2 RjUjk ;H 2 RjIjk 
k is the rank of the reconstruction. 
Steen Rendle 3 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Matrix Factorization 
Example for data: Matrix Factorization: 
Movie 
TI NH SW ST ... 
5 3 1 ? ... 
? ? 4 5 ... 
1 ? 5 ? ... 
... ... ... ... ... 
A 
B 
C 
... 
User 
^ Y := W Ht ; W 2 RjUjk ;H 2 RjIjk 
^y(u; i) = ^yu;i = 
Xk 
f =1 
wu;f hi ;f = hwu; hi i 
k is the rank of the reconstruction. 
Steen Rendle 3 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Matrix Factorization  Extensions 
Example for data: Examples for models: 
Movie 
TI NH SW ST ... 
5 3 1 ? ... 
? ? 4 5 ... 
1 ? 5 ? ... 
... ... ... ... ... 
A 
B 
C 
... 
User 
^yMF(u; i ) := 
Xk 
f =1 
vu;f vi ;f = hvu; vi i 
Steen Rendle 4 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Matrix Factorization  Extensions 
Example for data: Examples for models: 
Movie 
TI NH SW ST ... 
5 3 1 ? ... 
? ? 4 5 ... 
1 ? 5 ? ... 
... ... ... ... ... 
A 
B 
C 
... 
User 
^yMF(u; i ) := 
Xk 
f =1 
vu;f vi ;f = hvu; vi i 
^ySVD++(u; i) := 
* 
vu + 
X 
j2N(u) 
vj ; vi 
+ 
^yFact-KNN(u; i ) := 
1 
jR(u)j 
X 
j2R(u) 
ru;j hvi ; vj i 
Steen Rendle 4 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Matrix Factorization  Extensions 
Example for data: Examples for models: 
Movie 
TI NH SW ST ... 
5 3 1 ? ... 
? ? 4 5 ... 
1 ? 5 ? ... 
... ... ... ... ... 
A 
B 
C 
... 
User 
^yMF(u; i ) := 
Xk 
f =1 
vu;f vi ;f = hvu; vi i 
^ySVD++(u; i) := 
* 
vu + 
X 
j2N(u) 
vj ; vi 
+ 
^yFact-KNN(u; i ) := 
1 
jR(u)j 
X 
j2R(u) 
ru;j hvi ; vj i 
Rating 
Matrix 
time 
^ytimeSVD(u; i ; t) := hvu + vu;t ; vi i 
^ytimeTF(u; i ; t) := 
Xk 
f =1 
vu;f vi ;f vt;f 
: : : 
Steen Rendle 4 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Tensor Factorization 
Example for data: Examples for models: 
Triples of Subject, Predicate, Object 
^yPARAFAC(s; p; o) := 
Xk 
f =1 
vs;f vp;f vo;f 
^yPITF(s; p; o) := hvs ; vpi + hvs ; voi + hvp; voi 
: : : 
Steen Rendle 5 / 53 
[illustration from Drumond et al. 2012]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Sequential Factorization Models 
Example for data: Examples for models: 
Bt Bt­3 
b 
b a b 
a c 
User 1 ? 
c 
e 
c c 
a 
? 
d c e e ? 
? 
User 2 
User 3 
User 4 
Bt­2 
Bt­1 
a 
^yFMC(u; i ; t) := 
X 
l2Bt1 
hvi ; vl i 
^yFPMC(u; i ; t) := hvu; vi i + 
X 
l2Bt1 
hvi ; vl i 
: : : 
Steen Rendle 6 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Factorization Models: Discussion 
I Advantages 
I Can estimate interactions between two (or more) variables even if 
the cross is not observed. 
I E.g. user  movie, current product  next product, user  query  
url, : : : 
Steen Rendle 7 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Factorization Models: Discussion 
I Advantages 
I Can estimate interactions between two (or more) variables even if 
the cross is not observed. 
I E.g. user  movie, current product  next product, user  query  
url, : : : 
I Downsides 
I Factorization models are usually build speci
cally for each problem. 
I Learning algorithms and implementations are tailored to individual 
models. 
Steen Rendle 7 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Models 
Linear/ Polynomial Regression 
Comparison 
Factorization Machines 
Applications 
Summary 
Steen Rendle 8 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Data and Variable Representation 
Many standard ML approaches work with real valued feature vectors as 
input. It allows to represent, e.g.: 
I any number of variables 
I categorical domains by using dummy indicator variables 
I numerical domains 
I set-categorical domains by using dummy indicator variables 
Using this representation allows to apply a wide variety of standard 
models (e.g. linear regression, SVM, etc.). 
Steen Rendle 9 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Linear Regression 
I Let x 2 Rp be an input vector with p predictor variables. 
I Model equation: 
^y(x) := w0 + 
Xp 
i=1 
wi xi 
I Model parameters: 
w0 2 R; w 2 Rp 
O(p) model parameters. 
Steen Rendle 10 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Polynomial Regression 
I Let x 2 Rp be an input vector with p predictor variables. 
I Model equation (degree 2): 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
wi ;j xi xj 
I Model parameters: 
w0 2 R; w 2 Rp; W 2 Rpp 
O(p2) model parameters. 
Steen Rendle 11 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Models 
Linear/ Polynomial Regression 
Comparison 
Factorization Machines 
Applications 
Summary 
Steen Rendle 12 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Representation: Matrix/ Tensor vs. Feature Vectors 
Matrix/ Tensor data can be represented by feature vectors: 
Movie 
TI NH SW ST ... 
5 3 1 ? ... 
? ? 4 5 ... 
1 ? 5 ? ... 
... ... ... ... ... 
A 
B 
C 
... 
User 
Steen Rendle 13 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Representation: Matrix/ Tensor vs. Feature Vectors 
Matrix/ Tensor data can be represented by feature vectors: 
Movie 
TI NH SW ST ... 
5 3 1 ? ... 
? ? 4 5 ... 
1 ? 5 ? ... 
... ... ... ... ... 
A 
B 
C 
... 
User 
, 
# User Movie Rating 
1 Alice Titanic 5 
2 Alice Notting Hill 3 
3 Alice Star Wars 1 
4 Bob Star Wars 4 
5 Bob Star Trek 5 
6 Charlie Titanic 1 
7 Charlie Star Wars 5 
. . . . . . . . . . . . 
Steen Rendle 13 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Representation: Matrix/ Tensor vs. Feature Vectors 
Matrix/ Tensor data can be represented by feature vectors: 
# User Movie Rating 
1 Alice Titanic 5 
2 Alice Notting Hill 3 
3 Alice Star Wars 1 
4 Bob Star Wars 4 
5 Bob Star Trek 5 
6 Charlie Titanic 1 
7 Charlie Star Wars 5 
. . . . . . . . . . . . 
) 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
A B C ... TI NH SW ST ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
User Movie 
Target y 
5 
3 
1 y(3) 
4 
5 
1 
5 
y(1) 
y(2) 
y(4) 
y(5) 
y(6) 
y(7) 
Steen Rendle 13 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
A B C ... TI NH SW ST ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
User Movie 
Target y 
5 
3 
1 y(3) 
4 
5 
1 
5 
y(1) 
y(2) 
y(4) 
y(5) 
y(6) 
y(7) 
Applying regression models to this data leads to: 
Steen Rendle 14 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
A B C ... TI NH SW ST ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
User Movie 
Target y 
5 
3 
1 y(3) 
4 
5 
1 
5 
y(1) 
y(2) 
y(4) 
y(5) 
y(6) 
y(7) 
Applying regression models to this data leads to: 
Linear regression: ^y(x) = w0 + wu + wi 
Steen Rendle 14 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
A B C ... TI NH SW ST ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
User Movie 
Target y 
5 
3 
1 y(3) 
4 
5 
1 
5 
y(1) 
y(2) 
y(4) 
y(5) 
y(6) 
y(7) 
Applying regression models to this data leads to: 
Linear regression: ^y(x) = w0 + wu + wi 
Polynomial regression: ^y(x) = w0 + wu + wi + wu;i 
Steen Rendle 14 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
A B C ... TI NH SW ST ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
User Movie 
Target y 
5 
3 
1 y(3) 
4 
5 
1 
5 
y(1) 
y(2) 
y(4) 
y(5) 
y(6) 
y(7) 
Applying regression models to this data leads to: 
Linear regression: ^y(x) = w0 + wu + wi 
Polynomial regression: ^y(x) = w0 + wu + wi + wu;i 
Matrix factorization: ^y(u; i) = hwu; hi i 
Steen Rendle 14 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
For the data of the example: 
I Linear regression has no user-item interaction. 
Steen Rendle 15 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
For the data of the example: 
I Linear regression has no user-item interaction. 
I ) Linear regression is not expressive enough. 
Steen Rendle 15 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
For the data of the example: 
I Linear regression has no user-item interaction. 
I ) Linear regression is not expressive enough. 
I Polynomial regression includes pairwise interactions but cannot 
estimate them from the data. 
Steen Rendle 15 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
For the data of the example: 
I Linear regression has no user-item interaction. 
I ) Linear regression is not expressive enough. 
I Polynomial regression includes pairwise interactions but cannot 
estimate them from the data. 
I n  p2: number of cases is much smaller than number of model 
parameters. 
Steen Rendle 15 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
For the data of the example: 
I Linear regression has no user-item interaction. 
I ) Linear regression is not expressive enough. 
I Polynomial regression includes pairwise interactions but cannot 
estimate them from the data. 
I n  p2: number of cases is much smaller than number of model 
parameters. 
I Max.-likelihood estimator for a pairwise eect is: 
wi ;j = 
( 
y  w0  wi  wu; if (i ; j ; y) 2 S: 
not de
ned; else 
Steen Rendle 15 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
For the data of the example: 
I Linear regression has no user-item interaction. 
I ) Linear regression is not expressive enough. 
I Polynomial regression includes pairwise interactions but cannot 
estimate them from the data. 
I n  p2: number of cases is much smaller than number of model 
parameters. 
I Max.-likelihood estimator for a pairwise eect is: 
wi ;j = 
( 
y  w0  wi  wu; if (i ; j ; y) 2 S: 
not de
ned; else 
I Polynomial regression cannot generalize to any unobserved pairwise 
eect. 
Steen Rendle 15 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Machines 
Model 
Examples 
Properties 
Learning 
libFM Software 
Applications 
Summary 
Steen Rendle 16 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Factorization Machine (FM) 
I Let x 2 Rp be an input vector with p predictor variables. 
I Model equation (degree 2): 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
hvi ; vj i xi xj 
I Model parameters: 
w0 2 R; w 2 Rp; V 2 Rpk 
Steen Rendle 17 / 53 
[Rendle 2010, Rendle 2012]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Factorization Machine (FM) 
I Let x 2 Rp be an input vector with p predictor variables. 
I Model equation (degree 2): 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
hvi ; vj i xi xj 
I Model parameters: 
w0 2 R; w 2 Rp; V 2 Rpk 
Compared to Polynomial regression: 
I Model equation (degree 2): 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
wi ;j xi xj 
I Model parameters: 
w0 2 R; w 2 Rp; W 2 Rpp 
Steen Rendle 17 / 53 
[Rendle 2010, Rendle 2012]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Factorization Machine (FM) 
I Let x 2 Rp be an input vector with p predictor variables. 
I Model equation (degree 2): 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
hvi ; vj i xi xj 
I Model parameters: 
w0 2 R; w 2 Rp; V 2 Rpk 
Steen Rendle 17 / 53 
[Rendle 2010, Rendle 2012]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Factorization Machine (FM) 
I Let x 2 Rp be an input vector with p predictor variables. 
I Model equation (degree 3): 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
hvi ; vj i xi xj 
+ 
Xp 
i=1 
Xp 
ji 
Xp 
lj 
Xk 
f =1 
v(3) 
i ;f v(3) 
j ;f v(3) 
l ;f xi xj xl 
I Model parameters: 
w0 2 R; w 2 Rp; V 2 Rpk ; V(3) 2 Rpk 
Steen Rendle 17 / 53 
[Rendle 2010, Rendle 2012]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Factorization Machines: Discussion 
I FMs work with real valued input. 
I FMs include variable interactions like polynomial regression. 
I Model parameters for interactions are factorized. 
I Number of model parameters is O(k p) (instead of O(p2) for poly. 
regr.). 
Steen Rendle 18 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Machines 
Model 
Examples 
Properties 
Learning 
libFM Software 
Applications 
Summary 
Steen Rendle 19 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Matrix Factorization and Factorization Machines 
Two categorical variables encoded with real valued predictor variables: 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
A B C ... TI NH SW ST ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
User Movie 
With this data, the FM is identical to MF with biases1: 
^y(x) = w0 + wu + wi + hvu; vi i | {z } 
MF 
1libFM, k = 128, MCMC inference, Net
ix RMSE=0.8937 
Steen Rendle 20 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
RDF-Triple Prediction with Factorization Machines 
Three categorical variables encoded with real valued predictor variables: 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 0 0 0 1 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
S1 S2 S3 ... P1 P2 P3 P4 ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
1 
0 
0 
0 
0 
0 
0 
... 
... 
... 
... 
... 
0 0 0 1 ... 
O1 O2 O3 O4 ... 
Subject Predicate Object 
With this data, the FM is equivalent to the PITF model: 
^y(x) := w0 + ws + wp + wo + hvs ; vpi + hvs ; voi + hvp; voi 
[PITF: Rendle et al. 2010, WSDM Best Student Paper, ECML 2009 Best DC Award] 
Steen Rendle 21 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Time with Factorization Machines 
Two categorical variables and time as linear predictor: 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
A B C ... TI NH SW ST ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
User Movie 
0.2 
0.6 
0.61 
0.3 
0.5 
0.1 
0.8 
Time 
The FM model would correspond to: 
^y(x) := w0 + wi + wu + t wtime + hvu; vi i + t hvu; vtimei + t hvi ; vtimei 
Steen Rendle 22 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Time with Factorization Machines 
Two categorical variables and time discretized in bins (b(t)): 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
A B C ... TI NH SW ST ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
User Movie 
1 
0 
0 
1 
0 
1 
0 
0 
1 
1 
0 
1 
0 
0 
Time 
0 
0 
0 
0 
0 
0 
1 
T1 T2 T3 
The FM model would correspond to:2 
^y(x) := w0 + wi + wu + wb(t) + hvu; vi i + hvu; vb(t)i + hvi ; vb(t)i 
2libFM, k = 128, MCMC inference, Net
ix RMSE=0.8873 
Steen Rendle 22 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
SVD++ 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 0.3 0.3 0.3 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
A B C ... TI NH SW ST ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
0.3 
0.3 
0 
0 
0.5 
0.3 
0.3 
0 
0 
0 
0.3 
0.3 
0.5 
0.5 
0.5 
0 
0 
0.5 
0.5 
0 
... 
... 
... 
... 
... 
0.5 0 0.5 0 ... 
TI NH SW ST ... 
User Movie Other Movies rated 
With this data, the FM3 is identical to: 
^y(x) = 
SVD++ z }| { 
w0 + wu + wi + hvu; vi i + 
1 p 
jNuj 
X 
l2Nu 
hvi ; vl i 
+ 
1 p 
jNuj 
X 
l2Nu 
0 
@wl + hvu; vl i + 
1 p 
jNuj 
X 
l 02Nu ;l 0l 
hvl ; v0 
l i 
1 
A 
3libFM, k = 128, MCMC inference, Net
ix RMSE=0.8865 
Steen Rendle 23 / 53 
[Koren, 2008]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Factorizing Personalized Markov Chains (FPMC) 
Two categorical variables (u,i ), one set categorical (Bt1): 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 0.5 0.5 0 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
u1 u2 u3 ... A B C D ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
0 
... 
... 
... 
... 
... 
1 0 0 0 ... 
User Product 
A B C D ... 
Last Basket 
Sequential Baskets 
u1 A,B C 
u2 C D 
u3 A C 
FM is equivalent to 
^y(x) := w0 + wu + wi + 
1 
jBt1j 
X 
j2Bt1 
wj + hvu; vi i + 
1 
jBt1j 
X 
j2Bt1 
hvi ; vj i + ::: 
Steen Rendle 24 / 53 
[Rendle et al. 2010, WWW Best Paper]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Machines 
Model 
Examples 
Properties 
Learning 
libFM Software 
Applications 
Summary 
Steen Rendle 25 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Computation Complexity 
Factorization Machine model equation: 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
hvi ; vj i xi xj 
I Trivial computation: O(p2 k) 
Steen Rendle 26 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Computation Complexity 
Factorization Machine model equation: 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
hvi ; vj i xi xj 
I Trivial computation: O(p2 k) 
I Ecient computation can be done in: O(p k) 
Steen Rendle 26 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Computation Complexity 
Factorization Machine model equation: 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
hvi ; vj i xi xj 
I Trivial computation: O(p2 k) 
I Ecient computation can be done in: O(p k) 
I Making use of many zeros in x even in: O(Nz (x) k), where Nz (x) is 
the number of non-zero elements in vector x. 
Steen Rendle 26 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Ecient Computation 
The model equation of an FM can be computed in O(p k). 
Steen Rendle 27 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Ecient Computation 
The model equation of an FM can be computed in O(p k). 
Proof: 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
hvi ; vj i xi xj 
= w0 + 
Xp 
i=1 
wi xi + 
1 
2 
Xk 
f =1 
2 
4 
  Xp 
i=1 
xi vi ;f 
!2 
 
Xp 
i=1 
(xi vi ;f )2 
3 
5 
Steen Rendle 27 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Ecient Computation 
The model equation of an FM can be computed in O(p k). 
Proof: 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
hvi ; vj i xi xj 
= w0 + 
Xp 
i=1 
wi xi + 
1 
2 
Xk 
f =1 
2 
4 
  Xp 
i=1 
xi vi ;f 
!2 
 
Xp 
i=1 
(xi vi ;f )2 
3 
5 
I In the sums over i , only non-zero xi elements have to be summed up 
) O(Nz (x) k). 
I (The complexity of polynomial regression is O(Nz (x)2).) 
Steen Rendle 27 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Multilinearity 
FMs are multilinear: 
8 2  = fw0;w1; : : : ;wp; v1;1; : : : ; vp;kg : ^y(x; ) = h()(x)  + g()(x) 
where g() and h() do not depend on the value of . 
Steen Rendle 28 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Multilinearity 
FMs are multilinear: 
8 2  = fw0;w1; : : : ;wp; v1;1; : : : ; vp;kg : ^y(x; ) = h()(x)  + g()(x) 
where g() and h() do not depend on the value of . 
E.g. for second order eects ( = vl ;f ): 
^y(x; vl;f ) := 
g(vl;f )(x) 
z }| { 
w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
j=i+1 
Xk 
f 0=1 
(f 06=f )_(l62fi ;jg) 
vi ;f 0 vj;f 0 xi xj 
+ vl;f xl 
X 
i=1;i6=l 
vi ;f xi 
| {z } 
h(vl;f )(x) 
Steen Rendle 28 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Machines 
Model 
Examples 
Properties 
Learning 
libFM Software 
Applications 
Summary 
Steen Rendle 29 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Learning 
Using these properties, learning algorithms can be developed: 
I L2-regularized regression and classi
cation: 
I Stochastic gradient descent [Rendle, 2010] 
I Alternating least squares/ Coordinate Descent [Rendle et al., 2011, 
Rendle 2012] 
I Markov Chain Monte Carlo (for Bayesian FMs) [Freudenthaler et al. 
2011, Rendle 2012] 
I L2-regularized ranking: 
I Stochastic gradient descent [Rendle, 2010] 
All the proposed learning algorithms have a runtime of O(k Nz (X) i ), 
where i is the number of iterations and Nz (X) the number of non-zero 
elements in the design matrix X. 
Steen Rendle 30 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Stochastic Gradient Descent (SGD) 
I For each training case (x; y) 2 S, SGD updates the FM model 
parameter  using: 
0 =    
 
(^y(x)  y)h()(x) + () 
 
I  is the learning rate / step size. 
I () is the regularization value of the parameter . 
I SGD can easily be applied to other loss functions. 
Steen Rendle 31 / 53 
[Rendle, 2010]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Coordinate Descent (CD) 
I CD updates each FM model parameter  using: 
0 = 
P 
(x;y)2S 
 
y  g()(x) 
 
h()(x) 
P 
(x;y)2S h2 
()(x) + () 
I Using caches of intermediate results, the runtime for updating all 
model parameters is O(k Nz (X)). 
I CD can be extended to classi
cation [Rendle, 2012]. 
Steen Rendle 32 / 53 
[Rendle et al., 2011]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Gibbs Sampling (MCMC) 
I Gibbs sampling with a block for each FM model parameter : 
jS; n fg  N 
  
 
P 
(x;y)2S 
 
y  g()(x) 
 
h()(x) 
 
P 
(x;y)2S h2 
()(x) + () 
; 
1 
 
P 
(x;y)2S h2 
()(x) + () 
! 
I Mean is the same as for CD ) computational complexity is also 
O(k Nz (X)). 
I MCMC can be extended to classi
cation using link functions. 
Steen Rendle 33 / 53 
[Freudenthaler et al. 2011, Rendle 2012]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Learning Regularization Values 
 v ,v 
yi 
w ,w 
wj 
w0 
xij 
i=1,...,n 
v j 
 
j=1,...,p 
 , 0 ,0 
yi 
wj 
w0 
w 
xij 
i=1,...,n 
v j 
 
j=1,...,p 
w0 ,w0 
0 ,0 
w0 ,w0 
w  v v 
Standard FM with priors. Two level FM with hyperpriors. 
Steen Rendle 34 / 53 
[Freudenthaler et al., 2011]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Machines 
Model 
Examples 
Properties 
Learning 
libFM Software 
Applications 
Summary 
Steen Rendle 35 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
libFM Software 
libFM is an implementation of FMs 
I Model: second-order FMs 
I Learning/ inference: SGD, ALS, MCMC 
I Classi
cation and regression 
I Uses the same data format as LIBSVM, LIBLINEAR [Lin et. al], 
SVMlight [Joachims]. 
I Supports variable grouping. 
I Open source: GPLv3. 
Steen Rendle 36 / 53 
[http://www.libfm.org/]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Machines 
Applications 
Recommender Systems 
Link Prediction in Social Networks 
Clickthrough Prediction 
Personalized Ranking 
Student Performance Prediction 
Kaggle Competitions 
Summary 
Steen Rendle 37 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
(Context-aware) Rating Prediction 
I Main variables: 
I User ID (categorical) 
I Item ID (categorical) 
I Additional variables: 
I time 
I mood 
I user pro
le 
I item meta data 
I . . . 
I Examples: Net
ix prize, Movielens, KDDCup 2011 
+ 
♪ + + 
Song 
User Time Mood 
Steen Rendle 38 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Net
ix Prize 
Netflix Prize: Prediction Error 
Public Leaderboard 
RMS Error 
0.86 0.87 0.88 0.89 0.90 
user, movie 
user, movie, day 
user, movie, impl. 
user, movie, 
day, impl. 
SGD Matrix 
Factorization 
user, movie, 
day, impl., 
freq, lin. day 
$1M Prize 
I k = 128 factors, 512 MCMC samples (no burnin phase, initialization 
from random) 
I MCMC inference (no hyperparameters (learning rate, regularization) 
to specify) 
Steen Rendle 39 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Net
ix Prize 
Method (Name) Ref. Learning Method k Quiz RMSE 
Models using user ID and item ID 
Probabilistic Matrix Factorization [14, 13] Batch GD 40 *0.9170 
Probabilistic Matrix Factorization [14, 13] Batch GD 150 0.9211 
Matrix Factorization [6] Variational Bayes 30 *0.9141 
Matchbox [15] Variational Bayes 50 *0.9100 
ALS-MF [7] ALS 100 0.9079 
ALS-MF [7] ALS 1000 *0.9018 
SVD/ MF [3] SGD 100 0.9025 
SVD/ MF [3] SGD 200 *0.9009 
Bayesian Probablistic Matrix Factorization 
[13] MCMC 150 0.8965 
(BPMF) 
Bayesian Probablistic Matrix Factorization 
(BPMF) 
[13] MCMC 300 *0.8954 
FM, pred. var: user ID, movie ID - MCMC 128 0.8937 
Models using implicit feedback 
Probabilistic Matrix Factorization with Cons- 
traints 
[14] Batch GD 30 *0.9016 
SVD++ [3] SGD 100 0.8924 
SVD++ [3] SGD 200 *0.8911 
BSRM/F [18] MCMC 100 0.8926 
BSRM/F [18] MCMC 400 *0.8874 
FM, pred. var: user ID, movie ID, impl. - MCMC 128 0.8865 
Steen Rendle 40 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Net
ix Prize 
Method (Name) Ref. Learning Method k Quiz RMSE 
Models using time information 
Bayesian Probabilistic Tensor Factorization 
[17] MCMC 30 *0.9044 
(BPTF) 
FM, pred. var: user ID, movie ID, day - MCMC 128 0.8873 
Models using time and implicit feedback 
timeSVD++ [5] SGD 100 0.8805 
timeSVD++ [5] SGD 200 *0.8799 
FM, pred. var: user ID, movie ID, day, impl. - MCMC 128 0.8809 
FM, pred. var: user ID, movie ID, day, impl. - MCMC 256 0.8794 
Assorted models 
BRISMF/UM NB corrected [16] SGD 1000 *0.8904 
BMFSI plus side information [8] MCMC 100 *0.8875 
timeSVD++ plus frequencies [4] SGD 200 0.8777 
timeSVD++ plus frequencies [4] SGD 2000 *0.8762 
FM, pred. var: user ID, movie ID, day, impl., 
- MCMC 128 0.8779 
freq., lin. day 
FM, pred. var: user ID, movie ID, day, impl., 
freq., lin. day 
- MCMC 256 0.8771 
Steen Rendle 40 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Machines 
Applications 
Recommender Systems 
Link Prediction in Social Networks 
Clickthrough Prediction 
Personalized Ranking 
Student Performance Prediction 
Kaggle Competitions 
Summary 
Steen Rendle 41 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Link Prediction in Social Networks 
I Main variables: 
I Actor A ID 
I Actor B ID 
I Additional variables: 
I pro
les 
I actions 
I . . . 
+ 
Actor A Actor B 
Steen Rendle 42 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
KDDCup 2012: Track 1 
KDDCup 2012 Track 1: Prediction Quality 
Public Leaderboard Private Leaderboard 
Mean Average Precision @3 
0.32 0.34 0.36 0.38 0.40 0.42 
none 
gender, age, ... 
keywords 
friends 
all 
none 
gender, age, ... 
keywords 
friends 
all 
Top 1 
Top 5 
Top 10 
Top 100 
I k = 22 factors, 512 MCMC samples (no burnin phase, initialization 
from random) 
I MCMC inference (no hyperparameters (learning rate, regularization) 
to specify) 
Steen Rendle 43 / 53 
[Awarded 2nd place (out of 658 teams)]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Machines 
Applications 
Recommender Systems 
Link Prediction in Social Networks 
Clickthrough Prediction 
Personalized Ranking 
Student Performance Prediction 
Kaggle Competitions 
Summary 
Steen Rendle 44 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Clickthrough Prediction 
I Main variables: 
I User ID 
I Query ID 
I Ad/ Link ID 
I Additional variables: 
I query tokens 
I user pro
le 
I . . . 
+ 
keyword... + 
Link 1 
Link 2 
Link 3 
User Query Ad/ Link 
Steen Rendle 45 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
KDDCup 2012: Track 2 
Model Inference wAUC (public) wAUC (private) 
ID-based model (k = 0) SGD 0.78050 0.78086 
Attribute-based model (k = 8) MCMC 0.77409 0.77555 
Mixed model (k = 8) SGD 0.79011 0.79321 
Final ensemble n/a 0.79857 0.80178 
Ensemble 
I Rank positions (not predicted clickthrough rates) are used. 
I The MCMC attribute-based model and dierent variations of the 
. SGD models are included. 
Steen Rendle 46 / 53 
[Awarded 3rd place (out of 171 teams)]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Machines 
Applications 
Recommender Systems 
Link Prediction in Social Networks 
Clickthrough Prediction 
Personalized Ranking 
Student Performance Prediction 
Kaggle Competitions 
Summary 
Steen Rendle 47 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
ECML/PKDD Discovery Challenge 2013 
I Problem: Recommend given names. 
I Main variables: 
I User ID 
I Name ID 
I Additional variables: 
I session info 
I string representation for each name 
I . . . 
I FM approach won 1st place (online track) and 2nd (oine track). 
Steen Rendle 48 / 53

Contenu connexe

Tendances

Factorization Machines and Applications in Recommender Systems
Factorization Machines and Applications in Recommender SystemsFactorization Machines and Applications in Recommender Systems
Factorization Machines and Applications in Recommender SystemsEvgeniy Marinov
 
Context-aware Recommendation: A Quick View
Context-aware Recommendation: A Quick ViewContext-aware Recommendation: A Quick View
Context-aware Recommendation: A Quick ViewYONG ZHENG
 
Matrix Factorization In Recommender Systems
Matrix Factorization In Recommender SystemsMatrix Factorization In Recommender Systems
Matrix Factorization In Recommender SystemsYONG ZHENG
 
Simple Matrix Factorization for Recommendation in Mahout
Simple Matrix Factorization for Recommendation in MahoutSimple Matrix Factorization for Recommendation in Mahout
Simple Matrix Factorization for Recommendation in MahoutData Science London
 
Time series predictions using LSTMs
Time series predictions using LSTMsTime series predictions using LSTMs
Time series predictions using LSTMsSetu Chokshi
 
Collaborative Filtering using KNN
Collaborative Filtering using KNNCollaborative Filtering using KNN
Collaborative Filtering using KNNŞeyda Hatipoğlu
 
Recommendation at Netflix Scale
Recommendation at Netflix ScaleRecommendation at Netflix Scale
Recommendation at Netflix ScaleJustin Basilico
 
CF Models for Music Recommendations At Spotify
CF Models for Music Recommendations At SpotifyCF Models for Music Recommendations At Spotify
CF Models for Music Recommendations At SpotifyVidhya Murali
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRUananth
 
Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Alexandros Karatzoglou
 
Music recommendations @ MLConf 2014
Music recommendations @ MLConf 2014Music recommendations @ MLConf 2014
Music recommendations @ MLConf 2014Erik Bernhardsson
 
Actor critic algorithm
Actor critic algorithmActor critic algorithm
Actor critic algorithmJie-Han Chen
 
Music Recommendations at Scale with Spark
Music Recommendations at Scale with SparkMusic Recommendations at Scale with Spark
Music Recommendations at Scale with SparkChris Johnson
 
Recommender Systems from A to Z – Model Training
Recommender Systems from A to Z – Model TrainingRecommender Systems from A to Z – Model Training
Recommender Systems from A to Z – Model TrainingCrossing Minds
 
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019Anoop Deoras
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender SystemsJustin Basilico
 
Recent advances in deep recommender systems
Recent advances in deep recommender systemsRecent advances in deep recommender systems
Recent advances in deep recommender systemsNAVER Engineering
 

Tendances (20)

Factorization Machines and Applications in Recommender Systems
Factorization Machines and Applications in Recommender SystemsFactorization Machines and Applications in Recommender Systems
Factorization Machines and Applications in Recommender Systems
 
Context-aware Recommendation: A Quick View
Context-aware Recommendation: A Quick ViewContext-aware Recommendation: A Quick View
Context-aware Recommendation: A Quick View
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Matrix Factorization In Recommender Systems
Matrix Factorization In Recommender SystemsMatrix Factorization In Recommender Systems
Matrix Factorization In Recommender Systems
 
Simple Matrix Factorization for Recommendation in Mahout
Simple Matrix Factorization for Recommendation in MahoutSimple Matrix Factorization for Recommendation in Mahout
Simple Matrix Factorization for Recommendation in Mahout
 
Introduction to Transformer Model
Introduction to Transformer ModelIntroduction to Transformer Model
Introduction to Transformer Model
 
Time series predictions using LSTMs
Time series predictions using LSTMsTime series predictions using LSTMs
Time series predictions using LSTMs
 
Collaborative Filtering using KNN
Collaborative Filtering using KNNCollaborative Filtering using KNN
Collaborative Filtering using KNN
 
Recommendation at Netflix Scale
Recommendation at Netflix ScaleRecommendation at Netflix Scale
Recommendation at Netflix Scale
 
CF Models for Music Recommendations At Spotify
CF Models for Music Recommendations At SpotifyCF Models for Music Recommendations At Spotify
CF Models for Music Recommendations At Spotify
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
 
Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial
 
Music recommendations @ MLConf 2014
Music recommendations @ MLConf 2014Music recommendations @ MLConf 2014
Music recommendations @ MLConf 2014
 
Actor critic algorithm
Actor critic algorithmActor critic algorithm
Actor critic algorithm
 
Music Recommendations at Scale with Spark
Music Recommendations at Scale with SparkMusic Recommendations at Scale with Spark
Music Recommendations at Scale with Spark
 
Recommender Systems from A to Z – Model Training
Recommender Systems from A to Z – Model TrainingRecommender Systems from A to Z – Model Training
Recommender Systems from A to Z – Model Training
 
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Recent advances in deep recommender systems
Recent advances in deep recommender systemsRecent advances in deep recommender systems
Recent advances in deep recommender systems
 
Transformer Zoo
Transformer ZooTransformer Zoo
Transformer Zoo
 

En vedette

Intro to Factorization Machines
Intro to Factorization MachinesIntro to Factorization Machines
Intro to Factorization MachinesPavel Kalaidin
 
Factorization Machines with libFM
Factorization Machines with libFMFactorization Machines with libFM
Factorization Machines with libFMLiangjie Hong
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Xavier Amatriain
 
Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineNYC Predictive Analytics
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architectureLiang Xiang
 
Ted Dunning, Chief Application Architect, MapR at MLconf SF
Ted Dunning, Chief Application Architect, MapR at MLconf SFTed Dunning, Chief Application Architect, MapR at MLconf SF
Ted Dunning, Chief Application Architect, MapR at MLconf SFMLconf
 
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SFTed Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SFMLconf
 
Scott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SFScott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SFMLconf
 
Lise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SF
Lise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SFLise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SF
Lise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SFMLconf
 
Warsaw Data Science - Factorization Machines Introduction
Warsaw Data Science -  Factorization Machines IntroductionWarsaw Data Science -  Factorization Machines Introduction
Warsaw Data Science - Factorization Machines IntroductionBartlomiej Twardowski
 
Quoc Le, Software Engineer, Google at MLconf SF
Quoc Le, Software Engineer, Google at MLconf SFQuoc Le, Software Engineer, Google at MLconf SF
Quoc Le, Software Engineer, Google at MLconf SFMLconf
 
MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...Sri Ambati
 
Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SF
Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SFAmeet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SF
Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SFMLconf
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning SystemsXavier Amatriain
 
NIPS2010読み会: A New Probabilistic Model for Rank Aggregation
NIPS2010読み会: A New Probabilistic Model for Rank AggregationNIPS2010読み会: A New Probabilistic Model for Rank Aggregation
NIPS2010読み会: A New Probabilistic Model for Rank Aggregationsleepy_yoshi
 
Evan Estola – Data Scientist, Meetup.com at MLconf ATL
Evan Estola – Data Scientist, Meetup.com at MLconf ATLEvan Estola – Data Scientist, Meetup.com at MLconf ATL
Evan Estola – Data Scientist, Meetup.com at MLconf ATLMLconf
 
Naive Bayesian Text Classifier Event Models
Naive Bayesian Text Classifier Event ModelsNaive Bayesian Text Classifier Event Models
Naive Bayesian Text Classifier Event ModelsDKALab
 
Steffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SFSteffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SFMLconf
 
Warsaw Data Science - Recsys2016 Quick Review
Warsaw Data Science - Recsys2016 Quick ReviewWarsaw Data Science - Recsys2016 Quick Review
Warsaw Data Science - Recsys2016 Quick ReviewBartlomiej Twardowski
 
Ben Hamner, Co-founder and CTO, Kaggle at MLconf SF - 11/13/15
Ben Hamner, Co-founder and CTO, Kaggle at MLconf SF - 11/13/15Ben Hamner, Co-founder and CTO, Kaggle at MLconf SF - 11/13/15
Ben Hamner, Co-founder and CTO, Kaggle at MLconf SF - 11/13/15MLconf
 

En vedette (20)

Intro to Factorization Machines
Intro to Factorization MachinesIntro to Factorization Machines
Intro to Factorization Machines
 
Factorization Machines with libFM
Factorization Machines with libFMFactorization Machines with libFM
Factorization Machines with libFM
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
 
Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engine
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
 
Ted Dunning, Chief Application Architect, MapR at MLconf SF
Ted Dunning, Chief Application Architect, MapR at MLconf SFTed Dunning, Chief Application Architect, MapR at MLconf SF
Ted Dunning, Chief Application Architect, MapR at MLconf SF
 
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SFTed Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
 
Scott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SFScott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SF
 
Lise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SF
Lise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SFLise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SF
Lise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SF
 
Warsaw Data Science - Factorization Machines Introduction
Warsaw Data Science -  Factorization Machines IntroductionWarsaw Data Science -  Factorization Machines Introduction
Warsaw Data Science - Factorization Machines Introduction
 
Quoc Le, Software Engineer, Google at MLconf SF
Quoc Le, Software Engineer, Google at MLconf SFQuoc Le, Software Engineer, Google at MLconf SF
Quoc Le, Software Engineer, Google at MLconf SF
 
MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...
 
Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SF
Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SFAmeet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SF
Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SF
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
 
NIPS2010読み会: A New Probabilistic Model for Rank Aggregation
NIPS2010読み会: A New Probabilistic Model for Rank AggregationNIPS2010読み会: A New Probabilistic Model for Rank Aggregation
NIPS2010読み会: A New Probabilistic Model for Rank Aggregation
 
Evan Estola – Data Scientist, Meetup.com at MLconf ATL
Evan Estola – Data Scientist, Meetup.com at MLconf ATLEvan Estola – Data Scientist, Meetup.com at MLconf ATL
Evan Estola – Data Scientist, Meetup.com at MLconf ATL
 
Naive Bayesian Text Classifier Event Models
Naive Bayesian Text Classifier Event ModelsNaive Bayesian Text Classifier Event Models
Naive Bayesian Text Classifier Event Models
 
Steffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SFSteffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SF
 
Warsaw Data Science - Recsys2016 Quick Review
Warsaw Data Science - Recsys2016 Quick ReviewWarsaw Data Science - Recsys2016 Quick Review
Warsaw Data Science - Recsys2016 Quick Review
 
Ben Hamner, Co-founder and CTO, Kaggle at MLconf SF - 11/13/15
Ben Hamner, Co-founder and CTO, Kaggle at MLconf SF - 11/13/15Ben Hamner, Co-founder and CTO, Kaggle at MLconf SF - 11/13/15
Ben Hamner, Co-founder and CTO, Kaggle at MLconf SF - 11/13/15
 

Similaire à Steffen Rendle, Research Scientist, Google at MLconf SF

PF_IDETC_2012_Souma
PF_IDETC_2012_SoumaPF_IDETC_2012_Souma
PF_IDETC_2012_SoumaMDO_Lab
 
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...Francisco (Paco) Florez-Revuelta
 
CP3_SDM_2010_Souma
CP3_SDM_2010_SoumaCP3_SDM_2010_Souma
CP3_SDM_2010_SoumaMDO_Lab
 
Применение машинного обучения для навигации и управления роботами
Применение машинного обучения для навигации и управления роботамиПрименение машинного обучения для навигации и управления роботами
Применение машинного обучения для навигации и управления роботамиSkolkovo Robotics Center
 
An_expected_improvement_criterion_for_the_global_optimization_of_a_noisy_comp...
An_expected_improvement_criterion_for_the_global_optimization_of_a_noisy_comp...An_expected_improvement_criterion_for_the_global_optimization_of_a_noisy_comp...
An_expected_improvement_criterion_for_the_global_optimization_of_a_noisy_comp...Kanika Anand
 
The study on mining temporal patterns and related applications in dynamic soc...
The study on mining temporal patterns and related applications in dynamic soc...The study on mining temporal patterns and related applications in dynamic soc...
The study on mining temporal patterns and related applications in dynamic soc...Thanh Hieu
 
PPT Image Analysis(IRDE, DRDO)
PPT Image Analysis(IRDE, DRDO)PPT Image Analysis(IRDE, DRDO)
PPT Image Analysis(IRDE, DRDO)Nidhi Gopal
 
A walk through the intersection between machine learning and mechanistic mode...
A walk through the intersection between machine learning and mechanistic mode...A walk through the intersection between machine learning and mechanistic mode...
A walk through the intersection between machine learning and mechanistic mode...JuanPabloCarbajal3
 
Comprehensive Product Platform Planning (CP3) - Souma - AIAA/SDM2010
Comprehensive Product Platform Planning (CP3) - Souma - AIAA/SDM2010Comprehensive Product Platform Planning (CP3) - Souma - AIAA/SDM2010
Comprehensive Product Platform Planning (CP3) - Souma - AIAA/SDM2010Souma Chowdhury
 
Prpagation of Error Bounds Across reduction interfaces
Prpagation of Error Bounds Across reduction interfacesPrpagation of Error Bounds Across reduction interfaces
Prpagation of Error Bounds Across reduction interfacesMohammad
 
Projection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamicsProjection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamicsUniversity of Glasgow
 
Applying Model Checking Approach with Floating Point Arithmetic for Verificat...
Applying Model Checking Approach with Floating Point Arithmetic for Verificat...Applying Model Checking Approach with Floating Point Arithmetic for Verificat...
Applying Model Checking Approach with Floating Point Arithmetic for Verificat...Sergey Staroletov
 
PF_MAO_2010_Souam
PF_MAO_2010_SouamPF_MAO_2010_Souam
PF_MAO_2010_SouamMDO_Lab
 
Development of Multi-Level ROM
Development of Multi-Level ROMDevelopment of Multi-Level ROM
Development of Multi-Level ROMMohammad
 

Similaire à Steffen Rendle, Research Scientist, Google at MLconf SF (20)

PF_IDETC_2012_Souma
PF_IDETC_2012_SoumaPF_IDETC_2012_Souma
PF_IDETC_2012_Souma
 
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...
 
CP3_SDM_2010_Souma
CP3_SDM_2010_SoumaCP3_SDM_2010_Souma
CP3_SDM_2010_Souma
 
Применение машинного обучения для навигации и управления роботами
Применение машинного обучения для навигации и управления роботамиПрименение машинного обучения для навигации и управления роботами
Применение машинного обучения для навигации и управления роботами
 
An_expected_improvement_criterion_for_the_global_optimization_of_a_noisy_comp...
An_expected_improvement_criterion_for_the_global_optimization_of_a_noisy_comp...An_expected_improvement_criterion_for_the_global_optimization_of_a_noisy_comp...
An_expected_improvement_criterion_for_the_global_optimization_of_a_noisy_comp...
 
GDRR Opening Workshop - Variance Reduction for Reliability Assessment with St...
GDRR Opening Workshop - Variance Reduction for Reliability Assessment with St...GDRR Opening Workshop - Variance Reduction for Reliability Assessment with St...
GDRR Opening Workshop - Variance Reduction for Reliability Assessment with St...
 
FMI output gap
FMI output gapFMI output gap
FMI output gap
 
Wp13105
Wp13105Wp13105
Wp13105
 
The study on mining temporal patterns and related applications in dynamic soc...
The study on mining temporal patterns and related applications in dynamic soc...The study on mining temporal patterns and related applications in dynamic soc...
The study on mining temporal patterns and related applications in dynamic soc...
 
Compact 3 d_resist_models_general
Compact 3 d_resist_models_generalCompact 3 d_resist_models_general
Compact 3 d_resist_models_general
 
PPT Image Analysis(IRDE, DRDO)
PPT Image Analysis(IRDE, DRDO)PPT Image Analysis(IRDE, DRDO)
PPT Image Analysis(IRDE, DRDO)
 
A walk through the intersection between machine learning and mechanistic mode...
A walk through the intersection between machine learning and mechanistic mode...A walk through the intersection between machine learning and mechanistic mode...
A walk through the intersection between machine learning and mechanistic mode...
 
Phd Defense 2007
Phd Defense 2007Phd Defense 2007
Phd Defense 2007
 
Comprehensive Product Platform Planning (CP3) - Souma - AIAA/SDM2010
Comprehensive Product Platform Planning (CP3) - Souma - AIAA/SDM2010Comprehensive Product Platform Planning (CP3) - Souma - AIAA/SDM2010
Comprehensive Product Platform Planning (CP3) - Souma - AIAA/SDM2010
 
Prpagation of Error Bounds Across reduction interfaces
Prpagation of Error Bounds Across reduction interfacesPrpagation of Error Bounds Across reduction interfaces
Prpagation of Error Bounds Across reduction interfaces
 
AbdoSummerANS_mod3
AbdoSummerANS_mod3AbdoSummerANS_mod3
AbdoSummerANS_mod3
 
Projection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamicsProjection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamics
 
Applying Model Checking Approach with Floating Point Arithmetic for Verificat...
Applying Model Checking Approach with Floating Point Arithmetic for Verificat...Applying Model Checking Approach with Floating Point Arithmetic for Verificat...
Applying Model Checking Approach with Floating Point Arithmetic for Verificat...
 
PF_MAO_2010_Souam
PF_MAO_2010_SouamPF_MAO_2010_Souam
PF_MAO_2010_Souam
 
Development of Multi-Level ROM
Development of Multi-Level ROMDevelopment of Multi-Level ROM
Development of Multi-Level ROM
 

Plus de MLconf

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...MLconf
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingMLconf
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...MLconf
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushMLconf
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceMLconf
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...MLconf
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...MLconf
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMLconf
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionMLconf
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLMLconf
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...MLconf
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldMLconf
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...MLconf
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...MLconf
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...MLconf
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeMLconf
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...MLconf
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareMLconf
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesMLconf
 

Plus de MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
 

Dernier

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 

Dernier (20)

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 

Steffen Rendle, Research Scientist, Google at MLconf SF

  • 1. Factorization Models & Polynomial Regression Factorization Machines Applications Summary Factorization Machines Steen Rendle Current aliation: Google Inc. Work was done at University of Konstanz MLConf, November 14, 2014 Steen Rendle 1 / 53
  • 2. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Models Linear/ Polynomial Regression Comparison Factorization Machines Applications Summary Steen Rendle 2 / 53
  • 3. Factorization Models Polynomial Regression Factorization Machines Applications Summary Matrix Factorization Example for data: Matrix Factorization: Movie TI NH SW ST ... 5 3 1 ? ... ? ? 4 5 ... 1 ? 5 ? ... ... ... ... ... ... A B C ... User ^ Y := W Ht ; W 2 RjUjk ;H 2 RjIjk k is the rank of the reconstruction. Steen Rendle 3 / 53
  • 4. Factorization Models Polynomial Regression Factorization Machines Applications Summary Matrix Factorization Example for data: Matrix Factorization: Movie TI NH SW ST ... 5 3 1 ? ... ? ? 4 5 ... 1 ? 5 ? ... ... ... ... ... ... A B C ... User ^ Y := W Ht ; W 2 RjUjk ;H 2 RjIjk ^y(u; i) = ^yu;i = Xk f =1 wu;f hi ;f = hwu; hi i k is the rank of the reconstruction. Steen Rendle 3 / 53
  • 5. Factorization Models Polynomial Regression Factorization Machines Applications Summary Matrix Factorization Extensions Example for data: Examples for models: Movie TI NH SW ST ... 5 3 1 ? ... ? ? 4 5 ... 1 ? 5 ? ... ... ... ... ... ... A B C ... User ^yMF(u; i ) := Xk f =1 vu;f vi ;f = hvu; vi i Steen Rendle 4 / 53
  • 6. Factorization Models Polynomial Regression Factorization Machines Applications Summary Matrix Factorization Extensions Example for data: Examples for models: Movie TI NH SW ST ... 5 3 1 ? ... ? ? 4 5 ... 1 ? 5 ? ... ... ... ... ... ... A B C ... User ^yMF(u; i ) := Xk f =1 vu;f vi ;f = hvu; vi i ^ySVD++(u; i) := * vu + X j2N(u) vj ; vi + ^yFact-KNN(u; i ) := 1 jR(u)j X j2R(u) ru;j hvi ; vj i Steen Rendle 4 / 53
  • 7. Factorization Models Polynomial Regression Factorization Machines Applications Summary Matrix Factorization Extensions Example for data: Examples for models: Movie TI NH SW ST ... 5 3 1 ? ... ? ? 4 5 ... 1 ? 5 ? ... ... ... ... ... ... A B C ... User ^yMF(u; i ) := Xk f =1 vu;f vi ;f = hvu; vi i ^ySVD++(u; i) := * vu + X j2N(u) vj ; vi + ^yFact-KNN(u; i ) := 1 jR(u)j X j2R(u) ru;j hvi ; vj i Rating Matrix time ^ytimeSVD(u; i ; t) := hvu + vu;t ; vi i ^ytimeTF(u; i ; t) := Xk f =1 vu;f vi ;f vt;f : : : Steen Rendle 4 / 53
  • 8. Factorization Models Polynomial Regression Factorization Machines Applications Summary Tensor Factorization Example for data: Examples for models: Triples of Subject, Predicate, Object ^yPARAFAC(s; p; o) := Xk f =1 vs;f vp;f vo;f ^yPITF(s; p; o) := hvs ; vpi + hvs ; voi + hvp; voi : : : Steen Rendle 5 / 53 [illustration from Drumond et al. 2012]
  • 9. Factorization Models Polynomial Regression Factorization Machines Applications Summary Sequential Factorization Models Example for data: Examples for models: Bt Bt­3 b b a b a c User 1 ? c e c c a ? d c e e ? ? User 2 User 3 User 4 Bt­2 Bt­1 a ^yFMC(u; i ; t) := X l2Bt1 hvi ; vl i ^yFPMC(u; i ; t) := hvu; vi i + X l2Bt1 hvi ; vl i : : : Steen Rendle 6 / 53
  • 10. Factorization Models Polynomial Regression Factorization Machines Applications Summary Factorization Models: Discussion I Advantages I Can estimate interactions between two (or more) variables even if the cross is not observed. I E.g. user movie, current product next product, user query url, : : : Steen Rendle 7 / 53
  • 11. Factorization Models Polynomial Regression Factorization Machines Applications Summary Factorization Models: Discussion I Advantages I Can estimate interactions between two (or more) variables even if the cross is not observed. I E.g. user movie, current product next product, user query url, : : : I Downsides I Factorization models are usually build speci
  • 12. cally for each problem. I Learning algorithms and implementations are tailored to individual models. Steen Rendle 7 / 53
  • 13. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Models Linear/ Polynomial Regression Comparison Factorization Machines Applications Summary Steen Rendle 8 / 53
  • 14. Factorization Models Polynomial Regression Factorization Machines Applications Summary Data and Variable Representation Many standard ML approaches work with real valued feature vectors as input. It allows to represent, e.g.: I any number of variables I categorical domains by using dummy indicator variables I numerical domains I set-categorical domains by using dummy indicator variables Using this representation allows to apply a wide variety of standard models (e.g. linear regression, SVM, etc.). Steen Rendle 9 / 53
  • 15. Factorization Models Polynomial Regression Factorization Machines Applications Summary Linear Regression I Let x 2 Rp be an input vector with p predictor variables. I Model equation: ^y(x) := w0 + Xp i=1 wi xi I Model parameters: w0 2 R; w 2 Rp O(p) model parameters. Steen Rendle 10 / 53
  • 16. Factorization Models Polynomial Regression Factorization Machines Applications Summary Polynomial Regression I Let x 2 Rp be an input vector with p predictor variables. I Model equation (degree 2): ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji wi ;j xi xj I Model parameters: w0 2 R; w 2 Rp; W 2 Rpp O(p2) model parameters. Steen Rendle 11 / 53
  • 17. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Models Linear/ Polynomial Regression Comparison Factorization Machines Applications Summary Steen Rendle 12 / 53
  • 18. Factorization Models Polynomial Regression Factorization Machines Applications Summary Representation: Matrix/ Tensor vs. Feature Vectors Matrix/ Tensor data can be represented by feature vectors: Movie TI NH SW ST ... 5 3 1 ? ... ? ? 4 5 ... 1 ? 5 ? ... ... ... ... ... ... A B C ... User Steen Rendle 13 / 53
  • 19. Factorization Models Polynomial Regression Factorization Machines Applications Summary Representation: Matrix/ Tensor vs. Feature Vectors Matrix/ Tensor data can be represented by feature vectors: Movie TI NH SW ST ... 5 3 1 ? ... ? ? 4 5 ... 1 ? 5 ? ... ... ... ... ... ... A B C ... User , # User Movie Rating 1 Alice Titanic 5 2 Alice Notting Hill 3 3 Alice Star Wars 1 4 Bob Star Wars 4 5 Bob Star Trek 5 6 Charlie Titanic 1 7 Charlie Star Wars 5 . . . . . . . . . . . . Steen Rendle 13 / 53
  • 20. Factorization Models Polynomial Regression Factorization Machines Applications Summary Representation: Matrix/ Tensor vs. Feature Vectors Matrix/ Tensor data can be represented by feature vectors: # User Movie Rating 1 Alice Titanic 5 2 Alice Notting Hill 3 3 Alice Star Wars 1 4 Bob Star Wars 4 5 Bob Star Trek 5 6 Charlie Titanic 1 7 Charlie Star Wars 5 . . . . . . . . . . . . ) 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... A B C ... TI NH SW ST ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x User Movie Target y 5 3 1 y(3) 4 5 1 5 y(1) y(2) y(4) y(5) y(6) y(7) Steen Rendle 13 / 53
  • 21. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... A B C ... TI NH SW ST ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x User Movie Target y 5 3 1 y(3) 4 5 1 5 y(1) y(2) y(4) y(5) y(6) y(7) Applying regression models to this data leads to: Steen Rendle 14 / 53
  • 22. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... A B C ... TI NH SW ST ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x User Movie Target y 5 3 1 y(3) 4 5 1 5 y(1) y(2) y(4) y(5) y(6) y(7) Applying regression models to this data leads to: Linear regression: ^y(x) = w0 + wu + wi Steen Rendle 14 / 53
  • 23. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... A B C ... TI NH SW ST ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x User Movie Target y 5 3 1 y(3) 4 5 1 5 y(1) y(2) y(4) y(5) y(6) y(7) Applying regression models to this data leads to: Linear regression: ^y(x) = w0 + wu + wi Polynomial regression: ^y(x) = w0 + wu + wi + wu;i Steen Rendle 14 / 53
  • 24. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... A B C ... TI NH SW ST ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x User Movie Target y 5 3 1 y(3) 4 5 1 5 y(1) y(2) y(4) y(5) y(6) y(7) Applying regression models to this data leads to: Linear regression: ^y(x) = w0 + wu + wi Polynomial regression: ^y(x) = w0 + wu + wi + wu;i Matrix factorization: ^y(u; i) = hwu; hi i Steen Rendle 14 / 53
  • 25. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors For the data of the example: I Linear regression has no user-item interaction. Steen Rendle 15 / 53
  • 26. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors For the data of the example: I Linear regression has no user-item interaction. I ) Linear regression is not expressive enough. Steen Rendle 15 / 53
  • 27. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors For the data of the example: I Linear regression has no user-item interaction. I ) Linear regression is not expressive enough. I Polynomial regression includes pairwise interactions but cannot estimate them from the data. Steen Rendle 15 / 53
  • 28. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors For the data of the example: I Linear regression has no user-item interaction. I ) Linear regression is not expressive enough. I Polynomial regression includes pairwise interactions but cannot estimate them from the data. I n p2: number of cases is much smaller than number of model parameters. Steen Rendle 15 / 53
  • 29. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors For the data of the example: I Linear regression has no user-item interaction. I ) Linear regression is not expressive enough. I Polynomial regression includes pairwise interactions but cannot estimate them from the data. I n p2: number of cases is much smaller than number of model parameters. I Max.-likelihood estimator for a pairwise eect is: wi ;j = ( y w0 wi wu; if (i ; j ; y) 2 S: not de
  • 30. ned; else Steen Rendle 15 / 53
  • 31. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors For the data of the example: I Linear regression has no user-item interaction. I ) Linear regression is not expressive enough. I Polynomial regression includes pairwise interactions but cannot estimate them from the data. I n p2: number of cases is much smaller than number of model parameters. I Max.-likelihood estimator for a pairwise eect is: wi ;j = ( y w0 wi wu; if (i ; j ; y) 2 S: not de
  • 32. ned; else I Polynomial regression cannot generalize to any unobserved pairwise eect. Steen Rendle 15 / 53
  • 33. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Model Examples Properties Learning libFM Software Applications Summary Steen Rendle 16 / 53
  • 34. Factorization Models Polynomial Regression Factorization Machines Applications Summary Factorization Machine (FM) I Let x 2 Rp be an input vector with p predictor variables. I Model equation (degree 2): ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji hvi ; vj i xi xj I Model parameters: w0 2 R; w 2 Rp; V 2 Rpk Steen Rendle 17 / 53 [Rendle 2010, Rendle 2012]
  • 35. Factorization Models Polynomial Regression Factorization Machines Applications Summary Factorization Machine (FM) I Let x 2 Rp be an input vector with p predictor variables. I Model equation (degree 2): ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji hvi ; vj i xi xj I Model parameters: w0 2 R; w 2 Rp; V 2 Rpk Compared to Polynomial regression: I Model equation (degree 2): ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji wi ;j xi xj I Model parameters: w0 2 R; w 2 Rp; W 2 Rpp Steen Rendle 17 / 53 [Rendle 2010, Rendle 2012]
  • 36. Factorization Models Polynomial Regression Factorization Machines Applications Summary Factorization Machine (FM) I Let x 2 Rp be an input vector with p predictor variables. I Model equation (degree 2): ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji hvi ; vj i xi xj I Model parameters: w0 2 R; w 2 Rp; V 2 Rpk Steen Rendle 17 / 53 [Rendle 2010, Rendle 2012]
  • 37. Factorization Models Polynomial Regression Factorization Machines Applications Summary Factorization Machine (FM) I Let x 2 Rp be an input vector with p predictor variables. I Model equation (degree 3): ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji hvi ; vj i xi xj + Xp i=1 Xp ji Xp lj Xk f =1 v(3) i ;f v(3) j ;f v(3) l ;f xi xj xl I Model parameters: w0 2 R; w 2 Rp; V 2 Rpk ; V(3) 2 Rpk Steen Rendle 17 / 53 [Rendle 2010, Rendle 2012]
  • 38. Factorization Models Polynomial Regression Factorization Machines Applications Summary Factorization Machines: Discussion I FMs work with real valued input. I FMs include variable interactions like polynomial regression. I Model parameters for interactions are factorized. I Number of model parameters is O(k p) (instead of O(p2) for poly. regr.). Steen Rendle 18 / 53
  • 39. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Model Examples Properties Learning libFM Software Applications Summary Steen Rendle 19 / 53
  • 40. Factorization Models Polynomial Regression Factorization Machines Applications Summary Matrix Factorization and Factorization Machines Two categorical variables encoded with real valued predictor variables: 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... A B C ... TI NH SW ST ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x User Movie With this data, the FM is identical to MF with biases1: ^y(x) = w0 + wu + wi + hvu; vi i | {z } MF 1libFM, k = 128, MCMC inference, Net ix RMSE=0.8937 Steen Rendle 20 / 53
  • 41. Factorization Models Polynomial Regression Factorization Machines Applications Summary RDF-Triple Prediction with Factorization Machines Three categorical variables encoded with real valued predictor variables: 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0 0 0 1 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... S1 S2 S3 ... P1 P2 P3 P4 ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x 1 0 0 0 1 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 ... ... ... ... ... 0 0 0 1 ... O1 O2 O3 O4 ... Subject Predicate Object With this data, the FM is equivalent to the PITF model: ^y(x) := w0 + ws + wp + wo + hvs ; vpi + hvs ; voi + hvp; voi [PITF: Rendle et al. 2010, WSDM Best Student Paper, ECML 2009 Best DC Award] Steen Rendle 21 / 53
  • 42. Factorization Models Polynomial Regression Factorization Machines Applications Summary Time with Factorization Machines Two categorical variables and time as linear predictor: 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... A B C ... TI NH SW ST ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x User Movie 0.2 0.6 0.61 0.3 0.5 0.1 0.8 Time The FM model would correspond to: ^y(x) := w0 + wi + wu + t wtime + hvu; vi i + t hvu; vtimei + t hvi ; vtimei Steen Rendle 22 / 53
  • 43. Factorization Models Polynomial Regression Factorization Machines Applications Summary Time with Factorization Machines Two categorical variables and time discretized in bins (b(t)): 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... A B C ... TI NH SW ST ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x User Movie 1 0 0 1 0 1 0 0 1 1 0 1 0 0 Time 0 0 0 0 0 0 1 T1 T2 T3 The FM model would correspond to:2 ^y(x) := w0 + wi + wu + wb(t) + hvu; vi i + hvu; vb(t)i + hvi ; vb(t)i 2libFM, k = 128, MCMC inference, Net ix RMSE=0.8873 Steen Rendle 22 / 53
  • 44. Factorization Models Polynomial Regression Factorization Machines Applications Summary SVD++ 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0.3 0.3 0.3 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... A B C ... TI NH SW ST ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x 0.3 0.3 0 0 0.5 0.3 0.3 0 0 0 0.3 0.3 0.5 0.5 0.5 0 0 0.5 0.5 0 ... ... ... ... ... 0.5 0 0.5 0 ... TI NH SW ST ... User Movie Other Movies rated With this data, the FM3 is identical to: ^y(x) = SVD++ z }| { w0 + wu + wi + hvu; vi i + 1 p jNuj X l2Nu hvi ; vl i + 1 p jNuj X l2Nu 0 @wl + hvu; vl i + 1 p jNuj X l 02Nu ;l 0l hvl ; v0 l i 1 A 3libFM, k = 128, MCMC inference, Net ix RMSE=0.8865 Steen Rendle 23 / 53 [Koren, 2008]
  • 45. Factorization Models Polynomial Regression Factorization Machines Applications Summary Factorizing Personalized Markov Chains (FPMC) Two categorical variables (u,i ), one set categorical (Bt1): 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0.5 0.5 0 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... u1 u2 u3 ... A B C D ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 ... ... ... ... ... 1 0 0 0 ... User Product A B C D ... Last Basket Sequential Baskets u1 A,B C u2 C D u3 A C FM is equivalent to ^y(x) := w0 + wu + wi + 1 jBt1j X j2Bt1 wj + hvu; vi i + 1 jBt1j X j2Bt1 hvi ; vj i + ::: Steen Rendle 24 / 53 [Rendle et al. 2010, WWW Best Paper]
  • 46. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Model Examples Properties Learning libFM Software Applications Summary Steen Rendle 25 / 53
  • 47. Factorization Models Polynomial Regression Factorization Machines Applications Summary Computation Complexity Factorization Machine model equation: ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji hvi ; vj i xi xj I Trivial computation: O(p2 k) Steen Rendle 26 / 53
  • 48. Factorization Models Polynomial Regression Factorization Machines Applications Summary Computation Complexity Factorization Machine model equation: ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji hvi ; vj i xi xj I Trivial computation: O(p2 k) I Ecient computation can be done in: O(p k) Steen Rendle 26 / 53
  • 49. Factorization Models Polynomial Regression Factorization Machines Applications Summary Computation Complexity Factorization Machine model equation: ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji hvi ; vj i xi xj I Trivial computation: O(p2 k) I Ecient computation can be done in: O(p k) I Making use of many zeros in x even in: O(Nz (x) k), where Nz (x) is the number of non-zero elements in vector x. Steen Rendle 26 / 53
  • 50. Factorization Models Polynomial Regression Factorization Machines Applications Summary Ecient Computation The model equation of an FM can be computed in O(p k). Steen Rendle 27 / 53
  • 51. Factorization Models Polynomial Regression Factorization Machines Applications Summary Ecient Computation The model equation of an FM can be computed in O(p k). Proof: ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji hvi ; vj i xi xj = w0 + Xp i=1 wi xi + 1 2 Xk f =1 2 4 Xp i=1 xi vi ;f !2 Xp i=1 (xi vi ;f )2 3 5 Steen Rendle 27 / 53
  • 52. Factorization Models Polynomial Regression Factorization Machines Applications Summary Ecient Computation The model equation of an FM can be computed in O(p k). Proof: ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji hvi ; vj i xi xj = w0 + Xp i=1 wi xi + 1 2 Xk f =1 2 4 Xp i=1 xi vi ;f !2 Xp i=1 (xi vi ;f )2 3 5 I In the sums over i , only non-zero xi elements have to be summed up ) O(Nz (x) k). I (The complexity of polynomial regression is O(Nz (x)2).) Steen Rendle 27 / 53
  • 53. Factorization Models Polynomial Regression Factorization Machines Applications Summary Multilinearity FMs are multilinear: 8 2 = fw0;w1; : : : ;wp; v1;1; : : : ; vp;kg : ^y(x; ) = h()(x) + g()(x) where g() and h() do not depend on the value of . Steen Rendle 28 / 53
  • 54. Factorization Models Polynomial Regression Factorization Machines Applications Summary Multilinearity FMs are multilinear: 8 2 = fw0;w1; : : : ;wp; v1;1; : : : ; vp;kg : ^y(x; ) = h()(x) + g()(x) where g() and h() do not depend on the value of . E.g. for second order eects ( = vl ;f ): ^y(x; vl;f ) := g(vl;f )(x) z }| { w0 + Xp i=1 wi xi + Xp i=1 Xp j=i+1 Xk f 0=1 (f 06=f )_(l62fi ;jg) vi ;f 0 vj;f 0 xi xj + vl;f xl X i=1;i6=l vi ;f xi | {z } h(vl;f )(x) Steen Rendle 28 / 53
  • 55. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Model Examples Properties Learning libFM Software Applications Summary Steen Rendle 29 / 53
  • 56. Factorization Models Polynomial Regression Factorization Machines Applications Summary Learning Using these properties, learning algorithms can be developed: I L2-regularized regression and classi
  • 57. cation: I Stochastic gradient descent [Rendle, 2010] I Alternating least squares/ Coordinate Descent [Rendle et al., 2011, Rendle 2012] I Markov Chain Monte Carlo (for Bayesian FMs) [Freudenthaler et al. 2011, Rendle 2012] I L2-regularized ranking: I Stochastic gradient descent [Rendle, 2010] All the proposed learning algorithms have a runtime of O(k Nz (X) i ), where i is the number of iterations and Nz (X) the number of non-zero elements in the design matrix X. Steen Rendle 30 / 53
  • 58. Factorization Models Polynomial Regression Factorization Machines Applications Summary Stochastic Gradient Descent (SGD) I For each training case (x; y) 2 S, SGD updates the FM model parameter using: 0 = (^y(x) y)h()(x) + () I is the learning rate / step size. I () is the regularization value of the parameter . I SGD can easily be applied to other loss functions. Steen Rendle 31 / 53 [Rendle, 2010]
  • 59. Factorization Models Polynomial Regression Factorization Machines Applications Summary Coordinate Descent (CD) I CD updates each FM model parameter using: 0 = P (x;y)2S y g()(x) h()(x) P (x;y)2S h2 ()(x) + () I Using caches of intermediate results, the runtime for updating all model parameters is O(k Nz (X)). I CD can be extended to classi
  • 60. cation [Rendle, 2012]. Steen Rendle 32 / 53 [Rendle et al., 2011]
  • 61. Factorization Models Polynomial Regression Factorization Machines Applications Summary Gibbs Sampling (MCMC) I Gibbs sampling with a block for each FM model parameter : jS; n fg N P (x;y)2S y g()(x) h()(x) P (x;y)2S h2 ()(x) + () ; 1 P (x;y)2S h2 ()(x) + () ! I Mean is the same as for CD ) computational complexity is also O(k Nz (X)). I MCMC can be extended to classi
  • 62. cation using link functions. Steen Rendle 33 / 53 [Freudenthaler et al. 2011, Rendle 2012]
  • 63. Factorization Models Polynomial Regression Factorization Machines Applications Summary Learning Regularization Values  v ,v yi w ,w wj w0 xij i=1,...,n v j  j=1,...,p  , 0 ,0 yi wj w0 w xij i=1,...,n v j  j=1,...,p w0 ,w0 0 ,0 w0 ,w0 w  v v Standard FM with priors. Two level FM with hyperpriors. Steen Rendle 34 / 53 [Freudenthaler et al., 2011]
  • 64. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Model Examples Properties Learning libFM Software Applications Summary Steen Rendle 35 / 53
  • 65. Factorization Models Polynomial Regression Factorization Machines Applications Summary libFM Software libFM is an implementation of FMs I Model: second-order FMs I Learning/ inference: SGD, ALS, MCMC I Classi
  • 66. cation and regression I Uses the same data format as LIBSVM, LIBLINEAR [Lin et. al], SVMlight [Joachims]. I Supports variable grouping. I Open source: GPLv3. Steen Rendle 36 / 53 [http://www.libfm.org/]
  • 67. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Applications Recommender Systems Link Prediction in Social Networks Clickthrough Prediction Personalized Ranking Student Performance Prediction Kaggle Competitions Summary Steen Rendle 37 / 53
  • 68. Factorization Models Polynomial Regression Factorization Machines Applications Summary (Context-aware) Rating Prediction I Main variables: I User ID (categorical) I Item ID (categorical) I Additional variables: I time I mood I user pro
  • 69. le I item meta data I . . . I Examples: Net ix prize, Movielens, KDDCup 2011 + ♪ + + Song User Time Mood Steen Rendle 38 / 53
  • 70. Factorization Models Polynomial Regression Factorization Machines Applications Summary Net ix Prize Netflix Prize: Prediction Error Public Leaderboard RMS Error 0.86 0.87 0.88 0.89 0.90 user, movie user, movie, day user, movie, impl. user, movie, day, impl. SGD Matrix Factorization user, movie, day, impl., freq, lin. day $1M Prize I k = 128 factors, 512 MCMC samples (no burnin phase, initialization from random) I MCMC inference (no hyperparameters (learning rate, regularization) to specify) Steen Rendle 39 / 53
  • 71. Factorization Models Polynomial Regression Factorization Machines Applications Summary Net ix Prize Method (Name) Ref. Learning Method k Quiz RMSE Models using user ID and item ID Probabilistic Matrix Factorization [14, 13] Batch GD 40 *0.9170 Probabilistic Matrix Factorization [14, 13] Batch GD 150 0.9211 Matrix Factorization [6] Variational Bayes 30 *0.9141 Matchbox [15] Variational Bayes 50 *0.9100 ALS-MF [7] ALS 100 0.9079 ALS-MF [7] ALS 1000 *0.9018 SVD/ MF [3] SGD 100 0.9025 SVD/ MF [3] SGD 200 *0.9009 Bayesian Probablistic Matrix Factorization [13] MCMC 150 0.8965 (BPMF) Bayesian Probablistic Matrix Factorization (BPMF) [13] MCMC 300 *0.8954 FM, pred. var: user ID, movie ID - MCMC 128 0.8937 Models using implicit feedback Probabilistic Matrix Factorization with Cons- traints [14] Batch GD 30 *0.9016 SVD++ [3] SGD 100 0.8924 SVD++ [3] SGD 200 *0.8911 BSRM/F [18] MCMC 100 0.8926 BSRM/F [18] MCMC 400 *0.8874 FM, pred. var: user ID, movie ID, impl. - MCMC 128 0.8865 Steen Rendle 40 / 53
  • 72. Factorization Models Polynomial Regression Factorization Machines Applications Summary Net ix Prize Method (Name) Ref. Learning Method k Quiz RMSE Models using time information Bayesian Probabilistic Tensor Factorization [17] MCMC 30 *0.9044 (BPTF) FM, pred. var: user ID, movie ID, day - MCMC 128 0.8873 Models using time and implicit feedback timeSVD++ [5] SGD 100 0.8805 timeSVD++ [5] SGD 200 *0.8799 FM, pred. var: user ID, movie ID, day, impl. - MCMC 128 0.8809 FM, pred. var: user ID, movie ID, day, impl. - MCMC 256 0.8794 Assorted models BRISMF/UM NB corrected [16] SGD 1000 *0.8904 BMFSI plus side information [8] MCMC 100 *0.8875 timeSVD++ plus frequencies [4] SGD 200 0.8777 timeSVD++ plus frequencies [4] SGD 2000 *0.8762 FM, pred. var: user ID, movie ID, day, impl., - MCMC 128 0.8779 freq., lin. day FM, pred. var: user ID, movie ID, day, impl., freq., lin. day - MCMC 256 0.8771 Steen Rendle 40 / 53
  • 73. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Applications Recommender Systems Link Prediction in Social Networks Clickthrough Prediction Personalized Ranking Student Performance Prediction Kaggle Competitions Summary Steen Rendle 41 / 53
  • 74. Factorization Models Polynomial Regression Factorization Machines Applications Summary Link Prediction in Social Networks I Main variables: I Actor A ID I Actor B ID I Additional variables: I pro
  • 75. les I actions I . . . + Actor A Actor B Steen Rendle 42 / 53
  • 76. Factorization Models Polynomial Regression Factorization Machines Applications Summary KDDCup 2012: Track 1 KDDCup 2012 Track 1: Prediction Quality Public Leaderboard Private Leaderboard Mean Average Precision @3 0.32 0.34 0.36 0.38 0.40 0.42 none gender, age, ... keywords friends all none gender, age, ... keywords friends all Top 1 Top 5 Top 10 Top 100 I k = 22 factors, 512 MCMC samples (no burnin phase, initialization from random) I MCMC inference (no hyperparameters (learning rate, regularization) to specify) Steen Rendle 43 / 53 [Awarded 2nd place (out of 658 teams)]
  • 77. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Applications Recommender Systems Link Prediction in Social Networks Clickthrough Prediction Personalized Ranking Student Performance Prediction Kaggle Competitions Summary Steen Rendle 44 / 53
  • 78. Factorization Models Polynomial Regression Factorization Machines Applications Summary Clickthrough Prediction I Main variables: I User ID I Query ID I Ad/ Link ID I Additional variables: I query tokens I user pro
  • 79. le I . . . + keyword... + Link 1 Link 2 Link 3 User Query Ad/ Link Steen Rendle 45 / 53
  • 80. Factorization Models Polynomial Regression Factorization Machines Applications Summary KDDCup 2012: Track 2 Model Inference wAUC (public) wAUC (private) ID-based model (k = 0) SGD 0.78050 0.78086 Attribute-based model (k = 8) MCMC 0.77409 0.77555 Mixed model (k = 8) SGD 0.79011 0.79321 Final ensemble n/a 0.79857 0.80178 Ensemble I Rank positions (not predicted clickthrough rates) are used. I The MCMC attribute-based model and dierent variations of the . SGD models are included. Steen Rendle 46 / 53 [Awarded 3rd place (out of 171 teams)]
  • 81. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Applications Recommender Systems Link Prediction in Social Networks Clickthrough Prediction Personalized Ranking Student Performance Prediction Kaggle Competitions Summary Steen Rendle 47 / 53
  • 82. Factorization Models Polynomial Regression Factorization Machines Applications Summary ECML/PKDD Discovery Challenge 2013 I Problem: Recommend given names. I Main variables: I User ID I Name ID I Additional variables: I session info I string representation for each name I . . . I FM approach won 1st place (online track) and 2nd (oine track). Steen Rendle 48 / 53
  • 83. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Applications Recommender Systems Link Prediction in Social Networks Clickthrough Prediction Personalized Ranking Student Performance Prediction Kaggle Competitions Summary Steen Rendle 49 / 53
  • 84. Factorization Models Polynomial Regression Factorization Machines Applications Summary Student Performance Prediction I Main variables: I Student ID I Question ID I Additional variables: I question hierarchy I sequence of questions I skills required I . . . I Examples: KDDCup 2010, Grockit Challenge4 (FM placed 1st/241) + ? Student Question 4http://www.kaggle.com/c/WhatDoYouKnow Steen Rendle 50 / 53
  • 85. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Applications Recommender Systems Link Prediction in Social Networks Clickthrough Prediction Personalized Ranking Student Performance Prediction Kaggle Competitions Summary Steen Rendle 51 / 53
  • 86. Factorization Models Polynomial Regression Factorization Machines Applications Summary Kaggle Competitions FMs have been successfully applied to several Kaggle competitions: I Criteon Display Advertising Challenge: 1st place (team '3 idiots'). I Blue Book for Bulldozers: 1st place (team 'Leustagos Titericz'). I EMI Music Data Science Hackathon: 2nd place (team 'lns'). Steen Rendle 52 / 53
  • 87. Factorization Models Polynomial Regression Factorization Machines Applications Summary Summary I Factorization machines combine linear/polynomial regression with factorization models. I Feature interactions are learned with a low rank representation. I Estimation of unobserved interactions is possible. I Factorization machines can be computed eciently and have high prediction quality. Steen Rendle 53 / 53
  • 88. Factorization Models Polynomial Regression Factorization Machines Applications Summary L. Drumond, S. Rendle, and L. Schmidt-Thieme. Predicting rdf triples in incomplete knowledge bases with tensor factorization. In Proceedings of the 27th Annual ACM Symposium on Applied Computing, SAC '12, pages 326{331, New York, NY, USA, 2012. ACM. C. Freudenthaler, L. Schmidt-Thieme, and S. Rendle. Bayesian factorization machines. In NIPS workshop on Sparse Representation and Low-rank Approximation, 2011. Y. Koren. Factorization meets the neighborhood: a multifaceted collaborative
  • 89. ltering model. In KDD '08: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 426{434, New York, NY, USA, 2008. ACM. Y. Koren. The bellkor solution to the net ix grand prize. 2009. Steen Rendle 53 / 53
  • 90. Factorization Models Polynomial Regression Factorization Machines Applications Summary Y. Koren. Collaborative
  • 91. ltering with temporal dynamics. In KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 447{456, New York, NY, USA, 2009. ACM. Y. J. Lim and Y. W. Teh. Variational Bayesian approach to movie rating prediction. In Proceedings of KDD Cup and Workshop, 2007. I. Pilaszy, D. Zibriczky, and D. Tikk. Fast als-based matrix factorization for explicit and implicit feedback datasets. In RecSys '10: Proceedings of the fourth ACM conference on Recommender systems, pages 71{78, New York, NY, USA, 2010. ACM. I. Porteous, A. Asuncion, and M. Welling. Bayesian matrix factorization with side information and dirichlet process mixtures. In Proceedings of the Twenty-Fourth AAAI Conference on Arti
  • 92. cial Intelligence, AAAI 2010, pages 563{568, 2010. Steen Rendle 53 / 53
  • 93. Factorization Models Polynomial Regression Factorization Machines Applications Summary S. Rendle. Factorization machines. In Proceedings of the 2010 IEEE International Conference on Data Mining, ICDM '10, pages 995{1000, Washington, DC, USA, 2010. IEEE Computer Society. S. Rendle. Factorization machines with libFM. ACM Trans. Intell. Syst. Technol., 3(3):57:1{57:22, May 2012. S. Rendle, C. Freudenthaler, and L. Schmidt-Thieme. Factorizing personalized markov chains for next-basket recommendation. In WWW '10: Proceedings of the 19th international conference on World wide web, pages 811{820, New York, NY, USA, 2010. ACM. S. Rendle, Z. Gantner, C. Freudenthaler, and L. Schmidt-Thieme. Fast context-aware recommendations with factorization machines. In Proceedings of the 34th ACM SIGIR Conference on Reasearch and Development in Information Retrieval. ACM, 2011. R. Salakhutdinov and A. Mnih. Steen Rendle 53 / 53
  • 94. Factorization Models Polynomial Regression Factorization Machines Applications Summary Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In Proceedings of the 25th international conference on Machine learning, ICML '08, pages 880{887, New York, NY, USA, 2008. ACM. R. Salakhutdinov and A. Mnih. Probabilistic matrix factorization. In J. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems 20, pages 1257{1264, Cambridge, MA, 2008. MIT Press. D. H. Stern, R. Herbrich, and T. Graepel. Matchbox: large scale online bayesian recommendations. In Proceedings of the 18th international conference on World wide web, WWW '09, pages 111{120, New York, NY, USA, 2009. ACM. G. Takacs, I. Pilaszy, B. Nemeth, and D. Tikk. Scalable collaborative
  • 95. ltering approaches for large recommender systems. J. Mach. Learn. Res., 10:623{656, June 2009. Steen Rendle 53 / 53
  • 96. Factorization Models Polynomial Regression Factorization Machines Applications Summary L. Xiong, X. Chen, T.-K. Huang, J. Schneider, and J. G. Carbonell. Temporal collaborative
  • 97. ltering with bayesian probabilistic tensor factorization. In Proceedings of the SIAM International Conference on Data Mining, pages 211{222. SIAM, 2010. S. Zhu, K. Yu, and Y. Gong. Stochastic relational models for large-scale dyadic data using MCMC. In D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, editors, Advances in Neural Information Processing Systems 21, pages 1993{2000, 2009. Steen Rendle 53 / 53