Predicting performance in Recommender Systems - Slides

Predicting Performance in
Recommender Systems

Alejandro Bellogín
Supervised by Pablo Castells and Iván Cantador
Escuela Politécnica Superior
Universidad Autónoma de Madrid

@abellogin
alejandro.bellogin@uam.es

ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA IRG
IR Group @ UAM

Motivation

Is it possible to predict the accuracy of a recommendation?

2

IR Group @ UAM

Hypothesis

Data that are commonly available to a
Recommender System could contain
signals that enable an a priori estimation
of the success of the recommendation

3

IR Group @ UAM

Research Questions

1. Is it possible to define a performance prediction theory for
recommender systems in a sound, formal way?
2. Is it possible to adapt query performance techniques (from
IR) to the recommendation task?
3. What kind of evaluation should be performed? Is IR
evaluation still valid in our problem?
4. What kind of recommendation problems can these models
be applied to?

4

IR Group @ UAM

Predicting Performance in Recommender Systems

RQ1. Is it possible to define a performance prediction theory
for recommender systems in a sound, formal way?

a) Define a predictor of performance  = (u, i, r, …)

b) Agree on a performance metric  = (u, i, r, …)

c) Check predictive power by measuring correlation
corr([(x1), …, (xn)], [(x1), …, (xn)])

d) Evaluate final performance: dynamic vs static
5

IR Group @ UAM


RQ2. Is it possible to adapt query performance techniques
(from IR) to the recommendation task?

 In IR: “Estimation of the system’s performance in response
to a specific query”

 Several predictors proposed

 We focus on query clarity  user clarity

6

IR Group @ UAM

User clarity
 It captures uncertainty in user’s data
• Distance between the user’s and the system’s probability model

 px |u  user’s model
c la rity  u    p  x | u  lo g  
 p x 
x X  c  system’s model

• X may be: users, items, ratings, or a combination

7

IR Group @ UAM

User clarity
 Three user clarity formulations:

Background
Name Vocabulary User model
model
Rating-based Ratings p r |u  pc  r 
Item-based Items p i | u  pc i

Item-and-rating-based Items rated by the user p  r | i, u  pml  r | i 

 px |u  user model
c la rity  u    p  x | u  lo g  
 p x 
x X  c  background model
8

IR Group @ UAM

User clarity
 Predictor that captures uncertainty in user’s data

 Different formulations capture different nuances

 More dimensions in RS than in IR: user, items, ratings, features, …

0.7 RatItem p_c(x)
0.6 p(x|u1)
p(x|u2)
0.5

0.4

0.3

0.2

0.1

0.0
4 3 5 2 1 Ratings 4 3 5 2 1

10

IR Group @ UAM


RQ3. What kind of evaluation should be performed? Is IR

 In IR: Mean Average Precision + correlation
 50 points (queries) vs 1000+ points (users)

 Performance metric is not clear: error-based, precision-based?
 What is performance?
 It may depend on the final r ~ 0.57
application

 Possible bias
 E.g., towards users or items with larger profiles
11

IR Group @ UAM


RQ3. What kind of evaluation should be performed? Is IR

 In IR: Mean Average Precision + correlation
 50 points (queries) vs 1000+ points (users)

 Performance metric is not clear: error-based, precision-based?
 What is performance?
 It may depend on the final application

 Possible bias
 E.g., towards users or items with larger profiles
12

IR Group @ UAM


RQ4. What kind of recommendation problems can these
models be applied to?

 Whenever a combination of strategies is available

 Example 1: dynamic neighbor weighting

 Example 2: dynamic ensemble recommendation

13

IR Group @ UAM

Dynamic neighbor weighting
 The user’s neighbors are weighted according to their similarity
 Can we take into account the uncertainty in neighbor’s data?

 User neighbor weighting [1]
• Static: g u,i  C  s im  u , v   r a t  v , i 
v N [ u ]

• Dynamic: g u,i  C  γ  v   s im  u , v   r a t  v , i 
v N [ u ]

14

IR Group @ UAM

Dynamic hybrid recommendation
 Weight is the same for every item and user (learnt from training)
 What about boosting those users predicted to perform better for
some recommender?

 Hybrid recommendation [3]

• Static: g  u , i     g R 1  u , i   1     g R 2  u , i 

• Dynamic: g  u , i   γ  u   g R 1  u , i   1  γ  u   g R2  u , i 

15

IR Group @ UAM

Results – Neighbor weighting
 Correlation analysis [1]
• With respect to Neighbor Goodness metric: “how good a neighbor is to her vicinity”

 Performance [1] (MAE = Mean Average Error, the lower the better)
0,98 Standard CF 0,88 b) Neighbourhood size: 500
Standard CF
0,96 Clarity-enhanced CF
0,94
0,86
0,92
0,85
0,90

MAE
MAE

0,84
0,88
0,86 0,83
0,84 0,82
0,82 0,81
0,80 0,80
10 20 30 40 50 60 70 80 90 10 20 100 150 200 250 300 350 400 450 500
30 40 50 60 70 80 90 100
% of ratings for training Neighbourhood size 16

IR Group @ UAM

Results – Neighbour weighting
• With respect to Neighbour Goodness metric: “how good a neighbour is to her vicinity”
Positive, although not very strong correlations

 Performance [1] (MAE = Mean Average Error, the lower the better)
0,98 Standard CF 0,88 b) Neighbourhood size: 500
Standard CF
0,94
0,86
0,92
0,90
Improvement of over 5% wrt. the baseline
0,85

MAE
MAE

0,84
0,88 Plus, it does not degrade performance
0,83
0,86
0,84 0,82
0,82 0,81
0,80 0,80
10 20 30 40 50 60 70 80 90 10 20 100 150 200 250 300 350 400 450 500
30 40 50 60 70 80 90 100
% of ratings for training Neighbourhood size 17

IR Group @ UAM

Results – Hybrid recommendation
• With respect to nDCG@50 (nDCG, normalized Discount Cumulative Gain)

 Performance
P@50 [3] nDCG@50
0,2

0,15

0,1

0,05

0
H3 H4 H1 H2 H3 H4

tive Static Adaptive Static 18

IR Group @ UAM

Results – Hybrid recommendation
• With respect to nDCG@50 (nDCG, normalized Discount Cumulative Gain)

In average, most of the predictors obtain positive, strong correlations

 Performance
P@50 [3] nDCG@50
0,2

0,15

Dynamic strategy outperforms static for
0,1

different combination of recommenders
0,05

0
H3 H4 H1 H2 H3 H4

tive Static Adaptive Static 19

IR Group @ UAM

Summary
 Inferring user’s performance in a recommender system
 Different adaptations of query clarity techniques

 Building dynamic recommendation strategies
• Dynamic neighbor weighting: according to expected goodness of neighbor
• Dynamic hybrid recommendation: based on predicted performance

 Encouraging results
• Positive predictive power (good correlations between predictors and metrics)
• Dynamic strategies obtain better (or equal) results than static

20

IR Group @ UAM

Related publications
 [1] A Performance Prediction Aproach to Enhance Collaborative
Filtering Performance. A. Bellogín and P. Castells. In ECIR 2010.

 [2] Predicting the Performance of Recommender Systems: An
Information Theoretic Approach. A. Bellogín, P. Castells, and I.
Cantador. In ICTIR 2011.

 [3] Performance Prediction for Dynamic Ensemble Recommender
Systems. A. Bellogín, P. Castells, and I. Cantador. In press.

21

IR Group @ UAM

Future Work
 Explore other input sources
• Item predictors
• Social links
• Implicit data (with time)

 We need a theoretical background
• Why do some predictors work better?

 Larger datasets

22

IR Group @ UAM

FW – Other input sources
 Item predictors
 Social links
 Implicit data (with time)

 Item predictors could be very useful:
• Different recommender behavior depending on item attributes
• They would allow to capture popularity, diversity, etc.

23

IR Group @ UAM

FW – Other input sources
 Item predictors
 Social links
 Implicit data (with time)

 First results using social-based predictors
• Combination of social and CF
• Graph-based measures as predictors
• “Indicators” of the user strength

24

IR Group @ UAM

FW – Theoretical background
 We need a theoretical background
• Why do some predictors work better?

25

IR Group @ UAM

Thank you!
Predicting Performance in
Recommender Systems

Alejandro Bellogín
Supervied by Pablo Castells and Iván Cantador
Escuela Politécnica Superior
Universidad Autónoma de Madrid

@abellogin
alejandro.bellogin@uam.es

Acknowledgements to the National Science Foundation

for the funding to attend the conference.

26

IR Group @ UAM

Reviewer’s comments: Confidence
 Other methods to measure self-performance of RS
o Confidence

 These methods capture the performance of the RS,
not user’s performance

27

October 23, Chicago, USA

Reviewer’s comments: Neighbor’s goodness
 Neighbor goodness seems to be a little bit ad-hoc

 We need a measurable definition of neighbor performance

NG(u) ~ “total MAE reduction by u” ~ “MAE without u” – “MAE with u”
1 1
  C E U  u   v    CEU v
| R U   u  | v U   u  | R U   u  | v U   u 

CE X v   | rX v,i   r v,i  |
i:r a t ( v ,i )  

 Some attempts in trust research: sign and error deviation [Rafter et al. 2009]
28


Reviewer’s comments: Neighbor’s weighting issues
 Neighbor size vs dynamic neighborhood weighting
 So far, only dynamic weighting
• Same training time than static weighting
 Future work: dynamic size

 Apply this method for larger datasets
 Current work

 Apply this method for other CF methods (e.g., latent factor models, SVD)
 More difficult to identify the combination
 Future work

29


Reviewer’s comments: Dynamic hybrid issues
 Other methods to combine recommenders
o Stacking
o Multi-linear weighting

 We focus on linear weighted hybrid recommendation
 Future work: cascade, stacking

30


Predicting performance in Recommender Systems - Slides

Recommandé

Recommandé

Contenu connexe

Similaire à Predicting performance in Recommender Systems - Slides

Similaire à Predicting performance in Recommender Systems - Slides (20)

Plus de Alejandro Bellogin

Plus de Alejandro Bellogin (17)

Dernier

Dernier (20)

Predicting performance in Recommender Systems - Slides