IAC 2024 - IA Fast Track to Search Focused AI Solutions
Predicting performance in Recommender Systems - Slides
1. Predicting Performance in
Recommender Systems
Alejandro Bellogín
Supervised by Pablo Castells and Iván Cantador
Escuela Politécnica Superior
Universidad Autónoma de Madrid
@abellogin
alejandro.bellogin@uam.es
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA IRG
IR Group @ UAM
2. Motivation
Is it possible to predict the accuracy of a recommendation?
2
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA IRG
IR Group @ UAM
3. Hypothesis
Data that are commonly available to a
Recommender System could contain
signals that enable an a priori estimation
of the success of the recommendation
3
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA IRG
IR Group @ UAM
4. Research Questions
1. Is it possible to define a performance prediction theory for
recommender systems in a sound, formal way?
2. Is it possible to adapt query performance techniques (from
IR) to the recommendation task?
3. What kind of evaluation should be performed? Is IR
evaluation still valid in our problem?
4. What kind of recommendation problems can these models
be applied to?
4
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA IRG
IR Group @ UAM
5. Predicting Performance in Recommender Systems
RQ1. Is it possible to define a performance prediction theory
for recommender systems in a sound, formal way?
a) Define a predictor of performance = (u, i, r, …)
b) Agree on a performance metric = (u, i, r, …)
c) Check predictive power by measuring correlation
corr([(x1), …, (xn)], [(x1), …, (xn)])
d) Evaluate final performance: dynamic vs static
5
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA IRG
IR Group @ UAM
6. Predicting Performance in Recommender Systems
RQ2. Is it possible to adapt query performance techniques
(from IR) to the recommendation task?
In IR: “Estimation of the system’s performance in response
to a specific query”
Several predictors proposed
We focus on query clarity user clarity
6
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA IRG
IR Group @ UAM
7. User clarity
It captures uncertainty in user’s data
• Distance between the user’s and the system’s probability model
px |u user’s model
c la rity u p x | u lo g
p x
x X c system’s model
• X may be: users, items, ratings, or a combination
7
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA IRG
IR Group @ UAM
8. User clarity
Three user clarity formulations:
Background
Name Vocabulary User model
model
Rating-based Ratings p r |u pc r
Item-based Items p i | u pc i
Item-and-rating-based Items rated by the user p r | i, u pml r | i
px |u user model
c la rity u p x | u lo g
p x
x X c background model
8
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA IRG
IR Group @ UAM
9. User clarity
Seven user clarity models implemented:
Name Formulation User model Background model
RatUser Rating-based pU r | i, u ; pU R i | u pc r
RatItem Rating-based pI r | i, u ; pU R i | u pc r
ItemSimple Item-based pR i | u pc i
pU R i | u pc i
ItemUser Item-based
IRUser Item-and-rating-based pU r | i, u pml r | i
IRItem Item-and-rating-based pI r | i, u pml r | i
IRUserItem Item-and-rating-based pU I r | i, u pml r | i
9
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA IRG
IR Group @ UAM
10. User clarity
Predictor that captures uncertainty in user’s data
Different formulations capture different nuances
More dimensions in RS than in IR: user, items, ratings, features, …
0.7 RatItem p_c(x)
0.6 p(x|u1)
p(x|u2)
0.5
0.4
0.3
0.2
0.1
0.0
4 3 5 2 1 Ratings 4 3 5 2 1
10
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA IRG
IR Group @ UAM
11. Predicting Performance in Recommender Systems
RQ3. What kind of evaluation should be performed? Is IR
evaluation still valid in our problem?
In IR: Mean Average Precision + correlation
50 points (queries) vs 1000+ points (users)
Performance metric is not clear: error-based, precision-based?
What is performance?
It may depend on the final r ~ 0.57
application
Possible bias
E.g., towards users or items with larger profiles
11
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA IRG
IR Group @ UAM
12. Predicting Performance in Recommender Systems
RQ3. What kind of evaluation should be performed? Is IR
evaluation still valid in our problem?
In IR: Mean Average Precision + correlation
50 points (queries) vs 1000+ points (users)
Performance metric is not clear: error-based, precision-based?
What is performance?
It may depend on the final application
Possible bias
E.g., towards users or items with larger profiles
12
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA IRG
IR Group @ UAM
13. Predicting Performance in Recommender Systems
RQ4. What kind of recommendation problems can these
models be applied to?
Whenever a combination of strategies is available
Example 1: dynamic neighbor weighting
Example 2: dynamic ensemble recommendation
13
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA IRG
IR Group @ UAM
14. Dynamic neighbor weighting
The user’s neighbors are weighted according to their similarity
Can we take into account the uncertainty in neighbor’s data?
User neighbor weighting [1]
• Static: g u,i C s im u , v r a t v , i
v N [ u ]
• Dynamic: g u,i C γ v s im u , v r a t v , i
v N [ u ]
14
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA IRG
IR Group @ UAM
15. Dynamic hybrid recommendation
Weight is the same for every item and user (learnt from training)
What about boosting those users predicted to perform better for
some recommender?
Hybrid recommendation [3]
• Static: g u , i g R 1 u , i 1 g R 2 u , i
• Dynamic: g u , i γ u g R 1 u , i 1 γ u g R2 u , i
15
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA IRG
IR Group @ UAM
16. Results – Neighbor weighting
Correlation analysis [1]
• With respect to Neighbor Goodness metric: “how good a neighbor is to her vicinity”
Performance [1] (MAE = Mean Average Error, the lower the better)
0,98 Standard CF 0,88 b) Neighbourhood size: 500
Standard CF
0,96 Clarity-enhanced CF
0,87 Clarity-enhanced CF
0,94
0,86
0,92
0,85
0,90
MAE
MAE
0,84
0,88
0,86 0,83
0,84 0,82
0,82 0,81
0,80 0,80
10 20 30 40 50 60 70 80 90 10 20 100 150 200 250 300 350 400 450 500
30 40 50 60 70 80 90 100
% of ratings for training Neighbourhood size 16
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA IRG
IR Group @ UAM
17. Results – Neighbour weighting
Correlation analysis [1]
• With respect to Neighbour Goodness metric: “how good a neighbour is to her vicinity”
Positive, although not very strong correlations
Performance [1] (MAE = Mean Average Error, the lower the better)
0,98 Standard CF 0,88 b) Neighbourhood size: 500
Standard CF
0,96 Clarity-enhanced CF
0,87 Clarity-enhanced CF
0,94
0,86
0,92
0,90
Improvement of over 5% wrt. the baseline
0,85
MAE
MAE
0,84
0,88 Plus, it does not degrade performance
0,83
0,86
0,84 0,82
0,82 0,81
0,80 0,80
10 20 30 40 50 60 70 80 90 10 20 100 150 200 250 300 350 400 450 500
30 40 50 60 70 80 90 100
% of ratings for training Neighbourhood size 17
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA IRG
IR Group @ UAM
18. Results – Hybrid recommendation
Correlation analysis [2]
• With respect to nDCG@50 (nDCG, normalized Discount Cumulative Gain)
Performance
P@50 [3] nDCG@50
0,2
0,15
0,1
0,05
0
H3 H4 H1 H2 H3 H4
tive Static Adaptive Static 18
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA IRG
IR Group @ UAM
19. Results – Hybrid recommendation
Correlation analysis [2]
• With respect to nDCG@50 (nDCG, normalized Discount Cumulative Gain)
In average, most of the predictors obtain positive, strong correlations
Performance
P@50 [3] nDCG@50
0,2
0,15
Dynamic strategy outperforms static for
0,1
different combination of recommenders
0,05
0
H3 H4 H1 H2 H3 H4
tive Static Adaptive Static 19
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA IRG
IR Group @ UAM
20. Summary
Inferring user’s performance in a recommender system
Different adaptations of query clarity techniques
Building dynamic recommendation strategies
• Dynamic neighbor weighting: according to expected goodness of neighbor
• Dynamic hybrid recommendation: based on predicted performance
Encouraging results
• Positive predictive power (good correlations between predictors and metrics)
• Dynamic strategies obtain better (or equal) results than static
20
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA IRG
IR Group @ UAM
21. Related publications
[1] A Performance Prediction Aproach to Enhance Collaborative
Filtering Performance. A. Bellogín and P. Castells. In ECIR 2010.
[2] Predicting the Performance of Recommender Systems: An
Information Theoretic Approach. A. Bellogín, P. Castells, and I.
Cantador. In ICTIR 2011.
[3] Performance Prediction for Dynamic Ensemble Recommender
Systems. A. Bellogín, P. Castells, and I. Cantador. In press.
21
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA IRG
IR Group @ UAM
22. Future Work
Explore other input sources
• Item predictors
• Social links
• Implicit data (with time)
We need a theoretical background
• Why do some predictors work better?
Larger datasets
22
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA IRG
IR Group @ UAM
23. FW – Other input sources
Item predictors
Social links
Implicit data (with time)
Item predictors could be very useful:
• Different recommender behavior depending on item attributes
• They would allow to capture popularity, diversity, etc.
23
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA IRG
IR Group @ UAM
24. FW – Other input sources
Item predictors
Social links
Implicit data (with time)
First results using social-based predictors
• Combination of social and CF
• Graph-based measures as predictors
• “Indicators” of the user strength
24
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA IRG
IR Group @ UAM
25. FW – Theoretical background
We need a theoretical background
• Why do some predictors work better?
25
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA IRG
IR Group @ UAM
26. Thank you!
Predicting Performance in
Recommender Systems
Alejandro Bellogín
Supervied by Pablo Castells and Iván Cantador
Escuela Politécnica Superior
Universidad Autónoma de Madrid
@abellogin
alejandro.bellogin@uam.es
Acknowledgements to the National Science Foundation
for the funding to attend the conference.
26
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA IRG
IR Group @ UAM
27. Reviewer’s comments: Confidence
Other methods to measure self-performance of RS
o Confidence
These methods capture the performance of the RS,
not user’s performance
27
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA
28. Reviewer’s comments: Neighbor’s goodness
Neighbor goodness seems to be a little bit ad-hoc
We need a measurable definition of neighbor performance
NG(u) ~ “total MAE reduction by u” ~ “MAE without u” – “MAE with u”
1 1
C E U u v CEU v
| R U u | v U u | R U u | v U u
CE X v | rX v,i r v,i |
i:r a t ( v ,i )
Some attempts in trust research: sign and error deviation [Rafter et al. 2009]
28
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA
29. Reviewer’s comments: Neighbor’s weighting issues
Neighbor size vs dynamic neighborhood weighting
So far, only dynamic weighting
• Same training time than static weighting
Future work: dynamic size
Apply this method for larger datasets
Current work
Apply this method for other CF methods (e.g., latent factor models, SVD)
More difficult to identify the combination
Future work
29
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA
30. Reviewer’s comments: Dynamic hybrid issues
Other methods to combine recommenders
o Stacking
o Multi-linear weighting
We focus on linear weighted hybrid recommendation
Future work: cascade, stacking
30
ACM Conference on Recommender Systems 2011 – Doctoral Symposium
October 23, Chicago, USA