Modeling Difficulty in Recommender Systems

Modeling Difficulty in Recommender Systems

Benjamin Kille (@bennykille)
Competence Center Information Retrieval & Machine Learning

September 9, 2012 Recommendation Utility Evaluation: Beyond RMSE (2012)

Outline

► Recommender System Evaluation

► Problem definition

► Difficulty in Recommender Systems

► Future work

► Conclusions

September 9, 2012 Recommendation Utility Evaluation: Beyond RMSE (2012) 2

Recommender Systems Evaluation

► Definition of Evaluation measure:
 RMSE (rating prediction scenario)

 nDCG (ranking scenario)

 Precision@N (top-N scenario)

► Splitting data into training and test partition

► Reporting results as average over the full set of users

► Is recommending to all users equally difficult?


Observed Differences

► Users differ with respect to
 Demographics (e.g., age, gender and location)
 Taste
 Needs
 Expectations
 Consumption patterns
…
► Recommendation algorithms do not perform equally for each

single user
users should not be evaluated all in the same way!


Risks of disregarding users‘ differences

► A subset of users receives worse recommendations than
possible
► recommendation algorithm optimization targets all users

equally:
 „easy“ users  costs could be saved
 „difficult“ users  insufficient optimization
 Control optimization towards those users who really require it!

 How to determine difficulty?


Problem Formulation

► Measuring how difficult it will be to recommend items to a
user
► Ideally: deriving difficulty directly from user attributes

► Problem: unkown correlation between (combinations of)

attributes and difficulty

► We need a method to calculate the correlation of user
attributes and the recommendation difficulty


Difficulty in Information Retrieval

► Target object: query
► Method: Query

IR-System IR-System IR-System IR-System IR-System

Doc 1 Doc 1 Doc 1 Doc 2 Doc 1



… … … … …

Difficulty = Diversity of returned list of documents


Difficulty in Recommender Systems

► Selecting several recommendation methods (state-of-the-art)
► Measure the diversity of their output for a specific user

► Based on the methods‘ agreement with respect to predicted

rating / ranking / top-N items, we conclude:
 high agreement  low difficulty
 low agreement  high difficulty

► Target correlation (user attributes ~ difficulty) can be
estimated using the observed difficulties for a sufficiently
large set of users


Future Work

► Experimentally verify feasability of difficulty estimation

► Evaluate observed correlation (user attributes ~ difficulty) on
data sets

► Investigate business rationale (reduced costs through
controlled optimization efforts)

► How to deal with sparsity / cold-start issues


Conclusions

► Users should not be treated equally when evaluating
recommender systems

► Difficulty of recommendation tasks varies between users

► Difficulty will allow to control optimization towards those users
who require it

► Diversity metrics could be used to estimate difficulty scores
(analogously to information retrieval)

► Proposed method needs to be evaluated


Thank you for your attention!

Questions


References

[He2008] J. He, M. Larson, and M. De Rijke. Using
coherence-based measures to predict query
difficulty. ECIR 2008
[Herlocker2004] J. Herlocker, J. Konstan, L. Terveen, and J.
Riedl. Evaluating collaborative filtering
recommender systems. ACM TOIS 22(1) 2004
[Kuncheva2003] L. Kuncheva and C. Whitaker. Measures of
diversity in classifier ensembles and their
relationship with the ensemble accuracy.
Machine Learning 51 2003
[Vargas2011] S. Vargas and P. Castells. Rank and relevance in
novelty and diversity metrics for
recommender systems. RecSys 2011


Modeling Difficulty in Recommender Systems

Recommandé

Recommandé

Contenu connexe

Similaire à Modeling Difficulty in Recommender Systems

Similaire à Modeling Difficulty in Recommender Systems (20)

Modeling Difficulty in Recommender Systems