This document summarizes a study on a new recommendation algorithm called K-Furthest Neighbor (KFN) which recommends items that are disliked by users dissimilar to the target user. The study found that:
1) KFN performed worse than the standard K-Nearest Neighbor algorithm in offline evaluation metrics but was perceived as more useful by users in online evaluations.
2) Users found the recommendations from KFN to be less obvious and recognizable but similarly serendipitous and useful as the standard algorithm.
3) Recommending items disliked by dissimilar users leads to more diverse recommendations while maintaining comparable overall usefulness, even if standard offline metrics say otherwise.
User-Centric Evaluation of a K-Furthest Neighbor Collaborative Filtering Recommender Algorithm
1. User-Centric Evaluation of a K-Furthest Neighbor
Collaborative Filtering Recommender Algorithm
Alan Said*1, Ben Fields#2, Brijnesh J. Jain*, Sahin Albayrak*
*
Technische Universität Berlin
# musicmetric
1 @alansaid
2
CSCW 2013, San Antonio, TX, USA
@alsothings
February 27th, 2013
2. Abstract
• New recommendation algorithm for diverse recommendations
• Based on the k-nearest neighbor algorithm
• Two types of evaluation
o standard offline evaluation
o user-centric online evaluation
• Proposed algorithm performs worse than baseline in offline
evaluation but has higher perceived usefulness from the users
in online evaluation
February 27th, 2013 2
4. Background and Acknowledgements
Started as a (not very serious) discussion at IJCAI & ICWSM 2011
• Ben Fields - @alsothings
• Òscar Celma - @ocelma
• Markus Schedl - @m_schedl
• Mohamed Sordo - @neomoha
February 27th, 2013 4
5. Recommendation
• What is it?
o Personalized information filtering
• What is the difference to search?
o Implicit
o Passively finds most interesting items
• How?
February 27th, 2013 5
6. Recommendation - An example (knn)
Recommending a movie to Bert:
Cookie Herry
what/who Bert Ernie Big Bird Elmo
Monster Monster
Toy Story 4 4 5 1 4
E.T. 2 5 2
Beetlejuice 4 4 5 2 3
Shrek 1 3 1
Zoolander 4 1
February 27th, 2013 6
7. Recommendation - An example (knn)
Recommending a movie to Bert:
Similar to Bert
Cookie Herry
what/who Bert Ernie Big Bird Elmo
Monster Monster
Toy Story 4 4 5 1 K-Nearest Neighbor
4
poor
E.T. 2 5 2
rating
Beetlejuice 4 4 5
Potential movies
2 3
to recommend
Shrek 1 3 1
recommendation Zoolander 4 1
February 27th, 2013 6
8. Recommendation - A counter example
What happens if we flip it?
Can we recommend movies
disliked by those who are
dissimilar to Bert?
Yes!
February 27th, 2013 7
9. Recommendation - A counter example (kfn)
Recommending a movie to Bert:
1. Who is dissimilar to Bert? Cookie Herry
what/who Bert Ernie Big Bird Elmo
Monster Monster
Toy Story 4 4 5 1 4
E.T. 2 5 2
Beetlejuice 4 4 5 2 2 3
Shrek 1 3 1
Zoolander 4
February 27th, 2013 8
10. Recommendation - A counter example (kfn)
Recommending a movie to Bert:
1. Who is dissimilar to Bert? Cookie
what/who Bert
Monster
2. What do they dislike?
Toy Story 4 1
K-Furthest Neighbor
E.T.
Beetlejuice 4 2
Disliked by Cookie Monster -
Shrek 1
Liked by Bert?
Zoolander
February 27th, 2013 9
11. Evaluation
What are the effects of this?
Diversity :
• Less popular items
• Items the users are not familiar with
• Non standard items
February 27th, 2013 10
12. Evaluation - Recommendation Accuracy
Traditional - Offline Evaluation
• Movielens 10M, 70k users
• Precision@N for users with
>2N ratings
• Furthest performs at ~60% of
Nearest neighbor (for N=100) <0.001
However
• lists of recommended items are practically
disjoint
February 27th, 2013 11
15. Evaluation - Online User Study
10 recommended
movies
7 questions
February 27th, 2013 13
16. Evaluation – Recommendation Utility
Data Questions
• 132 users • Novel?
• 10 recommended • Obvious?
movies each • Recognizable?
• Serendipitous?
• knn: 47 users
• Useful?
• kfn: 43 users • Best movie?
• random: 42 users • Worst movie?
• training set: Movielens • Rate each seen movie
10M • State whether movie is familiar
• State whether you would see it
February 27th, 2013 14
20. Evaluation – Recommendation Utility
Likert scale
1: least agree; 5: most agree
rating novelty obviousness recognizable serendipity usefulness
knn 3.64 3.83 2.27 2.69 2.71 2.69
kfn 3.65 3.95 1.79 2.07 2.65 2.63
random 3.07 4.17 1.64 1.81 2.48 2.24
highest rating
less obvious/recognizable comparable serendipity and
usefulness
remember: knn and kfn recommend different items, still the experienced quality is similar (or higher)
February 27th, 2013 16
21. Conclusion
Recommending what your anti-peers do not like creates:
• more diverse recommendations,
• with comparable overall usefulness,
• even though standard offline evaluation says otherwise
February 27th, 2013 17
22. Questions?
Thank you for listening!
For more RecSys stuff, check out:
www.recsyswiki.com
February 27th, 2013 18