SlideShare une entreprise Scribd logo
1  sur  31
Télécharger pour lire hors ligne
Letting Users Choose
Recommender Algorithms
Michael Ekstrand
(Texas State University)
Daniel Kluver, Max Harper, and Joe Konstan
(GroupLens Research / University of Minnesota)
Research Objective
If we give users control over the algorithm providing
their recommendations, what happens?
Why User Control?
• Different users, different needs/wants
• Allow users to personalize the recommendation
experience to their needs and preferences.
• Transparency and control may promote trust
Research Questions
• Do users make use of a switching feature?
• How much do they use it?
• What algorithms do they settle on?
• Do algorithm or user properties predict choice?
Relation to Previous Work
Paper you just saw: tweak algorithm output
We change the whole algorithm
Previous study (RecSys 2014): what do users perceive
to be different, and say they want?
We see what their actions say they want
Outline
1. Introduction (just did that)
2. Experimental Setup
3. Findings
4. Conclusion & Future Work
Context: MovieLens
• Let MovieLens users switch between algorithms
• Algorithm produces:
• Recommendations (in sort-by-recommended mode)
• Predictions (everywhere)
• Change is persistent until next tweak
• Switcher integrated into top menu
Algorithms
• Four algorithms
• Peasant: personalized (user-item) mean rating
• Bard: group-based recommender (Chang et al. CSCW
2015)
• Warrior: item-item CF
• Wizard: FunkSVD CF
• Each modified with 10% blend of popularity rank
for top-N recommendation
Experiment Design
• Only consider established users
• Each user randomly assigned an initial algorithm
(not the Bard)
• Allow users to change algorithms
• Interstitial highlighted feature on first login
• Log interactions
Users Switch Algorithms
• 3005 total users
• 25% (748) switched at least once
• 72.1% of switchers (539) settled on different
algorithm
Finding 1: Users do use the control
Ok, so how do they switch?
• Many times or just a few?
• Repeatedly throughout their use, or find an
algorithm and stick with it?
Switching Behavior: Few Times
196
157
118
63
54
32
12
21 22
12 11 4 7 3 5 4 1 4 2
0
50
100
150
200
250
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
# of Transitions
Transition Count Histogram
Switching Beh.: Few Sessions
• Break sessions at 60 mins of inactivity
• 63% only switched in 1 session, 81% in 2 sessions
• 44% only switched in 1st session
• Few intervening events (switches concentrated)
Finding 2: users use the menu some, then leave it
alone
I’ll just stay here…
Question: do users find some algorithms more
initially satisfactory than others?
29.69%
22.07%
17.67%
0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
30.00%
35.00%
Baseline Item-Item SVD
Initial Algorithm
Frac. of Users Switching
(all diffs. significant, χ2 p<0.05)
…or go over there…
Question: do users tend to find some algorithms
more finally satisfactory than others?
…by some path
What do users do between initial and final?
• As stated, not many flips
• Most common: change to other personalized,
maybe change back (A -> B, A -> B -> A)
• Users starting w/ baseline usually tried one or both
personalized algorithms
53 62
292
341
0
50
100
150
200
250
300
350
400
Baseline Group Item-Item SVD
Final Choice of Algorithm
(for users who tried menu)
Algorithm Preferences
• Users prefer personalized (more likely to stay
initially or finally)
• Small preference of SVD over item-item
• Caveat: algorithm naming may confound
Interlude: Offline Experiment
• For each user:
• Discarded all ratings after starting experiment
• Use 5 most recent pre-experiment ratings for testing
• Train recommenders
• Measure:
• RMSE for test ratings
• Boolean recall: is a rated move in first 24 recs?
• Diversity (intra-list similarity over tag genome)
• Mean pop. rank of 24-item list
• Why 24? Size of single page of MovieLens results
Algorithms Made Different Recs
• Average of 53.8 unique items/user (out of 72
possible)
• Baseline and Item-Item most different (Jaccard
similarity)
• Accuracy is another story…
Algorithm Accuracy
0.62
0.64
0.66
0.68
0.7
0.72
0.74
Baseline Item-Item SVD
RMSE
0
0.05
0.1
0.15
0.2
0.25
0.3
Baseline Item-Item SVD
Boolean Recall
Diversity and Popularity
Not Predicting User Preference
• Algorithm properties do directly not predict user
preference, or whether they will switch
• Little ability to predict user behavior overall
• If user starts with baseline, diverse baseline recs
increase likelihood of trying another algorithm
• If user starts w/ item-item, novel baseline recs increase
likelihood of trying
• No other significant effects found
• Basic user properties do not predict behavior
What does this mean?
• Users take advantage of the feature
• Users experiment a little bit, then leave it alone
• Observed preference for personalized recs,
especially SVD
• Impact on long-term user satisfaction unknown
Future Work
• Disentangle preference and naming
• More domains
• Understand impact on long-term user satisfaction
and retention
Questions?
This work was supported by the National Science Foundation under grants
IIS 08-08692 and 10-17697.

Contenu connexe

Similaire à Letting Users Choose Recommender Algorithms

Don t make them think: create an easy-to-use website and catalog through user...
Don t make them think: create an easy-to-use website and catalog through user...Don t make them think: create an easy-to-use website and catalog through user...
Don t make them think: create an easy-to-use website and catalog through user...
Margaret Ostrander
 
ARF foq2 Day Router Presentation
ARF foq2 Day Router Presentation ARF foq2 Day Router Presentation
ARF foq2 Day Router Presentation
Federated Sample
 
Simulation Models as a Research Method.ppt
Simulation Models as a Research Method.pptSimulation Models as a Research Method.ppt
Simulation Models as a Research Method.ppt
QidiwQidiwQidiw
 

Similaire à Letting Users Choose Recommender Algorithms (20)

Get it right the first time through cheap and easy DIY usability testing
Get it right the first time through cheap and easy DIY usability testingGet it right the first time through cheap and easy DIY usability testing
Get it right the first time through cheap and easy DIY usability testing
 
Get it right the first time through cheap and easy DIY usability testing
Get it right the first time through cheap and easy DIY usability testingGet it right the first time through cheap and easy DIY usability testing
Get it right the first time through cheap and easy DIY usability testing
 
Evaluation techniques
Evaluation techniquesEvaluation techniques
Evaluation techniques
 
Human Computer Interaction Evaluation
Human Computer Interaction EvaluationHuman Computer Interaction Evaluation
Human Computer Interaction Evaluation
 
Design and Evaluation techniques unit 5
Design and Evaluation techniques unit  5Design and Evaluation techniques unit  5
Design and Evaluation techniques unit 5
 
E3 chap-09
E3 chap-09E3 chap-09
E3 chap-09
 
Don t make them think: create an easy-to-use website and catalog through user...
Don t make them think: create an easy-to-use website and catalog through user...Don t make them think: create an easy-to-use website and catalog through user...
Don t make them think: create an easy-to-use website and catalog through user...
 
HCI_chapter_09-Evaluation_techniques
HCI_chapter_09-Evaluation_techniquesHCI_chapter_09-Evaluation_techniques
HCI_chapter_09-Evaluation_techniques
 
Shyam presentation prefinal
Shyam presentation prefinalShyam presentation prefinal
Shyam presentation prefinal
 
NLTestDag_20161118-B
NLTestDag_20161118-BNLTestDag_20161118-B
NLTestDag_20161118-B
 
Get It Right the First Time Through Cheap and Easy DIY Usability Testing - Dr...
Get It Right the First Time Through Cheap and Easy DIY Usability Testing - Dr...Get It Right the First Time Through Cheap and Easy DIY Usability Testing - Dr...
Get It Right the First Time Through Cheap and Easy DIY Usability Testing - Dr...
 
Unit 3_Evaluation Technique.pptx
Unit 3_Evaluation Technique.pptxUnit 3_Evaluation Technique.pptx
Unit 3_Evaluation Technique.pptx
 
ARF foq2 Day Router Presentation
ARF foq2 Day Router Presentation ARF foq2 Day Router Presentation
ARF foq2 Day Router Presentation
 
Optimal Recommendations under Attraction, Aversion, and Social Influence
Optimal Recommendations under Attraction, Aversion, and Social InfluenceOptimal Recommendations under Attraction, Aversion, and Social Influence
Optimal Recommendations under Attraction, Aversion, and Social Influence
 
Avoiding test hell
Avoiding test hellAvoiding test hell
Avoiding test hell
 
How to gain a foothold in the world of classification
How to gain a foothold in the world of classificationHow to gain a foothold in the world of classification
How to gain a foothold in the world of classification
 
Simulation Models as a Research Method.ppt
Simulation Models as a Research Method.pptSimulation Models as a Research Method.ppt
Simulation Models as a Research Method.ppt
 
Modeling and analysis
Modeling and analysisModeling and analysis
Modeling and analysis
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
 
SLOPE 1st workshop - presentation 7
SLOPE 1st workshop - presentation 7SLOPE 1st workshop - presentation 7
SLOPE 1st workshop - presentation 7
 

Dernier

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Dernier (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 

Letting Users Choose Recommender Algorithms

  • 1. Letting Users Choose Recommender Algorithms Michael Ekstrand (Texas State University) Daniel Kluver, Max Harper, and Joe Konstan (GroupLens Research / University of Minnesota)
  • 2.
  • 3. Research Objective If we give users control over the algorithm providing their recommendations, what happens?
  • 4. Why User Control? • Different users, different needs/wants • Allow users to personalize the recommendation experience to their needs and preferences. • Transparency and control may promote trust
  • 5. Research Questions • Do users make use of a switching feature? • How much do they use it? • What algorithms do they settle on? • Do algorithm or user properties predict choice?
  • 6. Relation to Previous Work Paper you just saw: tweak algorithm output We change the whole algorithm Previous study (RecSys 2014): what do users perceive to be different, and say they want? We see what their actions say they want
  • 7. Outline 1. Introduction (just did that) 2. Experimental Setup 3. Findings 4. Conclusion & Future Work
  • 8. Context: MovieLens • Let MovieLens users switch between algorithms • Algorithm produces: • Recommendations (in sort-by-recommended mode) • Predictions (everywhere) • Change is persistent until next tweak • Switcher integrated into top menu
  • 9.
  • 10.
  • 11. Algorithms • Four algorithms • Peasant: personalized (user-item) mean rating • Bard: group-based recommender (Chang et al. CSCW 2015) • Warrior: item-item CF • Wizard: FunkSVD CF • Each modified with 10% blend of popularity rank for top-N recommendation
  • 12.
  • 13. Experiment Design • Only consider established users • Each user randomly assigned an initial algorithm (not the Bard) • Allow users to change algorithms • Interstitial highlighted feature on first login • Log interactions
  • 14. Users Switch Algorithms • 3005 total users • 25% (748) switched at least once • 72.1% of switchers (539) settled on different algorithm Finding 1: Users do use the control
  • 15. Ok, so how do they switch? • Many times or just a few? • Repeatedly throughout their use, or find an algorithm and stick with it?
  • 16. Switching Behavior: Few Times 196 157 118 63 54 32 12 21 22 12 11 4 7 3 5 4 1 4 2 0 50 100 150 200 250 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 # of Transitions Transition Count Histogram
  • 17. Switching Beh.: Few Sessions • Break sessions at 60 mins of inactivity • 63% only switched in 1 session, 81% in 2 sessions • 44% only switched in 1st session • Few intervening events (switches concentrated) Finding 2: users use the menu some, then leave it alone
  • 18. I’ll just stay here… Question: do users find some algorithms more initially satisfactory than others?
  • 19. 29.69% 22.07% 17.67% 0.00% 5.00% 10.00% 15.00% 20.00% 25.00% 30.00% 35.00% Baseline Item-Item SVD Initial Algorithm Frac. of Users Switching (all diffs. significant, χ2 p<0.05)
  • 20. …or go over there… Question: do users tend to find some algorithms more finally satisfactory than others?
  • 21. …by some path What do users do between initial and final? • As stated, not many flips • Most common: change to other personalized, maybe change back (A -> B, A -> B -> A) • Users starting w/ baseline usually tried one or both personalized algorithms
  • 22. 53 62 292 341 0 50 100 150 200 250 300 350 400 Baseline Group Item-Item SVD Final Choice of Algorithm (for users who tried menu)
  • 23. Algorithm Preferences • Users prefer personalized (more likely to stay initially or finally) • Small preference of SVD over item-item • Caveat: algorithm naming may confound
  • 24. Interlude: Offline Experiment • For each user: • Discarded all ratings after starting experiment • Use 5 most recent pre-experiment ratings for testing • Train recommenders • Measure: • RMSE for test ratings • Boolean recall: is a rated move in first 24 recs? • Diversity (intra-list similarity over tag genome) • Mean pop. rank of 24-item list • Why 24? Size of single page of MovieLens results
  • 25. Algorithms Made Different Recs • Average of 53.8 unique items/user (out of 72 possible) • Baseline and Item-Item most different (Jaccard similarity) • Accuracy is another story…
  • 26. Algorithm Accuracy 0.62 0.64 0.66 0.68 0.7 0.72 0.74 Baseline Item-Item SVD RMSE 0 0.05 0.1 0.15 0.2 0.25 0.3 Baseline Item-Item SVD Boolean Recall
  • 28. Not Predicting User Preference • Algorithm properties do directly not predict user preference, or whether they will switch • Little ability to predict user behavior overall • If user starts with baseline, diverse baseline recs increase likelihood of trying another algorithm • If user starts w/ item-item, novel baseline recs increase likelihood of trying • No other significant effects found • Basic user properties do not predict behavior
  • 29. What does this mean? • Users take advantage of the feature • Users experiment a little bit, then leave it alone • Observed preference for personalized recs, especially SVD • Impact on long-term user satisfaction unknown
  • 30. Future Work • Disentangle preference and naming • More domains • Understand impact on long-term user satisfaction and retention
  • 31. Questions? This work was supported by the National Science Foundation under grants IIS 08-08692 and 10-17697.