SlideShare une entreprise Scribd logo
1  sur  59
Télécharger pour lire hors ligne
What recommender
systems can learn from
decision psychology
about preference elicitation
and behavioral change
Martijn Willemsen
Human-Technology Interaction
www.martijnwillemsen.nl
What are recommender systems about
Algorithms
Accuracy:
compare prediction
with actual values
Recommendation:
best predicted items
dataset
user-item rating pairs
user
Choose (prefer?)
ratings
Rating?
Experience!
preferences
Goals &
desires!
User-Centric Framework
Computers Scientists (and marketing researchers) would study
behavior…. (they hate asking the user or just cannot (AB tests))
User-Centric Framework
Psychologists and HCI people are mostly interested in experience…
User-Centric Framework
Though it helps to triangulate experience and behavior…
User-Centric Framework
Our framework adds the intermediate construct of perception that explains
why behavior and experiences changes due to our manipulations
User-Centric Framework
And adds personal
and situational
characteristics
Relations modeled
using factor analysis
and SEM
Knijnenburg, B.P., Willemsen, M.C., Gantner, Z., Soncu, H., Newell, C. (2012). Explaining
the User Experience of Recommender Systems. User Modeling and User-Adapted
Interaction (UMUAI), vol 22, p. 441-504 http://bit.ly/umuai
Providing input to the
recommender:
Preference Elicitation
memory, ratings & choices
Providing input to a recommender
system
how do algorithms get their data?
Preference Elicitation (PE)
PE is a major topic in research on
Decision Making
I even did my PhD thesis on it… ;-)
What can Psychology learn us on
improving this aspect?
Role of memory in ratings
Rating support
Rating versus choice-based elicitation
What does rating entail?
Typical recommender scenario:
First usage: Show a set of (typical) items and ask
people to rate the ones they know (cold start)
Later usage: people go to the recommender when they
have consumed an item to rate it and typically also rate
other aspects
Does it matter if the preference you provide (by
rating) is based on recent experiences or mostly
on your memory?
Psychologist: user knows best, ask her!
In two user experiments, users rated a number of
movies that were aired on Dutch TV in the
previous month (~150 movies)
Rate 15-20 movies from that set that you know
and indicate how long ago you have seen these
movies (last week, last month, last 6 months, last
year, last 3 years, longer ago)
Motivate ratings for two movies (one seen recently
and one seen more than a year ago)
Results
247 users, 4212 ratings
Rating distributions:
Most movies are seen
long ago…
Only 28% seen in the
last year
# Positive ratings
decrease with time
1st timeslot: 60% 4/5
star
Last timeslot: only 36%
*****
****
***
**
*
Modeling the ratings
Coefficient Std. Err. t-value
intercept 2.95 0.15 19.05
time 0.29 0.13 2.31
highrated 1.62 0.22 7.43
time2 -0.09 0.02 -3.55
Time x highrated -0.73 0.18 -4.10
Tine2 x highrated 0.11 0.03 3.26
Multilevel model:
Random intercepts for movies
and users
high-rated versus low-rated
shows a different pattern
Regression towards the mean?
High-rated
Low rated
This is a problem…
How can we train a recommender system..
If ratings depend on our memory this much…
This is new to psychology as well!
Memory effects like this have not been studied…
Problem lies partly in the type of judgment asked:
Rating is separate evaluation on an absolute scale…
Lacks a good reference/comparison
Two solutions we explored:
Rating support
Different elicitation methods: choice!
Joint versus Separate Evaluation
Evaluations of two job candidates for a computer
programmer position expecting the use of a special
language called KY.
Mean WTP (in thousands):
Separate $ 32.7 k $ 26.8 k
Joint $ 31.2 k $ 33.2 k
Candidate A Candidate B
Education B.Sc. computer Sc. B.Sc. computer Sc.
GPA (0-5) 4.8 3.1
KY Experience 10 KY programs 70 KY programs
17
Rating support interfaces
Using movielens!
Can we help users during rating to make
their ratings more stable/accurate?
We can support their memory for the movie using tags
We can help ratings on the scale with previous ratings
Movielens has a tag genome and a history of
ratings so we can give real-time user-specific
feedback!
Nguyen, T. T., Kluver, D., Wang, T.-Y., Hui, P.-M., Ekstrand, M. D., Willemsen, M.
C., & Riedl, J. (2013). Rating Support Interfaces to Improve User Experience
and Recommender Accuracy. RecSys 2013 (pp. 149–156)
Tag Interface
Provide 10 tags
that are relevant
for that user
and that describe
the movie well
Didn’t really help…
Exemplar Interface
Support rating on the
scale by providing
exemplars:
Exemplar: Similar movies
rated before by that user
for that level on the scale
This helps to anchor the
values on the scale better:
more consistent ratings
Bur what are preferences?
Ratings are absolute statements
of preference…
But preference is a relative
statement…
I like Grand Budapest hotel more
than King’s Speech
So why not ask users to choose?
Which do you prefer?
Others also tried different PE methods
Loepp, Hussein & Ziegler
(CHI 2014)
Choose between sets of
movies that differ a lot on a
latent feature
Chang, Harper & Terveen
(CSCW 2015)
Choose between groups
of similar movies
By assigning points per
group (ranking!)
Choice-based preference elicitation
Choices are relative statements that are easier to make
Better fit with final goal: finding a good item rather than making a
good prediction
In Marketing, conjoint-based analysis uses the same idea
to determine attribute weights and utilities based on a
series of (adaptive) choices
Can we use a set of choices in the matrix factorization
space to determine a user vector in a stepwise fashion?
Graus, M.P. & Willemsen, M.C. (2015). Improving the user
experience during cold start through choice-based preference
elicitation. In Proceedings of the 9th ACM conference on
Recommender systems (pp. 273-276)
Dimensions in Matrix Factorization
Dimensionality reduction
Users and items are
represented as vectors on a set
of latent features
item vector: utility of attributes
user vector: weights of attributes
Rating is the dot product of
these vectors (overall utility!)
Gus will like Dumb and
Dumber but hate Color Purple
Koren, Y., Bell, R., and Volinsky, C. 2009. Matrix Factorization
Techniques for Recommender Systems. IEEE Computer 42, 8, 30–37.
How does this work? Step 1
Latent Feature 1
LatentFeature2
Iteration 1a: Diversified choice set is
calculated from a matrix factorization
model (red items)
Iteration 1b: User vector (blue arrow) is
moved towards chosen item (green item),
items with lowest predicted rating are
discarded (greyed out items)
How does this work? Step 2
Iteration 2: New diversified choice set
(blue items)
End of Iteration 2: with updated vector
and more items discarded based on
second choice (green item)
Evaluation of Preference Elicitation
Choice-based PE: choosing 10 times from 10 items
Rating-based PE: rating 15 items
After each PE method they evaluated the interface on
interaction usability in terms of ease of use
e.g., “It was easy to let the system know my preferences”
Effort: e.g., “Using the interface was effortful.”
effort and usability are highly related (r=0.62)
Results: less perceived effort for choice-based PE
perceived effort goes down with completion time
Behavioral data of PE-tasks
Choice-based PE: most users find their perfect item
around the 8th / 9th item and they inspect quite some
unique items along the way
Rating-based: user inspect many
lists (Median = 13), suggesting
high effort in rating task.
Perception of Recommendation List
Participants evaluated each recommendation list
separately on Choice Difficulty and Satisfaction
Satisfaction
with Chosen
Item
Obscurity
Difficulty
Intra List
Similarity
-2.407(.381)
p<.001
-.240 (.145)
p<.1
-.479 (.111)
p<.001
-.257 (.045)
p<.001
14.00 (4.51)
p<.01
Choice-
Based List
+
-
- -
-
Conclusion
Participants experienced reduced effort and increased
satisfaction for choice-based PE over rating-based PE
relative (choice) rather than absolute (rating) PE could alleviate the
cold-start problem for new users
Further research needed:
the parameterization of the choice task
strong effect of choice on the popularity of the resulting list
Using trailers helps to decrease popularity (-> IntRS 2016)
novelty effects might have played a role: fun way of interacting?
Recommending for
Behavioral Change
Energy saving and
Hypertension management
Behavioral change
Behavioral change is hard…
Exercising more, eat healthy, reduce alchohol
consumption (reducing Binge watching on Netflix )
Needs awareness, motivation and commitment
Combi model:
Klein, Mogles, Wissen
Journal of Biomedical Informatics, 2014
What can recommenders do?
Persuasive Technology: focused on how to help
people change their behavior:
personalize the message…
Recommenders systems can help with what to
change and when to act
personalize what to do next…
This requires different models/algorithms
our past behavior/liking is not what we want to do now!
Two illustrations of new approach:
energy saving
hypertension management
How can we help people to save
energy?
Our first (old) recommender system
Recommendations
Selected
measures
Things you
already do or
don’t want to
Attributes
Set
attribute
weights
Show items with highest Uitem,user, where
Uitem,user = ∑ Vitem,attribute • Wattribute,user
Study 3 (AMCIS 2014)
Online lab study
—147 paid pps (79M, 68F, mean age: 40.0)
—Selected pps interacted for at least 2.5 minutes
3 PE-methods, 2 baselines
—Attribute-based PE
—Implicit PE
—Hybrid PE (attribute + implicit)
—Sort (baseline, not personalized)
—Top-N (baseline, not personalized)
http://bit.ly/amcis14
Study 3 — Results
Experts prefer Attribute-based PE and Hybrid PE,
novices prefer Top-N and Sort (baselines)
System satisfaction mediates the effect on choice
satisfaction and behavior!
Systemsatisfaction
Domain knowledge http://bit.ly/amcis14
Tailoring energy-saving
advice using a unidimensional
Rasch scale of conservation
measures
Work with Alain Starke (Ph. D. student)
Towards a better (psychometric) user model
consumers differ in energy-saving capabilities, attitudes,
goals, …
Our prior work did not take that into account…
Energy-saving interventions are more effective when
personalized. But how?
≠
(Cf. Abrahamse et al., 2005)
Single energy saving
dimension/attitude?
Campbell’s Paradigm (Kaiser et al., 2010)
“One’s attitude or ability becomes apparent
through its behavior…”
“Attitude and Behavior are two sides of the
same coin…”
Different from standard psychological
approaches to measure attitudes & intentions
with likert scales…
41
Psychological assumptions
Three assumptions for our user model
(Based on Kaiser et al., 2010)
1. All Energy-saving behaviors form a class serving a
single goal: Saving Energy
2. Less performed behaviors yield higher Behavioral
Costs (i.e. are more difficult)
3. Individuals that execute more energy-saving behaviors
have a higher Energy-saving Ability (i.e. more skilled)
The Rasch model
The Rasch model equates behavioral difficulties and
individual propensities in a probabilistic model
Log-odds of engagement levels (yes/no):
𝜽 = an individual’s propensity/attitude
𝜹 = behavioral difficulty
P = probability of individual n engaging in behavior i
Rasch also determines individual propensities and
item difficulties & fits them onto a single scale
One scale may have lots of different difficulty levels
43
𝐥𝐧
𝑷 𝒏𝒊
𝟏 − 𝑷 𝒏𝒊
= 𝜽 𝒏 − 𝜹𝒊
The resulting Rasch scale
Pictures disclaimer: courtesy of my PhD student
(the guy on the right)
Item difficulty vs. person ability
Probability of a person executing behavior
depends on the Ability - Costs
LED lighting
Unplug chargers
Install PV panels
Using Rasch for tailored advice
Earlier research (Kaiser, Urban) found evidence
for a unidimensional scale, but with few items &
no advice
We set out a Rasch-based, energy
recommender system that:
Shows the measures in order of difficulty (either
ascending or descending)
Provide tailored conservation advice to users (or not)
Include a more extensive set of measures
46
Energy saving ‘Webshop’
Ordered set of measures with rich information and
(in some conditions) recommendations
Order:
Ascending or
descending in
Rasch difficulty
Rasch
recommendation:
start at their ability
(3 items highlighted)
or just at the bottom
(no highlights)
Procedure
1. Determining Ability: Each
user indicated for 13
measures whether
he/she executed them
2. Show webshop in one of 4 conditions (total N=224)
and ask to select a set of measures (to execute in
next 4 weeks)
3. Survey to measure user experience
4. After 4 weeks: report back on which measures they
implemented (to some extent) N=86
Results: Experience (SEM model)
Perceived
Effort
Perceived
Support
Choice
Satisfaction
-0.40*** 0.59***
-0.56**
* p < .05, ** p < .01, *** p < .001.
0.74**
-0.65*
Rasch
recomm.
Low
Ability
Ascending
Order
Results
Recommendations work (esp. for descending
order)
higher perceived support higher choice satisfaction
What did they choose?
Rank-ordered logit with
ability – difficulty/costs
difference as predictor
Without recommendations
they select easy
measures (below their
ability)
With recommendations
they choose measures
around their ability
Measure Follow-up after 4 weeks
Follow-up depends on relative costs (they are
more likely to do the easier measures…)
Lifestyle recommendations
for Hypertension
Management
Joint work with Mustafa Radha (Ph.D. student)
Radha, Willemsen, Boerhof & IJsselsteijn
UMAP 2016
Hypertension
Hypertension occurs in 30% of the population
Hypertension is without symptoms.
Hypertension is a leading cause of death.
Hypertension can be prevented or treated with
lifestyle change, such as:
Salt intake reduction
Physical activity
Weight control
Alcohol moderation
Adherence
Healthy habits have a strong benefit,
but are not always feasible
Study 1: Model construction
Online survey
300 participants between 40 and 60 years old
50% hypertensive
Self-reported engagement in 63 health behaviors
about diet, sodium intake and physical activity
Results:
reliable Rasch scale (for both persons and items)
No strong differences between subgroups
Unidimensional: different categories
mix nicely across the scale
The Rasch scale
Study 2: Coaching strategies
The engagement maximization strategy selects
the easiest behaviors (that she does not do yet)
The motivation maximization strategy selects
behaviors with difficulties that match the
individual’s ability (that she does no do yet)
The random control strategy selects behaviors
(not done yet) at random without regard for
difficulty
Study 2: design
150 hypertensive users invited online
Questionnaire used to measure user ability and
find behaviors that user should be coached on.
Pairwise comparison between the three
intervention strategies through virtual coaches
Medium abilityEntire population
Results
EM
EM
Health Benefit
Personalization
RC
RC
MM RC
Health Benefit
MM
MM
Health Benefit
Appeal
RC
RC
Conclusions
Engagement maximization (easy behaviors not
done yet) outperforms random most of the time
The rasch order helps!
Knowing the ability helps to elicit better
recommendations by tailoring: for medium ability
individuals
But for other groups it does not matter much…
Questions?
Contact:
Martijn Willemsen
@MCWillemsen
M.C.Willemsen@tue.nl
www.martijnwillemsen.nl
Thanks to my co-authors:
Mark Graus, Alain Starke
Mustafa Radha
Bart Knijnenburg, Ron Broeders

Contenu connexe

Tendances

Paper Presentation: Data Mining User Preference in Interactive Multimedia
Paper Presentation: Data Mining User Preference in Interactive MultimediaPaper Presentation: Data Mining User Preference in Interactive Multimedia
Paper Presentation: Data Mining User Preference in Interactive Multimedia
Jeanette Howe
 
Mentor Palokaj - Thesis Final Version v1.1 - Response of Bartle types to a me...
Mentor Palokaj - Thesis Final Version v1.1 - Response of Bartle types to a me...Mentor Palokaj - Thesis Final Version v1.1 - Response of Bartle types to a me...
Mentor Palokaj - Thesis Final Version v1.1 - Response of Bartle types to a me...
Mentor Palokaj
 
Fairness-aware Learning through Regularization Approach
Fairness-aware Learning through Regularization ApproachFairness-aware Learning through Regularization Approach
Fairness-aware Learning through Regularization Approach
Toshihiro Kamishima
 
Consideration on Fairness-aware Data Mining
Consideration on Fairness-aware Data MiningConsideration on Fairness-aware Data Mining
Consideration on Fairness-aware Data Mining
Toshihiro Kamishima
 

Tendances (13)

Btp 3rd Report
Btp 3rd ReportBtp 3rd Report
Btp 3rd Report
 
A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015
A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015
A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015
 
Recommender system
Recommender systemRecommender system
Recommender system
 
Recommender systems to help people move forward
Recommender systems to help people move forwardRecommender systems to help people move forward
Recommender systems to help people move forward
 
Developing Movie Recommendation System
Developing Movie Recommendation SystemDeveloping Movie Recommendation System
Developing Movie Recommendation System
 
Recommender-technology-ReColl08
Recommender-technology-ReColl08Recommender-technology-ReColl08
Recommender-technology-ReColl08
 
Paper Presentation: Data Mining User Preference in Interactive Multimedia
Paper Presentation: Data Mining User Preference in Interactive MultimediaPaper Presentation: Data Mining User Preference in Interactive Multimedia
Paper Presentation: Data Mining User Preference in Interactive Multimedia
 
Jeffrey xu yu large graph processing
Jeffrey xu yu large graph processingJeffrey xu yu large graph processing
Jeffrey xu yu large graph processing
 
Mentor Palokaj - Thesis Final Version v1.1 - Response of Bartle types to a me...
Mentor Palokaj - Thesis Final Version v1.1 - Response of Bartle types to a me...Mentor Palokaj - Thesis Final Version v1.1 - Response of Bartle types to a me...
Mentor Palokaj - Thesis Final Version v1.1 - Response of Bartle types to a me...
 
Stated preference methods and analysis
Stated preference methods and analysisStated preference methods and analysis
Stated preference methods and analysis
 
Fairness-aware Learning through Regularization Approach
Fairness-aware Learning through Regularization ApproachFairness-aware Learning through Regularization Approach
Fairness-aware Learning through Regularization Approach
 
Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation Systems
 
Consideration on Fairness-aware Data Mining
Consideration on Fairness-aware Data MiningConsideration on Fairness-aware Data Mining
Consideration on Fairness-aware Data Mining
 

Similaire à What recommender systems can learn from decision psychology about preference elicitation and behavioral change

Social Recommender Systems Tutorial - WWW 2011
Social Recommender Systems Tutorial - WWW 2011Social Recommender Systems Tutorial - WWW 2011
Social Recommender Systems Tutorial - WWW 2011
idoguy
 
Preference Elicitation Interface
Preference Elicitation InterfacePreference Elicitation Interface
Preference Elicitation Interface
晓愚 孟
 

Similaire à What recommender systems can learn from decision psychology about preference elicitation and behavioral change (20)

Explaining recommendations: design implications and lessons learned
Explaining recommendations: design implications and lessons learnedExplaining recommendations: design implications and lessons learned
Explaining recommendations: design implications and lessons learned
 
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation System
 
Evaluating Collaborative Filtering Recommender Systems
Evaluating Collaborative Filtering Recommender SystemsEvaluating Collaborative Filtering Recommender Systems
Evaluating Collaborative Filtering Recommender Systems
 
Social Recommender Systems Tutorial - WWW 2011
Social Recommender Systems Tutorial - WWW 2011Social Recommender Systems Tutorial - WWW 2011
Social Recommender Systems Tutorial - WWW 2011
 
Preference Elicitation Interface
Preference Elicitation InterfacePreference Elicitation Interface
Preference Elicitation Interface
 
2 Studies UX types should know about (Straub UXPA unconference13)
2 Studies UX types should know about (Straub UXPA unconference13)2 Studies UX types should know about (Straub UXPA unconference13)
2 Studies UX types should know about (Straub UXPA unconference13)
 
Collaborative filtering
Collaborative filteringCollaborative filtering
Collaborative filtering
 
Explainable AI for non-expert users
Explainable AI for non-expert usersExplainable AI for non-expert users
Explainable AI for non-expert users
 
A Novel Latent Factor Model For Recommender System
A Novel Latent Factor Model For Recommender SystemA Novel Latent Factor Model For Recommender System
A Novel Latent Factor Model For Recommender System
 
Fairness in Search & RecSys 네이버 검색 콜로키움 김진영
Fairness in Search & RecSys 네이버 검색 콜로키움 김진영Fairness in Search & RecSys 네이버 검색 콜로키움 김진영
Fairness in Search & RecSys 네이버 검색 콜로키움 김진영
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
Demography basedhybridrecommendersystemformovierecommendation
Demography basedhybridrecommendersystemformovierecommendationDemography basedhybridrecommendersystemformovierecommendation
Demography basedhybridrecommendersystemformovierecommendation
 
Towards the next generation of interactive and adaptive explanation methods
Towards the next generation of interactive and adaptive explanation methodsTowards the next generation of interactive and adaptive explanation methods
Towards the next generation of interactive and adaptive explanation methods
 
ENTERTAINMENT CONTENT RECOMMENDATION SYSTEM USING MACHINE LEARNING
ENTERTAINMENT CONTENT RECOMMENDATION SYSTEM USING MACHINE LEARNINGENTERTAINMENT CONTENT RECOMMENDATION SYSTEM USING MACHINE LEARNING
ENTERTAINMENT CONTENT RECOMMENDATION SYSTEM USING MACHINE LEARNING
 
Human-centered AI: how can we support lay users to understand AI?
Human-centered AI: how can we support lay users to understand AI?Human-centered AI: how can we support lay users to understand AI?
Human-centered AI: how can we support lay users to understand AI?
 
Presentation: Valuing Ecosystem Services, Methods and Practices
Presentation: Valuing Ecosystem Services, Methods and PracticesPresentation: Valuing Ecosystem Services, Methods and Practices
Presentation: Valuing Ecosystem Services, Methods and Practices
 
Recommender Systems Fairness Evaluation via Generalized Cross Entropy
Recommender Systems Fairness Evaluation via Generalized Cross EntropyRecommender Systems Fairness Evaluation via Generalized Cross Entropy
Recommender Systems Fairness Evaluation via Generalized Cross Entropy
 
Human-centered AI: how can we support end-users to interact with AI?
Human-centered AI: how can we support end-users to interact with AI?Human-centered AI: how can we support end-users to interact with AI?
Human-centered AI: how can we support end-users to interact with AI?
 
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
 
Modern information Retrieval-Relevance Feedback
Modern information Retrieval-Relevance FeedbackModern information Retrieval-Relevance Feedback
Modern information Retrieval-Relevance Feedback
 

Dernier

If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New Nigeria
Kayode Fayemi
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
Kayode Fayemi
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
raffaeleoman
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
Sheetaleventcompany
 

Dernier (20)

Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatment
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New Nigeria
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.
 
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubs
 
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfAWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
 
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, YardstickSaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
 
Causes of poverty in France presentation.pptx
Causes of poverty in France presentation.pptxCauses of poverty in France presentation.pptx
Causes of poverty in France presentation.pptx
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
 
My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Bailey
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)
 
Air breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animalsAir breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animals
 
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfThe workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
 
ICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdfICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdf
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
 
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
 

What recommender systems can learn from decision psychology about preference elicitation and behavioral change

  • 1. What recommender systems can learn from decision psychology about preference elicitation and behavioral change Martijn Willemsen Human-Technology Interaction www.martijnwillemsen.nl
  • 2. What are recommender systems about Algorithms Accuracy: compare prediction with actual values Recommendation: best predicted items dataset user-item rating pairs user Choose (prefer?) ratings Rating? Experience! preferences Goals & desires!
  • 3. User-Centric Framework Computers Scientists (and marketing researchers) would study behavior…. (they hate asking the user or just cannot (AB tests))
  • 4. User-Centric Framework Psychologists and HCI people are mostly interested in experience…
  • 5. User-Centric Framework Though it helps to triangulate experience and behavior…
  • 6. User-Centric Framework Our framework adds the intermediate construct of perception that explains why behavior and experiences changes due to our manipulations
  • 7. User-Centric Framework And adds personal and situational characteristics Relations modeled using factor analysis and SEM Knijnenburg, B.P., Willemsen, M.C., Gantner, Z., Soncu, H., Newell, C. (2012). Explaining the User Experience of Recommender Systems. User Modeling and User-Adapted Interaction (UMUAI), vol 22, p. 441-504 http://bit.ly/umuai
  • 8. Providing input to the recommender: Preference Elicitation memory, ratings & choices
  • 9. Providing input to a recommender system how do algorithms get their data? Preference Elicitation (PE) PE is a major topic in research on Decision Making I even did my PhD thesis on it… ;-) What can Psychology learn us on improving this aspect? Role of memory in ratings Rating support Rating versus choice-based elicitation
  • 10. What does rating entail? Typical recommender scenario: First usage: Show a set of (typical) items and ask people to rate the ones they know (cold start) Later usage: people go to the recommender when they have consumed an item to rate it and typically also rate other aspects Does it matter if the preference you provide (by rating) is based on recent experiences or mostly on your memory?
  • 11. Psychologist: user knows best, ask her! In two user experiments, users rated a number of movies that were aired on Dutch TV in the previous month (~150 movies) Rate 15-20 movies from that set that you know and indicate how long ago you have seen these movies (last week, last month, last 6 months, last year, last 3 years, longer ago) Motivate ratings for two movies (one seen recently and one seen more than a year ago)
  • 12. Results 247 users, 4212 ratings Rating distributions: Most movies are seen long ago… Only 28% seen in the last year # Positive ratings decrease with time 1st timeslot: 60% 4/5 star Last timeslot: only 36% ***** **** *** ** *
  • 13. Modeling the ratings Coefficient Std. Err. t-value intercept 2.95 0.15 19.05 time 0.29 0.13 2.31 highrated 1.62 0.22 7.43 time2 -0.09 0.02 -3.55 Time x highrated -0.73 0.18 -4.10 Tine2 x highrated 0.11 0.03 3.26 Multilevel model: Random intercepts for movies and users high-rated versus low-rated shows a different pattern Regression towards the mean? High-rated Low rated
  • 14. This is a problem… How can we train a recommender system.. If ratings depend on our memory this much… This is new to psychology as well! Memory effects like this have not been studied… Problem lies partly in the type of judgment asked: Rating is separate evaluation on an absolute scale… Lacks a good reference/comparison Two solutions we explored: Rating support Different elicitation methods: choice!
  • 15. Joint versus Separate Evaluation Evaluations of two job candidates for a computer programmer position expecting the use of a special language called KY. Mean WTP (in thousands): Separate $ 32.7 k $ 26.8 k Joint $ 31.2 k $ 33.2 k Candidate A Candidate B Education B.Sc. computer Sc. B.Sc. computer Sc. GPA (0-5) 4.8 3.1 KY Experience 10 KY programs 70 KY programs 17
  • 16. Rating support interfaces Using movielens! Can we help users during rating to make their ratings more stable/accurate? We can support their memory for the movie using tags We can help ratings on the scale with previous ratings Movielens has a tag genome and a history of ratings so we can give real-time user-specific feedback! Nguyen, T. T., Kluver, D., Wang, T.-Y., Hui, P.-M., Ekstrand, M. D., Willemsen, M. C., & Riedl, J. (2013). Rating Support Interfaces to Improve User Experience and Recommender Accuracy. RecSys 2013 (pp. 149–156)
  • 17. Tag Interface Provide 10 tags that are relevant for that user and that describe the movie well Didn’t really help…
  • 18. Exemplar Interface Support rating on the scale by providing exemplars: Exemplar: Similar movies rated before by that user for that level on the scale This helps to anchor the values on the scale better: more consistent ratings
  • 19. Bur what are preferences? Ratings are absolute statements of preference… But preference is a relative statement… I like Grand Budapest hotel more than King’s Speech So why not ask users to choose? Which do you prefer?
  • 20. Others also tried different PE methods Loepp, Hussein & Ziegler (CHI 2014) Choose between sets of movies that differ a lot on a latent feature Chang, Harper & Terveen (CSCW 2015) Choose between groups of similar movies By assigning points per group (ranking!)
  • 21. Choice-based preference elicitation Choices are relative statements that are easier to make Better fit with final goal: finding a good item rather than making a good prediction In Marketing, conjoint-based analysis uses the same idea to determine attribute weights and utilities based on a series of (adaptive) choices Can we use a set of choices in the matrix factorization space to determine a user vector in a stepwise fashion? Graus, M.P. & Willemsen, M.C. (2015). Improving the user experience during cold start through choice-based preference elicitation. In Proceedings of the 9th ACM conference on Recommender systems (pp. 273-276)
  • 22. Dimensions in Matrix Factorization Dimensionality reduction Users and items are represented as vectors on a set of latent features item vector: utility of attributes user vector: weights of attributes Rating is the dot product of these vectors (overall utility!) Gus will like Dumb and Dumber but hate Color Purple Koren, Y., Bell, R., and Volinsky, C. 2009. Matrix Factorization Techniques for Recommender Systems. IEEE Computer 42, 8, 30–37.
  • 23. How does this work? Step 1 Latent Feature 1 LatentFeature2 Iteration 1a: Diversified choice set is calculated from a matrix factorization model (red items) Iteration 1b: User vector (blue arrow) is moved towards chosen item (green item), items with lowest predicted rating are discarded (greyed out items)
  • 24. How does this work? Step 2 Iteration 2: New diversified choice set (blue items) End of Iteration 2: with updated vector and more items discarded based on second choice (green item)
  • 25. Evaluation of Preference Elicitation Choice-based PE: choosing 10 times from 10 items Rating-based PE: rating 15 items After each PE method they evaluated the interface on interaction usability in terms of ease of use e.g., “It was easy to let the system know my preferences” Effort: e.g., “Using the interface was effortful.” effort and usability are highly related (r=0.62) Results: less perceived effort for choice-based PE perceived effort goes down with completion time
  • 26. Behavioral data of PE-tasks Choice-based PE: most users find their perfect item around the 8th / 9th item and they inspect quite some unique items along the way Rating-based: user inspect many lists (Median = 13), suggesting high effort in rating task.
  • 27. Perception of Recommendation List Participants evaluated each recommendation list separately on Choice Difficulty and Satisfaction Satisfaction with Chosen Item Obscurity Difficulty Intra List Similarity -2.407(.381) p<.001 -.240 (.145) p<.1 -.479 (.111) p<.001 -.257 (.045) p<.001 14.00 (4.51) p<.01 Choice- Based List + - - - -
  • 28. Conclusion Participants experienced reduced effort and increased satisfaction for choice-based PE over rating-based PE relative (choice) rather than absolute (rating) PE could alleviate the cold-start problem for new users Further research needed: the parameterization of the choice task strong effect of choice on the popularity of the resulting list Using trailers helps to decrease popularity (-> IntRS 2016) novelty effects might have played a role: fun way of interacting?
  • 29. Recommending for Behavioral Change Energy saving and Hypertension management
  • 30. Behavioral change Behavioral change is hard… Exercising more, eat healthy, reduce alchohol consumption (reducing Binge watching on Netflix ) Needs awareness, motivation and commitment Combi model: Klein, Mogles, Wissen Journal of Biomedical Informatics, 2014
  • 31. What can recommenders do? Persuasive Technology: focused on how to help people change their behavior: personalize the message… Recommenders systems can help with what to change and when to act personalize what to do next… This requires different models/algorithms our past behavior/liking is not what we want to do now! Two illustrations of new approach: energy saving hypertension management
  • 32. How can we help people to save energy?
  • 33. Our first (old) recommender system Recommendations Selected measures Things you already do or don’t want to Attributes Set attribute weights Show items with highest Uitem,user, where Uitem,user = ∑ Vitem,attribute • Wattribute,user
  • 34. Study 3 (AMCIS 2014) Online lab study —147 paid pps (79M, 68F, mean age: 40.0) —Selected pps interacted for at least 2.5 minutes 3 PE-methods, 2 baselines —Attribute-based PE —Implicit PE —Hybrid PE (attribute + implicit) —Sort (baseline, not personalized) —Top-N (baseline, not personalized) http://bit.ly/amcis14
  • 35. Study 3 — Results Experts prefer Attribute-based PE and Hybrid PE, novices prefer Top-N and Sort (baselines) System satisfaction mediates the effect on choice satisfaction and behavior! Systemsatisfaction Domain knowledge http://bit.ly/amcis14
  • 36. Tailoring energy-saving advice using a unidimensional Rasch scale of conservation measures Work with Alain Starke (Ph. D. student)
  • 37. Towards a better (psychometric) user model consumers differ in energy-saving capabilities, attitudes, goals, … Our prior work did not take that into account… Energy-saving interventions are more effective when personalized. But how? ≠ (Cf. Abrahamse et al., 2005)
  • 38. Single energy saving dimension/attitude? Campbell’s Paradigm (Kaiser et al., 2010) “One’s attitude or ability becomes apparent through its behavior…” “Attitude and Behavior are two sides of the same coin…” Different from standard psychological approaches to measure attitudes & intentions with likert scales… 41
  • 39. Psychological assumptions Three assumptions for our user model (Based on Kaiser et al., 2010) 1. All Energy-saving behaviors form a class serving a single goal: Saving Energy 2. Less performed behaviors yield higher Behavioral Costs (i.e. are more difficult) 3. Individuals that execute more energy-saving behaviors have a higher Energy-saving Ability (i.e. more skilled)
  • 40. The Rasch model The Rasch model equates behavioral difficulties and individual propensities in a probabilistic model Log-odds of engagement levels (yes/no): 𝜽 = an individual’s propensity/attitude 𝜹 = behavioral difficulty P = probability of individual n engaging in behavior i Rasch also determines individual propensities and item difficulties & fits them onto a single scale One scale may have lots of different difficulty levels 43 𝐥𝐧 𝑷 𝒏𝒊 𝟏 − 𝑷 𝒏𝒊 = 𝜽 𝒏 − 𝜹𝒊
  • 41. The resulting Rasch scale Pictures disclaimer: courtesy of my PhD student (the guy on the right)
  • 42. Item difficulty vs. person ability Probability of a person executing behavior depends on the Ability - Costs LED lighting Unplug chargers Install PV panels
  • 43. Using Rasch for tailored advice Earlier research (Kaiser, Urban) found evidence for a unidimensional scale, but with few items & no advice We set out a Rasch-based, energy recommender system that: Shows the measures in order of difficulty (either ascending or descending) Provide tailored conservation advice to users (or not) Include a more extensive set of measures 46
  • 44. Energy saving ‘Webshop’ Ordered set of measures with rich information and (in some conditions) recommendations Order: Ascending or descending in Rasch difficulty Rasch recommendation: start at their ability (3 items highlighted) or just at the bottom (no highlights)
  • 45. Procedure 1. Determining Ability: Each user indicated for 13 measures whether he/she executed them 2. Show webshop in one of 4 conditions (total N=224) and ask to select a set of measures (to execute in next 4 weeks) 3. Survey to measure user experience 4. After 4 weeks: report back on which measures they implemented (to some extent) N=86
  • 46. Results: Experience (SEM model) Perceived Effort Perceived Support Choice Satisfaction -0.40*** 0.59*** -0.56** * p < .05, ** p < .01, *** p < .001. 0.74** -0.65* Rasch recomm. Low Ability Ascending Order
  • 47. Results Recommendations work (esp. for descending order) higher perceived support higher choice satisfaction
  • 48. What did they choose? Rank-ordered logit with ability – difficulty/costs difference as predictor Without recommendations they select easy measures (below their ability) With recommendations they choose measures around their ability
  • 49. Measure Follow-up after 4 weeks Follow-up depends on relative costs (they are more likely to do the easier measures…)
  • 50. Lifestyle recommendations for Hypertension Management Joint work with Mustafa Radha (Ph.D. student) Radha, Willemsen, Boerhof & IJsselsteijn UMAP 2016
  • 51. Hypertension Hypertension occurs in 30% of the population Hypertension is without symptoms. Hypertension is a leading cause of death. Hypertension can be prevented or treated with lifestyle change, such as: Salt intake reduction Physical activity Weight control Alcohol moderation
  • 52. Adherence Healthy habits have a strong benefit, but are not always feasible
  • 53. Study 1: Model construction Online survey 300 participants between 40 and 60 years old 50% hypertensive Self-reported engagement in 63 health behaviors about diet, sodium intake and physical activity Results: reliable Rasch scale (for both persons and items) No strong differences between subgroups Unidimensional: different categories mix nicely across the scale
  • 55. Study 2: Coaching strategies The engagement maximization strategy selects the easiest behaviors (that she does not do yet) The motivation maximization strategy selects behaviors with difficulties that match the individual’s ability (that she does no do yet) The random control strategy selects behaviors (not done yet) at random without regard for difficulty
  • 56. Study 2: design 150 hypertensive users invited online Questionnaire used to measure user ability and find behaviors that user should be coached on. Pairwise comparison between the three intervention strategies through virtual coaches
  • 57. Medium abilityEntire population Results EM EM Health Benefit Personalization RC RC MM RC Health Benefit MM MM Health Benefit Appeal RC RC
  • 58. Conclusions Engagement maximization (easy behaviors not done yet) outperforms random most of the time The rasch order helps! Knowing the ability helps to elicit better recommendations by tailoring: for medium ability individuals But for other groups it does not matter much…
  • 59. Questions? Contact: Martijn Willemsen @MCWillemsen M.C.Willemsen@tue.nl www.martijnwillemsen.nl Thanks to my co-authors: Mark Graus, Alain Starke Mustafa Radha Bart Knijnenburg, Ron Broeders