SlideShare une entreprise Scribd logo
1  sur  60
Télécharger pour lire hors ligne
Evalua&ng	
  the	
  search	
  experience:	
  
from	
  Retrieval	
  Effec&veness	
  to	
  
User	
  Engagement	
  
Mounia Lalmas
Yahoo Labs London
mounia@acm.org
CLEF 2015 – Toulouse
This talk
§ Evaluation in search
(offline evaluation)
(online evaluation)
§  Interpreting the signals
§ Introduction to user engagement
§ From retrieval effectiveness to user engagement
(from intra-session to inter-session evaluation)
The Message of this talk
What you want
to optimize for
each task,
session, query
M1
M2
M3
.
.
.
Mn
LTV1
LTV2
LTV3
.
.
.
LTVm
Mi LTVj
What you want
to optimize long-
termSystem
Models
Features
Evaluation in
search
How to evaluate a search system
§ Coverage	
  
§ Speed	
  
§ Query	
  language	
  
§ User	
  interface	
  
§ User	
  happiness	
  
Users	
  find	
  what	
  they	
  want	
  and	
  return	
  to	
  the	
  search	
  system	
  
	
  
§ But	
  let	
  us	
  remember:	
  
In	
  carrying	
  out	
  a	
  search	
  task,	
  search	
  is	
  a	
  means,	
  not	
  an	
  end	
  
Sec. 8.6
(Manning, Raghavan & Schütze, 2008; Baeza-Yates & Ribeiro-Neto, 2011)
Within an online
session
›  July 2012
›  2.5M users
›  785M page views
›  Categorization of the most
frequent accessed sites
•  11 categories (e.g. news), 33
subcategories (e.g. news finance,
news society)
•  760 sites from 70 countries/regions
short sessions: average 3.01 distinct sites visited with revisitation rate 10%
long sessions: average 9.62 distinct sites visited with revisitation rate 22%
(Lehmann etal, 2013)
Measuring user happiness
Most	
  common	
  proxy:	
  relevance	
  of	
  retrieved	
  results	
  
Sec. 8.1
Relevant
Retrieved
all items
§  User	
  informa(on	
  need	
  translated	
  into	
  
a	
  query	
  
§  Relevance	
  assessed	
  rela&ve	
  to	
  	
  
informa(on	
  need	
  not	
  the	
  query	
  
§  Example:	
  
›  Informa&on	
  need:	
  I	
  am	
  looking	
  for	
  tennis	
  
holiday	
  in	
  a	
  country	
  with	
  no	
  rain	
  
›  Query:	
  tennis	
  academy	
  good	
  weather	
  
Evaluation measures:
•  precision, recall, R-precision; precision@n;
average precision; F-measure; …
•  bpref; cumulative gains, rank-biased precision,
expected reciprocal rank, Q-measure, …
precision
recall
Measuring user happiness
Most	
  common	
  proxy:	
  relevance	
  of	
  retrieval	
  results	
  
Sec. 8.1
Explicit signals
Test collection methodology (TREC, CLEF, …)
Human labeled corpora
Implicit signals
User behavior in online settings (clicks, skips, …)
Explicit and implicit signals can be used together
Examples of implicit signals
§  Number of clicks
§  SAT click
§  Quick-back click
§  Click at given position
§  Time to first click
§  Skipping
§  Abandonment rate
§  Number of query reformulations
§  Dwell time
§  Hover
What is a happy user in search
1.  The user information need is satisfied
2.  The user has learned about a topic and even
about other topics
3.  The system was inviting and even fun to use
In-the-moment engagement
Users on a site
Long-term engagement
Users come back frequently
USER ENGAGEMENT
Interpreting the
signals
User variability
(Anderson & Krathwohl, 2001; Bailey etal, 2015)
T: number of documents users (judges) expected to read
Q: number of queries users (judges) expected to issue
Task complexity Task complexity
Explicit signal: MAP
(Turpin & Scholer, 2006)
Similar results obtained with P@2, P@3, P@4 and P@10
PRECISION-BASED SEARCH
Explicit signal: MAP (2)
(Turpin & Scholer, 2006)
RECALL-BASED SEARCH
top most popular tweets top most popular tweets + geographical diverse
Being from a central or peripheral location makes a difference.
Peripheral users did not perceive the timeline as being diverse
Explicit signal: “Diversity”
It should never be just about the algorithm, but also how users respond to what the
algorithm returns to them
(Graells-Garrido, Lalmas & Baeza-Yates, Under Review)
Implicit signal: Click-through rate
CTR
new ranking algorithm
new design of search result page
…
Multimedia search
activities often
driven by
entertainment
needs, not by
information needs
Relevance in multimedia search
(Slaney, 2011)
Signal signal: Clicks (I)
(Miliaraki, Blanco & Lalmas, 2015)
Implicit signal: Clicks (II)
Explorative and serendipitous search
I just wanted the phone number … I am totally happy J
Implicit signal: No click
Information-rich snippet
Implicit signal: No click
Cickthrough rate:
% of clicks when URL
shown (per query)
Hover rate:
% hover over URL
(per query)
Unclicked hover:
Median time user hovers over
URL but no click (per query)
Max hover time:
Maximum time user hovers
over a result (per SERP)
(Huang et al, 2011)
20
§  Abandonment is when there is no click on the search result page
›  User is dissatisfied (bad abandonment)
›  User found result(s) on the search result page (good abandonment)
§  858 queries (21% good vs. 79% abandonment manually examined)
§  Cursor trail length
›  Total distance (pixel) traveled by cursor on SERP
›  Shorter for good abandonment
§  Movement time
›  Total time (second) cursor moved on SERP
›  Longer when answers in snippet (good abandonment)
§  Cursor speed
›  Average cursor speed (pixel/second)
›  Slower when answers in snippet (good abandonment)
(Huang et al, 2011)
Implicit signal: Abandonment rate
“reading” cursor heatmap of relevant document vs “scanning” cursor heatmap
of non-relevant document (both dwell time of 30s)
(Guo & Agichtein, 2012)
22
Implicit signal: Dwell time
Implicit signal: Dwell time
“reading” a relevant long document vs “scanning” a long non-relevant
document
(Guo & Agichtein, 2012)
23
Implicit signal: Dwell time
DWELL TIME
used a proxy of
user experience
Publisher
click on
an ad on
mobile
device
Dwell time on non-optimized landing pages
comparable and even higher than on mobile-
optimized ones
… when mobile optimized, users realize quickly
whether they “like” the ad or not?
(Lalmas etal, 2015)
non-mobile optimized mobile optimized
User engagement
What is user engagement?
“User engagement is a quality of the
user experience that emphasizes the
phenomena associated with wanting to
use a technological resource longer and
frequently” (Attfield et al, 2011)
Characteristics of user engagement
Novelty
(Webster & Ho, 1997; O’Brien,
2008)
Richness and control
(Jacques et al, 1995; Webster &
Ho, 1997)
Aesthetics
(Jacques et al, 1995; O’Brien,
2008)
Endurability
(Read, MacFarlane, & Casey,
2002; O’Brien, 2008)
Focused attention
(Webster & Ho, 1997; O’Brien,
2008)
Reputation, trust and
expectation
(Attfield et al, 2011)
Positive Affect
(O’Brien & Toms, 2008)
Motivation, interests,
incentives, and benefits
(Jacques et al., 1995; O’Brien & Toms,
2008)
(O’Brien, Lalmas & Yom-Tov, 2014)
Measuring user engagement
Measures	
   Attributes	
  
Self-report Questionnaire, interview,
think-aloud and think after
protocols
Subjective
Short- and long-term
Lab and field
Small scale
Physiology EEG, SCL, fMRI
eye tracking
mouse-tracking
Objective
Short-term
Lab and field
Small and large scale
Analytics intra- and inter-session metrics
data science
Objective
Short- and long-term
Field
Large scale
Attributes of user engagement
§ Scale (small versus large)
§ Setting (laboratory versus field)
§ Objective versus subjective
§ Temporality (in-the-moment versus long-term)
What you want
to optimize for
each task,
session, query
What you want
to optimize long-
term
Mi LTVj
User engagement metrics
User engagement metrics
0-1 1-0.5 0.5
Kendall’s tau with p-value < 0.05
('-' insignificant correlations)
High correlation
between metrics in
same group
Low correlation
between metrics in
different groups
[POP]#Users
[POP]#Visits
[POP]#Clicks
[ACT]PageViewsV
[ACT]DwellTimeV
[LOY]ActiveDays
[LOY]ReturnRate
#Users [POP] 0.82 0.75 - - 0.43 0.34
#Visits [POP] 0.82 0.85 - - 0.60 0.52
#Clicks [POP] 0.75 0.85 0.16 0.18 0.59 0.51
PageViewsV [ACT] - - 0.16 0.33 - -
DwellTimeV [ACT] - - 0.18 0.33 - -
ActiveDays [LOY] 0.43 0.60 0.59 - - 0.79
ReturnRate [LOY] 0.34 0.52 0.51 - - 0.79
0.69
(Lehmann etal, 2012)
in-the-moment
long-term
Online sites differ with respect to
their engagement pattern
Games
Users spend
much time per
visit
Search
Users come
frequently and
do not stay long
Social media
Users come
frequently and
stay long
Niche
Users come on
average once
a week e.g. weekly
post
News
Users come
periodically,
e.g. morning and
evening
Service
Users visit site,
when needed,
e.g. to renew
subscription
(Lehmann etal, 2012)
in-the-moment: at each visit
long-term: visit frequency
From intra- to
inter-session
evaluation
1.  Search
2.  Mobile advertising
happy users
come back
The Message: From intra- to inter-
session evaluation
What you want
to optimize for
each task,
session, query
M1
M2
M3
.
.
.
Mn
LTV1
LTV2
LTV3
.
.
.
LTVm
Mi LTVj
What you want
to optimize long-
termSystem
Models
Features
Search
Search experience
What you want
to optimize for
each task,
session, query
search
metrics
(signals)
absence
time
(revisit the
site)
Mi LTVj
What you want
to optimize long-
termSearch
system
Models
Features
intra-session search
metrics
•  Dwell time
•  Number of clicks
•  Time to 1st lick
•  Skipping
•  Click through rate
•  Abandonment rate
•  Number of query
reformulations
•  …
Dwell time as a proxy of user interest
Dwell time as a proxy of relevance
Dwell time as a proxy of conversion
Dwell time as a proxy of post-click ad
quality
…
User engagement metrics for search
(Proxy: relevance of search results)
intra-session
inter-session
Dwell time (I)
§ Definition
The contiguous time spent on
a site or web page
§ Cons
Not clear that the user was
actually looking at the site
while there à blur/focus
Distribution of dwell times on 50
websites
(O’Brien, Lalmas & Yom-Tov, 2014)
Dwell time (II)
Dwell time varies by
site type:
•  leisure sites tend to have
longer dwell times than
news, e-commerce, etc.
Dwell time has a
relatively large variance
even for the same site
Dwell time on 50 websites
(tourists, active, VIP …
users)
(O’Brien, Lalmas & Yom-Tov, 2014)
Search result page for “asparagus” (I)
Search result page for “asparagus” (II)
Absence time and survival analysis
story 1
story 2
story 3
story 4
story 5
story 6
story 7
story 8
story 9
0 5 10 15 20
0.00.20.40.60.81.0
Users (%) who did come back
Users (%) who read story 2 but did not come back after 10 hours
SURVIVE
DIE
DIE = RETURN TO SITE èSHORT ABSENCE TIME
hours
Absence time applied to search
Ranking function on Yahoo Answer Japan
Two-weeks click data on Yahoo Answer Japan: search
One millions users
Six ranking functions
30-minute session boundary
survival analysis: high hazard rate (die quickly) = short absence
5 clicks
control=noclick
Absence time and number of clicks on
search result page
3 clicks
§  No click means a bad user experience
§  Clicking between 3-5 results leads to same user experience
§  Clicking on more than 5 results reflects poorer user experience; users cannot
find what they are looking for
(Dupret & Lalmas, 2013)
Using DCG versus absence to evaluate
five ranking functions
DCG@1
Ranking Alg 1
Ranking Alg 2
Ranking Alg 3
Ranking Alg 4
Ranking Alg 5
DCG@5
Ranking Alg 1
Ranking Alg 3
Ranking Alg 2
Ranking Alg 4
Ranking Alg 5
Absence time
Ranking Alg 1
Ranking Alg 2
Ranking Alg 5
Ranking Alg 3
Ranking Alg 4
(Dupret & Lalmas, 2013)
Absence time and search experience
§  Clicking lower in the ranking (2nd, 3rd) suggests more careful choice
from the user (compared to 1st)
§  Clicking at bottom is a sign of low quality overall ranking
§  Users finding their answers quickly (time to 1st click) return sooner to
the search application
§  Returning to the same search result page is a worse user experience
than reformulating the query
search session metrics à absence time
(Dupret & Lalmas, 2013)
Absence time – search experience
From 21 experiments carried out through A/B testing, using absence time
agrees with 14 of them (which one is better)
(Chakraborty etal, 2014)
Positive signals
•  One more query in session
•  One more click in session
•  SAT clicks
•  Query reformulation
Negative signals
•  Abandoned session
•  Quick-back clicks
search session metrics à absence time
Native advertising
The context — Post-click experience
on mobile advertising
What you want
to optimize for
each task,
session, query
dwell
time on
landing
page
absence
time
(next ad
click)
Mi LTVj
What you want
to optimize long-
termnative ad
serving
Models
Features
Native Advertising
…
Mobile Desktop
Estimating the quality of the post-click
experience
Best experience is when conversion happens
Estimating the probability of conversion is hard!
- Conversion data is not available for all advertisers
- Conversion data is not missing at random
Proxy metric of post-click quality:
dwell time on the ad landing page
- No conversion does not mean a bad experience
tad-click tback-to-publisher
dwell time = tback-to-publisher – tad-click
Dwell time as a proxy of the post-click
experience
mobile
200K ad clicks
Ø  It needs less time to
get the same
probability of a
second click
desktop (toolbar)
30K ad clicks
Ø  23.3% of users visit other websites
than the ad landing page before
returning to publisher
Ø  this goes down to 7.4% for dwell time
up to 3 mins.
Probability of a second click
increases with dwell time
Dwell time and
absence time
0%
200%
400%
600%
short ad clicks long ad clicks
adclickdifference
Dwell time à ad click
Positive post-click
experience (“long” clicks)
has an effect on users
clicking on ads again
(mobile)
(Lalmas etal, 2015)
Absence time:
•  return to publisher
•  click on an ad
From intra- to inter-
session evaluation
Absence time
1.  Search
2.  Mobile advertising
happy users
come back
What’s next?
Large-scale online measurement
Decide the in-
the-moment
metric(s)
Decide the long-
term-value
metric(s)
System
Models
Features
Which in-the-
moment metric(s)
are good
predictor of long-
term value
metric(s)
Optimize for the
identified in-the-
moment
metric(s)
Lots of data
required to
remove noise
What is a
signal?
What is a
metric?
O’Brien & Toms User
Engagement Scale
31-items and six sub-
scales:
aesthetic appeal, novelty,
felt involvement,
focused attention,
perceived usability,
endurability
(O’Brien & Toms, 2010; Arguello etal, 2012;
Bordino etal, Under Review)
Small-scale measurement
Towards User Engagement
happy users
come back
we need to
properly identify
that a user is
happy
Merci

Contenu connexe

Tendances

Mobile advertising: The preclick experience
Mobile advertising: The preclick experienceMobile advertising: The preclick experience
Mobile advertising: The preclick experienceMounia Lalmas-Roelleke
 
Social Media and AI: Don’t forget the users
Social Media and AI: Don’t forget the usersSocial Media and AI: Don’t forget the users
Social Media and AI: Don’t forget the usersMounia Lalmas-Roelleke
 
Story-focused Reading in Online News and its Potential for User Engagement
Story-focused Reading in Online News and its Potential for User EngagementStory-focused Reading in Online News and its Potential for User Engagement
Story-focused Reading in Online News and its Potential for User EngagementMounia Lalmas-Roelleke
 
User Engagement - A Scientific Challenge
User Engagement - A Scientific ChallengeUser Engagement - A Scientific Challenge
User Engagement - A Scientific ChallengeMounia Lalmas-Roelleke
 
Metrics, Engagement & Personalization
Metrics, Engagement & Personalization Metrics, Engagement & Personalization
Metrics, Engagement & Personalization Mounia Lalmas-Roelleke
 
Tutorial on metrics of user engagement -- Applications to Search & E- commerce
Tutorial on metrics of user engagement -- Applications to Search & E- commerceTutorial on metrics of user engagement -- Applications to Search & E- commerce
Tutorial on metrics of user engagement -- Applications to Search & E- commerceMounia Lalmas-Roelleke
 
Describing Patterns and Disruptions in Large Scale Mobile App Usage Data
Describing Patterns and Disruptions in Large Scale Mobile App Usage DataDescribing Patterns and Disruptions in Large Scale Mobile App Usage Data
Describing Patterns and Disruptions in Large Scale Mobile App Usage DataMounia Lalmas-Roelleke
 
To be or not be engaged: What are the questions (to ask)?
To be or not be engaged: What are the questions (to ask)?To be or not be engaged: What are the questions (to ask)?
To be or not be engaged: What are the questions (to ask)?Mounia Lalmas-Roelleke
 
Tutorial on Online User Engagement: Metrics and Optimization
Tutorial on Online User Engagement: Metrics and OptimizationTutorial on Online User Engagement: Metrics and Optimization
Tutorial on Online User Engagement: Metrics and OptimizationMounia Lalmas-Roelleke
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...MLconf
 
Recent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveRecent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveJustin Basilico
 
Less is More: An Empirical Investigation of the Relationship Between Amount o...
Less is More: An Empirical Investigation of the Relationship Between Amount o...Less is More: An Empirical Investigation of the Relationship Between Amount o...
Less is More: An Empirical Investigation of the Relationship Between Amount o...UXPA International
 
21 UX Research Methods
21 UX Research Methods21 UX Research Methods
21 UX Research MethodsTushar Patil
 
Brightfind world usability day 2016 full deck final
Brightfind world usability day 2016   full deck finalBrightfind world usability day 2016   full deck final
Brightfind world usability day 2016 full deck finalBrightfind
 
Engaging with Users on Public Social Media
Engaging with Users on Public Social MediaEngaging with Users on Public Social Media
Engaging with Users on Public Social MediaJeffrey Nichols
 

Tendances (20)

Mobile advertising: The preclick experience
Mobile advertising: The preclick experienceMobile advertising: The preclick experience
Mobile advertising: The preclick experience
 
User engagement in the digital world
User engagement in the digital worldUser engagement in the digital world
User engagement in the digital world
 
Social Media and AI: Don’t forget the users
Social Media and AI: Don’t forget the usersSocial Media and AI: Don’t forget the users
Social Media and AI: Don’t forget the users
 
Story-focused Reading in Online News and its Potential for User Engagement
Story-focused Reading in Online News and its Potential for User EngagementStory-focused Reading in Online News and its Potential for User Engagement
Story-focused Reading in Online News and its Potential for User Engagement
 
User Engagement - A Scientific Challenge
User Engagement - A Scientific ChallengeUser Engagement - A Scientific Challenge
User Engagement - A Scientific Challenge
 
Metrics, Engagement & Personalization
Metrics, Engagement & Personalization Metrics, Engagement & Personalization
Metrics, Engagement & Personalization
 
Tutorial on metrics of user engagement -- Applications to Search & E- commerce
Tutorial on metrics of user engagement -- Applications to Search & E- commerceTutorial on metrics of user engagement -- Applications to Search & E- commerce
Tutorial on metrics of user engagement -- Applications to Search & E- commerce
 
Describing Patterns and Disruptions in Large Scale Mobile App Usage Data
Describing Patterns and Disruptions in Large Scale Mobile App Usage DataDescribing Patterns and Disruptions in Large Scale Mobile App Usage Data
Describing Patterns and Disruptions in Large Scale Mobile App Usage Data
 
An engaging click
An engaging clickAn engaging click
An engaging click
 
To be or not be engaged: What are the questions (to ask)?
To be or not be engaged: What are the questions (to ask)?To be or not be engaged: What are the questions (to ask)?
To be or not be engaged: What are the questions (to ask)?
 
Tutorial on Online User Engagement: Metrics and Optimization
Tutorial on Online User Engagement: Metrics and OptimizationTutorial on Online User Engagement: Metrics and Optimization
Tutorial on Online User Engagement: Metrics and Optimization
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Recent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveRecent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix Perspective
 
Less is More: An Empirical Investigation of the Relationship Between Amount o...
Less is More: An Empirical Investigation of the Relationship Between Amount o...Less is More: An Empirical Investigation of the Relationship Between Amount o...
Less is More: An Empirical Investigation of the Relationship Between Amount o...
 
Ijcatr04061001
Ijcatr04061001Ijcatr04061001
Ijcatr04061001
 
Exploring the roles of hosts' attachment and psychological ownership in an Ai...
Exploring the roles of hosts' attachment and psychological ownership in an Ai...Exploring the roles of hosts' attachment and psychological ownership in an Ai...
Exploring the roles of hosts' attachment and psychological ownership in an Ai...
 
Chinese Adoption of Travel Information on Social Media: Moderating Effects of...
Chinese Adoption of Travel Information on Social Media: Moderating Effects of...Chinese Adoption of Travel Information on Social Media: Moderating Effects of...
Chinese Adoption of Travel Information on Social Media: Moderating Effects of...
 
21 UX Research Methods
21 UX Research Methods21 UX Research Methods
21 UX Research Methods
 
Brightfind world usability day 2016 full deck final
Brightfind world usability day 2016   full deck finalBrightfind world usability day 2016   full deck final
Brightfind world usability day 2016 full deck final
 
Engaging with Users on Public Social Media
Engaging with Users on Public Social MediaEngaging with Users on Public Social Media
Engaging with Users on Public Social Media
 

Similaire à Evaluating the search experience: from Retrieval Effectiveness to User Engagement

Optimizing Mobile UX Design Webinar Presentation Slides
Optimizing Mobile UX Design Webinar Presentation SlidesOptimizing Mobile UX Design Webinar Presentation Slides
Optimizing Mobile UX Design Webinar Presentation SlidesUserZoom
 
User Zoom Webinar Monster Aug09 Vf
User Zoom Webinar Monster Aug09 VfUser Zoom Webinar Monster Aug09 Vf
User Zoom Webinar Monster Aug09 VfUserZoom
 
Search analytics what why how - By Otis Gospodnetic
Search analytics what why how - By Otis GospodneticSearch analytics what why how - By Otis Gospodnetic
Search analytics what why how - By Otis Gospodneticlucenerevolution
 
Detecting Good Abandonment in Mobile Search
Detecting Good Abandonment in Mobile SearchDetecting Good Abandonment in Mobile Search
Detecting Good Abandonment in Mobile SearchJulia Kiseleva
 
Search analytics what why how - By Otis Gospodnetic
 Search analytics what why how - By Otis Gospodnetic  Search analytics what why how - By Otis Gospodnetic
Search analytics what why how - By Otis Gospodnetic lucenerevolution
 
Usability Testing Basics: What's it All About? at Web SIG Cleveland
Usability Testing Basics: What's it All About? at Web SIG ClevelandUsability Testing Basics: What's it All About? at Web SIG Cleveland
Usability Testing Basics: What's it All About? at Web SIG ClevelandCarol Smith
 
Mobile First to AI First: How User Signals Change SEO | SMX19
Mobile First to AI First: How User Signals Change SEO | SMX19Mobile First to AI First: How User Signals Change SEO | SMX19
Mobile First to AI First: How User Signals Change SEO | SMX19Philipp Klöckner
 
Self-Organized, Autonomous UX | SoCal UX Camp | May 31, 2014
Self-Organized, Autonomous UX  |  SoCal UX Camp  |  May 31, 2014Self-Organized, Autonomous UX  |  SoCal UX Camp  |  May 31, 2014
Self-Organized, Autonomous UX | SoCal UX Camp | May 31, 2014Jaimi Kercher
 
Know and Delight Your Users: UX Analytics
Know and Delight Your Users: UX AnalyticsKnow and Delight Your Users: UX Analytics
Know and Delight Your Users: UX AnalyticsCemal Buyukgokcesu
 
How User Testing Can Inform Content - 03/19/12 Content Strategy - Minneapolis...
How User Testing Can Inform Content - 03/19/12 Content Strategy - Minneapolis...How User Testing Can Inform Content - 03/19/12 Content Strategy - Minneapolis...
How User Testing Can Inform Content - 03/19/12 Content Strategy - Minneapolis...Link Positive, Inc.
 
Corso Interazione Uomo Macchina e Sviluppo Applicazioni Mobile - GoBus
Corso Interazione Uomo Macchina e Sviluppo Applicazioni Mobile - GoBusCorso Interazione Uomo Macchina e Sviluppo Applicazioni Mobile - GoBus
Corso Interazione Uomo Macchina e Sviluppo Applicazioni Mobile - GoBusAlessandro Longo
 
UserZoom Webinar: How to Conduct Web Customer Experience Benchmarking
UserZoom Webinar: How to Conduct Web Customer Experience BenchmarkingUserZoom Webinar: How to Conduct Web Customer Experience Benchmarking
UserZoom Webinar: How to Conduct Web Customer Experience BenchmarkingUserZoom
 
MeasureWorks - Social Mentions as a Performance KPI
MeasureWorks - Social Mentions as a Performance KPIMeasureWorks - Social Mentions as a Performance KPI
MeasureWorks - Social Mentions as a Performance KPIMeasureWorks
 
Combining Methods: Web Analytics and User Testing
Combining Methods: Web Analytics and User TestingCombining Methods: Web Analytics and User Testing
Combining Methods: Web Analytics and User TestingUser Intelligence
 
Usability in product development
Usability in product developmentUsability in product development
Usability in product developmentRavi Shyam
 
Www tutorial2013 userengagement
Www tutorial2013 userengagementWww tutorial2013 userengagement
Www tutorial2013 userengagementGabriela Agustini
 
User Interface and User Experience - A Process and Strategy for Small Teams
User Interface and User Experience - A Process and Strategy for Small TeamsUser Interface and User Experience - A Process and Strategy for Small Teams
User Interface and User Experience - A Process and Strategy for Small TeamsDamon Sanchez
 

Similaire à Evaluating the search experience: from Retrieval Effectiveness to User Engagement (20)

Optimizing Mobile UX Design Webinar Presentation Slides
Optimizing Mobile UX Design Webinar Presentation SlidesOptimizing Mobile UX Design Webinar Presentation Slides
Optimizing Mobile UX Design Webinar Presentation Slides
 
User Zoom Webinar Monster Aug09 Vf
User Zoom Webinar Monster Aug09 VfUser Zoom Webinar Monster Aug09 Vf
User Zoom Webinar Monster Aug09 Vf
 
Search analytics what why how - By Otis Gospodnetic
Search analytics what why how - By Otis GospodneticSearch analytics what why how - By Otis Gospodnetic
Search analytics what why how - By Otis Gospodnetic
 
Detecting Good Abandonment in Mobile Search
Detecting Good Abandonment in Mobile SearchDetecting Good Abandonment in Mobile Search
Detecting Good Abandonment in Mobile Search
 
Search analytics what why how - By Otis Gospodnetic
 Search analytics what why how - By Otis Gospodnetic  Search analytics what why how - By Otis Gospodnetic
Search analytics what why how - By Otis Gospodnetic
 
Usability Testing Basics: What's it All About? at Web SIG Cleveland
Usability Testing Basics: What's it All About? at Web SIG ClevelandUsability Testing Basics: What's it All About? at Web SIG Cleveland
Usability Testing Basics: What's it All About? at Web SIG Cleveland
 
Mobile First to AI First: How User Signals Change SEO | SMX19
Mobile First to AI First: How User Signals Change SEO | SMX19Mobile First to AI First: How User Signals Change SEO | SMX19
Mobile First to AI First: How User Signals Change SEO | SMX19
 
Self-Organized, Autonomous UX | SoCal UX Camp | May 31, 2014
Self-Organized, Autonomous UX  |  SoCal UX Camp  |  May 31, 2014Self-Organized, Autonomous UX  |  SoCal UX Camp  |  May 31, 2014
Self-Organized, Autonomous UX | SoCal UX Camp | May 31, 2014
 
Know and Delight Your Users: UX Analytics
Know and Delight Your Users: UX AnalyticsKnow and Delight Your Users: UX Analytics
Know and Delight Your Users: UX Analytics
 
How User Testing Can Inform Content - 03/19/12 Content Strategy - Minneapolis...
How User Testing Can Inform Content - 03/19/12 Content Strategy - Minneapolis...How User Testing Can Inform Content - 03/19/12 Content Strategy - Minneapolis...
How User Testing Can Inform Content - 03/19/12 Content Strategy - Minneapolis...
 
Corso Interazione Uomo Macchina e Sviluppo Applicazioni Mobile - GoBus
Corso Interazione Uomo Macchina e Sviluppo Applicazioni Mobile - GoBusCorso Interazione Uomo Macchina e Sviluppo Applicazioni Mobile - GoBus
Corso Interazione Uomo Macchina e Sviluppo Applicazioni Mobile - GoBus
 
UserZoom Webinar: How to Conduct Web Customer Experience Benchmarking
UserZoom Webinar: How to Conduct Web Customer Experience BenchmarkingUserZoom Webinar: How to Conduct Web Customer Experience Benchmarking
UserZoom Webinar: How to Conduct Web Customer Experience Benchmarking
 
MeasureWorks - Social Mentions as a Performance KPI
MeasureWorks - Social Mentions as a Performance KPIMeasureWorks - Social Mentions as a Performance KPI
MeasureWorks - Social Mentions as a Performance KPI
 
Combining Methods: Web Analytics and User Testing
Combining Methods: Web Analytics and User TestingCombining Methods: Web Analytics and User Testing
Combining Methods: Web Analytics and User Testing
 
Usability in product development
Usability in product developmentUsability in product development
Usability in product development
 
Www tutorial2013 userengagement
Www tutorial2013 userengagementWww tutorial2013 userengagement
Www tutorial2013 userengagement
 
ASML UX Event
ASML UX EventASML UX Event
ASML UX Event
 
161121 ASML UX Event
161121 ASML UX Event161121 ASML UX Event
161121 ASML UX Event
 
ASML UX Event
ASML UX EventASML UX Event
ASML UX Event
 
User Interface and User Experience - A Process and Strategy for Small Teams
User Interface and User Experience - A Process and Strategy for Small TeamsUser Interface and User Experience - A Process and Strategy for Small Teams
User Interface and User Experience - A Process and Strategy for Small Teams
 

Plus de Mounia Lalmas-Roelleke

Engagement, Metrics & Personalisation at Scale
Engagement, Metrics &  Personalisation at ScaleEngagement, Metrics &  Personalisation at Scale
Engagement, Metrics & Personalisation at ScaleMounia Lalmas-Roelleke
 
Personalizing the listening experience
Personalizing the listening experiencePersonalizing the listening experience
Personalizing the listening experienceMounia Lalmas-Roelleke
 
Recommending and Searching (Research @ Spotify)
Recommending and Searching (Research @ Spotify)Recommending and Searching (Research @ Spotify)
Recommending and Searching (Research @ Spotify)Mounia Lalmas-Roelleke
 
An introduction to system-oriented evaluation in Information Retrieval
An introduction to system-oriented evaluation in Information RetrievalAn introduction to system-oriented evaluation in Information Retrieval
An introduction to system-oriented evaluation in Information RetrievalMounia Lalmas-Roelleke
 
Friendly, Appealing or Both? Characterising User Experience in Sponsored Sear...
Friendly, Appealing or Both? Characterising User Experience in Sponsored Sear...Friendly, Appealing or Both? Characterising User Experience in Sponsored Sear...
Friendly, Appealing or Both? Characterising User Experience in Sponsored Sear...Mounia Lalmas-Roelleke
 
Predicting Pre-click Quality for Native Advertisements
Predicting Pre-click Quality for Native AdvertisementsPredicting Pre-click Quality for Native Advertisements
Predicting Pre-click Quality for Native AdvertisementsMounia Lalmas-Roelleke
 
Improving Post-Click User Engagement on Native Ads via Survival Analysis
Improving Post-Click User Engagement on Native Ads via Survival AnalysisImproving Post-Click User Engagement on Native Ads via Survival Analysis
Improving Post-Click User Engagement on Native Ads via Survival AnalysisMounia Lalmas-Roelleke
 
Promoting Positive Post-click Experience for In-Stream Yahoo Gemini Users
Promoting Positive Post-click Experience for In-Stream Yahoo Gemini UsersPromoting Positive Post-click Experience for In-Stream Yahoo Gemini Users
Promoting Positive Post-click Experience for In-Stream Yahoo Gemini UsersMounia Lalmas-Roelleke
 
Social Media News Communities: Gatekeeping, Coverage, and Statement Bias
 Social Media News Communities: Gatekeeping, Coverage, and Statement Bias Social Media News Communities: Gatekeeping, Coverage, and Statement Bias
Social Media News Communities: Gatekeeping, Coverage, and Statement BiasMounia Lalmas-Roelleke
 
On the Reliability and Intuitiveness of Aggregated Search Metrics
On the Reliability and Intuitiveness of Aggregated Search MetricsOn the Reliability and Intuitiveness of Aggregated Search Metrics
On the Reliability and Intuitiveness of Aggregated Search MetricsMounia Lalmas-Roelleke
 
Penguins in Sweaters, or Serendipitous Entity Search on User-generated Content
 Penguins in Sweaters, or Serendipitous Entity Search on User-generated Content Penguins in Sweaters, or Serendipitous Entity Search on User-generated Content
Penguins in Sweaters, or Serendipitous Entity Search on User-generated ContentMounia Lalmas-Roelleke
 
Evaluating Heterogeneous Information Access (Position Paper)
Evaluating Heterogeneous Information Access (Position Paper)Evaluating Heterogeneous Information Access (Position Paper)
Evaluating Heterogeneous Information Access (Position Paper)Mounia Lalmas-Roelleke
 

Plus de Mounia Lalmas-Roelleke (14)

Engagement, Metrics & Personalisation at Scale
Engagement, Metrics &  Personalisation at ScaleEngagement, Metrics &  Personalisation at Scale
Engagement, Metrics & Personalisation at Scale
 
Recommending and searching @ Spotify
Recommending and searching @ SpotifyRecommending and searching @ Spotify
Recommending and searching @ Spotify
 
Personalizing the listening experience
Personalizing the listening experiencePersonalizing the listening experience
Personalizing the listening experience
 
Recommending and Searching (Research @ Spotify)
Recommending and Searching (Research @ Spotify)Recommending and Searching (Research @ Spotify)
Recommending and Searching (Research @ Spotify)
 
An introduction to system-oriented evaluation in Information Retrieval
An introduction to system-oriented evaluation in Information RetrievalAn introduction to system-oriented evaluation in Information Retrieval
An introduction to system-oriented evaluation in Information Retrieval
 
Friendly, Appealing or Both? Characterising User Experience in Sponsored Sear...
Friendly, Appealing or Both? Characterising User Experience in Sponsored Sear...Friendly, Appealing or Both? Characterising User Experience in Sponsored Sear...
Friendly, Appealing or Both? Characterising User Experience in Sponsored Sear...
 
Advertising Quality Science
Advertising Quality ScienceAdvertising Quality Science
Advertising Quality Science
 
Predicting Pre-click Quality for Native Advertisements
Predicting Pre-click Quality for Native AdvertisementsPredicting Pre-click Quality for Native Advertisements
Predicting Pre-click Quality for Native Advertisements
 
Improving Post-Click User Engagement on Native Ads via Survival Analysis
Improving Post-Click User Engagement on Native Ads via Survival AnalysisImproving Post-Click User Engagement on Native Ads via Survival Analysis
Improving Post-Click User Engagement on Native Ads via Survival Analysis
 
Promoting Positive Post-click Experience for In-Stream Yahoo Gemini Users
Promoting Positive Post-click Experience for In-Stream Yahoo Gemini UsersPromoting Positive Post-click Experience for In-Stream Yahoo Gemini Users
Promoting Positive Post-click Experience for In-Stream Yahoo Gemini Users
 
Social Media News Communities: Gatekeeping, Coverage, and Statement Bias
 Social Media News Communities: Gatekeeping, Coverage, and Statement Bias Social Media News Communities: Gatekeeping, Coverage, and Statement Bias
Social Media News Communities: Gatekeeping, Coverage, and Statement Bias
 
On the Reliability and Intuitiveness of Aggregated Search Metrics
On the Reliability and Intuitiveness of Aggregated Search MetricsOn the Reliability and Intuitiveness of Aggregated Search Metrics
On the Reliability and Intuitiveness of Aggregated Search Metrics
 
Penguins in Sweaters, or Serendipitous Entity Search on User-generated Content
 Penguins in Sweaters, or Serendipitous Entity Search on User-generated Content Penguins in Sweaters, or Serendipitous Entity Search on User-generated Content
Penguins in Sweaters, or Serendipitous Entity Search on User-generated Content
 
Evaluating Heterogeneous Information Access (Position Paper)
Evaluating Heterogeneous Information Access (Position Paper)Evaluating Heterogeneous Information Access (Position Paper)
Evaluating Heterogeneous Information Access (Position Paper)
 

Dernier

Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhimiss dipika
 
Company Snapshot Theme for Business by Slidesgo.pptx
Company Snapshot Theme for Business by Slidesgo.pptxCompany Snapshot Theme for Business by Slidesgo.pptx
Company Snapshot Theme for Business by Slidesgo.pptxMario
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书rnrncn29
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Paul Calvano
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predieusebiomeyer
 
Internet of Things Presentation (IoT).pptx
Internet of Things Presentation (IoT).pptxInternet of Things Presentation (IoT).pptx
Internet of Things Presentation (IoT).pptxErYashwantJagtap
 
Unidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptxUnidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptxmibuzondetrabajo
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxDyna Gilbert
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书rnrncn29
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书zdzoqco
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa494f574xmv
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一z xss
 
NSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentationNSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentationMarko4394
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作ys8omjxb
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationLinaWolf1
 

Dernier (15)

Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhi
 
Company Snapshot Theme for Business by Slidesgo.pptx
Company Snapshot Theme for Business by Slidesgo.pptxCompany Snapshot Theme for Business by Slidesgo.pptx
Company Snapshot Theme for Business by Slidesgo.pptx
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predi
 
Internet of Things Presentation (IoT).pptx
Internet of Things Presentation (IoT).pptxInternet of Things Presentation (IoT).pptx
Internet of Things Presentation (IoT).pptx
 
Unidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptxUnidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptx
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptx
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
 
NSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentationNSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentation
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 Documentation
 

Evaluating the search experience: from Retrieval Effectiveness to User Engagement

  • 1. Evalua&ng  the  search  experience:   from  Retrieval  Effec&veness  to   User  Engagement   Mounia Lalmas Yahoo Labs London mounia@acm.org CLEF 2015 – Toulouse
  • 2. This talk § Evaluation in search (offline evaluation) (online evaluation) §  Interpreting the signals § Introduction to user engagement § From retrieval effectiveness to user engagement (from intra-session to inter-session evaluation)
  • 3. The Message of this talk What you want to optimize for each task, session, query M1 M2 M3 . . . Mn LTV1 LTV2 LTV3 . . . LTVm Mi LTVj What you want to optimize long- termSystem Models Features
  • 5. How to evaluate a search system § Coverage   § Speed   § Query  language   § User  interface   § User  happiness   Users  find  what  they  want  and  return  to  the  search  system     § But  let  us  remember:   In  carrying  out  a  search  task,  search  is  a  means,  not  an  end   Sec. 8.6 (Manning, Raghavan & Schütze, 2008; Baeza-Yates & Ribeiro-Neto, 2011)
  • 6. Within an online session ›  July 2012 ›  2.5M users ›  785M page views ›  Categorization of the most frequent accessed sites •  11 categories (e.g. news), 33 subcategories (e.g. news finance, news society) •  760 sites from 70 countries/regions short sessions: average 3.01 distinct sites visited with revisitation rate 10% long sessions: average 9.62 distinct sites visited with revisitation rate 22% (Lehmann etal, 2013)
  • 7. Measuring user happiness Most  common  proxy:  relevance  of  retrieved  results   Sec. 8.1 Relevant Retrieved all items §  User  informa(on  need  translated  into   a  query   §  Relevance  assessed  rela&ve  to     informa(on  need  not  the  query   §  Example:   ›  Informa&on  need:  I  am  looking  for  tennis   holiday  in  a  country  with  no  rain   ›  Query:  tennis  academy  good  weather   Evaluation measures: •  precision, recall, R-precision; precision@n; average precision; F-measure; … •  bpref; cumulative gains, rank-biased precision, expected reciprocal rank, Q-measure, … precision recall
  • 8. Measuring user happiness Most  common  proxy:  relevance  of  retrieval  results   Sec. 8.1 Explicit signals Test collection methodology (TREC, CLEF, …) Human labeled corpora Implicit signals User behavior in online settings (clicks, skips, …) Explicit and implicit signals can be used together
  • 9. Examples of implicit signals §  Number of clicks §  SAT click §  Quick-back click §  Click at given position §  Time to first click §  Skipping §  Abandonment rate §  Number of query reformulations §  Dwell time §  Hover
  • 10. What is a happy user in search 1.  The user information need is satisfied 2.  The user has learned about a topic and even about other topics 3.  The system was inviting and even fun to use In-the-moment engagement Users on a site Long-term engagement Users come back frequently USER ENGAGEMENT
  • 12. User variability (Anderson & Krathwohl, 2001; Bailey etal, 2015) T: number of documents users (judges) expected to read Q: number of queries users (judges) expected to issue Task complexity Task complexity
  • 13. Explicit signal: MAP (Turpin & Scholer, 2006) Similar results obtained with P@2, P@3, P@4 and P@10 PRECISION-BASED SEARCH
  • 14. Explicit signal: MAP (2) (Turpin & Scholer, 2006) RECALL-BASED SEARCH
  • 15. top most popular tweets top most popular tweets + geographical diverse Being from a central or peripheral location makes a difference. Peripheral users did not perceive the timeline as being diverse Explicit signal: “Diversity” It should never be just about the algorithm, but also how users respond to what the algorithm returns to them (Graells-Garrido, Lalmas & Baeza-Yates, Under Review)
  • 16. Implicit signal: Click-through rate CTR new ranking algorithm new design of search result page …
  • 17. Multimedia search activities often driven by entertainment needs, not by information needs Relevance in multimedia search (Slaney, 2011) Signal signal: Clicks (I)
  • 18. (Miliaraki, Blanco & Lalmas, 2015) Implicit signal: Clicks (II) Explorative and serendipitous search
  • 19. I just wanted the phone number … I am totally happy J Implicit signal: No click Information-rich snippet
  • 20. Implicit signal: No click Cickthrough rate: % of clicks when URL shown (per query) Hover rate: % hover over URL (per query) Unclicked hover: Median time user hovers over URL but no click (per query) Max hover time: Maximum time user hovers over a result (per SERP) (Huang et al, 2011) 20
  • 21. §  Abandonment is when there is no click on the search result page ›  User is dissatisfied (bad abandonment) ›  User found result(s) on the search result page (good abandonment) §  858 queries (21% good vs. 79% abandonment manually examined) §  Cursor trail length ›  Total distance (pixel) traveled by cursor on SERP ›  Shorter for good abandonment §  Movement time ›  Total time (second) cursor moved on SERP ›  Longer when answers in snippet (good abandonment) §  Cursor speed ›  Average cursor speed (pixel/second) ›  Slower when answers in snippet (good abandonment) (Huang et al, 2011) Implicit signal: Abandonment rate
  • 22. “reading” cursor heatmap of relevant document vs “scanning” cursor heatmap of non-relevant document (both dwell time of 30s) (Guo & Agichtein, 2012) 22 Implicit signal: Dwell time
  • 23. Implicit signal: Dwell time “reading” a relevant long document vs “scanning” a long non-relevant document (Guo & Agichtein, 2012) 23
  • 24. Implicit signal: Dwell time DWELL TIME used a proxy of user experience Publisher click on an ad on mobile device Dwell time on non-optimized landing pages comparable and even higher than on mobile- optimized ones … when mobile optimized, users realize quickly whether they “like” the ad or not? (Lalmas etal, 2015) non-mobile optimized mobile optimized
  • 26. What is user engagement? “User engagement is a quality of the user experience that emphasizes the phenomena associated with wanting to use a technological resource longer and frequently” (Attfield et al, 2011)
  • 27. Characteristics of user engagement Novelty (Webster & Ho, 1997; O’Brien, 2008) Richness and control (Jacques et al, 1995; Webster & Ho, 1997) Aesthetics (Jacques et al, 1995; O’Brien, 2008) Endurability (Read, MacFarlane, & Casey, 2002; O’Brien, 2008) Focused attention (Webster & Ho, 1997; O’Brien, 2008) Reputation, trust and expectation (Attfield et al, 2011) Positive Affect (O’Brien & Toms, 2008) Motivation, interests, incentives, and benefits (Jacques et al., 1995; O’Brien & Toms, 2008) (O’Brien, Lalmas & Yom-Tov, 2014)
  • 28. Measuring user engagement Measures   Attributes   Self-report Questionnaire, interview, think-aloud and think after protocols Subjective Short- and long-term Lab and field Small scale Physiology EEG, SCL, fMRI eye tracking mouse-tracking Objective Short-term Lab and field Small and large scale Analytics intra- and inter-session metrics data science Objective Short- and long-term Field Large scale
  • 29. Attributes of user engagement § Scale (small versus large) § Setting (laboratory versus field) § Objective versus subjective § Temporality (in-the-moment versus long-term) What you want to optimize for each task, session, query What you want to optimize long- term Mi LTVj
  • 31. User engagement metrics 0-1 1-0.5 0.5 Kendall’s tau with p-value < 0.05 ('-' insignificant correlations) High correlation between metrics in same group Low correlation between metrics in different groups [POP]#Users [POP]#Visits [POP]#Clicks [ACT]PageViewsV [ACT]DwellTimeV [LOY]ActiveDays [LOY]ReturnRate #Users [POP] 0.82 0.75 - - 0.43 0.34 #Visits [POP] 0.82 0.85 - - 0.60 0.52 #Clicks [POP] 0.75 0.85 0.16 0.18 0.59 0.51 PageViewsV [ACT] - - 0.16 0.33 - - DwellTimeV [ACT] - - 0.18 0.33 - - ActiveDays [LOY] 0.43 0.60 0.59 - - 0.79 ReturnRate [LOY] 0.34 0.52 0.51 - - 0.79 0.69 (Lehmann etal, 2012) in-the-moment long-term
  • 32. Online sites differ with respect to their engagement pattern Games Users spend much time per visit Search Users come frequently and do not stay long Social media Users come frequently and stay long Niche Users come on average once a week e.g. weekly post News Users come periodically, e.g. morning and evening Service Users visit site, when needed, e.g. to renew subscription (Lehmann etal, 2012) in-the-moment: at each visit long-term: visit frequency
  • 34. 1.  Search 2.  Mobile advertising happy users come back
  • 35. The Message: From intra- to inter- session evaluation What you want to optimize for each task, session, query M1 M2 M3 . . . Mn LTV1 LTV2 LTV3 . . . LTVm Mi LTVj What you want to optimize long- termSystem Models Features
  • 37. Search experience What you want to optimize for each task, session, query search metrics (signals) absence time (revisit the site) Mi LTVj What you want to optimize long- termSearch system Models Features
  • 38. intra-session search metrics •  Dwell time •  Number of clicks •  Time to 1st lick •  Skipping •  Click through rate •  Abandonment rate •  Number of query reformulations •  … Dwell time as a proxy of user interest Dwell time as a proxy of relevance Dwell time as a proxy of conversion Dwell time as a proxy of post-click ad quality … User engagement metrics for search (Proxy: relevance of search results) intra-session inter-session
  • 39. Dwell time (I) § Definition The contiguous time spent on a site or web page § Cons Not clear that the user was actually looking at the site while there à blur/focus Distribution of dwell times on 50 websites (O’Brien, Lalmas & Yom-Tov, 2014)
  • 40. Dwell time (II) Dwell time varies by site type: •  leisure sites tend to have longer dwell times than news, e-commerce, etc. Dwell time has a relatively large variance even for the same site Dwell time on 50 websites (tourists, active, VIP … users) (O’Brien, Lalmas & Yom-Tov, 2014)
  • 41. Search result page for “asparagus” (I)
  • 42. Search result page for “asparagus” (II)
  • 43. Absence time and survival analysis story 1 story 2 story 3 story 4 story 5 story 6 story 7 story 8 story 9 0 5 10 15 20 0.00.20.40.60.81.0 Users (%) who did come back Users (%) who read story 2 but did not come back after 10 hours SURVIVE DIE DIE = RETURN TO SITE èSHORT ABSENCE TIME hours
  • 44. Absence time applied to search Ranking function on Yahoo Answer Japan Two-weeks click data on Yahoo Answer Japan: search One millions users Six ranking functions 30-minute session boundary
  • 45. survival analysis: high hazard rate (die quickly) = short absence 5 clicks control=noclick Absence time and number of clicks on search result page 3 clicks §  No click means a bad user experience §  Clicking between 3-5 results leads to same user experience §  Clicking on more than 5 results reflects poorer user experience; users cannot find what they are looking for (Dupret & Lalmas, 2013)
  • 46. Using DCG versus absence to evaluate five ranking functions DCG@1 Ranking Alg 1 Ranking Alg 2 Ranking Alg 3 Ranking Alg 4 Ranking Alg 5 DCG@5 Ranking Alg 1 Ranking Alg 3 Ranking Alg 2 Ranking Alg 4 Ranking Alg 5 Absence time Ranking Alg 1 Ranking Alg 2 Ranking Alg 5 Ranking Alg 3 Ranking Alg 4 (Dupret & Lalmas, 2013)
  • 47. Absence time and search experience §  Clicking lower in the ranking (2nd, 3rd) suggests more careful choice from the user (compared to 1st) §  Clicking at bottom is a sign of low quality overall ranking §  Users finding their answers quickly (time to 1st click) return sooner to the search application §  Returning to the same search result page is a worse user experience than reformulating the query search session metrics à absence time (Dupret & Lalmas, 2013)
  • 48. Absence time – search experience From 21 experiments carried out through A/B testing, using absence time agrees with 14 of them (which one is better) (Chakraborty etal, 2014) Positive signals •  One more query in session •  One more click in session •  SAT clicks •  Query reformulation Negative signals •  Abandoned session •  Quick-back clicks search session metrics à absence time
  • 50. The context — Post-click experience on mobile advertising What you want to optimize for each task, session, query dwell time on landing page absence time (next ad click) Mi LTVj What you want to optimize long- termnative ad serving Models Features
  • 52. Estimating the quality of the post-click experience Best experience is when conversion happens Estimating the probability of conversion is hard! - Conversion data is not available for all advertisers - Conversion data is not missing at random Proxy metric of post-click quality: dwell time on the ad landing page - No conversion does not mean a bad experience tad-click tback-to-publisher dwell time = tback-to-publisher – tad-click
  • 53. Dwell time as a proxy of the post-click experience mobile 200K ad clicks Ø  It needs less time to get the same probability of a second click desktop (toolbar) 30K ad clicks Ø  23.3% of users visit other websites than the ad landing page before returning to publisher Ø  this goes down to 7.4% for dwell time up to 3 mins. Probability of a second click increases with dwell time
  • 54. Dwell time and absence time 0% 200% 400% 600% short ad clicks long ad clicks adclickdifference Dwell time à ad click Positive post-click experience (“long” clicks) has an effect on users clicking on ads again (mobile) (Lalmas etal, 2015) Absence time: •  return to publisher •  click on an ad
  • 55. From intra- to inter- session evaluation Absence time 1.  Search 2.  Mobile advertising happy users come back
  • 57. Large-scale online measurement Decide the in- the-moment metric(s) Decide the long- term-value metric(s) System Models Features Which in-the- moment metric(s) are good predictor of long- term value metric(s) Optimize for the identified in-the- moment metric(s) Lots of data required to remove noise What is a signal? What is a metric?
  • 58. O’Brien & Toms User Engagement Scale 31-items and six sub- scales: aesthetic appeal, novelty, felt involvement, focused attention, perceived usability, endurability (O’Brien & Toms, 2010; Arguello etal, 2012; Bordino etal, Under Review) Small-scale measurement
  • 59. Towards User Engagement happy users come back we need to properly identify that a user is happy
  • 60. Merci