SlideShare une entreprise Scribd logo
1  sur  47
Télécharger pour lire hors ligne
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Search Tasks, Proactive Search & Digital
Assistants
Rishabh Mehrotra
University College London
r.mehrotra@cs.ucl.ac.uk
Machine Learning Meetup, Gurgaon
Saturday 25th June, 2016
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Modelling Search Tasks & Behaviors
1 Motivation
2 From Sessions to Tasks
3 Extracting Tasks & Subtasks
4 Hierarchies of Tasks & Subtasks
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Evolution of Web Search
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Search is Everywhere
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Web Search Volume
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Next Generation Systems
Towards Task-based Search
Systems
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Slight Detour - Proactive IR & Digital Assistants
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Slight Detour - Proactive IR & Digital Assistants
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Slight Detour - Proactive IR & Digital Assistants
Information Cards:
From Queries to Cards (Shokouhi et al. SIGIR’15)
Modelling User Interests for Zero-query Ranking (Yang et
al. ECIR’16)
Intelligent Assistants:
Understanding User Satisfaction with Intelligent Assistants
(Kiseleva et al. CHIIR’16)
Automatic Online Evaluation of Intelligent Assistants
(Jiang et al. WWW’15)
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Slight Detour - Proactive IR & Digital Assistants
Modelling user interaction on Information Cards:
Click-based (pseudo) relevance labels may not be
appropriate for evaluating all types of cards.
Features
Reactive History
Proactive History
Lexical/Topical Features
Local/Temporal Features
Constant Features
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Slight Detour - Proactive IR & Digital Assistants
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Slight Detour - Proactive IR & Digital Assistants1
1
Understanding User Satisfaction with Intelligent Assistants, Kiseleva et
al. ACM CHIIR 2016
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Putting it together
Digital Assistants - entry points for all web activities
Goal is to move towards Task Completion
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Understanding Search Tasks
Search Tasks
An (atomic?) information need that may result in issuing of
one or more queries.
Image from Verma et al. 2014
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Why Search Tasks
Image from Wang et al. 2014
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Outline
Part 1 Understanding Search Sessions
Part 2 Extracting Tasks & Subtasks
Part 3 Hierarchies of Tasks & Subtasks
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Understanding Search Behavior
IR research = sessions, topics, queries
We focus on:
1 Multitasking
2 User groups based on multitasking
3 Topical heterogeneity
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Research Questions
1 RQ1: Extent of multi-tasking in search sessions?
2 RQ2: Evidence of user-level heterogeneity based on task
behavior?
1 RQ2.1: Does search task effort vary across user groups?
2 RQ2.2: Association between users’ interests and task
multiplicity?
3 RQ3: Characterizing various heterogeneities in search
behavior.
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Data Context
No of Queries 620M
No of Sessions 190M
No of Users 2M
Avg No of queries per session 3.18
Avg no of sessions per user 76.01
Avg no of tasks per session 2.08
Table: Data summary
Time Query SessionID TaskID Topic
05/29/2012 14:06:04 adele songs 1 1 Arts
05/29/2012 14:11:49 wedding venue 1 2 Society
05/29/2012 14:12:01 video download 1 3 Arts
05/29/2012 14:06:04 Obama care 2 4 News
05/29/2012 14:11:49 running shoes 2 5 Shopping
05/29/2012 14:12:01 sports shoes 2 5 Shopping
05/29/2012 14:22:12 wedding cards 2 2 Society
Table: Sample search sessions
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Extent of Multitasking
Quantifying the extent of multi-tasking
0%	
  
20%	
  
40%	
  
60%	
  
80%	
  
100%	
  
%	
  sessions	
  
No	
  of	
  tasks	
  in	
  a	
  session	
  
EXTENT	
  OF	
  MULTITASKING	
  IN	
  SESSIONS	
  
2	
   3	
   4	
   5	
   6	
   7	
   8	
   9	
   10	
   10+	
   1	
  110629692	
  
0	
  
10	
  
20	
  
30	
  
40	
  
50	
  
60	
  
No	
  of	
  Users	
  (In	
  %)	
  
Avg	
  Number	
  of	
  Tasks	
  Performed	
  in	
  a	
  Session	
  
1	
   2	
   3	
   4	
   5	
   6	
   7	
   8	
   9	
   10	
   10+	
  
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
User Groups
Focussed	
  Users	
  
Mul--­‐taskers	
  
Supertaskers	
  
0%	
  
20%	
  
40%	
  
60%	
  
80%	
  
100%	
  
%	
  of	
  User	
  Popula-on	
  
Types	
  of	
  User	
  Groups	
  
TASK	
  CENTRIC	
  USER	
  GROUPS	
  
Focussed	
   Mul--­‐Taskers	
   Supertaskers	
  
Figure: User groups based on multi-tasking behaviors
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
User Groups: Example Sessions
User Type Example Queries from a Typical Session
Focussed User
”test guide.com”, ”CNA Practice Test”, ”CNA State
Board Exam”, ”CNA Testing Schedule and Loca-
tions”, ”CNA State Board Practice Test”, ”CNA
Practice Test 2014”, CNA 50 Questions Test”, ”Free
GED Practice Test 2014”
Multi-Tasking User
”Gravity FSX 2.0”, ”Full Suspension Mountain
Bikes”, ”Walmart Cards”, ”Walmart Instant Card
Application”, ”Gravity FSX 2.0 price”, ”full suspen-
sion bike sale”
Supertasking User
”hairstyles for women over 50”, ”thin wavy hairstyles
for women”, ”facebook”, ”fb sign in”, pulled
pork crock pot recipe easy”, ”Slow-Roasted Pulled
Pork”, ”barefoot contessa”, ”miley cyrus hair styles”,
”hairstyler.com”
Table: Example query sessions from the different user groups.
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Effort Metrics across User Groups
TTFC: Time to First Click TTLC: Time to Last Click
PCC: Page Click Count PgCC: Pagination Click Count
-3
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
TTFC TTLC PCC PgCC
Focussed
Multi-taskers
Super-taskers
Standardisedscores
Figure: User groups based on multi-tasking behaviors
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Topical Heterogeneity
0	
   0.01	
   0.02	
   0.03	
   0.04	
   0.05	
   0.06	
   0.07	
   0.08	
   0.09	
  
Shopping	
  
Home	
  
Kids	
  
Health	
  
Extent	
  of	
  Mul,tasking	
  
(a) Focused
0.55	
   0.56	
   0.57	
   0.58	
   0.59	
   0.6	
   0.61	
   0.62	
   0.63	
   0.64	
   0.65	
   0.66	
  
News	
  
Sports	
  
Arts	
  
Kids	
  
Extent	
  of	
  Mul,tasking	
  
(b) Multitasker
0.88	
   0.9	
   0.92	
   0.94	
   0.96	
  
News	
  
Sports	
  
Arts	
  
Society	
  
Extent	
  of	
  Mul,tasking	
  
(c) Supertasker
0	
   0.1	
   0.2	
   0.3	
   0.4	
   0.5	
  
Arts	
  
Adult	
  
Games	
  
Computers	
  
Extent	
  of	
  Single-­‐Tasking	
  
(d) Focused
-­‐0.6	
   -­‐0.5	
   -­‐0.4	
   -­‐0.3	
   -­‐0.2	
   -­‐0.1	
   0	
  
Business	
  
Adult	
  
Computers	
  
Games	
  
Extent	
  of	
  Single-­‐Tasking	
  
(e) Multitasker
-­‐0.86	
   -­‐0.84	
   -­‐0.82	
   -­‐0.8	
   -­‐0.78	
   -­‐0.76	
   -­‐0.74	
  
Business	
  
Home	
  
Computers	
  
Games	
  
Extent	
  of	
  Single-­‐Tasking	
  
(f) Supertasker
Figure: Top topics prone to multi-tasking (Top) and single-tasking
(Bottom) across different user groups.
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Takeaways2
Sessions should NOT be the focal units in IR
More than 70% of sessions are multi-task
Users differ based on their multi-tasking habits
Search effort varies across user groups
Some topics prone to multi-tasking than others
2
R. Mehrotra, P. Bhattacharya, E. Yilmaz; Characterizing Users
Multi-Tasking Behavior in Web Search, at ACM SIGIR Conference
CHIIR 2016
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Outline
Part 1 Understanding Search Sessions
Part 2 Extracting Tasks & Subtasks
Part 3 Hierarchies of Tasks & Subtasks
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Extracting Tasks & Subtasks
1 Task extraction
2 Subtask extraction
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Extracting Search Tasks
Various proposed strategies:
Clustering session based queries [Lucchese, WSDM’11]
Structured Learning Approach [Wang, WWW’13]
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Extracting Search Tasks
Various proposed strategies:
Hawkes Process based Task Extraction [Li, KDD’14]
Entity-based Task Extraction [Verma, CIKM’14][White,
CIKM’14]
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Extracting Search Tasks
Problems abound:
Link query to on-going task = long chains
Impure tasks
Pairwise predictions need extra post-processing
Rely on large corpus of pre-tagged queries
Do not aggregate across users
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Random Fields based Task Extraction3
Propose a latent variable model for extracting tasks:
Assumes k-underlying search tasks
Places a Markov Random Field over the latent task layer
Assumes access to a task-oracle
Each latent task is a multinomial distribution over query
terms
3
R. Mehrotra, E. Yilmaz; Extracting Search Tasks via Task Random
Field Model, SIGIR 2016 (under review)
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Random Fields based Task Extraction
Figure: The proposed task-RF model for extracting search tasks.
The generative process:
Draw task proportions vector θ ∼ Dir(α)
Draw task labels z for all queries from the joint
distribution defined in Eq.(1)
For each query (qi ) in the session:
For each query term (wqi ,j )
draw wqi ,j ∼ multi(βzi
)
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Random Fields based Task Extraction
A query marginal factorizes into its query terms:
p(q|z, β) =
Mq
j=1
p(wj |z, β) (1)
The joint distribution of θ, z, q and w can be written as:
p(θ, z, w|α, β, λ) = p(θ|α)p(z|θ, λ)
N
i=1
Mqi
j=1
p(wqi ,j |zi , β) (2)
Under the MRF model, the joint probability of all task
assignments z = zi
N
i=1 can be written as:
p(z|θ, λ) =
1
A(θ, λ)
N
i=1
p(zi |θ) exp{λ
(m,n)∈E
1(zm = zn)} (3)
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Random Fields based Task Extraction
Task Oracle: classifier which predicts whether or not 2 queries
belong to the same task.
Query Based Features
cosine cosine similarity between the term sets of the queries
edit norm edit distance between query strings
Jac Jaccard coeff between the term sets of the queries
Term proportion of common terms between the queries
url-term-match-sum u∈URLi
m(qj , u) + u∈URLj
m(qi , u)
url-term-match-max maxu∈URLi
m(qj , u) + maxu∈URLj
m(qi , u)
Embedding cosine distance between embedding vectors of the two queries
URL Based Features
dom-match the number of domain matches between URLs from both queries
Min-edit-U Minimum edit distance between all URL pairs from the queries
Avg-edit-U Average edit distance between all URL pairs from the queries
Jac-U-min Minimum Jaccard coefficient between all URL pairs from the queries
Jac-U-avg Average Jaccard coefficient between all URL pairs from the queries
Table: Features used by the classifier.
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Extracting Tasks & Subtasks
1 Task extraction
2 Subtask extraction
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Subtask Extraction4
Complex task decompose to more focussed subtasks
Number of subtasks is unknown
Given: collection of on-task queries
Couple Bayesian Nonparametrics & Word Embeddings
4
Subtask Extraction: R. Mehrotra, P. Bhattacharya, E. Yilmaz; Exploiting
Distributional Representations with Distance Dependent CRPs for Extracting
Sub-Tasks. Accepted at NAACL 2016
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Chinese Restaurant Process
p(zi = j D, α) ∝
f (dij ) if j = i
α if j = i
1 For each query i ∈ [1, N], draw assignment
zi ∼ dist − CRP(α, f , D).
2 For each sub-task, k ∈ {1, ....}, draw a parameter θ∗
k ∼ G0.
3 For each query i ∈ [1, N],
1 If ci /∈ z∗
q1:N
, set the parameter for the ith
query to θi = θqi
.
Otherwise draw the parameter from the base distribution,
θi ∼ Dirichlet(λ).
2 Draw the ith query, wi ∼ Mult(M, θi ).
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Quantifying Task based Distances
Task: plan a wedding
Sample queries: wedding planning, wedding checklist, bridal
dresses, wedding cards
Leverage word embeddings.
Classify each word as background word or
subtask-specific word.
Use a weighted combination of their embedding vectors to
encode a query’s vector:
Vq =
1
nterms
i
nqti
Σqnq
Vti (4)
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Evaluating Quality of Subtasks
Qualitative Analysis:
Proposed Approach LDA
sub-task 1 sub-task 2 sub-task 3 sub-task 1 sub-task 2 sub-task 3
wedding hairstyles used wedding dresses wedding card holders wedding insurance christian wedding vows make your own wedding invitations
wedding hair dos colorful bridal gowns indian wedding program cards destination wedding dresses brides wedding cakes pictures
curly wedding hairstyles preowned wedding dresses wedding program paper wedding planning book cheap wedding dresses planners
pictures of wedding hair wedding attire regency wedding cards party supply stores tea length wedding dresses wedding colors
CRP TW
sub-task 1 sub-task 2 sub-task 3 sub-task 1 sub-task 2 sub-task 3
wedding planning kit wedding theme wedding insurance wedding insurance christian wedding vows cheap dresses
destination wedding wedding guide weddings in vegas destination wedding plus size bridesmaid wedding cakes pictures
wedding table decorations save the date ideas wedding cakes pictures financing wedding rings wedding colors pricing weddings
1938 wedding pictures wedding vacation planning a wedding party supply stores tea length wedding dresses macys wedding dresses
User Study:
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Takeaways
Latent variable model for task extraction based on Task
oracle
Extracts more coherent tasks
Nonparametric model for extracting subtasks
Couple nonparametric approaches with word embeddings
Human judgements & coherence estimates show improved
results
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Outline
Part 1 Understanding Search Sessions
Part 2 Extracting Tasks & Subtasks
Part 3 Hierarchies of Tasks & Subtasks
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Hierarchies of Tasks & Subtasks
Complex search tasks places intense cognitive burden on
the user
Users need to explore domain-space
Flat structure-less tasks lack insights about demarcation
of subtasks
Provides tremendous opportunity to contextually target
the user (recommendations/advertisements)
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Constructing Bayesian Hierarchies
Tree as partitions & mixtures:
Tree: partition over the group of queries (Q) where probability
of a query group is defined via Dynamic Programming as
p(Q|T) = πT f (Q) + (1 − πt)
Ti ∈ch(T)
p(leaves(Ti )|Ti ) (5)
Gamma-Poisson Model of Query Affinities:
Query-Term Based Affinity (r1)
cosine cosine similarity between the term sets of the queries
edit norm edit distance between query strings
Jac Jaccard coeff between the term sets of the queries
Term proportion of common terms between the queries
URL Based Affinity (r2)
Min-edit-U Minimum edit distance between all URL pairs from the queries
Avg-edit-U Average edit distance between all URL pairs from the queries
Jac-U-min Minimum Jaccard coefficient between all URL pairs from the queries
Jac-U-avg Average Jaccard coefficient between all URL pairs from the queries
Session/User Based Affinity (r3)
Same-U if the two queries belong to the same user
Same-S if the two queries belong to the same session
Embedding Based Affinity (r4)
Embedding cosine distance between embedding vectors of the two queries
Table: Query-Query Affinities.
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Constructing Bayesian Hierarchies
Forming Arbitrary Tree Structure:
Bayesian Rose Trees (NIPS 2012)
Agglomerative Model Selection:
Bottom-up greedy agglomerative fashion
At each iteration: merges two of the trees in the forest
Which pair of tree to merge(& how) - merger that yields
the largest Bayes factor improvement
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Takeaways5
We can extract recursive hierarchies of tasks & subtasks
Nonparametric tree partition model for hierarchies
Allows for arbitrary structures
Improves term prediction & task extraction
Human labelled judgements show:
we extract coherent tasks
Subtasks are valid
Subtasks are useful for completing the overall task
5
R. Mehrotra, E. Yilmaz; Towards Hierarchies of Search Tasks & Subtasks,
In Proceedings of the Poster Track WWW 2015
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
Thank You!
ML Meetup
Rishabh
Mehrotra
Motivation
Search Tasks
From Sessions
to Tasks
Multitasking
User Groups
Topical
Heterogeneity
Extracting
Tasks &
Subtasks
Task Extraction
Subtask
Extraction
Hierarchies of
Tasks &
Subtasks
References
R. Mehrotra, P. Bhattacharya, and E. Yilmaz.
Characterizing users’ multi-tasking behavior in web search.
In Proceedings of the 2016 ACM on Conference on Human Information Interaction and Retrieval,
2016.
R. Mehrotra, P. Bhattacharya, and E. Yilmaz.
Deconstructing complex search tasks: A bayesian nonparametric approach for extracting sub-tasks.
In Proceedings of NAACL-HLT, pages 599–605, 2016.
R. Mehrotra, P. Bhattacharya, and E. Yilmaz.
Sessions; tasks & topics - uncovering behavioral heterogeneities in online search behavior.
In Proceedings of the 2016 39th International ACM SIGIR Conference on Research and Development
in Information Retrieval. ACM, 2016.
R. Mehrotra and E. Yilmaz.
Towards hierarchies of search tasks & subtasks.
In WWW 2015.
R. Mehrotra and E. Yilmaz.
Terms, topics & tasks: Enhanced user modelling for better personalization.
In Proceedings of the 2015 International Conference on The Theory of Information Retrieval, pages
131–140. ACM, 2015.
R. Mehrotra, E. Yilmaz, and M. Verma.
Task-based user modelling for personalization via probabilistic matrix factorization.
In RecSys Posters, 2014.
D. Odijk, R. W. White, A. Hassan Awadallah, and S. T. Dumais.
Struggling and success in web search.
In CIKM 2015.

Contenu connexe

Similaire à Search Tasks, Proactive Search & Digital Assistants

From Exploration to Construction
 - How to Support the Complex Dynamics of In...
From Exploration to Construction
 - How to Support the Complex Dynamics of In...From Exploration to Construction
 - How to Support the Complex Dynamics of In...
From Exploration to Construction
 - How to Support the Complex Dynamics of In...TimelessFuture
 
Multi-turn QA: A RNN Contextual Approach to Intent Classification for Goal-or...
Multi-turn QA: A RNN Contextual Approach to Intent Classification for Goal-or...Multi-turn QA: A RNN Contextual Approach to Intent Classification for Goal-or...
Multi-turn QA: A RNN Contextual Approach to Intent Classification for Goal-or...Martino Mensio
 
Parts 1 & 2: WWW 2018 Tutorial: Understanding User Needs & Tasks
Parts 1 & 2: WWW 2018 Tutorial: Understanding User Needs & TasksParts 1 & 2: WWW 2018 Tutorial: Understanding User Needs & Tasks
Parts 1 & 2: WWW 2018 Tutorial: Understanding User Needs & TasksRishabh Mehrotra
 
Structure, Personalization, Scale: A Deep Dive into LinkedIn Search
Structure, Personalization, Scale: A Deep Dive into LinkedIn SearchStructure, Personalization, Scale: A Deep Dive into LinkedIn Search
Structure, Personalization, Scale: A Deep Dive into LinkedIn SearchC4Media
 
Detecting Good Abandonment in Mobile Search
Detecting Good Abandonment in Mobile SearchDetecting Good Abandonment in Mobile Search
Detecting Good Abandonment in Mobile SearchJulia Kiseleva
 
Unit 5.1-Basics of Hierarchical Task Analysis (HTA).pptx
Unit 5.1-Basics of Hierarchical Task Analysis (HTA).pptxUnit 5.1-Basics of Hierarchical Task Analysis (HTA).pptx
Unit 5.1-Basics of Hierarchical Task Analysis (HTA).pptxNeetuBairwa
 
Acm productivity-webinar-2016-slides
Acm productivity-webinar-2016-slidesAcm productivity-webinar-2016-slides
Acm productivity-webinar-2016-slidesGail Murphy
 
Recommender Systems @ Scale - PyData 2019
Recommender Systems @ Scale - PyData 2019Recommender Systems @ Scale - PyData 2019
Recommender Systems @ Scale - PyData 2019Sonya Liberman
 
Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...
Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...
Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...Gabriel Moreira
 
Data-driven UX: What it really takes and how to get there
Data-driven UX: What it really takes and how to get thereData-driven UX: What it really takes and how to get there
Data-driven UX: What it really takes and how to get thereEmmanuelle Boloix, Ph.D.
 
Requirements-Collector: Automating Requirements Specification from Elicitatio...
Requirements-Collector: Automating Requirements Specification from Elicitatio...Requirements-Collector: Automating Requirements Specification from Elicitatio...
Requirements-Collector: Automating Requirements Specification from Elicitatio...Sebastiano Panichella
 
Advances in Exploratory Data Analysis, Visualisation and Quality for Data Cen...
Advances in Exploratory Data Analysis, Visualisation and Quality for Data Cen...Advances in Exploratory Data Analysis, Visualisation and Quality for Data Cen...
Advances in Exploratory Data Analysis, Visualisation and Quality for Data Cen...Hima Patel
 
Where Cognitive Science, Interaction Design and Data Dwells: The Competencies...
Where Cognitive Science, Interaction Design and Data Dwells: The Competencies...Where Cognitive Science, Interaction Design and Data Dwells: The Competencies...
Where Cognitive Science, Interaction Design and Data Dwells: The Competencies...Aaron Silvers
 
A Personalized Assistant Framework for Service Recommendation
A Personalized Assistant Framework for Service RecommendationA Personalized Assistant Framework for Service Recommendation
A Personalized Assistant Framework for Service RecommendationPradeep K. Venkatesh
 
Software Development Analytics Intro. Twitter OSS workshop
Software Development Analytics Intro. Twitter OSS workshopSoftware Development Analytics Intro. Twitter OSS workshop
Software Development Analytics Intro. Twitter OSS workshopManrique Lopez
 
The LinkedIn Effect: A new way of learning? OU Conference presentation
The LinkedIn Effect: A new way of learning?  OU Conference presentationThe LinkedIn Effect: A new way of learning?  OU Conference presentation
The LinkedIn Effect: A new way of learning? OU Conference presentationLouise Worsley
 
User Research 101: DIY Quick Course - CodeMash 2.0.1.1.
User Research 101: DIY Quick Course - CodeMash 2.0.1.1.User Research 101: DIY Quick Course - CodeMash 2.0.1.1.
User Research 101: DIY Quick Course - CodeMash 2.0.1.1.Carol Smith
 
Social Network Analysis based on MOOC's (Massive Open Online Classes)
Social Network Analysis based on MOOC's (Massive Open Online Classes)Social Network Analysis based on MOOC's (Massive Open Online Classes)
Social Network Analysis based on MOOC's (Massive Open Online Classes)ShankarPrasaadRajama
 

Similaire à Search Tasks, Proactive Search & Digital Assistants (20)

From Exploration to Construction
 - How to Support the Complex Dynamics of In...
From Exploration to Construction
 - How to Support the Complex Dynamics of In...From Exploration to Construction
 - How to Support the Complex Dynamics of In...
From Exploration to Construction
 - How to Support the Complex Dynamics of In...
 
Multi-turn QA: A RNN Contextual Approach to Intent Classification for Goal-or...
Multi-turn QA: A RNN Contextual Approach to Intent Classification for Goal-or...Multi-turn QA: A RNN Contextual Approach to Intent Classification for Goal-or...
Multi-turn QA: A RNN Contextual Approach to Intent Classification for Goal-or...
 
Parts 1 & 2: WWW 2018 Tutorial: Understanding User Needs & Tasks
Parts 1 & 2: WWW 2018 Tutorial: Understanding User Needs & TasksParts 1 & 2: WWW 2018 Tutorial: Understanding User Needs & Tasks
Parts 1 & 2: WWW 2018 Tutorial: Understanding User Needs & Tasks
 
Structure, Personalization, Scale: A Deep Dive into LinkedIn Search
Structure, Personalization, Scale: A Deep Dive into LinkedIn SearchStructure, Personalization, Scale: A Deep Dive into LinkedIn Search
Structure, Personalization, Scale: A Deep Dive into LinkedIn Search
 
Detecting Good Abandonment in Mobile Search
Detecting Good Abandonment in Mobile SearchDetecting Good Abandonment in Mobile Search
Detecting Good Abandonment in Mobile Search
 
Unit 5.1-Basics of Hierarchical Task Analysis (HTA).pptx
Unit 5.1-Basics of Hierarchical Task Analysis (HTA).pptxUnit 5.1-Basics of Hierarchical Task Analysis (HTA).pptx
Unit 5.1-Basics of Hierarchical Task Analysis (HTA).pptx
 
Acm productivity-webinar-2016-slides
Acm productivity-webinar-2016-slidesAcm productivity-webinar-2016-slides
Acm productivity-webinar-2016-slides
 
Social Semantic Search and Browsing
Social Semantic Search and BrowsingSocial Semantic Search and Browsing
Social Semantic Search and Browsing
 
Recommender Systems @ Scale - PyData 2019
Recommender Systems @ Scale - PyData 2019Recommender Systems @ Scale - PyData 2019
Recommender Systems @ Scale - PyData 2019
 
Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...
Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...
Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...
 
Data-driven UX: What it really takes and how to get there
Data-driven UX: What it really takes and how to get thereData-driven UX: What it really takes and how to get there
Data-driven UX: What it really takes and how to get there
 
Requirements-Collector: Automating Requirements Specification from Elicitatio...
Requirements-Collector: Automating Requirements Specification from Elicitatio...Requirements-Collector: Automating Requirements Specification from Elicitatio...
Requirements-Collector: Automating Requirements Specification from Elicitatio...
 
Beyond User Research
Beyond User ResearchBeyond User Research
Beyond User Research
 
Advances in Exploratory Data Analysis, Visualisation and Quality for Data Cen...
Advances in Exploratory Data Analysis, Visualisation and Quality for Data Cen...Advances in Exploratory Data Analysis, Visualisation and Quality for Data Cen...
Advances in Exploratory Data Analysis, Visualisation and Quality for Data Cen...
 
Where Cognitive Science, Interaction Design and Data Dwells: The Competencies...
Where Cognitive Science, Interaction Design and Data Dwells: The Competencies...Where Cognitive Science, Interaction Design and Data Dwells: The Competencies...
Where Cognitive Science, Interaction Design and Data Dwells: The Competencies...
 
A Personalized Assistant Framework for Service Recommendation
A Personalized Assistant Framework for Service RecommendationA Personalized Assistant Framework for Service Recommendation
A Personalized Assistant Framework for Service Recommendation
 
Software Development Analytics Intro. Twitter OSS workshop
Software Development Analytics Intro. Twitter OSS workshopSoftware Development Analytics Intro. Twitter OSS workshop
Software Development Analytics Intro. Twitter OSS workshop
 
The LinkedIn Effect: A new way of learning? OU Conference presentation
The LinkedIn Effect: A new way of learning?  OU Conference presentationThe LinkedIn Effect: A new way of learning?  OU Conference presentation
The LinkedIn Effect: A new way of learning? OU Conference presentation
 
User Research 101: DIY Quick Course - CodeMash 2.0.1.1.
User Research 101: DIY Quick Course - CodeMash 2.0.1.1.User Research 101: DIY Quick Course - CodeMash 2.0.1.1.
User Research 101: DIY Quick Course - CodeMash 2.0.1.1.
 
Social Network Analysis based on MOOC's (Massive Open Online Classes)
Social Network Analysis based on MOOC's (Massive Open Online Classes)Social Network Analysis based on MOOC's (Massive Open Online Classes)
Social Network Analysis based on MOOC's (Massive Open Online Classes)
 

Plus de Rishabh Mehrotra

Parts 5 & 6: WWW 2018 tutorial on Understanding User Needs & Tasks
Parts 5 & 6: WWW 2018 tutorial on Understanding User Needs & TasksParts 5 & 6: WWW 2018 tutorial on Understanding User Needs & Tasks
Parts 5 & 6: WWW 2018 tutorial on Understanding User Needs & TasksRishabh Mehrotra
 
Part 4: WWW 2018 tutorial on Understanding User Needs & Tasks
Part 4: WWW 2018 tutorial on Understanding User Needs & Tasks Part 4: WWW 2018 tutorial on Understanding User Needs & Tasks
Part 4: WWW 2018 tutorial on Understanding User Needs & Tasks Rishabh Mehrotra
 
Part 3: WWW 2018 tutorial on Understanding User Needs & Tasks
Part 3: WWW 2018 tutorial on Understanding User Needs & TasksPart 3: WWW 2018 tutorial on Understanding User Needs & Tasks
Part 3: WWW 2018 tutorial on Understanding User Needs & TasksRishabh Mehrotra
 
Facebook London - Learning from User Interactions
Facebook London - Learning from User InteractionsFacebook London - Learning from User Interactions
Facebook London - Learning from User InteractionsRishabh Mehrotra
 
Predicting Supply-side Engagement on Video Sharing Platforms
Predicting Supply-side Engagement on Video Sharing PlatformsPredicting Supply-side Engagement on Video Sharing Platforms
Predicting Supply-side Engagement on Video Sharing PlatformsRishabh Mehrotra
 
SIGIR 2017: Extracting Hierarchies of Search Tasks & Subtasks via a Bayesian ...
SIGIR 2017: Extracting Hierarchies of Search Tasks & Subtasks via a Bayesian ...SIGIR 2017: Extracting Hierarchies of Search Tasks & Subtasks via a Bayesian ...
SIGIR 2017: Extracting Hierarchies of Search Tasks & Subtasks via a Bayesian ...Rishabh Mehrotra
 
Fairness in Web Search: Auditing Search Engines for Differential Satisfaction
Fairness in Web Search: Auditing Search Engines for Differential SatisfactionFairness in Web Search: Auditing Search Engines for Differential Satisfaction
Fairness in Web Search: Auditing Search Engines for Differential SatisfactionRishabh Mehrotra
 
Characterizing Cross Domain Search Behavior
Characterizing Cross Domain Search BehaviorCharacterizing Cross Domain Search Behavior
Characterizing Cross Domain Search BehaviorRishabh Mehrotra
 

Plus de Rishabh Mehrotra (8)

Parts 5 & 6: WWW 2018 tutorial on Understanding User Needs & Tasks
Parts 5 & 6: WWW 2018 tutorial on Understanding User Needs & TasksParts 5 & 6: WWW 2018 tutorial on Understanding User Needs & Tasks
Parts 5 & 6: WWW 2018 tutorial on Understanding User Needs & Tasks
 
Part 4: WWW 2018 tutorial on Understanding User Needs & Tasks
Part 4: WWW 2018 tutorial on Understanding User Needs & Tasks Part 4: WWW 2018 tutorial on Understanding User Needs & Tasks
Part 4: WWW 2018 tutorial on Understanding User Needs & Tasks
 
Part 3: WWW 2018 tutorial on Understanding User Needs & Tasks
Part 3: WWW 2018 tutorial on Understanding User Needs & TasksPart 3: WWW 2018 tutorial on Understanding User Needs & Tasks
Part 3: WWW 2018 tutorial on Understanding User Needs & Tasks
 
Facebook London - Learning from User Interactions
Facebook London - Learning from User InteractionsFacebook London - Learning from User Interactions
Facebook London - Learning from User Interactions
 
Predicting Supply-side Engagement on Video Sharing Platforms
Predicting Supply-side Engagement on Video Sharing PlatformsPredicting Supply-side Engagement on Video Sharing Platforms
Predicting Supply-side Engagement on Video Sharing Platforms
 
SIGIR 2017: Extracting Hierarchies of Search Tasks & Subtasks via a Bayesian ...
SIGIR 2017: Extracting Hierarchies of Search Tasks & Subtasks via a Bayesian ...SIGIR 2017: Extracting Hierarchies of Search Tasks & Subtasks via a Bayesian ...
SIGIR 2017: Extracting Hierarchies of Search Tasks & Subtasks via a Bayesian ...
 
Fairness in Web Search: Auditing Search Engines for Differential Satisfaction
Fairness in Web Search: Auditing Search Engines for Differential SatisfactionFairness in Web Search: Auditing Search Engines for Differential Satisfaction
Fairness in Web Search: Auditing Search Engines for Differential Satisfaction
 
Characterizing Cross Domain Search Behavior
Characterizing Cross Domain Search BehaviorCharacterizing Cross Domain Search Behavior
Characterizing Cross Domain Search Behavior
 

Dernier

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 

Dernier (20)

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 

Search Tasks, Proactive Search & Digital Assistants

  • 1. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Search Tasks, Proactive Search & Digital Assistants Rishabh Mehrotra University College London r.mehrotra@cs.ucl.ac.uk Machine Learning Meetup, Gurgaon Saturday 25th June, 2016
  • 2. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Modelling Search Tasks & Behaviors 1 Motivation 2 From Sessions to Tasks 3 Extracting Tasks & Subtasks 4 Hierarchies of Tasks & Subtasks
  • 3. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Evolution of Web Search
  • 4. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Search is Everywhere
  • 5. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Web Search Volume
  • 6. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Next Generation Systems Towards Task-based Search Systems
  • 7. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Slight Detour - Proactive IR & Digital Assistants
  • 8. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Slight Detour - Proactive IR & Digital Assistants
  • 9. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Slight Detour - Proactive IR & Digital Assistants Information Cards: From Queries to Cards (Shokouhi et al. SIGIR’15) Modelling User Interests for Zero-query Ranking (Yang et al. ECIR’16) Intelligent Assistants: Understanding User Satisfaction with Intelligent Assistants (Kiseleva et al. CHIIR’16) Automatic Online Evaluation of Intelligent Assistants (Jiang et al. WWW’15)
  • 10. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Slight Detour - Proactive IR & Digital Assistants Modelling user interaction on Information Cards: Click-based (pseudo) relevance labels may not be appropriate for evaluating all types of cards. Features Reactive History Proactive History Lexical/Topical Features Local/Temporal Features Constant Features
  • 11. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Slight Detour - Proactive IR & Digital Assistants
  • 12. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Slight Detour - Proactive IR & Digital Assistants1 1 Understanding User Satisfaction with Intelligent Assistants, Kiseleva et al. ACM CHIIR 2016
  • 13. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Putting it together Digital Assistants - entry points for all web activities Goal is to move towards Task Completion
  • 14. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Understanding Search Tasks Search Tasks An (atomic?) information need that may result in issuing of one or more queries. Image from Verma et al. 2014
  • 15. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Why Search Tasks Image from Wang et al. 2014
  • 16. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Outline Part 1 Understanding Search Sessions Part 2 Extracting Tasks & Subtasks Part 3 Hierarchies of Tasks & Subtasks
  • 17. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Understanding Search Behavior IR research = sessions, topics, queries We focus on: 1 Multitasking 2 User groups based on multitasking 3 Topical heterogeneity
  • 18. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Research Questions 1 RQ1: Extent of multi-tasking in search sessions? 2 RQ2: Evidence of user-level heterogeneity based on task behavior? 1 RQ2.1: Does search task effort vary across user groups? 2 RQ2.2: Association between users’ interests and task multiplicity? 3 RQ3: Characterizing various heterogeneities in search behavior.
  • 19. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Data Context No of Queries 620M No of Sessions 190M No of Users 2M Avg No of queries per session 3.18 Avg no of sessions per user 76.01 Avg no of tasks per session 2.08 Table: Data summary Time Query SessionID TaskID Topic 05/29/2012 14:06:04 adele songs 1 1 Arts 05/29/2012 14:11:49 wedding venue 1 2 Society 05/29/2012 14:12:01 video download 1 3 Arts 05/29/2012 14:06:04 Obama care 2 4 News 05/29/2012 14:11:49 running shoes 2 5 Shopping 05/29/2012 14:12:01 sports shoes 2 5 Shopping 05/29/2012 14:22:12 wedding cards 2 2 Society Table: Sample search sessions
  • 20. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Extent of Multitasking Quantifying the extent of multi-tasking 0%   20%   40%   60%   80%   100%   %  sessions   No  of  tasks  in  a  session   EXTENT  OF  MULTITASKING  IN  SESSIONS   2   3   4   5   6   7   8   9   10   10+   1  110629692   0   10   20   30   40   50   60   No  of  Users  (In  %)   Avg  Number  of  Tasks  Performed  in  a  Session   1   2   3   4   5   6   7   8   9   10   10+  
  • 21. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks User Groups Focussed  Users   Mul--­‐taskers   Supertaskers   0%   20%   40%   60%   80%   100%   %  of  User  Popula-on   Types  of  User  Groups   TASK  CENTRIC  USER  GROUPS   Focussed   Mul--­‐Taskers   Supertaskers   Figure: User groups based on multi-tasking behaviors
  • 22. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks User Groups: Example Sessions User Type Example Queries from a Typical Session Focussed User ”test guide.com”, ”CNA Practice Test”, ”CNA State Board Exam”, ”CNA Testing Schedule and Loca- tions”, ”CNA State Board Practice Test”, ”CNA Practice Test 2014”, CNA 50 Questions Test”, ”Free GED Practice Test 2014” Multi-Tasking User ”Gravity FSX 2.0”, ”Full Suspension Mountain Bikes”, ”Walmart Cards”, ”Walmart Instant Card Application”, ”Gravity FSX 2.0 price”, ”full suspen- sion bike sale” Supertasking User ”hairstyles for women over 50”, ”thin wavy hairstyles for women”, ”facebook”, ”fb sign in”, pulled pork crock pot recipe easy”, ”Slow-Roasted Pulled Pork”, ”barefoot contessa”, ”miley cyrus hair styles”, ”hairstyler.com” Table: Example query sessions from the different user groups.
  • 23. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Effort Metrics across User Groups TTFC: Time to First Click TTLC: Time to Last Click PCC: Page Click Count PgCC: Pagination Click Count -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 TTFC TTLC PCC PgCC Focussed Multi-taskers Super-taskers Standardisedscores Figure: User groups based on multi-tasking behaviors
  • 24. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Topical Heterogeneity 0   0.01   0.02   0.03   0.04   0.05   0.06   0.07   0.08   0.09   Shopping   Home   Kids   Health   Extent  of  Mul,tasking   (a) Focused 0.55   0.56   0.57   0.58   0.59   0.6   0.61   0.62   0.63   0.64   0.65   0.66   News   Sports   Arts   Kids   Extent  of  Mul,tasking   (b) Multitasker 0.88   0.9   0.92   0.94   0.96   News   Sports   Arts   Society   Extent  of  Mul,tasking   (c) Supertasker 0   0.1   0.2   0.3   0.4   0.5   Arts   Adult   Games   Computers   Extent  of  Single-­‐Tasking   (d) Focused -­‐0.6   -­‐0.5   -­‐0.4   -­‐0.3   -­‐0.2   -­‐0.1   0   Business   Adult   Computers   Games   Extent  of  Single-­‐Tasking   (e) Multitasker -­‐0.86   -­‐0.84   -­‐0.82   -­‐0.8   -­‐0.78   -­‐0.76   -­‐0.74   Business   Home   Computers   Games   Extent  of  Single-­‐Tasking   (f) Supertasker Figure: Top topics prone to multi-tasking (Top) and single-tasking (Bottom) across different user groups.
  • 25. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Takeaways2 Sessions should NOT be the focal units in IR More than 70% of sessions are multi-task Users differ based on their multi-tasking habits Search effort varies across user groups Some topics prone to multi-tasking than others 2 R. Mehrotra, P. Bhattacharya, E. Yilmaz; Characterizing Users Multi-Tasking Behavior in Web Search, at ACM SIGIR Conference CHIIR 2016
  • 26. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Outline Part 1 Understanding Search Sessions Part 2 Extracting Tasks & Subtasks Part 3 Hierarchies of Tasks & Subtasks
  • 27. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Extracting Tasks & Subtasks 1 Task extraction 2 Subtask extraction
  • 28. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Extracting Search Tasks Various proposed strategies: Clustering session based queries [Lucchese, WSDM’11] Structured Learning Approach [Wang, WWW’13]
  • 29. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Extracting Search Tasks Various proposed strategies: Hawkes Process based Task Extraction [Li, KDD’14] Entity-based Task Extraction [Verma, CIKM’14][White, CIKM’14]
  • 30. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Extracting Search Tasks Problems abound: Link query to on-going task = long chains Impure tasks Pairwise predictions need extra post-processing Rely on large corpus of pre-tagged queries Do not aggregate across users
  • 31. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Random Fields based Task Extraction3 Propose a latent variable model for extracting tasks: Assumes k-underlying search tasks Places a Markov Random Field over the latent task layer Assumes access to a task-oracle Each latent task is a multinomial distribution over query terms 3 R. Mehrotra, E. Yilmaz; Extracting Search Tasks via Task Random Field Model, SIGIR 2016 (under review)
  • 32. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Random Fields based Task Extraction Figure: The proposed task-RF model for extracting search tasks. The generative process: Draw task proportions vector θ ∼ Dir(α) Draw task labels z for all queries from the joint distribution defined in Eq.(1) For each query (qi ) in the session: For each query term (wqi ,j ) draw wqi ,j ∼ multi(βzi )
  • 33. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Random Fields based Task Extraction A query marginal factorizes into its query terms: p(q|z, β) = Mq j=1 p(wj |z, β) (1) The joint distribution of θ, z, q and w can be written as: p(θ, z, w|α, β, λ) = p(θ|α)p(z|θ, λ) N i=1 Mqi j=1 p(wqi ,j |zi , β) (2) Under the MRF model, the joint probability of all task assignments z = zi N i=1 can be written as: p(z|θ, λ) = 1 A(θ, λ) N i=1 p(zi |θ) exp{λ (m,n)∈E 1(zm = zn)} (3)
  • 34. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Random Fields based Task Extraction Task Oracle: classifier which predicts whether or not 2 queries belong to the same task. Query Based Features cosine cosine similarity between the term sets of the queries edit norm edit distance between query strings Jac Jaccard coeff between the term sets of the queries Term proportion of common terms between the queries url-term-match-sum u∈URLi m(qj , u) + u∈URLj m(qi , u) url-term-match-max maxu∈URLi m(qj , u) + maxu∈URLj m(qi , u) Embedding cosine distance between embedding vectors of the two queries URL Based Features dom-match the number of domain matches between URLs from both queries Min-edit-U Minimum edit distance between all URL pairs from the queries Avg-edit-U Average edit distance between all URL pairs from the queries Jac-U-min Minimum Jaccard coefficient between all URL pairs from the queries Jac-U-avg Average Jaccard coefficient between all URL pairs from the queries Table: Features used by the classifier.
  • 35. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Extracting Tasks & Subtasks 1 Task extraction 2 Subtask extraction
  • 36. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Subtask Extraction4 Complex task decompose to more focussed subtasks Number of subtasks is unknown Given: collection of on-task queries Couple Bayesian Nonparametrics & Word Embeddings 4 Subtask Extraction: R. Mehrotra, P. Bhattacharya, E. Yilmaz; Exploiting Distributional Representations with Distance Dependent CRPs for Extracting Sub-Tasks. Accepted at NAACL 2016
  • 37. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Chinese Restaurant Process p(zi = j D, α) ∝ f (dij ) if j = i α if j = i 1 For each query i ∈ [1, N], draw assignment zi ∼ dist − CRP(α, f , D). 2 For each sub-task, k ∈ {1, ....}, draw a parameter θ∗ k ∼ G0. 3 For each query i ∈ [1, N], 1 If ci /∈ z∗ q1:N , set the parameter for the ith query to θi = θqi . Otherwise draw the parameter from the base distribution, θi ∼ Dirichlet(λ). 2 Draw the ith query, wi ∼ Mult(M, θi ).
  • 38. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Quantifying Task based Distances Task: plan a wedding Sample queries: wedding planning, wedding checklist, bridal dresses, wedding cards Leverage word embeddings. Classify each word as background word or subtask-specific word. Use a weighted combination of their embedding vectors to encode a query’s vector: Vq = 1 nterms i nqti Σqnq Vti (4)
  • 39. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Evaluating Quality of Subtasks Qualitative Analysis: Proposed Approach LDA sub-task 1 sub-task 2 sub-task 3 sub-task 1 sub-task 2 sub-task 3 wedding hairstyles used wedding dresses wedding card holders wedding insurance christian wedding vows make your own wedding invitations wedding hair dos colorful bridal gowns indian wedding program cards destination wedding dresses brides wedding cakes pictures curly wedding hairstyles preowned wedding dresses wedding program paper wedding planning book cheap wedding dresses planners pictures of wedding hair wedding attire regency wedding cards party supply stores tea length wedding dresses wedding colors CRP TW sub-task 1 sub-task 2 sub-task 3 sub-task 1 sub-task 2 sub-task 3 wedding planning kit wedding theme wedding insurance wedding insurance christian wedding vows cheap dresses destination wedding wedding guide weddings in vegas destination wedding plus size bridesmaid wedding cakes pictures wedding table decorations save the date ideas wedding cakes pictures financing wedding rings wedding colors pricing weddings 1938 wedding pictures wedding vacation planning a wedding party supply stores tea length wedding dresses macys wedding dresses User Study:
  • 40. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Takeaways Latent variable model for task extraction based on Task oracle Extracts more coherent tasks Nonparametric model for extracting subtasks Couple nonparametric approaches with word embeddings Human judgements & coherence estimates show improved results
  • 41. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Outline Part 1 Understanding Search Sessions Part 2 Extracting Tasks & Subtasks Part 3 Hierarchies of Tasks & Subtasks
  • 42. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Hierarchies of Tasks & Subtasks Complex search tasks places intense cognitive burden on the user Users need to explore domain-space Flat structure-less tasks lack insights about demarcation of subtasks Provides tremendous opportunity to contextually target the user (recommendations/advertisements)
  • 43. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Constructing Bayesian Hierarchies Tree as partitions & mixtures: Tree: partition over the group of queries (Q) where probability of a query group is defined via Dynamic Programming as p(Q|T) = πT f (Q) + (1 − πt) Ti ∈ch(T) p(leaves(Ti )|Ti ) (5) Gamma-Poisson Model of Query Affinities: Query-Term Based Affinity (r1) cosine cosine similarity between the term sets of the queries edit norm edit distance between query strings Jac Jaccard coeff between the term sets of the queries Term proportion of common terms between the queries URL Based Affinity (r2) Min-edit-U Minimum edit distance between all URL pairs from the queries Avg-edit-U Average edit distance between all URL pairs from the queries Jac-U-min Minimum Jaccard coefficient between all URL pairs from the queries Jac-U-avg Average Jaccard coefficient between all URL pairs from the queries Session/User Based Affinity (r3) Same-U if the two queries belong to the same user Same-S if the two queries belong to the same session Embedding Based Affinity (r4) Embedding cosine distance between embedding vectors of the two queries Table: Query-Query Affinities.
  • 44. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Constructing Bayesian Hierarchies Forming Arbitrary Tree Structure: Bayesian Rose Trees (NIPS 2012) Agglomerative Model Selection: Bottom-up greedy agglomerative fashion At each iteration: merges two of the trees in the forest Which pair of tree to merge(& how) - merger that yields the largest Bayes factor improvement
  • 45. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Takeaways5 We can extract recursive hierarchies of tasks & subtasks Nonparametric tree partition model for hierarchies Allows for arbitrary structures Improves term prediction & task extraction Human labelled judgements show: we extract coherent tasks Subtasks are valid Subtasks are useful for completing the overall task 5 R. Mehrotra, E. Yilmaz; Towards Hierarchies of Search Tasks & Subtasks, In Proceedings of the Poster Track WWW 2015
  • 46. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks Thank You!
  • 47. ML Meetup Rishabh Mehrotra Motivation Search Tasks From Sessions to Tasks Multitasking User Groups Topical Heterogeneity Extracting Tasks & Subtasks Task Extraction Subtask Extraction Hierarchies of Tasks & Subtasks References R. Mehrotra, P. Bhattacharya, and E. Yilmaz. Characterizing users’ multi-tasking behavior in web search. In Proceedings of the 2016 ACM on Conference on Human Information Interaction and Retrieval, 2016. R. Mehrotra, P. Bhattacharya, and E. Yilmaz. Deconstructing complex search tasks: A bayesian nonparametric approach for extracting sub-tasks. In Proceedings of NAACL-HLT, pages 599–605, 2016. R. Mehrotra, P. Bhattacharya, and E. Yilmaz. Sessions; tasks & topics - uncovering behavioral heterogeneities in online search behavior. In Proceedings of the 2016 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2016. R. Mehrotra and E. Yilmaz. Towards hierarchies of search tasks & subtasks. In WWW 2015. R. Mehrotra and E. Yilmaz. Terms, topics & tasks: Enhanced user modelling for better personalization. In Proceedings of the 2015 International Conference on The Theory of Information Retrieval, pages 131–140. ACM, 2015. R. Mehrotra, E. Yilmaz, and M. Verma. Task-based user modelling for personalization via probabilistic matrix factorization. In RecSys Posters, 2014. D. Odijk, R. W. White, A. Hassan Awadallah, and S. T. Dumais. Struggling and success in web search. In CIKM 2015.