5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
Optimizing Search User Interfaces and Interactions within Professional Social Networks
1. Optimizing Search User Interfaces and
Interactions within Professional
Social Networks
PhD Candidate: Nikita V Spirin (UIUC)
PhD Committee: Prof. Karrie G Karahalios (UIUC, co-adviser)
Prof. ChengXiang Zhai (UIUC, co-adviser)
Prof. Jiawei Han (UIUC)
Dr. Daniel Tunkelang (LinkedIn, Google, Endeca)
2. Imagine that you are looking for a
Software Engineer job in New York
3
3. Keywordsearch for entities
(e.g. people, jobs, groups)
Faceted search to filter
entities based on attributes
To help users cope with the immense scale and
influx of new information, professional social
networks provide search functionality
4
4. Search within PSNs is fundamentally different
from web search and traditional IR
• The units of retrieval are structured and typed entities
rather than documents.
• The entities aren't independent from each other but form
the entity graph. Plus, users form the part of this graph.
• Sorting by relevance, typical for web search, is not the only
way to order search results. There are many new ways of
ordering, e.g. sort by date, sort by salary, and etc.
• Rather than providing services to mass market, PSNs'
target audience are knowledge workers.
5
5. “...it is clearly the case that the new models and
associated representation and ranking techniques
lead to only incremental (if that) improvement in
performance over previous models and techniques,
which is generally not statistically significant (e.g.
Sparck Jones, 2005); and, that such improvement,
as determined in TREC-style evaluation, rarely, if
ever, leads to improved performance by human
searchers in interactive IR systems...”
Nicholas Belkin
Keynote at ECIR 2008
6
6. How can we optimize search user
interfaces (SUI) and interactions
within professional social networks?
7
7. How can we optimize SUIs and interactions
within professional social networks?
Filters
Query formulation, suggestions… Resorting
Snippets for jobs/people
Snippets for jobs/people
Snippets for jobs/people
Breadcrumbs Breadcrumbs Breadcrumbs
8
8. How can we optimize SUIs and interactions
within professional social networks?
Filters
Query formulation, suggestions… Resorting
Snippets for jobs/people
Snippets for jobs/people
Snippets for jobs/people
Breadcrumbs Breadcrumbs Breadcrumbs
9
9. How can we optimize SUIs and interactions
within professional social networks?
Filters
Query formulation, suggestions… Resorting
Snippets for jobs/people
Snippets for jobs/people
Snippets for jobs/people
Breadcrumbs Breadcrumbs Breadcrumbs
10
15. • Interactive free-text queries (e.g. “Stephen Robertson“,
“SIGIR”, “Chinese Buffet”)
• Interactive structured queries (e.g. “Photos of people
who visited Beijing“)
• One-shot free-text queries (e.g. “big data”, “query log
mining“, “Shanghai”) limited to users' status updates
16
16. • Interactive free-text queries (e.g. “Stephen Robertson“,
“SIGIR”, “Chinese Buffet”)
• Interactive structured queries (e.g. “Photos of people
who visited Beijing“)
• One-shot free-text queries (e.g. “big data”, “query log
mining“, “Shanghai”) limited to users' status updates
17
17. We explore the way people search for
people on Facebook
• RQ1: How does search behavior differ for NEQs and SQs?
• RQ2: How does search behavior depend on the graph search
distance (friend vs. non-friend)?
• RQ3: How does search behavior depend on demographic
attributes (age, gender, number of friends, celebrity status)?
• RQ4: How structured querying capabilities are used by the
users of Graph Search?
18
18. Anonymized Named
Entity Query Log
• 3M non-novice users
• 58.5M queries
• Sept 2013 – Oct 2013
We use four interconnected data sets
provided by Facebook
Anonymized Structured
Query Log
• 3M non-novice users
• 10.9M queries
• Sept 2013 – Oct 2013
Anonymized Social Graph
• 858M vertexes
• 270B edges
• Oct 2013 snapshot
Anonymized User Profiles
• 858M vertexes
• Age, gender, # of friends
• en_US (English + USA)
19
19. Definitions: graph search distance
Named Entity Query
Use a traditional graph-theoretical
definition of the graph distance
Structured Query
1. If one entity, use a traditional
graph-theoretical definition
2. If 2+ entities, compute the
distance to each one and
renormalize by computing a bit
vector with three components
(one for each of the three classes
of the graph distance).
RQ1,RQ220
20. NEQs and SQs complement each other enabling
more effective exploration of the network
• Users search for friends using NEQs and search for non-
friends using SQs.
• Self queries are less popular compared to an overall
query volume.
• Users search for themselves more using SQs.
RQ1,RQ221
22. Graph search distance vs. Age (10-year bins)
Users write NEQs for friends more often compared to NEQs
for non-friends across all age bins.
0
2
4
6
8
10
12
14
10 20 30 40 50 60 70 80
NEQ 1st/user
NEQ 2nd+/user
23
RQ2,RQ3
23. Graph search distance vs. Age (10-year bins)
The graph for SQs is bi-modal. Non-friend SQs prevail for
the younger users. Friend SQs prevail for the older users.
0
0.5
1
1.5
2
2.5
3
10 20 30 40 50 60 70 80
SQ 1st/user
SQ 2nd+/user
24
RQ2,RQ3
24. Graph search distance vs. Age (10-year bins)
The younger users more actively search for non-friends and
the older – for friends, relative to the average user.
0
0.2
0.4
0.6
0.8
1
1 2 3 4 5 6 7 8
NEQ
1st/(1st + 2nd+)
ratio
SQ
1st/(1st + 2nd+)
ratio
25
RQ2,RQ3
25. Graph search distance vs. Gender
Females write more queries than males and it is consistent
across the query types (both for NEQs and SQs).
0
5
10
15
20
25
female male
NEQ
1st/user
NEQ
2nd+/user
NEQ/user
0
0.5
1
1.5
2
2.5
3
3.5
4
female male
SQ
1st/user
SQ
2nd+/user
SQ/user
26
RQ2,RQ3
26. Graph search distance vs. Number of friends
(100-friend bins, from 0 to 1500)
The more friends a user has, the more friend NEQs the user
writes. The trend for non-friend NEQs slightly declines.
0
5
10
15
20
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
NEQ 1st/user
NEQ 2nd+/user
27
RQ2,RQ3
27. Users with more friends write less non-friend SQs.
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
SQ 1st/user
SQ 2nd+/user
SQ/user
Graph search distance vs. Number of friends
(100-friend bins, from 0 to 1500)
28
RQ2,RQ3
28. The trend for non-friend NEQs is flat, while friend NEQs
contribute to the growth of the query volume.
Graph search distance vs. Number of friends
(100-friend bins, from 0 to 1500)
0
5
10
15
20
25
30
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
NEQ 1st/user
NEQ 2nd+/user
NEQ/user
29
RQ2,RQ3
29. 0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
SQ 1st/user
SQ 2nd+/user
SQ/user
Graph search distance vs. Number of friends
(100-friend bins, from 0 to 1500)
The trend for friend SQs is flat, while the volume of non-
friend SQs changes with the number of friends.
30
RQ2,RQ3
30. Graph search distance vs. Celebrity status
Celebrity users submit more NEQs and less SQs than
typical users.
0
5
10
15
20
25
30
35
40
typical celebrity
NEQ 1st/user
NEQ 2nd+/user
NEQ/user
0
0.5
1
1.5
2
2.5
3
3.5
typical celebrity
SQ 1st/user
SQ 2nd+/user
SQ/user
31
RQ2,RQ3
31. Graph search distance vs. Celebrity status
Celebrities search more for other celebrities than typical users.
Both user groups are more likely to search for a celebrity when
they write a non-friend query relative to a friend query.
32
RQ3
32. Graph search distance vs. Celebrity status
Celebrities search more for other celebrities than typical users.
Both user groups are more likely to search for a celebrity when
they write a non-friend query relative to a friend query. RQ333
33. Graph search distance vs. Celebrity status
Celebrities search more for other celebrities than typical users.
Both user groups are more likely to search for a celebrity when
they write a non-friend query relative to a friend query. RQ334
36. Structured query popularity vs. Length,
measured as # of functional predicates
RQ4
• Shorter SQs are more popular.
• Users write shorter grammar queries when they search for the
first degree connections.
37
37. Structured query popularity vs. Length,
measured as # of functional predicates
• Shorter SQs are more popular.
• Users write shorter grammar queries when they search for
the first degree connections.
RQ438
39. Grammar usage for name disambiguation
RQ4
Top-5 groups of
disambiguation
predicates used in SQs
1. Location
2. Affiliation (e.g. Company)
3. Interest
4. Gender
5. Relationship
40
40. Key take-aways and design implications
• Both NEQs and SQs are important to facilitate navigation
and exploration within the social network
– Users search for friends with NEQs
– Users search for non-friends and explore the graph using SQs
• Personalized search query suggestions are very promising
– Focus on SQs if have limited time or resources to achieve maximum
results since it has higher variance across demographic groups
– Don’t limit query suggestions to friends only; include some
interesting distant network vertices
– Take into account a predicate degree preference distribution, i.e.
ranking entities for a predicate using its graph distance distribution
41
44. Search for “product manager” sort by “relevance”Search for “product manager” sort by “relevance”
45
45. Search for “product manager” sort by “relevance”Search for “product manager” sort by “date desc”
46
46. Search for “table” sort by “relevance”Search for “table” sort by “relevance”
47
47. Search for “table” sort by “time desc”Search for “table” sort by “relevance”Search for “table” sort by “time desc”
48
48. Search for “chocolate” sort by “relevance”Search for “table” sort by “relevance”Search for “chocolate” sort by “relevance”
49
49. Search for “chocolate” sort by “price asc”Search for “table” sort by “relevance”Search for “chocolate” sort by “price desc”
50
50. Problems with the existing SUIs supporting
result re-sorting by an attribute value
• When results are sorted by relevance, the output is good
– Average Precision@10 is 0.86
– Results are personalized for the user
• When sorting by the attribute value, e.g. salary high-to-
low, price low-to-high, or date recent-to-old, there are
many irrelevant results at the top of the SERP
– Average Precision@1 is 0.44
– Average Precision@5 is 0.45
– 61% of queries have the Precision@10 below 0.5
– Personalization is gone
51
52. We explore how to improve relevance of
search results sorted by an attribute value
• RQ5: Can the quality be improved by incorporating
relevance into the ranking process?
• RQ6: What is the best way to accomplish it?
53
55. Evaluation trace for a toy example problem
{(0, 0); (1, 3); (2, 1); (3, 2); (4, 1); (5, 3)}
Dependencies between problems
in the memoization matrix and
proper evaluation order
Reconstruction of the optimal
path using the intermediate
values in the memoization matrix
56
56. • Predict relevance labels with Gradient Boosted Regression
Trees (5-fold cross validation partitioning)
• Extend MQ2007 and MSLR-WEB10K data sets by assigning a
random timestamp to each document to model the sorting
by the attribute value
• Apply filtering as the final step in the query processing
pipelines for the following baselines:
– B1: sort by the attribute value and do nothing else (weak)
– B2: predict relevance labels, take all above the threshold, re-sort by
the attribute value (somewhat strong)
– B3: sort by relevance, take top-k results, re-sort by the attribute value
(strong)
• Average the results from 1,000 simulation runs
Experiments with the real L2R data sets (MSR
LETOR collections MQ2007 and MSLR-WEB10K)
RQ557
57. Our approach outperforms all baselines (including
top-k re-ranking) and leads to ~2-4% lift in NDCGMQ2007
(1,600queries;
{0,1,2}labels;40
doc/query;46feats.
MSLR-WEB10K
(10,000queries;
{0,1,2,3,4}labels;120
doc/query;136feats.
RQ558
58. The behavior of the algorithm for different
input sizes and relevance label distributions
59
59. The algorithm can process 1,000 results under 100ms
using our reference C++implementation on a
workstation with 4GB RAM and two 2.5GHz CPU cores
60
60. Key take-aways and design implications
• The quality of search results sorted by an attribute value could
be improved using relevance-aware filtering. The proposed
algorithm consistently outperforms all known baselines and
increases search quality by 2-4%
• Assuming that users scan the results sequentially, the proposed
algorithm is theoretically optimal as it directly optimizes search
quality metrics within the dynamic programming framework
• Higher gains are characteristic for the relevance label
distributions, where relevant results are more probable, and for
medium length result sets (20-100 tuples)
• The algorithm can process 1,000 results under 100ms using our
reference C++implementation on machine with 4GB, 2x2.5GHz
61
67. The problem is that search snippets are either
absent or generated with very naive heuristics
• Titles on the SERP are not informative since in job search queries are
the same as titles, e.g. for the query “Software Engineer”, relevant results
will have “Software Engineer” in their titles.
• Titles on the SERP are not discriminative and minimally help users in
making click decisions. Users play the “lottery” by trying to find a
relevant link among 10 similarly looking links.
• A title and a (query-biased) snippetare redundant, which requires
users to expend cognitive energy on the SERP without extra gains.
• Often the content of a snippet doesn’t provide useful information about
a job posting hidden behind the link. For example, snippets contain
irrelevantnumbers, names, and etc.
• For jobs, which are not directly related to the query, SERPs with the titles
only don’t help in making click decisions. For example, software
engineer in a data-driven company might do data science, but the
common belief is not => users will ignore such a job posting.
68
68. RQ7: What job attributes do people consider
important when deciding whether they want
to apply for a job?
69
69. Method: mixed-method need elicitation study
``Think-aloud” comments
while conducting job search
Job posting annotation
using Diigo plugin
Two surveys asking to score
attributes (SERP + job page)
RQ770
70. Participants have diverse backgrounds
(gender and professional interests / job title)
• Required criteria:
– At least 3 month internship experience
– Above 18 years old
– Used online job search engine (e.g. LinkedIn Job, Indeed)
• Gender: 13 females, 13 males
• Job title: Software Engineer (8), Data Scientist (3),
Healthcare Consultant (2), Research Scientist (2),
Personal Trainer (1), Genetics Counselor (1), Product
Manager (1), Translator (1), Occupational Therapist (1),
Marketing Manager (1), Business Analyst (1), Foreign
Policy Representative (1), Consultant (1), Biomedical
Product Developer (1), Pharmacist (1). RQ771
71. Participants have diverse backgrounds (age,
education, and years of work experience)
What is your current
student (work) status?
How many years of work
experience do you have?
RQ772
72. Findings from the need elicitation user
study about relative attribute importance
• Think-aloud comments:
– Company (36); skills (34); job title (29); responsibilities (25);
years of work experience (22); degree (15); location (14)
• Job posting annotation:
– Qualifications (83) including required skills (72), degree (59),
years of work experience (53); responsibilities (71), location (24),
job type (24), job title (16), work authorization (13)
• Surveys:
– Job type (9.39/10); company (8.99/10); job title (8.84/10);
required skills (8.65/10); responsibilities(8.41/10); educational
requirements including degree and major (8.29/10); location
including city and country (8.27/10); years of experience (8.26/10)
RQ773
73. “Think-aloud” comments made by participants
while searching for and annotating jobs
“I stopped once I saw SQL and other coding technologies
[skills]. I am a different kind of analyst. [P2]
“I try to count the number of required skills I cover. If a lot of the
skills don't match my background, I go for another job." [P14]
“When I search I try to follow the following strategy: if I am
sure that I fit, I will open the job posting [form the SERP], if it
is 50/50, I will still open it since I am exploring more options,
if am sure that I don't qualify, I will skip. Basically, I look for
must have criteria and if they aren't satisfied, I skip. For me
these are title, skill, major, and degree." [P22]
RQ774
74. Top ranked job attributed based on the survey
responses (scored on a 5-point Likert scale)
RQ775
75. The proposal is to standardize job postings using
information extraction and show responsibilities
and requirements in the snippets on the SERP
Generate extended snippets for job search Optimize detailed page views
76
76. We explore the feasibility to generate
structured snippets and their effectiveness
• RQ8: How to convert unstructured job postings into the
structured representation with minimal supervision?
• RQ9: Do extended structured snippets improve job
search user experience? How do users behave when such
structured snippets are used?
77
77. Jobs are quite regular and one word per section is
enough to prepare the training set for ML model
RQ878
78. Jobs are quite regular and one word per section is
enough to prepare the training set for ML model
RQ879
79. Jobs are quite regular and one word per section is
enough to prepare the training set for ML model
RQ880
80. Weakly-supervised VS. Supervised (English)
Our weakly-supervised approach achieves as good Precision as
the supervised model trained on a corpus of 1,000 labeled job
postings. At the same time, our approach is easily deployable
for many languages and has higher Page Coverage.
RQ881
81. Extraction quality across job titles using the
proposed weakly-supervised approach (English)
Extraction quality is consistently high across randomly
selected sample of job titles, which implies generalizability of
the model to the entire job search vertical (for English).
RQ882
82. Tuning for a special language (Russian) leads
to boost in information extraction quality
• Active learning pipeline to bootstrap more accurate
section detection rules, which minimizes human
intervention and efforts and increases model precision
• Hybrid algorithm based on rules and machine learning
as a back-off [2 stages] yields 97-98% for Precision:
– Do high accuracy classification using manually defined rules
– Classify with the machine learning model other sentences
0
50
100
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52
Pagescovered%
Number of rules
RQ983
83. Before (#1 job search engine in Russia)
Hard to differentiate
similar job titles +
no textual snippets
(only company,
location, date posted)
RQ984
84. After (tested in production A/B tests with #1 job search
engine in Russia): DEFAULT vs. RESP+REQ+COND
RQ985
86. The ratio of SERP clicks per query is less
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1 2 3 4 5 6 7 8 9 10 11
Lessisbetter
Days since the beginning of the experiment
Series1
Series2
RQ987
87. The ratio of job applications over job views is more
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Moreisbetter
Days since the beginning of the experiment
Series1
Series2
RQ988
88. Other relevant metrics from the A/B test
• Extraction quality: 97% precision at 100% coverage
• Decreased number of queries per session by 8%
• Decreased number of detailed page views by 1.4X
• Increased number of applications overall by 1.6%
• Increased application rate conditioned on click by 13%
• Decreased number of short clicks by 5.5%
• Decreased number of wasted views by 1.25X
• Decreased click entropy 1.98X
RQ989
89. Key take-aways and design implications
• In addition to the attributes currently shown on the SERP,
users pay attention to responsibilities and requirements
• By leveraging big data redundancy, we can generate large
scale annotated data sets with minimum supervision and use
them to train highly accurate ML models to extract
responsibilities and requirements sections from job postings
• The proposed weakly-supervised approach for information
extraction can be easily adapted to new languages
• Extended structured snippets improve search user experience:
– Minimize irrelevant clicks and click entropy
– Standardize job posting representation
– Eliminate title-snippet redundancy
90
98. In the case of structured search, the longer
the query, the more redundant and less
informative are the query-biased snippets.
Query-snippet duality
99
99. • M. Das et al., Generating Informative Snippet to Maximize Item
Visibility, CIKM ‘2013
• A. Kashyapet and Vagelis Hristidis, Comprehension-Based
Result Snippets, CIKM ’2012
• Z. Liu et al., Structured Search Result Differentiation, VLDB‘2009
• M. Miah et al., Standing Out in a Crowd: Selecting Attributes for
Maximum Visibility, ICDE ‘2008
• G. Das et al., Ordering the Attributes of Query Results,
SIGMOD‘2006
Prior work (DB community): from query-biased to
non-redundant snippets for structured search
100
106. • RQ10: What kind of snippets make users more
productive and successful when performing
structured search on mobile devices?
• RQ11: What kind of snippets do users prefer
based on their subjective feelings?
107
107. Method: laboratory interactive user study
Built a new structured
search mobile app
Invited 39 (12 + 24 +3)
participants to the lab
Made each participant
to do four search tasks
108
110. Participants have diverse backgrounds (mostly
UIUC students + a few working professionals)
• Gender: 21 females, 18 males (we recruited 3
extra people to make up for outlier sessions)
• Age: 22-34 years old, mean 26.3 years old
• Degree: BS/BA (24), MS/MFA/MA (10), PhD (5)
• Major / Field of Study: Computer Science (12),
Psychology (3), Biology (3), Mathematics (2),
MBA (2), Nutrition (2), Agriculture (2),
Mechanical Engineering (2), Civil Engineering (1),
Political Science (1), Kinesiology (1), European
Union Studies (1), Chemistry (1), Supply Chain
Management (1), Accounting (1), Linguistics (1),
Marketing (1), Medicine (1), Education (1)
111
111. Participants are active users of various social
networking sites and regularly do people search
How often do you use
LinkedIn?
How often do you search
for people online?
112
112. Tasks are inspired by real people search needs
• Task 1: Find five people to ask for
professional/career advice
• Task 2: Find five potential keynote speakers
for a conference
• Task 3: Help a recruiter find and evaluate five
candidates to interview
• Task 4: Find five potential candidates to
collaborate with you on a project
113
113. Tasks are inspired by real people search needs
• Task 1: Find five people to ask for
professional/career advice
• Task 2: Find five potential keynote speakers
for a conference
• Task 3: Help a recruiter find and evaluate five
candidates for the interview
• Task 4: Find five potential candidates to
collaborate with you on a project
114
114. Measurements collected from each participant
• Before (5-10 min)
– Pre-study survey
• During (4 x 10 min/task + 4 x 5 min/survey)
– User “think-aloud” comments
– Post-task system satisfaction surveys (Likert scale +
semantic differentials)
– Post-task subjective relevance judgments for top-5
retrieved results
– Search usage behavior logs
– Task completion times
• After (15-20 min)
– Post-study semi-structured interview
Quantitative
Qualitative
Quantitative
Quantitative
Quantitative
Quantitative
Qualitative
115
115. A pilot study with 12 participants and 4 systems.
Each participant does one task using one system.
The task/system order is randomized following
the Greek-Latin square experiment design.
116
117. Key findings from the pilot user study
• Users want to see more information about each
result on the SERP (longer snippets)
• Users don’t notice the extra scrolling cost
• Users tend to like non-redundant snippets more
than query-biased ones given a fixed snippet length
• Non-redundant snippets help users find generally
more relevant results
Query-
biased, 2
Non-
redundant, 2
Query-
biased,4
Non-
redundant, 4
System rank 3.5 +/- 0.7 2.8 +/- 1.0 2.2 +/- 0.8 1.5 +/- 0.8
118
119. A formal study with 24 participants and 2 systems. Each
participant does two tasks per system. The task/system
order is randomized following the Greek-Latin square
experiment design.120
120. Participants find tasks as realistic (4+/5 Likert scale)
suggesting high ecological validity of the study
Q8: I can see myself doing this
task in the real world
121
121. Users feel that the version of the SERP with the
query-biased snippets is easier to use
Q1: The system is easy to use
RQ11122
122. Users consider non-redundant snippets as
more useful based on the post-task surveys
Q5: The display of each profile
on the SERP is useful
Q7: The summaries/attributes presented
for each result on the SERP are useful RQ11123
123. Non-redundant snippets help users find more
relevant people based on personal judgments
Q3: The system helps me find
relevant candidates RQ11124
124. Non-redundant snippets help users find more
relevant people based on personal judgments
Q3: The system helps me find
relevant candidates
RQ10125
126. Users seem to be more effective when using
the system with non-redundant snippets
Metric
System 1
(query-biased)
System2
(non-redundant)
Ave. number of queries per
search session
5.91 +/- 5.60 4.96 +/- 3.90
Ave. number of SERP clicks
(profile views) per search session
18.51 +/- 5.80 16.00 +/- 4.60 (*)
Time between consecutive
queries within a session (sec.)
63.80 +/-39.59 56.77 +/- 34.88
Ave. query length (filters used) 3.03 +/- 0.97 3.14 +/- 0.95
Ave. SERP click position 6.48 +/- 3.97 6.18 +/- 4.77
Ave. Max SERP click position 28.13 +/- 21.14 24.39 +/- 19.78
RQ10127
127. For non-redundant snippets the first three results
(one screen) get about the same percentage of clicks
RQ10128
128. With non-redundant snippets users engage with the
SERP more and make more informed decisions
Metric
System1
(query-biased)
System2
(non-redundant)
Time to the first SERP click (sec.) 11.36 +/- 4.71 13.17 +/- 6.96
Ave. number of candidates
added to favorites (SERP)
1.00 +/- 2.13 1.44 +/- 2.77
Ave. number of candidates
removed from favorites (SERP)
0.43 +/- 1.11 0.65 +/- 1.40
Ave. number of candidates
added to favorites (Profile )
4.85 +/- 2.11 4.48 +/- 2.03
Ave. number of candidates
removed from favorites (Profile)
0.45 +/- 0.78 0.39 +/- 0.74
RQ10129
129. Users prefer non-
redundant snippets
(19/27) over query-
biased snippets
(8/27). The result is
statistically significant
based on the exact
two-tailed binomial
test at alpha=0.05
(p=0.0357).
RQ11130
130. Why do users prefer non-redundant snippets?
• Shows new non-redundant information (16 users)
• Helps discriminate the results on the SERP (12 users)
• Shows more relevant attributes (6 users)
• Helps accomplish the task faster (3 users)
• Requires less scrolling (3 users)
• Returns more relevant results (1 user)
RQ11131
131. Why do users prefer non-redundant snippets?
“System 2 reduces repetition of information displayed
on the screen. Therefore, it can show more results per
screen and I have to do less scrolling.” [P7]
“It [System 2] is better since there is no extra line of
information. I know they are all from Chicago and it is
good that I don't have to see it here [on SERP].” [P17]
“System 2 shows less information but not loosing any
information since it also shows search criteria
compactly at the top.” [P22] RQ11132
132. Why do users prefer query-biased snippets?
• Has a more regular layout (7 users)
• Shows more relevant attributes (6 users)
• More predictable and reassuring (6 users)
• Demands less cognitive load and effort (5 users)
• Good balance of selected and new information (2 users)
• Works faster (2 users)
• Returns more relevant results (1 user)
• Forces to check individual profiles (1 user)
RQ11133
133. Why do users prefer query-biased snippets?
“To be honest, I get distracted easily. If you show so
much novel information, I don't know what to focus
on. System 1 [query-biased] is more moderate in
that sense. It either shows all info as in the query or
only a few attributes are new.” [P6]
“If I searched for a PhD, it will obviously have all
results matching this filter. From that point of view,
this information is redundant. But I still feel better
psychologically when I see what I searched for. That’s
why I prefer System 1 more.” [P13]
RQ11134
134. Query formulation strategies
• Submit a specific search, then generalize (18 users)
– Users pay attention to the number of search results and make their queries
more specific if they see 100+ results.
• Submit a general search with only minimal requirements, then check
several result sets by playing with the optional attributes (6 users)
• Submit a general search, add many results to favorites, finally pick the
top-5 from favorites (4 users)
• Create a search “design space”, then methodically check all possible
combinations (6 users)
• Change the query, if see that the quality of results decreases with the
rank on the SERP (5 users)
• Users want to select several values per attribute (e.g. “Lyft or Uber”).
135
135. For non-redundant snippets we see more new queries,
query-biased snippets lead to more reformulations
136
136. Result examination strategies on the SERP
• Remember the query if it has 1-2 constraints, but find the
“breadcrumbs” useful for cases with 3+ query constraints
• Mostly top-to-bottom (position bias effect) with some exceptions
– “The order of examination is actually random, not top-to-bottom.
Sometime I just scroll and see if some word catches my eye and I
click on it.”
• Look at the names
– More distinct than other elements on the SERP since in bold
– Sometimes help decide whether the candidate is appropriate
• Look at the first and the last lines of the structured snippet
• Align one result to the next one and see the difference in attribute
values shown
137
137. Result selection / bookmarking strategies
• Users tend to select the results, which have some familiar attribute
values (e.g. big companies, well-known schools)
• Try to assess the social similarity based on the attributes on the SERP
(e.g. distinctive names, alumni from the same university, people from
the same age group, people with the same levels of qualification)
• Add to favorites (mostly) from the detailed page since don't have
enough info on the SERP to decide whether the candidate is good
• Some add to favorites from the SERP if:
– want to compare with the other results (can do it for System 2)
– there are too many results
– want to have candidates that have something remarkable to sell themselves
even from the SERP
• Skip “N/A”, “Self-employed”, “Intern” in the job title
• Want to specify attributes shown / CV layout depending on the task
138
138. Key take-aways and design implications
• Eliminate redundancy from the SERP
– Use non-redundant snippets (lead to effective and efficient search)
– Eliminate redundancy via interaction design (swipe instead of star
buttons, hovering navigation bar with the query breadcrumbs)
• Provide more control and transparency
– Show on the SERP more specific information to help assess the
relevance of results, e.g. “87% (Both you and John went to MIT)”
– Let users specify which attributes to show on the SERP
– Let users decide what kind of snippets to show or predict with ML
• Show more information per result on the SERP, yet show at
least several results to help users do result comparison
• Direct users to use better search strategies
– “Steer” users towards writing longer queries
– Explore new ways to encourage users to reformulate their queries
139
139. Scope User Understanding Technique
StructuredSearch
ProfessionalSocial
Networks
Large-scale Query Log Analysis
Study of People Search
Weakly-supervised
Approach for
Information Extraction
from Job Postings
Mixed-method User Study of
Job Attribute Importance
Interactive Task-oriented User
Study of Query-biased vs. Non-
redundant Structured Snippets
Relevance-aware
Structured Search
Results Filtering
140.
141. Thank you for your time!
Reach out if you seriously want to collaborate.
Nik Spirin
Email: spirinus@gmail.com
Skype: @spirinus
Twitter: @spirinus
Facebook: @spirinus
Instagram: @_spirinus_