Diversifying Autocomplete

1
Diversifying
Autocomplete
Felipe Besson
Haystack / MICES / Berlin Buzzwords
June 11, 2020

2
20+ classifieds brands worldwide
350+ mi users each month
5000+ employees
35 offices
Online Classifieds Platform
25+ countries

OLX Europe Discovery Cycle
3
Recommendations
home page search page
ad page

All steps are interconnected!
4
● Users have different intents
● What can break the dialogue with the user ?
○ Broad queries (Autocomplete and Search)
○ Ambiguity (Query understanding)
○ Bad Interactions (Recommendations)

Diversifying search results
5
Strength the dialogue with the user
● Dealing with broad queries
○ Autocomplete
○ Search
● Item showcase for new or exploring users
● Gathering more interactions to improve recommendations
● Autocomplete

What will be covered and how ?
6
● Broad queries problem in autocomplete
● Techniques to promote diversiﬁcation
● Our use case:
○ Autocomplete at OLX Europe

What is Autocomplete ?
7
A tool to talk directly to the user
● Guide users to good queries
● Help query understanding to understand
● Fast response/reaction
● Help tackling search relevance earlier as possible

Autocomplete at OLX Europe
8
● Suggest popular searches with category ﬁlters
● Covers 7 different countries
● > 50 mi requests per day
● Responsible for 40% of total searches
● Ranks suggestions by popularity and narrowness
○ but ...

Broad query problem ...
9
What is my intent ?
What if I don't know any Vespa
model ?
popularity
What if I have a Vespa and
want some accessory ?

Broad query effect ...
Fashion
Bags and
accessories
Footwear
Clothing
Watches and
Jewelry
Notions
Other bags and
accessories
Woman
Sunglasses
Man
Woman
Man
Watches
Jewelry
10
Different topics
Level 1 (L1)
Level 2 (L2)
Level 3 (L3)
Gucci
Wallets
Handbags
Health and
Beauty
Perfumes
Medical care
Autocomplete
suggestions

Breaks in the dialogue with user
11
● We jumped to premature conclusions
○ Show very speciﬁc popular suggestions (Vespa models)
● We could have asked more
○ Show more possibilities (like accessories)
● Maybe we will never have the chance to ask more
○ Popularity feedback loop ("rich get richer")

Diversifying autocomplete suggestions
12
Improve user experience on broad queries
● Minimize overspecialization of suggestions
● Give an overview of different available item categories
● Break popularity feedback loop
● Reﬁne the query (user intents)

The goal
13
Diversifying autocomplete category suggestions for broad queries
Broad queries =
popular queries
AND contain categories with many search results
AND those categories are not yet suggested!

How to apply diversification ?
14
Inspiration from Web Search and Information retrieval
Explicit diversification
○ From query (information needs)
○ Increase Coverage
○ Broad queries
Based on Search result diversification: http://www.dcs.gla.ac.uk/~craigm/publications/santos2015ftir.pdf

How can we measure coverage ?
15
Step 1: Clustering documents into topics
○ Facets, categories, colors, word embeddings, ...
891
...
36
...
37
...
903 3
topics
topics
probability

How can we measure coverage ?
16
Step 2: Measure dispersion of topics distribution
GINI Coefﬁcient: https://opensourceconnections.com/blog/2019/09/05/diversity-vs-relevance
<>
GINI Coefﬁcient
Shannon Entropy
topicstopics
probability
probability

Shannon Entropy
17
Measures level of information in a probability distribution
A B C
High Knowledge Medium Knowledge Low Knowledge
Low Surprise Medium Surprise High Surprise
entropy = 0 entropy = 0.81 entropy = 1.5

Shannon Entropy for e-commerces
18
1. Cluster document into categories (or any other criteria)
2. Category probability
entropy: 2.38 entropy: 0.52

Entropy from another perspective
19Extracted from: https://medium.com/udacity/shannon-entropy-information-gain-and-picking-balls-from-buckets-5810d35d54b4
On average, how many questions do we need to ask to ﬁnd out what letter it is?
Entropy = 0
Bucket 1
Entropy = 1.75
Bucket 2
Entropy = 2.0
Bucket 3
Akinator: https://en.wikipedia.org/wiki/Akinator

Entropy from another perspective
20
Extracted from: https://medium.com/udacity/shannon-entropy-information-gain-and-picking-balls-from-buckets-5810d35d54b4
Bucket 3 (2 questions on overage)
Bucket 2 (1.75 questions on average)

Coming back to the autocomplete
21
On average, how many questions can we ask to make sure we cover all user intents ?
each suggestion we give = a different question we make
○ 0 questions for very speciﬁc queries (low entropy)
○ n questions for broad queries
■ How many is n ?
■ How can we deﬁne these questions ?

How many questions can we ask ?
22
possible question!
entropy of each category
10 slots
Entropy = # of different questions

Maximum diversity is 10 different suggestions!
● Each category has p(x) = 0.1 and e(x) = 0.33
How to pick each suggestion ?
23
0.33
too few results Narrow queries
candidates

Generation new suggestions
Fashion
Bags and
accessories
Footwear Clothing
Watches and
Jewelry
Notions
24
Gucci
Health and
Beauty
H(X) = 1.27
p(x) = 0.56
e(x) = 0.47
p(x) = 0.15
e(x) = 0.41
p(x) = 0.14
e(x) = 0.39
p(x) = 0.09
e(x) = 0.32
p(x) = 0.05
e(x) = 0.22
p(x) = 0.002
e(x) = 0.019
L2

Experiment pipeline
25
Goal: Expand suggestions for broad queries

Expansion example
26
Gucci
Before After
inherited popularity

Expansion example
27
iphone
Before After

Experiment Scope
28
● 2 countries (C1 and C2)
● Expansions for less than 5% of suggested queries but covered:
○ 26% of total searches for C1
○ 17% of total searches for C2
● Compared the performance of both groups
○ broad queries: expanded vs not expanded

Primary metrics Description C1 C2
suggest_search_rate Autocomplete usage: # suggested searches / # total searches +10.41% +0.72%
pos_filter_rate Search filters applied after picking expanded suggestions -3.14% -5.14%
Experiment Results
29
● Diversification impacted user behaviour in autocomplete
● C1 users interacted more with autocomplete suggestions
● Did C2 users pick less suggestions but better ones ?

Experiment Results
30
Query metrics* Description C1 C2
suggest_ctr Uplift in ad clicks from expanded query +3.64% -3.86
suggest_reply_rate Uplift in ad replies from expanded query +1.81% +0.26%
Suggestion metrics* Description C1 C2
suggest_cat_ctr Uplift in ad clicks from expanded suggestions (category) +2.24% +9.48%
suggest_cat_reply_rate Uplift in ad replies from expanded suggestions (category) +6.13% +13.01%
● Promising for C1 users in general
● In C2, we might have replaced relevant suggestions
● In both countries, new suggested categories look relevant

Considerations and Future
31
● Early stage: ﬁrst and simple iteration
● Extend experiment
○ Affect more queries and add more countries
● Impact short vs long term
○ Consider rank (top n results)
○ Explore more clustering dimensions
○ Deﬁne entropy and popularity thresholds (prior and observed)

Thanks
32
linkedin.com/in/felipe-besson
@fmbesson

Diversifying Autocomplete

Recommandé

Recommandé

Contenu connexe

Similaire à Diversifying Autocomplete

Similaire à Diversifying Autocomplete (20)

Dernier

Dernier (20)

Diversifying Autocomplete