SlideShare une entreprise Scribd logo
1  sur  46
Télécharger pour lire hors ligne
Embeddings! Embeddings everywhere!
Maciej Arciuch, Karol Grzegorczyk
How to build a recommender system using representation learning
● Online marketplace
● Over 380k sellers
● The largest e-commerce in Poland
● 6th
largest e-commerce in EU
● ~ 15 million transactions monthly
● ~ 20 million accounts
● ~ 110 million offers (~ 20 million unique products)
2
Recommender systems
3
Recommender systems
Collaborative-filtering (CF)
● Similarity models based on past
users’ behaviour
● Two main subtypes:
○ item-to-item
○ user-to-item
● Model built either on explicit or
implicit feedback
Content-based (CB)
● Based on product content
○ Description
○ Meta-data
○ Images
4
Recommender systems
Which is better?
● In general, CF are better
● Sometimes we explicitly want
content-based recommendations
○ Fashion
● Not always can be applied
○ Cold-start scenarios
○ Data sparsity
Hybrid recommender systems
● Where possible use CF
● Where not use CB
● Can we do better?
5
Recommendations @ allegro
● Automatic recommendations generate 10% of allegro’s
GMV (total sales).
● 33 different recommendation scenarios / different
placements
● Available on
6
🖥 📱 ✉
Recommendations @ allegro
7
Recommendations - offer page
...
All sellers
Same seller
8
Recommendations - main page
9
Separate shelves
for different
recently visited
categories
Generic recommendation framework
10
Generic framework
Main components
● Training data preparation
● Representation learning module
● Nearest-neighbour search in latent space
● Serving
11
Training data preparation
● Different scenario -> different logic -> different data
● Collaborative filtering
○ Sequences of visited product ids from user sessions
○ Purchases or carts
● Content-based recommendations
○ Product title with concatenated category path
○ Product images
● We use Scio - a Scala wrapper around Apache Beam Java API
○ Concise syntax
○ Good testability
12
Representation learning
13
Latent space product representations
14
How to obtain word embeddings?
W W WW
the
W
sat the
sat
Embedding matrix
Sum/Average/Concatenate
Prediction (Classifier)
cat on the
cat on the
dict_size
emb_dims
The cat sat on the mat
[Mikolov et al., Distributed Representations of Words and Phrases and their Compositionality, NIPS ’13]
15
How to obtain product embeddings?
W W WW
id8
W
id5 id8
id5
Embedding matrix
Sum/Average/Concatenate
Prediction (Classifier)
id4 id9 id1
id4 id9 id1
dict_size
emb_dims
id8 id4 id5 id9 id1
[Grbovic et al., E-commerce in Your Inbox: Product Recommendations at Scale, KDD ’15]
16
Training CF-based representations
● Our model learns to predict surrounding (context) products given
the central product
● We rely on system-wide implicit feedback
○ We reduce the bias from past recommendations
○ We have much more training data
● Product embeddings are, in fact, side-effects of the training
process
17
100D embeddings
Trained with skip-gram (word2vec)
on view-sessions
2D dimensionality reduction with t-SNE
Product embeddings - similarity
18
Product embeddings - relations
19
Training CF-based representations
● Different placements require different recommender logic
● On product pages, we want to show similar products
○ We train on sequences of product page views (ids)
● In cart and in post-buy e-mails we want to show complementary
products
○ We train on cart data
20
Training content-based representations
● Textual content
○ Word embeddings trained using fastText
○ Products represented as a weighted averages of title words
● Product images
○ We use Inception v3, pretrained on ImageNet, fine-tuned on Allegro’s products
21
From embeddings to recommendations
22
Nearest neighbour search
● We could compute distances between a query product and all
remaining products and then sort products according to distances
● This is prohibitively expensive
● We need some approximation
● Often when searching we want high precision but we are willing to
compromise on recall
● False negatives are not a problem
23
Nearest neighbour search
● Approximate NN search
○ Searching in almost constant time
○ Random hyperplane projections
(aka SimHash or LSH)
● Two popular libraries
○ Annoy
○ Faiss
[Charikar, Similarity Estimation Techniques from Rounding Algorithms, STOC ’02]
24
Nearest neighbour search
● FAISS implements a two-stage searching strategy:
○ Vectors are clustered into relatively small number (e.g. 1000) of partitions
(aka Voronoi cells)
○ Each cluster is represented by a centroid
○ During search, a small number (e.g. 5) of cells closest to the query product
is selected
○ Exact search within those cells
● FAISS provide further approximations with more advanced
quantization techniques
25
[Jégou et al., Product Quantization for Nearest Neighbor Search, 2011]
Nearest neighbour search
● In case of CF-based embeddings we use single index for
all products
● In case of content-based embeddings we use separate
indices for each top category of product
● Filtering out too close neighbours
○ Both too close to the source and too close to each other
○ How far is too far (from source)?
● Eagerly precomputing neighbours for all the products
26
Orchestration
27
Orchestration
● For orchestration
○ Spotify Luigi
○ Built around the concept of a task
○ The task has one output and multiple inputs defined by targets
● Custom Luigi tasks and targets
○ Google ML Engine submit task
■ FAISS and gensim tasks
○ Google Dataflow submit task
28
Recommendation serving
29
Serving
● We serve pre-calculated item-to-item recommendations
○ We use key-value stores for recommendations
○ source_id -> [target_id_1, …, target_id_N]
● Performance at serving-time
○ Enrichment with product meta-data
○ No-longer-available products are excluded at serving-time
30
Re-ranking
● Learning-to-rank re-ranking is common in search engines
● A LambdaMART reranking improved allegro’s search results by
9% (NDCG)
● We experiment with learning-to-rank algorithms to improve quality
of recommendations
[Burges et al., From RankNet to LambdaRank to LambdaMART: An Overview, 2010]
31
Hybrid recommenders
32
Hybrid recommendations
● For majority of products we are unable to server CF recommendations
○ Item cold-start
○ Data sparsity
● We can recommend products based on content similarity. But is there a
better way?
33
Hybrid recommendations - our approach
● Find the NN in “content” space narrowed to “popular” items
● Serve CF-based recommendations for the neighbour
34
Hybrid recommendations - meta-prod2vec
● In classic prod2vec we simply use product ids as words
○ id1 id2 … idN
● In meta-prod2vec we interleave ids and meta-data:
○ id1 meta1 id2 meta2 … idN metaN
○ We also need to increase window size
● Improves quality of product representations
● We obtain metadata embeddings
○ Both product embeddings and metadata embeddings are in the same vector space
○ We can represent product as a combination of its metadata embeddings
[Vasile et al., Meta-Prod2Vec - Product Embeddings Using Side-Information for Recommendation, RecSys ’16]
35
What about the user?
36
User-to-item recommendations
● Item-to-item recommendations are crucial for Allegro
○ Offer pages - most visited part of the system
○ The aim is to shorten the Path-to-Purchase
● However, on the main page we need user-to-item recommendations
○ No explicit product context
○ The aim is to inspire users to make new purchases
● Also in e-mail campaigns we need user-to-items recommendations
37
User-to-item recommendations - approaches
1. Learn latent representation of the user
○ Obtained from a user-to-item interaction matrix
○ Impossible to retrain user representation in real-time
○ Good for e-mailing campaigns
2. User representation is an aggregation of representations of (e.g. visited)
products
○ Requires online NN search
○ Challenge: latency under heavy traffic
3. No user representation
○ Interleaved item-to-item recommendations of visited items
○ Easy to implement, fast to serve at runtime
○ Challenge: how to mix recommendations?
38
Other issues
39
Hyperparameter tuning
● “Hyperparameters Matter”
● We hold-out some number of train sessions
○ We predict the last item in the session based on the penultimate item.
● MRR@25
● Google ML Engine
○ Convenient, but expensive
● Do-It-Yourself with hyperopt library
○ Much cheaper
[Caselles-Dupré et al., Word2vec applied to Recommendation: Hyperparameters Matter, RecSys ’18]
[Golovin et al., Google Vizier: A Service for Black-Box Optimization, KDD ’17]
40
Data challenges
● Allegro in an online marketplace, not an online store
● There are many sellers
● The same product can be offered by many sellers
● Huge effort to match offers to products
○ Still, most of the offers do not have a product id
● 6x more offers than users
41
Data challenges
● Calculating recommendations on a “per offer” basis introduces “noise”
○ “Accidental” co-occurrences become relatively strong signal
● Id dictionary gets very big (110 M offers)
○ Memory consumption, size of the model
● Interaction matrices get very sparse
42
Data challenges
● We try to cluster unmatched offers into pseudo-products
● By title, by category, by attributes
● We want less “objects” and a denser interaction matrix
● Product-based recommendations give us better results
○ +44% CTR
○ +34% GMV
● Choice of the product “representant” becomes important
○ How to choose the “best” offer for a product (cheapest, free delivery, etc) - ranking
algorithms
43
Summary
44
Summary
● Simple yet robust framework
● Well-defined module contracts - easily replaceable modules
● Can run in cloud and on premise
45
Thank you!
46

Contenu connexe

Tendances

Retail Reference Architecture Part 1: Flexible, Searchable, Low-Latency Produ...
Retail Reference Architecture Part 1: Flexible, Searchable, Low-Latency Produ...Retail Reference Architecture Part 1: Flexible, Searchable, Low-Latency Produ...
Retail Reference Architecture Part 1: Flexible, Searchable, Low-Latency Produ...
MongoDB
 

Tendances (12)

MongoDB
MongoDBMongoDB
MongoDB
 
Retail Reference Architecture Part 1: Flexible, Searchable, Low-Latency Produ...
Retail Reference Architecture Part 1: Flexible, Searchable, Low-Latency Produ...Retail Reference Architecture Part 1: Flexible, Searchable, Low-Latency Produ...
Retail Reference Architecture Part 1: Flexible, Searchable, Low-Latency Produ...
 
Seminario sobre sistemas de recomendación
Seminario sobre sistemas de recomendaciónSeminario sobre sistemas de recomendación
Seminario sobre sistemas de recomendación
 
Neo4j for Total Cost Visibility
Neo4j for Total Cost VisibilityNeo4j for Total Cost Visibility
Neo4j for Total Cost Visibility
 
Chapter 8(designing of documnt databases)no sql for mere mortals
Chapter 8(designing of documnt databases)no sql for mere mortalsChapter 8(designing of documnt databases)no sql for mere mortals
Chapter 8(designing of documnt databases)no sql for mere mortals
 
The Magic of Auto Differentiation
The Magic of Auto DifferentiationThe Magic of Auto Differentiation
The Magic of Auto Differentiation
 
Clique-ปัญหากลุ่มพรรคพวก
Clique-ปัญหากลุ่มพรรคพวกClique-ปัญหากลุ่มพรรคพวก
Clique-ปัญหากลุ่มพรรคพวก
 
Market basket analysis
Market basket analysisMarket basket analysis
Market basket analysis
 
New Adventures in RDF2vec
New Adventures in RDF2vecNew Adventures in RDF2vec
New Adventures in RDF2vec
 
Build a car with Graphs, Fabien Batejat, Volvo Cars
Build a car with Graphs, Fabien Batejat, Volvo CarsBuild a car with Graphs, Fabien Batejat, Volvo Cars
Build a car with Graphs, Fabien Batejat, Volvo Cars
 
ontop: A tutorial
ontop: A tutorialontop: A tutorial
ontop: A tutorial
 
Introduction to Neo4j
Introduction to Neo4jIntroduction to Neo4j
Introduction to Neo4j
 

Similaire à Embeddings! embeddings everywhere!

Similaire à Embeddings! embeddings everywhere! (20)

"What we learned from 5 years of building a data science software that actual...
"What we learned from 5 years of building a data science software that actual..."What we learned from 5 years of building a data science software that actual...
"What we learned from 5 years of building a data science software that actual...
 
Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...
Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...
Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...
 
Applied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce SettingApplied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce Setting
 
Role of Data Science in eCommerce
Role of Data Science in eCommerceRole of Data Science in eCommerce
Role of Data Science in eCommerce
 
Game-Changing Business Models
Game-Changing Business ModelsGame-Changing Business Models
Game-Changing Business Models
 
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
 
Democratizing data science Using spark, hive and druid
Democratizing data science Using spark, hive and druidDemocratizing data science Using spark, hive and druid
Democratizing data science Using spark, hive and druid
 
Tech Job Conference: Software Engineer @Criteo
Tech Job Conference: Software Engineer @CriteoTech Job Conference: Software Engineer @Criteo
Tech Job Conference: Software Engineer @Criteo
 
Damien Lefortier, Senior Machine Learning Engineer and Tech Lead in the Predi...
Damien Lefortier, Senior Machine Learning Engineer and Tech Lead in the Predi...Damien Lefortier, Senior Machine Learning Engineer and Tech Lead in the Predi...
Damien Lefortier, Senior Machine Learning Engineer and Tech Lead in the Predi...
 
Counting Unique Users in Real-Time: Here's a Challenge for You!
Counting Unique Users in Real-Time: Here's a Challenge for You!Counting Unique Users in Real-Time: Here's a Challenge for You!
Counting Unique Users in Real-Time: Here's a Challenge for You!
 
Building ML models for smart retail
Building ML models for smart retailBuilding ML models for smart retail
Building ML models for smart retail
 
Recommender systems
Recommender systems Recommender systems
Recommender systems
 
National Wildlife Federation- OMS- Dreamcore 2011
National Wildlife Federation- OMS- Dreamcore 2011National Wildlife Federation- OMS- Dreamcore 2011
National Wildlife Federation- OMS- Dreamcore 2011
 
Reproducible data science and business solutions
Reproducible data science and business solutionsReproducible data science and business solutions
Reproducible data science and business solutions
 
Drools5 Community Training Module 5 Drools BLIP Architectural Overview + Demos
Drools5 Community Training Module 5 Drools BLIP Architectural Overview + DemosDrools5 Community Training Module 5 Drools BLIP Architectural Overview + Demos
Drools5 Community Training Module 5 Drools BLIP Architectural Overview + Demos
 
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
 
Customer Story: Elastic Stack을 이용한 게임 서비스 통합 로깅 플랫폼
Customer Story: Elastic Stack을 이용한 게임 서비스 통합 로깅 플랫폼Customer Story: Elastic Stack을 이용한 게임 서비스 통합 로깅 플랫폼
Customer Story: Elastic Stack을 이용한 게임 서비스 통합 로깅 플랫폼
 
A Big (Query) Frog in a Small Pond, Jakub Motyl, BuffPanel
A Big (Query) Frog in a Small Pond, Jakub Motyl, BuffPanelA Big (Query) Frog in a Small Pond, Jakub Motyl, BuffPanel
A Big (Query) Frog in a Small Pond, Jakub Motyl, BuffPanel
 
Better Living Through Analytics - Louis Cialdella Product School
Better Living Through Analytics - Louis Cialdella Product SchoolBetter Living Through Analytics - Louis Cialdella Product School
Better Living Through Analytics - Louis Cialdella Product School
 
How to price your SaaS or software product?
How to price your SaaS or software product?How to price your SaaS or software product?
How to price your SaaS or software product?
 

Dernier

AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
masabamasaba
 

Dernier (20)

AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 

Embeddings! embeddings everywhere!

  • 1. Embeddings! Embeddings everywhere! Maciej Arciuch, Karol Grzegorczyk How to build a recommender system using representation learning
  • 2. ● Online marketplace ● Over 380k sellers ● The largest e-commerce in Poland ● 6th largest e-commerce in EU ● ~ 15 million transactions monthly ● ~ 20 million accounts ● ~ 110 million offers (~ 20 million unique products) 2
  • 4. Recommender systems Collaborative-filtering (CF) ● Similarity models based on past users’ behaviour ● Two main subtypes: ○ item-to-item ○ user-to-item ● Model built either on explicit or implicit feedback Content-based (CB) ● Based on product content ○ Description ○ Meta-data ○ Images 4
  • 5. Recommender systems Which is better? ● In general, CF are better ● Sometimes we explicitly want content-based recommendations ○ Fashion ● Not always can be applied ○ Cold-start scenarios ○ Data sparsity Hybrid recommender systems ● Where possible use CF ● Where not use CB ● Can we do better? 5
  • 6. Recommendations @ allegro ● Automatic recommendations generate 10% of allegro’s GMV (total sales). ● 33 different recommendation scenarios / different placements ● Available on 6 🖥 📱 ✉
  • 8. Recommendations - offer page ... All sellers Same seller 8
  • 9. Recommendations - main page 9 Separate shelves for different recently visited categories
  • 11. Generic framework Main components ● Training data preparation ● Representation learning module ● Nearest-neighbour search in latent space ● Serving 11
  • 12. Training data preparation ● Different scenario -> different logic -> different data ● Collaborative filtering ○ Sequences of visited product ids from user sessions ○ Purchases or carts ● Content-based recommendations ○ Product title with concatenated category path ○ Product images ● We use Scio - a Scala wrapper around Apache Beam Java API ○ Concise syntax ○ Good testability 12
  • 14. Latent space product representations 14
  • 15. How to obtain word embeddings? W W WW the W sat the sat Embedding matrix Sum/Average/Concatenate Prediction (Classifier) cat on the cat on the dict_size emb_dims The cat sat on the mat [Mikolov et al., Distributed Representations of Words and Phrases and their Compositionality, NIPS ’13] 15
  • 16. How to obtain product embeddings? W W WW id8 W id5 id8 id5 Embedding matrix Sum/Average/Concatenate Prediction (Classifier) id4 id9 id1 id4 id9 id1 dict_size emb_dims id8 id4 id5 id9 id1 [Grbovic et al., E-commerce in Your Inbox: Product Recommendations at Scale, KDD ’15] 16
  • 17. Training CF-based representations ● Our model learns to predict surrounding (context) products given the central product ● We rely on system-wide implicit feedback ○ We reduce the bias from past recommendations ○ We have much more training data ● Product embeddings are, in fact, side-effects of the training process 17
  • 18. 100D embeddings Trained with skip-gram (word2vec) on view-sessions 2D dimensionality reduction with t-SNE Product embeddings - similarity 18
  • 19. Product embeddings - relations 19
  • 20. Training CF-based representations ● Different placements require different recommender logic ● On product pages, we want to show similar products ○ We train on sequences of product page views (ids) ● In cart and in post-buy e-mails we want to show complementary products ○ We train on cart data 20
  • 21. Training content-based representations ● Textual content ○ Word embeddings trained using fastText ○ Products represented as a weighted averages of title words ● Product images ○ We use Inception v3, pretrained on ImageNet, fine-tuned on Allegro’s products 21
  • 22. From embeddings to recommendations 22
  • 23. Nearest neighbour search ● We could compute distances between a query product and all remaining products and then sort products according to distances ● This is prohibitively expensive ● We need some approximation ● Often when searching we want high precision but we are willing to compromise on recall ● False negatives are not a problem 23
  • 24. Nearest neighbour search ● Approximate NN search ○ Searching in almost constant time ○ Random hyperplane projections (aka SimHash or LSH) ● Two popular libraries ○ Annoy ○ Faiss [Charikar, Similarity Estimation Techniques from Rounding Algorithms, STOC ’02] 24
  • 25. Nearest neighbour search ● FAISS implements a two-stage searching strategy: ○ Vectors are clustered into relatively small number (e.g. 1000) of partitions (aka Voronoi cells) ○ Each cluster is represented by a centroid ○ During search, a small number (e.g. 5) of cells closest to the query product is selected ○ Exact search within those cells ● FAISS provide further approximations with more advanced quantization techniques 25 [Jégou et al., Product Quantization for Nearest Neighbor Search, 2011]
  • 26. Nearest neighbour search ● In case of CF-based embeddings we use single index for all products ● In case of content-based embeddings we use separate indices for each top category of product ● Filtering out too close neighbours ○ Both too close to the source and too close to each other ○ How far is too far (from source)? ● Eagerly precomputing neighbours for all the products 26
  • 28. Orchestration ● For orchestration ○ Spotify Luigi ○ Built around the concept of a task ○ The task has one output and multiple inputs defined by targets ● Custom Luigi tasks and targets ○ Google ML Engine submit task ■ FAISS and gensim tasks ○ Google Dataflow submit task 28
  • 30. Serving ● We serve pre-calculated item-to-item recommendations ○ We use key-value stores for recommendations ○ source_id -> [target_id_1, …, target_id_N] ● Performance at serving-time ○ Enrichment with product meta-data ○ No-longer-available products are excluded at serving-time 30
  • 31. Re-ranking ● Learning-to-rank re-ranking is common in search engines ● A LambdaMART reranking improved allegro’s search results by 9% (NDCG) ● We experiment with learning-to-rank algorithms to improve quality of recommendations [Burges et al., From RankNet to LambdaRank to LambdaMART: An Overview, 2010] 31
  • 33. Hybrid recommendations ● For majority of products we are unable to server CF recommendations ○ Item cold-start ○ Data sparsity ● We can recommend products based on content similarity. But is there a better way? 33
  • 34. Hybrid recommendations - our approach ● Find the NN in “content” space narrowed to “popular” items ● Serve CF-based recommendations for the neighbour 34
  • 35. Hybrid recommendations - meta-prod2vec ● In classic prod2vec we simply use product ids as words ○ id1 id2 … idN ● In meta-prod2vec we interleave ids and meta-data: ○ id1 meta1 id2 meta2 … idN metaN ○ We also need to increase window size ● Improves quality of product representations ● We obtain metadata embeddings ○ Both product embeddings and metadata embeddings are in the same vector space ○ We can represent product as a combination of its metadata embeddings [Vasile et al., Meta-Prod2Vec - Product Embeddings Using Side-Information for Recommendation, RecSys ’16] 35
  • 36. What about the user? 36
  • 37. User-to-item recommendations ● Item-to-item recommendations are crucial for Allegro ○ Offer pages - most visited part of the system ○ The aim is to shorten the Path-to-Purchase ● However, on the main page we need user-to-item recommendations ○ No explicit product context ○ The aim is to inspire users to make new purchases ● Also in e-mail campaigns we need user-to-items recommendations 37
  • 38. User-to-item recommendations - approaches 1. Learn latent representation of the user ○ Obtained from a user-to-item interaction matrix ○ Impossible to retrain user representation in real-time ○ Good for e-mailing campaigns 2. User representation is an aggregation of representations of (e.g. visited) products ○ Requires online NN search ○ Challenge: latency under heavy traffic 3. No user representation ○ Interleaved item-to-item recommendations of visited items ○ Easy to implement, fast to serve at runtime ○ Challenge: how to mix recommendations? 38
  • 40. Hyperparameter tuning ● “Hyperparameters Matter” ● We hold-out some number of train sessions ○ We predict the last item in the session based on the penultimate item. ● MRR@25 ● Google ML Engine ○ Convenient, but expensive ● Do-It-Yourself with hyperopt library ○ Much cheaper [Caselles-Dupré et al., Word2vec applied to Recommendation: Hyperparameters Matter, RecSys ’18] [Golovin et al., Google Vizier: A Service for Black-Box Optimization, KDD ’17] 40
  • 41. Data challenges ● Allegro in an online marketplace, not an online store ● There are many sellers ● The same product can be offered by many sellers ● Huge effort to match offers to products ○ Still, most of the offers do not have a product id ● 6x more offers than users 41
  • 42. Data challenges ● Calculating recommendations on a “per offer” basis introduces “noise” ○ “Accidental” co-occurrences become relatively strong signal ● Id dictionary gets very big (110 M offers) ○ Memory consumption, size of the model ● Interaction matrices get very sparse 42
  • 43. Data challenges ● We try to cluster unmatched offers into pseudo-products ● By title, by category, by attributes ● We want less “objects” and a denser interaction matrix ● Product-based recommendations give us better results ○ +44% CTR ○ +34% GMV ● Choice of the product “representant” becomes important ○ How to choose the “best” offer for a product (cheapest, free delivery, etc) - ranking algorithms 43
  • 45. Summary ● Simple yet robust framework ● Well-defined module contracts - easily replaceable modules ● Can run in cloud and on premise 45