SlideShare une entreprise Scribd logo
1  sur  20
Télécharger pour lire hors ligne
Integrating Clickstream Data in SOLR
for Ranking and Dynamic Facet Optimization
Ilayaraja Prabakaran
Lead Engineer, Search
Target
Agenda
•  Search Ranking using clickstream
Compiling relevance feedback
Indexing click signals to Solr
Re-ranking using clicks signals
•  Facet Optimization using clickstream
Compiling facet engagement data
Re-ordering of facets based on engagement
Discovery of relevant facets per query
01
Learning From Implicit Feedback
Query Product list
Q1 Px, Py, Pz….
Q2 Pi, Pj, Pk….
Qn …....
Events
UserId SessionId Time stamp Action Meta-Data
xyz ab2n2n.. 20161008-233554 Search Search term
xyz ab2n2n.. 20161008-233601 Click Query, ProdId,
pos
xyz ab2n2n.. 20161008-233801 CartAdd Query, ProdId,
pos
MAP REDUCE JOBS
(Aggregation over
sessions and actions)
Item List of user queries and importance
X <Q1,score>, <Q2,score>,….
Y <Qi,score>,<Qj,score>,….
SOLR
Search Service
Doc: Title
Brand
Click_term1,Click_val1
Click_term2,Click_val2
….
Boosting for
matching
click_terms …
01
Rank Score
The final weighted score is calculated as below:
 
Weighted Score = ( W1 * ClickRate ) + ( W2 * CartRate )
 
Currently weights are :
W1 = 0.25
W2 = 0.75
•  Query to Item Clicks (ClickRate)
•  Query to Item added to cart (CartRate)
5
public static double slotweight(int page,int slotinpage) {

try {

int expo = (page - 1) * (24) + slotinpage;

Double score = 1 / (1 + Math.exp(-0.2 * expo));

return score;

} catch (Exception e) {

e.printStackTrace();

}

return 0;

}	
  
	
  
0	
  
0.2	
  
0.4	
  
0.6	
  
0.8	
  
1	
  
1.2	
  
1	
   2	
   3	
   4	
   5	
   6	
   7	
   8	
   9	
   10	
  11	
  12	
  13	
  14	
  15	
  16	
  17	
  18	
  19	
  20	
  21	
  22	
  23	
  24	
  25	
  26	
  27	
  28	
  29	
  30	
  31	
  32	
  33	
  34	
  35	
  36	
  37	
  38	
  39	
  40	
  41	
  42	
  43	
  44	
  45	
  46	
  47	
  48	
  
Slot	
  Weight	
  
Positional Bias
6
Days dayDiff = Days.daysBetween(eventDate,DateTime.now());
double decay=0;
if(dayDiff.getDays() <= 365){
decay = 1/((dayDiff.getDays()/60.0) + 1);
}
	
  
	
  
Time Decay
0	
  
0.2	
  
0.4	
  
0.6	
  
0.8	
  
1	
  
1.2	
  
1	
  
7	
  
13	
  
19	
  
25	
  
31	
  
37	
  
43	
  
49	
  
55	
  
61	
  
67	
  
73	
  
79	
  
85	
  
91	
  
97	
  
103	
  
109	
  
115	
  
121	
  
127	
  
133	
  
139	
  
145	
  
151	
  
157	
  
163	
  
169	
  
175	
  
181	
  
187	
  
193	
  
199	
  
205	
  
211	
  
217	
  
223	
  
229	
  
235	
  
241	
  
247	
  
253	
  
259	
  
265	
  
271	
  
277	
  
283	
  
289	
  
295	
  
301	
  
307	
  
313	
  
319	
  
325	
  
331	
  
337	
  
343	
  
349	
  
355	
  
361	
  
Time	
  Decay	
  
7
Data File
ITEM_ID SEARCH_QUERY SCORE
16981621 bandit belly 14.82
16981621 band belly 13.87
16981621 clothing maternity 13.60
16981621 maternity 13.45
16981621 belt postpartum 12.58
15179653 patrol paw 5.83
15179653 patrol paw toys 5.36
15179653 patrol paw skye 3.78
15179653 patrol paw sky 2.77
15179653 patrol paw skye toys 2.58
8
Integration with Solr – Indexing Time
•  schema.xml: Include below dynamic fields
     <dynamicField name="click_term*" type="string" indexed="true" multiValued="false" />
      <dynamicField name="click_val*" type="double" indexed="true" multiValued="false" />
Index the search terms (click_term1, click_term2..) as a new field in the solr index
along with corrosponding click scores (click_val1, click_val2...).
9
Integration with Solr – Search Time
Boost the match on click_term(s):
    Including the boost query for top N terms to the original solr query, for
example:
 bq=(click_term1:"nespresso") AND _val_:"max(1,product(click_val1,1000))” &
 bq=(click_term2:"nespresso”) AND _val_:"max(1,product(click_val2,1000))"& ….
Reference: Implementing Click-through Relevance , Andrzej Białecki 
10
Impact
Before
 A(er
11
Impact
Before
 A(er
12
Metrics & Measurement
Offline:
NDCG/Human judgments
Click-Rank
Cart-Rank
Online: A/B test
Primary metrics: CTR, Conversion
Secondary metric: Demand Sales
Overall Impact: 350K queries
10% up in CTR-Top10
6% up in CTR-Top5
2% up in $search demand
13
WIP - Learning To Rank
TargetFeatureStore:
OriginalScoreFeature – originalScore
SolrFeature
productTitleMatchQuery
productBrandMatchQuery
productItemTypeMatchQuery
productCategoryNamesMatchQuery
productRecencyFeature
…..
Learned Model: (prototype)
weights: {
"originalScore":0.0,
"productTitleMatchQuery":-0.022,
"productBrandMatchQuery":0.0241,
"productItemTypeMatchQuery":0.022,
"productCategoryNamesMatchQuery":0.0182,
"productKeywordsMatchQuery":0.030,
"productMetaKeywordsMatchQuery":0.017
}
14
Facet Optimization
Ø Problem: Rule based faceting treatment is static and limited.
Ø Solution: Build learning model for driving facets dynamically.
• Solution 1: Facet Ranking
• Solution 2: Facet Discovery
Ø How will this affect our guest?
• Bringing relevant refinements for a given search query or category (browse)
• Help guests to narrow down their product discovery through “smart” facets
15
Ranking Implementation
Ø Computing query to facet association:
• Will go through each session record and calculate number of clicks for each facet.
• Calculate search impression by getting distinct search terms for a facet type and
summing them up.
• Facet Engagement Rate = Facet clicks/ facet impression.
Ø Computing query to product click association
• For each session, find attribution of product clicks to each facet.
• Total sum of clicks calculated for <query, facet> combination.
• Facet Click Rate = Normalized by dividing with search impression.
Ø Computing query to product cart add association
• For each session, find attribution of product cart adds to each facet.
• Total sum of cart adds calculated for <query, facet> combination.
• Facet Cart Rate = Normalized by dividing with search impression.
16
Facet Ranking
The final weighted score is calculated as below:
 
Score = Sum { W1 * Facet Engagement Rate,
W2 * Facet Click Rate,
W3 * Facet Cart Rate  }
17
Facet  Discovery
• Query to Category (breadcrumb) mapping
• Compute weighted score based click/cart signals for browse with breadcrumb
as the key.
• Select the optimal breadcrumb(s) by ranking based on above score for the
breadcrumbs at a search term level.
• Search and browse facets are aggregated.
18
Facet Discovery & Ranking
19
Before  
Discovery
0	
  
10000	
  
20000	
  
30000	
  
40000	
  
50000	
  
60000	
  
70000	
  
80000	
  
1	
   2	
   3	
   4	
   5	
   6	
   7	
   8	
   9	
   10	
   11	
   12	
   13	
   14	
   15	
   16	
   17	
  
Impact	
  on	
  Search	
  
No.	
  of	
  	
  “Unique	
  Guest	
  Queries”	
  
No.  of  “newly  discovery  filters”
Discovery of “Search Filters”
A(er  Discovery
Thank You

Contenu connexe

Tendances

Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
DataWorks Summit
 
Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...
Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...
Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...
Lucidworks
 
Building a Microservices-based ERP System
Building a Microservices-based ERP SystemBuilding a Microservices-based ERP System
Building a Microservices-based ERP System
MongoDB
 

Tendances (20)

Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
 
Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...
Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...
Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...
 
Apache Spark sql
Apache Spark sqlApache Spark sql
Apache Spark sql
 
Elk - An introduction
Elk - An introductionElk - An introduction
Elk - An introduction
 
Personalized search
Personalized searchPersonalized search
Personalized search
 
Use MLflow to manage and deploy Machine Learning model on Spark
Use MLflow to manage and deploy Machine Learning model on Spark Use MLflow to manage and deploy Machine Learning model on Spark
Use MLflow to manage and deploy Machine Learning model on Spark
 
Cosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceCosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle Service
 
Building a Microservices-based ERP System
Building a Microservices-based ERP SystemBuilding a Microservices-based ERP System
Building a Microservices-based ERP System
 
SHACL in Apache jena - ApacheCon2020
SHACL in Apache jena - ApacheCon2020SHACL in Apache jena - ApacheCon2020
SHACL in Apache jena - ApacheCon2020
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 
Searching for AI - Leveraging Solr for classic Artificial Intelligence tasks
Searching for AI - Leveraging Solr for classic Artificial Intelligence tasksSearching for AI - Leveraging Solr for classic Artificial Intelligence tasks
Searching for AI - Leveraging Solr for classic Artificial Intelligence tasks
 
Solr Presentation
Solr PresentationSolr Presentation
Solr Presentation
 
Elastic search overview
Elastic search overviewElastic search overview
Elastic search overview
 
Hyperloglog Project
Hyperloglog ProjectHyperloglog Project
Hyperloglog Project
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 
Spark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka StreamsSpark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka Streams
 
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
 
Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearch
 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introduction
 
Airflow at lyft for Airflow summit 2020 conference
Airflow at lyft for Airflow summit 2020 conferenceAirflow at lyft for Airflow summit 2020 conference
Airflow at lyft for Airflow summit 2020 conference
 

Similaire à Integrating Clickstream Data into Solr for Ranking and Dynamic Facet Optimization - Ilayaraja Prabakaran, Target

Real time ecommerce analytics with MongoDB at Gilt Groupe (Michael Bryzek & M...
Real time ecommerce analytics with MongoDB at Gilt Groupe (Michael Bryzek & M...Real time ecommerce analytics with MongoDB at Gilt Groupe (Michael Bryzek & M...
Real time ecommerce analytics with MongoDB at Gilt Groupe (Michael Bryzek & M...
MongoSF
 

Similaire à Integrating Clickstream Data into Solr for Ranking and Dynamic Facet Optimization - Ilayaraja Prabakaran, Target (20)

Google Analytics for Beginners - Training
Google Analytics for Beginners - TrainingGoogle Analytics for Beginners - Training
Google Analytics for Beginners - Training
 
Best Practices: What to Track with Your Analytics
Best Practices: What to Track with Your AnalyticsBest Practices: What to Track with Your Analytics
Best Practices: What to Track with Your Analytics
 
Introduction to web analytics and the Google analytics platform pdf
Introduction to web analytics and the Google analytics platform pdfIntroduction to web analytics and the Google analytics platform pdf
Introduction to web analytics and the Google analytics platform pdf
 
Digital analytics with R - Sydney Users of R Forum - May 2015
Digital analytics with R - Sydney Users of R Forum - May 2015Digital analytics with R - Sydney Users of R Forum - May 2015
Digital analytics with R - Sydney Users of R Forum - May 2015
 
Google Analytics for Miva Merchants by Morgan Jones
Google Analytics for Miva Merchants by Morgan JonesGoogle Analytics for Miva Merchants by Morgan Jones
Google Analytics for Miva Merchants by Morgan Jones
 
Understanding Web Analytics and Google Analytics
Understanding Web Analytics and Google AnalyticsUnderstanding Web Analytics and Google Analytics
Understanding Web Analytics and Google Analytics
 
How to use pertinently Google Analytics, by Gilles Barbier
How to use pertinently Google Analytics, by Gilles BarbierHow to use pertinently Google Analytics, by Gilles Barbier
How to use pertinently Google Analytics, by Gilles Barbier
 
Cloudera Movies Data Science Project On Big Data
Cloudera Movies Data Science Project On Big DataCloudera Movies Data Science Project On Big Data
Cloudera Movies Data Science Project On Big Data
 
Universal Analytics for Book Publishers: Knowing a Little Bit About Everything
Universal Analytics for Book Publishers: Knowing a Little Bit About EverythingUniversal Analytics for Book Publishers: Knowing a Little Bit About Everything
Universal Analytics for Book Publishers: Knowing a Little Bit About Everything
 
Fried toronto sps14 91 wcm intranet
Fried toronto sps14 91 wcm intranetFried toronto sps14 91 wcm intranet
Fried toronto sps14 91 wcm intranet
 
ASMD 2022 for class.pptx
ASMD 2022 for class.pptxASMD 2022 for class.pptx
ASMD 2022 for class.pptx
 
Improving Analytics with Google Tag Manager
Improving Analytics with Google Tag ManagerImproving Analytics with Google Tag Manager
Improving Analytics with Google Tag Manager
 
Nuda Anthoney Web Analytics Demo
Nuda Anthoney Web Analytics DemoNuda Anthoney Web Analytics Demo
Nuda Anthoney Web Analytics Demo
 
Introduction to Google Analytics
Introduction to Google AnalyticsIntroduction to Google Analytics
Introduction to Google Analytics
 
Marketo: hands on with Google Analytics
Marketo: hands on with Google AnalyticsMarketo: hands on with Google Analytics
Marketo: hands on with Google Analytics
 
Real time ecommerce analytics with MongoDB at Gilt Groupe (Michael Bryzek & M...
Real time ecommerce analytics with MongoDB at Gilt Groupe (Michael Bryzek & M...Real time ecommerce analytics with MongoDB at Gilt Groupe (Michael Bryzek & M...
Real time ecommerce analytics with MongoDB at Gilt Groupe (Michael Bryzek & M...
 
Lean Analytics - How to Measure Your Product
Lean Analytics - How to Measure Your ProductLean Analytics - How to Measure Your Product
Lean Analytics - How to Measure Your Product
 
Play with Kaggle
Play with KagglePlay with Kaggle
Play with Kaggle
 
Kostas Voudouris - BrightonSEO - Perfromance-based optimisation using Google ...
Kostas Voudouris - BrightonSEO - Perfromance-based optimisation using Google ...Kostas Voudouris - BrightonSEO - Perfromance-based optimisation using Google ...
Kostas Voudouris - BrightonSEO - Perfromance-based optimisation using Google ...
 
Agile Finance for Project Success
Agile Finance for Project SuccessAgile Finance for Project Success
Agile Finance for Project Success
 

Plus de Lucidworks

Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Lucidworks
 

Plus de Lucidworks (20)

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
 

Dernier

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

Integrating Clickstream Data into Solr for Ranking and Dynamic Facet Optimization - Ilayaraja Prabakaran, Target

  • 1. Integrating Clickstream Data in SOLR for Ranking and Dynamic Facet Optimization Ilayaraja Prabakaran Lead Engineer, Search Target
  • 2. Agenda •  Search Ranking using clickstream Compiling relevance feedback Indexing click signals to Solr Re-ranking using clicks signals •  Facet Optimization using clickstream Compiling facet engagement data Re-ordering of facets based on engagement Discovery of relevant facets per query
  • 3. 01 Learning From Implicit Feedback Query Product list Q1 Px, Py, Pz…. Q2 Pi, Pj, Pk…. Qn ….... Events UserId SessionId Time stamp Action Meta-Data xyz ab2n2n.. 20161008-233554 Search Search term xyz ab2n2n.. 20161008-233601 Click Query, ProdId, pos xyz ab2n2n.. 20161008-233801 CartAdd Query, ProdId, pos MAP REDUCE JOBS (Aggregation over sessions and actions) Item List of user queries and importance X <Q1,score>, <Q2,score>,…. Y <Qi,score>,<Qj,score>,…. SOLR Search Service Doc: Title Brand Click_term1,Click_val1 Click_term2,Click_val2 …. Boosting for matching click_terms …
  • 4. 01 Rank Score The final weighted score is calculated as below:   Weighted Score = ( W1 * ClickRate ) + ( W2 * CartRate )   Currently weights are : W1 = 0.25 W2 = 0.75 •  Query to Item Clicks (ClickRate) •  Query to Item added to cart (CartRate)
  • 5. 5 public static double slotweight(int page,int slotinpage) {
 try {
 int expo = (page - 1) * (24) + slotinpage;
 Double score = 1 / (1 + Math.exp(-0.2 * expo));
 return score;
 } catch (Exception e) {
 e.printStackTrace();
 }
 return 0;
 }     0   0.2   0.4   0.6   0.8   1   1.2   1   2   3   4   5   6   7   8   9   10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48   Slot  Weight   Positional Bias
  • 6. 6 Days dayDiff = Days.daysBetween(eventDate,DateTime.now()); double decay=0; if(dayDiff.getDays() <= 365){ decay = 1/((dayDiff.getDays()/60.0) + 1); }     Time Decay 0   0.2   0.4   0.6   0.8   1   1.2   1   7   13   19   25   31   37   43   49   55   61   67   73   79   85   91   97   103   109   115   121   127   133   139   145   151   157   163   169   175   181   187   193   199   205   211   217   223   229   235   241   247   253   259   265   271   277   283   289   295   301   307   313   319   325   331   337   343   349   355   361   Time  Decay  
  • 7. 7 Data File ITEM_ID SEARCH_QUERY SCORE 16981621 bandit belly 14.82 16981621 band belly 13.87 16981621 clothing maternity 13.60 16981621 maternity 13.45 16981621 belt postpartum 12.58 15179653 patrol paw 5.83 15179653 patrol paw toys 5.36 15179653 patrol paw skye 3.78 15179653 patrol paw sky 2.77 15179653 patrol paw skye toys 2.58
  • 8. 8 Integration with Solr – Indexing Time •  schema.xml: Include below dynamic fields      <dynamicField name="click_term*" type="string" indexed="true" multiValued="false" />       <dynamicField name="click_val*" type="double" indexed="true" multiValued="false" /> Index the search terms (click_term1, click_term2..) as a new field in the solr index along with corrosponding click scores (click_val1, click_val2...).
  • 9. 9 Integration with Solr – Search Time Boost the match on click_term(s):     Including the boost query for top N terms to the original solr query, for example:  bq=(click_term1:"nespresso") AND _val_:"max(1,product(click_val1,1000))” &  bq=(click_term2:"nespresso”) AND _val_:"max(1,product(click_val2,1000))"& …. Reference: Implementing Click-through Relevance , Andrzej Białecki 
  • 12. 12 Metrics & Measurement Offline: NDCG/Human judgments Click-Rank Cart-Rank Online: A/B test Primary metrics: CTR, Conversion Secondary metric: Demand Sales Overall Impact: 350K queries 10% up in CTR-Top10 6% up in CTR-Top5 2% up in $search demand
  • 13. 13 WIP - Learning To Rank TargetFeatureStore: OriginalScoreFeature – originalScore SolrFeature productTitleMatchQuery productBrandMatchQuery productItemTypeMatchQuery productCategoryNamesMatchQuery productRecencyFeature ….. Learned Model: (prototype) weights: { "originalScore":0.0, "productTitleMatchQuery":-0.022, "productBrandMatchQuery":0.0241, "productItemTypeMatchQuery":0.022, "productCategoryNamesMatchQuery":0.0182, "productKeywordsMatchQuery":0.030, "productMetaKeywordsMatchQuery":0.017 }
  • 14. 14 Facet Optimization Ø Problem: Rule based faceting treatment is static and limited. Ø Solution: Build learning model for driving facets dynamically. • Solution 1: Facet Ranking • Solution 2: Facet Discovery Ø How will this affect our guest? • Bringing relevant refinements for a given search query or category (browse) • Help guests to narrow down their product discovery through “smart” facets
  • 15. 15 Ranking Implementation Ø Computing query to facet association: • Will go through each session record and calculate number of clicks for each facet. • Calculate search impression by getting distinct search terms for a facet type and summing them up. • Facet Engagement Rate = Facet clicks/ facet impression. Ø Computing query to product click association • For each session, find attribution of product clicks to each facet. • Total sum of clicks calculated for <query, facet> combination. • Facet Click Rate = Normalized by dividing with search impression. Ø Computing query to product cart add association • For each session, find attribution of product cart adds to each facet. • Total sum of cart adds calculated for <query, facet> combination. • Facet Cart Rate = Normalized by dividing with search impression.
  • 16. 16 Facet Ranking The final weighted score is calculated as below:   Score = Sum { W1 * Facet Engagement Rate, W2 * Facet Click Rate, W3 * Facet Cart Rate  }
  • 17. 17 Facet  Discovery • Query to Category (breadcrumb) mapping • Compute weighted score based click/cart signals for browse with breadcrumb as the key. • Select the optimal breadcrumb(s) by ranking based on above score for the breadcrumbs at a search term level. • Search and browse facets are aggregated.
  • 19. 19 Before   Discovery 0   10000   20000   30000   40000   50000   60000   70000   80000   1   2   3   4   5   6   7   8   9   10   11   12   13   14   15   16   17   Impact  on  Search   No.  of    “Unique  Guest  Queries”   No.  of  “newly  discovery  filters” Discovery of “Search Filters” A(er  Discovery