SlideShare une entreprise Scribd logo
1  sur  23
LexisNexisLegalSearchGuildLexisNexisLegalSearchGuild
Learning to Rank v2.0
Sujit Pal, Elsevier Labs
October 11, 2018
What it is, how it’s done, and what it can do for you
LexisNexisLegalSearchGuild
Outline
• History
• Overview
• Practical Considerations
• Case Studies
• Wrap Up
Learning to Rank - what it is, how it's done, and what it can do for you 2
LexisNexisLegalSearchGuild
History
• 1992: Idea of LTR (or Machine Learned Ranking) first proposed
• 2003: Altavista (later acquired by Yahoo!) using LTR in its engine
• 2005: Microsoft invents RankNet, deploys in Bing
• 2008: In contrast, Google’s engine hand tuned, relies on up ~200 signals
• 2009: Yandex invents and deploys MatrixNet in its engine
• 2016: Google says RankBrain is #3 signal to its search engine
• 2016: Bloomberg contributes LTR plugin to Solr
• 2017: Open Source Connections contributes LTR plugin in Elasticsearch
Learning to Rank - what it is, how it's done, and what it can do for you 3
LexisNexisLegalSearchGuild
Overview
Learning to Rank - what it is, how it's done, and what it can do for you 4
LexisNexisLegalSearchGuild
LTR Pipeline
Learning to Rank - what it is, how it's done, and what it can do for you 5
Image Credit: https://towardsdatascience.com/when-to-use-a-machine-learned-vs-score-based-search-ranker-aa8762cd9aa9
• Training: Build LTR
model using training
data (query, document,
label) triples
• Label is rank
• Inference: Use model to
predict label ŷ = h(x)
from unseen (query,
document) pairs
LexisNexisLegalSearchGuild
LTR used as re-ranker
• LTR models are usually more
computationally expensive than
search engines.
• Search engine used to pull out
matched documents
• Top-N of these documents are fed into
the LTR model and top-n of those are
replaced with the output of the
model, for N >> n (typically 50-100x).
Learning to Rank - what it is, how it's done, and what it can do for you 6
Index
Query
Matched
(10k)
Scored
(10k)
Top 1000
retrieved
Re-ranked
Top 10
Ranking
Model
Image Credit: https://lucidworks.com/2016/08/17/learning-to-rank-solr/
LexisNexisLegalSearchGuild
Types of LTR Algorithms
• Pointwise – documents ranked
by relevance of each (query,
document) pair
• Pairwise – documents ranked by
considering priority between
pairs of (query, document) pairs
• Listwise – documents ranked by
considering the entire relevance
ordering of all (query,
Documents) tuples per query
Learning to Rank - what it is, how it's done, and what it can do for you 7
Image Credit: https://www.hiit.fi/wp-content/uploads/2018/04/Learning-to-Rank-S11.pdf
LexisNexisLegalSearchGuild
Popular LTR Libraries
• RankLib (Java) – from Lemur Project (UMass, CMU), provides
Coordinate Ascent, Random Forest (pointwise), MART, RankNet,
RankBoost (pairwise), LambdaMART (pair/listwise), AdaRank and
ListNet (listwise)
• SVMRank (C++) – from Cornell, provides SVMRank (pairwise)
• XGBoost (Python/C++) – LambdaRank (pairwise)
• PyLTR (Python) – LambdaMART (pairwise)
• Michael Alcorn (Python) – RankNet and LambdaMART (pairwise)
Learning to Rank - what it is, how it's done, and what it can do for you 8
LexisNexisLegalSearchGuild
Common Algorithms
• Linear Model
• Predicted rank is linear combination of input
features
• RankNet
• Neural network based
• Good for binary (relevant/irrelevant) labels
• Weight matrix transforms input features into
rank probabilities
• LambdaMART
• Tree (forest) based
• Good for multi-class labels
• Feature splits with thresholds
Learning to Rank - what it is, how it's done, and what it can do for you 9
Image Credits: https://slideplayer.com/slide/9347431/, https://dimensionless.in/introduction-to-random-forest/
LexisNexisLegalSearchGuild
Practical Considerations
Learning to Rank - what it is, how it's done, and what it can do for you 10
LexisNexisLegalSearchGuild
Reasons to consider LTR
• Too many parameters to tune manually without overfitting to
particular query set.
• Availability of enough (implicit or explicit) good training data to train
LTR model.
Learning to Rank - what it is, how it's done, and what it can do for you 11
LexisNexisLegalSearchGuild
Acquiring labels
• Implicit
• Intrinsic features (words, phrases)
• Document metadata
• User Clicks
• Time spent on document
• Purchases (if applicable)
• Cheap to build but noisy
• Explicit
• Human expert rates relevancy of each
document against query
• Cleaner but expensive to build
Learning to Rank - what it is, how it's done, and what it can do for you 12
Image Credit: https://www.datasciencecentral.com/profiles/blogs/handling-imbalanced-data-sets-in-supervised-learning-using-family
LexisNexisLegalSearchGuild
Traditional Ranking Models
• Vector Space Models
• Boolean – predicts if document relevant to query or not
• TF-IDF – rank documents by cosine similarity between document and query
• Probabilistic Models
• BM25 – rank documents by log odds of relevance to query
• LMIR – probability of document’s LM generating terms in query
• Importance based Models
• HITS – rank documents by hubness/authority (inlinks/outlinks).
• PageRank – rank document by probability of random surfer arriving on page
• Impact Factor – rank documents by number of citations
Learning to Rank - what it is, how it's done, and what it can do for you 13
LexisNexisLegalSearchGuild
Feature Selection
• Document Features
• Document Length
• URL Length
• Publication Date
• Number of outlinks
• PageRank
• Query Features
• Number of words
• PER or ORG in query
Learning to Rank - what it is, how it's done, and what it can do for you 14
• Query-Document Features
• TF-IDF, BM25 similarity
• Frequency of query in anchor
text
• Document contains query words
in title
• User Dependent Features
• Star ratings
• Age, gender
• Device
LexisNexisLegalSearchGuild
Evaluation Metrics
• Mean Average Precision (MAP
@k)
• Mean Reciprocal Rank (MRR)
Learning to Rank - what it is, how it's done, and what it can do for you 15
• Normalized Discounted
Cumulative Gain (NDCG @k)
• Rank Correlation
LexisNexisLegalSearchGuild
LETOR Data Format
2 qid:1 1:3 2:3 3:0 4:0 5:3 6:1 7:1 8:0 9:0 10:1 11:156... # 11
2 qid:1 1:3 2:0 3:3 4:0 5:3 6:1 7:0 8:1 9:0 10:1 11:406... # 23
0 qid:1 1:3 2:0 3:2 4:0 5:3 6:1 7:0 8:0.666667 9:0 10:1 ... # 44
2 qid:1 1:3 2:0 3:3 4:0 5:3 6:1 7:0 8:1 9:0 10:1 11:287 ... # 57
1 qid:1 1:3 2:0 3:3 4:0 5:3 6:1 7:0 8:1 9:0 10:1 11:2009 ... # 89
Learning to Rank - what it is, how it's done, and what it can do for you 16
label
Query ID
Features: query, document, query/document, other
(sparse or dense format)
Comments (ex: docID)
LexisNexisLegalSearchGuild
Case Studies
Learning to Rank - what it is, how it's done, and what it can do for you 17
LexisNexisLegalSearchGuild
Dataset
• We use The Movie Database (TMDB) from Kaggle.
• 45k movies, 20 genres, 31k unique keywords
• We extract following fields: (docID, title, description, popularity,
release date, running time, rating (0-10), keywords, genres)
• Categorical labels 1-5 created from rating
• Objective is to build LTR model that learns the ordering implied by
rating and re-rank top 10 results using this model
• Features chosen: (query-title and query-description similarity using
TF-IDF and BM25, document recency, original score, and boolean 0/1
for each genre)
Learning to Rank - what it is, how it's done, and what it can do for you 18
LexisNexisLegalSearchGuild
General Flow
• Set up plugin if applicable
• Load data
• Define LTR features to index (if applicable)
• Generate LTR features for query set in LETOR format
• Train LTR model using 3rd party library (RankLib, etc).
• Upload trained model definition to index (if applicable)
• Run re-rank query using trained LTR model
Learning to Rank - what it is, how it's done, and what it can do for you 19
LexisNexisLegalSearchGuild
Examples
• Solr Case Study
• Elasticsearch Case Study
• DIY LTR Case Study
• Project: https://github.com/sujitpal/ltr-examples
Learning to Rank - what it is, how it's done, and what it can do for you 20
LexisNexisLegalSearchGuild
Wrap Up
Learning to Rank - what it is, how it's done, and what it can do for you 21
LexisNexisLegalSearchGuild
Resources
• Book – Learning to Rank for Information Retrieval, by Tie-Yan Liu.
• Paper – From RankNet to LambdaRank to LambdaMART: An
Overview, by Christopher J. C. Burges
• Tutorials
• Solr - https://github.com/airalcorn2/Solr-LTR
• Elasticsearch – Learning to Rank 101 by Pere Urbon-Bayes, ES-LTR Demo by
Doug Turnbull.
• Product Centric LTR Documentation
• Solr Learning To Rank Docs
• Elasticsearch Learning to Rank Docs
Learning to Rank - what it is, how it's done, and what it can do for you 22
LexisNexisLegalSearchGuild
Thank you!
• Contact: sujit.pal@elsevier.com
• Twitter: @palsujit
Learning to Rank - what it is, how it's done, and what it can do for you 23

Contenu connexe

Tendances

Learning to Rank Datasets for Search with Oscar Castaneda
Learning to Rank Datasets for Search with Oscar CastanedaLearning to Rank Datasets for Search with Oscar Castaneda
Learning to Rank Datasets for Search with Oscar CastanedaDatabricks
 
Dense Retrieval with Apache Solr Neural Search.pdf
Dense Retrieval with Apache Solr Neural Search.pdfDense Retrieval with Apache Solr Neural Search.pdf
Dense Retrieval with Apache Solr Neural Search.pdfSease
 
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorial
Learning to Rank for Recommender Systems -  ACM RecSys 2013 tutorialLearning to Rank for Recommender Systems -  ACM RecSys 2013 tutorial
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorialAlexandros Karatzoglou
 
Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Alexandros Karatzoglou
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Xavier Amatriain
 
Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...
Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...
Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...Sease
 
Search Quality Evaluation: a Developer Perspective
Search Quality Evaluation: a Developer PerspectiveSearch Quality Evaluation: a Developer Perspective
Search Quality Evaluation: a Developer PerspectiveSease
 
Rated Ranking Evaluator (FOSDEM 2019)
Rated Ranking Evaluator (FOSDEM 2019)Rated Ranking Evaluator (FOSDEM 2019)
Rated Ranking Evaluator (FOSDEM 2019)Andrea Gazzarini
 
How to Build your Training Set for a Learning To Rank Project
How to Build your Training Set for a Learning To Rank ProjectHow to Build your Training Set for a Learning To Rank Project
How to Build your Training Set for a Learning To Rank ProjectSease
 
Recommending What Video to Watch Next: A Multitask Ranking System
Recommending What Video to Watch Next: A Multitask Ranking SystemRecommending What Video to Watch Next: A Multitask Ranking System
Recommending What Video to Watch Next: A Multitask Ranking Systemivaderivader
 
Interactive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyInteractive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyChris Johnson
 
[QCon.ai 2019] People You May Know: Fast Recommendations Over Massive Data
[QCon.ai 2019] People You May Know: Fast Recommendations Over Massive Data[QCon.ai 2019] People You May Know: Fast Recommendations Over Massive Data
[QCon.ai 2019] People You May Know: Fast Recommendations Over Massive DataSumit Rangwala
 
Personalizing the listening experience
Personalizing the listening experiencePersonalizing the listening experience
Personalizing the listening experienceMounia Lalmas-Roelleke
 
Music Recommendations at Scale with Spark
Music Recommendations at Scale with SparkMusic Recommendations at Scale with Spark
Music Recommendations at Scale with SparkChris Johnson
 
Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...
Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...
Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...Lucidworks
 
Better Search Through Query Understanding
Better Search Through Query UnderstandingBetter Search Through Query Understanding
Better Search Through Query UnderstandingDaniel Tunkelang
 
System design for recommendations and search
System design for recommendations and searchSystem design for recommendations and search
System design for recommendations and searchEugene Yan Ziyou
 
Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018
Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018
Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018Massimo Quadrana
 

Tendances (20)

Learning to Rank Datasets for Search with Oscar Castaneda
Learning to Rank Datasets for Search with Oscar CastanedaLearning to Rank Datasets for Search with Oscar Castaneda
Learning to Rank Datasets for Search with Oscar Castaneda
 
Dense Retrieval with Apache Solr Neural Search.pdf
Dense Retrieval with Apache Solr Neural Search.pdfDense Retrieval with Apache Solr Neural Search.pdf
Dense Retrieval with Apache Solr Neural Search.pdf
 
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorial
Learning to Rank for Recommender Systems -  ACM RecSys 2013 tutorialLearning to Rank for Recommender Systems -  ACM RecSys 2013 tutorial
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorial
 
Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
 
Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...
Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...
Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...
 
Search Quality Evaluation: a Developer Perspective
Search Quality Evaluation: a Developer PerspectiveSearch Quality Evaluation: a Developer Perspective
Search Quality Evaluation: a Developer Perspective
 
Rated Ranking Evaluator (FOSDEM 2019)
Rated Ranking Evaluator (FOSDEM 2019)Rated Ranking Evaluator (FOSDEM 2019)
Rated Ranking Evaluator (FOSDEM 2019)
 
How to Build your Training Set for a Learning To Rank Project
How to Build your Training Set for a Learning To Rank ProjectHow to Build your Training Set for a Learning To Rank Project
How to Build your Training Set for a Learning To Rank Project
 
Recommender system
Recommender systemRecommender system
Recommender system
 
Recommending What Video to Watch Next: A Multitask Ranking System
Recommending What Video to Watch Next: A Multitask Ranking SystemRecommending What Video to Watch Next: A Multitask Ranking System
Recommending What Video to Watch Next: A Multitask Ranking System
 
Interactive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyInteractive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and Spotify
 
[QCon.ai 2019] People You May Know: Fast Recommendations Over Massive Data
[QCon.ai 2019] People You May Know: Fast Recommendations Over Massive Data[QCon.ai 2019] People You May Know: Fast Recommendations Over Massive Data
[QCon.ai 2019] People You May Know: Fast Recommendations Over Massive Data
 
Personalizing the listening experience
Personalizing the listening experiencePersonalizing the listening experience
Personalizing the listening experience
 
Music Recommendations at Scale with Spark
Music Recommendations at Scale with SparkMusic Recommendations at Scale with Spark
Music Recommendations at Scale with Spark
 
Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...
Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...
Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...
 
Better Search Through Query Understanding
Better Search Through Query UnderstandingBetter Search Through Query Understanding
Better Search Through Query Understanding
 
System design for recommendations and search
System design for recommendations and searchSystem design for recommendations and search
System design for recommendations and search
 
ElasticSearch
ElasticSearchElasticSearch
ElasticSearch
 
Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018
Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018
Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018
 

Similaire à Learning to Rank Presentation (v2) at LexisNexis Search Guild

RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...Joaquin Delgado PhD.
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...S. Diana Hu
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningJoaquin Delgado PhD.
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningS. Diana Hu
 
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Lucidworks
 
Case study of Rujhaan.com (A social news app )
Case study of Rujhaan.com (A social news app )Case study of Rujhaan.com (A social news app )
Case study of Rujhaan.com (A social news app )Rahul Jain
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesRahul Jain
 
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkDice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkSimon Hughes
 
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...eswcsummerschool
 
Building a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineBuilding a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineTrey Grainger
 
Reflected intelligence evolving self-learning data systems
Reflected intelligence  evolving self-learning data systemsReflected intelligence  evolving self-learning data systems
Reflected intelligence evolving self-learning data systemsTrey Grainger
 
Latest trends in AI and information Retrieval
Latest trends in AI and information Retrieval Latest trends in AI and information Retrieval
Latest trends in AI and information Retrieval Abhay Ratnaparkhi
 
Elasticsearch - basics and beyond
Elasticsearch - basics and beyondElasticsearch - basics and beyond
Elasticsearch - basics and beyondErnesto Reig
 
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrTrey Grainger
 
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Lucidworks
 
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...Fwdays
 
From keyword-based search to language-agnostic semantic search
From keyword-based search to language-agnostic semantic searchFrom keyword-based search to language-agnostic semantic search
From keyword-based search to language-agnostic semantic searchCareerBuilder.com
 
Roaring with elastic search sangam2018
Roaring with elastic search sangam2018Roaring with elastic search sangam2018
Roaring with elastic search sangam2018Vinay Kumar
 
Semantic framework for web scraping.
Semantic framework for web scraping.Semantic framework for web scraping.
Semantic framework for web scraping.Shyjal Raazi
 
Sebastian Cohnen – Building a Startup with NoSQL - NoSQL matters Barcelona 2014
Sebastian Cohnen – Building a Startup with NoSQL - NoSQL matters Barcelona 2014Sebastian Cohnen – Building a Startup with NoSQL - NoSQL matters Barcelona 2014
Sebastian Cohnen – Building a Startup with NoSQL - NoSQL matters Barcelona 2014NoSQLmatters
 

Similaire à Learning to Rank Presentation (v2) at LexisNexis Search Guild (20)

RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
 
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
 
Case study of Rujhaan.com (A social news app )
Case study of Rujhaan.com (A social news app )Case study of Rujhaan.com (A social news app )
Case study of Rujhaan.com (A social news app )
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and Usecases
 
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkDice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank Talk
 
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
 
Building a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineBuilding a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engine
 
Reflected intelligence evolving self-learning data systems
Reflected intelligence  evolving self-learning data systemsReflected intelligence  evolving self-learning data systems
Reflected intelligence evolving self-learning data systems
 
Latest trends in AI and information Retrieval
Latest trends in AI and information Retrieval Latest trends in AI and information Retrieval
Latest trends in AI and information Retrieval
 
Elasticsearch - basics and beyond
Elasticsearch - basics and beyondElasticsearch - basics and beyond
Elasticsearch - basics and beyond
 
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solr
 
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
 
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
 
From keyword-based search to language-agnostic semantic search
From keyword-based search to language-agnostic semantic searchFrom keyword-based search to language-agnostic semantic search
From keyword-based search to language-agnostic semantic search
 
Roaring with elastic search sangam2018
Roaring with elastic search sangam2018Roaring with elastic search sangam2018
Roaring with elastic search sangam2018
 
Semantic framework for web scraping.
Semantic framework for web scraping.Semantic framework for web scraping.
Semantic framework for web scraping.
 
Sebastian Cohnen – Building a Startup with NoSQL - NoSQL matters Barcelona 2014
Sebastian Cohnen – Building a Startup with NoSQL - NoSQL matters Barcelona 2014Sebastian Cohnen – Building a Startup with NoSQL - NoSQL matters Barcelona 2014
Sebastian Cohnen – Building a Startup with NoSQL - NoSQL matters Barcelona 2014
 

Plus de Sujit Pal

Supporting Concept Search using a Clinical Healthcare Knowledge Graph
Supporting Concept Search using a Clinical Healthcare Knowledge GraphSupporting Concept Search using a Clinical Healthcare Knowledge Graph
Supporting Concept Search using a Clinical Healthcare Knowledge GraphSujit Pal
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Building Learning to Rank (LTR) search reranking models using Large Language ...
Building Learning to Rank (LTR) search reranking models using Large Language ...Building Learning to Rank (LTR) search reranking models using Large Language ...
Building Learning to Rank (LTR) search reranking models using Large Language ...Sujit Pal
 
Cheap Trick for Question Answering
Cheap Trick for Question AnsweringCheap Trick for Question Answering
Cheap Trick for Question AnsweringSujit Pal
 
Searching Across Images and Test
Searching Across Images and TestSearching Across Images and Test
Searching Across Images and TestSujit Pal
 
Learning a Joint Embedding Representation for Image Search using Self-supervi...
Learning a Joint Embedding Representation for Image Search using Self-supervi...Learning a Joint Embedding Representation for Image Search using Self-supervi...
Learning a Joint Embedding Representation for Image Search using Self-supervi...Sujit Pal
 
The power of community: training a Transformer Language Model on a shoestring
The power of community: training a Transformer Language Model on a shoestringThe power of community: training a Transformer Language Model on a shoestring
The power of community: training a Transformer Language Model on a shoestringSujit Pal
 
Backprop Visualization
Backprop VisualizationBackprop Visualization
Backprop VisualizationSujit Pal
 
Accelerating NLP with Dask and Saturn Cloud
Accelerating NLP with Dask and Saturn CloudAccelerating NLP with Dask and Saturn Cloud
Accelerating NLP with Dask and Saturn CloudSujit Pal
 
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19Sujit Pal
 
Leslie Smith's Papers discussion for DL Journal Club
Leslie Smith's Papers discussion for DL Journal ClubLeslie Smith's Papers discussion for DL Journal Club
Leslie Smith's Papers discussion for DL Journal ClubSujit Pal
 
Using Graph and Transformer Embeddings for Vector Based Retrieval
Using Graph and Transformer Embeddings for Vector Based RetrievalUsing Graph and Transformer Embeddings for Vector Based Retrieval
Using Graph and Transformer Embeddings for Vector Based RetrievalSujit Pal
 
Transformer Mods for Document Length Inputs
Transformer Mods for Document Length InputsTransformer Mods for Document Length Inputs
Transformer Mods for Document Length InputsSujit Pal
 
Question Answering as Search - the Anserini Pipeline and Other Stories
Question Answering as Search - the Anserini Pipeline and Other StoriesQuestion Answering as Search - the Anserini Pipeline and Other Stories
Question Answering as Search - the Anserini Pipeline and Other StoriesSujit Pal
 
Building Named Entity Recognition Models Efficiently using NERDS
Building Named Entity Recognition Models Efficiently using NERDSBuilding Named Entity Recognition Models Efficiently using NERDS
Building Named Entity Recognition Models Efficiently using NERDSSujit Pal
 
Graph Techniques for Natural Language Processing
Graph Techniques for Natural Language ProcessingGraph Techniques for Natural Language Processing
Graph Techniques for Natural Language ProcessingSujit Pal
 
Search summit-2018-content-engineering-slides
Search summit-2018-content-engineering-slidesSearch summit-2018-content-engineering-slides
Search summit-2018-content-engineering-slidesSujit Pal
 
SoDA v2 - Named Entity Recognition from streaming text
SoDA v2 - Named Entity Recognition from streaming textSoDA v2 - Named Entity Recognition from streaming text
SoDA v2 - Named Entity Recognition from streaming textSujit Pal
 
Evolving a Medical Image Similarity Search
Evolving a Medical Image Similarity SearchEvolving a Medical Image Similarity Search
Evolving a Medical Image Similarity SearchSujit Pal
 
Embed, Encode, Attend, Predict – applying the 4 step NLP recipe for text clas...
Embed, Encode, Attend, Predict – applying the 4 step NLP recipe for text clas...Embed, Encode, Attend, Predict – applying the 4 step NLP recipe for text clas...
Embed, Encode, Attend, Predict – applying the 4 step NLP recipe for text clas...Sujit Pal
 

Plus de Sujit Pal (20)

Supporting Concept Search using a Clinical Healthcare Knowledge Graph
Supporting Concept Search using a Clinical Healthcare Knowledge GraphSupporting Concept Search using a Clinical Healthcare Knowledge Graph
Supporting Concept Search using a Clinical Healthcare Knowledge Graph
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Building Learning to Rank (LTR) search reranking models using Large Language ...
Building Learning to Rank (LTR) search reranking models using Large Language ...Building Learning to Rank (LTR) search reranking models using Large Language ...
Building Learning to Rank (LTR) search reranking models using Large Language ...
 
Cheap Trick for Question Answering
Cheap Trick for Question AnsweringCheap Trick for Question Answering
Cheap Trick for Question Answering
 
Searching Across Images and Test
Searching Across Images and TestSearching Across Images and Test
Searching Across Images and Test
 
Learning a Joint Embedding Representation for Image Search using Self-supervi...
Learning a Joint Embedding Representation for Image Search using Self-supervi...Learning a Joint Embedding Representation for Image Search using Self-supervi...
Learning a Joint Embedding Representation for Image Search using Self-supervi...
 
The power of community: training a Transformer Language Model on a shoestring
The power of community: training a Transformer Language Model on a shoestringThe power of community: training a Transformer Language Model on a shoestring
The power of community: training a Transformer Language Model on a shoestring
 
Backprop Visualization
Backprop VisualizationBackprop Visualization
Backprop Visualization
 
Accelerating NLP with Dask and Saturn Cloud
Accelerating NLP with Dask and Saturn CloudAccelerating NLP with Dask and Saturn Cloud
Accelerating NLP with Dask and Saturn Cloud
 
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
 
Leslie Smith's Papers discussion for DL Journal Club
Leslie Smith's Papers discussion for DL Journal ClubLeslie Smith's Papers discussion for DL Journal Club
Leslie Smith's Papers discussion for DL Journal Club
 
Using Graph and Transformer Embeddings for Vector Based Retrieval
Using Graph and Transformer Embeddings for Vector Based RetrievalUsing Graph and Transformer Embeddings for Vector Based Retrieval
Using Graph and Transformer Embeddings for Vector Based Retrieval
 
Transformer Mods for Document Length Inputs
Transformer Mods for Document Length InputsTransformer Mods for Document Length Inputs
Transformer Mods for Document Length Inputs
 
Question Answering as Search - the Anserini Pipeline and Other Stories
Question Answering as Search - the Anserini Pipeline and Other StoriesQuestion Answering as Search - the Anserini Pipeline and Other Stories
Question Answering as Search - the Anserini Pipeline and Other Stories
 
Building Named Entity Recognition Models Efficiently using NERDS
Building Named Entity Recognition Models Efficiently using NERDSBuilding Named Entity Recognition Models Efficiently using NERDS
Building Named Entity Recognition Models Efficiently using NERDS
 
Graph Techniques for Natural Language Processing
Graph Techniques for Natural Language ProcessingGraph Techniques for Natural Language Processing
Graph Techniques for Natural Language Processing
 
Search summit-2018-content-engineering-slides
Search summit-2018-content-engineering-slidesSearch summit-2018-content-engineering-slides
Search summit-2018-content-engineering-slides
 
SoDA v2 - Named Entity Recognition from streaming text
SoDA v2 - Named Entity Recognition from streaming textSoDA v2 - Named Entity Recognition from streaming text
SoDA v2 - Named Entity Recognition from streaming text
 
Evolving a Medical Image Similarity Search
Evolving a Medical Image Similarity SearchEvolving a Medical Image Similarity Search
Evolving a Medical Image Similarity Search
 
Embed, Encode, Attend, Predict – applying the 4 step NLP recipe for text clas...
Embed, Encode, Attend, Predict – applying the 4 step NLP recipe for text clas...Embed, Encode, Attend, Predict – applying the 4 step NLP recipe for text clas...
Embed, Encode, Attend, Predict – applying the 4 step NLP recipe for text clas...
 

Dernier

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 

Dernier (20)

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 

Learning to Rank Presentation (v2) at LexisNexis Search Guild

  • 1. LexisNexisLegalSearchGuildLexisNexisLegalSearchGuild Learning to Rank v2.0 Sujit Pal, Elsevier Labs October 11, 2018 What it is, how it’s done, and what it can do for you
  • 2. LexisNexisLegalSearchGuild Outline • History • Overview • Practical Considerations • Case Studies • Wrap Up Learning to Rank - what it is, how it's done, and what it can do for you 2
  • 3. LexisNexisLegalSearchGuild History • 1992: Idea of LTR (or Machine Learned Ranking) first proposed • 2003: Altavista (later acquired by Yahoo!) using LTR in its engine • 2005: Microsoft invents RankNet, deploys in Bing • 2008: In contrast, Google’s engine hand tuned, relies on up ~200 signals • 2009: Yandex invents and deploys MatrixNet in its engine • 2016: Google says RankBrain is #3 signal to its search engine • 2016: Bloomberg contributes LTR plugin to Solr • 2017: Open Source Connections contributes LTR plugin in Elasticsearch Learning to Rank - what it is, how it's done, and what it can do for you 3
  • 4. LexisNexisLegalSearchGuild Overview Learning to Rank - what it is, how it's done, and what it can do for you 4
  • 5. LexisNexisLegalSearchGuild LTR Pipeline Learning to Rank - what it is, how it's done, and what it can do for you 5 Image Credit: https://towardsdatascience.com/when-to-use-a-machine-learned-vs-score-based-search-ranker-aa8762cd9aa9 • Training: Build LTR model using training data (query, document, label) triples • Label is rank • Inference: Use model to predict label ŷ = h(x) from unseen (query, document) pairs
  • 6. LexisNexisLegalSearchGuild LTR used as re-ranker • LTR models are usually more computationally expensive than search engines. • Search engine used to pull out matched documents • Top-N of these documents are fed into the LTR model and top-n of those are replaced with the output of the model, for N >> n (typically 50-100x). Learning to Rank - what it is, how it's done, and what it can do for you 6 Index Query Matched (10k) Scored (10k) Top 1000 retrieved Re-ranked Top 10 Ranking Model Image Credit: https://lucidworks.com/2016/08/17/learning-to-rank-solr/
  • 7. LexisNexisLegalSearchGuild Types of LTR Algorithms • Pointwise – documents ranked by relevance of each (query, document) pair • Pairwise – documents ranked by considering priority between pairs of (query, document) pairs • Listwise – documents ranked by considering the entire relevance ordering of all (query, Documents) tuples per query Learning to Rank - what it is, how it's done, and what it can do for you 7 Image Credit: https://www.hiit.fi/wp-content/uploads/2018/04/Learning-to-Rank-S11.pdf
  • 8. LexisNexisLegalSearchGuild Popular LTR Libraries • RankLib (Java) – from Lemur Project (UMass, CMU), provides Coordinate Ascent, Random Forest (pointwise), MART, RankNet, RankBoost (pairwise), LambdaMART (pair/listwise), AdaRank and ListNet (listwise) • SVMRank (C++) – from Cornell, provides SVMRank (pairwise) • XGBoost (Python/C++) – LambdaRank (pairwise) • PyLTR (Python) – LambdaMART (pairwise) • Michael Alcorn (Python) – RankNet and LambdaMART (pairwise) Learning to Rank - what it is, how it's done, and what it can do for you 8
  • 9. LexisNexisLegalSearchGuild Common Algorithms • Linear Model • Predicted rank is linear combination of input features • RankNet • Neural network based • Good for binary (relevant/irrelevant) labels • Weight matrix transforms input features into rank probabilities • LambdaMART • Tree (forest) based • Good for multi-class labels • Feature splits with thresholds Learning to Rank - what it is, how it's done, and what it can do for you 9 Image Credits: https://slideplayer.com/slide/9347431/, https://dimensionless.in/introduction-to-random-forest/
  • 10. LexisNexisLegalSearchGuild Practical Considerations Learning to Rank - what it is, how it's done, and what it can do for you 10
  • 11. LexisNexisLegalSearchGuild Reasons to consider LTR • Too many parameters to tune manually without overfitting to particular query set. • Availability of enough (implicit or explicit) good training data to train LTR model. Learning to Rank - what it is, how it's done, and what it can do for you 11
  • 12. LexisNexisLegalSearchGuild Acquiring labels • Implicit • Intrinsic features (words, phrases) • Document metadata • User Clicks • Time spent on document • Purchases (if applicable) • Cheap to build but noisy • Explicit • Human expert rates relevancy of each document against query • Cleaner but expensive to build Learning to Rank - what it is, how it's done, and what it can do for you 12 Image Credit: https://www.datasciencecentral.com/profiles/blogs/handling-imbalanced-data-sets-in-supervised-learning-using-family
  • 13. LexisNexisLegalSearchGuild Traditional Ranking Models • Vector Space Models • Boolean – predicts if document relevant to query or not • TF-IDF – rank documents by cosine similarity between document and query • Probabilistic Models • BM25 – rank documents by log odds of relevance to query • LMIR – probability of document’s LM generating terms in query • Importance based Models • HITS – rank documents by hubness/authority (inlinks/outlinks). • PageRank – rank document by probability of random surfer arriving on page • Impact Factor – rank documents by number of citations Learning to Rank - what it is, how it's done, and what it can do for you 13
  • 14. LexisNexisLegalSearchGuild Feature Selection • Document Features • Document Length • URL Length • Publication Date • Number of outlinks • PageRank • Query Features • Number of words • PER or ORG in query Learning to Rank - what it is, how it's done, and what it can do for you 14 • Query-Document Features • TF-IDF, BM25 similarity • Frequency of query in anchor text • Document contains query words in title • User Dependent Features • Star ratings • Age, gender • Device
  • 15. LexisNexisLegalSearchGuild Evaluation Metrics • Mean Average Precision (MAP @k) • Mean Reciprocal Rank (MRR) Learning to Rank - what it is, how it's done, and what it can do for you 15 • Normalized Discounted Cumulative Gain (NDCG @k) • Rank Correlation
  • 16. LexisNexisLegalSearchGuild LETOR Data Format 2 qid:1 1:3 2:3 3:0 4:0 5:3 6:1 7:1 8:0 9:0 10:1 11:156... # 11 2 qid:1 1:3 2:0 3:3 4:0 5:3 6:1 7:0 8:1 9:0 10:1 11:406... # 23 0 qid:1 1:3 2:0 3:2 4:0 5:3 6:1 7:0 8:0.666667 9:0 10:1 ... # 44 2 qid:1 1:3 2:0 3:3 4:0 5:3 6:1 7:0 8:1 9:0 10:1 11:287 ... # 57 1 qid:1 1:3 2:0 3:3 4:0 5:3 6:1 7:0 8:1 9:0 10:1 11:2009 ... # 89 Learning to Rank - what it is, how it's done, and what it can do for you 16 label Query ID Features: query, document, query/document, other (sparse or dense format) Comments (ex: docID)
  • 17. LexisNexisLegalSearchGuild Case Studies Learning to Rank - what it is, how it's done, and what it can do for you 17
  • 18. LexisNexisLegalSearchGuild Dataset • We use The Movie Database (TMDB) from Kaggle. • 45k movies, 20 genres, 31k unique keywords • We extract following fields: (docID, title, description, popularity, release date, running time, rating (0-10), keywords, genres) • Categorical labels 1-5 created from rating • Objective is to build LTR model that learns the ordering implied by rating and re-rank top 10 results using this model • Features chosen: (query-title and query-description similarity using TF-IDF and BM25, document recency, original score, and boolean 0/1 for each genre) Learning to Rank - what it is, how it's done, and what it can do for you 18
  • 19. LexisNexisLegalSearchGuild General Flow • Set up plugin if applicable • Load data • Define LTR features to index (if applicable) • Generate LTR features for query set in LETOR format • Train LTR model using 3rd party library (RankLib, etc). • Upload trained model definition to index (if applicable) • Run re-rank query using trained LTR model Learning to Rank - what it is, how it's done, and what it can do for you 19
  • 20. LexisNexisLegalSearchGuild Examples • Solr Case Study • Elasticsearch Case Study • DIY LTR Case Study • Project: https://github.com/sujitpal/ltr-examples Learning to Rank - what it is, how it's done, and what it can do for you 20
  • 21. LexisNexisLegalSearchGuild Wrap Up Learning to Rank - what it is, how it's done, and what it can do for you 21
  • 22. LexisNexisLegalSearchGuild Resources • Book – Learning to Rank for Information Retrieval, by Tie-Yan Liu. • Paper – From RankNet to LambdaRank to LambdaMART: An Overview, by Christopher J. C. Burges • Tutorials • Solr - https://github.com/airalcorn2/Solr-LTR • Elasticsearch – Learning to Rank 101 by Pere Urbon-Bayes, ES-LTR Demo by Doug Turnbull. • Product Centric LTR Documentation • Solr Learning To Rank Docs • Elasticsearch Learning to Rank Docs Learning to Rank - what it is, how it's done, and what it can do for you 22
  • 23. LexisNexisLegalSearchGuild Thank you! • Contact: sujit.pal@elsevier.com • Twitter: @palsujit Learning to Rank - what it is, how it's done, and what it can do for you 23

Notes de l'éditeur

  1. Most of the key work done between 2008 – 2011, competitions sponsored by MS, Yahoo and Yandex. Bloomberg LTR meetup – Michael Nillson, Erick Erickson. OSC LTR – at Haystack earlier this year.
  2. All cases you need judgement list (ie relevant vs irrelevant). For MRR you need first good result so notion of position; for DCG you need graded results and for NDCG and Rank Correlation we also need ideal ordering.