SlideShare une entreprise Scribd logo
1  sur  28
Télécharger pour lire hors ligne
O C T O B E R 1 1 - 1 4 , 2 0 1 6 • B O S T O N , M A
Evolving The Optimal Relevancy Scoring Model at Dice.com
Simon Hughes
Chief Data Scientist, Dice.com
3
•  Chief Data Scientist at Dice.com and DHI, under Yuri Bykov
•  Dice.com – leading US job board for IT professionals
•  Twitter handle: https://twitter.com/hughes_meister
Who Am I?
•  Dice Skills pages - http://www.dice.com/skills
•  New Dice Careers Mobile App
Key Projects
•  PhD candidate at DePaul University, studying NLP and machine learning
•  Thesis topic – Detecting causality in scientific explanatory essays
PhD
4
•  Look under https://github.com/DiceTechJobs
•  Set of Solr plugins https://github.com/DiceTechJobs/SolrPlugins
•  Tutorial for this talk: https://github.com/DiceTechJobs/RelevancyTuning
Open Source GitHub Repositories
5
1.  Approaches to Relevancy Tuning
2.  Automated Relevancy Tuning – using Reinforcement Learning
3.  Feedback Loops - Dangers of Closed Loop Learning Systems
Overview
6
•  Last year I talked about conceptual search and how that could be used to improve recall
•  This year I want to focus on techniques to improve precision
•  Novelty
Motivations for Talk
7
Finding the Optimal Search Engine Configuration
•  Most companies initially approach this in a very ad hoc and manual process:
•  Follow ‘best practices’ and make some initial educated guesses as to the best settings
•  Manually tune the parameters on a number of key user queries
•  The search engine parameters should be tuned to reflect how your users search
•  Relevancy is a hard to define concept, but it’s what your users consider provides them with an
optimal search experience. So it should be informed by their search behavior
Relevancy Tuning
8
What Solr Configuration Options Influence Relevancy?
Solr and Lucene provide many configuration options that impact search relevancy, including:
•  Which query parser – dismax, edismax, LuceneParser, etc
•  Field boosts – qf parameter
•  Phrase boosts – pf, pf2, pf3 parameters
•  Minimum should match - mm parameter
•  Similarity Class – default similarity, BM25, Tf.Idf, custom or one of many others
•  Boost queries – boost, bf, bq, etc
•  Edismax tie parameter – recommended value ≈ 0.1
9
Remove Noise Chars
•  Ensure punctuation characters and plurality are removed from each field using the analysis chain
Ø  ‘q=developer’ should match ‘developer,’ ,’developer.’, ‘developer’s’ and ‘developers’
When using Stemming  Synonyms – use Copy Fields + Edismax
•  Use copy fields to apply stemming and synonyms to existing fields
•  Allows different boosts to be applied to stemmed and synonym matches
•  Set fields boost to be lower on the stemmed and synonym copy fields
Some General Tips on Relevancy Tuning
10
Use Boost Queries for Specific Query Use Cases
•  Edismax bq parameter – allows boosting of matches to nested queries
•  See chapter 7 of Relevant Search - good coverage of this strategy
Make Good Use of Phrase Query Boosts
•  Use pf, pf2 and pf3 parameters in edismax to give preference for multi-term matches
•  pf2 and pf3 often give better performance than pf, which requires an exact match for all query terms
Caveat Emptor: Monitor impact of these changes on query performance (QTime) and index size
Some General Tips on Relevancy Tuning
11
•  To tune your search parameters, you can gather a dataset of relevancy judgements
•  For a set of important queries, the dataset will contain a set of relevancy judgements with the
top results returned annotated for relevancy
•  This dataset can be collected using domain experts and a user interface designed for this task
•  Commercial Examples:
•  Quepid – developed by OpenSource Connections
•  Fusion UI Relevancy Workbench – part of the Fusion offering from Lucidworks
The ‘Golden’ Test Collection
12
13
•  An alternative to manually collecting relevancy judgements is to collect them directly from your users
•  For each user search on the site, capture:
•  User’s query, and timestamp
•  Any filters applied
•  Result impressions and clicks
•  You can then turn this into a test collection by assuming that the results that people click on are more relevant
than those they don’t
•  The time spent on the results page is also a great indication of how relevant that result was to the original search
Search Log Capture
14
•  Now you have a test collection, you can use that to tune your search engine configuration
•  Using the test collection, you can measure the relevancy of a set of searches on that collection using some IR metrics, such as:
•  MAP (Mean Average Precision)
•  Precision at K (compute precision at the k’th document retrieved)
•  NDCG (Normalized Discounted Cumulative Gain)
•  Regression testing – this allows you to build a set of regression tests to ensure configuration changes both improve relevancy
and don’t break certain queries
•  Manually tuning search configurations is still a time consuming and inefficient process
•  Is there a better way?
Relevancy Tuning with a Test Collection
15
1.  Supervised Machine Learning?
•  No - cannot optimize your search configuration without a computable gradient
2.  Grid Search?
•  Perform a brute force search over a the range of possible configuration parameters
•  Very slow and inefficient – is not able to learn which ranges of settings work best
3.  Black Box Optimization Algorithms?
•  Optimization algorithms exist that attempt to find the optimum value of an unknown function in as few iterations as
possible
•  Perform a much smarter search of the parameter space than grid search
Automated Relevancy Tuning Approaches
16
•  Use an optimization algorithm to optimize a ‘black box’ function
•  Black box function – provide the optimization algorithm with a function that takes a set of parameters as inputs
and computes a score
•  The black box algorithm will then try and choose parameter settings to optimize the score
•  This can be thought of as a form of reinforcement learning
•  These algorithms will intelligently search the space of possible search configurations to arrive at a solution
•  Example algorithms include Bayesian Optimization, Simulated Annealing, and Genetic Algorithms (hence talk
title)
Black Box Optimization Algorithms
17
Example Black Box Function for Search Relevancy
18
•  There are some excellent mature libraries for doing this sort of thing e.g.
•  DEAP
- Distributed Evolutionary Algorithms in Python (hence talk title)
•  Scikit Optimize
– General optimization library built by a team at CERN headed by Tim Head
•  These libraries are very easy to use, however getting them to optimize your search configuration is a little trickier
•  They tend to work better when optimizing a small set of parameters at a time – 1 to 4 works well
•  Achieved an improvement of 5% in MAP @ 5 for our MLT configuration. AB testing changes to search before
EOY
Making it Work
19
•  To optimize a large set of search parameters – start with the most important ones and optimize those while
keeping the rest fixed
•  If you are using search logs to optimize the search configuration, use a large number of searches (at least a few
thousand) to ensure you are performing a robust enough test
•  For most search collections of a reasonable size, running these optimizations over your search collection will
take time – set it up on a server, parallelize where possible and leave running overnight
•  Typically you will want to allow the algorithm to try a few hundred variations of each parameter set at least to
find a good range of settings
•  Ideally – first optimize your search configuration against a set of relevancy judgements acquired from domain
experts, deploy to production and use the search logs to further tune against your users search behavior
Making it Work
20
•  As with any machine learning problem, it is essential to use one dataset to learn from, and a second separate dataset to
validate your results – prevents ‘overfitting’
•  Overfitting in this context means that the search parameters are over-tuned on your initial dataset, that the search engine
performs worse on new data than with the current configuration
•  Once you have an optimal set of configuration parameters, that you are happy with, these should be evaluated on a second set
of relevancy judgements to ensure the same performance gains are seen there also
•  This applies to both manual and automatic tuning of the search engine configuration. Humans can overfit a dataset just as
easily as an algorithm can
Use a Separate Testing Dataset to Validate Improvements
21
•  Auto-tune other solr parameters – phrase slop, mm settings, similarity class used
•  Your can evolve a more optimal ranking function:
•  Either tweak the settings of the existing ranking functions (see
SweetSpotSimilarityFactory class)
•  Or use Genetic Programming to evolve a better ranking function for your dataset
•  Genetic Programming is an evolutionary algorithm that can evolve programs and equations
•  Some relevant papers, good introductory paper (but not very recent)
Some Other Things to Try
22
•  Building a Machine Learned Ranking system is a premature optimization if you haven’t first optimized
your search configuration
•  Relevancy tuning and MLR both primarily optimize for precision over recall due to nature of training
data**
•  For techniques to improve recall, see conceptual  semantic search:
•  Simon Hughes - “Conceptual Search” (Revolution 2015)
•  Trey Grainger - “Enhancing Relevancy Through Personalization and Semantic Search” (Revolution 2013)
•  Doug Turnbull and John Berryman - Chapter 11 of Relevant Search
Things to Consider
Feedback Loops – Dangers of Closed Loop Learning Systems
Users
Interact with
the System
Model
Machine Learning
Produce
Building a Machine Learning System
1.  Users interact with the system to
produce data
2.  Machine learning algorithms turns
that data into a model
What happens if the model’s
predictions influence the user’s
behavior?
Users
Interact with
the System
Model
Produce
Positive Feedback Loop
1.  Users interact with the system to
produce data
2.  Machine learning algorithms turns
that data into a model
3.  Model changes user behavior,
modifying its own future training
data
Model changes behavior
Machine Learning
26
1.  Isolate a subset of data from being influenced by the model, use this data to train the system
•  E.g. leave a small proportion of user searches un-ranked by the MLR model
•  E.g. generate a subset of recommendations at random, or by using an unsupervised model
2.  Use a reinforcement learning model instead (such as a multi-armed bandit) - the system will
dynamically adapt to the users’ behavior, balancing exploring different hypotheses with
exploiting what it’s learned to produce accurate predictions
Preventing Positive Feedback Loops
27
THE END
•  Thank you for listening
•  Any questions?
28

Contenu connexe

Tendances

H-Hypermap - Heatmap Analytics at Scale: Presented by David Smiley, D W Smile...
H-Hypermap - Heatmap Analytics at Scale: Presented by David Smiley, D W Smile...H-Hypermap - Heatmap Analytics at Scale: Presented by David Smiley, D W Smile...
H-Hypermap - Heatmap Analytics at Scale: Presented by David Smiley, D W Smile...Lucidworks
 
Cross Data Center Replication for the Enterprise: Presented by Adam Williams,...
Cross Data Center Replication for the Enterprise: Presented by Adam Williams,...Cross Data Center Replication for the Enterprise: Presented by Adam Williams,...
Cross Data Center Replication for the Enterprise: Presented by Adam Williams,...Lucidworks
 
Measuring Search Engine Quality using Spark and Python
Measuring Search Engine Quality using Spark and PythonMeasuring Search Engine Quality using Spark and Python
Measuring Search Engine Quality using Spark and PythonSujit Pal
 
Solr Distributed Indexing in WalmartLabs: Presented by Shengua Wan, WalmartLabs
Solr Distributed Indexing in WalmartLabs: Presented by Shengua Wan, WalmartLabsSolr Distributed Indexing in WalmartLabs: Presented by Shengua Wan, WalmartLabs
Solr Distributed Indexing in WalmartLabs: Presented by Shengua Wan, WalmartLabsLucidworks
 
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Lucidworks
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Lucidworks
 
Understanding Lucene Search Performance
Understanding Lucene Search PerformanceUnderstanding Lucene Search Performance
Understanding Lucene Search PerformanceLucidworks (Archived)
 
Data Science with Solr and Spark
Data Science with Solr and SparkData Science with Solr and Spark
Data Science with Solr and SparkLucidworks
 
The Evolution of Streaming Expressions - Joel Bernstein, Alfresco & Dennis Go...
The Evolution of Streaming Expressions - Joel Bernstein, Alfresco & Dennis Go...The Evolution of Streaming Expressions - Joel Bernstein, Alfresco & Dennis Go...
The Evolution of Streaming Expressions - Joel Bernstein, Alfresco & Dennis Go...Lucidworks
 
Exploring Direct Concept Search - Steve Rowe, Lucidworks
Exploring Direct Concept Search - Steve Rowe, LucidworksExploring Direct Concept Search - Steve Rowe, Lucidworks
Exploring Direct Concept Search - Steve Rowe, LucidworksLucidworks
 
Personalized Search and Job Recommendations - Simon Hughes, Dice.com
Personalized Search and Job Recommendations - Simon Hughes, Dice.comPersonalized Search and Job Recommendations - Simon Hughes, Dice.com
Personalized Search and Job Recommendations - Simon Hughes, Dice.comLucidworks
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scalethelabdude
 
Solr JDBC: Presented by Kevin Risden, Avalon Consulting
Solr JDBC: Presented by Kevin Risden, Avalon ConsultingSolr JDBC: Presented by Kevin Risden, Avalon Consulting
Solr JDBC: Presented by Kevin Risden, Avalon ConsultingLucidworks
 
Twitter Search Architecture
Twitter Search Architecture Twitter Search Architecture
Twitter Search Architecture Ramez Al-Fayez
 
Our Tale from the Trail of Shadows at REI Co-op - Chris Phillips & Dale Smith...
Our Tale from the Trail of Shadows at REI Co-op - Chris Phillips & Dale Smith...Our Tale from the Trail of Shadows at REI Co-op - Chris Phillips & Dale Smith...
Our Tale from the Trail of Shadows at REI Co-op - Chris Phillips & Dale Smith...Lucidworks
 
Data Engineering with Solr and Spark
Data Engineering with Solr and SparkData Engineering with Solr and Spark
Data Engineering with Solr and SparkLucidworks
 
Structured Streaming Use-Cases at Apple
Structured Streaming Use-Cases at AppleStructured Streaming Use-Cases at Apple
Structured Streaming Use-Cases at AppleDatabricks
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucenelucenerevolution
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucenelucenerevolution
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesRahul Jain
 

Tendances (20)

H-Hypermap - Heatmap Analytics at Scale: Presented by David Smiley, D W Smile...
H-Hypermap - Heatmap Analytics at Scale: Presented by David Smiley, D W Smile...H-Hypermap - Heatmap Analytics at Scale: Presented by David Smiley, D W Smile...
H-Hypermap - Heatmap Analytics at Scale: Presented by David Smiley, D W Smile...
 
Cross Data Center Replication for the Enterprise: Presented by Adam Williams,...
Cross Data Center Replication for the Enterprise: Presented by Adam Williams,...Cross Data Center Replication for the Enterprise: Presented by Adam Williams,...
Cross Data Center Replication for the Enterprise: Presented by Adam Williams,...
 
Measuring Search Engine Quality using Spark and Python
Measuring Search Engine Quality using Spark and PythonMeasuring Search Engine Quality using Spark and Python
Measuring Search Engine Quality using Spark and Python
 
Solr Distributed Indexing in WalmartLabs: Presented by Shengua Wan, WalmartLabs
Solr Distributed Indexing in WalmartLabs: Presented by Shengua Wan, WalmartLabsSolr Distributed Indexing in WalmartLabs: Presented by Shengua Wan, WalmartLabs
Solr Distributed Indexing in WalmartLabs: Presented by Shengua Wan, WalmartLabs
 
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
 
Understanding Lucene Search Performance
Understanding Lucene Search PerformanceUnderstanding Lucene Search Performance
Understanding Lucene Search Performance
 
Data Science with Solr and Spark
Data Science with Solr and SparkData Science with Solr and Spark
Data Science with Solr and Spark
 
The Evolution of Streaming Expressions - Joel Bernstein, Alfresco & Dennis Go...
The Evolution of Streaming Expressions - Joel Bernstein, Alfresco & Dennis Go...The Evolution of Streaming Expressions - Joel Bernstein, Alfresco & Dennis Go...
The Evolution of Streaming Expressions - Joel Bernstein, Alfresco & Dennis Go...
 
Exploring Direct Concept Search - Steve Rowe, Lucidworks
Exploring Direct Concept Search - Steve Rowe, LucidworksExploring Direct Concept Search - Steve Rowe, Lucidworks
Exploring Direct Concept Search - Steve Rowe, Lucidworks
 
Personalized Search and Job Recommendations - Simon Hughes, Dice.com
Personalized Search and Job Recommendations - Simon Hughes, Dice.comPersonalized Search and Job Recommendations - Simon Hughes, Dice.com
Personalized Search and Job Recommendations - Simon Hughes, Dice.com
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
Solr JDBC: Presented by Kevin Risden, Avalon Consulting
Solr JDBC: Presented by Kevin Risden, Avalon ConsultingSolr JDBC: Presented by Kevin Risden, Avalon Consulting
Solr JDBC: Presented by Kevin Risden, Avalon Consulting
 
Twitter Search Architecture
Twitter Search Architecture Twitter Search Architecture
Twitter Search Architecture
 
Our Tale from the Trail of Shadows at REI Co-op - Chris Phillips & Dale Smith...
Our Tale from the Trail of Shadows at REI Co-op - Chris Phillips & Dale Smith...Our Tale from the Trail of Shadows at REI Co-op - Chris Phillips & Dale Smith...
Our Tale from the Trail of Shadows at REI Co-op - Chris Phillips & Dale Smith...
 
Data Engineering with Solr and Spark
Data Engineering with Solr and SparkData Engineering with Solr and Spark
Data Engineering with Solr and Spark
 
Structured Streaming Use-Cases at Apple
Structured Streaming Use-Cases at AppleStructured Streaming Use-Cases at Apple
Structured Streaming Use-Cases at Apple
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucene
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucene
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and Usecases
 

En vedette

Expand Your Perspective
Expand Your PerspectiveExpand Your Perspective
Expand Your PerspectiveGeorge Hutton
 
Emotional Fitness Gym - Exercise Your Emotional Muscles
Emotional Fitness Gym - Exercise Your Emotional MusclesEmotional Fitness Gym - Exercise Your Emotional Muscles
Emotional Fitness Gym - Exercise Your Emotional MusclesAnil Dagia
 
Strengths Finder Presentation
Strengths Finder PresentationStrengths Finder Presentation
Strengths Finder PresentationBrittany Venzon
 
The top-10-secrets-of-nlp-coaching-language
The top-10-secrets-of-nlp-coaching-languageThe top-10-secrets-of-nlp-coaching-language
The top-10-secrets-of-nlp-coaching-languagehazaharb
 
Language of Influence and Persuasion - introduction to the NLP Milton Model
Language of Influence and Persuasion - introduction to the NLP Milton ModelLanguage of Influence and Persuasion - introduction to the NLP Milton Model
Language of Influence and Persuasion - introduction to the NLP Milton ModelFiona Campbell
 
Introduction to the NLP Meta Model - NLP Business Coaching Series
Introduction to the NLP Meta Model - NLP Business Coaching SeriesIntroduction to the NLP Meta Model - NLP Business Coaching Series
Introduction to the NLP Meta Model - NLP Business Coaching SeriesFiona Campbell
 
Strengths Based Leadership
Strengths Based LeadershipStrengths Based Leadership
Strengths Based LeadershipEric Kaufman
 
Leading With Your Strengths - Crash Course in Gallup StrengthsFinder
Leading With Your Strengths - Crash Course in Gallup StrengthsFinder Leading With Your Strengths - Crash Course in Gallup StrengthsFinder
Leading With Your Strengths - Crash Course in Gallup StrengthsFinder nrbenner
 
Four ‘Magic’ Questions that Help Resolve Most Problems - Introduction to The ...
Four ‘Magic’ Questions that Help Resolve Most Problems - Introduction to The ...Four ‘Magic’ Questions that Help Resolve Most Problems - Introduction to The ...
Four ‘Magic’ Questions that Help Resolve Most Problems - Introduction to The ...Fiona Campbell
 
StrengthsEngage - How to understand your Clifton StrengthsFinder results
StrengthsEngage -  How to understand your Clifton StrengthsFinder resultsStrengthsEngage -  How to understand your Clifton StrengthsFinder results
StrengthsEngage - How to understand your Clifton StrengthsFinder resultsPatrick Kayton
 
neuro-linguistic programming
neuro-linguistic programmingneuro-linguistic programming
neuro-linguistic programmingMichael Buckley
 
Slides For Nlp(Anchoring)
Slides For Nlp(Anchoring)Slides For Nlp(Anchoring)
Slides For Nlp(Anchoring)Alwyn Lau
 
How to use NLP in Business
How to use NLP in BusinessHow to use NLP in Business
How to use NLP in BusinessMorgan PR
 
What is Neuro Linguistic Programming (NLP)
What is Neuro Linguistic Programming (NLP)What is Neuro Linguistic Programming (NLP)
What is Neuro Linguistic Programming (NLP)Fiona Campbell
 

En vedette (20)

Expand Your Perspective
Expand Your PerspectiveExpand Your Perspective
Expand Your Perspective
 
Strength Finder 2
Strength Finder 2Strength Finder 2
Strength Finder 2
 
Emotional Fitness Gym - Exercise Your Emotional Muscles
Emotional Fitness Gym - Exercise Your Emotional MusclesEmotional Fitness Gym - Exercise Your Emotional Muscles
Emotional Fitness Gym - Exercise Your Emotional Muscles
 
Meta NLP Day 2
Meta NLP Day 2Meta NLP Day 2
Meta NLP Day 2
 
Coaching by question
Coaching by questionCoaching by question
Coaching by question
 
Strengths Finder Presentation
Strengths Finder PresentationStrengths Finder Presentation
Strengths Finder Presentation
 
The top-10-secrets-of-nlp-coaching-language
The top-10-secrets-of-nlp-coaching-languageThe top-10-secrets-of-nlp-coaching-language
The top-10-secrets-of-nlp-coaching-language
 
NLP
NLPNLP
NLP
 
Language of Influence and Persuasion - introduction to the NLP Milton Model
Language of Influence and Persuasion - introduction to the NLP Milton ModelLanguage of Influence and Persuasion - introduction to the NLP Milton Model
Language of Influence and Persuasion - introduction to the NLP Milton Model
 
Strengths-Based Leadership Handout
Strengths-Based Leadership HandoutStrengths-Based Leadership Handout
Strengths-Based Leadership Handout
 
Introduction to the NLP Meta Model - NLP Business Coaching Series
Introduction to the NLP Meta Model - NLP Business Coaching SeriesIntroduction to the NLP Meta Model - NLP Business Coaching Series
Introduction to the NLP Meta Model - NLP Business Coaching Series
 
Strengths Based Leadership
Strengths Based LeadershipStrengths Based Leadership
Strengths Based Leadership
 
Leading With Your Strengths - Crash Course in Gallup StrengthsFinder
Leading With Your Strengths - Crash Course in Gallup StrengthsFinder Leading With Your Strengths - Crash Course in Gallup StrengthsFinder
Leading With Your Strengths - Crash Course in Gallup StrengthsFinder
 
Four ‘Magic’ Questions that Help Resolve Most Problems - Introduction to The ...
Four ‘Magic’ Questions that Help Resolve Most Problems - Introduction to The ...Four ‘Magic’ Questions that Help Resolve Most Problems - Introduction to The ...
Four ‘Magic’ Questions that Help Resolve Most Problems - Introduction to The ...
 
Neuro linguistic programming(nlp)
Neuro linguistic programming(nlp)Neuro linguistic programming(nlp)
Neuro linguistic programming(nlp)
 
StrengthsEngage - How to understand your Clifton StrengthsFinder results
StrengthsEngage -  How to understand your Clifton StrengthsFinder resultsStrengthsEngage -  How to understand your Clifton StrengthsFinder results
StrengthsEngage - How to understand your Clifton StrengthsFinder results
 
neuro-linguistic programming
neuro-linguistic programmingneuro-linguistic programming
neuro-linguistic programming
 
Slides For Nlp(Anchoring)
Slides For Nlp(Anchoring)Slides For Nlp(Anchoring)
Slides For Nlp(Anchoring)
 
How to use NLP in Business
How to use NLP in BusinessHow to use NLP in Business
How to use NLP in Business
 
What is Neuro Linguistic Programming (NLP)
What is Neuro Linguistic Programming (NLP)What is Neuro Linguistic Programming (NLP)
What is Neuro Linguistic Programming (NLP)
 

Similaire à Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon Hughes, Dice.com

Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comSimon Hughes
 
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkDice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkSimon Hughes
 
Best Practices for Hyperparameter Tuning with MLflow
Best Practices for Hyperparameter Tuning with MLflowBest Practices for Hyperparameter Tuning with MLflow
Best Practices for Hyperparameter Tuning with MLflowDatabricks
 
Optimising Queries - Series 1 Query Optimiser Architecture
Optimising Queries - Series 1 Query Optimiser ArchitectureOptimising Queries - Series 1 Query Optimiser Architecture
Optimising Queries - Series 1 Query Optimiser ArchitectureDAGEOP LTD
 
Net campus2015 antimomusone
Net campus2015 antimomusoneNet campus2015 antimomusone
Net campus2015 antimomusoneDotNetCampus
 
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATAPREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATADotNetCampus
 
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comEnhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comSimon Hughes
 
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...Lucidworks
 
Lucene Bootcamp - 2
Lucene Bootcamp - 2Lucene Bootcamp - 2
Lucene Bootcamp - 2GokulD
 
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Lucidworks
 
Building High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsBuilding High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsYalçın Yenigün
 
Personalized Re-Ranking of Documents
Personalized Re-Ranking of DocumentsPersonalized Re-Ranking of Documents
Personalized Re-Ranking of Documentskswapna9
 
Enabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial IntelligenceEnabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial IntelligenceLionel Briand
 
Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015 Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015 antimo musone
 
SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!
SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!
SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!Richard Robinson
 
Data structures and algorithms Module-1.pdf
Data structures and algorithms Module-1.pdfData structures and algorithms Module-1.pdf
Data structures and algorithms Module-1.pdfDukeCalvin
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine Learningsafa cimenli
 
Lec 4 expert systems
Lec 4  expert systemsLec 4  expert systems
Lec 4 expert systemsEyob Sisay
 
'A critique of testing' UK TMF forum January 2015
'A critique of testing' UK TMF forum January 2015 'A critique of testing' UK TMF forum January 2015
'A critique of testing' UK TMF forum January 2015 Georgina Tilby
 

Similaire à Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon Hughes, Dice.com (20)

Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.com
 
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkDice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank Talk
 
Best Practices for Hyperparameter Tuning with MLflow
Best Practices for Hyperparameter Tuning with MLflowBest Practices for Hyperparameter Tuning with MLflow
Best Practices for Hyperparameter Tuning with MLflow
 
Optimising Queries - Series 1 Query Optimiser Architecture
Optimising Queries - Series 1 Query Optimiser ArchitectureOptimising Queries - Series 1 Query Optimiser Architecture
Optimising Queries - Series 1 Query Optimiser Architecture
 
Net campus2015 antimomusone
Net campus2015 antimomusoneNet campus2015 antimomusone
Net campus2015 antimomusone
 
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATAPREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
 
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comEnhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
 
Visual Studio Profiler
Visual Studio ProfilerVisual Studio Profiler
Visual Studio Profiler
 
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
 
Lucene Bootcamp - 2
Lucene Bootcamp - 2Lucene Bootcamp - 2
Lucene Bootcamp - 2
 
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
 
Building High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsBuilding High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning Applications
 
Personalized Re-Ranking of Documents
Personalized Re-Ranking of DocumentsPersonalized Re-Ranking of Documents
Personalized Re-Ranking of Documents
 
Enabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial IntelligenceEnabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial Intelligence
 
Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015 Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015
 
SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!
SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!
SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!
 
Data structures and algorithms Module-1.pdf
Data structures and algorithms Module-1.pdfData structures and algorithms Module-1.pdf
Data structures and algorithms Module-1.pdf
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine Learning
 
Lec 4 expert systems
Lec 4  expert systemsLec 4  expert systems
Lec 4 expert systems
 
'A critique of testing' UK TMF forum January 2015
'A critique of testing' UK TMF forum January 2015 'A critique of testing' UK TMF forum January 2015
'A critique of testing' UK TMF forum January 2015
 

Plus de Lucidworks

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategyLucidworks
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceLucidworks
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsLucidworks
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesLucidworks
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Lucidworks
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...Lucidworks
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Lucidworks
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Lucidworks
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteLucidworks
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentLucidworks
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeLucidworks
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Lucidworks
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchLucidworks
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Lucidworks
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyLucidworks
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Lucidworks
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceLucidworks
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchLucidworks
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondLucidworks
 

Plus de Lucidworks (20)

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
 

Dernier

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 

Dernier (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon Hughes, Dice.com

  • 1. O C T O B E R 1 1 - 1 4 , 2 0 1 6 • B O S T O N , M A
  • 2. Evolving The Optimal Relevancy Scoring Model at Dice.com Simon Hughes Chief Data Scientist, Dice.com
  • 3. 3 •  Chief Data Scientist at Dice.com and DHI, under Yuri Bykov •  Dice.com – leading US job board for IT professionals •  Twitter handle: https://twitter.com/hughes_meister Who Am I? •  Dice Skills pages - http://www.dice.com/skills •  New Dice Careers Mobile App Key Projects •  PhD candidate at DePaul University, studying NLP and machine learning •  Thesis topic – Detecting causality in scientific explanatory essays PhD
  • 4. 4 •  Look under https://github.com/DiceTechJobs •  Set of Solr plugins https://github.com/DiceTechJobs/SolrPlugins •  Tutorial for this talk: https://github.com/DiceTechJobs/RelevancyTuning Open Source GitHub Repositories
  • 5. 5 1.  Approaches to Relevancy Tuning 2.  Automated Relevancy Tuning – using Reinforcement Learning 3.  Feedback Loops - Dangers of Closed Loop Learning Systems Overview
  • 6. 6 •  Last year I talked about conceptual search and how that could be used to improve recall •  This year I want to focus on techniques to improve precision •  Novelty Motivations for Talk
  • 7. 7 Finding the Optimal Search Engine Configuration •  Most companies initially approach this in a very ad hoc and manual process: •  Follow ‘best practices’ and make some initial educated guesses as to the best settings •  Manually tune the parameters on a number of key user queries •  The search engine parameters should be tuned to reflect how your users search •  Relevancy is a hard to define concept, but it’s what your users consider provides them with an optimal search experience. So it should be informed by their search behavior Relevancy Tuning
  • 8. 8 What Solr Configuration Options Influence Relevancy? Solr and Lucene provide many configuration options that impact search relevancy, including: •  Which query parser – dismax, edismax, LuceneParser, etc •  Field boosts – qf parameter •  Phrase boosts – pf, pf2, pf3 parameters •  Minimum should match - mm parameter •  Similarity Class – default similarity, BM25, Tf.Idf, custom or one of many others •  Boost queries – boost, bf, bq, etc •  Edismax tie parameter – recommended value ≈ 0.1
  • 9. 9 Remove Noise Chars •  Ensure punctuation characters and plurality are removed from each field using the analysis chain Ø  ‘q=developer’ should match ‘developer,’ ,’developer.’, ‘developer’s’ and ‘developers’ When using Stemming Synonyms – use Copy Fields + Edismax •  Use copy fields to apply stemming and synonyms to existing fields •  Allows different boosts to be applied to stemmed and synonym matches •  Set fields boost to be lower on the stemmed and synonym copy fields Some General Tips on Relevancy Tuning
  • 10. 10 Use Boost Queries for Specific Query Use Cases •  Edismax bq parameter – allows boosting of matches to nested queries •  See chapter 7 of Relevant Search - good coverage of this strategy Make Good Use of Phrase Query Boosts •  Use pf, pf2 and pf3 parameters in edismax to give preference for multi-term matches •  pf2 and pf3 often give better performance than pf, which requires an exact match for all query terms Caveat Emptor: Monitor impact of these changes on query performance (QTime) and index size Some General Tips on Relevancy Tuning
  • 11. 11 •  To tune your search parameters, you can gather a dataset of relevancy judgements •  For a set of important queries, the dataset will contain a set of relevancy judgements with the top results returned annotated for relevancy •  This dataset can be collected using domain experts and a user interface designed for this task •  Commercial Examples: •  Quepid – developed by OpenSource Connections •  Fusion UI Relevancy Workbench – part of the Fusion offering from Lucidworks The ‘Golden’ Test Collection
  • 12. 12
  • 13. 13 •  An alternative to manually collecting relevancy judgements is to collect them directly from your users •  For each user search on the site, capture: •  User’s query, and timestamp •  Any filters applied •  Result impressions and clicks •  You can then turn this into a test collection by assuming that the results that people click on are more relevant than those they don’t •  The time spent on the results page is also a great indication of how relevant that result was to the original search Search Log Capture
  • 14. 14 •  Now you have a test collection, you can use that to tune your search engine configuration •  Using the test collection, you can measure the relevancy of a set of searches on that collection using some IR metrics, such as: •  MAP (Mean Average Precision) •  Precision at K (compute precision at the k’th document retrieved) •  NDCG (Normalized Discounted Cumulative Gain) •  Regression testing – this allows you to build a set of regression tests to ensure configuration changes both improve relevancy and don’t break certain queries •  Manually tuning search configurations is still a time consuming and inefficient process •  Is there a better way? Relevancy Tuning with a Test Collection
  • 15. 15 1.  Supervised Machine Learning? •  No - cannot optimize your search configuration without a computable gradient 2.  Grid Search? •  Perform a brute force search over a the range of possible configuration parameters •  Very slow and inefficient – is not able to learn which ranges of settings work best 3.  Black Box Optimization Algorithms? •  Optimization algorithms exist that attempt to find the optimum value of an unknown function in as few iterations as possible •  Perform a much smarter search of the parameter space than grid search Automated Relevancy Tuning Approaches
  • 16. 16 •  Use an optimization algorithm to optimize a ‘black box’ function •  Black box function – provide the optimization algorithm with a function that takes a set of parameters as inputs and computes a score •  The black box algorithm will then try and choose parameter settings to optimize the score •  This can be thought of as a form of reinforcement learning •  These algorithms will intelligently search the space of possible search configurations to arrive at a solution •  Example algorithms include Bayesian Optimization, Simulated Annealing, and Genetic Algorithms (hence talk title) Black Box Optimization Algorithms
  • 17. 17 Example Black Box Function for Search Relevancy
  • 18. 18 •  There are some excellent mature libraries for doing this sort of thing e.g. •  DEAP - Distributed Evolutionary Algorithms in Python (hence talk title) •  Scikit Optimize – General optimization library built by a team at CERN headed by Tim Head •  These libraries are very easy to use, however getting them to optimize your search configuration is a little trickier •  They tend to work better when optimizing a small set of parameters at a time – 1 to 4 works well •  Achieved an improvement of 5% in MAP @ 5 for our MLT configuration. AB testing changes to search before EOY Making it Work
  • 19. 19 •  To optimize a large set of search parameters – start with the most important ones and optimize those while keeping the rest fixed •  If you are using search logs to optimize the search configuration, use a large number of searches (at least a few thousand) to ensure you are performing a robust enough test •  For most search collections of a reasonable size, running these optimizations over your search collection will take time – set it up on a server, parallelize where possible and leave running overnight •  Typically you will want to allow the algorithm to try a few hundred variations of each parameter set at least to find a good range of settings •  Ideally – first optimize your search configuration against a set of relevancy judgements acquired from domain experts, deploy to production and use the search logs to further tune against your users search behavior Making it Work
  • 20. 20 •  As with any machine learning problem, it is essential to use one dataset to learn from, and a second separate dataset to validate your results – prevents ‘overfitting’ •  Overfitting in this context means that the search parameters are over-tuned on your initial dataset, that the search engine performs worse on new data than with the current configuration •  Once you have an optimal set of configuration parameters, that you are happy with, these should be evaluated on a second set of relevancy judgements to ensure the same performance gains are seen there also •  This applies to both manual and automatic tuning of the search engine configuration. Humans can overfit a dataset just as easily as an algorithm can Use a Separate Testing Dataset to Validate Improvements
  • 21. 21 •  Auto-tune other solr parameters – phrase slop, mm settings, similarity class used •  Your can evolve a more optimal ranking function: •  Either tweak the settings of the existing ranking functions (see SweetSpotSimilarityFactory class) •  Or use Genetic Programming to evolve a better ranking function for your dataset •  Genetic Programming is an evolutionary algorithm that can evolve programs and equations •  Some relevant papers, good introductory paper (but not very recent) Some Other Things to Try
  • 22. 22 •  Building a Machine Learned Ranking system is a premature optimization if you haven’t first optimized your search configuration •  Relevancy tuning and MLR both primarily optimize for precision over recall due to nature of training data** •  For techniques to improve recall, see conceptual semantic search: •  Simon Hughes - “Conceptual Search” (Revolution 2015) •  Trey Grainger - “Enhancing Relevancy Through Personalization and Semantic Search” (Revolution 2013) •  Doug Turnbull and John Berryman - Chapter 11 of Relevant Search Things to Consider
  • 23. Feedback Loops – Dangers of Closed Loop Learning Systems
  • 24. Users Interact with the System Model Machine Learning Produce Building a Machine Learning System 1.  Users interact with the system to produce data 2.  Machine learning algorithms turns that data into a model What happens if the model’s predictions influence the user’s behavior?
  • 25. Users Interact with the System Model Produce Positive Feedback Loop 1.  Users interact with the system to produce data 2.  Machine learning algorithms turns that data into a model 3.  Model changes user behavior, modifying its own future training data Model changes behavior Machine Learning
  • 26. 26 1.  Isolate a subset of data from being influenced by the model, use this data to train the system •  E.g. leave a small proportion of user searches un-ranked by the MLR model •  E.g. generate a subset of recommendations at random, or by using an unsupervised model 2.  Use a reinforcement learning model instead (such as a multi-armed bandit) - the system will dynamically adapt to the users’ behavior, balancing exploring different hypotheses with exploiting what it’s learned to produce accurate predictions Preventing Positive Feedback Loops
  • 27. 27 THE END •  Thank you for listening •  Any questions?
  • 28. 28