SlideShare une entreprise Scribd logo
1  sur  20
Boosting Documents in Solr by Recency, Popularity, and User Preferences Timothy Potter [email_address] , May 25, 2011
What I Will Cover ,[object Object],[object Object],[object Object]
My Background ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Boost documents by age ,[object Object],[object Object],[object Object]
Solr: Indexing ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
FunctionQuery Basics ,[object Object],[object Object],[object Object],constant literal fieldvalue ord rord sum sub product pow abs log sqrt map scale query linear recip max min ms sqedist - Squared Euclidean Dist hsin, ghhsin - Haversine Formula geohash - Convert to geohash strdist
Solr: Query Time Boost ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Tune Solr recip function
Tips and Tricks ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],Boost by Popularity
Popularity Illustrated
Solr: ExternalFileField ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Popularity Boost: Nuts & Bolts Logs Solr Server User activity logged View Counting Job solr-home/data/ external_popularity a=1.114 b=1.05 c=1.111 … commit
Popularity Tips & Tricks ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Filtering By User Preferences ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Preferences Component ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Preferences Filter ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Preferences Filter in Action User Preferences Db Solr Server LRU Cache Preferences Component Update Preferences Query with pref.id=123 and pref.mod = TS pref.id & pref.mod If cached mod == pref.mod read from cache SQL to compute excluded categories sources and types
Wrap Up ,[object Object],[object Object],[object Object]
Contact ,[object Object],[object Object],[object Object],[object Object]

Contenu connexe

Tendances

Neural Search Comes to Apache Solr_ Approximate Nearest Neighbor, BERT and Mo...
Neural Search Comes to Apache Solr_ Approximate Nearest Neighbor, BERT and Mo...Neural Search Comes to Apache Solr_ Approximate Nearest Neighbor, BERT and Mo...
Neural Search Comes to Apache Solr_ Approximate Nearest Neighbor, BERT and Mo...Sease
 
Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash courseTommaso Teofili
 
Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...
Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...
Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...Lucidworks
 
Sentiment Analysis Using Solr
Sentiment Analysis Using SolrSentiment Analysis Using Solr
Sentiment Analysis Using SolrPradeep Pujari
 
Search summit-2018-ltr-presentation
Search summit-2018-ltr-presentationSearch summit-2018-ltr-presentation
Search summit-2018-ltr-presentationSujit Pal
 
Near RealTime search @Flipkart
Near RealTime search @FlipkartNear RealTime search @Flipkart
Near RealTime search @FlipkartUmesh Prasad
 
Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...
Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...
Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...Sease
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to SolrErik Hatcher
 
차곡차곡 쉽게 알아가는 Elasticsearch와 Node.js
차곡차곡 쉽게 알아가는 Elasticsearch와 Node.js차곡차곡 쉽게 알아가는 Elasticsearch와 Node.js
차곡차곡 쉽게 알아가는 Elasticsearch와 Node.jsHeeJung Hwang
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guideRyan Blue
 
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxData
 
Japanese Linguistics in Lucene and Solr
Japanese Linguistics in Lucene and Solr Japanese Linguistics in Lucene and Solr
Japanese Linguistics in Lucene and Solr lucenerevolution
 
Building a Knowledge Graph using NLP and Ontologies
Building a Knowledge Graph using NLP and OntologiesBuilding a Knowledge Graph using NLP and Ontologies
Building a Knowledge Graph using NLP and OntologiesNeo4j
 
Applied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce SettingApplied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce SettingDatabricks
 
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's ScalePinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's ScaleSeunghyun Lee
 
SPARQL 사용법
SPARQL 사용법SPARQL 사용법
SPARQL 사용법홍수 허
 
Netflix Global Search - Lucene Revolution
Netflix Global Search - Lucene RevolutionNetflix Global Search - Lucene Revolution
Netflix Global Search - Lucene Revolutionivan provalov
 
A Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQLA Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQLDatabricks
 

Tendances (20)

Semantic search
Semantic searchSemantic search
Semantic search
 
Neural Search Comes to Apache Solr_ Approximate Nearest Neighbor, BERT and Mo...
Neural Search Comes to Apache Solr_ Approximate Nearest Neighbor, BERT and Mo...Neural Search Comes to Apache Solr_ Approximate Nearest Neighbor, BERT and Mo...
Neural Search Comes to Apache Solr_ Approximate Nearest Neighbor, BERT and Mo...
 
Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash course
 
Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...
Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...
Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...
 
JSON in Solr: from top to bottom
JSON in Solr: from top to bottomJSON in Solr: from top to bottom
JSON in Solr: from top to bottom
 
Sentiment Analysis Using Solr
Sentiment Analysis Using SolrSentiment Analysis Using Solr
Sentiment Analysis Using Solr
 
Search summit-2018-ltr-presentation
Search summit-2018-ltr-presentationSearch summit-2018-ltr-presentation
Search summit-2018-ltr-presentation
 
Near RealTime search @Flipkart
Near RealTime search @FlipkartNear RealTime search @Flipkart
Near RealTime search @Flipkart
 
Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...
Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...
Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
차곡차곡 쉽게 알아가는 Elasticsearch와 Node.js
차곡차곡 쉽게 알아가는 Elasticsearch와 Node.js차곡차곡 쉽게 알아가는 Elasticsearch와 Node.js
차곡차곡 쉽게 알아가는 Elasticsearch와 Node.js
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
 
Japanese Linguistics in Lucene and Solr
Japanese Linguistics in Lucene and Solr Japanese Linguistics in Lucene and Solr
Japanese Linguistics in Lucene and Solr
 
Building a Knowledge Graph using NLP and Ontologies
Building a Knowledge Graph using NLP and OntologiesBuilding a Knowledge Graph using NLP and Ontologies
Building a Knowledge Graph using NLP and Ontologies
 
Applied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce SettingApplied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce Setting
 
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's ScalePinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
 
SPARQL 사용법
SPARQL 사용법SPARQL 사용법
SPARQL 사용법
 
Netflix Global Search - Lucene Revolution
Netflix Global Search - Lucene RevolutionNetflix Global Search - Lucene Revolution
Netflix Global Search - Lucene Revolution
 
A Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQLA Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQL
 

En vedette

Implementing Click-through Relevance Ranking in Solr and LucidWorks Enterprise
Implementing Click-through Relevance Ranking in Solr and LucidWorks EnterpriseImplementing Click-through Relevance Ranking in Solr and LucidWorks Enterprise
Implementing Click-through Relevance Ranking in Solr and LucidWorks EnterpriseLucidworks (Archived)
 
Semantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/SolrSemantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/SolrTrey Grainger
 
Implementing Click-through Relevance Ranking in Solr and LucidWorks Enterprise
Implementing Click-through Relevance Ranking in Solr and LucidWorks EnterpriseImplementing Click-through Relevance Ranking in Solr and LucidWorks Enterprise
Implementing Click-through Relevance Ranking in Solr and LucidWorks EnterpriseLucidworks (Archived)
 
Click-through relevance ranking in solr &  lucid works enterprise - By Andrz...
 Click-through relevance ranking in solr &  lucid works enterprise - By Andrz... Click-through relevance ranking in solr &  lucid works enterprise - By Andrz...
Click-through relevance ranking in solr &  lucid works enterprise - By Andrz...lucenerevolution
 
네이버 지식쇼핑과 아마존의 검색결과 페이지네비게이션 유형분석
네이버 지식쇼핑과 아마존의 검색결과 페이지네비게이션 유형분석네이버 지식쇼핑과 아마존의 검색결과 페이지네비게이션 유형분석
네이버 지식쇼핑과 아마존의 검색결과 페이지네비게이션 유형분석상욱 송
 
Apache Solr 4 Part 1 - Introduction, Features, Recency Ranking and Popularity...
Apache Solr 4 Part 1 - Introduction, Features, Recency Ranking and Popularity...Apache Solr 4 Part 1 - Introduction, Features, Recency Ranking and Popularity...
Apache Solr 4 Part 1 - Introduction, Features, Recency Ranking and Popularity...Ramzi Alqrainy
 
Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Trey Grainger
 
Query Parsing - Tips and Tricks
Query Parsing - Tips and TricksQuery Parsing - Tips and Tricks
Query Parsing - Tips and TricksErik Hatcher
 
Twitter Search Architecture
Twitter Search Architecture Twitter Search Architecture
Twitter Search Architecture Ramez Al-Fayez
 
第16回Lucene/Solr勉強会 – ランキングチューニングと定量評価 #SolrJP
第16回Lucene/Solr勉強会 – ランキングチューニングと定量評価 #SolrJP第16回Lucene/Solr勉強会 – ランキングチューニングと定量評価 #SolrJP
第16回Lucene/Solr勉強会 – ランキングチューニングと定量評価 #SolrJPYahoo!デベロッパーネットワーク
 
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Lucidworks
 
Building a Real-time Solr-powered Recommendation Engine
Building a Real-time Solr-powered Recommendation EngineBuilding a Real-time Solr-powered Recommendation Engine
Building a Real-time Solr-powered Recommendation Enginelucenerevolution
 
Reflected intelligence evolving self-learning data systems
Reflected intelligence  evolving self-learning data systemsReflected intelligence  evolving self-learning data systems
Reflected intelligence evolving self-learning data systemsTrey Grainger
 
Language support and linguistics in lucene solr & its eco system
Language support and linguistics in lucene solr & its eco systemLanguage support and linguistics in lucene solr & its eco system
Language support and linguistics in lucene solr & its eco systemlucenerevolution
 
Hierarchical data models in Relational Databases
Hierarchical data models in Relational DatabasesHierarchical data models in Relational Databases
Hierarchical data models in Relational Databasesnavicorevn
 
SearchLeeds, Tom Anthony 'The next trilion searches: Intelligent personal ass...
SearchLeeds, Tom Anthony 'The next trilion searches: Intelligent personal ass...SearchLeeds, Tom Anthony 'The next trilion searches: Intelligent personal ass...
SearchLeeds, Tom Anthony 'The next trilion searches: Intelligent personal ass...Branded3
 
South Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis PanelSouth Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis PanelTrey Grainger
 
The Semantic Knowledge Graph
The Semantic Knowledge GraphThe Semantic Knowledge Graph
The Semantic Knowledge GraphTrey Grainger
 
Reflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data systemReflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data systemTrey Grainger
 
The Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemThe Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemTrey Grainger
 

En vedette (20)

Implementing Click-through Relevance Ranking in Solr and LucidWorks Enterprise
Implementing Click-through Relevance Ranking in Solr and LucidWorks EnterpriseImplementing Click-through Relevance Ranking in Solr and LucidWorks Enterprise
Implementing Click-through Relevance Ranking in Solr and LucidWorks Enterprise
 
Semantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/SolrSemantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/Solr
 
Implementing Click-through Relevance Ranking in Solr and LucidWorks Enterprise
Implementing Click-through Relevance Ranking in Solr and LucidWorks EnterpriseImplementing Click-through Relevance Ranking in Solr and LucidWorks Enterprise
Implementing Click-through Relevance Ranking in Solr and LucidWorks Enterprise
 
Click-through relevance ranking in solr &  lucid works enterprise - By Andrz...
 Click-through relevance ranking in solr &  lucid works enterprise - By Andrz... Click-through relevance ranking in solr &  lucid works enterprise - By Andrz...
Click-through relevance ranking in solr &  lucid works enterprise - By Andrz...
 
네이버 지식쇼핑과 아마존의 검색결과 페이지네비게이션 유형분석
네이버 지식쇼핑과 아마존의 검색결과 페이지네비게이션 유형분석네이버 지식쇼핑과 아마존의 검색결과 페이지네비게이션 유형분석
네이버 지식쇼핑과 아마존의 검색결과 페이지네비게이션 유형분석
 
Apache Solr 4 Part 1 - Introduction, Features, Recency Ranking and Popularity...
Apache Solr 4 Part 1 - Introduction, Features, Recency Ranking and Popularity...Apache Solr 4 Part 1 - Introduction, Features, Recency Ranking and Popularity...
Apache Solr 4 Part 1 - Introduction, Features, Recency Ranking and Popularity...
 
Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...
 
Query Parsing - Tips and Tricks
Query Parsing - Tips and TricksQuery Parsing - Tips and Tricks
Query Parsing - Tips and Tricks
 
Twitter Search Architecture
Twitter Search Architecture Twitter Search Architecture
Twitter Search Architecture
 
第16回Lucene/Solr勉強会 – ランキングチューニングと定量評価 #SolrJP
第16回Lucene/Solr勉強会 – ランキングチューニングと定量評価 #SolrJP第16回Lucene/Solr勉強会 – ランキングチューニングと定量評価 #SolrJP
第16回Lucene/Solr勉強会 – ランキングチューニングと定量評価 #SolrJP
 
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
 
Building a Real-time Solr-powered Recommendation Engine
Building a Real-time Solr-powered Recommendation EngineBuilding a Real-time Solr-powered Recommendation Engine
Building a Real-time Solr-powered Recommendation Engine
 
Reflected intelligence evolving self-learning data systems
Reflected intelligence  evolving self-learning data systemsReflected intelligence  evolving self-learning data systems
Reflected intelligence evolving self-learning data systems
 
Language support and linguistics in lucene solr & its eco system
Language support and linguistics in lucene solr & its eco systemLanguage support and linguistics in lucene solr & its eco system
Language support and linguistics in lucene solr & its eco system
 
Hierarchical data models in Relational Databases
Hierarchical data models in Relational DatabasesHierarchical data models in Relational Databases
Hierarchical data models in Relational Databases
 
SearchLeeds, Tom Anthony 'The next trilion searches: Intelligent personal ass...
SearchLeeds, Tom Anthony 'The next trilion searches: Intelligent personal ass...SearchLeeds, Tom Anthony 'The next trilion searches: Intelligent personal ass...
SearchLeeds, Tom Anthony 'The next trilion searches: Intelligent personal ass...
 
South Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis PanelSouth Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis Panel
 
The Semantic Knowledge Graph
The Semantic Knowledge GraphThe Semantic Knowledge Graph
The Semantic Knowledge Graph
 
Reflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data systemReflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data system
 
The Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemThe Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data Ecosystem
 

Similaire à Boosting Documents in Solr by Recency, Popularity and Personal Preferences - By Timothy Potter

Boosting Documents in Solr (Lucene Revolution 2011)
Boosting Documents in Solr (Lucene Revolution 2011)Boosting Documents in Solr (Lucene Revolution 2011)
Boosting Documents in Solr (Lucene Revolution 2011)thelabdude
 
Dev8d Apache Solr Tutorial
Dev8d Apache Solr TutorialDev8d Apache Solr Tutorial
Dev8d Apache Solr TutorialSourcesense
 
Letting In the Light: Using Solr as an External Search Component
Letting In the Light: Using Solr as an External Search ComponentLetting In the Light: Using Solr as an External Search Component
Letting In the Light: Using Solr as an External Search ComponentJay Luker
 
Performance Tuning for Visualforce and Apex
Performance Tuning for Visualforce and ApexPerformance Tuning for Visualforce and Apex
Performance Tuning for Visualforce and ApexSalesforce Developers
 
Sumo Logic Cert Jam - Fundamentals
Sumo Logic Cert Jam - FundamentalsSumo Logic Cert Jam - Fundamentals
Sumo Logic Cert Jam - FundamentalsSumo Logic
 
Cloudera Movies Data Science Project On Big Data
Cloudera Movies Data Science Project On Big DataCloudera Movies Data Science Project On Big Data
Cloudera Movies Data Science Project On Big DataAbhishek M Shivalingaiah
 
Solr JDBC: Presented by Kevin Risden, Avalon Consulting
Solr JDBC: Presented by Kevin Risden, Avalon ConsultingSolr JDBC: Presented by Kevin Risden, Avalon Consulting
Solr JDBC: Presented by Kevin Risden, Avalon ConsultingLucidworks
 
Presentation Moss 2007 Usman
Presentation Moss 2007 UsmanPresentation Moss 2007 Usman
Presentation Moss 2007 UsmanUsman Zafar Malik
 
Solr JDBC - Lucene/Solr Revolution 2016
Solr JDBC - Lucene/Solr Revolution 2016Solr JDBC - Lucene/Solr Revolution 2016
Solr JDBC - Lucene/Solr Revolution 2016Kevin Risden
 
Level 2 Certification: Using Sumo Logic - Oct 2018
Level 2 Certification: Using Sumo Logic - Oct 2018Level 2 Certification: Using Sumo Logic - Oct 2018
Level 2 Certification: Using Sumo Logic - Oct 2018Sumo Logic
 
Portfolio Oversight With eazyBI
Portfolio Oversight With eazyBIPortfolio Oversight With eazyBI
Portfolio Oversight With eazyBIeazyBI
 
CrossRef How-to: A Technical Introduction to the Basics of CrossRef, Chuck Ko...
CrossRef How-to: A Technical Introduction to the Basics of CrossRef, Chuck Ko...CrossRef How-to: A Technical Introduction to the Basics of CrossRef, Chuck Ko...
CrossRef How-to: A Technical Introduction to the Basics of CrossRef, Chuck Ko...Crossref
 
AWS re:Invent 2016: Zillow Group: Developing Classification and Recommendatio...
AWS re:Invent 2016: Zillow Group: Developing Classification and Recommendatio...AWS re:Invent 2016: Zillow Group: Developing Classification and Recommendatio...
AWS re:Invent 2016: Zillow Group: Developing Classification and Recommendatio...Amazon Web Services
 
Nose Dive into Apache Spark ML
Nose Dive into Apache Spark MLNose Dive into Apache Spark ML
Nose Dive into Apache Spark MLAhmet Bulut
 
2018 data warehouse features in spark
2018   data warehouse features in spark2018   data warehouse features in spark
2018 data warehouse features in sparkChester Chen
 
Welcome Webinar Slides
Welcome Webinar SlidesWelcome Webinar Slides
Welcome Webinar SlidesSumo Logic
 
ImageSemantics : User-Generated Metadata, Content-Based Retrieval & Beyond
ImageSemantics : User-Generated Metadata, Content-Based Retrieval & BeyondImageSemantics : User-Generated Metadata, Content-Based Retrieval & Beyond
ImageSemantics : User-Generated Metadata, Content-Based Retrieval & BeyondRalf Klamma
 
Reproducible AI Using PyTorch and MLflow
Reproducible AI Using PyTorch and MLflowReproducible AI Using PyTorch and MLflow
Reproducible AI Using PyTorch and MLflowDatabricks
 
TestGuild and QuerySurge Presentation -DevOps for Data Testing
TestGuild and QuerySurge Presentation -DevOps for Data TestingTestGuild and QuerySurge Presentation -DevOps for Data Testing
TestGuild and QuerySurge Presentation -DevOps for Data TestingRTTS
 

Similaire à Boosting Documents in Solr by Recency, Popularity and Personal Preferences - By Timothy Potter (20)

Boosting Documents in Solr (Lucene Revolution 2011)
Boosting Documents in Solr (Lucene Revolution 2011)Boosting Documents in Solr (Lucene Revolution 2011)
Boosting Documents in Solr (Lucene Revolution 2011)
 
Dev8d Apache Solr Tutorial
Dev8d Apache Solr TutorialDev8d Apache Solr Tutorial
Dev8d Apache Solr Tutorial
 
Letting In the Light: Using Solr as an External Search Component
Letting In the Light: Using Solr as an External Search ComponentLetting In the Light: Using Solr as an External Search Component
Letting In the Light: Using Solr as an External Search Component
 
Performance Tuning for Visualforce and Apex
Performance Tuning for Visualforce and ApexPerformance Tuning for Visualforce and Apex
Performance Tuning for Visualforce and Apex
 
Sumo Logic Cert Jam - Fundamentals
Sumo Logic Cert Jam - FundamentalsSumo Logic Cert Jam - Fundamentals
Sumo Logic Cert Jam - Fundamentals
 
Cloudera Movies Data Science Project On Big Data
Cloudera Movies Data Science Project On Big DataCloudera Movies Data Science Project On Big Data
Cloudera Movies Data Science Project On Big Data
 
Solr JDBC: Presented by Kevin Risden, Avalon Consulting
Solr JDBC: Presented by Kevin Risden, Avalon ConsultingSolr JDBC: Presented by Kevin Risden, Avalon Consulting
Solr JDBC: Presented by Kevin Risden, Avalon Consulting
 
Presentation Moss 2007 Usman
Presentation Moss 2007 UsmanPresentation Moss 2007 Usman
Presentation Moss 2007 Usman
 
Solr Presentation
Solr PresentationSolr Presentation
Solr Presentation
 
Solr JDBC - Lucene/Solr Revolution 2016
Solr JDBC - Lucene/Solr Revolution 2016Solr JDBC - Lucene/Solr Revolution 2016
Solr JDBC - Lucene/Solr Revolution 2016
 
Level 2 Certification: Using Sumo Logic - Oct 2018
Level 2 Certification: Using Sumo Logic - Oct 2018Level 2 Certification: Using Sumo Logic - Oct 2018
Level 2 Certification: Using Sumo Logic - Oct 2018
 
Portfolio Oversight With eazyBI
Portfolio Oversight With eazyBIPortfolio Oversight With eazyBI
Portfolio Oversight With eazyBI
 
CrossRef How-to: A Technical Introduction to the Basics of CrossRef, Chuck Ko...
CrossRef How-to: A Technical Introduction to the Basics of CrossRef, Chuck Ko...CrossRef How-to: A Technical Introduction to the Basics of CrossRef, Chuck Ko...
CrossRef How-to: A Technical Introduction to the Basics of CrossRef, Chuck Ko...
 
AWS re:Invent 2016: Zillow Group: Developing Classification and Recommendatio...
AWS re:Invent 2016: Zillow Group: Developing Classification and Recommendatio...AWS re:Invent 2016: Zillow Group: Developing Classification and Recommendatio...
AWS re:Invent 2016: Zillow Group: Developing Classification and Recommendatio...
 
Nose Dive into Apache Spark ML
Nose Dive into Apache Spark MLNose Dive into Apache Spark ML
Nose Dive into Apache Spark ML
 
2018 data warehouse features in spark
2018   data warehouse features in spark2018   data warehouse features in spark
2018 data warehouse features in spark
 
Welcome Webinar Slides
Welcome Webinar SlidesWelcome Webinar Slides
Welcome Webinar Slides
 
ImageSemantics : User-Generated Metadata, Content-Based Retrieval & Beyond
ImageSemantics : User-Generated Metadata, Content-Based Retrieval & BeyondImageSemantics : User-Generated Metadata, Content-Based Retrieval & Beyond
ImageSemantics : User-Generated Metadata, Content-Based Retrieval & Beyond
 
Reproducible AI Using PyTorch and MLflow
Reproducible AI Using PyTorch and MLflowReproducible AI Using PyTorch and MLflow
Reproducible AI Using PyTorch and MLflow
 
TestGuild and QuerySurge Presentation -DevOps for Data Testing
TestGuild and QuerySurge Presentation -DevOps for Data TestingTestGuild and QuerySurge Presentation -DevOps for Data Testing
TestGuild and QuerySurge Presentation -DevOps for Data Testing
 

Plus de lucenerevolution

Text Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and LuceneText Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and Lucenelucenerevolution
 
State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! lucenerevolution
 
Building Client-side Search Applications with Solr
Building Client-side Search Applications with SolrBuilding Client-side Search Applications with Solr
Building Client-side Search Applications with Solrlucenerevolution
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationslucenerevolution
 
Scaling Solr with SolrCloud
Scaling Solr with SolrCloudScaling Solr with SolrCloud
Scaling Solr with SolrCloudlucenerevolution
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusterslucenerevolution
 
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and ParboiledImplementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiledlucenerevolution
 
Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs lucenerevolution
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchlucenerevolution
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Stormlucenerevolution
 
Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?lucenerevolution
 
Schemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APISchemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APIlucenerevolution
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucenelucenerevolution
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMlucenerevolution
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucenelucenerevolution
 
Recent Additions to Lucene Arsenal
Recent Additions to Lucene ArsenalRecent Additions to Lucene Arsenal
Recent Additions to Lucene Arsenallucenerevolution
 
Turning search upside down
Turning search upside downTurning search upside down
Turning search upside downlucenerevolution
 
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...lucenerevolution
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - finallucenerevolution
 

Plus de lucenerevolution (20)

Text Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and LuceneText Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and Lucene
 
State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here!
 
Search at Twitter
Search at TwitterSearch at Twitter
Search at Twitter
 
Building Client-side Search Applications with Solr
Building Client-side Search Applications with SolrBuilding Client-side Search Applications with Solr
Building Client-side Search Applications with Solr
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
 
Scaling Solr with SolrCloud
Scaling Solr with SolrCloudScaling Solr with SolrCloud
Scaling Solr with SolrCloud
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusters
 
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and ParboiledImplementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
 
Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic search
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Storm
 
Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?
 
Schemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APISchemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST API
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucene
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucene
 
Recent Additions to Lucene Arsenal
Recent Additions to Lucene ArsenalRecent Additions to Lucene Arsenal
Recent Additions to Lucene Arsenal
 
Turning search upside down
Turning search upside downTurning search upside down
Turning search upside down
 
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - final
 

Dernier

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 

Dernier (20)

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 

Boosting Documents in Solr by Recency, Popularity and Personal Preferences - By Timothy Potter

  • 1. Boosting Documents in Solr by Recency, Popularity, and User Preferences Timothy Potter [email_address] , May 25, 2011
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8. Tune Solr recip function
  • 9.
  • 10.
  • 12.
  • 13. Popularity Boost: Nuts & Bolts Logs Solr Server User activity logged View Counting Job solr-home/data/ external_popularity a=1.114 b=1.05 c=1.111 … commit
  • 14.
  • 15.
  • 16.
  • 17.
  • 18. Preferences Filter in Action User Preferences Db Solr Server LRU Cache Preferences Component Update Preferences Query with pref.id=123 and pref.mod = TS pref.id & pref.mod If cached mod == pref.mod read from cache SQL to compute excluded categories sources and types
  • 19.
  • 20.

Notes de l'éditeur

  1. Attendees with come away from this presentation with a good understanding and access to source code for boosting and/or filtering documents by recency, popularity, and personal preferences. My solution improves upon the common "recip" based solution for boosting by document age. The framework also supports boosting documents by a popularity score, which is calculated and managed outside the index. I will present a few different ways to calculate popularity in a scalable manner. Lastly, my solution supports the concept of a personal document collection, where each user is only interested in a subset of the total number of documents in the index. My presentation will provide a good example of how to filter and/or boost results based on user preferences, which is a very common requirement of many Web applications.
  2. The one thing I’d like you to come away with today is confidence that Solr has powerful boosting capabilities built-in, but they require some fine-tuning and experimentation. Some simple recipes for complementing core Solr functionality to do: I. Boost documents by age (recency / freshness boost) II. Boost documents by popularity III. Filter results based on User Preferences (Personalized collection)
  3. Currently working at the National Renewable Energy Laboratory on building an infrastructure for storing and analyzing large volumes of smart grid related energy data using Hadoop technologies. Been doing search work for the past 5 years including a Lucene based search solution of eLearning content, Solr based solution for online magazine content and a FAST to Solr migration for a real estate portal. My other area of interest is in Mahout; I've contributed a few bug fixes and several pages on the wiki including working with Grant Ingersoll on benchmarking Mahout's distributed clustering algorithms in the Amazon cloud. Technical Blog: http://thelabdude.blogspot.com/ Currently working on JSF2 components for Solr.
  4. All other things being equal, more recent documents are better What’s not covered is how to determine if you should apply the boost. That’s a more in-depth topic that is the focus of academic research, especially in relation to Web search. News and most magazine articles Business documents – perhaps a less aggressive boost function identification of recency sensitive queries before ranking. see: http://technicallypossible.wordpress.com/2011/03/13/identifying-queries-which-demand-recency-sensitive-results-in-web-search/
  5. Careful! TrieFields make it more efficient to do range searches on numeric fields indexed at full precision, but it doesn't actually do anything to round the fields for people who genuinely want their stored and index values to only have second/minute/hour/day precision regardless of what the initial raw data looks like. Currently, Solr doesn't have anything built-in to round a date down to a different precision, such as minute / hour. Thus, you may need to do this yourself prior to indexing a document. see SOLR-741 // from commons DateUtils Date published = DateUtils.round(item.getPublishedOnDate(), Calendar.HOUR);
  6. Solr 1.4+ the recommended approach is to use the recip function with the ms function: There are approximately 3.16e10 milliseconds in a year, so one can scale dates to fractions of a year with the inverse, or 3.16e-11 recip(ms(NOW/HOUR,pubdate),3.16e-11,1,1) For standard query parser, you could do: q={!boost b=recip(ms(NOW/HOUR,pubdate),3.16e-11,1,1)}wine This uses the built-in boost function query. This uses a Lucene FieldCache under the covers on the pubdate field (stored in the index as long). The ms(NOW/HOUR) uses less precise measure of document age (rounding clause), which helps reduce memory consumption. Lessons: 1 - {!boost b=} syntax breaks spell-checking so you need to use spellcheck.q to be explicit 2 - Use edismax because it multiplies the boost whereas dismax adds "bf" 3 - Use a tdate field when indexing 4 - Use ms(NOW/HOUR) and less precision when indexing 5 - Use max(boost,0.20) - to bottom out the age penalty
  7. A reciprocal function with recip(x,m,a,b) implementing a/(m*x+b). m,a,b are constants, x is any numeric field or arbitrarily complex function. When a and b are equal, and x>=0, this function has a maximum value of 1 that drops as x increases. Increasing the value of a and b together results in a movement of the entire function to a flatter part of the curve. These properties can make this an ideal function for boosting more recent documents – see http://wiki.apache.org/solr/FunctionQuery
  8. identification of recency sensitive queries before ranking. see: http://technicallypossible.wordpress.com/2011/03/13/identifying-queries-which-demand-recency-sensitive-results-in-web-search/
  9. Score made of number of unique views in a time slot + avg rating / # of comments, etc. Must be computed outside of the index; refreshed periodically Probably don’t want to mix this with age boost as an older document might be really popular for some weird reason; think of old videos that become popular on YouTube Age – probably not as an old doc might get popular identification of recency sensitive queries before ranking. see: http://technicallypossible.wordpress.com/2011/03/13/identifying-queries-which-demand-recency-sensitive-results-in-web-search/
  10. Bar chart illustrates time slots Popularity score favors more recent content Document A is most popular; B was popular but is now on the decline and C has enjoyed consistent interest for a longer period but scores a little lower than A because of the recent interest in A
  11. Most likely use case would be to use log-file analysis > Ideal problem for MapReduce Question the audience – who has heard of MapReduce?