Determining Relevance Rankings from Search Click Logs

4.  Mining user behaviour/preferences  Predict document relevance  Re-rank the search results  Compare different ranking functions (train/test)  Optimize the ad. performance  Query suggestions  How Big are these logs? ◦ 10+ terabyte of entries each day ◦ Composed of billions of distinct (query, url)’s Comp 7220 1/11/2012 4

5. Documents/results Ranking factors Many ranking factors presented in order of depend on query, considered when the relevance to the document and ranking these results query query-document pair Improving ranking based on user Personalized search Recency (temporal) preferences +Social search ranking (likes/dislikes) Comp 7220 1/11/2012 5

6. [David Green; blog] Comp 7220 1/11/2012 6

7. # of clicks received [CIKM'09 Tutorial] Comp 7220 1/11/2012 7

8. Trust factor: Preferences to certain URLs more than the other, e.g., wikipedia.com, stackoverflow.com, Yahoo answers, about.com What is missing (in previous models) ?  Modelling trust factor  Clicks on sponsored results  Related queries/searches (sidebars)  Realistic and flexible assumptions on user behaviour Comp 7220 1/11/2012 8

9. Comp 7220 1/11/2012 9

10. 1. Informational query – “DDR3 memory”, “SATA 3 hard drives”, “American history” 2. Navigational query – “gmail”, “digg”, “CIBC”, “CIBC credit cards” Comp 7220 1/11/2012 10

11. No No Snippet Examine? Snippet Examine? No Yes Yes No Snippet Attractive? Snippet Attractive? No Yes No Yes Enough Utility? Enough Utility? Yes Yes End End Comp 7220 1/11/2012 11

12. Realistic and flexible assumptions on user behaviour (session modelling) Consider trust bias (trust factor) Order results for particular query by relevance scores predicted by model Comparison of this order to the editorial ranking Is it good model? If orderings agree upto a considerable extent Comp 7220 1/11/2012 12

13. Deploy this model as a feature/factor for predicting relevance in learning to rank algorithm Deriving retrieval/ranking function If metric gains over baseline ranking function? Model insights can be used as a feature in ranking function Ranking function tests with different class of queries for metric gains Comp 7220 1/11/2012 13

14. Metrics • Discounted Cumulative Gain (DCG) • Normalized DCG (NDCG) • Precision • Recall Two types of data 1. Search click logs (from real or meta search engines) 2. Benchmarking dataset LEarning TO Rank (LETOR) for information retrieval Comp 7220 1/11/2012 14

15. [Guo et al., 2009] [Chapelle and Zhang, 2009] Comp 7220 1/11/2012 15

16.  David Green Blog. http://davidgreen.com/comparative-value-of-google-search- rankings (accessed 20th-April-2011)  Fan Guo and Chao Liu. Statistical Models for Web Search Click Log Analysis. Tutorial, 2009  Fan Guo, Chao Liu, and Yi Min Wang. Efficient multiple-click models in web search. In Proceedings of Second Web Search and Data Mining (WSDM) Conference, Barcelona, Spain, pages 124-131. ACM, 9-11 February, 2009  Olivier Chapelle and Ye Zhang. A dynamic bayesian network click model for web search and ranking. In Proceedings of the 18th International Conference on World Wide web (WWW), Madrid, Spain, pages 1-10, ACM, 20-24 April, 2009 Comp 7220 1/11/2012 16

17. [Tmcnet.com Blog] Comp 7220 1/11/2012 17

Notes de l'éditeur

User Browsing Model (UBM) [Dupret and Piwowarski, 2008]Dynamic Bayesian Model (DBM) [Chapelle and Zhang, 2009] Session Utility Model (SUM) [Dupret and Liao, 2010]Independent Click Model (ICM) [Guo et. al, 2009]Dependent Click Model (DCM) [Guo et. al, 2009]

Determining Relevance Rankings from Search Click Logs

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Determining Relevance Rankings from Search Click Logs

Similaire à Determining Relevance Rankings from Search Click Logs (20)

Dernier

Dernier (20)

Determining Relevance Rankings from Search Click Logs

Notes de l'éditeur