SlideShare une entreprise Scribd logo
1  sur  31
Télécharger pour lire hors ligne
Rank
all the (geo)
things!
@jsuchal
@SynopsiTV
Blogs, newsletters
How do you learn things?
Courses, training
Conferences Work
Research papers?
WHY NOT?
WHY NOT?
“It’s not useful for the
real-world.”
“I wouldn’t
understand any of
that.”
About me
PhD dropout FIIT STU Bratislava
foaf.sk, otvorenezmluvy.sk, govdata.sk
sme.sk news recommender
developer @ SynopsiTV
My workflow
My workflow
MAGIC!
MAGIC!
MAGIC!
Search vs. recommender engine
Search engine
input: query
output: list of results
Recommendation engine
input: movie
output: list of similar movies
Academic Mode
Accurately interpreting clickthrough
data as implicit feedback
Thorsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, and Geri Gay. Accurately interpreting
clickthrough data as implicit feedback. In Proceedings of the 28th annual international ACM SIGIR conference on
Research and development in Information retrieval, SIGIR ’05, pages 154–161, New York, NY, USA, 2005. ACM.
Significant on
two-tailed tests
at a 95%
confidence level
!!!
Learning to Rank for Spatiotemporal
Search
Blake Shaw, Jon Shea, Siddhartha Sinha, and Andrew Hogue. 2013. Learning to rank for
spatiotemporal search. In Proceedings of the sixth ACM international conference on Web search and
data mining (WSDM '13). ACM, New York, NY, USA, 717-726.
Learning to Rank for Spatiotemporal
Search
Learning to Rank for Spatiotemporal
Search
Learning to Rank for Spatiotemporal
Search
Learning to Rank for Spatiotemporal
Search
Learning to Rank for Spatiotemporal
Search
Accurately interpreting clickthrough
data as implicit feedback
Thorsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, and Geri
Gay. Accurately interpreting clickthrough data as implicit feedback. In
Proceedings of the 28th annual international ACM SIGIR conference on
Research and development in Information retrieval, SIGIR ’05, pages 154–161,
New York, NY, USA, 2005. ACM.
Accurately interpreting clickthrough
data as implicit feedback
Evaluation Metrics
● Mean Average Precision @ N
○ probability of target result being in top N items
● Mean Reciprocal Rank
○ 1 / rank of target result
● Normalized Discounted Cumulative Gain
● Expected Reciprocal Rank
Optimizing search engines using
clickthrough data
Thorsten Joachims. Optimizing search engines using clickthrough data. In
Proceedings of the eighth ACM SIGKDD international conference on
Knowledge discovery and data mining, KDD ’02, pages 133–142, New York,
NY, USA, 2002. ACM.
Optimizing search engines using
clickthrough data
Query chains: learning to rank from
implicit feedback
Filip Radlinski and Thorsten
Joachims. Query chains: learning
to rank from implicit feedback. In
KDD ’05: Proceeding of the eleventh
ACM SIGKDD international
conference on Knowledge discovery
in data mining, pages 239–248,
New York, NY, USA, 2005. ACM.
On Caption Bias in Interleaving
Experiments
Katja Hofmann, Fritz Behr, and Filip Radlinski: On Caption Bias in Interleaving
Experiments In Proceedings of the ACM Conference on Information and
Knowledge Management (CIKM) 2012
On Caption Bias in Interleaving
Experiments
Fighting Search Engine Amnesia:
Reranking Repeated Results
Milad Shokouhi, Ryen W. White, Paul Bennett, and Filip Radlinski. Fighting
search engine amnesia: reranking repeated results. In Proceedings of the
36th international ACM SIGIR conference on Research and development in
information retrieval, SIGIR ’13, pages 273–282, New York, NY, USA, 2013.
ACM.
In this paper, we observed that the same results are often shown to
users multiple times during search sessions. We showed that there are
a number of effects at play, which can be leveraged to improve information
retrieval performance. In particular, previously skipped results are much
less likely to be clicked, and previously clicked results may or may not
be re-clicked depending on other factors of the session.
Challenges
Diversification
Group recommendations
Context-aware recommendations
Time of day
Device
Mood
Season
Location
Serious
recommenders and search?
Get in touch!
@synopsitv @jsuchal

Contenu connexe

Similaire à Rank all the (geo) things!

In a World of Biased Search Engines
In a World of Biased Search EnginesIn a World of Biased Search Engines
In a World of Biased Search Engines
Dirk Lewandowski
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
butest
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
butest
 
Merck's Information Landscape Knowledgebase - Eugenio, Clark
Merck's Information Landscape Knowledgebase - Eugenio, ClarkMerck's Information Landscape Knowledgebase - Eugenio, Clark
Merck's Information Landscape Knowledgebase - Eugenio, Clark
Kello64
 

Similaire à Rank all the (geo) things! (20)

Rank all the things!
Rank all the things!Rank all the things!
Rank all the things!
 
Michal Barla: Beyond search queries @ ElasticSearch Vienna Meetup #1
Michal Barla: Beyond search queries @ ElasticSearch Vienna Meetup #1Michal Barla: Beyond search queries @ ElasticSearch Vienna Meetup #1
Michal Barla: Beyond search queries @ ElasticSearch Vienna Meetup #1
 
Beyond search queries
Beyond search queriesBeyond search queries
Beyond search queries
 
In a World of Biased Search Engines
In a World of Biased Search EnginesIn a World of Biased Search Engines
In a World of Biased Search Engines
 
Immersive Recommendation Workshop, NYC Media Lab'17
Immersive Recommendation Workshop, NYC Media Lab'17Immersive Recommendation Workshop, NYC Media Lab'17
Immersive Recommendation Workshop, NYC Media Lab'17
 
Text Analytics: From Colored Pens and Crumbly Papers to Custom Machine Classi...
Text Analytics: From Colored Pens and Crumbly Papers to Custom Machine Classi...Text Analytics: From Colored Pens and Crumbly Papers to Custom Machine Classi...
Text Analytics: From Colored Pens and Crumbly Papers to Custom Machine Classi...
 
High-value datasets: from publication to impact
High-value datasets: from publication to impactHigh-value datasets: from publication to impact
High-value datasets: from publication to impact
 
Why am I doing this???
Why am I doing this???Why am I doing this???
Why am I doing this???
 
Evidence-based Semantic Web Just a Dream or the Way to Go?
Evidence-based Semantic WebJust a Dream or the Way to Go?Evidence-based Semantic WebJust a Dream or the Way to Go?
Evidence-based Semantic Web Just a Dream or the Way to Go?
 
Building and Using a Knowledge Graph to Combat Human Trafficking
Building and Using a Knowledge Graph to Combat Human TraffickingBuilding and Using a Knowledge Graph to Combat Human Trafficking
Building and Using a Knowledge Graph to Combat Human Trafficking
 
Transparency in ML and AI (humble views from a concerned academic)
Transparency in ML and AI (humble views from a concerned academic)Transparency in ML and AI (humble views from a concerned academic)
Transparency in ML and AI (humble views from a concerned academic)
 
Academic Networking with Digital Tools
Academic Networking with Digital ToolsAcademic Networking with Digital Tools
Academic Networking with Digital Tools
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
 
Presentasjon
PresentasjonPresentasjon
Presentasjon
 
Qual, Mixed, Machine and Everything in Between
Qual, Mixed, Machine and Everything in BetweenQual, Mixed, Machine and Everything in Between
Qual, Mixed, Machine and Everything in Between
 
Merck's Information Landscape Knowledgebase - Eugenio, Clark
Merck's Information Landscape Knowledgebase - Eugenio, ClarkMerck's Information Landscape Knowledgebase - Eugenio, Clark
Merck's Information Landscape Knowledgebase - Eugenio, Clark
 
JanData-mining-to-knowledge-discovery.ppt
JanData-mining-to-knowledge-discovery.pptJanData-mining-to-knowledge-discovery.ppt
JanData-mining-to-knowledge-discovery.ppt
 
Software Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciencesSoftware Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciences
 
Measuring reliability and validity in human coding and machine classification
Measuring reliability and validity in human coding and machine classificationMeasuring reliability and validity in human coding and machine classification
Measuring reliability and validity in human coding and machine classification
 

Plus de Jano Suchal

Bonetics: Mastering Puppet Workshop
Bonetics: Mastering Puppet WorkshopBonetics: Mastering Puppet Workshop
Bonetics: Mastering Puppet Workshop
Jano Suchal
 
Garelic: Google Analytics as App Performance monitoring
Garelic: Google Analytics as App Performance monitoringGarelic: Google Analytics as App Performance monitoring
Garelic: Google Analytics as App Performance monitoring
Jano Suchal
 
Vojtech Rinik: Internship v USA - moje skúsenosti
Vojtech Rinik: Internship v USA - moje skúsenostiVojtech Rinik: Internship v USA - moje skúsenosti
Vojtech Rinik: Internship v USA - moje skúsenosti
Jano Suchal
 
Profiling and monitoring ruby & rails applications
Profiling and monitoring ruby & rails applicationsProfiling and monitoring ruby & rails applications
Profiling and monitoring ruby & rails applications
Jano Suchal
 
Petr Joachim: Redis na Super.cz
Petr Joachim: Redis na Super.czPetr Joachim: Redis na Super.cz
Petr Joachim: Redis na Super.cz
Jano Suchal
 
Metaprogramovanie #1
Metaprogramovanie #1Metaprogramovanie #1
Metaprogramovanie #1
Jano Suchal
 

Plus de Jano Suchal (18)

Slovensko.Digital: Čo ďalej?
Slovensko.Digital: Čo ďalej?Slovensko.Digital: Čo ďalej?
Slovensko.Digital: Čo ďalej?
 
Datanest 3.0
Datanest 3.0Datanest 3.0
Datanest 3.0
 
Improving code quality
Improving code qualityImproving code quality
Improving code quality
 
Bonetics: Mastering Puppet Workshop
Bonetics: Mastering Puppet WorkshopBonetics: Mastering Puppet Workshop
Bonetics: Mastering Puppet Workshop
 
Peter Mihalik: Puppet
Peter Mihalik: PuppetPeter Mihalik: Puppet
Peter Mihalik: Puppet
 
Tomáš Čorej: Configuration management & CFEngine3
Tomáš Čorej: Configuration management & CFEngine3Tomáš Čorej: Configuration management & CFEngine3
Tomáš Čorej: Configuration management & CFEngine3
 
SQL: Query optimization in practice
SQL: Query optimization in practiceSQL: Query optimization in practice
SQL: Query optimization in practice
 
Garelic: Google Analytics as App Performance monitoring
Garelic: Google Analytics as App Performance monitoringGarelic: Google Analytics as App Performance monitoring
Garelic: Google Analytics as App Performance monitoring
 
Miroslav Šimulčík: Temporálne databázy
Miroslav Šimulčík: Temporálne databázyMiroslav Šimulčík: Temporálne databázy
Miroslav Šimulčík: Temporálne databázy
 
Vojtech Rinik: Internship v USA - moje skúsenosti
Vojtech Rinik: Internship v USA - moje skúsenostiVojtech Rinik: Internship v USA - moje skúsenosti
Vojtech Rinik: Internship v USA - moje skúsenosti
 
Profiling and monitoring ruby & rails applications
Profiling and monitoring ruby & rails applicationsProfiling and monitoring ruby & rails applications
Profiling and monitoring ruby & rails applications
 
Čo po GAMČI?
Čo po GAMČI?Čo po GAMČI?
Čo po GAMČI?
 
Petr Joachim: Redis na Super.cz
Petr Joachim: Redis na Super.czPetr Joachim: Redis na Super.cz
Petr Joachim: Redis na Super.cz
 
Metaprogramovanie #1
Metaprogramovanie #1Metaprogramovanie #1
Metaprogramovanie #1
 
PostgreSQL: Advanced features in practice
PostgreSQL: Advanced features in practicePostgreSQL: Advanced features in practice
PostgreSQL: Advanced features in practice
 
elasticsearch - advanced features in practice
elasticsearch - advanced features in practiceelasticsearch - advanced features in practice
elasticsearch - advanced features in practice
 
Postobjektové programovanie v Ruby
Postobjektové programovanie v RubyPostobjektové programovanie v Ruby
Postobjektové programovanie v Ruby
 
sme.sk čočítať ontožíur-2010
sme.sk čočítať ontožíur-2010sme.sk čočítať ontožíur-2010
sme.sk čočítať ontožíur-2010
 

Dernier

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Dernier (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 

Rank all the (geo) things!

  • 2. Blogs, newsletters How do you learn things? Courses, training Conferences Work
  • 5. WHY NOT? “It’s not useful for the real-world.” “I wouldn’t understand any of that.”
  • 6. About me PhD dropout FIIT STU Bratislava foaf.sk, otvorenezmluvy.sk, govdata.sk sme.sk news recommender developer @ SynopsiTV
  • 9. Search vs. recommender engine Search engine input: query output: list of results Recommendation engine input: movie output: list of similar movies
  • 11. Accurately interpreting clickthrough data as implicit feedback Thorsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, and Geri Gay. Accurately interpreting clickthrough data as implicit feedback. In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in Information retrieval, SIGIR ’05, pages 154–161, New York, NY, USA, 2005. ACM. Significant on two-tailed tests at a 95% confidence level !!!
  • 12. Learning to Rank for Spatiotemporal Search Blake Shaw, Jon Shea, Siddhartha Sinha, and Andrew Hogue. 2013. Learning to rank for spatiotemporal search. In Proceedings of the sixth ACM international conference on Web search and data mining (WSDM '13). ACM, New York, NY, USA, 717-726.
  • 13. Learning to Rank for Spatiotemporal Search
  • 14. Learning to Rank for Spatiotemporal Search
  • 15. Learning to Rank for Spatiotemporal Search
  • 16. Learning to Rank for Spatiotemporal Search
  • 17. Learning to Rank for Spatiotemporal Search
  • 18. Accurately interpreting clickthrough data as implicit feedback Thorsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, and Geri Gay. Accurately interpreting clickthrough data as implicit feedback. In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in Information retrieval, SIGIR ’05, pages 154–161, New York, NY, USA, 2005. ACM.
  • 20. Evaluation Metrics ● Mean Average Precision @ N ○ probability of target result being in top N items ● Mean Reciprocal Rank ○ 1 / rank of target result ● Normalized Discounted Cumulative Gain ● Expected Reciprocal Rank
  • 21. Optimizing search engines using clickthrough data Thorsten Joachims. Optimizing search engines using clickthrough data. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’02, pages 133–142, New York, NY, USA, 2002. ACM.
  • 22. Optimizing search engines using clickthrough data
  • 23. Query chains: learning to rank from implicit feedback Filip Radlinski and Thorsten Joachims. Query chains: learning to rank from implicit feedback. In KDD ’05: Proceeding of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, pages 239–248, New York, NY, USA, 2005. ACM.
  • 24. On Caption Bias in Interleaving Experiments Katja Hofmann, Fritz Behr, and Filip Radlinski: On Caption Bias in Interleaving Experiments In Proceedings of the ACM Conference on Information and Knowledge Management (CIKM) 2012
  • 25. On Caption Bias in Interleaving Experiments
  • 26. Fighting Search Engine Amnesia: Reranking Repeated Results Milad Shokouhi, Ryen W. White, Paul Bennett, and Filip Radlinski. Fighting search engine amnesia: reranking repeated results. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’13, pages 273–282, New York, NY, USA, 2013. ACM. In this paper, we observed that the same results are often shown to users multiple times during search sessions. We showed that there are a number of effects at play, which can be leveraged to improve information retrieval performance. In particular, previously skipped results are much less likely to be clicked, and previously clicked results may or may not be re-clicked depending on other factors of the session.
  • 30. Context-aware recommendations Time of day Device Mood Season Location
  • 31. Serious recommenders and search? Get in touch! @synopsitv @jsuchal