SlideShare une entreprise Scribd logo
1  sur  34
Télécharger pour lire hors ligne
Recommender Systems
    Challenges
      Best Practices
     Tutorial & Panel


       ACM RecSys 2012
           Dublin
         September 10, 2012
About us
•   Alan Said - PhD Student @ TU-Berlin
    o   Topics: RecSys Evaluation
    o   @alansaid
    o   URL: www.alansaid.com


•   Domonkos Tikk - CEO @ Gravity R&D
    o   Topics: Machine Learning methods for RecSys
    o   @domonkostikk
    o   http://www.tmit.bme.hu/tikk.domonkos


•   Andreas Hotho - Prof. @ Uni. Würzburg
    o   Topics: Data Mining, Information Retrieval, Web Science
    o   http://www.is.informatik.uni-wuerzburg.de/staff/hotho
General Motivation
"RecSys is nobody's home conference. We
  come from CHI, IUI, SIGIR, etc."
  Joe Konstan - RecSys 2010


RecSys is our home conference - we
should evaluate accordingly!
Outline
•   Tutorial
    o Introduction to concepts in challenges
    o Execution of a challenge
    o Conclusion

•   Panel
      Experiences of participating in and
      organizing challenges
         Yehuda Koren
         Darren Vengroff
         Torben Brodt
What is the motivation
for RecSys Challenges?
          Part 1
Setup - information overload




users


                      content of service
                          provider
        recommender
Motivation of stakeholders
find relevant content
easy navigation
serendipity, discovery

  user                                service

                                    increase revenue
                                    target user with
                    recom               the right content
                                    engage users
 facilitate goals of stakeholders
 get recognized
Evaluation in terms of the business
                           business
                           reporting




Online evaluation
   (A/B test)
                      Casting into a
                    research problem
Context of the contest
•   Selection of metrics
•   Domain dependent
•   Offline vs. online evaluation


•   IR centric evaluation
     o RMSE
     o MAP
     o F1
Latent user needs
Recsys Competition Highlights
                          •   Large scale
                          •   Organization
                          •   RMSE
•   3-stage setup         •   Prize
•   selection by review
•   runtime limits
•   real traffic
•   revenue increase
                          •   offline
                          •   MAP@500
                          •   metadata available
                          •   larger in dimensions
                          •   no ratings
Recurring Competitions
•   ACM KDD Cup (2007, 2011, 2012)
•   ECML/PKDD Discovery Challenge (2008
    onwards)
    o 2008 and 09: tag recommendation in social
      bookmarking (incl. online evaluation task)
    o 2011: video lectures
•   CAMRa (2010, 2011, 2012)
Does size matter?
•   Yes! – real world users
•   In research – to some extent
Research & Industry
Important for both
• Industry has the data and research needs
  data
• Industry needs better approaches but this
  costs
• Research has ideas but has no systems
  and/or data to do the evaluation

Don't exploit participants
Don't be too greedy
Running a Challenge
       Part 2
Standard Challenge Setting
•   organizer defines the recommender setting e.g.
    tag recommendation in BibSonomy
•   provide data
    o   with features or
    o   raw data
    o   construct your own data
•   fix the way to do the evaluation
•   define the goal e.g. reach a certain
    improvement (F1)
•   motivate people to participate:
    e.g. promise a lot of money ;-)
Typical contest settings
 •   offline
     o   everyone gets access to the dataset
     o   in principle it is a prediction task, the user can't be influenced
     o   privacy of the user within the data is a big issue
     o   results from offline experimentation have limited predictive power
         for online user behavior

 •   online
     o   after a first learning phase the recommender is plugged into a real
         system
     o   user can be influenced but only by the selected system
     o   comparison of different system is not completely fair

 •   further ways
     o   user study
Example online setting
(BibSonomy)




BALBY MARINHO, L. ; HOTHO, A. ; JÄSCHKE, R. ; NANOPOULOS, A. ; RENDLE, S. ; SCHMIDT-THIEME, L. ; STUMME, G. ; SYMEONIDIS, P.:
Recommender Systems for Social Tagging Systems : SPRINGER, 2012 (SpringerBriefs in Electrical and Computer Engineering). - ISBN 978-1-
4614-1893-1
Which evaluation measures?
•   Root Mean Squared Error (RMSE)
•   Mean Absolute Error (MAE)
•   Typical IR measures
    o   precision @ n-items
    o   recall @ n-items
    o   False Positive Rate
    o   F1 @ n-items
    o   Area Under the ROC Curve (AUC)
•   non-quality measures
    o   server answer time
    o   understandability of the results
Discussion of measures?
    RMSE - Precision
• RMSE is not necessarily the king of metrics
    as RMSE is easy to optimize on
•   What about Top-n?
•   but RMSE is not influenced by popularity as
    top-n

• What about user-centric stuff?
• Ranking-based measure in KDD Cup 2011,
    Track 2
Results influenced by ...

•   target of the recommendation (user, resources, etc...)
•   evaluation methodology (leave-one-out, time based split, random
    sample, cross validation)
•   evaluation measure
•   design of the application (online setting)
•   the selected part of the data and its preprocessing (e.g.
    p-core vs. long tail)
•   scalability vs. quality of the model
•   feature and content accessible and usable for the
    recommendation
Don't forget..
• the effort to organize a challenge is very big
• preparing data takes time
• answering questions takes even more time
• participants are creative, needs for reaction
• time to compute the evaluation and check the
    results
•   prepare proceedings with the outcome
•   ...
What have we learnt?
    Conclusion
        Part 3
Challenges are good since they...
•   ... are focused on solving a single problem
•   ... have many participants
•   ... create common evaluation criteria
•   ... have comparable results
•   ... bring real-world problems to research
•   ... make it easy to crown a winner
•   ... they are cheap (even with a 1M$ prize)
Is that the complete truth?




           No!
Is that the complete truth?
•   Why?
Because using standard information retrieval metrics we
cannot evaluate recommender system concepts like:
    • user interaction
    • perception
    • satisfaction
    • usefulness
    • any metric not based on accuracy/rating prediction
      and negative predictions
    • scalability
    • engineering
We can't catch everything offline
        Scalability

                      Presentation



                      Interaction
The difference between IR and RS
Information retrieval systems answer to a need


                 A Query
Recommender systems identify the user's needs
Should we organize more
challenges?
•   Yes - but before we do that, think of
    o What is the utility of Yet Another Dataset - aren't
      there enough already?
    o How do we create a real-world like challenge
    o How do we get real user feedback
Take home message
•   Real needs of users and content providers are better
    reflected in online evaluation

•   Consider technical limitations as well

•   Challenges advance the field a lot
    o Matrix factorization & ensemble methods in the
      Netflix Prize
    o Evaluation measure and objective in the KDD Cup
      2011
Related events at RecSys
•   Workshops
    o   Recommender Utility Evaluation
    o   RecSys Data Challenge
•   Paper Sessions
    o Multi-Objective Recommendation and Human
      Factors - Mon. 14:30
    o Implicit Feedback and User Preference - Tue. 11:00
    o Top-N Recommendation - Wed. 14:30

•   More challenges:
    o   www.recsyswiki.com/wiki/Category:Competition
Panel
Part 4
Panel
•   Torben Brodt
    o   Plista
    o   Organizing Plista Contest

•   Yehuda Koren
    o   Google
    o   Member of winning team of the Netflix Prize


•   Darren Vengroff
    o   RichRelevance
    o   Organizer of RecLab Prize
Questions
•   How does recommendation influence the
    user and system?
•   How can we quantify the effects of the UI?
•   How should we translate what we've
    presented into an actual challenge?
•   should we focus on the long tail or the short
    head?
•   Evaluation measures, click rate, wtf@k
•   How to evaluate conversion rate?

Contenu connexe

Tendances

Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Xavier Amatriain
 
Information Retrieval Models for Recommender Systems - PhD slides
Information Retrieval Models for Recommender Systems - PhD slidesInformation Retrieval Models for Recommender Systems - PhD slides
Information Retrieval Models for Recommender Systems - PhD slidesDaniel Valcarce
 
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...PyData
 
Kdd 2014 Tutorial - the recommender problem revisited
Kdd 2014 Tutorial -  the recommender problem revisitedKdd 2014 Tutorial -  the recommender problem revisited
Kdd 2014 Tutorial - the recommender problem revisitedXavier Amatriain
 
Aiinpractice2017deepaklongversion
Aiinpractice2017deepaklongversionAiinpractice2017deepaklongversion
Aiinpractice2017deepaklongversionDeepak Agarwal
 
Social Recommender Systems
Social Recommender SystemsSocial Recommender Systems
Social Recommender Systemsguest77b0cd12
 
Product Recommendations Enhanced with Reviews
Product Recommendations Enhanced with ReviewsProduct Recommendations Enhanced with Reviews
Product Recommendations Enhanced with Reviewsmaranlar
 
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleQcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleXavier Amatriain
 
Recommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringRecommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringViet-Trung TRAN
 
Recsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and DeepakRecsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and DeepakDeepak Agarwal
 
Collaborative Filtering using KNN
Collaborative Filtering using KNNCollaborative Filtering using KNN
Collaborative Filtering using KNNŞeyda Hatipoğlu
 
Data Mining and Recommendation Systems
Data Mining and Recommendation SystemsData Mining and Recommendation Systems
Data Mining and Recommendation SystemsSalil Navgire
 
Survey of Recommendation Systems
Survey of Recommendation SystemsSurvey of Recommendation Systems
Survey of Recommendation Systemsyoualab
 
[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systems[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systemsFalitokiniaina Rabearison
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender systemStanley Wang
 
An introduction to Recommender Systems
An introduction to Recommender SystemsAn introduction to Recommender Systems
An introduction to Recommender SystemsDavid Zibriczky
 
Recommendation System for Design Patterns in Software Development
Recommendation System for Design Patterns in Software DevelopmentRecommendation System for Design Patterns in Software Development
Recommendation System for Design Patterns in Software DevelopmentFrancis Palma
 

Tendances (20)

Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
 
Information Retrieval Models for Recommender Systems - PhD slides
Information Retrieval Models for Recommender Systems - PhD slidesInformation Retrieval Models for Recommender Systems - PhD slides
Information Retrieval Models for Recommender Systems - PhD slides
 
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
 
Kdd 2014 Tutorial - the recommender problem revisited
Kdd 2014 Tutorial -  the recommender problem revisitedKdd 2014 Tutorial -  the recommender problem revisited
Kdd 2014 Tutorial - the recommender problem revisited
 
Aiinpractice2017deepaklongversion
Aiinpractice2017deepaklongversionAiinpractice2017deepaklongversion
Aiinpractice2017deepaklongversion
 
Social Recommender Systems
Social Recommender SystemsSocial Recommender Systems
Social Recommender Systems
 
Product Recommendations Enhanced with Reviews
Product Recommendations Enhanced with ReviewsProduct Recommendations Enhanced with Reviews
Product Recommendations Enhanced with Reviews
 
Recommender system
Recommender systemRecommender system
Recommender system
 
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleQcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Recommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringRecommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filtering
 
Recsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and DeepakRecsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and Deepak
 
Collaborative Filtering using KNN
Collaborative Filtering using KNNCollaborative Filtering using KNN
Collaborative Filtering using KNN
 
Data Mining and Recommendation Systems
Data Mining and Recommendation SystemsData Mining and Recommendation Systems
Data Mining and Recommendation Systems
 
Survey of Recommendation Systems
Survey of Recommendation SystemsSurvey of Recommendation Systems
Survey of Recommendation Systems
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systems[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systems
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender system
 
An introduction to Recommender Systems
An introduction to Recommender SystemsAn introduction to Recommender Systems
An introduction to Recommender Systems
 
Recommendation System for Design Patterns in Software Development
Recommendation System for Design Patterns in Software DevelopmentRecommendation System for Design Patterns in Software Development
Recommendation System for Design Patterns in Software Development
 

Similaire à Best Practices in Recommender System Challenges

PAS: The Planning Quality Framework
PAS: The Planning Quality FrameworkPAS: The Planning Quality Framework
PAS: The Planning Quality FrameworkPAS_Team
 
Establishing best practices to improve usefulness and usability of web interf...
Establishing best practices to improve usefulness and usability of web interf...Establishing best practices to improve usefulness and usability of web interf...
Establishing best practices to improve usefulness and usability of web interf...DRIscience
 
ECIR Recommendation Challenges
ECIR Recommendation ChallengesECIR Recommendation Challenges
ECIR Recommendation ChallengesDaniel Kohlsdorf
 
Value stream mapping for complex processes (innovation, Lean, service design)
Value stream mapping for complex processes (innovation, Lean, service design) Value stream mapping for complex processes (innovation, Lean, service design)
Value stream mapping for complex processes (innovation, Lean, service design) Teemu Toivonen
 
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...Alan Said
 
Building Search and Personalization at Nordstrom Rack | Hautelook
Building Search and Personalization at Nordstrom Rack | HautelookBuilding Search and Personalization at Nordstrom Rack | Hautelook
Building Search and Personalization at Nordstrom Rack | HautelookLucidworks
 
Unlocking the value of customer data
Unlocking the value of customer dataUnlocking the value of customer data
Unlocking the value of customer dataJanessa Lantz
 
7.1 Mapping Your Processes to Deliver an Exceptional Student Experience
7.1 Mapping Your Processes to Deliver an Exceptional Student Experience7.1 Mapping Your Processes to Deliver an Exceptional Student Experience
7.1 Mapping Your Processes to Deliver an Exceptional Student ExperienceTargetX
 
Downloads abc 2006 presentation downloads-ramesh_babu
Downloads abc 2006   presentation downloads-ramesh_babuDownloads abc 2006   presentation downloads-ramesh_babu
Downloads abc 2006 presentation downloads-ramesh_babuHem Rana
 
Advanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project DeliveryAdvanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project DeliveryMark Constable
 
Human computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspectiveHuman computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspectiveoralonso
 
Knowledge Management in Healthcare Analytics
Knowledge Management in Healthcare AnalyticsKnowledge Management in Healthcare Analytics
Knowledge Management in Healthcare AnalyticsGregory Nelson
 
Crafting a Compelling Data Science Resume
Crafting a Compelling Data Science ResumeCrafting a Compelling Data Science Resume
Crafting a Compelling Data Science ResumeArushi Prakash, Ph.D.
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyMaya Hristakeva
 
Group 1 Report CRISP - DM METHODOLOGY.pptx
Group 1 Report CRISP - DM METHODOLOGY.pptxGroup 1 Report CRISP - DM METHODOLOGY.pptx
Group 1 Report CRISP - DM METHODOLOGY.pptxellamangapis2003
 
Usability Testing for Qualitative Researchers - QRCA NYC Chapter event
Usability Testing for Qualitative Researchers - QRCA NYC Chapter eventUsability Testing for Qualitative Researchers - QRCA NYC Chapter event
Usability Testing for Qualitative Researchers - QRCA NYC Chapter eventKay Aubrey
 

Similaire à Best Practices in Recommender System Challenges (20)

PAS: The Planning Quality Framework
PAS: The Planning Quality FrameworkPAS: The Planning Quality Framework
PAS: The Planning Quality Framework
 
Establishing best practices to improve usefulness and usability of web interf...
Establishing best practices to improve usefulness and usability of web interf...Establishing best practices to improve usefulness and usability of web interf...
Establishing best practices to improve usefulness and usability of web interf...
 
ECIR Recommendation Challenges
ECIR Recommendation ChallengesECIR Recommendation Challenges
ECIR Recommendation Challenges
 
Value stream mapping for complex processes (innovation, Lean, service design)
Value stream mapping for complex processes (innovation, Lean, service design) Value stream mapping for complex processes (innovation, Lean, service design)
Value stream mapping for complex processes (innovation, Lean, service design)
 
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
 
Building Search and Personalization at Nordstrom Rack | Hautelook
Building Search and Personalization at Nordstrom Rack | HautelookBuilding Search and Personalization at Nordstrom Rack | Hautelook
Building Search and Personalization at Nordstrom Rack | Hautelook
 
Unlocking the value of customer data
Unlocking the value of customer dataUnlocking the value of customer data
Unlocking the value of customer data
 
Dlf 2012
Dlf 2012Dlf 2012
Dlf 2012
 
7.1 Mapping Your Processes to Deliver an Exceptional Student Experience
7.1 Mapping Your Processes to Deliver an Exceptional Student Experience7.1 Mapping Your Processes to Deliver an Exceptional Student Experience
7.1 Mapping Your Processes to Deliver an Exceptional Student Experience
 
Downloads abc 2006 presentation downloads-ramesh_babu
Downloads abc 2006   presentation downloads-ramesh_babuDownloads abc 2006   presentation downloads-ramesh_babu
Downloads abc 2006 presentation downloads-ramesh_babu
 
Advanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project DeliveryAdvanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project Delivery
 
Human computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspectiveHuman computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspective
 
Knowledge Management in Healthcare Analytics
Knowledge Management in Healthcare AnalyticsKnowledge Management in Healthcare Analytics
Knowledge Management in Healthcare Analytics
 
The art of project estimation
The art of project estimationThe art of project estimation
The art of project estimation
 
PQF Overview
PQF OverviewPQF Overview
PQF Overview
 
Ch 3
Ch   3Ch   3
Ch 3
 
Crafting a Compelling Data Science Resume
Crafting a Compelling Data Science ResumeCrafting a Compelling Data Science Resume
Crafting a Compelling Data Science Resume
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in Mendeley
 
Group 1 Report CRISP - DM METHODOLOGY.pptx
Group 1 Report CRISP - DM METHODOLOGY.pptxGroup 1 Report CRISP - DM METHODOLOGY.pptx
Group 1 Report CRISP - DM METHODOLOGY.pptx
 
Usability Testing for Qualitative Researchers - QRCA NYC Chapter event
Usability Testing for Qualitative Researchers - QRCA NYC Chapter eventUsability Testing for Qualitative Researchers - QRCA NYC Chapter event
Usability Testing for Qualitative Researchers - QRCA NYC Chapter event
 

Plus de Alan Said

Replication of Recommender Systems Research
Replication of Recommender Systems ResearchReplication of Recommender Systems Research
Replication of Recommender Systems ResearchAlan Said
 
The Magic Barrier of Recommender Systems - No Magic, Just Ratings
The Magic Barrier of Recommender Systems - No Magic, Just RatingsThe Magic Barrier of Recommender Systems - No Magic, Just Ratings
The Magic Barrier of Recommender Systems - No Magic, Just RatingsAlan Said
 
A Top-N Recommender System Evaluation Protocol Inspired by Deployed Systems
A Top-N Recommender System Evaluation Protocol Inspired by Deployed SystemsA Top-N Recommender System Evaluation Protocol Inspired by Deployed Systems
A Top-N Recommender System Evaluation Protocol Inspired by Deployed SystemsAlan Said
 
Information Retrieval and User-centric Recommender System Evaluation
Information Retrieval and User-centric Recommender System EvaluationInformation Retrieval and User-centric Recommender System Evaluation
Information Retrieval and User-centric Recommender System EvaluationAlan Said
 
User-Centric Evaluation of a K-Furthest Neighbor Collaborative Filtering Reco...
User-Centric Evaluation of a K-Furthest Neighbor Collaborative Filtering Reco...User-Centric Evaluation of a K-Furthest Neighbor Collaborative Filtering Reco...
User-Centric Evaluation of a K-Furthest Neighbor Collaborative Filtering Reco...Alan Said
 
A 3D Approach to Recommender System Evaluation
A 3D Approach to Recommender System EvaluationA 3D Approach to Recommender System Evaluation
A 3D Approach to Recommender System EvaluationAlan Said
 
State of RecSys: Recap of RecSys 2012
State of RecSys: Recap of RecSys 2012State of RecSys: Recap of RecSys 2012
State of RecSys: Recap of RecSys 2012Alan Said
 
RecSysChallenge Opening
RecSysChallenge OpeningRecSysChallenge Opening
RecSysChallenge OpeningAlan Said
 
Estimating the Magic Barrier of Recommender Systems: A User Study
Estimating the Magic Barrier of Recommender Systems: A User StudyEstimating the Magic Barrier of Recommender Systems: A User Study
Estimating the Magic Barrier of Recommender Systems: A User StudyAlan Said
 
Users and Noise: The Magic Barrier of Recommender Systems
Users and Noise: The Magic Barrier of Recommender SystemsUsers and Noise: The Magic Barrier of Recommender Systems
Users and Noise: The Magic Barrier of Recommender SystemsAlan Said
 
Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...
Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...
Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...Alan Said
 
CaRR 2012 Opening Presentation
CaRR 2012 Opening PresentationCaRR 2012 Opening Presentation
CaRR 2012 Opening PresentationAlan Said
 
Personalizing Tags: A Folksonomy-like Approach for Recommending Movies
Personalizing Tags: A Folksonomy-like Approach for Recommending MoviesPersonalizing Tags: A Folksonomy-like Approach for Recommending Movies
Personalizing Tags: A Folksonomy-like Approach for Recommending MoviesAlan Said
 
Inferring Contextual User Profiles - Improving Recommender Performance
Inferring Contextual User Profiles - Improving Recommender PerformanceInferring Contextual User Profiles - Improving Recommender Performance
Inferring Contextual User Profiles - Improving Recommender PerformanceAlan Said
 
Using Social- and Pseudo-Social Networks to Improve Recommendation Quality
Using Social- and Pseudo-Social Networks to Improve Recommendation QualityUsing Social- and Pseudo-Social Networks to Improve Recommendation Quality
Using Social- and Pseudo-Social Networks to Improve Recommendation QualityAlan Said
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender SystemsAlan Said
 

Plus de Alan Said (16)

Replication of Recommender Systems Research
Replication of Recommender Systems ResearchReplication of Recommender Systems Research
Replication of Recommender Systems Research
 
The Magic Barrier of Recommender Systems - No Magic, Just Ratings
The Magic Barrier of Recommender Systems - No Magic, Just RatingsThe Magic Barrier of Recommender Systems - No Magic, Just Ratings
The Magic Barrier of Recommender Systems - No Magic, Just Ratings
 
A Top-N Recommender System Evaluation Protocol Inspired by Deployed Systems
A Top-N Recommender System Evaluation Protocol Inspired by Deployed SystemsA Top-N Recommender System Evaluation Protocol Inspired by Deployed Systems
A Top-N Recommender System Evaluation Protocol Inspired by Deployed Systems
 
Information Retrieval and User-centric Recommender System Evaluation
Information Retrieval and User-centric Recommender System EvaluationInformation Retrieval and User-centric Recommender System Evaluation
Information Retrieval and User-centric Recommender System Evaluation
 
User-Centric Evaluation of a K-Furthest Neighbor Collaborative Filtering Reco...
User-Centric Evaluation of a K-Furthest Neighbor Collaborative Filtering Reco...User-Centric Evaluation of a K-Furthest Neighbor Collaborative Filtering Reco...
User-Centric Evaluation of a K-Furthest Neighbor Collaborative Filtering Reco...
 
A 3D Approach to Recommender System Evaluation
A 3D Approach to Recommender System EvaluationA 3D Approach to Recommender System Evaluation
A 3D Approach to Recommender System Evaluation
 
State of RecSys: Recap of RecSys 2012
State of RecSys: Recap of RecSys 2012State of RecSys: Recap of RecSys 2012
State of RecSys: Recap of RecSys 2012
 
RecSysChallenge Opening
RecSysChallenge OpeningRecSysChallenge Opening
RecSysChallenge Opening
 
Estimating the Magic Barrier of Recommender Systems: A User Study
Estimating the Magic Barrier of Recommender Systems: A User StudyEstimating the Magic Barrier of Recommender Systems: A User Study
Estimating the Magic Barrier of Recommender Systems: A User Study
 
Users and Noise: The Magic Barrier of Recommender Systems
Users and Noise: The Magic Barrier of Recommender SystemsUsers and Noise: The Magic Barrier of Recommender Systems
Users and Noise: The Magic Barrier of Recommender Systems
 
Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...
Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...
Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...
 
CaRR 2012 Opening Presentation
CaRR 2012 Opening PresentationCaRR 2012 Opening Presentation
CaRR 2012 Opening Presentation
 
Personalizing Tags: A Folksonomy-like Approach for Recommending Movies
Personalizing Tags: A Folksonomy-like Approach for Recommending MoviesPersonalizing Tags: A Folksonomy-like Approach for Recommending Movies
Personalizing Tags: A Folksonomy-like Approach for Recommending Movies
 
Inferring Contextual User Profiles - Improving Recommender Performance
Inferring Contextual User Profiles - Improving Recommender PerformanceInferring Contextual User Profiles - Improving Recommender Performance
Inferring Contextual User Profiles - Improving Recommender Performance
 
Using Social- and Pseudo-Social Networks to Improve Recommendation Quality
Using Social- and Pseudo-Social Networks to Improve Recommendation QualityUsing Social- and Pseudo-Social Networks to Improve Recommendation Quality
Using Social- and Pseudo-Social Networks to Improve Recommendation Quality
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 

Dernier

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 

Dernier (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Best Practices in Recommender System Challenges

  • 1. Recommender Systems Challenges Best Practices Tutorial & Panel ACM RecSys 2012 Dublin September 10, 2012
  • 2. About us • Alan Said - PhD Student @ TU-Berlin o Topics: RecSys Evaluation o @alansaid o URL: www.alansaid.com • Domonkos Tikk - CEO @ Gravity R&D o Topics: Machine Learning methods for RecSys o @domonkostikk o http://www.tmit.bme.hu/tikk.domonkos • Andreas Hotho - Prof. @ Uni. Würzburg o Topics: Data Mining, Information Retrieval, Web Science o http://www.is.informatik.uni-wuerzburg.de/staff/hotho
  • 3. General Motivation "RecSys is nobody's home conference. We come from CHI, IUI, SIGIR, etc." Joe Konstan - RecSys 2010 RecSys is our home conference - we should evaluate accordingly!
  • 4. Outline • Tutorial o Introduction to concepts in challenges o Execution of a challenge o Conclusion • Panel Experiences of participating in and organizing challenges  Yehuda Koren  Darren Vengroff  Torben Brodt
  • 5. What is the motivation for RecSys Challenges? Part 1
  • 6. Setup - information overload users content of service provider recommender
  • 7. Motivation of stakeholders find relevant content easy navigation serendipity, discovery user service increase revenue target user with recom the right content engage users facilitate goals of stakeholders get recognized
  • 8. Evaluation in terms of the business business reporting Online evaluation (A/B test) Casting into a research problem
  • 9. Context of the contest • Selection of metrics • Domain dependent • Offline vs. online evaluation • IR centric evaluation o RMSE o MAP o F1
  • 11. Recsys Competition Highlights • Large scale • Organization • RMSE • 3-stage setup • Prize • selection by review • runtime limits • real traffic • revenue increase • offline • MAP@500 • metadata available • larger in dimensions • no ratings
  • 12. Recurring Competitions • ACM KDD Cup (2007, 2011, 2012) • ECML/PKDD Discovery Challenge (2008 onwards) o 2008 and 09: tag recommendation in social bookmarking (incl. online evaluation task) o 2011: video lectures • CAMRa (2010, 2011, 2012)
  • 13. Does size matter? • Yes! – real world users • In research – to some extent
  • 14. Research & Industry Important for both • Industry has the data and research needs data • Industry needs better approaches but this costs • Research has ideas but has no systems and/or data to do the evaluation Don't exploit participants Don't be too greedy
  • 16. Standard Challenge Setting • organizer defines the recommender setting e.g. tag recommendation in BibSonomy • provide data o with features or o raw data o construct your own data • fix the way to do the evaluation • define the goal e.g. reach a certain improvement (F1) • motivate people to participate: e.g. promise a lot of money ;-)
  • 17. Typical contest settings • offline o everyone gets access to the dataset o in principle it is a prediction task, the user can't be influenced o privacy of the user within the data is a big issue o results from offline experimentation have limited predictive power for online user behavior • online o after a first learning phase the recommender is plugged into a real system o user can be influenced but only by the selected system o comparison of different system is not completely fair • further ways o user study
  • 18. Example online setting (BibSonomy) BALBY MARINHO, L. ; HOTHO, A. ; JÄSCHKE, R. ; NANOPOULOS, A. ; RENDLE, S. ; SCHMIDT-THIEME, L. ; STUMME, G. ; SYMEONIDIS, P.: Recommender Systems for Social Tagging Systems : SPRINGER, 2012 (SpringerBriefs in Electrical and Computer Engineering). - ISBN 978-1- 4614-1893-1
  • 19. Which evaluation measures? • Root Mean Squared Error (RMSE) • Mean Absolute Error (MAE) • Typical IR measures o precision @ n-items o recall @ n-items o False Positive Rate o F1 @ n-items o Area Under the ROC Curve (AUC) • non-quality measures o server answer time o understandability of the results
  • 20. Discussion of measures? RMSE - Precision • RMSE is not necessarily the king of metrics as RMSE is easy to optimize on • What about Top-n? • but RMSE is not influenced by popularity as top-n • What about user-centric stuff? • Ranking-based measure in KDD Cup 2011, Track 2
  • 21. Results influenced by ... • target of the recommendation (user, resources, etc...) • evaluation methodology (leave-one-out, time based split, random sample, cross validation) • evaluation measure • design of the application (online setting) • the selected part of the data and its preprocessing (e.g. p-core vs. long tail) • scalability vs. quality of the model • feature and content accessible and usable for the recommendation
  • 22. Don't forget.. • the effort to organize a challenge is very big • preparing data takes time • answering questions takes even more time • participants are creative, needs for reaction • time to compute the evaluation and check the results • prepare proceedings with the outcome • ...
  • 23. What have we learnt? Conclusion Part 3
  • 24. Challenges are good since they... • ... are focused on solving a single problem • ... have many participants • ... create common evaluation criteria • ... have comparable results • ... bring real-world problems to research • ... make it easy to crown a winner • ... they are cheap (even with a 1M$ prize)
  • 25. Is that the complete truth? No!
  • 26. Is that the complete truth? • Why? Because using standard information retrieval metrics we cannot evaluate recommender system concepts like: • user interaction • perception • satisfaction • usefulness • any metric not based on accuracy/rating prediction and negative predictions • scalability • engineering
  • 27. We can't catch everything offline Scalability Presentation Interaction
  • 28. The difference between IR and RS Information retrieval systems answer to a need A Query Recommender systems identify the user's needs
  • 29. Should we organize more challenges? • Yes - but before we do that, think of o What is the utility of Yet Another Dataset - aren't there enough already? o How do we create a real-world like challenge o How do we get real user feedback
  • 30. Take home message • Real needs of users and content providers are better reflected in online evaluation • Consider technical limitations as well • Challenges advance the field a lot o Matrix factorization & ensemble methods in the Netflix Prize o Evaluation measure and objective in the KDD Cup 2011
  • 31. Related events at RecSys • Workshops o Recommender Utility Evaluation o RecSys Data Challenge • Paper Sessions o Multi-Objective Recommendation and Human Factors - Mon. 14:30 o Implicit Feedback and User Preference - Tue. 11:00 o Top-N Recommendation - Wed. 14:30 • More challenges: o www.recsyswiki.com/wiki/Category:Competition
  • 33. Panel • Torben Brodt o Plista o Organizing Plista Contest • Yehuda Koren o Google o Member of winning team of the Netflix Prize • Darren Vengroff o RichRelevance o Organizer of RecLab Prize
  • 34. Questions • How does recommendation influence the user and system? • How can we quantify the effects of the UI? • How should we translate what we've presented into an actual challenge? • should we focus on the long tail or the short head? • Evaluation measures, click rate, wtf@k • How to evaluate conversion rate?