Recommendation Modeling with Impression Data at Netflix

Recommendation
Modeling with Impression
Data at Netflix
Jiangwei Pan
Research Scientist at Netflix
LERI workshop, RecSys’23
Recommendation Modeling with Impression Data at Netflix
What did I show?
Definition of impressions
An item appears in the viewport of the
application
● for at least x milliseconds
● partially visible can be OK
Impressions can be logged for different
entities on screen
● shows, rows, boxart images, etc
Goal of this presentation
Impression data is critical for building recommender models at
Netflix
● and other industry recommenders
How do we incorporate impression data into recommender
models?
● Impressions for label definition (training objectives)
● Impressions for feature definition
Share interesting learnings and challenges
Impressions for label
definition
Recommenders choose items and
display them to the user as
impressions
What do recommenders do?
A simplified recommendation algorithm
Given a user:
for every item, predict
p(engage | user impression of item)
then choose the item with the highest
prediction
How to train p(engage | impression)
Binary classification model: engage or no-engage?
Training data: take all user-item impressions
If only “relevant” items are impressed
Training data concentrate on the most
relevant part of the item space
If we train classifier using this data
● relevance is not the main difference
between positives and negatives
● so it may be ignored by the model
The classifier will not generalize well to the
whole item space
● may over-predict for many non-relevant
items
Solution 1: Add item exploration
Display random items to each user
User still can’t impress every single item
● there can be millions of items
But user can impress most “types” of items
Model generalizes better!
Too much exploration may hurt user
experience or ads revenue
Explore volume needs to be limited
Solution 2: Add random negatives
Pseudo-impressions with no
engagement
May incorrectly mark a relevant item as
negative
● risk is small when item space is large
Random negatives are easy to classify
● little connection to user interests
But help a lot with model generalization
Challenges
● what distribution to sample negatives?
● how to mix random negatives with
impressed negatives?
Popularity bias
Definition: popular items get higher predictions than they should
Model trained only using impressions (exploit + explore)
● no popularity bias as popular items get both more positives
and more negatives in training data
● some items can suffer from high variance if not enough
explore
Adding uniform random negatives
● may increase popularity bias as we add the same number of
negatives for popular and non-popular items
When item space is large (millions)
Too costly to compute p(engage |
impression) for every item
Candidate generation pass
● efficient model architectures (e.g. two-
tower)
● millions → hundreds (loosely-relevant)
● care more about recall @ hundreds
Fine-grained ranking pass
● more sophisticated model architectures
● distinguish between good and excellent
● often trained only on impressed negatives as
it is applied on already relevant candidates
More passes can be used, eg
● adjusting the ranking for diversity
Efficiency optimization: use 2 passes
● both predict p(engage | impression)
● with different focuses
Repeated impressions
User scrolls back and forth multiple times
Items at the top get repeated
impressions
Need to deduplicate the impressions per
session in the training data
Otherwise, top items get unfairly
penalized in the model as they have more
repeated impressions
Noisy impressions
Many items on screen at the same
time
Not clear if the user saw the item
If no engagement, is it because
● user is not interested?
● user didn’t see it?
Impressions may have long-term value
Impression of a Netflix show makes it more familiar
to the user
● even if the user did not play it
User may become more/less likely to play the show
at the next impression
Impressions for feature
definition
Typical features
Frequency counts: number past impressions
of item
● can add different variations
Engagement rate: #engagements /
#impressions
● how to set the value if #impressions = 0?
● 0, average, 1, adding prior?
● this could affect cold-start performance
● we can also skip this feature to let model
learn directly from raw counts
Categorical features: user’s impressed item
ids
● can help model generalize better via id
embeddings
But a user can have hundreds of impressions
even in a single day
Need to reduce the noise
Impression data volume
Impression data volume is huge
Logging is challenging
● heterogeneous client devices (TV,
mobile, web)
● need to process, sessionize and
summarize in real-time
● need to be available via multiple
channels (table, stream, API) for
different purposes
Handle volume in feature definition
● summary counts
● focus on most recent impressions
● increase minimum impression
duration requirement
● random sampling
How does impression features help?
Correlation
Should we then recommend more items with
many prior impressions from the user? No
Correlation does not imply causation
● highly-impressed items probably have
higher quality and thus have higher avg
label
In an AB test, after adding impression features
● model recommends more lowly-impressed
items
Conclusion
● Overview of using impression data to build an unbiased
recommendation model at Netflix
● Label definition: we may need exploration and random
negative sampling to enrich the training data
● Feature definition: various ways to summarize and
denoise impression data
● Long-term value: impressions can have different long-
term values for different users/items
Challenges
● How to do efficient exploration that maximizes signal
collection and minimizes user experience impact?
● How to sample random negatives? How to mix
random negatives with impressed negatives?
● How to model long-term value of impressions?
Thank you!
Questions?
Contact: Jiangwei Pan, jpan@netflix.com
1 sur 24

Recommandé

Reward Innovation for long-term member satisfaction par
Reward Innovation for long-term member satisfactionReward Innovation for long-term member satisfaction
Reward Innovation for long-term member satisfactionJiangwei Pan
676 vues18 diapositives
Recent Trends in Personalization at Netflix par
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixJustin Basilico
24.2K vues57 diapositives
Recent Trends in Personalization: A Netflix Perspective par
Recent Trends in Personalization: A Netflix PerspectiveRecent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveJustin Basilico
30.3K vues64 diapositives
Netflix Recommendations - Beyond the 5 Stars par
Netflix Recommendations - Beyond the 5 StarsNetflix Recommendations - Beyond the 5 Stars
Netflix Recommendations - Beyond the 5 StarsXavier Amatriain
21.1K vues82 diapositives
Personalizing "The Netflix Experience" with Deep Learning par
Personalizing "The Netflix Experience" with Deep LearningPersonalizing "The Netflix Experience" with Deep Learning
Personalizing "The Netflix Experience" with Deep LearningAnoop Deoras
1.1K vues41 diapositives
Context Aware Recommendations at Netflix par
Context Aware Recommendations at NetflixContext Aware Recommendations at Netflix
Context Aware Recommendations at NetflixLinas Baltrunas
5.7K vues38 diapositives

Contenu connexe

Tendances

Contextualization at Netflix par
Contextualization at NetflixContextualization at Netflix
Contextualization at NetflixLinas Baltrunas
7.6K vues31 diapositives
Personalizing the listening experience par
Personalizing the listening experiencePersonalizing the listening experience
Personalizing the listening experienceMounia Lalmas-Roelleke
2.1K vues35 diapositives
그로스 해킹 & 데이터 프로덕트 (Growth Hacking & Data Product) - 고넥터 고영혁 (Gonnector Dylan Ko) par
그로스 해킹 & 데이터 프로덕트 (Growth Hacking & Data Product) - 고넥터 고영혁 (Gonnector Dylan Ko)그로스 해킹 & 데이터 프로덕트 (Growth Hacking & Data Product) - 고넥터 고영혁 (Gonnector Dylan Ko)
그로스 해킹 & 데이터 프로덕트 (Growth Hacking & Data Product) - 고넥터 고영혁 (Gonnector Dylan Ko)Dylan Ko
4.9K vues119 diapositives
Survey of Recommendation Systems par
Survey of Recommendation SystemsSurvey of Recommendation Systems
Survey of Recommendation Systemsyoualab
4.7K vues38 diapositives
Déjà Vu: The Importance of Time and Causality in Recommender Systems par
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsJustin Basilico
11.8K vues45 diapositives
Data council SF 2020 Building a Personalized Messaging System at Netflix par
Data council SF 2020 Building a Personalized Messaging System at NetflixData council SF 2020 Building a Personalized Messaging System at Netflix
Data council SF 2020 Building a Personalized Messaging System at NetflixGrace T. Huang
447 vues31 diapositives

Tendances(20)

그로스 해킹 & 데이터 프로덕트 (Growth Hacking & Data Product) - 고넥터 고영혁 (Gonnector Dylan Ko) par Dylan Ko
그로스 해킹 & 데이터 프로덕트 (Growth Hacking & Data Product) - 고넥터 고영혁 (Gonnector Dylan Ko)그로스 해킹 & 데이터 프로덕트 (Growth Hacking & Data Product) - 고넥터 고영혁 (Gonnector Dylan Ko)
그로스 해킹 & 데이터 프로덕트 (Growth Hacking & Data Product) - 고넥터 고영혁 (Gonnector Dylan Ko)
Dylan Ko4.9K vues
Survey of Recommendation Systems par youalab
Survey of Recommendation SystemsSurvey of Recommendation Systems
Survey of Recommendation Systems
youalab4.7K vues
Déjà Vu: The Importance of Time and Causality in Recommender Systems par Justin Basilico
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Justin Basilico11.8K vues
Data council SF 2020 Building a Personalized Messaging System at Netflix par Grace T. Huang
Data council SF 2020 Building a Personalized Messaging System at NetflixData council SF 2020 Building a Personalized Messaging System at Netflix
Data council SF 2020 Building a Personalized Messaging System at Netflix
Grace T. Huang447 vues
Sequential Decision Making in Recommendations par Jaya Kawale
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in Recommendations
Jaya Kawale2.1K vues
Artwork Personalization at Netflix par Justin Basilico
Artwork Personalization at NetflixArtwork Personalization at Netflix
Artwork Personalization at Netflix
Justin Basilico28.1K vues
RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020 par Zachary Schendel
RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020
RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020
Time, Context and Causality in Recommender Systems par Yves Raimond
Time, Context and Causality in Recommender SystemsTime, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender Systems
Yves Raimond5.9K vues
Past, Present & Future of Recommender Systems: An Industry Perspective par Justin Basilico
Past, Present & Future of Recommender Systems: An Industry PerspectivePast, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry Perspective
Justin Basilico65K vues
[Causal Inference KR] 스타트업에서의 인과추론 par Bokyung Choi
[Causal Inference KR] 스타트업에서의 인과추론[Causal Inference KR] 스타트업에서의 인과추론
[Causal Inference KR] 스타트업에서의 인과추론
Bokyung Choi3.7K vues
Calibrated Recommendations par Harald Steck
Calibrated RecommendationsCalibrated Recommendations
Calibrated Recommendations
Harald Steck4.2K vues
Deep Learning for Recommender Systems par Yves Raimond
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
Yves Raimond15.4K vues
Past, present, and future of Recommender Systems: an industry perspective par Xavier Amatriain
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
Xavier Amatriain11K vues

Similaire à Recommendation Modeling with Impression Data at Netflix

Strata 2016 - Lessons Learned from building real-life Machine Learning Systems par
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning SystemsXavier Amatriain
5.9K vues51 diapositives
BIG2016- Lessons Learned from building real-life user-focused Big Data systems par
BIG2016- Lessons Learned from building real-life user-focused Big Data systemsBIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systemsXavier Amatriain
3.4K vues48 diapositives
How to Use Data for Product Decisions by YouTube Product Manager par
How to Use Data for Product Decisions by YouTube Product ManagerHow to Use Data for Product Decisions by YouTube Product Manager
How to Use Data for Product Decisions by YouTube Product ManagerProduct School
788 vues44 diapositives
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016 par
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016MLconf
2.7K vues80 diapositives
Analytics Academy 2017 Presentation Slides par
Analytics Academy 2017 Presentation SlidesAnalytics Academy 2017 Presentation Slides
Analytics Academy 2017 Presentation SlidesHarvardComms
963 vues180 diapositives
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys... par
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Xavier Amatriain
16.5K vues46 diapositives

Similaire à Recommendation Modeling with Impression Data at Netflix(20)

Strata 2016 - Lessons Learned from building real-life Machine Learning Systems par Xavier Amatriain
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Xavier Amatriain5.9K vues
BIG2016- Lessons Learned from building real-life user-focused Big Data systems par Xavier Amatriain
BIG2016- Lessons Learned from building real-life user-focused Big Data systemsBIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
Xavier Amatriain3.4K vues
How to Use Data for Product Decisions by YouTube Product Manager par Product School
How to Use Data for Product Decisions by YouTube Product ManagerHow to Use Data for Product Decisions by YouTube Product Manager
How to Use Data for Product Decisions by YouTube Product Manager
Product School788 vues
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016 par MLconf
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016
MLconf2.7K vues
Analytics Academy 2017 Presentation Slides par HarvardComms
Analytics Academy 2017 Presentation SlidesAnalytics Academy 2017 Presentation Slides
Analytics Academy 2017 Presentation Slides
HarvardComms963 vues
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys... par Xavier Amatriain
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Xavier Amatriain16.5K vues
Choose the Right Problems to Solve with ML by Spotify PM par Product School
Choose the Right Problems to Solve with ML by Spotify PMChoose the Right Problems to Solve with ML by Spotify PM
Choose the Right Problems to Solve with ML by Spotify PM
Product School284 vues
Agile methodology - Humanity par Humanity
Agile methodology  - HumanityAgile methodology  - Humanity
Agile methodology - Humanity
Humanity667 vues
Intro to Data Analytics with Oscar's Director of Product par Product School
 Intro to Data Analytics with Oscar's Director of Product Intro to Data Analytics with Oscar's Director of Product
Intro to Data Analytics with Oscar's Director of Product
Product School878 vues
Mobile Monetization par npobbathi
Mobile MonetizationMobile Monetization
Mobile Monetization
npobbathi1.3K vues
How to Avoid Common Mistakes in Product by Cake Product Manager par Product School
How to Avoid Common Mistakes in Product by Cake Product ManagerHow to Avoid Common Mistakes in Product by Cake Product Manager
How to Avoid Common Mistakes in Product by Cake Product Manager
Product School745 vues
Key Tactics for a Successful Product Launch by Kespry Senior PM par Product School
Key Tactics for a Successful Product Launch by Kespry Senior PMKey Tactics for a Successful Product Launch by Kespry Senior PM
Key Tactics for a Successful Product Launch by Kespry Senior PM
Product School291 vues
Delivering Projects the Pivotal Way par Aaron Severs
Delivering Projects the Pivotal WayDelivering Projects the Pivotal Way
Delivering Projects the Pivotal Way
Aaron Severs3K vues
What Are the Basics of Product Manager Interviews by Google PM par Product School
What Are the Basics of Product Manager Interviews by Google PMWhat Are the Basics of Product Manager Interviews by Google PM
What Are the Basics of Product Manager Interviews by Google PM
Product School25.5K vues
How to Effectively Experiment in PM by LendingTree Sr PM par Product School
How to Effectively Experiment in PM by LendingTree Sr PMHow to Effectively Experiment in PM by LendingTree Sr PM
How to Effectively Experiment in PM by LendingTree Sr PM
Product School278 vues
Marketplace in motion - AdKDD keynote - 2020 par Roelof van Zwol
Marketplace in motion - AdKDD keynote - 2020 Marketplace in motion - AdKDD keynote - 2020
Marketplace in motion - AdKDD keynote - 2020
Roelof van Zwol2.1K vues
MVP (Minimum Viable Product) Readiness | Boost Labs par Boost Labs
MVP (Minimum Viable Product) Readiness | Boost LabsMVP (Minimum Viable Product) Readiness | Boost Labs
MVP (Minimum Viable Product) Readiness | Boost Labs
Boost Labs137 vues
Product management class rookie to pro par Bim Akinfenwa
Product management class rookie to proProduct management class rookie to pro
Product management class rookie to pro
Bim Akinfenwa1.4K vues

Dernier

Mini-Track: AI and ML in Network Operations Applications par
Mini-Track: AI and ML in Network Operations ApplicationsMini-Track: AI and ML in Network Operations Applications
Mini-Track: AI and ML in Network Operations ApplicationsNetwork Automation Forum
10 vues24 diapositives
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors par
TouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective SensorsTouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective Sensors
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensorssugiuralab
21 vues15 diapositives
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLive par
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLiveAutomating a World-Class Technology Conference; Behind the Scenes of CiscoLive
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLiveNetwork Automation Forum
34 vues35 diapositives
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院 par
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院IttrainingIttraining
58 vues8 diapositives
20231123_Camunda Meetup Vienna.pdf par
20231123_Camunda Meetup Vienna.pdf20231123_Camunda Meetup Vienna.pdf
20231123_Camunda Meetup Vienna.pdfPhactum Softwareentwicklung GmbH
41 vues73 diapositives
Uni Systems for Power Platform.pptx par
Uni Systems for Power Platform.pptxUni Systems for Power Platform.pptx
Uni Systems for Power Platform.pptxUni Systems S.M.S.A.
56 vues21 diapositives

Dernier(20)

TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors par sugiuralab
TouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective SensorsTouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective Sensors
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors
sugiuralab21 vues
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLive par Network Automation Forum
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLiveAutomating a World-Class Technology Conference; Behind the Scenes of CiscoLive
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLive
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院 par IttrainingIttraining
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院
6g - REPORT.pdf par Liveplex
6g - REPORT.pdf6g - REPORT.pdf
6g - REPORT.pdf
Liveplex10 vues
Data Integrity for Banking and Financial Services par Precisely
Data Integrity for Banking and Financial ServicesData Integrity for Banking and Financial Services
Data Integrity for Banking and Financial Services
Precisely25 vues
Unit 1_Lecture 2_Physical Design of IoT.pdf par StephenTec
Unit 1_Lecture 2_Physical Design of IoT.pdfUnit 1_Lecture 2_Physical Design of IoT.pdf
Unit 1_Lecture 2_Physical Design of IoT.pdf
StephenTec12 vues
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f... par TrustArc
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc11 vues
Powerful Google developer tools for immediate impact! (2023-24) par wesley chun
Powerful Google developer tools for immediate impact! (2023-24)Powerful Google developer tools for immediate impact! (2023-24)
Powerful Google developer tools for immediate impact! (2023-24)
wesley chun10 vues
Igniting Next Level Productivity with AI-Infused Data Integration Workflows par Safe Software
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Safe Software280 vues
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas... par Bernd Ruecker
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
Bernd Ruecker40 vues

Recommendation Modeling with Impression Data at Netflix

  • 1. Recommendation Modeling with Impression Data at Netflix Jiangwei Pan Research Scientist at Netflix LERI workshop, RecSys’23
  • 3. What did I show?
  • 4. Definition of impressions An item appears in the viewport of the application ● for at least x milliseconds ● partially visible can be OK Impressions can be logged for different entities on screen ● shows, rows, boxart images, etc
  • 5. Goal of this presentation Impression data is critical for building recommender models at Netflix ● and other industry recommenders How do we incorporate impression data into recommender models? ● Impressions for label definition (training objectives) ● Impressions for feature definition Share interesting learnings and challenges
  • 7. Recommenders choose items and display them to the user as impressions What do recommenders do?
  • 8. A simplified recommendation algorithm Given a user: for every item, predict p(engage | user impression of item) then choose the item with the highest prediction
  • 9. How to train p(engage | impression) Binary classification model: engage or no-engage? Training data: take all user-item impressions
  • 10. If only “relevant” items are impressed Training data concentrate on the most relevant part of the item space If we train classifier using this data ● relevance is not the main difference between positives and negatives ● so it may be ignored by the model The classifier will not generalize well to the whole item space ● may over-predict for many non-relevant items
  • 11. Solution 1: Add item exploration Display random items to each user User still can’t impress every single item ● there can be millions of items But user can impress most “types” of items Model generalizes better! Too much exploration may hurt user experience or ads revenue Explore volume needs to be limited
  • 12. Solution 2: Add random negatives Pseudo-impressions with no engagement May incorrectly mark a relevant item as negative ● risk is small when item space is large Random negatives are easy to classify ● little connection to user interests But help a lot with model generalization Challenges ● what distribution to sample negatives? ● how to mix random negatives with impressed negatives?
  • 13. Popularity bias Definition: popular items get higher predictions than they should Model trained only using impressions (exploit + explore) ● no popularity bias as popular items get both more positives and more negatives in training data ● some items can suffer from high variance if not enough explore Adding uniform random negatives ● may increase popularity bias as we add the same number of negatives for popular and non-popular items
  • 14. When item space is large (millions) Too costly to compute p(engage | impression) for every item Candidate generation pass ● efficient model architectures (e.g. two- tower) ● millions → hundreds (loosely-relevant) ● care more about recall @ hundreds Fine-grained ranking pass ● more sophisticated model architectures ● distinguish between good and excellent ● often trained only on impressed negatives as it is applied on already relevant candidates More passes can be used, eg ● adjusting the ranking for diversity Efficiency optimization: use 2 passes ● both predict p(engage | impression) ● with different focuses
  • 15. Repeated impressions User scrolls back and forth multiple times Items at the top get repeated impressions Need to deduplicate the impressions per session in the training data Otherwise, top items get unfairly penalized in the model as they have more repeated impressions
  • 16. Noisy impressions Many items on screen at the same time Not clear if the user saw the item If no engagement, is it because ● user is not interested? ● user didn’t see it?
  • 17. Impressions may have long-term value Impression of a Netflix show makes it more familiar to the user ● even if the user did not play it User may become more/less likely to play the show at the next impression
  • 19. Typical features Frequency counts: number past impressions of item ● can add different variations Engagement rate: #engagements / #impressions ● how to set the value if #impressions = 0? ● 0, average, 1, adding prior? ● this could affect cold-start performance ● we can also skip this feature to let model learn directly from raw counts Categorical features: user’s impressed item ids ● can help model generalize better via id embeddings But a user can have hundreds of impressions even in a single day Need to reduce the noise
  • 20. Impression data volume Impression data volume is huge Logging is challenging ● heterogeneous client devices (TV, mobile, web) ● need to process, sessionize and summarize in real-time ● need to be available via multiple channels (table, stream, API) for different purposes Handle volume in feature definition ● summary counts ● focus on most recent impressions ● increase minimum impression duration requirement ● random sampling
  • 21. How does impression features help? Correlation Should we then recommend more items with many prior impressions from the user? No Correlation does not imply causation ● highly-impressed items probably have higher quality and thus have higher avg label In an AB test, after adding impression features ● model recommends more lowly-impressed items
  • 22. Conclusion ● Overview of using impression data to build an unbiased recommendation model at Netflix ● Label definition: we may need exploration and random negative sampling to enrich the training data ● Feature definition: various ways to summarize and denoise impression data ● Long-term value: impressions can have different long- term values for different users/items
  • 23. Challenges ● How to do efficient exploration that maximizes signal collection and minimizes user experience impact? ● How to sample random negatives? How to mix random negatives with impressed negatives? ● How to model long-term value of impressions?