Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Recommending What Video to Watch Next: A Multitask Ranking System
1. Recommending What Video to Watch Next:
A Multitask Ranking System
Google Inc, 2019
Zhe Zhao, Lichan Hong, Li Wei, Jilin Chen, Aniruddh Nath, Shawn Andrews, Aditee Jumthekar, Maheswaran Sathiamoorthy, Xinyang Yi, Ed Chi
November 10, 2021
Presenter: Hongkyu Lim
3. 2
Introduction
• Youtube’s recommendation system in two steps
Candidate Generation
Ranking What we are focusing on.
• Ranking
What to optimize? be cautious on ‘ setting Objectives’
• Not simply consider ‘Watches’, and ‘Clicks’
• Consider ‘Durations’, ‘Shares’, and ‘Preferences’ to optimize ranks
Do not be fallen into a trap of ‘Selection Bias’ so called ‘Feedback loop’
• A efficient method needed to deal with ‘Selection Bias’(Also called ‘Position Bias’)
Solution is a use of “Multitask Neural Network”.
4. 3
Introduction
• Architecture
Consults on Wide&Deep model
Applys Multi-gate Mixture-of-Exports
(MMoE)
• Why?
Operating Multitask Learning by segmenting
objectives
• What to click? biased to fishing
• Query and candidate video features
• content, topic, title, upload time
-------------------------------
• How long you watch?
• How much you like?
• User and context features
• Time, user profile
<WIDE> <DEEP>
<Main model>
5. 4
Introduction
• Architecture
Consults on Wide&Deep model
Applys Multi-gate Mixture-of-Exports
(MMoE)
• Why?
Operating Multitask Learning by segmenting
objectives
• What to click? biased to fishing
• Query and candidate video features
• content, topic, title, upload time
-------------------------------
• How long you watch?
• How much you like?
• User and context features
• Time, user profile
<Main model>
6. 5
Introduction
• Objectives
1. Engagement objective
• Users' ‘Clicks’
• How much users are engaged?
2. Satisfaction objective
• How much users like the video?
• Users’ Ratings
<Main model>
7. 6
Introduction
• Model
1. User utility with No Bias
• Query and candidate video features
• User and context features
2. Estimated Propensity Score
• Input : Selection Bias
• Wide section in Wide&Deep
<User Utility>
<Main model>
<Propensity>
8. 7
Related Work
• Industrial Recommendation Systems
Implicit Feedback > Explicit Feedback
Dividing states
• Candidate Generation
• Association rule
• co-occurrence
• collaborative filtering(preference)
• Ranking
• Learning-to-rank
• point-wise efficient in speed
• pair-wise & list-wise
• Modeling Biases in Training Data
Feedback Loop
• Solution : Passing values as missing values at serving
9. 8
Problem Description
• Multimodal Feature Space
User utility at candidate level
• Video content, Thumbnail, Sounds, Title, User demo
• Scalability
Massive users and data
14. 13
Model Architecture
• Why MMoE used?
Correlation issue
• When correlation between tasks is low, hard-parameter sharing techniques harm the learning
of multiple objectives.
16. 15
Model Architecture
• Selection Bias
Linear Combination
• Position Feature
+
• Other Features(e.g. device info)
Penalties on higher ranked videos
10% dropout for not relying on
Wide part too much
17. 16
Experiment
• Model
TensorFlow
TPU
• Deployed and Monitored on Youtube
Up-to-date
Offline experiments – monitoring AUC(area under the curve, 수신자 조작 특성)
A/B Test
Used to tune hyper-parameters
18. 17
Experiment Result
• Model performance with MMoE and without MMoE
• Visualization of expert utilization(Gating network distribution)
• CTR comparison related to Wide feature(position bias)
19. 18
Experiment Result
• Model performance with MMoE and without MMoE
• Visualization of expert utilization(Gating network distribution)
• CTR comparison related to Wide feature(position bias)