SlideShare une entreprise Scribd logo
1  sur  15
Télécharger pour lire hors ligne
A Flexible Recommendation
System for Cable TV
Diogo Gonçalves, Miguel Costa, Francisco Couto
LaSIGE @ Faculty of Sciences, University of Lisbon
RecSysTV 2016, Boston, USA
September 15, 2016
• Information overflow
• too many video contents from which to choose
• too much time exploring video contents
(thousands of programs broadcast in hundreds of TV channels, plus thousands of movies & series on VOD)
• Impact
• dissatisfaction
• change to other systems with recommendations (e.g. Netflix, Youtube)
• less visualization time
• less revenue
• churn
Source: The Netflix Recommender System: Algorithms, Business Value, and Innovation, 2016
2
• Characterization of user behaviour in cable TV (previous study).
• showed that Live TV and Catch-up TV have more users and views
• Improvement of state-of-the-art algorithms using the learning-to-rank (L2R)
framework with contextual information and implicit feedback.
1. Extraction of implicit feedback for each pair <user, program>
2. Feature engineering on the contextual information
3. Creation of a large-scale dataset for learning & evaluation
4. Creation of a L2R model (cast recommendations as a supervised learning problem)
5. Evaluation against state-of-the-art algorithms
3
goal
• 12 weeks, between October and December of 2015
• 10K clients randomly chosen that watched at least 10 programs per week
• 21K programs available to the users during that period
• 4.5M of user visualizations (users saw more than 50%)
• we assumed users liked these ones
• 78.5M of user visualizations inferior to 50% (most are 0%)
• In average, each user watched 454 programs and each program was watched by 216 users
4
implicit
feedback
(ground truth)
• Users tend to see the same programs & channels
• Rankings of programs and channels & the relative visualization in each one (per user and for all users)
• Users tend to see the same categories & subcategories of contents
• Rankings of categories and subcategories & the relative visualization in each one (per user and for all users)
• Time influences user preferences
• Weekend information, broadcast period, time passed since broadcast, last days information
• Users tend to see programs with similar textual description of their content
• Textual similarity functions (e.g. Jaccard, TFxIDF) between metadata (e.g. title, description)
• Users tend to see programs with similar characteristics
• Number of episodes, content age & duration, category & subcategory
• Different types of contents require different strategies
• Information about who saw the content and whether is new for the user or for all
• Similar users watch the same programs
• Colaborative-filtering algorithms, such as WRMF (Collaborative Filtering for Implicit Feedback Datasets, ICDM 2008)
(60 features created based on the insights of previous characterization)
5
Users …… ……
)1(
q
)(
1
i
d
)2(
q )(i
q )(m
q
)(
2
i
d
)(i
jd )(
)(
i
n id
…… ……
Programs
)(
1
i
y )(
2
i
y )(i
jy )(
)(
i
n iyPreference scores
)(
1
i
x )(
2
i
x
)(i
jx )(
)(
i
n ixFeature vectors
)(
1
i
z )(
2
i
z )(i
jz
)(
)(
i
n izModel-generated
scores
……
…… ……
……
 
m
i
ii
zyL
1
)()(
,
Listwise loss
function
LambdaRank [Burges et al. NIPS’06]
LambdaMART [Wu et al. MSR-TR-2008-109]
Algorithms work differently for different types of programs. For instance, new programs tend to
work better with content-based approaches, while the programs never seen by a user tend to
work well with collaborative-filtering approaches. How to complement them?
6
• Random: returns random recommendations.
• Popular: recommends the most popular programs watched by everyone.
• UserPopular: recommends the programs most watched by the user.
• WRMF: a matrix factorization technique for collaborative filtering where
user and item latent vectors are inferred from implicit feedback.
• Content-based: recommends programs based on similar content of
programs watched by the user (e.g. category, actors, directors).
• Learning to Rank: LambdaMART is a type of Gradient Boosted Regression
Trees algorithm that was used to combine recommendation features and
algorithm outputs.
7
client visualizations
(training data)
t0 t1
client visualizations
(test data)
Learning System model h2
Learning System model h1
Predicting
System
Predicting
System
prediction
prediction
evaluation
10
Time-series 5-fold cross validation
• Accuracy – How well the items meet the users’ information need?
NDCG@k - match between recommendations and what the client watched after
• Diversity – How different are the items with respect to each other?
Intra-list diversity@k (ILD@k)
• Novelty – How novel is the item for all users?
Mean self-information (MSI@k)
• Serendipitious – How different is the item with respect to the user history?
Unexpecteness@k
Only the top-k items in the recommendation lists are considered.
We used a k of 5 and 10, because these are the typical sizes of recommendation lists shown in Cable TV.
9
1. Compute a recommendation list optimized for accuracy
2. Rerank recommendations, usually trading-off accuracy for other metrics (e.g. diversity
and novelty)
GreedyRecommendations(programs,objective_function) {
ranking = Ø
repeat
programsi = argmax( i ) objective_function(ranking ∪ programsk), for all k ∈ [1..#programs]
ranking = ranking ∪ programsi
programs = programs  programsi
until programs = Ø
return ranking
}
objective_function() calculates a score for a ranking list
e.g. 50% NDCG@5 + 25% ILD@5 + 25% MSI@5 10
accuracy accuracy for new diversity novelty serendipitious global
nDCG@
5
nDCG@
10
nDCG@
5
nDCG@
10
ILD@
5
ILD@
10
MSI@
5
MSI@
10
Seren@
5
Seren@
10
objective
@5
objective
@10
Random 0.002 0.002 0.001 0.002 0.919 0.920 0.677 0.679 0.935 0.935 0.400 0.401
Popular 0.293 0.247 0.021 0.018 0.600 0.541 0.058 0.080 0.888 0.887 0.311 0.279
UserPopular 0.694 0.600 0.000 0.000 0.708 0.727 0.200 0.213 0.872 0.872 0.574 0.535
WRMF 0.306 0.277 0.034 0.030 0.523 0.588 0.153 0.162 0.856 0.858 0.322 0.326
Content-
based 0.345 0.316 0.059 0.052 0.503 0.526 0.346 0.354 0.837 0.831 0.385 0.378
L2R 0.726 0.630 0.110 0.093 0.683 0.692 0.199 0.214 0.870 0.868 0.583 0.541
Greedy + L2R 0.556 0.440 0.027 0.010 0.753 0.784 0.606 0.654 0.850 0.857 0.618 0.580
Greedy trade-off accuracy for diversity and novelty
Random provides the best results in these 3 metrics.
L2R provides the best accurracy.
Used as example: objective = 50% nDCG + 25% ILD + 25% MSI
11
accuracy accuracy for new diversity novelty serendipitious global
nDCG@
5
nDCG@
10
nDCG@
5
nDCG@
10
ILD@
5
ILD@
10
MSI@
5
MSI@
10
Seren@
5
Seren@
10
objective
@5
objective
@10
Random 0.002 0.002 0.003 0.003 0.919 0.920 0.677 0.679 0.935 0.935 0.400 0.401
Popular 0.295 0.252 0.007 0.006 0.600 0.550 0.058 0.081 0.888 0.887 0.312 0.283
UserPopular 0.699 0.603 0.000 0.000 0.708 0.727 0.186 0.199 0.872 0.872 0.573 0.533
WRMF 0.318 0.289 0.020 0.017 0.534 0.598 0.129 0.138 0.862 0.864 0.325 0.329
Content-
based 0.435 0.396 0.047 0.038 0.525 0.550 0.238 0.242 0.840 0.836 0.408 0.396
L2R 0.726 0.631 0.115 0.081 0.683 0.692 0.193 0.206 0.870 0.869 0.582 0.540
Greedy + L2R 0.524 0.434 0.030 0.011 0.735 0.793 0.701 0.678 0.843 0.843 0.621 0.585
12
• Typical approaches used in VOD, such as collaborative-filtering and content-based
filtering, are not effective for Live TV and Catch-up TV.
• Our learning to rank approach with contextual information and implicit feedback is
superior in accuracy and accuracy for programs never seen before, while
maintaining high scores of diversity and serendipity.
• still, recommending programs never seen before requires further investigation.
• A solution to optimize recommendations for multiple metrics were proposed, which
enables to easily adapt recommendations to user preferences.
• still, it is necessary to understand what clients value (e.g. accuracy vs. diversity).
13
Thank you.
migcosta@gmail.com
Novelty
Serendipity
Accuracy
Diversity
IDCG is the value for the perfect ranking
reli is 1 if the user watched the program at rank i
or 0 otherwise
𝑑 𝑖, 𝑗 = ⅓ 𝑐𝑎𝑡𝑒𝑔𝑜𝑟𝑦 𝑖=𝑗 +
⅓ 𝑠𝑢𝑏𝑐𝑎𝑡𝑒𝑔𝑜𝑟𝑦 𝑖=𝑗 +
⅓ 𝑐ℎ𝑎𝑛𝑛𝑒𝑙(𝑖=𝑗)
𝐼𝐿𝐷(𝑢) =
1
)|𝑅|( 𝑅 − 1
𝑖∈𝑅 𝑗∈𝑅
𝑑 𝑖, 𝑗 𝐼𝐿𝐷 =
1
𝑈
𝑢∈𝑈
)𝐼𝐿𝐷(𝑢
Ru = list of recommended items for user u
Wu = watching history for user u
U = list of users
𝑈𝑛𝑒𝑥𝑝 𝑢 =
1
𝑅 𝑊𝑢
𝑖∈𝑅 𝑗∈𝑊𝑢
𝑑 𝑖, 𝑗 𝑈𝑛𝑒𝑥𝑝 =
1
𝑈
𝑢∈𝑈
)𝑈𝑛𝑒𝑥𝑝(𝑢
𝑀𝑆𝐼 𝑢 =
1
𝑅 𝑢
𝑖∈𝑅
| 𝑣 ∈ 𝑈 𝑖 ∈ 𝑊𝑣}|
𝑈
𝑀𝑆𝐼 =
1
𝑈
𝑢∈𝑈
)𝑀𝑆𝐼(𝑢
15

Contenu connexe

En vedette

Power GaN 2016: Epitaxy and Devices, Applications, and Technology Trends - 20...
Power GaN 2016: Epitaxy and Devices, Applications, and Technology Trends - 20...Power GaN 2016: Epitaxy and Devices, Applications, and Technology Trends - 20...
Power GaN 2016: Epitaxy and Devices, Applications, and Technology Trends - 20...
Yole Developpement
 
Apple iPhone 7 Plus: Rear-Facing Dual Camera Module 2016 teardown reverse cos...
Apple iPhone 7 Plus: Rear-Facing Dual Camera Module 2016 teardown reverse cos...Apple iPhone 7 Plus: Rear-Facing Dual Camera Module 2016 teardown reverse cos...
Apple iPhone 7 Plus: Rear-Facing Dual Camera Module 2016 teardown reverse cos...
Yole Developpement
 
Status of the Power Electronics Industry - 2016 Report by Yole Developpement
Status of the Power Electronics Industry - 2016 Report by Yole DeveloppementStatus of the Power Electronics Industry - 2016 Report by Yole Developpement
Status of the Power Electronics Industry - 2016 Report by Yole Developpement
Yole Developpement
 

En vedette (8)

Power GaN 2016: Epitaxy and Devices, Applications, and Technology Trends - 20...
Power GaN 2016: Epitaxy and Devices, Applications, and Technology Trends - 20...Power GaN 2016: Epitaxy and Devices, Applications, and Technology Trends - 20...
Power GaN 2016: Epitaxy and Devices, Applications, and Technology Trends - 20...
 
Apple iPhone 7 Plus: Rear-Facing Dual Camera Module 2016 teardown reverse cos...
Apple iPhone 7 Plus: Rear-Facing Dual Camera Module 2016 teardown reverse cos...Apple iPhone 7 Plus: Rear-Facing Dual Camera Module 2016 teardown reverse cos...
Apple iPhone 7 Plus: Rear-Facing Dual Camera Module 2016 teardown reverse cos...
 
Filosofía renacentista
Filosofía renacentistaFilosofía renacentista
Filosofía renacentista
 
Status of the Power Electronics Industry - 2016 Report by Yole Developpement
Status of the Power Electronics Industry - 2016 Report by Yole DeveloppementStatus of the Power Electronics Industry - 2016 Report by Yole Developpement
Status of the Power Electronics Industry - 2016 Report by Yole Developpement
 
Stakeholder Engagement
Stakeholder EngagementStakeholder Engagement
Stakeholder Engagement
 
Change communication strategy
Change communication strategy Change communication strategy
Change communication strategy
 
actividad 3
actividad 3actividad 3
actividad 3
 
Stakeholder Communication
Stakeholder CommunicationStakeholder Communication
Stakeholder Communication
 

Similaire à A Flexible Recommendation System for Cable TV

FlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at HumanaFlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at Humana
Databricks
 
RS in the context of Big Data-v4
RS in the context of Big Data-v4RS in the context of Big Data-v4
RS in the context of Big Data-v4
Khadija Atiya
 

Similaire à A Flexible Recommendation System for Cable TV (20)

Teacher training material
Teacher training materialTeacher training material
Teacher training material
 
Labaiik Marketing Research Plan 2021 (1!).pptx
Labaiik Marketing Research Plan 2021 (1!).pptxLabaiik Marketing Research Plan 2021 (1!).pptx
Labaiik Marketing Research Plan 2021 (1!).pptx
 
Final .pptx
Final .pptxFinal .pptx
Final .pptx
 
Machine Learning Applications in Credit Risk
Machine Learning Applications in Credit RiskMachine Learning Applications in Credit Risk
Machine Learning Applications in Credit Risk
 
BESDUI: Benchmark for End-User Structured Data User Interfaces
BESDUI: Benchmark for End-User Structured Data User InterfacesBESDUI: Benchmark for End-User Structured Data User Interfaces
BESDUI: Benchmark for End-User Structured Data User Interfaces
 
IRJET- Hybrid Recommendation System for Movies
IRJET-  	  Hybrid Recommendation System for MoviesIRJET-  	  Hybrid Recommendation System for Movies
IRJET- Hybrid Recommendation System for Movies
 
Optimizing Observability Spend: Metrics
Optimizing Observability Spend: MetricsOptimizing Observability Spend: Metrics
Optimizing Observability Spend: Metrics
 
Predictive Solutions and Analytics for TV & Entertainment Businesses
Predictive Solutions and Analytics for TV & Entertainment BusinessesPredictive Solutions and Analytics for TV & Entertainment Businesses
Predictive Solutions and Analytics for TV & Entertainment Businesses
 
FlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at HumanaFlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at Humana
 
Thesis_presentation_arda_tasci
Thesis_presentation_arda_tasciThesis_presentation_arda_tasci
Thesis_presentation_arda_tasci
 
productionising-recommenders
productionising-recommendersproductionising-recommenders
productionising-recommenders
 
Building an recommendation system for IPTV on a fast streaming architecture -...
Building an recommendation system for IPTV on a fast streaming architecture -...Building an recommendation system for IPTV on a fast streaming architecture -...
Building an recommendation system for IPTV on a fast streaming architecture -...
 
RS in the context of Big Data-v4
RS in the context of Big Data-v4RS in the context of Big Data-v4
RS in the context of Big Data-v4
 
Movie recommendation Engine using Artificial Intelligence
Movie recommendation Engine using Artificial IntelligenceMovie recommendation Engine using Artificial Intelligence
Movie recommendation Engine using Artificial Intelligence
 
Recommender system
Recommender system Recommender system
Recommender system
 
User Behavior Hashing for Audience Expansion
User Behavior Hashing for Audience ExpansionUser Behavior Hashing for Audience Expansion
User Behavior Hashing for Audience Expansion
 
The recommendations system for source code components retrieval
The recommendations system for source code components retrievalThe recommendations system for source code components retrieval
The recommendations system for source code components retrieval
 
Supersede overview presentation
Supersede overview presentationSupersede overview presentation
Supersede overview presentation
 
VINOD_6yrs
VINOD_6yrsVINOD_6yrs
VINOD_6yrs
 
Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence
 

Plus de Francisco Couto

Plus de Francisco Couto (9)

Master's Theses in Bioinformatics and Computational Biology
Master's Theses in Bioinformatics and Computational BiologyMaster's Theses in Bioinformatics and Computational Biology
Master's Theses in Bioinformatics and Computational Biology
 
Linked Data – challenges for Imagiology and Radiology
Linked Data – challenges for Imagiology and RadiologyLinked Data – challenges for Imagiology and Radiology
Linked Data – challenges for Imagiology and Radiology
 
Metadata Analyser: measuring metadata quality
Metadata Analyser: measuring metadata qualityMetadata Analyser: measuring metadata quality
Metadata Analyser: measuring metadata quality
 
MER: a Minimal Named-Entity Recognition Tagger and Annotation Server
MER: a Minimal Named-Entity Recognition Tagger and Annotation ServerMER: a Minimal Named-Entity Recognition Tagger and Annotation Server
MER: a Minimal Named-Entity Recognition Tagger and Annotation Server
 
Master in Bioinformatics and Computational Biology
Master in Bioinformatics and Computational BiologyMaster in Bioinformatics and Computational Biology
Master in Bioinformatics and Computational Biology
 
KnowledgeCoin : recognizing and rewarding metadata integration and sharing ...
KnowledgeCoin: recognizing and rewarding metadata integration and sharing ...KnowledgeCoin: recognizing and rewarding metadata integration and sharing ...
KnowledgeCoin : recognizing and rewarding metadata integration and sharing ...
 
Bioinf2Bio Oportunidades
Bioinf2Bio OportunidadesBioinf2Bio Oportunidades
Bioinf2Bio Oportunidades
 
Stabvida oportunidades profissionais
Stabvida oportunidades profissionaisStabvida oportunidades profissionais
Stabvida oportunidades profissionais
 
Mestrado em Bioinformática e Biologia Computacional da FCUL
Mestrado em Bioinformática e Biologia Computacional da FCULMestrado em Bioinformática e Biologia Computacional da FCUL
Mestrado em Bioinformática e Biologia Computacional da FCUL
 

Dernier

Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Christo Ananth
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
Tonystark477637
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
rknatarajan
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Christo Ananth
 

Dernier (20)

NFPA 5000 2024 standard .
NFPA 5000 2024 standard                                  .NFPA 5000 2024 standard                                  .
NFPA 5000 2024 standard .
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 

A Flexible Recommendation System for Cable TV

  • 1. A Flexible Recommendation System for Cable TV Diogo Gonçalves, Miguel Costa, Francisco Couto LaSIGE @ Faculty of Sciences, University of Lisbon RecSysTV 2016, Boston, USA September 15, 2016
  • 2. • Information overflow • too many video contents from which to choose • too much time exploring video contents (thousands of programs broadcast in hundreds of TV channels, plus thousands of movies & series on VOD) • Impact • dissatisfaction • change to other systems with recommendations (e.g. Netflix, Youtube) • less visualization time • less revenue • churn Source: The Netflix Recommender System: Algorithms, Business Value, and Innovation, 2016 2
  • 3. • Characterization of user behaviour in cable TV (previous study). • showed that Live TV and Catch-up TV have more users and views • Improvement of state-of-the-art algorithms using the learning-to-rank (L2R) framework with contextual information and implicit feedback. 1. Extraction of implicit feedback for each pair <user, program> 2. Feature engineering on the contextual information 3. Creation of a large-scale dataset for learning & evaluation 4. Creation of a L2R model (cast recommendations as a supervised learning problem) 5. Evaluation against state-of-the-art algorithms 3 goal
  • 4. • 12 weeks, between October and December of 2015 • 10K clients randomly chosen that watched at least 10 programs per week • 21K programs available to the users during that period • 4.5M of user visualizations (users saw more than 50%) • we assumed users liked these ones • 78.5M of user visualizations inferior to 50% (most are 0%) • In average, each user watched 454 programs and each program was watched by 216 users 4 implicit feedback (ground truth)
  • 5. • Users tend to see the same programs & channels • Rankings of programs and channels & the relative visualization in each one (per user and for all users) • Users tend to see the same categories & subcategories of contents • Rankings of categories and subcategories & the relative visualization in each one (per user and for all users) • Time influences user preferences • Weekend information, broadcast period, time passed since broadcast, last days information • Users tend to see programs with similar textual description of their content • Textual similarity functions (e.g. Jaccard, TFxIDF) between metadata (e.g. title, description) • Users tend to see programs with similar characteristics • Number of episodes, content age & duration, category & subcategory • Different types of contents require different strategies • Information about who saw the content and whether is new for the user or for all • Similar users watch the same programs • Colaborative-filtering algorithms, such as WRMF (Collaborative Filtering for Implicit Feedback Datasets, ICDM 2008) (60 features created based on the insights of previous characterization) 5
  • 6. Users …… …… )1( q )( 1 i d )2( q )(i q )(m q )( 2 i d )(i jd )( )( i n id …… …… Programs )( 1 i y )( 2 i y )(i jy )( )( i n iyPreference scores )( 1 i x )( 2 i x )(i jx )( )( i n ixFeature vectors )( 1 i z )( 2 i z )(i jz )( )( i n izModel-generated scores …… …… …… ……   m i ii zyL 1 )()( , Listwise loss function LambdaRank [Burges et al. NIPS’06] LambdaMART [Wu et al. MSR-TR-2008-109] Algorithms work differently for different types of programs. For instance, new programs tend to work better with content-based approaches, while the programs never seen by a user tend to work well with collaborative-filtering approaches. How to complement them? 6
  • 7. • Random: returns random recommendations. • Popular: recommends the most popular programs watched by everyone. • UserPopular: recommends the programs most watched by the user. • WRMF: a matrix factorization technique for collaborative filtering where user and item latent vectors are inferred from implicit feedback. • Content-based: recommends programs based on similar content of programs watched by the user (e.g. category, actors, directors). • Learning to Rank: LambdaMART is a type of Gradient Boosted Regression Trees algorithm that was used to combine recommendation features and algorithm outputs. 7
  • 8. client visualizations (training data) t0 t1 client visualizations (test data) Learning System model h2 Learning System model h1 Predicting System Predicting System prediction prediction evaluation 10 Time-series 5-fold cross validation
  • 9. • Accuracy – How well the items meet the users’ information need? NDCG@k - match between recommendations and what the client watched after • Diversity – How different are the items with respect to each other? Intra-list diversity@k (ILD@k) • Novelty – How novel is the item for all users? Mean self-information (MSI@k) • Serendipitious – How different is the item with respect to the user history? Unexpecteness@k Only the top-k items in the recommendation lists are considered. We used a k of 5 and 10, because these are the typical sizes of recommendation lists shown in Cable TV. 9
  • 10. 1. Compute a recommendation list optimized for accuracy 2. Rerank recommendations, usually trading-off accuracy for other metrics (e.g. diversity and novelty) GreedyRecommendations(programs,objective_function) { ranking = Ø repeat programsi = argmax( i ) objective_function(ranking ∪ programsk), for all k ∈ [1..#programs] ranking = ranking ∪ programsi programs = programs programsi until programs = Ø return ranking } objective_function() calculates a score for a ranking list e.g. 50% NDCG@5 + 25% ILD@5 + 25% MSI@5 10
  • 11. accuracy accuracy for new diversity novelty serendipitious global nDCG@ 5 nDCG@ 10 nDCG@ 5 nDCG@ 10 ILD@ 5 ILD@ 10 MSI@ 5 MSI@ 10 Seren@ 5 Seren@ 10 objective @5 objective @10 Random 0.002 0.002 0.001 0.002 0.919 0.920 0.677 0.679 0.935 0.935 0.400 0.401 Popular 0.293 0.247 0.021 0.018 0.600 0.541 0.058 0.080 0.888 0.887 0.311 0.279 UserPopular 0.694 0.600 0.000 0.000 0.708 0.727 0.200 0.213 0.872 0.872 0.574 0.535 WRMF 0.306 0.277 0.034 0.030 0.523 0.588 0.153 0.162 0.856 0.858 0.322 0.326 Content- based 0.345 0.316 0.059 0.052 0.503 0.526 0.346 0.354 0.837 0.831 0.385 0.378 L2R 0.726 0.630 0.110 0.093 0.683 0.692 0.199 0.214 0.870 0.868 0.583 0.541 Greedy + L2R 0.556 0.440 0.027 0.010 0.753 0.784 0.606 0.654 0.850 0.857 0.618 0.580 Greedy trade-off accuracy for diversity and novelty Random provides the best results in these 3 metrics. L2R provides the best accurracy. Used as example: objective = 50% nDCG + 25% ILD + 25% MSI 11
  • 12. accuracy accuracy for new diversity novelty serendipitious global nDCG@ 5 nDCG@ 10 nDCG@ 5 nDCG@ 10 ILD@ 5 ILD@ 10 MSI@ 5 MSI@ 10 Seren@ 5 Seren@ 10 objective @5 objective @10 Random 0.002 0.002 0.003 0.003 0.919 0.920 0.677 0.679 0.935 0.935 0.400 0.401 Popular 0.295 0.252 0.007 0.006 0.600 0.550 0.058 0.081 0.888 0.887 0.312 0.283 UserPopular 0.699 0.603 0.000 0.000 0.708 0.727 0.186 0.199 0.872 0.872 0.573 0.533 WRMF 0.318 0.289 0.020 0.017 0.534 0.598 0.129 0.138 0.862 0.864 0.325 0.329 Content- based 0.435 0.396 0.047 0.038 0.525 0.550 0.238 0.242 0.840 0.836 0.408 0.396 L2R 0.726 0.631 0.115 0.081 0.683 0.692 0.193 0.206 0.870 0.869 0.582 0.540 Greedy + L2R 0.524 0.434 0.030 0.011 0.735 0.793 0.701 0.678 0.843 0.843 0.621 0.585 12
  • 13. • Typical approaches used in VOD, such as collaborative-filtering and content-based filtering, are not effective for Live TV and Catch-up TV. • Our learning to rank approach with contextual information and implicit feedback is superior in accuracy and accuracy for programs never seen before, while maintaining high scores of diversity and serendipity. • still, recommending programs never seen before requires further investigation. • A solution to optimize recommendations for multiple metrics were proposed, which enables to easily adapt recommendations to user preferences. • still, it is necessary to understand what clients value (e.g. accuracy vs. diversity). 13
  • 15. Novelty Serendipity Accuracy Diversity IDCG is the value for the perfect ranking reli is 1 if the user watched the program at rank i or 0 otherwise 𝑑 𝑖, 𝑗 = ⅓ 𝑐𝑎𝑡𝑒𝑔𝑜𝑟𝑦 𝑖=𝑗 + ⅓ 𝑠𝑢𝑏𝑐𝑎𝑡𝑒𝑔𝑜𝑟𝑦 𝑖=𝑗 + ⅓ 𝑐ℎ𝑎𝑛𝑛𝑒𝑙(𝑖=𝑗) 𝐼𝐿𝐷(𝑢) = 1 )|𝑅|( 𝑅 − 1 𝑖∈𝑅 𝑗∈𝑅 𝑑 𝑖, 𝑗 𝐼𝐿𝐷 = 1 𝑈 𝑢∈𝑈 )𝐼𝐿𝐷(𝑢 Ru = list of recommended items for user u Wu = watching history for user u U = list of users 𝑈𝑛𝑒𝑥𝑝 𝑢 = 1 𝑅 𝑊𝑢 𝑖∈𝑅 𝑗∈𝑊𝑢 𝑑 𝑖, 𝑗 𝑈𝑛𝑒𝑥𝑝 = 1 𝑈 𝑢∈𝑈 )𝑈𝑛𝑒𝑥𝑝(𝑢 𝑀𝑆𝐼 𝑢 = 1 𝑅 𝑢 𝑖∈𝑅 | 𝑣 ∈ 𝑈 𝑖 ∈ 𝑊𝑣}| 𝑈 𝑀𝑆𝐼 = 1 𝑈 𝑢∈𝑈 )𝑀𝑆𝐼(𝑢 15