SlideShare une entreprise Scribd logo
1  sur  104
Télécharger pour lire hors ligne
Recommender Systems in the Media Industry:

Relevance, Recency, Popularity, and Diversity
! David Graus
✉ david.graus@fdmediagroep.nl
🐦 @dvdgrs
David Graus / Recsys Summer School / 11/09/2019
whoami !
• 🎓 Academia
• BA Media Studies @ UvA (2008)
• MSc Media Technology @ Universiteit Leiden (2012)
• PhD Information Retrieval @ UvA (2017)
• 🏢 Industry
• Editor radio/online public broadcaster NTR (between BA & MSc)
• Research Intern @ Microsoft Research, US
• Data Scientist @ FD Mediagroep
• Lead Data Scientist @ FD SMART Journalism / BNR SMART Radio
2
David Graus / Recsys Summer School / 11/09/2019
In what is to follow…
• An introduction of FD Mediagroep
• Personalization & RecSys at FD Mediagroep:
• SMART Radio
• SMART Journalism
• Lessons learnt building/productizing a RecSys;
• Relevance
• Usefulness
• Trust
3
Part 1: Introduction
David Graus / Recsys Summer School / 11/09/2019 5
The leading information provider in the financial economic domain
FD Mediagroup
in the Netherlands
AI @ FD Mediagroup
David Graus / Recsys Summer School / 11/09/2019
AI @ FD Mediagroup
10
David Graus / Recsys Summer School / 11/09/2019
Team
11
Dung Bahadir Anca Philippe
Maya David Feng Li’ao
Klaus Oberon Manon Azamat
AI @ FDMG: Academia/Industry
Part 2: SMART Radio
David Graus / Recsys Summer School / 11/09/2019
SMART Radio
• (Transcribe)
• Segment
• Tag
• Serve
15
David Graus / Recsys Summer School / 11/09/2019
Transcribe
16
David Graus / Recsys Summer School / 11/09/2019
Segment
• Based on metadata, 

text, and audio.
17
David Graus / Recsys Summer School / 11/09/2019
Tag
• Simple multilabel text 

classifier
• Trained on transcripts of 

segments + associated tags 

from website
18
David Graus / Recsys Summer School / 11/09/2019
Serve
• iOS/Android 

app
19
Part 3: SMART Journalism
David Graus / Recsys Summer School / 11/09/2019 21
David Graus / Recsys Summer School / 11/09/2019
SMART Journalism
• Moonshot; personalized summarization
• How to get there:
• Content Understanding
• Content-based Recommender System; <user, article>
• Personalized snippet retrieval; <user, snippet-in-article>
• Snippet-to-summary abstractor (?)
22
Tech
David Graus / Recsys Summer School / 11/09/2019 24
User Article
RecSys
Matching
0.352
0.795
0.125
0.643
David Graus / Recsys Summer School / 11/09/2019 25
User Articles
RecSys
Matching
Reader
Profile
Article
Profile
David Graus / Recsys Summer School / 11/09/2019
Article Representation
26
Article
Article
Profile
'Meer regelgeving cryptogeld noodzakelijk'
Article Representation
Tags: Blockchain, Cryptocurrency, Regelgeving
Rubriek: Economie & Politiek
Stylometrie: CharLen=2424, WordLen=486
Entities: -
David Graus / Recsys Summer School / 11/09/2019
User Profile
27
User
User
Profile
Qualcomm krijgt bijna €1 mrd boete van Brussel
Tags: Boete, Chips, EU, Mededinging
Rubriek: Ondernemen
Stylometrie: CharLen=3491, WordLen=635
Entities: Qualcomm, Apple, NXP, Intel, Google
Tags: Boete, Chips, EU, Mededinging
Rubriek: Ondernemen
Stylometrie: CharLen=3491, WordLen=635
Entities: Qualcomm, Apple, NXP, Intel, Google
David Graus / Recsys Summer School / 11/09/2019 28
Qualcomm krijgt bijna €1 mrd boete van Brussel
Tags: Boete, Chips, EU, Mededinging, Big
Data, Blog, Davos, Google, Technologie
Rubriek: Ondernemen, Davos
Stylometrie: CharLen=3491, WordLen=635, CharLen=2856,
WordLen=524
Entities: Qualcomm, Apple (2), NXP, Intel, Google (2), Microsoft,
Salesforce
Topman van softwaremaker Salesforce kraakt grote
techbedrijven
Tags: Big Data, Blog, Davos, Google, Technologie
Rubriek: Davos
Stylometrie: CharLen=2856, WordLen=524
Entities: Google, Apple, Microsoft, Salesforce
Lezers
User
Profile
Tags: Boete, Chips, EU, Mededinging
Rubriek: Ondernemen
Stylometrie: CharLen=3491, WordLen=635
Entities: Qualcomm, Apple, NXP, Intel, Google
David Graus / Recsys Summer School / 11/09/2019
Model
• Content-based RecSys
• Using point-wise LTR
• Model: GBDT (xgboost)
• Labels: clicks (i.e., click = 1, non-click = 0)
• Implicit feedback (dangers and assumptions ahead)
• Features: user, article, user-article features (~14k)
• Trained nightly
29
Product
David Graus / Recsys Summer School / 11/09/2019
Goal
• Help users to discover content [1]
31Meyer,	F.	Recommender	systems	in	industrial	contexts	(2012)
David Graus / Recsys Summer School / 11/09/2019 32
David Graus / Recsys Summer School / 11/09/2019 33
David Graus / Recsys Summer School / 11/09/2019
Building a RecSys for FD
• Definition of Done:

Our RecSys needs to provide relevant and useful results, 

and be trustworthy.
• Relevance: IR evaluation metrics
• Usefulness: What we want our RecSys to achieve
• Trust: Whether we see that people like/use our RecSys
34
David Graus / Recsys Summer School / 11/09/2019
1. Relevance
• Using different (ranking) metrics, e.g.;
• NDCG
• MAP
• RPrec
• Evaluating models offline
• Comparing offline vs. online performance
35
David Graus / Recsys Summer School / 11/09/2019
Relevance: NDCG
• Item’s score: Gain
• Sum scores: Cumulative Gain
• Penalize scores at lower ranks: Discounted Cumulative Gain
• (e.g., divide CG by log of item position)
• Normalized by “ideal” ranking: NDCG
36http://fastml.com/evaluating-recommender-systems/
David Graus / Recsys Summer School / 11/09/2019
MAP
• Precision: TP / TP+FP
• Average Precision: Precision at each rank
• Mean Average Precision: average AP across 

queries
37
https://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-ranked-retrieval-results-1.html
David Graus / Recsys Summer School / 11/09/2019
R-Precision
• Precision@k
• where k = R (= number of relevant documents)
38https://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-ranked-retrieval-results-1.html
David Graus / Recsys Summer School / 11/09/2019 https://github.com/cvangysel/pytrec_eval
But…
• It’s all we can do…
39Garcin et al., Offline and Online Evaluation of News Recommender Systems at swissinfo.ch (RecSys 2014)
David Graus / Recsys Summer School / 11/09/2019
2. Usefulness
• 🤖 Part algorithmic
• What can we build?
• 💰 Part business/values
• What do we want our RecSys to do?
40
David Graus / Recsys Summer School / 11/09/2019
🤖 Algorithmic: Tailor-made RecSys
• Diversity
• Popularity
• Coverage
• Surprise
• Speed/Recency
• …
41
David Graus / Recsys Summer School / 11/09/2019
Popularity
• “It is generally not useful to recommend very popular items as they
are generally already known by the user” [1]
• Presentation bias
• Skewed clicks
• Issue in historic data and online evaluation
42[1] Meyer, F. Recommender systems in industrial contexts (2012)
David Graus / Recsys Summer School / 11/09/2019
Coverage
• Unleash the long tail?
43
Most read:
Same 6 articles for everyone
Personalized: 

178 articles
David Graus / Recsys Summer School / 11/09/2019
💰 Business: What do we want/expect?
• ⚠ WIP Study where journalists, data scientists, product,
developers were interviewed to identify shared 

perceptions, expectations, attitudes towards 

algorithms.
44
David Graus / Recsys Summer School / 11/09/2019
Business: perception of algorithms
• Distinguish between audience & and journalism-related values.
• Results TBD, but think of aspects such as:
• broadness vs. personalizedness (audience)
• usability (audience)
• objectivity/neutrality (journalism)
• speed/immediacy (journalism)
• etc.
45
David Graus / Recsys Summer School / 11/09/2019 46
But what about the 

filter bubble?
David Graus / Recsys Summer School / 11/09/2019
Short story
• Tailor-made RecSys: we are in control
• RecSys as “yet another ranking”
• The slightly longer story…
48
David Graus / Recsys Summer School / 11/09/2019 49
David Graus / Recsys Summer School / 11/09/2019
1. Should we worry?
[We] focus on empirical evidence of the spread of personalised
news services and its likely effects on political polarisation and
political information.
50[Zuiderveen Borgesius et al., 2016]
David Graus / Recsys Summer School / 11/09/2019
1. Should we worry?
• It’s difficult to bubble yourself
• Both offline:
• “Those who use a lot of partisan information also use an above-average
amount of mainstream news.”
• “[M]ost people by far still get their news via traditional sources, most
notably public-service television.”
• And online:
• “People who choose personalisation are more likely to use an above-average
amount of general-interest news as well.”
• “A recent study suggests that the influence of [the Facebook] algorithm is
lower than the influence of the user’s choices.”
51[Zuiderveen Borgesius et al., 2016]
David Graus / Recsys Summer School / 11/09/2019
1. ⚠ Take homes
• “[T]here is no empirical evidence that warrants any strong
worries about filter bubbles.”
• “One lesson we should have learned from the past is that panic
does not lead to sane policies. More evidence is needed on the
process and effects of personalisation, so we can shift the basis of
policy discussions from fear to insight.”
52[Zuiderveen Borgesius et al., 2016]
David Graus / Recsys Summer School / 11/09/2019
2. Some empirical evidence…
• On average, 11.7% of results show differences due to
personalization on Google.
• Varies widely by search query and by result ranking.
• Only found measurable personalization as a result of searching
with a logged in account and the IP address of the searching user.
53
David Graus / Recsys Summer School / 11/09/2019
2. Method
1. 👤
1. Get 200 volunteers with Google accounts
2. Have them issue the same set of queries
3. Compare results
2. 🤖
1. Construct Google bot accounts
• Vary aspects such as location, demographics, click behavior, browsing + search
history, etc.
2. Have them issue the same set of queries
3. Compare results
54[Hannák et al., 2013]
David Graus / Recsys Summer School / 11/09/2019
2. Findings 👤
• On average, 11.7% of results show differences due to
personalization on Google.
• Top ranks tend to be less personalized than bottom ranks.
55[Hannák et al., 2013]
David Graus / Recsys Summer School / 11/09/2019
2. Findings 👤
• ✅ A great deal of personalization based on location 

(especially for company names, where users received different store locations).
• ❌ The least personalized results tend to be factual and health related queries.
56[Hannák et al., 2013]
David Graus / Recsys Summer School / 11/09/2019
2. Findings 🤖
✅ Logged in vs. “cleared cookies” account
✅ Geolocation
❌ Gender
❌ Age
❌ Search history
❌ Click history
❌ Browsing history
57[Hannák et al., 2013]
David Graus / Recsys Summer School / 11/09/2019
3. Bursting bubbles
58
David Graus / Recsys Summer School / 11/09/2019
3. Aim
Increase exposure to varied political opinions 

with a goal of improving civil discourse
59[Yom-Tov et al. 2014]
David Graus / Recsys Summer School / 11/09/2019
3. Method
• Classify searchers into political leaning (using geo data)
60[Yom-Tov et al. 2014]
David Graus / Recsys Summer School / 11/09/2019
3. Method
• Infer political leaning of news sources from user behavior.
• Identify polarized search queries (with strong political leanings —
in both directions).
61[Yom-Tov et al. 2014]
David Graus / Recsys Summer School / 11/09/2019
3. Method
• Treatment group: Insert red results for blue users, and blue
results for red users
• Control group: Do not adjust results
62[Yom-Tov et al. 2014]
David Graus / Recsys Summer School / 11/09/2019
3. Methode
1. Short term: Compare clicks/behavior between control &
treatment.
2. Long term: Measure during two weeks, per user;
1. Polarization: Difference of user’s leaning-score compared to
average leaning across all sources.
2. Engagement: Average number of queries + average read
articles.
63
David Graus / Recsys Summer School / 11/09/2019
3. Findings 1
• Less clicks on inserted opposing sources.
• But: 

“Results pages of the opposing viewpoint which had a similarity
higher than the average tended to be clicked 38% more than those
below the average.”
64[Yom-Tov et al. 2014]
David Graus / Recsys Summer School / 11/09/2019
3. Findings 2
• Polarization:
• Treatment: Average leaning ‘moves’ ~25% to centre
• Control: Negligible difference (~1%)
• Engagement:
• Treatment: Number of queries: +9% / articles read: +4%
• Control: Small reduction in both (~2.5%)
65[Yom-Tov et al. 2014]
David Graus / Recsys Summer School / 11/09/2019
4. How do algorithmic recommendations 🤖 compare
to human 👤?
66
David Graus / Recsys Summer School / 11/09/2019
4. Method
• 🤖 Generate article recommendations for news articles using
different (off the shelf) recommender systems (CF & CB).
• 👤 Compare to hand-picked article recommendations.
• Measure “diversity” of recommended articles:
• At content level
• At tag level
• At category level
• At sentiment/subjectivity level
67[Möller et al. 2018]
David Graus / Recsys Summer School / 11/09/2019
4. Findings
“Conventional recommendation algorithms at least preserve the
topic/sentiment diversity of the article supply.”
68[Möller et al. 2018]
David Graus / Recsys Summer School / 11/09/2019
4. ⚠ Take home
• Diversity is preserved with conventional recommender systems.
69[Möller et al. 2018]
/end of rant
David Graus / Recsys Summer School / 11/09/2019
⚠ Besides
• It is trivial to model and incorporate aspects such as diversity (as
you’ll (hopefully) learn during our hands-on 🙌…)
71
David Graus / Recsys Summer School / 11/09/2019
3. Trust
• Returning visitors / users
• Measure CTR
• Questionnaires [1]
72[1]	https://hci.epfl.ch/research-projects/resque/
Explaining user profiles
David Graus / Recsys Summer School / 11/09/2019
Why fair and transparent?
• User studies have shown:
• Our users want personalized content
• Our users care for transparency
• FD
• Verifiability is one of the core values for FD Mediagroup
• Transparency enables verifiability
74
David Graus / Recsys Summer School / 11/09/2019
The Context
75
User-data
Rec engine
Inferred
data
Recomme
ndations
David Graus / Recsys Summer School / 11/09/2019
The Context
76
!77
STEP 1
DISCOURSE
Understanding the
problem
STEP 2
FRAMEWORK
Systematic
layering of
explanations
STEP 3
EVALUATION
Data exploration
and evaluation
!78
STEP 1
DISCOURSE
Understanding the
problem
STEP 2
FRAMEWORK
Systematic
layering of
explanations
STEP 3
EVALUATION
Data exploration
and evaluation
David Graus / Recsys Summer School / 11/09/2019
Discourse
• Who is the target audience of the explanations?
• What is the goal?
• What purpose do the explanations serve?
79
!80
STEP 1
DISCOURSE
Understanding the
problem
STEP 2
FRAMEWORK
Systematic
layering of
explanations
STEP 3
EVALUATION
Data exploration
and evaluation
David Graus / Recsys Summer School / 11/09/2019
Framework
81
David Graus / Recsys Summer School / 11/09/2019
Framework
82
I want to 

be an expert
I want to 

stay informed
I want to
broaden my
horizon
I want to
discover the
unexplored
Values
Broadness, diversity, autonomy, objectivity,
match with the user needs, controllability
!83
STEP 1
DISCOURSE
Understanding the
problem
STEP 2
FRAMEWORK
Systematic
layering of
explanations
STEP 3
EVALUATION
Data exploration
and evaluation
Data exploration
David Graus / Recsys Summer School / 11/09/2019
Data exploration
85
David Graus / Recsys Summer School / 11/09/2019
Data exploration
86
David Graus / Recsys Summer School / 11/09/2019
Data exploration
87
User study
David Graus / Recsys Summer School / 11/09/2019
User study
89
System
Evaluation
Your goal is to Broaden your Horizons. 

There may be topics you do not normally read about,
but you may actually find interesting. Exploring this
helps to build a broad perspective on the issues that
matter to you.
Your goal is to Discover the Unexplored. 

There may be topics that you haven’t explored
before that may actually become new interests.
Exploring new topics can promote creativity
and objectivity.
David Graus / Recsys Summer School / 11/09/2019
User study
90
Aim:
Study whether being offered an explanation of
reading behavior and a particular goal, would
influence the user’s intended reading
behaviorSystem
Evaluation
David Graus / Recsys Summer School / 11/09/2019
User study
91
System
Evaluation
Objective
Pick a persona from four
data-driven profiles
Random assign
goal-order
Explain the goals: Broaden Horizons,
Discover the unexplored
Goal A
Show
visualization
Questionnaire
Persona 1
Goal B
Persona 4
David Graus / Recsys Summer School / 11/09/2019
Reading Behavior Explanation
92
Similarity
Familiarity
David Graus / Recsys Summer School / 11/09/2019
Hypotheses
93
System
Evaluation
2. Broaden horizon:
Users choose topics that have high similarity and 

high familiarity compared to the non-selected topics.
3. Discover the unexplored: 

Users choose topics that have low similarity and 

low familiarity compared to the non-selected topics.
1. Goal Framework:
Broaden Horizon chosen topics are more similar and
more familiar than Discover the unexplored
!94
Results(Executive summary)
David Graus / Recsys Summer School / 11/09/2019
Results
95
System
Evaluation
2. Broaden horizon:
Users choose topics that have high similarity and 

high familiarity compared to the non-selected topics.
3. Discover the unexplored: 

Users choose topics that have low similarity and 

low familiarity compared to the non-selected topics.
1. Goal Framework:
Broaden Horizon chosen topics are more similar and
more familiar than Discover the unexplored
David Graus / Recsys Summer School / 11/09/2019
Results
96
System
Evaluation
2. Broaden horizon:
Users choose topics that have high similarity and 

high familiarity compared to the non-selected topics.
3. Discover the unexplored: 

Users choose topics that have low similarity and 

low familiarity compared to the non-selected topics.
1. Goal Framework:
Broaden Horizon chosen topics are more similar and
more familiar than Discover the unexplored
David Graus / Recsys Summer School / 11/09/2019
Results
97
System
Evaluation
2. Broaden horizon:
Users choose topics that have high similarity and 

high familiarity compared to the non-selected topics.
3. Discover the unexplored: 

Users choose topics that have low similarity and 

low familiarity compared to the non-selected topics.
1. Goal Framework:
Broaden Horizon chosen topics are more similar and
more familiar than Discover the unexplored
David Graus / Recsys Summer School / 11/09/2019
Results
98
System
Evaluation
2. Broaden horizon:
Users choose topics that have high similarity and 

high familiarity compared to the non-selected topics.
3. Discover the unexplored: 

Users choose topics that have low similarity and 

low familiarity compared to the non-selected topics.
1. Goal Framework:
Broaden Horizon chosen topics are more similar and
more familiar than Discover the unexplored
David Graus / Recsys Summer School / 11/09/2019
Results
99
System
Evaluation
2. Broaden horizon:
Users choose topics that have high similarity and 

high familiarity compared to the non-selected topics.
3. Discover the unexplored: 

Users choose topics that have low similarity and 

low familiarity compared to the non-selected topics.
1. Goal Framework:
Broaden Horizon chosen topics are more similar and
more familiar than Discover the unexplored
David Graus / Recsys Summer School / 11/09/2019
Results
100
System
Evaluation
2. Broaden horizon:
Users choose topics that have high similarity and 

high familiarity compared to the non-selected topics.
3. Discover the unexplored: 

Users choose topics that have low similarity and 

low familiarity compared to the non-selected topics.
1. Goal Framework:
Broaden Horizon chosen topics are more similar and
more familiar than Discover the unexplored
David Graus / Recsys Summer School / 11/09/2019
Results
101
System
Evaluation
2. Broaden horizon:
Users choose topics that have high similarity and 

high familiarity compared to the non-selected topics.
3. Discover the unexplored: 

Users choose topics that have low similarity and 

low familiarity compared to the non-selected topics.
1. Goal Framework:
Broaden Horizon chosen topics are more similar and
more familiar than Discover the unexplored
Partial
support
David Graus / Recsys Summer School / 11/09/2019
What have we done
• Novel, scalable and generalizable framework of user-profile
explanations
• Exploration of reader data; domain-knowledge and 

data-driven ontology
• Interface mockup
• User-study to evaluate the value-driven explanations
• “Explainability” as a product
102
David Graus / Recsys Summer School / 11/09/2019
Future
• Further designing 

implementation of our 

framework
• Formalize the 

epistemic goals 

together with editors
• Extend user studies/

focus groups to all 

value-driven goals
!103
David Graus / Recsys Summer School / 11/09/2019
In summary
• Shown you where/how we personalize @ FD Mediagroep
• Shown you how we evaluate and can determine usefulness of
RecSys using IR metrics
• Told you that these metrics are not all there is to it
• Told you it is not so difficult to incorporate diversity, discount for
popularity, etc.
• What we’ll work on after the break
104

Contenu connexe

Similaire à RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.

Cloud cpr uncc cloud computing conference 2013
Cloud cpr   uncc cloud computing conference 2013Cloud cpr   uncc cloud computing conference 2013
Cloud cpr uncc cloud computing conference 2013
C5_LUCK
 
15-MOR-101_IPDWhitepaper
15-MOR-101_IPDWhitepaper15-MOR-101_IPDWhitepaper
15-MOR-101_IPDWhitepaper
Todd Sullivan
 
Case Study #3: Teachers College - Writing Sample
Case Study #3: Teachers College - Writing SampleCase Study #3: Teachers College - Writing Sample
Case Study #3: Teachers College - Writing Sample
Chris Klem
 
PRISM - A Composite Score Model by Bongs Lainjo
PRISM - A Composite Score Model by Bongs LainjoPRISM - A Composite Score Model by Bongs Lainjo
PRISM - A Composite Score Model by Bongs Lainjo
CesToronto
 
Developing a maturity model for organisational digital capability
Developing a maturity model for organisational digital capabilityDeveloping a maturity model for organisational digital capability
Developing a maturity model for organisational digital capability
Jisc
 

Similaire à RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity. (20)

BD2K Update
BD2K UpdateBD2K Update
BD2K Update
 
Cloud Based Language Learning Market Size, Share, & Trends Estimation Report ...
Cloud Based Language Learning Market Size, Share, & Trends Estimation Report ...Cloud Based Language Learning Market Size, Share, & Trends Estimation Report ...
Cloud Based Language Learning Market Size, Share, & Trends Estimation Report ...
 
TargetX Insights: MultiObject Dashboards
TargetX Insights: MultiObject DashboardsTargetX Insights: MultiObject Dashboards
TargetX Insights: MultiObject Dashboards
 
Cloud cpr uncc cloud computing conference 2013
Cloud cpr   uncc cloud computing conference 2013Cloud cpr   uncc cloud computing conference 2013
Cloud cpr uncc cloud computing conference 2013
 
Programmatic advertising in china by daxue consulting
Programmatic advertising in china by daxue consultingProgrammatic advertising in china by daxue consulting
Programmatic advertising in china by daxue consulting
 
Cybersecurity education for the next generation
Cybersecurity education for the next generationCybersecurity education for the next generation
Cybersecurity education for the next generation
 
RallyFwd May 2023 - Robin Dagostino
RallyFwd May 2023 - Robin DagostinoRallyFwd May 2023 - Robin Dagostino
RallyFwd May 2023 - Robin Dagostino
 
Using Cloud Hyperscale Vendors Cognitive Artificial Intelligence NoOps MLaaS
Using Cloud Hyperscale Vendors Cognitive Artificial Intelligence NoOps MLaaSUsing Cloud Hyperscale Vendors Cognitive Artificial Intelligence NoOps MLaaS
Using Cloud Hyperscale Vendors Cognitive Artificial Intelligence NoOps MLaaS
 
Optimizing Cloud and Multi-Cloud Once You’re There: Solutions to the Toughest...
Optimizing Cloud and Multi-Cloud Once You’re There: Solutions to the Toughest...Optimizing Cloud and Multi-Cloud Once You’re There: Solutions to the Toughest...
Optimizing Cloud and Multi-Cloud Once You’re There: Solutions to the Toughest...
 
Interview preparation devops
Interview preparation devopsInterview preparation devops
Interview preparation devops
 
2012 IBM Tech Trends Report: Fast track to the future
2012 IBM Tech Trends Report: Fast track to the future2012 IBM Tech Trends Report: Fast track to the future
2012 IBM Tech Trends Report: Fast track to the future
 
RWDG Slides: Stay Non-Invasive in Your Data Governance Approach
RWDG Slides: Stay Non-Invasive in Your Data Governance ApproachRWDG Slides: Stay Non-Invasive in Your Data Governance Approach
RWDG Slides: Stay Non-Invasive in Your Data Governance Approach
 
15-MOR-101_IPDWhitepaper
15-MOR-101_IPDWhitepaper15-MOR-101_IPDWhitepaper
15-MOR-101_IPDWhitepaper
 
Integrated Project Delivery White Paper
Integrated Project Delivery   White PaperIntegrated Project Delivery   White Paper
Integrated Project Delivery White Paper
 
Case Study #3: Teachers College - Writing Sample
Case Study #3: Teachers College - Writing SampleCase Study #3: Teachers College - Writing Sample
Case Study #3: Teachers College - Writing Sample
 
Assessing the Personalisation of Australian Google News Results
Assessing the Personalisation of Australian Google News ResultsAssessing the Personalisation of Australian Google News Results
Assessing the Personalisation of Australian Google News Results
 
Data-centric design and the knowledge graph
Data-centric design and the knowledge graphData-centric design and the knowledge graph
Data-centric design and the knowledge graph
 
PRISM - A Composite Score Model by Bongs Lainjo
PRISM - A Composite Score Model by Bongs LainjoPRISM - A Composite Score Model by Bongs Lainjo
PRISM - A Composite Score Model by Bongs Lainjo
 
GraphTalk Copenhagen - Killing Data Silos in the Life Sciences with Neo4j
GraphTalk Copenhagen - Killing Data Silos in the Life Sciences with Neo4jGraphTalk Copenhagen - Killing Data Silos in the Life Sciences with Neo4j
GraphTalk Copenhagen - Killing Data Silos in the Life Sciences with Neo4j
 
Developing a maturity model for organisational digital capability
Developing a maturity model for organisational digital capabilityDeveloping a maturity model for organisational digital capability
Developing a maturity model for organisational digital capability
 

Plus de David Graus

Dynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity Ranking
David Graus
 
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27thDavid Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus
 
Understanding Email Traffic (talk @ E-Discovery NL Symposium)
Understanding Email Traffic (talk @ E-Discovery NL Symposium)Understanding Email Traffic (talk @ E-Discovery NL Symposium)
Understanding Email Traffic (talk @ E-Discovery NL Symposium)
David Graus
 
Semantic Search in E-Discovery
Semantic Search in E-DiscoverySemantic Search in E-Discovery
Semantic Search in E-Discovery
David Graus
 

Plus de David Graus (20)

Pragmatic ethical and fair AI for data scientists
Pragmatic ethical and fair AI for data scientistsPragmatic ethical and fair AI for data scientists
Pragmatic ethical and fair AI for data scientists
 
CAT/AI: Computer Assisted Translation 
Assessment for Impact
CAT/AI: Computer Assisted Translation 
Assessment for ImpactCAT/AI: Computer Assisted Translation 
Assessment for Impact
CAT/AI: Computer Assisted Translation 
Assessment for Impact
 
Opening the Black Box of User Profiles in Content-based Recommender Systems
Opening the Black Box of User Profiles in Content-based Recommender SystemsOpening the Black Box of User Profiles in Content-based Recommender Systems
Opening the Black Box of User Profiles in Content-based Recommender Systems
 
Zoeken, vinden, en aanbevelen: personalisatie vs. privacy
Zoeken, vinden, en aanbevelen: personalisatie vs. privacyZoeken, vinden, en aanbevelen: personalisatie vs. privacy
Zoeken, vinden, en aanbevelen: personalisatie vs. privacy
 
Layman's Talk: Entities of Interest --- Discovery in Digital Traces
Layman's Talk: Entities of Interest --- Discovery in Digital TracesLayman's Talk: Entities of Interest --- Discovery in Digital Traces
Layman's Talk: Entities of Interest --- Discovery in Digital Traces
 
Financial News Mining @ PyData Amsterdam
Financial News Mining @ PyData AmsterdamFinancial News Mining @ PyData Amsterdam
Financial News Mining @ PyData Amsterdam
 
De Macht van Data --- Hoe algoritmen ons leven vormgeven
De Macht van Data --- Hoe algoritmen ons leven vormgevenDe Macht van Data --- Hoe algoritmen ons leven vormgeven
De Macht van Data --- Hoe algoritmen ons leven vormgeven
 
Financial News Mining @ FD Mediagroep/Company.info
Financial News Mining @ FD Mediagroep/Company.infoFinancial News Mining @ FD Mediagroep/Company.info
Financial News Mining @ FD Mediagroep/Company.info
 
Big Data & Machine Learning - Mogelijkheden & Valkuilen
Big Data & Machine Learning - Mogelijkheden & ValkuilenBig Data & Machine Learning - Mogelijkheden & Valkuilen
Big Data & Machine Learning - Mogelijkheden & Valkuilen
 
Analyzing and Predicting Task Reminders
Analyzing and Predicting Task RemindersAnalyzing and Predicting Task Reminders
Analyzing and Predicting Task Reminders
 
Dynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity Ranking
 
Dynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity Ranking
 
Understanding Email Traffic
Understanding Email TrafficUnderstanding Email Traffic
Understanding Email Traffic
 
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27thDavid Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
 
Understanding Email Traffic (talk @ E-Discovery NL Symposium)
Understanding Email Traffic (talk @ E-Discovery NL Symposium)Understanding Email Traffic (talk @ E-Discovery NL Symposium)
Understanding Email Traffic (talk @ E-Discovery NL Symposium)
 
Generating Pseudo-ground Truth for Detecting New Concepts in Social Streams
Generating Pseudo-ground Truth for Detecting New Concepts in Social StreamsGenerating Pseudo-ground Truth for Detecting New Concepts in Social Streams
Generating Pseudo-ground Truth for Detecting New Concepts in Social Streams
 
yourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic eventsyourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic events
 
Semantic Search in E-Discovery
Semantic Search in E-DiscoverySemantic Search in E-Discovery
Semantic Search in E-Discovery
 
Semantic Annotation of the Cyttron Database
Semantic Annotation of the Cyttron DatabaseSemantic Annotation of the Cyttron Database
Semantic Annotation of the Cyttron Database
 
Semantic annotation, clustering and visualization
Semantic annotation, clustering and visualizationSemantic annotation, clustering and visualization
Semantic annotation, clustering and visualization
 

Dernier

Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 

Dernier (20)

Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptx
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 

RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.

  • 1. Recommender Systems in the Media Industry:
 Relevance, Recency, Popularity, and Diversity ! David Graus ✉ david.graus@fdmediagroep.nl 🐦 @dvdgrs
  • 2. David Graus / Recsys Summer School / 11/09/2019 whoami ! • 🎓 Academia • BA Media Studies @ UvA (2008) • MSc Media Technology @ Universiteit Leiden (2012) • PhD Information Retrieval @ UvA (2017) • 🏢 Industry • Editor radio/online public broadcaster NTR (between BA & MSc) • Research Intern @ Microsoft Research, US • Data Scientist @ FD Mediagroep • Lead Data Scientist @ FD SMART Journalism / BNR SMART Radio 2
  • 3. David Graus / Recsys Summer School / 11/09/2019 In what is to follow… • An introduction of FD Mediagroep • Personalization & RecSys at FD Mediagroep: • SMART Radio • SMART Journalism • Lessons learnt building/productizing a RecSys; • Relevance • Usefulness • Trust 3
  • 5. David Graus / Recsys Summer School / 11/09/2019 5
  • 6. The leading information provider in the financial economic domain FD Mediagroup in the Netherlands
  • 7.
  • 8.
  • 9. AI @ FD Mediagroup
  • 10. David Graus / Recsys Summer School / 11/09/2019 AI @ FD Mediagroup 10
  • 11. David Graus / Recsys Summer School / 11/09/2019 Team 11 Dung Bahadir Anca Philippe Maya David Feng Li’ao Klaus Oberon Manon Azamat
  • 12. AI @ FDMG: Academia/Industry
  • 13. Part 2: SMART Radio
  • 14.
  • 15. David Graus / Recsys Summer School / 11/09/2019 SMART Radio • (Transcribe) • Segment • Tag • Serve 15
  • 16. David Graus / Recsys Summer School / 11/09/2019 Transcribe 16
  • 17. David Graus / Recsys Summer School / 11/09/2019 Segment • Based on metadata, 
 text, and audio. 17
  • 18. David Graus / Recsys Summer School / 11/09/2019 Tag • Simple multilabel text 
 classifier • Trained on transcripts of 
 segments + associated tags 
 from website 18
  • 19. David Graus / Recsys Summer School / 11/09/2019 Serve • iOS/Android 
 app 19
  • 20. Part 3: SMART Journalism
  • 21. David Graus / Recsys Summer School / 11/09/2019 21
  • 22. David Graus / Recsys Summer School / 11/09/2019 SMART Journalism • Moonshot; personalized summarization • How to get there: • Content Understanding • Content-based Recommender System; <user, article> • Personalized snippet retrieval; <user, snippet-in-article> • Snippet-to-summary abstractor (?) 22
  • 23. Tech
  • 24. David Graus / Recsys Summer School / 11/09/2019 24 User Article RecSys Matching 0.352 0.795 0.125 0.643
  • 25. David Graus / Recsys Summer School / 11/09/2019 25 User Articles RecSys Matching Reader Profile Article Profile
  • 26. David Graus / Recsys Summer School / 11/09/2019 Article Representation 26 Article Article Profile 'Meer regelgeving cryptogeld noodzakelijk' Article Representation Tags: Blockchain, Cryptocurrency, Regelgeving Rubriek: Economie & Politiek Stylometrie: CharLen=2424, WordLen=486 Entities: -
  • 27. David Graus / Recsys Summer School / 11/09/2019 User Profile 27 User User Profile Qualcomm krijgt bijna €1 mrd boete van Brussel Tags: Boete, Chips, EU, Mededinging Rubriek: Ondernemen Stylometrie: CharLen=3491, WordLen=635 Entities: Qualcomm, Apple, NXP, Intel, Google Tags: Boete, Chips, EU, Mededinging Rubriek: Ondernemen Stylometrie: CharLen=3491, WordLen=635 Entities: Qualcomm, Apple, NXP, Intel, Google
  • 28. David Graus / Recsys Summer School / 11/09/2019 28 Qualcomm krijgt bijna €1 mrd boete van Brussel Tags: Boete, Chips, EU, Mededinging, Big Data, Blog, Davos, Google, Technologie Rubriek: Ondernemen, Davos Stylometrie: CharLen=3491, WordLen=635, CharLen=2856, WordLen=524 Entities: Qualcomm, Apple (2), NXP, Intel, Google (2), Microsoft, Salesforce Topman van softwaremaker Salesforce kraakt grote techbedrijven Tags: Big Data, Blog, Davos, Google, Technologie Rubriek: Davos Stylometrie: CharLen=2856, WordLen=524 Entities: Google, Apple, Microsoft, Salesforce Lezers User Profile Tags: Boete, Chips, EU, Mededinging Rubriek: Ondernemen Stylometrie: CharLen=3491, WordLen=635 Entities: Qualcomm, Apple, NXP, Intel, Google
  • 29. David Graus / Recsys Summer School / 11/09/2019 Model • Content-based RecSys • Using point-wise LTR • Model: GBDT (xgboost) • Labels: clicks (i.e., click = 1, non-click = 0) • Implicit feedback (dangers and assumptions ahead) • Features: user, article, user-article features (~14k) • Trained nightly 29
  • 31. David Graus / Recsys Summer School / 11/09/2019 Goal • Help users to discover content [1] 31Meyer, F. Recommender systems in industrial contexts (2012)
  • 32. David Graus / Recsys Summer School / 11/09/2019 32
  • 33. David Graus / Recsys Summer School / 11/09/2019 33
  • 34. David Graus / Recsys Summer School / 11/09/2019 Building a RecSys for FD • Definition of Done:
 Our RecSys needs to provide relevant and useful results, 
 and be trustworthy. • Relevance: IR evaluation metrics • Usefulness: What we want our RecSys to achieve • Trust: Whether we see that people like/use our RecSys 34
  • 35. David Graus / Recsys Summer School / 11/09/2019 1. Relevance • Using different (ranking) metrics, e.g.; • NDCG • MAP • RPrec • Evaluating models offline • Comparing offline vs. online performance 35
  • 36. David Graus / Recsys Summer School / 11/09/2019 Relevance: NDCG • Item’s score: Gain • Sum scores: Cumulative Gain • Penalize scores at lower ranks: Discounted Cumulative Gain • (e.g., divide CG by log of item position) • Normalized by “ideal” ranking: NDCG 36http://fastml.com/evaluating-recommender-systems/
  • 37. David Graus / Recsys Summer School / 11/09/2019 MAP • Precision: TP / TP+FP • Average Precision: Precision at each rank • Mean Average Precision: average AP across 
 queries 37 https://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-ranked-retrieval-results-1.html
  • 38. David Graus / Recsys Summer School / 11/09/2019 R-Precision • Precision@k • where k = R (= number of relevant documents) 38https://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-ranked-retrieval-results-1.html
  • 39. David Graus / Recsys Summer School / 11/09/2019 https://github.com/cvangysel/pytrec_eval But… • It’s all we can do… 39Garcin et al., Offline and Online Evaluation of News Recommender Systems at swissinfo.ch (RecSys 2014)
  • 40. David Graus / Recsys Summer School / 11/09/2019 2. Usefulness • 🤖 Part algorithmic • What can we build? • 💰 Part business/values • What do we want our RecSys to do? 40
  • 41. David Graus / Recsys Summer School / 11/09/2019 🤖 Algorithmic: Tailor-made RecSys • Diversity • Popularity • Coverage • Surprise • Speed/Recency • … 41
  • 42. David Graus / Recsys Summer School / 11/09/2019 Popularity • “It is generally not useful to recommend very popular items as they are generally already known by the user” [1] • Presentation bias • Skewed clicks • Issue in historic data and online evaluation 42[1] Meyer, F. Recommender systems in industrial contexts (2012)
  • 43. David Graus / Recsys Summer School / 11/09/2019 Coverage • Unleash the long tail? 43 Most read: Same 6 articles for everyone Personalized: 
 178 articles
  • 44. David Graus / Recsys Summer School / 11/09/2019 💰 Business: What do we want/expect? • ⚠ WIP Study where journalists, data scientists, product, developers were interviewed to identify shared 
 perceptions, expectations, attitudes towards 
 algorithms. 44
  • 45. David Graus / Recsys Summer School / 11/09/2019 Business: perception of algorithms • Distinguish between audience & and journalism-related values. • Results TBD, but think of aspects such as: • broadness vs. personalizedness (audience) • usability (audience) • objectivity/neutrality (journalism) • speed/immediacy (journalism) • etc. 45
  • 46. David Graus / Recsys Summer School / 11/09/2019 46
  • 47. But what about the 
 filter bubble?
  • 48. David Graus / Recsys Summer School / 11/09/2019 Short story • Tailor-made RecSys: we are in control • RecSys as “yet another ranking” • The slightly longer story… 48
  • 49. David Graus / Recsys Summer School / 11/09/2019 49
  • 50. David Graus / Recsys Summer School / 11/09/2019 1. Should we worry? [We] focus on empirical evidence of the spread of personalised news services and its likely effects on political polarisation and political information. 50[Zuiderveen Borgesius et al., 2016]
  • 51. David Graus / Recsys Summer School / 11/09/2019 1. Should we worry? • It’s difficult to bubble yourself • Both offline: • “Those who use a lot of partisan information also use an above-average amount of mainstream news.” • “[M]ost people by far still get their news via traditional sources, most notably public-service television.” • And online: • “People who choose personalisation are more likely to use an above-average amount of general-interest news as well.” • “A recent study suggests that the influence of [the Facebook] algorithm is lower than the influence of the user’s choices.” 51[Zuiderveen Borgesius et al., 2016]
  • 52. David Graus / Recsys Summer School / 11/09/2019 1. ⚠ Take homes • “[T]here is no empirical evidence that warrants any strong worries about filter bubbles.” • “One lesson we should have learned from the past is that panic does not lead to sane policies. More evidence is needed on the process and effects of personalisation, so we can shift the basis of policy discussions from fear to insight.” 52[Zuiderveen Borgesius et al., 2016]
  • 53. David Graus / Recsys Summer School / 11/09/2019 2. Some empirical evidence… • On average, 11.7% of results show differences due to personalization on Google. • Varies widely by search query and by result ranking. • Only found measurable personalization as a result of searching with a logged in account and the IP address of the searching user. 53
  • 54. David Graus / Recsys Summer School / 11/09/2019 2. Method 1. 👤 1. Get 200 volunteers with Google accounts 2. Have them issue the same set of queries 3. Compare results 2. 🤖 1. Construct Google bot accounts • Vary aspects such as location, demographics, click behavior, browsing + search history, etc. 2. Have them issue the same set of queries 3. Compare results 54[Hannák et al., 2013]
  • 55. David Graus / Recsys Summer School / 11/09/2019 2. Findings 👤 • On average, 11.7% of results show differences due to personalization on Google. • Top ranks tend to be less personalized than bottom ranks. 55[Hannák et al., 2013]
  • 56. David Graus / Recsys Summer School / 11/09/2019 2. Findings 👤 • ✅ A great deal of personalization based on location 
 (especially for company names, where users received different store locations). • ❌ The least personalized results tend to be factual and health related queries. 56[Hannák et al., 2013]
  • 57. David Graus / Recsys Summer School / 11/09/2019 2. Findings 🤖 ✅ Logged in vs. “cleared cookies” account ✅ Geolocation ❌ Gender ❌ Age ❌ Search history ❌ Click history ❌ Browsing history 57[Hannák et al., 2013]
  • 58. David Graus / Recsys Summer School / 11/09/2019 3. Bursting bubbles 58
  • 59. David Graus / Recsys Summer School / 11/09/2019 3. Aim Increase exposure to varied political opinions 
 with a goal of improving civil discourse 59[Yom-Tov et al. 2014]
  • 60. David Graus / Recsys Summer School / 11/09/2019 3. Method • Classify searchers into political leaning (using geo data) 60[Yom-Tov et al. 2014]
  • 61. David Graus / Recsys Summer School / 11/09/2019 3. Method • Infer political leaning of news sources from user behavior. • Identify polarized search queries (with strong political leanings — in both directions). 61[Yom-Tov et al. 2014]
  • 62. David Graus / Recsys Summer School / 11/09/2019 3. Method • Treatment group: Insert red results for blue users, and blue results for red users • Control group: Do not adjust results 62[Yom-Tov et al. 2014]
  • 63. David Graus / Recsys Summer School / 11/09/2019 3. Methode 1. Short term: Compare clicks/behavior between control & treatment. 2. Long term: Measure during two weeks, per user; 1. Polarization: Difference of user’s leaning-score compared to average leaning across all sources. 2. Engagement: Average number of queries + average read articles. 63
  • 64. David Graus / Recsys Summer School / 11/09/2019 3. Findings 1 • Less clicks on inserted opposing sources. • But: 
 “Results pages of the opposing viewpoint which had a similarity higher than the average tended to be clicked 38% more than those below the average.” 64[Yom-Tov et al. 2014]
  • 65. David Graus / Recsys Summer School / 11/09/2019 3. Findings 2 • Polarization: • Treatment: Average leaning ‘moves’ ~25% to centre • Control: Negligible difference (~1%) • Engagement: • Treatment: Number of queries: +9% / articles read: +4% • Control: Small reduction in both (~2.5%) 65[Yom-Tov et al. 2014]
  • 66. David Graus / Recsys Summer School / 11/09/2019 4. How do algorithmic recommendations 🤖 compare to human 👤? 66
  • 67. David Graus / Recsys Summer School / 11/09/2019 4. Method • 🤖 Generate article recommendations for news articles using different (off the shelf) recommender systems (CF & CB). • 👤 Compare to hand-picked article recommendations. • Measure “diversity” of recommended articles: • At content level • At tag level • At category level • At sentiment/subjectivity level 67[Möller et al. 2018]
  • 68. David Graus / Recsys Summer School / 11/09/2019 4. Findings “Conventional recommendation algorithms at least preserve the topic/sentiment diversity of the article supply.” 68[Möller et al. 2018]
  • 69. David Graus / Recsys Summer School / 11/09/2019 4. ⚠ Take home • Diversity is preserved with conventional recommender systems. 69[Möller et al. 2018]
  • 71. David Graus / Recsys Summer School / 11/09/2019 ⚠ Besides • It is trivial to model and incorporate aspects such as diversity (as you’ll (hopefully) learn during our hands-on 🙌…) 71
  • 72. David Graus / Recsys Summer School / 11/09/2019 3. Trust • Returning visitors / users • Measure CTR • Questionnaires [1] 72[1] https://hci.epfl.ch/research-projects/resque/
  • 74. David Graus / Recsys Summer School / 11/09/2019 Why fair and transparent? • User studies have shown: • Our users want personalized content • Our users care for transparency • FD • Verifiability is one of the core values for FD Mediagroup • Transparency enables verifiability 74
  • 75. David Graus / Recsys Summer School / 11/09/2019 The Context 75 User-data Rec engine Inferred data Recomme ndations
  • 76. David Graus / Recsys Summer School / 11/09/2019 The Context 76
  • 77. !77 STEP 1 DISCOURSE Understanding the problem STEP 2 FRAMEWORK Systematic layering of explanations STEP 3 EVALUATION Data exploration and evaluation
  • 78. !78 STEP 1 DISCOURSE Understanding the problem STEP 2 FRAMEWORK Systematic layering of explanations STEP 3 EVALUATION Data exploration and evaluation
  • 79. David Graus / Recsys Summer School / 11/09/2019 Discourse • Who is the target audience of the explanations? • What is the goal? • What purpose do the explanations serve? 79
  • 80. !80 STEP 1 DISCOURSE Understanding the problem STEP 2 FRAMEWORK Systematic layering of explanations STEP 3 EVALUATION Data exploration and evaluation
  • 81. David Graus / Recsys Summer School / 11/09/2019 Framework 81
  • 82. David Graus / Recsys Summer School / 11/09/2019 Framework 82 I want to 
 be an expert I want to 
 stay informed I want to broaden my horizon I want to discover the unexplored Values Broadness, diversity, autonomy, objectivity, match with the user needs, controllability
  • 83. !83 STEP 1 DISCOURSE Understanding the problem STEP 2 FRAMEWORK Systematic layering of explanations STEP 3 EVALUATION Data exploration and evaluation
  • 85. David Graus / Recsys Summer School / 11/09/2019 Data exploration 85
  • 86. David Graus / Recsys Summer School / 11/09/2019 Data exploration 86
  • 87. David Graus / Recsys Summer School / 11/09/2019 Data exploration 87
  • 89. David Graus / Recsys Summer School / 11/09/2019 User study 89 System Evaluation Your goal is to Broaden your Horizons. 
 There may be topics you do not normally read about, but you may actually find interesting. Exploring this helps to build a broad perspective on the issues that matter to you. Your goal is to Discover the Unexplored. 
 There may be topics that you haven’t explored before that may actually become new interests. Exploring new topics can promote creativity and objectivity.
  • 90. David Graus / Recsys Summer School / 11/09/2019 User study 90 Aim: Study whether being offered an explanation of reading behavior and a particular goal, would influence the user’s intended reading behaviorSystem Evaluation
  • 91. David Graus / Recsys Summer School / 11/09/2019 User study 91 System Evaluation Objective Pick a persona from four data-driven profiles Random assign goal-order Explain the goals: Broaden Horizons, Discover the unexplored Goal A Show visualization Questionnaire Persona 1 Goal B Persona 4
  • 92. David Graus / Recsys Summer School / 11/09/2019 Reading Behavior Explanation 92 Similarity Familiarity
  • 93. David Graus / Recsys Summer School / 11/09/2019 Hypotheses 93 System Evaluation 2. Broaden horizon: Users choose topics that have high similarity and 
 high familiarity compared to the non-selected topics. 3. Discover the unexplored: 
 Users choose topics that have low similarity and 
 low familiarity compared to the non-selected topics. 1. Goal Framework: Broaden Horizon chosen topics are more similar and more familiar than Discover the unexplored
  • 95. David Graus / Recsys Summer School / 11/09/2019 Results 95 System Evaluation 2. Broaden horizon: Users choose topics that have high similarity and 
 high familiarity compared to the non-selected topics. 3. Discover the unexplored: 
 Users choose topics that have low similarity and 
 low familiarity compared to the non-selected topics. 1. Goal Framework: Broaden Horizon chosen topics are more similar and more familiar than Discover the unexplored
  • 96. David Graus / Recsys Summer School / 11/09/2019 Results 96 System Evaluation 2. Broaden horizon: Users choose topics that have high similarity and 
 high familiarity compared to the non-selected topics. 3. Discover the unexplored: 
 Users choose topics that have low similarity and 
 low familiarity compared to the non-selected topics. 1. Goal Framework: Broaden Horizon chosen topics are more similar and more familiar than Discover the unexplored
  • 97. David Graus / Recsys Summer School / 11/09/2019 Results 97 System Evaluation 2. Broaden horizon: Users choose topics that have high similarity and 
 high familiarity compared to the non-selected topics. 3. Discover the unexplored: 
 Users choose topics that have low similarity and 
 low familiarity compared to the non-selected topics. 1. Goal Framework: Broaden Horizon chosen topics are more similar and more familiar than Discover the unexplored
  • 98. David Graus / Recsys Summer School / 11/09/2019 Results 98 System Evaluation 2. Broaden horizon: Users choose topics that have high similarity and 
 high familiarity compared to the non-selected topics. 3. Discover the unexplored: 
 Users choose topics that have low similarity and 
 low familiarity compared to the non-selected topics. 1. Goal Framework: Broaden Horizon chosen topics are more similar and more familiar than Discover the unexplored
  • 99. David Graus / Recsys Summer School / 11/09/2019 Results 99 System Evaluation 2. Broaden horizon: Users choose topics that have high similarity and 
 high familiarity compared to the non-selected topics. 3. Discover the unexplored: 
 Users choose topics that have low similarity and 
 low familiarity compared to the non-selected topics. 1. Goal Framework: Broaden Horizon chosen topics are more similar and more familiar than Discover the unexplored
  • 100. David Graus / Recsys Summer School / 11/09/2019 Results 100 System Evaluation 2. Broaden horizon: Users choose topics that have high similarity and 
 high familiarity compared to the non-selected topics. 3. Discover the unexplored: 
 Users choose topics that have low similarity and 
 low familiarity compared to the non-selected topics. 1. Goal Framework: Broaden Horizon chosen topics are more similar and more familiar than Discover the unexplored
  • 101. David Graus / Recsys Summer School / 11/09/2019 Results 101 System Evaluation 2. Broaden horizon: Users choose topics that have high similarity and 
 high familiarity compared to the non-selected topics. 3. Discover the unexplored: 
 Users choose topics that have low similarity and 
 low familiarity compared to the non-selected topics. 1. Goal Framework: Broaden Horizon chosen topics are more similar and more familiar than Discover the unexplored Partial support
  • 102. David Graus / Recsys Summer School / 11/09/2019 What have we done • Novel, scalable and generalizable framework of user-profile explanations • Exploration of reader data; domain-knowledge and 
 data-driven ontology • Interface mockup • User-study to evaluate the value-driven explanations • “Explainability” as a product 102
  • 103. David Graus / Recsys Summer School / 11/09/2019 Future • Further designing 
 implementation of our 
 framework • Formalize the 
 epistemic goals 
 together with editors • Extend user studies/
 focus groups to all 
 value-driven goals !103
  • 104. David Graus / Recsys Summer School / 11/09/2019 In summary • Shown you where/how we personalize @ FD Mediagroep • Shown you how we evaluate and can determine usefulness of RecSys using IR metrics • Told you that these metrics are not all there is to it • Told you it is not so difficult to incorporate diversity, discount for popularity, etc. • What we’ll work on after the break 104