Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Paper the plista dataset
1. The plista Dataset
ACM RecSys 2013, Hong Kong
Authors:
Kille, Benjamin
and Hopfgartner, Frank
and Brodt, Torben
and Heintz, Tobias
Speaker:
Brodt, Torben
International News Recommender
Systems Workshop and Challenge
October 13th, 2013
3. Introduction and Motivation
● Do we need another recommendation data
set?
we have
...
● What features are those data sets missing?
● What requirements entail news articles for
recommendation?
4. Introduction and Motivation
● Features that had not been available in
existing data sets:
○ contextual features: device, operating system,
browser, etc.
○ cross-domain features: 13 different news providers
included
○ different interaction types: interactions with
recommendations (clicks), as well as news items
(impressions)
○ content features: headline, URL, images, text
snippets, etc.
5. Introduction and Motivation
● Additional requirements for recommending news articles
○ real-time → recommendations must be provided within a
short time interval (< 200ms)
○ changing relevancy → items’ relevancy decreases with
time
○ dynamics → new news items are being continuously
added
● Requirements inherent to existing recommender systems:
○ sparsity → users typically read only few news articles
○ cold start → systems refrain from requesting users to
create profiles; this results in a majority of small user
profiles
9. Dataset usage
● Evaluation based on
Click-Through-Rate
(CTR)
● ~ 84 million
impressions
● ~ 1 million clicks
10. Dataset usage
● evaluation cross-news portal
recommenders
● 10 - 36 % user overlap in
between different news
portals
11. Dataset usage
● news portal comparisons
● do we observe similar user
behaviour on news portals
offering similar content?
12. Dataset usage
● evaluating contextual
recommendation algorithms
● sensitive to
○ weekday
○ hour of day
○ ...
13. Dataset usage
When using the data set you may consider…
● … we identify users by session IDs
○
○
individual users may have several IDs
users sharing their device might be mapped to one ID
● … interactions (clicks, impressions) and content
dynamics (creates, updates) differ between news
portals
● … contents are restricted to German
● … preferences are represented on a binary scale (user
read article, user clicked recommendation)
● … clicking on recommendations might not reveal the
actual relevancy of an item
14. Conclusions
● we introduce a new data set intended to
support recommender systems research
● we outlined novel features which existing
data sets lacked
● we presented scenarios which can be
evaluated using the data set
● we pointed to critical aspects which ought
to be considered when working with the data
set
15. Summary
● news articles
○ of ~13 publishers
● transactional data
○ Impressions
○ Clicks
● contextual data
○ of ~50 attributes
● cross domain application
16. The plista Dataset
@inproceedings{Kille:2013,
title = {The plista Dataset},
author = {
Kille, Benjamin
and Hopfgartner, Frank
and Brodt, Torben
and Heintz, Tobias
},
booktitle = {
NRS'13: Proceedings of
the International Workshop and
Challenge on News Recommender Systems
},
year = {2013},
month = {10},
location = {Hong Kong, China},
publisher = {ACM},
pages={14--21}
}