In this paper, we propose user modeling strategies which
use Concept Frequency - Inverse Document Frequency (CF-
IDF) as a weighting scheme and incorporate either or both
of the dynamics and semantics of user interests. To this end,
we first provide a comparative study on different user modeling strategies considering the dynamics of user interests in
previous literature to present their comparative performance.
In addition, we investigate different types of information (i.e.,
categories, classes and connected entities via various proper-
ties) for entities from DBpedia and the combination of them
for extending user interest profiles. Finally, we build our user
modeling strategies incorporating either or both of the best-
performing methods in each dimension. Results show that
our strategies outperform two baseline strategies significantly
in the context of link recommendations on Twitter.
SEMANTiCS2016 - Exploring Dynamics and Semantics of User Interests for User Modeling on Twitter for Link Recommendations
1. Guangyuan Piao, John G. Breslin
Unit for Social
Semantics
12th International Conference on Semantic Systems
Leipzig, Germany, 12-15, September, 2016
Exploring Dynamics and Semantics of
User Interests for User Modeling on Twitter for
Link Recommendations
2. 2
1/3 users seek medical information
and over 50% users consume news
on Social Networks
Facebook and Twitter together generate
more than 5 billion microblogs / day
[SOURCE] Semantic Filtering for Social Data, Amit et al., Internet Computing’16
3. Background – User Modeling
content enrichment
analysis &
user modeling
interest profile
?
personalized content
recommendations
(How) can we infer
user interest profiles
that support the
content recommender?
3[SOURCE] Analyzing User Modeling on Twitter for Personalized News Recommendations, UMAP’11
4. Background – User Modeling
Representation of User Interest
Bag of
Words
Topic
Modeling
Bag of
Concepts
users' interests are
represented as a
set of words
topics: co-occurring words
document: mixture of topics
users' interests are
represented as a
set of concepts
• can exploit background
knowledge about concepts
for interest propagation
• focus on words
• assumption: a single doc contains rich information
• cannot provide semantic relationships among words
6. Weighting Scheme: importance of a concept for user
dbpedia:The_Black_Keys (3)
dbpedia:Eagles_of_Death_Metal (5)
Background – User Modeling
dbpedia:The_Wombats (2)
Concept Frequency (CF)
7. Semantic Interest Propagation
• different structures of DBpedia beyond category information are
not fully explored
Related Work – Semantics
dbpedia:The_Black_Keys
dbpedia:The_Wombats
dbc:Rock_music_duos
dbpedia:Indie_rock
subject
genredbpedia:The_Black_Keys
8. Temporal Dynamics of User Interests
• assumption: user interests might change over time
• no comparative evaluation over different methods
Related Work – Dynamics
long-term user profile
short-term user profile
interest decay function
historical user-generated content (UGC)
(e.g., the last two weeks UGC)
9. Concepts
• entities, categories and classes from DBpedia which can be used
for representing user interests
Definition
dbpedia:The_Black_Keys
dbc:Rock_music_duos
subject
yago:BluesRockMusicians
type
entity
category
class
10. CF-IDF: Concept Frequency – Inverse Document Frequency
• Weighting Scheme: CF-IDF vs. CF
• Semantics: explore different structures of DBpedia
• Dynamics: comparative study on different methods
Aim of Work
We propose and evaluate user modeling strategies
using best-performing strategies in the three dimensions
11. 11
User Modeling Framework
semantic interest
propagation
temporal dynamics
• category
• …
• category & property
• Ahmed
• …
• Orlandi
User Profiles
P(u)
Google Category:
Smartphones
… iPhone
0.09 0.12 … 0.08
a concept-based profile P(u)
weighting
scheme
entity-based
user profiles
12. 12
Core propagation strategies
• category-based
SP: sub-pages of the category
SC: sub-categories of the category
• class-based
SP’: sub-pages of the class
SC’: sub-classes of the class
Semantic Interest Propagation
15. Dataset
• 322 users: shared at least one link in the last two weeks
• 247,676 tweets in total
Experiment
• task: recommending 10 links (URLs)
• recommendation algorithm: cosine similarity(P(u), P(i))
P(i): item (link) profile using the same modeling strategy for P(u)
• ground truth links: links shared in the last two weeks
• candidate links: 15,440 links
15
Experiment Setup
used for user modeling
ground truth
links (URLs)
recommendation time
17. # of concepts after propagation
17
Semantic Interest Propagation
• on average, 224 concepts before extension
• 1,865, 1,317 and 1,152 respectively after extension
19. Results
19
A Comparative Study of Dynamics
Ahmed’s and Orlandi’s methods
provide competitive performance
in line with previous studies,
using interest decay functions
improves the performance
20. 20
Our User Modeling Strategies
extension strategy
using DBpedia
temporal dynamics
• category & property
• Ahmedα
Google Category:
Smartphones
… iPhone
0.09 0.12 … 0.08
a concept-based profile P(u)
weighting
scheme
(CF-IDF)
entity-based
user profiles
um(weighting scheme, temporal dynamics, propagation strategy)
User Profiles
P(u)
22. Conclusions & Future Work
22
• CF-IDF & combining different structures of DBpedia are beneficial
- um(CF-IDF, none, category & property) performs best
• Ahmed’s and Orlandi’s methods provide competitive performance
for capturing dynamics of user interests
• investigation of combining different dimensions of user modeling
• richer interest representation beyond concepts for users
23. 23
Thank you for your attention!
Guangyuan Piao
homepage: http://parklize.github.io
e-mail: guangyuan.piao@insight-centre.org
twitter: https://twitter.com/parklize
slideshare: http://www.slideshare.net/parklize
Notes de l'éditeur
- focus on words, which cannot provide semantic information and relationships among them
- focus on words, which cannot provide semantic information and relationships among them
- focus on words, which cannot provide semantic information and relationships among them
- focus on words, which cannot provide semantic information and relationships among them