Hypertext2017-Leveraging Followee List Memberships for Inferring User Interests for Passive Users on Twitter
1. Guangyuan Piao, John G. Breslin
Unit for Social Semantics
28th ACM Conference on Hypertext and Social Medial
Prague, Czech Republic, 4-7, July, 2017
Leveraging Followee List Memberships for Inferring
User Interests for Passive Users on Twitter
2. 2
1/3 users seek medical information
and over 50% users consume news
on Social Networks
Facebook and Twitter together generate
more than 5 billion microblogs / day
[SOURCE] Semantic Filtering for Social Data, Amit et al., Internet Computing’16
3. According to a research done by Twocharts,
44% of Twitter users have never sent a tweet
[SOURCE] http://guardianlv.com/2014/04/twitter-users-are-not-tweeting/
4. How can we infer user interests for passive
users based on the info. of their followees?
5. ! user modeling for active users
• analyzing users’ tweets
• representing user interests using different approaches
• bag-of-words
• topic modeling
• bag-of-concepts
dbr:Eagles_of_Death_Metal (5)
Related Work
5
interest
frequency
dbr:The_Wombats (2)
dbc:Hard_rock
dbp:genre
6. ! user modeling for passive users
• analyzing information of users’ followees
• HIW(followees_tweet) [Chen et al. SIGCHI’10]
• a great amount of data, but also noisy
• SA(followees_name) [Besel et al. SAC’16, Faralli et al. SNAM’16]
• link names to entities, construct category-based user profiles
• spreading activation + WiBi-taxonomy (Wikipedia categories)
Related Work
6
dbr:Cristiano_Ronaldo (5)
dbc:Real_Madrid_C.F._players
dbr:2014_FIFA_World_Cup_players
Category A
Category B
…
…
7. ! user modeling for passive users
• analyzing information of users’ followees
• IP(followees_bio) [Piao et al. ECIR’17]
• exploring related categories & entities (1-hop)
• performed better than HIW(followees_tweet)
SA(followees_name)
Related Work
7
Bob Horry
@bob
Android developer,
educator
dbr:Android_(operating_system)
dbc:Smartphones
dbr:Java_(programming_language) dbc:Tablet_operating_systems
dc:subject
dc:subjectdbp:programmedIn
8. Different Views of Followees
! user modeling for passive users
8
Bob Horry
@bob
Android developer,
educator
biographies
(self-description)
list memberships
(others-descriptions)
9. Aim of Work
! user modeling for passive users
• we aim to investigate
• whether we can leverage the list memberships
of followees for inferring user interest profiles,
• whether two different views of followees complement
each other to improve the quality of user profiles
9
10. 10
Our Approach
! user modeling leveraging list memberships of followees
1
fetch user’s
followees
3
extract en33es from
followees’ list memberships
5
interest
propaga3on
Twitter user
@alice
Interest profile
Twitter API
Tag.me
DBpedia
graph
2
fetch list
memberships of
followees
4
construc3ng
primary interests
Twitter API
weigh3ng
scheme
11. 11
Constructing Primary Interests
! Weighting Scheme 1 (WS1)
• profile of a followee f in Fu :
where
• weight of an entity with respect to the target user
A (0.1) B (0.2)F (0.1) …
… …
B (0.3) F (0.2)C (0.2) …
normalized
followee
profile Fu
B (0.5) … F (0.3) …
12. 12
Constructing Primary Interests
! Weighting Scheme 2 (WS2)
• based on the idea of HIW (Chen et al. CHI’10)
• excluding entities extracted only in a single followee
• w(u, cj) = the number of followees who have cj in their list
memberships.
A BF …
… …
B FC …
B (2) … F (2) …
13. 13
Interest Propagation
! interest propagation using DBpedia (SEMANTiCS’16)
• SP: # of subpages
• SC: # of subcategories
• P: # of properties appearing in the whole DBpedia graph
• intuition 1: discount common categories
• intuition 2: discount related entities connected with common
properties
dbr:Android_(operating_system)
dbc:Smartphones
dbr:Java_(programming_language) dbc:Tablet_operating_systems
dc:subject
dc:subjectdbp:programmedIn
14. 14
Interest Propagation
! interest propagation using DBpedia (SEMANTiCS’16)
• same as previous approach but with DBpedia refinement
• extracting sub-graph of dbc:Main_topic_classifications
• merging categories and entities with the same names
dbc:Apple_Inc.
(0.25)
dbr:Apple_Inc.
(5)
dbr:Steve_Jobs
(2)
Apple_Inc.
(5.25)
Steve_Jobs
(2)
before after
15. ! main goal
• analyze & compare different user modeling strategies in the
context of link (URL) recommendations
! link (URL) profile
• same representation model for users, based on its content
! ground truth
• links shared by users in their timeline in the last two weeks 15
Experiment Setup
UM#1
UM#2
candidate
links (URLs)
recommendation
algorithm
(cosine similarity)
top-N
recommendations
16. 16
Experiment Setup
! Twitter dataset
• 439 random users
• 2,771 followees on average
• considered up to 200 followees for each user due to the
Twitter API limit for crawling list memberships
! dataset for experiment
• 439 users
• 74,488 followees in total, 170 followees on average
• 15,053 candidate links for recommendations
17. 17
Experiment Setup
! evaluation metrics
• MRR (Mean Reciprocal Rank)
• the 1st relevant item occurs on average in recommendations
• S@N (Success rate)
• mean probability of a relevant item occurs in the top-N list
• P@N (Precision)
• mean probability of retrieved items in the top-N are relevant
• R@N (Recall)
• mean probability of relevant items retrieved in in the top-N
18. 18
Info. of List Memberships
• over 90% users, at least 1 list membership
• 173 list memberships, on average
• 3,047 vs. 23 entities from list memberships vs.
bios considering up to 50 followees
20. 0 10000 20000 30000 40000 50000
50
100
150
200
# of en''es
# of followees
without refinement with refinement
Results
9% compression of profile size,
while remaining at a similar performance level
21. Results – combining two views
• combining two views of followees
The final rank of an item is determined by
the average rank position of each rank
based on two user models
(Ryen et al. SIGIR’09)
score =
x : rank position based on 1st user model
y : rank position based on 2nd user model
β : importance control parameter
• combining two views improves the
performance significantly
22. Results – combining two views
• combining two views of followees
0.0400
0.0450
0.0500
0.0550
0.0600
0.0650
0.0700
0.0750
0.0800
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
R@10
beta
50
100
150
200
best performance when β = 0.1,
similar results for MRR, P@10, S@10
• list memberships paly more
important role in the combination
23. Conclusions
• leveraging list memberships of followees > exploiting biographies
especially in the case of a user having a small number of followees
• combining the two different views of followees can improve the quality
of user modeling significantly,
• and the list memberships of followees play a more important role in the
combination
24. 24
Thank you for your attention!
Guangyuan Piao
homepage: http://parklize.github.io
e-mail: guangyuan.piao@insight-centre.org
twitter: https://twitter.com/parklize
slideshare: http://www.slideshare.net/parklize