From Mining to Understanding: The Evolution of Social Web Users

FROM MINING TO
UNDERSTANDING:
THE EVOLUTION OF
SOCIAL WEB USERS
DR. MATTHEW ROWE
SCHOOL OF COMPUTING AND COMMUNICATIONS
@MROWEBOT | M.ROWE@LANCASTER.AC.UK
Faculty of Science and Technology Christmas Conference
Lancaster University, UK

Our interests develop ‘Offline’

Primary School

High
School

University

Time
1
From Mining to Understanding: The Evolution of Social Web Users

Postgrad

Postdoc

Lecturing

And so too do our social networks…

Offline, we develop in terms of both our interests and social networks
Primary School

High
School

University

Time
2

Postgrad

Postdoc

Lecturing

This also happens ‘online’, on the ‘Social Web’…

3

First, Web 1.0

4

Then, Web 2.0…
the ‘Social Web’

5

…to understand how
people behave online

…to learn how people
shape their identities

Why study
user evolution?
…to predict churners
(from social networks and
online communities)

6

…to build better
recommender systems

Talk Outline

User Lifecycles,
Properties &
Evolution
Measures

Predicting
Churners

7

Recommending
Items

Conclusions

8

User Lifecycles


Modelling User Evolution: Lifecycles
Offline Lifecycle Periods
Primary School

High
School

University

Postgrad

Postdoc

Lecturing

Time

First Action
Last Action
Lifecycle Periods of a potential Question-Answering System user (conjecture!)
Novice Users

Asking Questions

Asking & Answering
Questions

Answering
Questions

In reality: do not know the labels, however we can split by equal time intervals:

1

2

3

…

n

Yet, users non-uniformly distribute their activity across lifecycles

1

2

3

9

…

n

User Properties in Lifecycle Stages
1

2

1
#actions

3
2

=

…

n

We divide lifetime into equal activity periods

#actions

Model the actions to user u by other users
Model the actions by user u to other users
Term

s

Count

Model the tastes of the user
10

17

Web

5

Item
Mining

Model the terms used by user u

Semantic

4

Rating

Alien
Statistics

3

4*

Bladerunner

5*

Star Wars

4*

How can we track the evolution of user’s properties?
Solution: use measures from information theory

11

by computing the cross-entropy of one probability distribution with respect to another distribution from an lifecycle
period, and the properties differ between time steps?
How do then selecting the distribution that minimises
cross-entropy. Assuming we have a probability distribution
Decrease = similarity between properties
(P ) formed from a given lifecycle period ([t, t0 ]), and a
probability distribution (Q) from an earlier lifecycle period,
then we deﬁne the cross-entropy between the distributions
as follows: Evolution measure 1: Cross-Entropy
X
H(P, Q) =
p(x) log q(x)
(5)
x

In properties in vein
User the same period sas

the earlier entropy analysis, we
derived the period cross-entropy for each platform’s users
User Properties in period s-1
throughout their lifecycles and then derived the mean crossentropy for the 20 lifecycle periods. Figure 2 presents the
12
cross-entropies The Evolution of Social Webthe different platforms and user
derived for Users
From Mining to Understanding:
properties. We observe that for each distribution and each

By using conditional entropy we can assess the information needed to describe the taste profile of a user at one time
How much information is transferred previous period
step (Q) using his taste profile from the from one stage (P ).
to entropy
A reduction in conditionalthe next?indicates that the user’s
taste profile is similar to information is transferred
Decrease = more that of his previous stage’s profile, while an increase indicates the converse. We define the
conditional entropy of two discrete probability distributions,
representing taste profiles, as: Conditional Entropy
Evolution measure 2:
X
p(x)
H(Q|P ) =
p(x, y) log
(5)
p(x, y)
x2P,
y2Q

We derived the conditional entropy over the 5 lifecycle
User properties in period s
periods in a pairwise fashion, i.e. H(P2 |P1 ), . . . , H(P5 |P4 ),
and User Properties in periodof the mean conditional entropy in
plotted the curve s-1
Figure 5 over each dataset’s users in the training split, also
including the 95% confidence intervals to show the varia13
From
tionMining tothe conditionalSocial Web Users
in Understanding: The Evolution of entropies. Figure 5 indicates that

examine the information transfer from a prior lifecycle stage
(s 1) to the current lifecycle stage (s) of the user. Now, assume that we have a random variable thatthe user’s the local
How do global dynamics influence describe
categories that have been reviewed at the current stage (Ys ),
properties?
a random variable of local categoriesglobal influence stage
Decrease = more susceptible to at the previous
(Ys 1 ). and a third random variable of global categories at
Increase = less susceptible to global influence
the previous stage (Xs 1 ), we then deﬁne the transfer entropy of one lifecycle stage to another as follows, based on
the work of Schreiber measure 3: Transfer Entropy
Evolution [8]:
TX!Y = H(Ys |Ys

1)

H(Ys |Ys

1 , Xs 1 )

(6)

Using the above probability distributions we can calculate
the transfer entropy based on the joint and conditional probSurprise in user properties from s-1 to s
ability distributions given the values of the random variables
Surprise in user properties in s when we
consider all users’ properties from s-1
14

15

Predicting Churners via Evolution Signals
...from Online Communities


d testing, using the former in this section to examine user development
e latter split forOnline Communities experiments.
Datasets: our later detection
Platform
Time Span
Post Count User Count
Facebook
[18-08-2007,24-01-2013] 118,432
4,745
SAP
[15-12-2003,20-07-2011] 427,221
32,926
Server Fault [01-08-2008,31-03-2011] 234,790
33,285

Churner ‘Cutoff’’
Defining Lifecycle Periods

For th

1500

800 1000

1

Table 1. Statistics of the online community platform datasets.

1000
500

Posts Frequency

1000

2008

2010

Time

2012

0

0

200

600

Posts Frequency

600
400
200
0

Posts Frequency

order to examine how users develop over time we needed some Fault
mean
gment a user’s lifetime (i.e. from the first date at which they post to thet
rate
simila
their final post) into discrete intervals. Prior work [6, 2, 5] has demonstr
the cr
e extent to which users develop at their own pace and thus evolve accor
must s
their own ‘personal clock ’ [5]. Hence, for deriving the lifecycle periods ofis
u
fect
thin the platforms we adopted an activity-slicing approach that divid
non-ch
(a) Facebook
(b) SAP
(c) Server Fault
comm
er’s lifetime into 20 discrete time intervals, emulating the approach in [2],
16
th an equal proportion of activity within each period. This approach than c
funct
distrib
follows: we derive the Posts per-day for the ({[ti , tj ]} with ) by first deri
Figure 2: set of interval tuples datasets 2 T the
to foll
2004

2006

2008

Time

2010

2009

2010

Time

2011

0.8

0

0.2

0.4

●
●●● ●
●

0.6

0.8

0.04
0.03
0.02
0.01

●

0.10

0.15

- (b) In-degree
SAP

0.05

●
17
●
●

0.10

●●●

1

●

0

●●

●

●●
●●●●●●●●

0.2

Lifecycle Stages

−period Cross Entropy

0.20
0.15

●
●

0.00

0.20
0.15
0.10
0.05

1

Lifecycle Stages

(a) In-degree
Facebook

●

0.4

●
●●●

0.6

0.8

●●●

1

Lifecycle Stages

- (c) In-degree
Server Fault
0.06

0.6

●●

●

0.04

0.4

●●

●

●

●

0.02

0.2

●●●●●●●
●

●

●●

Time−period Cross Entropy

●●
●●

●


●

●

0.00

●


0.00 0.02 0.04 0.06 0.08 0.10

Churners
Non−churners

●

0

.05

h
sn
To
s’
ss
bm
at


n


e
=

than churners. For the cross-entropy of users’ lexical term
distributions dissimilarity with prior in-degree non-churners
Cross-Entropy: we ﬁnd the signals of churner andinformation
to follow a similar curvature user differ from before?
I.e. how do users who contact a given(converging on a limit with a
decaying rate) but with di↵erent magnitudes.

●

●

-

●●●

●

●●

●●
●

●

●

●
●

0.5

3.0

●●●

●
●

●

Co

Co
0.2

Co

●
●
●

Cross-Entropy: dissimilarity with community out-degree
information
(a) In-degree
- (b) In-degree
- (c) In-degree
0

0.2

0.4

0.6

0.8

1

0

0.2

Lifecycle Stages

0.4

0.6

0.8

1

0

0.2

Lifecycle Stages

0.4

0.6

0.8

1

Lifecycle Stages

-

0.4

0.6

0.8

1

●●●

0

0.2

Lifecycle Stages

0.4

0.6

●
●

●●

●

●

0.8

0

●

●

●
●

●

0.2

0.4

0.6

0.8

1

-

8.5

7.0

●●●

●
●
●●
●●● ●●●●●
●
●
●

●

umunity Cross Entropy

8.0
●

●●

●

6.5

●

●

●

- (f) Out-degree
Server Fault

7.5


7.0
6.8
6.6
6.4

●

●

●●
●

●

●●●●
●

●

Lifecycle Stages

- (e) Out-degree
SAP

18
●
.2

1

●

●

Lifecycle Stages

(d) Out-degree
Facebook

●

●●

3.0 3.5 4.0 4.5 5.0 5.5 6.0

●

● ●●●
●
●

●

●

●

8.0

0.2

●

●

●

●

●
●

7.5

0

●

●

●●
●●

●

●

●

●

Community Cross Entropy

●

●

●

●

●

3.0 3.5 4.0 4.5 5.0 5.5 6.0

2.5

●


3.5
3.0

●

2.0


4.0

I.e.Facebook users that a user contacted differ from the Fault
how do the
SAP
Server community?

●
●

pe
to
at
is
fea

●●
●

●

●
●

●
●

●

●

●●
●

of
fo
pr
18
ra
sc
us
fea
(ii

●●

●

0.8

2.0
1.5

m(u, s + 1) m(u, s)
dm
=
the standard lineards
model: f (x; w) m(u,x. We include the
= w| s)
m (u, s) =

2. Build the prediction model
L2 -regulariser within the model to control for overfitting on
Where training splits and test di↵erent measure models. In
the m is indexed by the given -indexed (i.e. in-deg
•  Define the objective function using vectorabove goal is to minimise
period learning the model’s weight the w, our magnitude funct
cross-entropy),
the minimising the a given measure (m) vector:
to return the magnitude ofwith respect
•  Learn the model by cost function (C(w))objective: to the weight!for use
●●

●

●

●

●●

●

●

●●●

●

●

●

●

1

0

s

0.2

0.4

0.6

0.8

1

Lifecycle Stages

●●

●

Where the latter term (kwk ) defines the L2this
3. Apply the model
Goal: learn theby reducinge↵ect on the
w regularizer’s -regularizer
and
x =[m1 (u,defines .the weight of m2 (u, 2), . . . , m2 (u, 19), . .
2), . . , m1 (u, 19),
•  Over ‘held-out’ data and thus controls for overfitting on the training split:
model,
m1 (u, 2), . . . , m1 (u, 18), m2 (u, 1), . . . , m2 (u, 18)]
|w|
⇣X
⌘
•  Evaluate performance: how accurate is our 2predictor?
2
2

●

●

●

●

●●
●

●

●●●●

●

●

●

●

●

●

19

●

0.8

3.0 3.5 4.0 4.5 5.0 5.5 6.0

●


at the allotted lifecycle period. Thus a feature vector
- (c) In-degree
X
1
2
(f (xi ; thesei 2 + and
(6)
Server Fault of the model formedC(w) a single user using w) y )rate kwk2 magnitu
is
for = 2|Dtrain |
Error
i=1
features:

●

s

●
●

0.5

●

•  Change in the magnitude into from period s to s+1
●

1.0

●
●

Comumunity Cross Entropy

2.5

feature definition and model specification, we alter the l
lexical term distributions
cycle period notation from the existing interval tuple set (
h signalPredicting Churners [t, t0 ] 2 T ) to use a set of discrete single elements: s 2
we see a growing
ds for both churners and
where S = {1, 2, . . . , 20}. Magnitude features are defin
es of the curves are the
as a given user’s measure taken at a given lifecycle peri
1. Extract Featuresm(u, s),Users’ Evolution cross-entropy curves at lifecy
from where the measure for user u is taken
•  Magnitude period s. Rates@ period s changes in measures from o
of the signal are defined as
lifecycle period to the next:

1

●

0

0.2

0.4

0.6

0.8

Lifecycle Stages

1

kwk2 =

j=0

|wj |

1
2

(7)

As a result of using both rates and magnitudes from ea
For learning the parameter weight vector (w) we use graof the 20 lifecycle periods, aside from the first and last o

Evaluation: Results

Higher = better

Area Under the receiver operator characteristic Curve (AUC) scores for the di↵erent regu
Min = 0, Max = 1!
on models and the J48 baseline art baseline
=State of the model from the state of the art (denoted by J48 ). Best mo
is in bold and significance of improvement over the random model baseline is indicated.
Platform
Facebook
J48 = 0.586

SAP
J48

= 0.759

Server Fault
J48 = 0.796

Feature Set
=0
=1
=2
=5
In-degree
0.535.
0.543.
0.538.
0.556*
Out-degree
0.674***
0.666***
0.676***
0.696***
Lexical
0.633***
0.630***
0639***
0.637***
Cross-period
0.649***
0.642***
0.649***
0.652***
Cross-community
0.684***
0.693***
0.691***
0.699***
All
0.811***
0.804***
0.816***
0.817***
In-degree
0.652***
0.651***
0.651***
0.652***
Out-degree
0.741***
0.742***
0.742***
0.742***
Lexical
0.501
0.501
0.501
0.499
Cross-period
0.614***
0.614***
0.614***
0.613***
Cross-community
0.765***
0.765***
0.765***
0.765***
All
0.816***
0.817***
0.817***
0.817***
In-degree
0.659***
0.658***
0.662***
0.663***
Out-degree
0.618***
0.617***
0.616***
0.619***
Lexical
0.680***
0.682***
0.687***
0.686***
Cross-period
0.671***
0.675***
0.680***
0.691***
Cross-community
0.778***
0.779***
0.780***
0.778***
All
0.858***
0.860***
0.861***
0.861***
Significance codes: p-value < 0.001 *** 0.01 ** 0.05 * 0.1 . 1

= 10
0.549**
0.690***
0.641***
0.651***
0.701***
0.819***
0.654***
0.743***
0.497
0.612***
0.765***
0.818***
0.663***
0.626***
0.684***
0.689***
0.779***
0.860***

ods to late lifecycle periods. Across all three platOur churn prediction approach makes use of the
20
find that performance improves as additional inment signals that users exhibit along both social an
is added into the models. There are di↵erences,
dimensions in order to di↵erentiate between who w
in the gradient in performance between the platand who will remain within the online community p

By mining users’ evolution signals we can accurately predict
who will churn, and who will not…
…this enables the early application of retention strategies

21

22

Recommending Items from Taste Evolution


Recommender Systems aim to either:
(i)  Predict item adoptions
(ii)  Predict item ratings

duced from the training segment. There include the general
number ofthe given dataset (µ), which is shown in have beenas
he
bias of items within a particular category that Figure 7
reviewed, we instead include the ratings withincalculating
er
Recommendation Datasets: Item-Ratings when the training
the mean rating score across all ratings
Table 1: Statistics review define used sets, the former (D u,s,c )
the distribution. use of the mean onfor our analysis and experiments - i.e.
of
segment. The We first datasets two its own is insu cient
train Scale
Dataset
#Users #Items #Ratings
Time Span
Ratings
corresponding to the in ratings scores for[26-04-2000,31-12-2000] items
by during interval s for Movies
MovieLensthe variance ratings3,678 u 902,585
6,024
[1,5]
ng
note
the Amazon
Movie
u,s
from Tweetings& therefore 889,173 also include the corresponding to
category Reviews 19,043 latter 7,880,387 [20-08-1997,25-10-2012] i ) [1,10]
c, and we 253,059 (Dtrain[28-02-2013,23-09-2013]
the 11,451 117,206 ) item bias (b and
s’
dataset
Amazon Movies- TV
[0,5]
u,s,c
u,s
Total
ratings by u during s,The former 8,900,178Dthe average sets are
hence Dtrain ✓
is
the user bias (bu ). 914,240 268,188 bias is train , these deviation
User
…with score r…
ar
formed as follows:u… for the item i within the training segfrom the mean bias
n-coverage. ment, while the latter bias is thethe statistics ofdeviation frominthe 2 demonaverage these datasets shown Table
u,s,c
Dtrain =
, t to s, c 2 (i)}
strates the extent 2 which the items, users and ratings have
t-4.2 Amazon Movies{(u, i, r, t) : (u, i, r, t) 2 Dtrainratingspresents the distribution of reviews
mean bias from the training segment’s Figure 2 by user u.
been reduced.
(1)
For the Amazon Movie and TV Reviews dataset we
per users within each of the reduced datasets; we note that
…atstrategy t MovieLens (concentrating on users
time of
erprovided with Amazon Standard Identification Numberswere
…rated item i…
(ASINs) the collection
u,s looked up the ASINs for each item
who µ=7.7 reviewed more than 20
skews
rsas identifiers of items.train µ=3.7 {(u, i, r, t) : (u, i, r, t) have Dtrain , t 2 s} items)(2) the distribuD We the = Product Advertising
2 users who have produced many reviews, while
µ=4.1
in the dataset by querying
Amazon
tion towards
reAPI and returning the item information including: title, acfor Movie Tweetings and the Amazon datatsets we see heavy
We then MovieLens
directors. Unlike define the function
Tweetings,
tailed distributions. derive the
Table also indicates
irtors, andnot provided within the year and Movieinformation ave rating toboth users2 and itemsav- that there is a
we were
release
overall,
erage rating valueof from all ofratinglarge reduction innumber of ratings given great, this sugquadruples in the is not as however the
from the API, therefore to perform the disambiguation sereduction in the
manticset: we used the actor information from each movie:
URIs
gests two things: (i) mapped items are popular, and thus
our intuition being that each movie would have a unique set
dominate the ratings; and (iii) obscure items are present
X
of actors starring in it. Therefore we stored the actors associwithin the data. In particular for the Amazon dataset , deu,s information 1 spite our alignment covering only 10.6% of items we only
ated within each item as additional background ) =
ave2 rating(D5train 2 3 5 6have8 reduction of 126.9% of r3 total(3)5 suggesting
1
3
4
2
4
and performed disambiguation in a similar vein as1 above: 4 u,s 7 | a 9
the
ratings,
|Dtrain we cover the ‘headLifetime (in ratings user
Average Rating
Average Rating
u,s the days) per distribution in terms
we first identified candidate URIs for a given movie item
that
’ of
100

●

●

●

●
●
●
●●
●
●
● ●● ● ● ●●
●
●● ●●
● ●
● ●● ● ● ● ● ●
● ● ●
● ●● ●
●
●●●●● ●
●
●
● ●● ●●● ● ● ● ●
● ● ●●●●●●● ●●●● ● ●
●●
●● ● ●
●●● ● ● ● ●● ● ●
● ●
● ●● ● ● ● ●
●
● ● ● ●●●
● ● ●●
●
●● ● ● ●
● ● ● ●●
● ●● ●
●
●●●● ●●●● ●
● ● ● ●●●●●●●● ● ● ●
● ● ●●● ●●●● ●
●●● ● ●
● ● ●
●
●
● ●●●●●●●●● ● ●
●●● ●●
●
● ● ●●●●●●●●●●● ●● ●
●● ●●●● ● ● ● ●
● ●● ● ● ● ●
●●● ● ●
● ●
● ● ●●●●● ●●●●●●● ●●●●
● ●
● ● ● ●●●●● ● ●●●
● ● ● ●●
● ●●
●
●
● ● ● ●●● ●●●●●● ●●● ●●●● ●
●
●●●●●●●●●●●● ●●●● ●
●●●●●●●● ●●● ● ●●
● ●●●● ● ●
● ●●
● ●●● ● ● ●●●●●●●●●●●●●●●●●●●●●●●
●
●
● ● ●●●●●●●●●●●●●●●●●●●●●●
● ●●●●●●●●●●●●●●●●●●●●●●
● ●●●●● ●●●●●●●●●● ● ●●
●● ●●●●●●●●●●●● ● ● ●
●●●●● ●●● ● ●

10−4

●

●

●

p(x)

10

●

●

●

−6

10−2 10−1 100
10

●

−8

●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●
● ●●●●●●●●●●●●●●●●●●●●●●●●● ●
●●●●●●●●●●●●●●●●●●●●●●●●● ●
● ●●●●●●●●●●●●●●●●●●●●●●●
● ●● ●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●● ●

●

●

●
● ●
●●
●
● ● ●●
● ●
●●
●
●
●
●
●
●
● ● ●● ● ● ● ●●
●
●
●
● ●●
●
● ●●
● ● ● ● ● ● ● ●● ● ● ●
●
●
● ● ● ●●
●
●
●
● ● ●● ●● ● ● ● ● ● ●● ●●●●●●●●
● ● ● ● ● ● ●●●●● ● ●
● ● ● ● ● ●● ●● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●● ●
●
●● ●● ●●
● ●
●●
●
●● ● ●
● ● ● ●●●● ●●●● ● ●●
● ● ● ● ● ●●● ●●●● ●●●●●
● ● ● ● ● ● ● ●● ●● ●● ●●●●● ●●●
●● ●
● ● ●●
●●
● ●● ● ● ●● ● ●●●●● ● ●●●●●●●●
●● ●●
● ●●
● ● ●
●
● ● ●● ● ●● ●●●● ● ●● ●●●●
●
● ●
●
●● ● ● ●●●● ●●
● ● ● ● ●● ●●●● ●●●● ●●●
● ●
●● ●●
● ●
●● ● ● ●● ● ● ● ●●●●● ● ●●●●●●●●●●●●●●●
● ● ●●● ●● ●● ● ● ● ●●● ●● ● ●●●●●● ●
● ● ● ● ●●●● ●
●
●
● ●● ●
● ● ● ● ● ● ●● ●●●●●●●●●●●●●●●●
●
●
● ●● ● ● ● ●●●● ●●●●
● ● ●● ●●●●●●●● ●●●●●●●●●●●
● ● ● ● ● ●●●
●● ● ●● ●●●● ●●●●●●●●●●●●●●●●●●●●●●●●
●● ● ●● ● ●● ●●●●● ●●●●●●●●●●●●●●●●●●
●
●●● ● ●● ●● ●●● ●●●●●●●●● ●●●●●●●●●
● ●●●
● ● ●●●● ● ●●● ●●●●●●●●● ●●
●●●● ●●●●●● ● ●
●●●● ●● ●●●●●●●● ● ●●●●●●●●●●●●●●●●
●●●●●● ●●●●●●●●●●●
●● ●● ●●●●● ●●●
● ● ●●●●● ●●●
● ●
●
● ● ●● ●●●●●● ●●●●●● ●●●●●●●●●● ●
●● ● ● ● ●● ●●●●●●●●●●
● ●● ● ●● ● ●●●●●●● ●●●●●●●●●●●●●●●●●●●
● ● ● ● ● ●●●●●● ● ●●●●●●●●●●●●●●●●●
● ● ●● ●● ●●● ●●●●●●●● ●●●●●
●● ●● ●●●● ● ●●●●●●●●●●
● ●● ●●● ●● ●●●●●●●●●●
●● ●● ● ● ●
●● ● ● ●
●
●
● ● ●●●● ●
●
● ● ●●●● ●●●●●●●● ●●●●●●●●●●●●●●●●●
● ●● ●●● ●● ●●●●● ●●●●●●●●●●●●●●●●●
●
● ●● ●● ●●●●●● ●●●●●●●●●●●●●●
● ●● ● ● ●●●●●●●●●●●● ●●●
●
●●
●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
● ● ● ● ●●● ●●● ●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
● ●●●● ●●●●● ●●●●●●●● ●●●●●●●●●●●●●●●●
●●●●● ●● ●● ●● ●●●●●●●●●●●●●●●●●●●●●
● ● ● ● ●● ●● ● ●●●●●●●●●●●●●●●●●●●
● ● ●● ●●● ● ●●● ●●● ●●●●●●●●●●
● ●
●
● ● ● ● ●●●●●●● ● ●●
●● ● ● ●●● ● ● ●
●● ● ●● ●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
● ● ●●●●●●●●●●●●●●●●●●●●●●●●●
●● ● ●●● ●● ●●●●●●●●●●●●●●
●
●● ●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●● ● ●●●● ●●●●●●●●●●●●●●●●●●●●●●●●
●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●

10

10

−4

● ●
●

●●●●●●●●●●●●●●●●● ●
● ● ●●●●●●●●●●●●●●
● ● ●●●●●●●●●●●●●●
● ● ●●●●●●●●●●●● ●
● ● ●●●●●●●●● ●●
●●●● ●

10

●

●

10

●

p(x)

●
●● ●
● ● ● ●
●
● ●
● ●● ●
●
● ●●● ● ●
● ●●
● ●●
● ● ●● ●
● ●
●
● ●● ●
● ●●
●● ●● ●●● ●
● ● ●●
● ●
●
●
● ●●●●● ●● ●
●●●●● ● ●
●●●● ● ●
● ●● ● ●
● ● ●●●● ● ●●●●
●●●●● ●●●●
●
● ●●
●●
●● ●●●●●●●●● ●●●
● ●●●●●● ●●● ●
● ●●●● ● ●●● ●
● ●●● ● ●●
●
● ● ●●●●●●●●●●● ● ●● ●
● ● ●●●●●●●● ● ●●
●●●●●●●●●● ●
● ●● ●●●●
● ●●●●●
●●

10−3

p(x)

●

−3

2

−4

● ●
●

−5

10−2

●

(u,i,r,t)2Dtrain

by performing fuzzy matches between the item title and seof popularity.
25 titles. We then derived the correct URI by
mantic URIs’
From Mining to actors associated with the Social Web
comparing the set of Understanding: The Evolution of item (Aa ) Users
and the set of actors associated within each candidate URI

100

●

●

●

●●
●●

µ=5.8

−4

µ=139.7

10−1 100

●
●●
● ●
●
●●
●●●
●●
● ●●●
●●
●●●
●●
●●
●
●●
●
●
●●
● ●●

2

10−2

(a) Lens
(b) Tweetings
(c) Amazon
From these definitions we then derive the discrete probability distribution of the user’s ratings per category as fol-

µ=12.5

● ● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●

3
2
1

Average Rating

7.0
6.0

Independent Films
Directorial Debut Films

0

1990s Comedy Films

5.0

Average Rating

4

5

8.0

3.8
3.6
3.4
3.2
3.0

Average Rating

4.0

the biases of the recommendation models and consider the movie ‘Alien’u, v denote u
then
information returned. For instance, for the
• restability of a given bias in in 1970, which we shall now use as a running example, denote it
item
leased light of when the rating is being
• i, j
W
made: i.e. considering the
the following categories are rating
Forming Taste Profiles fluctuation of the found: signal
• r denotes a k
the
and how this relates <h tpreviouspfluctuations. o u r c e / A l i e n ( f i l m )>
to t p : / / d b e d i a . o r g / r e s
denotes a pre
d c t e r m s : s u b j e c t c a t e g o r y : A l i e n ( f r a n c h i s e ) •i lDatasets base
f ms ;
are
port
dcterms : s u b j e c t c a t e g o r y :1979 h o r r o r f i l m s ;
dcterms : s u b j e c t c a t e g o r y : S p a c e a d v e n t u r e f i l mD ;
s and are seg
are a
d c t e r m s : s u b j e c t c a t e g o r y : F i l m s s e t i n t h e f u t u r etest ) datas
.
(D
the t
such that Dt
71%
Subject categories form a hierarchical structure such that0
• c, c
ing c
parent categories define more general subjects. For instance denote t
from
graph and C
the category category:Films_set_in_the_future is linked
twee
itself
to category:Science_fiction_films_by_genre by the pred- is deno
May
Jul
Sep
Nov
Mar Apr May Jun Jul Aug
1998
2002
2006
2010
cove
notes
Time
Time
Time
icate skos:broader, thus providing a general taxonomic clas- the set
ings
sification of the film. The advantage of such a structure is concept
nect
(a) Lens
(b) Tweetings
(c) Amazon
and
that we can explicitly identify a given user’s tastes at a given
rected graph
point in time via the categories of films that they have con- e 0 from
i.e.
deno
Item sumed, and thus rated. In order to provide such information, c,c betw
Rating
Item
Rating
s
the triple c s
Figure 3: Average ratings require a link between a given item within and thusMov
Alien however, we derived using a 7-day
4*
Space_adventure
(4+4)/2 = one
4
mo
It
of our top-2 datasets frequently rated
three most and the semantic URI that denotes
moving average of the
Bladerunner
5*
Science Fiction
(4+5+4)/3 = 4.3
denotes a
the
that movie item. However in deriving semantic web•URIs
categories.
from
mantic categ
Star Wars films we may encounter ambiguity issues where multiple
4*
for
the
films share the same title - this often happens with u,s,c tion of semaI
film reProbability of user rating category we use available information from traintion either l
the
ave
)
u,s
makes. Therefore c P r(c|Dtrain ) = X rating(D each of0
(4) m
u,s,c
Mov
high our datasets to disambiguate the semantic ave rating(Dgories: p : I
in lifecycle period s:
URIs and thus )
train
5. ANALYSING TASTE EVOLUTION this u,s
Twe
26
return the correct alignment. In c0 2Ctrain
section we describe
connected gr
view
FromAnalysing the evolution and development of users’ tastes
Mining to Understanding: The this disambiguation procedure across the three datasets usEvolution of Social Web Users
Based on this formalisation we can3 assess the relative
allows one to understand how a given rating is likely categoryyeara of theuser and lifecycle
I.e. Using not
ing two methods: one based on title and for given movie the sta
mean user score per to rate
American Films
Black and White Films

0.220

0.290

●

1

2

3

4

Lifecycle Stages

(a) Lens

5

●

●

0.205

0.275

0.225

●

0.215

0.285

●
●

●

0.210

●

●

Conditional Entropy

●

●

0.280

Conditional Entropy

0.245

●

0.235

Conditional Entropy

rate items in the future given their category information.
Conversely, for MovieLens and Movie Tweetings we see an
Conditional-Entropy: relative proﬁles become less
opposite e↵ect: users’ taste information differencepredictable
I.e. how dissimilar is the user’s ratings in period s from period s-1?
as they develop; users rate items in a way that renders uncertainty in proﬁling from previous information.

1

2

3

4

Lifecycle Stages

(b) Tweetings

5

1

2

3

4

5

Lifecycle Stages

(c) Amazon

Figure 5: Parent category conditional entropy be27
tweento Understanding: The Evolution oflifecycle stages (e.g. H(P2 |P3 ))
consecutive Social Web Users
From Mining
across the datasets, together with the bounds of the

2

3

4

Lifecycle Stages

(a) Lens

5

0.136
2

3

●

4

Lifecycle Stages

(b) Tweetings

5

0.134

●

0.132

0.114
1

●

●

0.130

1

●

●

Transfer Entropy

0.116
●

●

0.112

●

Transfer Entropy

0.122

●

0.120

Transfer Entropy

0.124

ings and Amazon we ﬁnd a di↵erent e↵ect: users’ transfer
entropy actually increases over time, indicating that users
Transfer-Entropy: influence of globalpreferences, and therefore
are less inﬂuenced by global taste behaviour on the user
I.e. how does collective user behaviour influence the user’s tastes? their
the ratings of other users, and instead concentrate on
own tastes.

1

●

2

3

●

4

5

Lifecycle Stages

(c) Amazon

Figure 6: Parent category transfer entropy between
28
consecutive lifecycle stages (e.g. H(P2 |P3 )) across the
datasets, together with the bounds of the 95% con-

nalisation + q| p + |R(u)| recommendation model as
component of the 1
Model yj
Formulation
rui = bui
ˆ 6.1 Recommendation 2
(19)
u
i
ws: Including Taste Evolution in a Recommender System
Current
j2R(u)
We use the following model for our recommenderWork!
system
X
| factorisation: 1
based upon matrix pu Personalisation component: f latent factors
rui = bui + qi
ˆ
+ |R(u)| 2
yj f
(19)
we have three latent ufactor i:
•  Predict rating for user for item vectors: qi 2 R dej2R(u)
f latent factors associated with the item i; pu 2 Rf
rui = bui with qi user u; and
ˆ
+ p| the
(8)
u
f
he f we have three associated
6.2 Biases
ove, latent factors latent factor vectors: qi 2 R dedenotes biases in user u and item ilatent factoritem i; p for
Bias component of our model
the fThe the factors associated with the asvector u 2 Rf
latent f dimension are deﬁned follows:
m the set ofbias component to include taste evolution signal:
es the f latent factors associated u: R(u). user la- and
•  Modify rated items by user with the The u;
rs fare derived duringStatic
learning,latent shall vector
Evolving
R denotes the fz dimension zas we factorexplain for
}|
{
}|
{
hile the the setui of rated iitems+ bi,cats(i)u:isR(u). pri- la- (9)
j from numberof factors toucapture (f ) bu,cats(i)
b = µ + b + b by user + set a The
is often set to 50 across the
actors are derived categories of item i literature. we shall explain
during learning, as The factors
How global tastes for the
have
6.2.1 number of factorsitems, for instance Ro-ofpri-i
Static Biases the tastes evolvedu have evolved)for categories item
nifying the
, while attributes acrossHow the toof user
capture (f is set a
The bias component inthe model containsThe biases
omedies or Action Films of ourpersonalisation component: We
this is often setcategoriesacrossthe movies domain. factors into 50 within the literature. static
•  Interpolate
duced from the training segment. Therefor each the general
include sequation 19 to incorporate maths to be shown here! instance Rolatent factors for
Too across the items,
re unifying attributes much
bias of the given dataset (µ), item from. Our in29
tegory thatUnderstanding: Thehas Films inwhich is shown in Figure 7 as
rated an
c Comedies a user scoreof across all ratings within the training
From Mining to or Action
Evolution Social Web Users the movies domain. We
the mean rating
hind this inclusion is that certain categories have a
d Equation 19 to incorporate latent factors for each se-

3

Average Rating

2
1

7.0
6.0

Independent Films

0

1990s Comedy Films

5.0

Average Rating

4

5

8.0

3.8
3.6
3.4
3.2
3.0

Average Rating

4.0

di↵erent datase
and then selecting the top-2 most frequent. In Figure 3 we
interested in un
plotted the development of the average rating score across
lution di↵ers, a
these two categories, derived using a 7-day moving average
for the platform
to smooth the variance in rating fluctuations. We find that
there are peaks and troughs in the reviewing of the items
5.1 Pream
that belong to the categories, in particular one can note that
From this po
for MovieLens the scores remain relatively stable, while for
ommender syst
Movie Tweetings ‘Independent Films’ reduce in their average
By modelling tasteDebut films’ increase in their average
evolution we can capture… ease of legibilit
rating and ‘Directorial
for set notation
rating over time. Such information can be encoded within
the biases of the recommendation models and
• u, v denot
(i)  the influencebias in light of dynamics consider the
of global when the rating the user
on is being
stability of a given
• i, j denot
(ii)  made: the user’s preferences for of the rating signal
how i.e. considering the fluctuation categories change • r denotes
and how this relates to previous fluctuations.
denotes a
(iii)  how global tastes are evolving
• Datasets
D and are
(Dtest ) da
such that
• c, c0 deno
graph and
itself is d
May
Jul
Sep
Nov
Mar Apr May Jun Jul Aug
1998
2002
2006
2010
notes the
Time
Time
Time
nect conc
(a) Lens
(b) Tweetings
(c) Amazon
rected gra
30
i.e. ec,c0 d
the triple
Figure 3: Average ratings derived using a 7-day
American Films
Black and White Films

31

Conclusions


p(y|y )

salient di↵erentiating feature.
y2Ys ,

2.5

Stat
m(u, s), where the mea
z
}|
period s. Rates= µ + bi
are defi
bui
lifecycle period to the n

6.2.1

Static Biases

●

●

●

0.4

●

●
●●

●

●

●

●

●

●

●

●●●

3.0

●

●

●●●

●

●

●

●●

●●

●

●

1.5

m

●

●

●●

●

1.0

●

●

●

●

●●

●

●

●●●

●

●

●

●

0.5

0.6

●●

4.0

●

●

●

3.5

0.8

dm
User evolution can be captured using lifecycle models component
The bias (u, s) =
ds

duced from the trainin
bias of the given datas
Where m is indexed
the mean rating score
period cross-entropy), u
segment. The use of th
to return the magnitud
note allotted lifecycle
at thethe variance in ra
dataset therefore we
is formed- for a single u
the user
features: bias (bu ). The
from the mean bias fo
x =[m1 (u, the . . , m
ment, while 2), . latter1 (
mean bias from. . . , tra
m1 (u, 2), the m1

●

●
●● ●
●

●●

●

●

●●

●

●

●

●●
●

●

●

●

●

●

●

µ=3.7
● ●
●

As a result of using b
of the 20 lifecycle perio
for magnitudes and the
provided with at most
18 magnitude features
1
2
3
4
5
rate featuresRating each
for
Average
scribe within the exper
(a) Lens
used between di↵erent:
features, community cro
(ii) lifecycle periods. On
Figure 7: Distribut
the research questions t
three datasets
into a user’s lifecycle c
constraining the feature
eratively increasing the
6.2.2 Category Bia
●

●

●
●
●

1

●
●●

●

2

3

●●

●●

●
●

●

●

●
●

●

4

Lifecycle Stages
0.2 0.4 0.6 0.8

●

5

●

●

●

●
●●

●
●

1

2

●●● ●●
●

3

●

●

●●

●●

●

●

4

Lifecycle Stages
0.2 0.4 0.6 0.8

0
1
0
32
Lifecycle Stages
Lifecycle Stages
(a) Lens
From Mining to Understanding: The Evolution of Social (b) Tweetings
Web Users

5
1

Transfer Entropy

●

●

●

●

●
●
●

●
●
●
●●

●

●
●

●

●●
●

●

●

●

●

●

1
0

●

2

3

4

Lifecycle Stages
0.2 0.4 0.6 0.8

5
1

(c) Amazon
(g) Lexical - Face- (h) Lexical - SAP (i) Lexical - Server
book
Fault
Lifecycle Stages

● ●
●

10−4

7.0 0.1307.5 0.132 8.0 0.134 8.5 0.136

●

Transfer Entropy
6.0 0.112
6.5
7.0 0.114
7.5
8.0 0.116

Users’ tastes are susceptible to global taste influence
●

●
●● ●
● ●
● ●● ● ●
●● ● ●
●
● ●●● ● ●
● ●●
● ●●
● ● ●● ●
● ●
●
● ●● ●
● ●●
●● ●● ●●● ●
● ● ●●
● ●
●
●
● ●●●●● ●● ●
●●●●● ● ●
●●●● ● ●
● ●● ● ●
● ● ●●●● ● ●●●●
●●●●● ●●●●
●
● ●●
●●
●● ●●●●● ●●●● ● ●
● ●●●●●●●●● ●●
● ●●●●● ●●● ●
● ●●● ● ●●
●
● ● ●●●●●●●●●●● ● ●● ●
● ● ●●●●●●●● ● ●●
●●●●●●●●●● ●
● ●● ●●●●
● ●●●●●
●●

●

●

●●●●●●●●●●●●●●●●● ●
● ● ●●●●●●●●●●●●●●
● ● ●●●●●●●●●●●●●●
● ● ●●●●●●●●● ●●● ●
● ● ●●●●●●●●●●●
●●●●●

●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●
● ●●●●●●●●●●●●●●●●●●●●●●●●● ●
●●●●●●●●●●●●●●●●●●●●●●●●● ●
● ●●●●●●●●●●●●●●●●●●●●●●●
● ●● ●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●● ●

p(x)

●●●

●

●

●●●●

10−2

●

●

●

●

●

●

p(x)

●

●

●

10−3

●

●


●

3.0 3.5 4.0 4.5 5.0 5.5 6.0

●

●●

●●

●

●

●

●


3.5
3.0
2.5

●

●

2.0


●

●

3.0 3.5 4.0 4.5 5.0 5.5 6.0

Churners and non-churners exhibit divergent signals

Transfer Entropy
0.120 6.4 6.6 6.8 7.0 0.124
0.122
6.0 6.2

3. 

2.0

●

4.5

We derived the transfer entropy between consecutive lifecycle periods, as with the conditional entropy above, to examine how the influence of global and local dynamics on
users’ taste profiles developed over time. Figure 6 plots the
means 0.2 1 0.6 0.8values across3the lifecycle periods n 0.8 1
of 0.4
these 1 2 0 0.2 0.4 0.6 0.8 1 … 0 0.2 0.4 together
0
0.6
Lifecycle Stages
Lifecycle Stages
Lifecycle users of
with the 95% confidence intervals. We find that Stages
MovieLens transfer (b) In-degree
over In-degree
(a) In-degree
- entropy decrease - (c) time, indicating
that global dynamics have a stronger Server Fault users’
influence on
Facebook
SAP
taste profiles towards later lifecycle stages. Such an e↵ect is
characteristic of users becoming more involved and familiar
with the review system, and as a consequence paying attention to more information from the users. With Movie Tweetings and Amazon we find a di↵erent e↵ect: users’ transfer
entropy actually increases over time, indicating that users
are less influenced by global taste preferences, and therefore
0
0.2 0.4 0.6 0.8
1
0
0.2 0.4 0.6 0.8
1
0
0.2 0.4 0.6 0.8
1
the ratings of other users, and instead concentrate Stages their
Lifecycle Stages
Lifecycle Stages
Lifecycle on
own tastes.
(d) Out-degree - (e) Out-degree - (f) Out-degree Facebook
SAP
Server Fault
4.0

2. 


5.0

1.0


Churners
Non−churners

●

0.2

1. 


1.2

y 0 2Ys 1 ,
x2Xs 1

33

Questions?
@mrowebot
m.rowe@lancaster.ac.uk
http://www.lancaster.ac.uk/staff/rowem/

Changing with Time: Modelling and Detecting User Lifecycle Periods in Online Community
Platforms. M Rowe. International Conference on Social Informatics. Kyoto, Japan (2013)
Mining User Lifecycles from Online Community Platforms and their Application to Churn
Prediction. Understanding: The Evolution of Social Web Users
From Mining to M Rowe. International Conference on Data Mining. Dallas, US. (2013)

From Mining to Understanding: The Evolution of Social Web Users

Recommandé

Recommandé

Contenu connexe

Similaire à From Mining to Understanding: The Evolution of Social Web Users

Similaire à From Mining to Understanding: The Evolution of Social Web Users (20)

Plus de Matthew Rowe

Plus de Matthew Rowe (19)

Dernier

Dernier (20)

From Mining to Understanding: The Evolution of Social Web Users