Importance of Individual Events in Temporal Networks
1. Importance of individual events
in temporal networks
Taro Takaguchi1, Nobuo Sato2, Kazuo Yano2, and Naoki Masuda1
1 Department of Mathematical Informatics, The University of Tokyo
2 Central Research Laboratory, Hitachi, Ltd.
2. Interests: patterns in human communication behavior
By garryknight
By
opacity
twitter.com/#!/duncanjwatts By infomatique photos from flickr 2
3. More extensive data, more detailed analysis
• Huge populations (~millions)
• High temporal resolution (~minute)
• Additional information (e.g., locations, history of purchases)
Cell-phone calling network Business Microscope system
(Onnela et al., NJP 2007) (Hitachi, Ltd., Japan)
Name tag
with an infrared module
http://www.hitachi-hitec.com/jyouhou/business-microscope/
3
4. Temporal networks
Reviewed by Holme and Saramäki, Phys. Rep. 2012
Represented by sequences of events with time stamps
1
static (aggregated) network
1 2
2
3 3 4
4 ✓ node 1 → node 4
(temporal path)
time - node 4 → node 1
4
5. Impact of interevent intervals
Different temporal paths from node 2 to node 3
may have different impacts on
epidemics, information propagation, etc.
1 1 1 1
1
2 2
2 3
3 3
time
5
6. Question: which events are important?
Evaluate the importance of each event
• time-dependent centrality of links
1 2 1 2 1 2
3 4 3 4 3 4
time
6
7. Importance of events
Defined by the amount of new information about others
Note:
“information” ≠ contents of conversation
1
2
3
4 time
7
8. Importance of events
Defined by the amount of new information about others
Note:
“information” ≠ contents of conversation
Before the event:
1
2
3
latest information
4 time
8
9. Importance of events
Defined by the amount of new information about others
Note:
“information” ≠ contents of conversation
Before the event:
1
2
3
latest information
4 time
9
10. Importance of events
Defined by the amount of new information about others
Note:
“information” ≠ contents of conversation
After the event:
1
2
3
latest information
4 time
10
11. Importance of events
Defined by the amount of new information about others
Note:
“information” ≠ contents of conversation
After the event:
1
2
3
latest information
4 time
11
12. Concept (1): vector clock and latency
Lamport, Commun. ACM 1978; Mattern, 1988
Vector clock of node
At time , has the latest information about at time
Example:
time
12
13. Concept (2): advance of event
Kossinets et al., Proc. 14th ACM SIGKDD 2008
Advance for owing to an event between and
⇒
time
⇒
13
14. Calculation of importance
Assumption:
• Individuals can be involved in multiple events in a single snapshot.
• Information can spread up to hops within a snapshot.
(called “horizon” in Tang et al., Proc. 2nd ACM SIGCOMM WOSN 2009)
Read the given event sequence in the chronological order.
1. Update every ‘s information about .
2. Calculate and for all the events at .
3. Importance = symmetrized advance
14
15. Calculation of importance
Assumption:
• Individuals can be involved in multiple events in a single snapshot.
• Information can spread up to hops within a snapshot.
(called “horizon” in Tang et al., Proc. 2nd ACM SIGCOMM WOSN 2009)
Read the given event sequence in the chronological order.
1. Update every ‘s information about .
2. Calculate and for all the events at .
3. Importance = symmetrized advance
15
16. Calculation of advance (1)
Source node (defined for each )
h-neighbors having the latest information about
& being at the shortest distance from
Snapshot at
: source node
16
17. Calculation of advance (2)
Contributing neighbors
‘s neighbors that are on a shortest path
from a nearest source node (about ) to
and contribute .
Snapshot at
: source node
: contributing
neighbor
17
18. Case 1: multiple source nodes with different distances
Assumption:
Only the closest ones convey the information.
is not a contributing neighbor.
Snapshot at
: source node
: contributing
neighbor
18
19. Case 2: multiple source nodes with the same distance
Assumption:
Contributing neighbors equally contribute
regardless of the number of shortest paths they bridge.
and contribute .
Snapshot at
: source node
: contributing
neighbor
19
21. Research questions
1. How is the importance distributed? Broadly?
2. Is the advance asymmetric? (i → j versus j → i)
3. Is the importance “valid”?
Data set
Situation Company office in Japan
Participants 163
Period / resolution 73 days / 1 min
Total events 118,546
Data was collected by World Signal Center, Hitachi, Ltd.
21
22. Parameter
We set .
Information can spread to all nodes in the connected component
within a snapshot.
22
23. 1,2. Importance is broadly distributed & asymmetric
frequency
of events
max = min on the diagonal
23
24. 3. Is the importance of event “valid”?
Event removal test
Hypothesis: Removal of events with large importance values
1. makes “temporal distance” longer.
2. makes node pairs disconnected.
time
24
25. Two measures to characterize the connectivity
Reachability ratio (Holme, PRE 2005)
with at least one temporal path from to
disconnected fully connected
Network efficiency (Tang et al., Proc. 2nd ACM SIGCOMM WOSN 2009)
: time average of latency
disconnected fully connected
or large latency with small latency
25
26. Time average of latency
Pan & Saramäki, PRE 2011
Problem: is not defined for
Solution: a periodic boundary condition
sum of
Time average
26
27. Five schemes of event removal
• ascending/descending orders of the importance
• ascending/descending orders of the link weight
# events on the link
• random order
Fraction of connected pairs
Shortness of temporal paths ?
fraction of removed events 27
28. Ascending/descending orders of the link weight
1. Choose a link with the smallest/largest weight.
# events on the link
2. Remove an event on the link at random.
Decrease the weight of the link by one.
Static (aggregated) network
28
29. Event removal tests based on the importance
1. Removal of 80% unimportant events influences little (Robustness).
2. Removal of 20% important events considerably decreases connectivity.
1.0 1.0
0.8 0.8
netwrok efficiency
reachability ratio
0.6 0.6
ascending I ij
descending I ij
0.4 0.4
0.2 0.2
0.0 0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
fraction of removed events fraction of removed events
29
30. Comparison with the results based on the link weight
Event removals based on temporal/static information are similar
but different.
1.0 1.0
0.8 0.8
netwrok efficiency
reachability ratio
0.6 0.6
ascending I ij
descending I ij
0.4 ascending weight 0.4
descending weight
0.2 0.2
0.0 0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
fraction of removed events fraction of removed events
30
31. Removal of weak links fragments static network
“Strength of weak ties” property
(Granovetter, AJS 1973; Onnela et al., PNAS 2007)
Weak links connect different communities mainly
composed of strong links.
Takaguchi et al., PRX 2011
31
32. Do we need to consider the importance?
A criticism
Ascending-link-weight removal efficiently cuts off temporal paths.
Information about the importance is not necessary.
YES, we do need consider the importance, because:
1. Events on weak links are necessary but NOT sufficient for
connecting efficient temporal paths.
2. Events with large importance are necessary and sufficient
for connecting efficient temporal paths.
32
33. Correlates of the importance value
Spearman’s rank correlation coefficient
between the importance value and
Length of the # total events # total events # partners of
IEI involving i or j i or j
0.819 0.701 0.701 0.630
IEI: interevent interval
time
33
34. Latest IEI approximates the importance
1.0 1.0
(a) (b)
0.8 0.8
network efficiency
reachability ratio
0.6 0.6
0.4 0.4
ascending I ij
0.2 descending I ij 0.2
ascending IEI
descending IEI
0.0 0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
fraction of removed events fraction of removed events
34
35. Origin of the robustness
Bursty activity patterns (Barabási, Nature 2005)
time
of a typical individual
(Takaguchi et al., PRX 2011)
35
36. Exploration of the effect of burstiness
Carry out the event removal tests for the temporal networks generated by
(i) Shuffled IEIs (interevent intervals)
For each pair,
time
(ii) Poissonized IEIs
Reassign random time to each event.
Events follow Poisson process.
36
37. Characteristics conserved / lost by the randomizations
Poissonized
Original Shuffled IEIs
IEIs
Weighted network
structure ✓ ✓ ✓
Burstiness ✓ ✓ -
Temporal
correlations, etc. ✓ - -
37
38. 1. Temporal correlation is not necessary
Results for Shuffled IEIs Results for the original data
1.0 1.0
(a) (a)
0.8 0.8
network efficiency
reachability ratio
0.6 0.6
ascending I ij
descending I ij
0.4 ascending weight 0.4
descending weight
random order
0.2 0.2
0.0 0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
fraction of removed events fraction of removed events
38
39. 2. Burstiness (long-tailed IEIs) is essential
Results for Poissonized IEIs ≠ Results for the original data & Shuffled IEIs
Removal of unimportant events rapidly spoils network efficiency.
1.0 1.0
(b) (b)
0.8 0.8
network efficiency
reachability ratio
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
fraction of removed events fraction of removed events
39
40. Effect of the weighted network structure
(iii) Rewiring
1. Make an Erdös-Rényi random graph
with the same number of nodes and links as the original data.
2. Put the event sequences on the original links
onto links in the random graph.
time
time
original network rewired network
40
41. Characteristics conserved / lost by the randomizations
Poissonized
Original Shuffled IEIs Rewiring
IEIs
Weighted
network ✓ ✓ ✓ -
structure
Burstiness
✓ ✓ - ✓
Temporal
correlation, ✓ - - △
etc.
link weight
distribution ✓ ✓ ✓ ✓
41
42. 3. Heterogeneity in link weights is sufficient
Results for Rewiring Results for the original data
Skewed degree dist., community, structure-weight corr., etc.
are irrelevant.
1.0 1.0
(c) (c)
0.8 0.8
network efficiency
reachability ratio
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
fraction of removed events fraction of removed events
42
43. Effect of network structure
Can bustiness explain the heterogeneity in the importance
even without the heterogeneity in the link weight?
Regular random graph IEI distributions
power-law + cutoff
exponential (Poisson process)
i.i.d.
time
60 events on each link
43
44. Burstiness is a main cause of the robustness
Power-law IEIs on the RRG Exponential IEIs on the RRG
1.0 1.0
(a) (b)
0.8
network efficiency
0.8
network efficiency
0.6 0.6
0.4 0.4
0.2 ascending I ij 0.2
descending I ij
random order
0.0 0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
fraction of removed events fraction of removed events
44
45. Summary
• Importance of events in temporal networks
- Based on advance of vector clocks in an event
• Heterogeneity in the importance
- Long-tailed distribution and strong asymmetry
• Robustness of empirical temporal networks
- Connectivity conserved after removing 80% unimportant events
• Origin of the robustness
- Bursty activity patterns (i.e., long-tailed IEIs)
- Heterogeneity in the link weight
Reference
Taro Takaguchi, Nobuo Sato, Kazuo Yano, and Naoki Masuda,
“Importance of individual events in temporal networks”,
New Journal of Physics 14, 093003 (2012). [Open Access]
45