A presentation slides of Kyuhan Lee, Hyeonsoo Jo, Jihoon Ko, Sungsu Lim, Kijung Shin, "SSumM: Sparse Summarization of Massive Graphs", KDD 2020.
Given a graph G and the desired size k in bits, how can we summarize G within k bits, while minimizing the information loss?
Large-scale graphs have become omnipresent, posing considerable computational challenges. Analyzing such large graphs can be fast and easy if they are compressed sufficiently to fit in main memory or even cache. Graph summarization, which yields a coarse-grained summary graph with merged nodes, stands out with several advantages among graph compression techniques. Thus, a number of algorithms have been developed for obtaining a concise summary graph with little information loss or equivalently small reconstruction error. However, the existing methods focus solely on reducing the number of nodes, and they often yield dense summary graphs, failing to achieve better compression rates. Moreover, due to their limited scalability, they can be applied only to moderate-size graphs.
In this work, we propose SSumM, a scalable and effective graph-summarization algorithm that yields a sparse summary graph. SSumM not only merges nodes together but also sparsifies the summary graph, and the two strategies are carefully balanced based on the minimum description length principle. Compared with state-of-the-art competitors, SSumM is (a) Concise: yields up to 11.2X smaller summary graphs with similar reconstruction error, (b) Accurate: achieves up to 4.2X smaller reconstruction error with similarly concise outputs, and (c) Scalable: summarizes 26X larger graphs while exhibiting linear scalability. We validate these advantages through extensive experiments on 10 real-world graphs.
"SSumM: Sparse Summarization of Massive Graphs", KDD 2020
1. SSumM : Sparse Summarization
of Massive Graphs
Kyuhan Lee* Hyeonsoo Jo* Jihoon Ko Sungsu Lim Kijung Shin
2. Graphs are Everywhere
Citation networksSubway networks Internet topologies
Introduction Algorithms Experiments ConclusionProblem
Designed by Freepik from Flaticon
3. Massive Graphs Appeared
Social networks
2.49 Billion active users
Purchase histories
0.5 Billion products
World Wide Web
5.49 Billion web pages
Introduction Algorithms Experiments ConclusionProblem
4. Difficulties in Analyzing Massive graphs
Computational cost
(number of nodes & edges)
Introduction Algorithms Experiments ConclusionProblem
5. Difficulties in Analyzing Massive graphs
>
Cannot fit
Introduction Algorithms Experiments ConclusionProblem
Input Graph
7. Advantages of Graph Summarization
Introduction Algorithms Experiments ConclusionProblem
• Many graph compression techniques are available
• TheWebGraph Framework [BV04]
• BFS encoding [AD09]
• SlashBurn [KF11]
• VoG [KKVF14]
• Graph summarization stands out because
• Elastic: reduce size of outputs as much as we want
• Analyzable: existing graph analysis and tools can be applied
• Combinable for Additional Compression: can be further compressed
16. Problem Definition: Graph Summarization
𝑮𝑮𝑮𝑮𝑮𝑮𝑮𝑮𝑮𝑮:
a graph 𝑮𝑮 and the target number of node 𝑲𝑲
𝑭𝑭𝑭𝑭𝑭𝑭𝑭𝑭:
a summary graph �𝑮𝑮
𝑻𝑻𝑻𝑻 𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴:
the difference between graph 𝑮𝑮and the restored graph �𝑮𝑮
𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺 𝒕𝒕𝒕𝒕:
the number of supernodes in �𝑮𝑮 ≤ 𝑲𝑲
Introduction Algorithms Experiments ConclusionProblem
17. Problem Definition: Graph Summarization
𝑮𝑮𝑮𝑮𝑮𝑮𝑮𝑮𝑮𝑮:
a graph 𝑮𝑮 and the target number of node 𝑲𝑲
𝑭𝑭𝑭𝑭𝑭𝑭𝑭𝑭:
a summary graph �𝑮𝑮
𝑻𝑻𝑻𝑻 𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴:
the difference between graph 𝑮𝑮and the restored graph �𝑮𝑮
𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺 𝒕𝒕𝒕𝒕:
the number of supernodes in �𝑮𝑮 ≤ 𝑲𝑲
Shouldn’t we
consider sizes?
Introduction Algorithms Experiments ConclusionProblem
18. Problem Definition: Graph Summarization
𝑮𝑮𝑮𝑮𝑮𝑮𝑮𝑮𝑮𝑮:
a graph 𝑮𝑮 and the desired size 𝑲𝑲 (in bits)
𝑭𝑭𝑭𝑭𝑭𝑭𝑭𝑭:
a summary graph �𝑮𝑮
𝑻𝑻𝑻𝑻 𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴:
the difference with graph graph 𝑮𝑮and the restored graph �𝑮𝑮
𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺 𝒕𝒕𝒕𝒕:
size of �𝑮𝑮 in bits ≤ 𝑲𝑲
Introduction Algorithms Experiments ConclusionProblem
19. Details: Size in Bits of a Graph
𝐸𝐸 : set of edges
𝑉𝑉 : set of nodes
Encoded using log2|𝑉𝑉| bits
Introduction Algorithms Experiments ConclusionProblem
𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺 𝒐𝒐𝒐𝒐 𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈: 2 𝐸𝐸 log2 𝑉𝑉
Input graph 𝑮𝑮
20. Details: Size in Bits of a Summary Graph
4
5
1
4
1
5
Introduction Algorithms Experiments ConclusionProblem
𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺 𝒐𝒐𝒐𝒐 𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔 𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈: 𝑃𝑃 2 log2 𝑆𝑆 + log2 𝜔𝜔𝑚𝑚𝑚𝑚𝑚𝑚 + 𝑉𝑉 log2 𝑆𝑆
𝑆𝑆 : set of supernodes
𝑃𝑃 : set of superedges
𝑤𝑤𝑚𝑚𝑚𝑚𝑚𝑚 : maximum superedge weight
Summary graph �𝑮𝑮
21. Details: Size in Bits of a Summary Graph
4
5
1
4
1
5
Introduction Algorithms Experiments ConclusionProblem
𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺 𝒐𝒐𝒐𝒐 𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔 𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈: 𝑃𝑃 2 log2 𝑆𝑆 + log2 𝜔𝜔𝑚𝑚𝑚𝑚𝑚𝑚 + 𝑉𝑉 log2 𝑆𝑆
𝑆𝑆 : set of supernodes
𝑃𝑃 : set of superedges
𝑤𝑤𝑚𝑚𝑚𝑚𝑚𝑚 : maximum superedge weight
Summary graph �𝑮𝑮
22. Details: Size in Bits of a Summary Graph
4
5
1
4
1
5
Introduction Algorithms Experiments ConclusionProblem
𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺 𝒐𝒐𝒐𝒐 𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔 𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈: 𝑃𝑃 2 log2 𝑆𝑆 + log2 𝜔𝜔𝑚𝑚𝑚𝑚𝑚𝑚 + 𝑉𝑉 log2 𝑆𝑆
𝑆𝑆 : set of supernodes
𝑃𝑃 : set of superedges
𝑤𝑤𝑚𝑚𝑚𝑚𝑚𝑚 : maximum superedge weight
Summary graph �𝑮𝑮
23. Details: Size in Bits of a Summary Graph
4
5
1
4
1
5
Introduction Algorithms Experiments ConclusionProblem
𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺 𝒐𝒐𝒐𝒐 𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔 𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈: 𝑃𝑃 2 log2 𝑆𝑆 + log2 𝜔𝜔𝑚𝑚𝑚𝑚𝑚𝑚 + 𝑉𝑉 log2 𝑆𝑆
𝑆𝑆 : set of supernodes
𝑃𝑃 : set of superedges
𝑤𝑤𝑚𝑚𝑚𝑚𝑚𝑚 : maximum superedge weight
Summary graph �𝑮𝑮
26. Main ideas of SSumM
Introduction Algorithms Experiments ConclusionProblem
Combines node grouping and edge sparsification
Prunes search space
Balances error and size of the summary graph using MDL principle
• Practical graph summarization problem
◦ Given: a graph 𝑮𝑮
◦ Find: a summary graph �𝑮𝑮
◦ To minimize: the difference between 𝑮𝑮and the restored graph �𝑮𝑮
◦ Subject to: 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝑜𝑜𝑜𝑜 �𝑮𝑮 in bits ≤ 𝑲𝑲
27. Main Idea: Combining Two Strategies
Introduction Algorithms Experiments ConclusionProblem
Node Grouping Sparsification
28. Main Idea: Combining Two Strategies
Introduction Algorithms Experiments ConclusionProblem
Node Grouping Sparsification
29. Main Idea: Combining Two Strategies
Introduction Algorithms Experiments ConclusionProblem
Node Grouping Sparsification
30. Main Idea: Combining Two Strategies
Introduction Algorithms Experiments ConclusionProblem
Node Grouping Sparsification
38. Merging and Sparsification Phase
For each candidate set 𝑪𝑪 Among possible candidate pairs
Introduction Algorithms Experiments ConclusionProblem
(A, B) (B, C) (C, D) (D, E)
(A, C) (B, D) (C, E)
(A, D) (B, E)
(A, E)
𝐵𝐵 = {𝑏𝑏}
𝐶𝐶 = {𝑐𝑐}
D = {𝑑𝑑}
𝐴𝐴 = {𝑎𝑎}
Procedure
Initialization phase
𝑡𝑡 = 1
While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
Candidate generation phase
Merge and sparsification phase <<
Further sparsification phase
𝐸𝐸 = {𝑒𝑒}
39. Merging and Sparsification Phase
For each candidate set 𝑪𝑪 Among possible candidate pairs
Introduction Algorithms Experiments ConclusionProblem
(A, B) (B, C) (C, D) (D, E)
(A, C) (B, D) (C, E)
(A, D) (B, E)
(A, E)
𝐵𝐵 = {𝑏𝑏}
𝐶𝐶 = {𝑐𝑐}
D = {𝑑𝑑}
𝐴𝐴 = {𝑎𝑎}
Procedure
Initialization phase
𝑡𝑡 = 1
While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
Candidate generation phase
Merge and sparsification phase <<
Further sparsification phase
Sample log2 |𝑪𝑪| pairs𝐸𝐸 = {𝑒𝑒}
40. Merging and Sparsification Phase
Select the pair with
the greatest (relative) reduction
in the cost function
𝒊𝒊𝒊𝒊 reduction(C, D) > 𝜽𝜽:
merge(C, D)
𝒆𝒆𝒆𝒆𝒆𝒆𝒆𝒆
sample log2 |𝑪𝑪| pairs again
Introduction Algorithms Experiments ConclusionProblem
Procedure
Initialization phase
𝑡𝑡 = 1
While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
Candidate generation phase
Merge and sparsification phase <<
Further sparsification phase
(A, B) (A, D) (C, D)
41. Merging and Sparsification Phase
Select the pair with
the greatest (relative) reduction
in the cost function
𝒊𝒊𝒊𝒊 reduction(C, D) > 𝜽𝜽:
merge(C, D)
𝒆𝒆𝒆𝒆𝒆𝒆𝒆𝒆
sample log2 |𝑪𝑪| pairs again
Introduction Algorithms Experiments ConclusionProblem
Procedure
Initialization phase
𝑡𝑡 = 1
While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
Candidate generation phase
Merge and sparsification phase <<
Further sparsification phase
(A, B) (A, D) (C, D)
42. Merging and Sparsification Phase (cont.)
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Procedure
Initialization phase
𝑡𝑡 = 1
While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
Candidate generation phase
Merge and sparsification phase <<
Further sparsification phase
{𝑎𝑎}
{𝑏𝑏}𝐶𝐶 = {𝑐𝑐}
𝐷𝐷 = {𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
43. Merging and Sparsification Phase (cont.)
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Procedure
Initialization phase
𝑡𝑡 = 1
While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
Candidate generation phase
Merge and sparsification phase <<
Further sparsification phase
{𝑎𝑎}
{𝑏𝑏}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
44. Merging and Sparsification Phase (cont.)
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Procedure
Initialization phase
𝑡𝑡 = 1
While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
Candidate generation phase
Merge and sparsification phase <<
Further sparsification phase
{𝑎𝑎}
{𝑏𝑏}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
C = {𝑐𝑐, 𝑑𝑑}
45. Merging and Sparsification Phase (cont.)
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Procedure
Initialization phase
𝑡𝑡 = 1
While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
Candidate generation phase
Merge and sparsification phase <<
Further sparsification phase
{𝑎𝑎}
{𝑏𝑏}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
C = {𝑐𝑐, 𝑑𝑑}
46. Merging and Sparsification Phase (cont.)
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Procedure
Initialization phase
𝑡𝑡 = 1
While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
Candidate generation phase
Merge and sparsification phase <<
Further sparsification phase
Sparsify or not according
to total description cost
{𝑎𝑎}
{𝑏𝑏}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
C = {𝑐𝑐, 𝑑𝑑}
47. Merging and Sparsification Phase (cont.)
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Procedure
Initialization phase
𝑡𝑡 = 1
While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
Candidate generation phase
Merge and sparsification phase <<
Further sparsification phase
Sparsify or not according
to total description cost
{𝑎𝑎}
{𝑏𝑏}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
C = {𝑐𝑐, 𝑑𝑑}
48. Merging and Sparsification Phase (cont.)
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Procedure
Initialization phase
𝑡𝑡 = 1
While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
Candidate generation phase
Merge and sparsification phase <<
Further sparsification phase
Sparsify or not according
to total description cost
{𝑎𝑎}
{𝑏𝑏}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
C = {𝑐𝑐, 𝑑𝑑}
49. Repetition
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Procedure
Initialization phase
𝑡𝑡 = 1
While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
Candidate generation phase
Merge and sparsification phase
Further sparsification phase
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
50. Repetition
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Different
candidate sets
and decreasing
threshold 𝜃𝜃
over iteration
Procedure
Initialization phase
𝑡𝑡 = 1
While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
Candidate generation phase
Merge and sparsification phase
Further sparsification phase
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
51. Repetition
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Summary graph �𝑮𝑮
Different
candidate sets
and decreasing
threshold 𝜃𝜃
over iteration
Procedure
Initialization phase
𝑡𝑡 = 1
While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
Candidate generation phase
Merge and sparsification phase
Further sparsification phase
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
52. Repetition
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Summary graph �𝑮𝑮
Different
candidate sets
and decreasing
threshold 𝜃𝜃
over iteration
Procedure
Initialization phase
𝑡𝑡 = 1
While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
Candidate generation phase
Merge and sparsification phase
Further sparsification phase
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓} {ℎ}
{𝑔𝑔, 𝑖𝑖}
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
53. Repetition
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Summary graph �𝑮𝑮
Different
candidate sets
and decreasing
threshold 𝜃𝜃
over iteration
Procedure
Initialization phase
𝑡𝑡 = 1
While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
Candidate generation phase
Merge and sparsification phase
Further sparsification phase
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓} {ℎ}
{𝑔𝑔, 𝑖𝑖}
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
54. Repetition
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Summary graph �𝑮𝑮
Different
candidate sets
and decreasing
threshold 𝜃𝜃
over iteration
Procedure
Initialization phase
𝑡𝑡 = 1
While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
Candidate generation phase
Merge and sparsification phase
Further sparsification phase
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓} {ℎ}
{𝑔𝑔, 𝑖𝑖}
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
55. Repetition
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Summary graph �𝑮𝑮
Different
candidate sets
and decreasing
threshold 𝜃𝜃
over iteration
Procedure
Initialization phase
𝑡𝑡 = 1
While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
Candidate generation phase
Merge and sparsification phase
Further sparsification phase
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓} {𝑔𝑔, ℎ, 𝑖𝑖}
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
56. Repetition
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Summary graph �𝑮𝑮
Different
candidate sets
and decreasing
threshold 𝜃𝜃
over iteration
Procedure
Initialization phase
𝑡𝑡 = 1
While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
Candidate generation phase
Merge and sparsification phase
Further sparsification phase
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓} {𝑔𝑔, ℎ, 𝑖𝑖}
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
57. Repetition
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Summary graph �𝑮𝑮
Different
candidate sets
and decreasing
threshold 𝜃𝜃
over iteration
Procedure
Initialization phase
𝑡𝑡 = 1
While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
Candidate generation phase
Merge and sparsification phase
Further sparsification phase
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔, ℎ, 𝑖𝑖}
Summary graph �𝑮𝑮
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓} {𝑔𝑔, ℎ, 𝑖𝑖}
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
58. Repetition
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Summary graph �𝑮𝑮
Different
candidate sets
and decreasing
threshold 𝜃𝜃
over iteration
Procedure
Initialization phase
𝑡𝑡 = 1
While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
Candidate generation phase
Merge and sparsification phase
Further sparsification phase
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑓𝑓}
{𝑔𝑔, ℎ, 𝑖𝑖}
Summary graph �𝑮𝑮
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓} {𝑔𝑔, ℎ, 𝑖𝑖}
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
{𝑎𝑎, 𝑒𝑒}
59. Repetition
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Summary graph �𝑮𝑮
Different
candidate sets
and decreasing
threshold 𝜃𝜃
over iteration
Procedure
Initialization phase
𝑡𝑡 = 1
While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
Candidate generation phase
Merge and sparsification phase
Further sparsification phase
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑓𝑓}
{𝑔𝑔, ℎ, 𝑖𝑖}
Summary graph �𝑮𝑮
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓} {𝑔𝑔, ℎ, 𝑖𝑖}
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
{𝑎𝑎, 𝑒𝑒}
60. Introduction Algorithms Experiments ConclusionProblem
Further Sparsification Phase
Summary graph �𝑮𝑮
𝐴𝐴 𝐴𝐴
𝐵𝐵 𝐴𝐴
𝐹𝐹 𝐴𝐴
𝐺𝐺 𝐺𝐺
Superedges sorted by ∆𝑅𝑅𝐸𝐸𝑝𝑝
𝐶𝐶 𝐶𝐶
Size of �𝑮𝑮 in bits ≤ 𝑲𝑲
Procedure
Initialization phase
𝑡𝑡 = 1
While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
Candidate generation phase
Merge and sparsification phase
Further sparsification phase <<
C = {𝑐𝑐, 𝑑𝑑}
𝐴𝐴 = {𝑎𝑎, 𝑒𝑒}
𝐵𝐵 = {𝑏𝑏}
𝐹𝐹 = {𝑓𝑓}
𝐺𝐺 = {𝑔𝑔, ℎ, 𝑖𝑖}
70. Code available at https://github.com/KyuhanLee/SSumM
Concise & Accurate Fast Scalable
Introduction Algorithms Experiments ConclusionProblem
Practical Problem Formulation
Extensive Experiments on 10 real world graphs
Scalable and Effective Algorithm Design
Conclusions
71. SSumM : Sparse Summarization
of Massive Graphs
Kyuhan Lee* Hyeonsoo Jo* Jihoon Ko Sungsu Lim Kijung Shin