SlideShare une entreprise Scribd logo
1  sur  37
Télécharger pour lire hors ligne
Higher-order graph clustering
Austin Benson
Stanford University
AMS Spring Western Sectional
Pullman, WA
April 22, 2017
Joint work with
David Gleich, Purdue
Jure Leskovec, Stanford
A C
B
A C
B 2
Unifying two themes of network analysis
Co-author network
1. Real-world networks have
modular organization.
[Newman04, Newman06]
2. Higher-order connectivity patterns, or
network motifs, drive complex systems.
[Milo+02, Yaveroğlu+14].
Edge-based clustering and
community detection sometimes
expose this structure.
Signed feed-forward loops in
genetic transcription.
[Mangan+03, Alon07]
Triangles in social networks.
[Rapoport53, Granovetter73]
Bi-directed length-2 paths in
brain networks.
[Sporns-Kötter04, Sporns+07]
A CA C
3
Motifs are the fundamental units of
complex networks.
We should look for structure
(clusters) in terms of them.
Network
Motif
Different motifs give different clusters.
4
Higher-order graph clustering is our technique
for finding clusters based on motifs
5
§ We will generalize spectral clustering, a classical technique to find
clusters or communities in a graph, to use motifs as the fundamental
unit to partition.
§ Based on a higher-order (motif-based) conductance metric that
generalizes the traditional conductance.
§ Comes with theoretical guarantees.
§ We’ll first briefly review how spectral clustering works.
§ Then we’ll see how to adapt it to work with network motifs.
§ Then we’ll see the impact of this approach on various real-world data.
Higher-order graph clustering
Main points and overview
Graph Laplacian Eigenvector
6
Background spectral clustering is a classic technique to
partition graphs by looking at eigenvectors
Cluster
[Fiedler73]
Conductance is one of the most important cluster quality scores [Schaeffer07]
used in Markov chain theory, spectral clustering, bioinformatics, vision, etc.
The conductance of a set of vertices S is the ratio of
edges leaving to total edges
small conductance ó good cluster
(edges leaving S)
(edge end points in S)
7
S
cut(S) = 7
vol(S) = 85
vol( ¯S) = 151
ffi(S) = 7/85
Background spectral clustering works based on conductance
S
(S) =
cut(S)
min(vol(S), vol(S))
Cheeger Inequality
Finding the best conductance set is NP-hard. L
§ Cheeger realized the eigenvalues of the Laplacian
provided surface area to volume bounds in manifolds.
§ Alon and Milman independently realized the same
thing for a graph (conductance)!
Laplacian
2
⇤/2  2  2 ⇤
0 = 1  2  ...  n  2
⇤ = set of smallest conductance
8
Background spectral clustering has theoretical guarantees
[Cheeger70, Alon-Milman85]
D = diag(Ae)
L = D−1/2
(A − D)D−1/2
Eigenvalues of the Laplacian L
We can find a set S that achieves the Cheeger
bound.
1. Compute the eigenvector z associated with λ2
and scale to f = D-1/2z
2. Sort the vertices by their values in f:
σ1, σ2, …, σn
3. Let Sr = {σ1, …, σr} and compute the
conductance of φ(Sr) of each Sr.
4. Pick the set Sm with minimum conductance.
9
Background the sweep cut algorithm realizes the guarantee
[Mihail89, Chung92]
0 20 40
0
0.2
0.4
0.6
0.8
1
S
i
φi
(Sm) 2
Sr
0 20 40
0
0.2
0.4
0.6
0.8
1
Si
φ
i
10
Background the sweep cut visualized
[Mihail89, Chung92]
Sr
(S) =
cut(S)
min(vol(S), vol(S))
We want to cluster with richer data
Motifs that may be directed, signed, colored, feature-valued, etc.
11
Spectral clustering is theoretically justified for finding
edge-based clusters in undirected, simple graphs.
Signed feed-forward loops in genetic
transcription [Mangan+03]
Gene X activates transcription in gene Y.
Gene X suppresses transcription in gene Z.
Gene Y suppresses transcription in gene Z.X
Y
Z
Benson-Gleich-Leskovec, Science, 2016.
12
Our contributions
§ A generalized conductance metric for motifs.
§ A new spectral clustering algorithm to minimize
the generalized conductance.
§ AND an associated motif Cheeger inequality
guarantee.
§ Naturally handles directed, signed, colored,
weighted, and combinations of motifs.
§ Scales to networks with billions of edges.
§ Applications in ecology, biology, and
transportation.
A
B
wijk
13
How do we find clusters
based on motifs?
Motif-based conductance
Need new notions of cut and volume
14
φM (S) =
cutM (S)
min(volM (S), volM (S))
(S) =
cut(S)
min(vol(S), vol(S))
vol(S) = #(edge end points in S)
cut(S) = #(edges cut) cutM (S) = #(motifs cut)
volM (S) = #(motif end points in S)
S
S
S
S S
M = triangle motif
Motif-based conductance
Motif M
M (S) =
motifs cut
motif volume
=
1
8
15
9
10
3
5
6
7
8
1
2
4
9
10
3
5
6
7
8
1
2
4
S
S
Problem Given a motif M and a graph G, we want to
find a set of nodes S that minimizes motif conductance
This is NP-hard. [Wagner-Wagner93]
Our solution Generalize spectral clustering for motifs
1. Form new weighted, undirected graph W(M) based on M and G
2. Compute Fiedler vector of Laplacian matrix of W(M) [Fiedler73, Alon-Milman85]
3. Use “sweep cut” procedure to output clusters [Mihail89, Chung92]
Theorem (motif Cheeger inequality)
resulting clusters will obtain near optimal motif conductance
16
Higher-order clustering
Motif-based spectral clustering
1
2
3
4
5
6
7
8
9
10
1
11
1
1
1
3
1
1
1
1
1
2
1
1
17
9
10
3
5
6
7
8
1
2
4
W(M)
ij = #{instances of motif M that contain nodes i and j}
motif M
Step 1. Given directed graph G and motif M, form a weighted graph W(M).
graph G
weighted graph W(M)
Key insight
Classical spectral clustering on
weighted graph W(M) finds clusters
of low motif conductance.
M (S) =
motifs cut
motif volume
=
1
10
1
2
3
4
5
6
7
8
9
10
1
11
1
1
1
3
1
1
1
1
1
2
1
1
18
Motif-based spectral clustering
Step 1. Given directed graph G and motif M, form a weighted graph W(M).
motif M
W(M)
ij = #{instances of motif M that contain nodes i and j}
W(M)
Motif-based spectral clustering
19
Step 2. Compute the eigenvector f(M) associated with λ2 of the
normalized Laplacian matrix of W(M)
1
2
3
4
5
6
7
8
9
10
-0.25
-0.2
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
Takes roughly O(# edges) time.
D = diag(W(M)
e)
L(M)
= D−1/2
(D − W(M)
)D−1/2
L(M)
z = –2z
f (M)
= D−1/2
z
f(M)
Motif-based spectral clustering
20
Step 3 (motif sweep cut) [Mihail89,Chung92]
§ Sort nodes by values in f(M) → σ1, σ2, …σn.
§ Pick set Sr = {σ1, …, σr} with smallest motif conductance.
9
10
3
5
6
7
8
1
2
4
Motif Cheeger inequality
Theorem If the motif has three
nodes, then the sweep procedure
on the weighted graph finds a set
S of nodes for which
For 4+ nodes, need slightly different
notion of conductance.
21
M(G) = {instances of M in G}
Key Proof Step
cutM (S, G) =
X
{i,j,k}2M(G)
Indicator[xi , xj , xk not the same]
= 1
4 (x2
i + x2
j + x2
k xi xj xj xk xi xk )
= quadratic in x
M (S) 2 M
Applications
22
1. We do not know the motif of interest.
food webs and new applications
2. We know the motif of interest from domain knowledge.
yeast transcription regulation networks, connectome, social networks
3. We seek richer information from our data.
transportation networks and new applications
23
Application 1
We do not know the motif of interest.
Application 1 Food webs
Florida bay food web
§ Nodes are species
§ Edges represent carbon exchange
i → j if j eats i
§ Motifs represent energy flow patterns
24
http://marinebio.org/oceans/marine-zones/
M6M5
Application 1 Food webs
Which motif clusters the food web?
Our approach
§ Run motif spectral clustering for all 3-node
motifs as well as for just edges.
§ Examine the sweep profile to see which motif
gives the best clusters.
25
Application 1 Food webs
26
M6
M5
edge
Our finding motif M6
organizes the food web
into good clusters
M6M5
edge
B D
Micronutrient
sources
Pelagic fishes
and benthic
prey
Benthic macro-
invertebrates
Benthic Fishes
Motif M6 reveals
aquatic layers
A
61% accuracy vs.
48% with edge-
based methods
27
Application 1 Food webs
28
Application 2
We know the motif of interest
from domain knowledge.
Application 2 Yeast transcription regulation networks
§ Nodes are groups of genes
§ Edge i → j means i regulates transcription to j
§ Sign + / - denotes activation / suppression
§ Coherent feedforward loops encode biological function
[Mangan+03, Alon07]
A CC A
29
Clustering based on coherent
feedforward loops identifies
functions studied individually
by biologists [Mangan+03]
97% accuracy vs.
68–82% with edge-based methods
A
B
A CA C
A
B
30
Application 2 Yeast transcription regulation networks
A
B
31
Structure of the found modules
(all edge signs are positive)
Application 2 Yeast transcription regulation networks
32
Application 3
We seek richer information
from our data.
Application 3 Transportation networks
§ North American air
transport network.
§ Nodes are cites.
§ i → j if you can travel
from i to j in < 8 hours.
[Frey-Dueck07]
33
Accepted pending m
	
B
A
Important motifs from literature
[Rosvall+2014]
W
(M)
ij = #{bi-directional length-2 paths from i to j}
Weighted adjacency matrix already reveals hub-like structure
34
Application 3 Transportation networks
Top 8
U.S. hubs
East coast non-hubs
West coast non-hubs
Primary spectral coordinate
Atlanta, the top hub, is
next to Salina, a non-hub.
MOTIF SPECTRAL EMBEDDING EDGE SPECTRAL EMBEDDING
35
Application 3 Transportation networks
36
§Generalization of graph clustering to higher-order structures
(motifs) through a new objective (motif conductance).
§Generalizing old ideas from spectral graph theory admits a new
algorithm and a motif Cheeger inequality.
§Applications in ecology, biology, transportation.
Recap Higher-order graph clustering
# form W0 … W4
sc0 = spectral_cut(W0)
sc1 = spectral_cut(W1)
sc2 = spectral_cut(W2)
sc3 = spectral_cut(W3)
sc4 = spectral_cut(W4)
plt = x -> semilogx(
x.sweepcut_profile.
conductance)
plt(sc0)
plt(sc1)
plt(sc2)
plt(sc3)
plt(sc4)
Find me at…
http://stanford.edu/~arbenson
@austinbenson
arbenson@stanford.edu
Thanks!
Austin Benson
Stanford University
For fun reproduce some results from this talk!
Jupyter notebooks at bit.ly/higher-order-julia
Higher-order graph clustering
Paper Benson-Gleich-Leskovec, Higher-order organization of complex networks, Science, 2016
Code + data http://snap.stanford.edu/higher-order

Contenu connexe

Tendances

CLIM Program: Remote Sensing Workshop, Multilayer Modeling and Analysis of Co...
CLIM Program: Remote Sensing Workshop, Multilayer Modeling and Analysis of Co...CLIM Program: Remote Sensing Workshop, Multilayer Modeling and Analysis of Co...
CLIM Program: Remote Sensing Workshop, Multilayer Modeling and Analysis of Co...
The Statistical and Applied Mathematical Sciences Institute
 
Limits of Local Algorithms for Randomly Generated Constraint Satisfaction Pro...
Limits of Local Algorithms for Randomly Generated Constraint Satisfaction Pro...Limits of Local Algorithms for Randomly Generated Constraint Satisfaction Pro...
Limits of Local Algorithms for Randomly Generated Constraint Satisfaction Pro...
Yandex
 
Fuzzy c means clustering protocol for wireless sensor networks
Fuzzy c means clustering protocol for wireless sensor networksFuzzy c means clustering protocol for wireless sensor networks
Fuzzy c means clustering protocol for wireless sensor networks
mourya chandra
 
International Journal of Computer Science and Security Volume (2) Issue (5)
International Journal of Computer Science and Security Volume (2) Issue (5)International Journal of Computer Science and Security Volume (2) Issue (5)
International Journal of Computer Science and Security Volume (2) Issue (5)
CSCJournals
 
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
Kolja Kleineberg
 

Tendances (20)

CLIM Program: Remote Sensing Workshop, Multilayer Modeling and Analysis of Co...
CLIM Program: Remote Sensing Workshop, Multilayer Modeling and Analysis of Co...CLIM Program: Remote Sensing Workshop, Multilayer Modeling and Analysis of Co...
CLIM Program: Remote Sensing Workshop, Multilayer Modeling and Analysis of Co...
 
Limits of Local Algorithms for Randomly Generated Constraint Satisfaction Pro...
Limits of Local Algorithms for Randomly Generated Constraint Satisfaction Pro...Limits of Local Algorithms for Randomly Generated Constraint Satisfaction Pro...
Limits of Local Algorithms for Randomly Generated Constraint Satisfaction Pro...
 
A NEW GENERALIZATION OF EDGE OVERLAP TO WEIGHTED NETWORKS
A NEW GENERALIZATION OF EDGE OVERLAP TO WEIGHTED NETWORKSA NEW GENERALIZATION OF EDGE OVERLAP TO WEIGHTED NETWORKS
A NEW GENERALIZATION OF EDGE OVERLAP TO WEIGHTED NETWORKS
 
CSMR11b.ppt
CSMR11b.pptCSMR11b.ppt
CSMR11b.ppt
 
Deep learning ensembles loss landscape
Deep learning ensembles loss landscapeDeep learning ensembles loss landscape
Deep learning ensembles loss landscape
 
Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering
 
Fuzzy c means clustering protocol for wireless sensor networks
Fuzzy c means clustering protocol for wireless sensor networksFuzzy c means clustering protocol for wireless sensor networks
Fuzzy c means clustering protocol for wireless sensor networks
 
08 clustering
08 clustering08 clustering
08 clustering
 
26 spanning
26 spanning26 spanning
26 spanning
 
Hierarchical clustering
Hierarchical clusteringHierarchical clustering
Hierarchical clustering
 
International Journal of Computer Science and Security Volume (2) Issue (5)
International Journal of Computer Science and Security Volume (2) Issue (5)International Journal of Computer Science and Security Volume (2) Issue (5)
International Journal of Computer Science and Security Volume (2) Issue (5)
 
Finding Neighbors in Images Represented By Quadtree
Finding Neighbors in Images Represented By QuadtreeFinding Neighbors in Images Represented By Quadtree
Finding Neighbors in Images Represented By Quadtree
 
Hierarchical clustering techniques
Hierarchical clustering techniquesHierarchical clustering techniques
Hierarchical clustering techniques
 
Bistablecamnets
BistablecamnetsBistablecamnets
Bistablecamnets
 
H010223640
H010223640H010223640
H010223640
 
K mean-clustering algorithm
K mean-clustering algorithmK mean-clustering algorithm
K mean-clustering algorithm
 
Hierarchical Clustering
Hierarchical ClusteringHierarchical Clustering
Hierarchical Clustering
 
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
 
Data miningpresentation
Data miningpresentationData miningpresentation
Data miningpresentation
 
Rough K Means - Numerical Example
Rough K Means - Numerical ExampleRough K Means - Numerical Example
Rough K Means - Numerical Example
 

Similaire à Higher-order graph clustering at AMS Spring Western Sectional

3 d active meshes for cell tracking
3 d active meshes for cell tracking3 d active meshes for cell tracking
3 d active meshes for cell tracking
Prashant Pal
 

Similaire à Higher-order graph clustering at AMS Spring Western Sectional (20)

Higher-order spectral graph clustering with motifs
Higher-order spectral graph clustering with motifsHigher-order spectral graph clustering with motifs
Higher-order spectral graph clustering with motifs
 
Higher-order organization of complex networks
Higher-order organization of complex networksHigher-order organization of complex networks
Higher-order organization of complex networks
 
Spectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresSpectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structures
 
Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...
Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...
Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...
 
Suppression of grating lobes
Suppression of grating lobesSuppression of grating lobes
Suppression of grating lobes
 
Tensor Spectral Clustering
Tensor Spectral ClusteringTensor Spectral Clustering
Tensor Spectral Clustering
 
Minimizing cost in distributed multiquery processing applications
Minimizing cost in distributed multiquery processing applicationsMinimizing cost in distributed multiquery processing applications
Minimizing cost in distributed multiquery processing applications
 
Wereszczynski Molecular Dynamics
Wereszczynski Molecular DynamicsWereszczynski Molecular Dynamics
Wereszczynski Molecular Dynamics
 
Csmr11b.ppt
Csmr11b.pptCsmr11b.ppt
Csmr11b.ppt
 
Learning multifractal structure in large networks (Purdue ML Seminar)
Learning multifractal structure in large networks (Purdue ML Seminar)Learning multifractal structure in large networks (Purdue ML Seminar)
Learning multifractal structure in large networks (Purdue ML Seminar)
 
2009 asilomar
2009 asilomar2009 asilomar
2009 asilomar
 
3 d active meshes for cell tracking
3 d active meshes for cell tracking3 d active meshes for cell tracking
3 d active meshes for cell tracking
 
Lausanne 2019 #4
Lausanne 2019 #4Lausanne 2019 #4
Lausanne 2019 #4
 
Computation of electromagnetic fields scattered from dielectric objects of un...
Computation of electromagnetic fields scattered from dielectric objects of un...Computation of electromagnetic fields scattered from dielectric objects of un...
Computation of electromagnetic fields scattered from dielectric objects of un...
 
Answers withexplanations
Answers withexplanationsAnswers withexplanations
Answers withexplanations
 
11 clusadvanced
11 clusadvanced11 clusadvanced
11 clusadvanced
 
11ClusAdvanced.ppt
11ClusAdvanced.ppt11ClusAdvanced.ppt
11ClusAdvanced.ppt
 
Chapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.pptChapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.ppt
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
Paper id 26201482
Paper id 26201482Paper id 26201482
Paper id 26201482
 

Plus de Austin Benson

Plus de Austin Benson (20)

Hypergraph Cuts with General Splitting Functions (JMM)
Hypergraph Cuts with General Splitting Functions (JMM)Hypergraph Cuts with General Splitting Functions (JMM)
Hypergraph Cuts with General Splitting Functions (JMM)
 
Spectral embeddings and evolving networks
Spectral embeddings and evolving networksSpectral embeddings and evolving networks
Spectral embeddings and evolving networks
 
Computational Frameworks for Higher-order Network Data Analysis
Computational Frameworks for Higher-order Network Data AnalysisComputational Frameworks for Higher-order Network Data Analysis
Computational Frameworks for Higher-order Network Data Analysis
 
Higher-order link prediction and other hypergraph modeling
Higher-order link prediction and other hypergraph modelingHigher-order link prediction and other hypergraph modeling
Higher-order link prediction and other hypergraph modeling
 
Hypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting FunctionsHypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting Functions
 
Hypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting FunctionsHypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting Functions
 
Higher-order link prediction
Higher-order link predictionHigher-order link prediction
Higher-order link prediction
 
Simplicial closure & higher-order link prediction
Simplicial closure & higher-order link predictionSimplicial closure & higher-order link prediction
Simplicial closure & higher-order link prediction
 
Three hypergraph eigenvector centralities
Three hypergraph eigenvector centralitiesThree hypergraph eigenvector centralities
Three hypergraph eigenvector centralities
 
Semi-supervised learning of edge flows
Semi-supervised learning of edge flowsSemi-supervised learning of edge flows
Semi-supervised learning of edge flows
 
Choosing to grow a graph
Choosing to grow a graphChoosing to grow a graph
Choosing to grow a graph
 
Link prediction in networks with core-fringe structure
Link prediction in networks with core-fringe structureLink prediction in networks with core-fringe structure
Link prediction in networks with core-fringe structure
 
Higher-order Link Prediction GraphEx
Higher-order Link Prediction GraphExHigher-order Link Prediction GraphEx
Higher-order Link Prediction GraphEx
 
Higher-order Link Prediction Syracuse
Higher-order Link Prediction SyracuseHigher-order Link Prediction Syracuse
Higher-order Link Prediction Syracuse
 
Random spatial network models for core-periphery structure
Random spatial network models for core-periphery structureRandom spatial network models for core-periphery structure
Random spatial network models for core-periphery structure
 
Random spatial network models for core-periphery structure.
Random spatial network models for core-periphery structure.Random spatial network models for core-periphery structure.
Random spatial network models for core-periphery structure.
 
Simplicial closure & higher-order link prediction
Simplicial closure & higher-order link predictionSimplicial closure & higher-order link prediction
Simplicial closure & higher-order link prediction
 
Simplicial closure and simplicial diffusions
Simplicial closure and simplicial diffusionsSimplicial closure and simplicial diffusions
Simplicial closure and simplicial diffusions
 
Sampling methods for counting temporal motifs
Sampling methods for counting temporal motifsSampling methods for counting temporal motifs
Sampling methods for counting temporal motifs
 
Set prediction three ways
Set prediction three waysSet prediction three ways
Set prediction three ways
 

Dernier

Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 

Dernier (20)

Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptx
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 

Higher-order graph clustering at AMS Spring Western Sectional

  • 1. Higher-order graph clustering Austin Benson Stanford University AMS Spring Western Sectional Pullman, WA April 22, 2017 Joint work with David Gleich, Purdue Jure Leskovec, Stanford
  • 2. A C B A C B 2 Unifying two themes of network analysis Co-author network 1. Real-world networks have modular organization. [Newman04, Newman06] 2. Higher-order connectivity patterns, or network motifs, drive complex systems. [Milo+02, Yaveroğlu+14]. Edge-based clustering and community detection sometimes expose this structure. Signed feed-forward loops in genetic transcription. [Mangan+03, Alon07] Triangles in social networks. [Rapoport53, Granovetter73] Bi-directed length-2 paths in brain networks. [Sporns-Kötter04, Sporns+07] A CA C
  • 3. 3 Motifs are the fundamental units of complex networks. We should look for structure (clusters) in terms of them.
  • 4. Network Motif Different motifs give different clusters. 4 Higher-order graph clustering is our technique for finding clusters based on motifs
  • 5. 5 § We will generalize spectral clustering, a classical technique to find clusters or communities in a graph, to use motifs as the fundamental unit to partition. § Based on a higher-order (motif-based) conductance metric that generalizes the traditional conductance. § Comes with theoretical guarantees. § We’ll first briefly review how spectral clustering works. § Then we’ll see how to adapt it to work with network motifs. § Then we’ll see the impact of this approach on various real-world data. Higher-order graph clustering Main points and overview
  • 6. Graph Laplacian Eigenvector 6 Background spectral clustering is a classic technique to partition graphs by looking at eigenvectors Cluster [Fiedler73]
  • 7. Conductance is one of the most important cluster quality scores [Schaeffer07] used in Markov chain theory, spectral clustering, bioinformatics, vision, etc. The conductance of a set of vertices S is the ratio of edges leaving to total edges small conductance ó good cluster (edges leaving S) (edge end points in S) 7 S cut(S) = 7 vol(S) = 85 vol( ¯S) = 151 ffi(S) = 7/85 Background spectral clustering works based on conductance S (S) = cut(S) min(vol(S), vol(S))
  • 8. Cheeger Inequality Finding the best conductance set is NP-hard. L § Cheeger realized the eigenvalues of the Laplacian provided surface area to volume bounds in manifolds. § Alon and Milman independently realized the same thing for a graph (conductance)! Laplacian 2 ⇤/2  2  2 ⇤ 0 = 1  2  ...  n  2 ⇤ = set of smallest conductance 8 Background spectral clustering has theoretical guarantees [Cheeger70, Alon-Milman85] D = diag(Ae) L = D−1/2 (A − D)D−1/2 Eigenvalues of the Laplacian L
  • 9. We can find a set S that achieves the Cheeger bound. 1. Compute the eigenvector z associated with λ2 and scale to f = D-1/2z 2. Sort the vertices by their values in f: σ1, σ2, …, σn 3. Let Sr = {σ1, …, σr} and compute the conductance of φ(Sr) of each Sr. 4. Pick the set Sm with minimum conductance. 9 Background the sweep cut algorithm realizes the guarantee [Mihail89, Chung92] 0 20 40 0 0.2 0.4 0.6 0.8 1 S i φi (Sm) 2 Sr
  • 10. 0 20 40 0 0.2 0.4 0.6 0.8 1 Si φ i 10 Background the sweep cut visualized [Mihail89, Chung92] Sr (S) = cut(S) min(vol(S), vol(S))
  • 11. We want to cluster with richer data Motifs that may be directed, signed, colored, feature-valued, etc. 11 Spectral clustering is theoretically justified for finding edge-based clusters in undirected, simple graphs. Signed feed-forward loops in genetic transcription [Mangan+03] Gene X activates transcription in gene Y. Gene X suppresses transcription in gene Z. Gene Y suppresses transcription in gene Z.X Y Z
  • 12. Benson-Gleich-Leskovec, Science, 2016. 12 Our contributions § A generalized conductance metric for motifs. § A new spectral clustering algorithm to minimize the generalized conductance. § AND an associated motif Cheeger inequality guarantee. § Naturally handles directed, signed, colored, weighted, and combinations of motifs. § Scales to networks with billions of edges. § Applications in ecology, biology, and transportation. A B wijk
  • 13. 13 How do we find clusters based on motifs?
  • 14. Motif-based conductance Need new notions of cut and volume 14 φM (S) = cutM (S) min(volM (S), volM (S)) (S) = cut(S) min(vol(S), vol(S)) vol(S) = #(edge end points in S) cut(S) = #(edges cut) cutM (S) = #(motifs cut) volM (S) = #(motif end points in S) S S S S S M = triangle motif
  • 15. Motif-based conductance Motif M M (S) = motifs cut motif volume = 1 8 15 9 10 3 5 6 7 8 1 2 4 9 10 3 5 6 7 8 1 2 4 S S
  • 16. Problem Given a motif M and a graph G, we want to find a set of nodes S that minimizes motif conductance This is NP-hard. [Wagner-Wagner93] Our solution Generalize spectral clustering for motifs 1. Form new weighted, undirected graph W(M) based on M and G 2. Compute Fiedler vector of Laplacian matrix of W(M) [Fiedler73, Alon-Milman85] 3. Use “sweep cut” procedure to output clusters [Mihail89, Chung92] Theorem (motif Cheeger inequality) resulting clusters will obtain near optimal motif conductance 16 Higher-order clustering
  • 17. Motif-based spectral clustering 1 2 3 4 5 6 7 8 9 10 1 11 1 1 1 3 1 1 1 1 1 2 1 1 17 9 10 3 5 6 7 8 1 2 4 W(M) ij = #{instances of motif M that contain nodes i and j} motif M Step 1. Given directed graph G and motif M, form a weighted graph W(M). graph G weighted graph W(M)
  • 18. Key insight Classical spectral clustering on weighted graph W(M) finds clusters of low motif conductance. M (S) = motifs cut motif volume = 1 10 1 2 3 4 5 6 7 8 9 10 1 11 1 1 1 3 1 1 1 1 1 2 1 1 18 Motif-based spectral clustering Step 1. Given directed graph G and motif M, form a weighted graph W(M). motif M W(M) ij = #{instances of motif M that contain nodes i and j} W(M)
  • 19. Motif-based spectral clustering 19 Step 2. Compute the eigenvector f(M) associated with λ2 of the normalized Laplacian matrix of W(M) 1 2 3 4 5 6 7 8 9 10 -0.25 -0.2 -0.15 -0.1 -0.05 0 0.05 0.1 0.15 Takes roughly O(# edges) time. D = diag(W(M) e) L(M) = D−1/2 (D − W(M) )D−1/2 L(M) z = –2z f (M) = D−1/2 z f(M)
  • 20. Motif-based spectral clustering 20 Step 3 (motif sweep cut) [Mihail89,Chung92] § Sort nodes by values in f(M) → σ1, σ2, …σn. § Pick set Sr = {σ1, …, σr} with smallest motif conductance. 9 10 3 5 6 7 8 1 2 4
  • 21. Motif Cheeger inequality Theorem If the motif has three nodes, then the sweep procedure on the weighted graph finds a set S of nodes for which For 4+ nodes, need slightly different notion of conductance. 21 M(G) = {instances of M in G} Key Proof Step cutM (S, G) = X {i,j,k}2M(G) Indicator[xi , xj , xk not the same] = 1 4 (x2 i + x2 j + x2 k xi xj xj xk xi xk ) = quadratic in x M (S) 2 M
  • 22. Applications 22 1. We do not know the motif of interest. food webs and new applications 2. We know the motif of interest from domain knowledge. yeast transcription regulation networks, connectome, social networks 3. We seek richer information from our data. transportation networks and new applications
  • 23. 23 Application 1 We do not know the motif of interest.
  • 24. Application 1 Food webs Florida bay food web § Nodes are species § Edges represent carbon exchange i → j if j eats i § Motifs represent energy flow patterns 24 http://marinebio.org/oceans/marine-zones/ M6M5
  • 25. Application 1 Food webs Which motif clusters the food web? Our approach § Run motif spectral clustering for all 3-node motifs as well as for just edges. § Examine the sweep profile to see which motif gives the best clusters. 25
  • 26. Application 1 Food webs 26 M6 M5 edge Our finding motif M6 organizes the food web into good clusters M6M5 edge
  • 27. B D Micronutrient sources Pelagic fishes and benthic prey Benthic macro- invertebrates Benthic Fishes Motif M6 reveals aquatic layers A 61% accuracy vs. 48% with edge- based methods 27 Application 1 Food webs
  • 28. 28 Application 2 We know the motif of interest from domain knowledge.
  • 29. Application 2 Yeast transcription regulation networks § Nodes are groups of genes § Edge i → j means i regulates transcription to j § Sign + / - denotes activation / suppression § Coherent feedforward loops encode biological function [Mangan+03, Alon07] A CC A 29
  • 30. Clustering based on coherent feedforward loops identifies functions studied individually by biologists [Mangan+03] 97% accuracy vs. 68–82% with edge-based methods A B A CA C A B 30 Application 2 Yeast transcription regulation networks
  • 31. A B 31 Structure of the found modules (all edge signs are positive) Application 2 Yeast transcription regulation networks
  • 32. 32 Application 3 We seek richer information from our data.
  • 33. Application 3 Transportation networks § North American air transport network. § Nodes are cites. § i → j if you can travel from i to j in < 8 hours. [Frey-Dueck07] 33
  • 34. Accepted pending m B A Important motifs from literature [Rosvall+2014] W (M) ij = #{bi-directional length-2 paths from i to j} Weighted adjacency matrix already reveals hub-like structure 34 Application 3 Transportation networks
  • 35. Top 8 U.S. hubs East coast non-hubs West coast non-hubs Primary spectral coordinate Atlanta, the top hub, is next to Salina, a non-hub. MOTIF SPECTRAL EMBEDDING EDGE SPECTRAL EMBEDDING 35 Application 3 Transportation networks
  • 36. 36 §Generalization of graph clustering to higher-order structures (motifs) through a new objective (motif conductance). §Generalizing old ideas from spectral graph theory admits a new algorithm and a motif Cheeger inequality. §Applications in ecology, biology, transportation. Recap Higher-order graph clustering
  • 37. # form W0 … W4 sc0 = spectral_cut(W0) sc1 = spectral_cut(W1) sc2 = spectral_cut(W2) sc3 = spectral_cut(W3) sc4 = spectral_cut(W4) plt = x -> semilogx( x.sweepcut_profile. conductance) plt(sc0) plt(sc1) plt(sc2) plt(sc3) plt(sc4) Find me at… http://stanford.edu/~arbenson @austinbenson arbenson@stanford.edu Thanks! Austin Benson Stanford University For fun reproduce some results from this talk! Jupyter notebooks at bit.ly/higher-order-julia Higher-order graph clustering Paper Benson-Gleich-Leskovec, Higher-order organization of complex networks, Science, 2016 Code + data http://snap.stanford.edu/higher-order