SlideShare une entreprise Scribd logo
1  sur  49
Communities in Networks
Peter J. Mucha, UNC–Chapel Hill
AGRICULTURE
APPROPRIATIONS
INTERNATIONAL RELATIONS
BUDGET
HOUSE ADMINISTRATION
ENERGY/COMMERCE
FINANCIAL SERVICES
VETERANS’ AFFAIRS
EDUCATION
ARMED SERVICES
JUDICIARY
RESOURCES
RULES
SCIENCE
SMALL BUSINESS
OFFICIAL CONDUCT
TRANSPORTATION
GOVERNMENT REFORM
WAYS AND MEANS
INTELLIGENCE
HOMELAND SECURITY
Outline & Acknowledgements
1. What is community detection and
why is it useful?
2. How do you calculate communities?
– Software links
– Importance of resolution parameters
3. Multilayer networks
– If time permits (I’ll leave you slides)
 Skyler Cranmer, James Fowler,
Jeff Henderson, Jim Moody,
J.-P. Onnela, Mason Porter
 Dani Bassett, Kaveri Chaturvedi,
Saray Shai, Dane Taylor
 Natalie Stanley, Mandi Traud,
Andrew Waugh, William Weir,
James Wilson
 Scott Emmons, Ryan Gibson,
Eric Kelsic, Kevin Macon,
Thomas Richardson
 JSMF, UCRF (UNC), ARO, CDC,
NICHD, NIDDK, NIGMS, NSF
Apologies that this presentation will seriously err on the self-absorbed side.
It’s a big field, and I do not promise to cover even a small piece of it here.
 Jim Moody (paraphrased):
“I’ve been accused of turning everything into a network.”
 PJM (in response):
“I’m accused of turning everything into a network and a graph partitioning problem.”
 “Structure  Function”
Philosophical Disclaimer
Images by Aaron Clauset
Karate Club Example
This partition optimizes modularity, which measures the
number of intra-community ties (relative to a random model)
“If your method doesn’t work on this network, then go home.”
Karate Club Club
“Cris Moore (left) is the
inaugural recipient of the
Zachary Karate Club Club prize,
awarded on behalf of the
community by Aric Hagberg
(right). (9 May 2013)”
Facebook
Traud et al., “Comparing community structure to characteristics in
online collegiate social networks” (2011)
Traud et al., “Social structure of Facebook networks” (2012)
Caltech 2005:
Colors indicate residential
“House” affiliations
Purple = Not provided
Facebook
Traud et al., “Comparing community structure to characteristics in
online collegiate social networks” (2011)
Traud et al., “Social structure of Facebook networks” (2012)
Caltech 2005:
Colors indicate residential
“House” affiliations
Facebook
Traud et al., “Comparing community structure to characteristics in
online collegiate social networks” (2011)
Traud et al., “Social structure of Facebook networks” (2012)
Caltech 2005:
Colors indicate residential
“House” affiliations
Purple = Not provided
Community Detection Firehose Overview
 “Hard/rigid” v. “soft/overlapping” clusters
 cf. biclustering methods and mathematics of expander graphs
 A community should describe a “cohesive group”: varying formulations/algorithms
• Linkage clustering (average, single), local clustering coefficients,
betweeness (geodesic, random walk), spectral, conductance,…
 Classic approach in CS: Spectral Graph Partitioning
• Need to specify number of communities sought
 Conductance
 MDL, Infomap, OSLOM, … (many other things I’ve missed) …
 Stochastic Block Models: generative with in/out probabilities between labeled groups
 Modularity: a good partition has more total intra-community edge weight than one would
expect at random (but according to what model?)
“Communities in Networks,” M. A. Porter, J.-P. Onnela & P. J. Mucha,
Notices of the American Mathematical Society 56, 1082-97 & 1164-6 (2009).
“Community Detection in Graphs,” S. Fortunato, Physics Reports 486, 75-174 (2010).
“Community detection in networks: A user guide,” S. Fortunato & D. Hric, Physics Reports 659, 1-44 (2016).
“Case studies in network community detection,” S. Shai, N. Stanley, C. Granell, D. Taylor & P. J. Mucha, arXiv:1705.02305.
Modularity (see Newman & Girvan and other Newman papers)
Total edge
weight
Modularity
matrix
Indicator on
nodes i & j in
same community
Your data:
Edge from i to j?
Random
“null model”
for expected
edge weight
Modularity (see Newman & Girvan and other Newman papers)
 GOAL: Assign nodes to communities in order to maximize
quality function Q
 NP-Complete [Brandes et al. 2008]
~ enumerate possible partitions
 Numerous packages developed/developing
• e.g. igraph library (R, python), NetworkX, Louvain
• Need appropriate null model
 ER degree distribution (binomial/Poisson) is not a good model
for many real-world data sets
 Independent edges, constrained to expected
degree sequence same as observed.
 Requires Pij = f(ki)f(kj), quickly yielding
 g resolution parameter ad hoc (default = 1)
[Reichardt & Bornholdt, 2006; Lambiotte et al., 2008 & 2015]
Modularity (see Newman & Girvan and other Newman papers)
Null Models for Modularity Quality Functions
 Erdős–Rényi (Bernoulli)  Newman-Girvan*
• Leicht-Newman* (directed) • Barber* (bipartite)
Louvain Method (Blondel et al., “Fast unfolding of communities in large networks”, 2008)
“Virality Prediction and Community Structure in
Social Networks”, Weng, Menczer & Ahn (2013)
Melnik et al., “Dynamics on modular networks with
heterogeneous correlations” (2014)
Fraction of active nodes
Watts threshold model
Multi-university Facebook network
Modularity from Laplacian Dynamics
Lambiotte, Delvenne & Barahona [arXiv:0812.1770]
showed a way to derive modularity from normalized
Laplacian dynamics, defining partition quality in terms
of stability (autocovariance in Markov process)
Expansion of matrix exponential to first-order in t recovers
Newman-Girvan modularity with resolution g = 1/t.
(This is going to be important again for multilayer networks)
U.S. Congressional Roll Call as a similarity network
Waugh et al., “Party polarization in Congress: a network science approach” (2009)
AGRICULTURE
APPROPRIATIONS
INTERNATIONAL RELATIONS
BUDGET
HOUSE ADMINISTRATION
ENERGY/COMMERCE
FINANCIAL SERVICES
VETERANS’ AFFAIRS
EDUCATION
ARMED SERVICES
JUDICIARY
RESOURCES
RULES
SCIENCE
SMALL BUSINESS
OFFICIAL CONDUCT
TRANSPORTATION
GOVERNMENT REFORM
WAYS AND MEANS
INTELLIGENCE
HOMELAND SECURITY
Adjacency matrix of similarities is dense
and weighted, cf. other typical networks
(see committees: weighted but sparse)
85th Senate
U.S. Congressional Roll Call as a similarity network
Waugh et al., “Party polarization in Congress: a network science approach” (2009)
85th Senate 108th Senate
Moody & Mucha, “Portrait of political party polarization” (2013)
Parker et al., “Network Analysis Reveals Sex- and Antibiotic Resistance-
Associated Antivirulence Targets in Clinical Uropathogens” (2015)
Parker et al., “Network Analysis Reveals Sex- and Antibiotic Resistance-
Associated Antivirulence Targets in Clinical Uropathogens” (2015)
Outline & Summary
1. What is community detection and
why is it useful?
2. How do you calculate communities?
– Software links
– Importance of resolution parameters
3. Multilayer networks
Software
Other great codes to know:
http://www.mapequation.org/
https://graph-tool.skewed.de/
https://github.com/vtraag/louvain-igraph
http://netwiki.amath.unc.edu/GenLouvain
Recall the (pesky) resolution parameter?
Fenn et al., “Dynamic Communities in
Multichannel Data: An Application to the
Foreign Exchange Market During the
2007-2008 Credit Crisis” (2009)
Picking resolution parameters still active research
https://github.com/wweir827/CHAMP
“Division I-A” College Football
50,000 Louvain calls
384 unique partitions
But Qs(g) isn’t a point; it’s a line for partition s
50,000 Louvain calls
384 unique partitions
But Qs(g) isn’t a point; it’s a line for partition s
50,000 Louvain calls
384 unique partitions
19 admissible partitions
Pairwise compare admissible partitions
50,000 Louvain calls
384 unique partitions
19 admissible partitions
Human Protein Reactome
20,000 calls
19,980 unique
39 admissible
Human Protein Reactome
20,000 calls
19,980 unique
39 admissible
Self loops of weight r as a form of resolution parameter
Arenas et al., “Analysis of the structure of complex networks at different resolution levels” (2008)
(see also Shai et al., “Case studies in network community detection,” 2017)
Outline & Summary
1. What is community detection and
why is it useful?
2. How do you calculate communities?
– Software links
– Importance of resolution parameters
3. Multilayer networks
– We are surely out of time… If we had
more time, we would talk a lot about
the refs in the following slides
 Networks appear in many
disciplines
 Network representations provide a
flexible framework for studying
general data types, leveraging
methods of social network analysis
and network science.
 Community detection is a powerful
tool for exploring and
understanding network structures,
including multilayer networks.
 Network structures identify
essential features for modeling and
understanding data in applications.
Multilayer Networks
OrderedCategorical
Mucha et al., “Community structure in time-dependent,
multiscale, and multiplex networks” (2010)
Kivelä et al., “Multilayer Networks” (2014)
Multilayer Modularity
Mucha et al., “Community structure in time-dependent, multiscale, and multiplex networks” (2010)
How to count the expected weights of interlayer arcs given that they are definitional to the data structure?
Generalized Lambiotte et al. (2008) connection between modularity and autocorrelation under Laplacian dynamics
to re-derive null models for bipartite (Barber), directed (Leicht-Newman), and signed (Traag et al.) networks,
specified in terms of one-step conditional probabilities
intra-layer
adjacency
data and null
inter-layer
identity arcs
Same formalism works for more general multilayer networks,
with sum over inter-layer connections within same community
U.S. Senators across 2-year Congresses
Mucha et al., “Community
structure in time-dependent,
multiscale, and multiplex
networks” (2010)
Each point is a
Senator in a Congress
Colored bars indicate
temporal extent of each
community, labeled by
nominal party labels
Grey bars indicate Congresses
including more than two
communities
Bassett et al. “Dynamic reconfiguration of human
brain networks during learning” (2011)
Cranmer et al., “Kantian fractionalization predicts the
conflict propensity of the international system” (2015)
• Identified communities of
nation states in multiplex
international relations of trade,
IGOs, democracies
• Granger causal relationship to
total system-level conflict
• Negligible contribution from
joint democracy layer
See mapequation.org
Phys. Rev. X 6, 011036 (2016)
Stanley et al., “Clustering network layers with the
strata multilayer stochastic block model” (2016)
Stanley et al., “Clustering network layers with the
strata multilayer stochastic block model” (2016)
Stanley et al., “Clustering network layers with the
strata multilayer stochastic block model” (2016)
Initialization
layer l kmeans
cluster L
layers in
to S
strata
stratum s
Iterative Process
stratum s
Update number of strata to the
number of unique clustering
patterns according to (1) and (2)
kmeans
cluster
2L
layers in
to S
strata
(1)
(2)
ns
r L
in
a
stratum s
kmeans
cluster
tion
layer l kmeans
cluster L
layers in
to S
strata
stratum s
Process
kmeans
cluster
2L
layers in
to S
strata
(1)
(2)
tion
layer l kmeans
cluster L
layers in
to S
strata
stratum s
Process
kmeans
cluster
2L
(1)
kmeans
cluster L
layers in
to S
strata
stratum s
Taylor et al., “Enhanced detectability of community structure
in multilayer networks through layer aggregation” (2016)
Taylor et al., “Enhanced detectability of community structure
in multilayer networks through layer aggregation” (2016)
Multilayer CHAMP (Weir et al., 2017)
U.S. Senate Roll Call Similarities (Congresses 1-110)
240,000 GenLouvain calls; 197,879 unique partitions; 1,447 admissible partitions
Community Detection Firehose Overview
 “Hard/rigid” v. “soft/overlapping” clusters
 cf. biclustering methods and mathematics of expander graphs
 A community should describe a “cohesive group”: varying formulations/algorithms
• Linkage clustering (average, single), local clustering coefficients,
betweeness (geodesic, random walk), spectral, conductance,…
 Classic approach in CS: Spectral Graph Partitioning
• Need to specify number of communities sought
 Conductance
 MDL, Infomap, OSLOM, … (many other things I’ve missed) …
 Stochastic Block Models: generative with in/out probabilities between labeled groups
 Modularity: a good partition has more total intra-community edge weight than one would
expect at random (but according to what model?)
“Communities in Networks,” M. A. Porter, J.-P. Onnela & P. J. Mucha,
Notices of the American Mathematical Society 56, 1082-97 & 1164-6 (2009).
“Community Detection in Graphs,” S. Fortunato, Physics Reports 486, 75-174 (2010).
“Community detection in networks: A user guide,” S. Fortunato & D. Hric, Physics Reports 659, 1-44 (2016).
“Case studies in network community detection,” S. Shai, N. Stanley, C. Granell, D. Taylor & P. J. Mucha, arXiv:1705.02305.
Outline & Summary
1. What is community detection and
why is it useful?
2. How do you calculate communities?
– Software links
– Importance of resolution parameters
3. Multilayer networks
 Networks appear in many
disciplines
 Network representations provide a
flexible framework for studying
general data types, leveraging
methods of social network analysis
and network science.
 Community detection is a powerful
tool for exploring and
understanding network structures,
including multilayer networks.
 Network structures identify
essential features for modeling and
understanding data in applications.

Contenu connexe

Tendances

Socable Influence Maximization
Socable Influence MaximizationSocable Influence Maximization
Socable Influence Maximization
robertlz
 

Tendances (20)

Community Detection
Community Detection Community Detection
Community Detection
 
Link prediction
Link predictionLink prediction
Link prediction
 
Temporal networks - Alain Barrat
Temporal networks - Alain BarratTemporal networks - Alain Barrat
Temporal networks - Alain Barrat
 
Social Recommender Systems
Social Recommender SystemsSocial Recommender Systems
Social Recommender Systems
 
Community Detection with Networkx
Community Detection with NetworkxCommunity Detection with Networkx
Community Detection with Networkx
 
09 Ego Network Analysis
09 Ego Network Analysis09 Ego Network Analysis
09 Ego Network Analysis
 
Social Media Mining - Chapter 8 (Influence and Homophily)
Social Media Mining - Chapter 8 (Influence and Homophily)Social Media Mining - Chapter 8 (Influence and Homophily)
Social Media Mining - Chapter 8 (Influence and Homophily)
 
Network centrality measures and their effectiveness
Network centrality measures and their effectivenessNetwork centrality measures and their effectiveness
Network centrality measures and their effectiveness
 
08 Exponential Random Graph Models (2016)
08 Exponential Random Graph Models (2016)08 Exponential Random Graph Models (2016)
08 Exponential Random Graph Models (2016)
 
Machine learning clustering
Machine learning clusteringMachine learning clustering
Machine learning clustering
 
Random graph models
Random graph modelsRandom graph models
Random graph models
 
Social Network Visualization 101
Social Network Visualization 101Social Network Visualization 101
Social Network Visualization 101
 
Social network analysis
Social network analysisSocial network analysis
Social network analysis
 
Link prediction with the linkpred tool
Link prediction with the linkpred toolLink prediction with the linkpred tool
Link prediction with the linkpred tool
 
Representation learning on graphs
Representation learning on graphsRepresentation learning on graphs
Representation learning on graphs
 
GraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDBGraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDB
 
4 Cliques Clusters
4 Cliques Clusters4 Cliques Clusters
4 Cliques Clusters
 
Socable Influence Maximization
Socable Influence MaximizationSocable Influence Maximization
Socable Influence Maximization
 
Graph Representation Learning
Graph Representation LearningGraph Representation Learning
Graph Representation Learning
 
2.3 bayesian classification
2.3 bayesian classification2.3 bayesian classification
2.3 bayesian classification
 

Similaire à 06 Community Detection

Contractor-Borner-SNA-SAC
Contractor-Borner-SNA-SACContractor-Borner-SNA-SAC
Contractor-Borner-SNA-SAC
webuploader
 
Mining and Supporting Community Structures in Sensor Network Research
Mining and Supporting Community Structures in Sensor Network ResearchMining and Supporting Community Structures in Sensor Network Research
Mining and Supporting Community Structures in Sensor Network Research
Marko Rodriguez
 

Similaire à 06 Community Detection (20)

13 Community Detection
13 Community Detection13 Community Detection
13 Community Detection
 
03 Communities in Networks (2017)
03 Communities in Networks (2017)03 Communities in Networks (2017)
03 Communities in Networks (2017)
 
Network Science: Theory, Modeling and Applications
Network Science: Theory, Modeling and ApplicationsNetwork Science: Theory, Modeling and Applications
Network Science: Theory, Modeling and Applications
 
Network literacy-high-res
Network literacy-high-resNetwork literacy-high-res
Network literacy-high-res
 
Contractor-Borner-SNA-SAC
Contractor-Borner-SNA-SACContractor-Borner-SNA-SAC
Contractor-Borner-SNA-SAC
 
20142014_20142015_20142115
20142014_20142015_2014211520142014_20142015_20142115
20142014_20142015_20142115
 
Mining and Supporting Community Structures in Sensor Network Research
Mining and Supporting Community Structures in Sensor Network ResearchMining and Supporting Community Structures in Sensor Network Research
Mining and Supporting Community Structures in Sensor Network Research
 
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
 
KASW'08 - Invited Talk
KASW'08 - Invited TalkKASW'08 - Invited Talk
KASW'08 - Invited Talk
 
Social Software and Community Information Systems
Social Software and Community Information SystemsSocial Software and Community Information Systems
Social Software and Community Information Systems
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network Analysis
 
DREaM Event 2: Louise Cooke
DREaM Event 2: Louise CookeDREaM Event 2: Louise Cooke
DREaM Event 2: Louise Cooke
 
AI Class Topic 5: Social Network Graph
AI Class Topic 5:  Social Network GraphAI Class Topic 5:  Social Network Graph
AI Class Topic 5: Social Network Graph
 
Data mining based social network
Data mining based social networkData mining based social network
Data mining based social network
 
01 Introduction to Networks Methods and Measures
01 Introduction to Networks Methods and Measures01 Introduction to Networks Methods and Measures
01 Introduction to Networks Methods and Measures
 
01 Introduction to Networks Methods and Measures (2016)
01 Introduction to Networks Methods and Measures (2016)01 Introduction to Networks Methods and Measures (2016)
01 Introduction to Networks Methods and Measures (2016)
 
18CS3040 DISTRIBUTED SYSTEMS
18CS3040 DISTRIBUTED SYSTEMS18CS3040 DISTRIBUTED SYSTEMS
18CS3040 DISTRIBUTED SYSTEMS
 
20CS2021 DISTRIBUTED COMPUTING
20CS2021 DISTRIBUTED COMPUTING20CS2021 DISTRIBUTED COMPUTING
20CS2021 DISTRIBUTED COMPUTING
 
Jürgens diata12-communities
Jürgens diata12-communitiesJürgens diata12-communities
Jürgens diata12-communities
 
Mesoscale Structures in Networks
Mesoscale Structures in NetworksMesoscale Structures in Networks
Mesoscale Structures in Networks
 

Plus de Duke Network Analysis Center

Plus de Duke Network Analysis Center (20)

01 Add Health Network Data Challenges: IRB and Security Issues
01 Add Health Network Data Challenges: IRB and Security Issues01 Add Health Network Data Challenges: IRB and Security Issues
01 Add Health Network Data Challenges: IRB and Security Issues
 
00 Social Networks of Youth and Young People Who Misuse Prescription Opiods a...
00 Social Networks of Youth and Young People Who Misuse Prescription Opiods a...00 Social Networks of Youth and Young People Who Misuse Prescription Opiods a...
00 Social Networks of Youth and Young People Who Misuse Prescription Opiods a...
 
24 The Evolution of Network Thinking
24 The Evolution of Network Thinking24 The Evolution of Network Thinking
24 The Evolution of Network Thinking
 
22 An Introduction to Stochastic Actor-Oriented Models (SAOM or Siena)
22 An Introduction to Stochastic Actor-Oriented Models (SAOM or Siena)22 An Introduction to Stochastic Actor-Oriented Models (SAOM or Siena)
22 An Introduction to Stochastic Actor-Oriented Models (SAOM or Siena)
 
20 Network Experiments
20 Network Experiments20 Network Experiments
20 Network Experiments
 
19 Electronic Medical Records
19 Electronic Medical Records19 Electronic Medical Records
19 Electronic Medical Records
 
18 Diffusion Models and Peer Influence
18 Diffusion Models and Peer Influence18 Diffusion Models and Peer Influence
18 Diffusion Models and Peer Influence
 
17 Statistical Models for Networks
17 Statistical Models for Networks17 Statistical Models for Networks
17 Statistical Models for Networks
 
15 Network Visualization and Communities
15 Network Visualization and Communities15 Network Visualization and Communities
15 Network Visualization and Communities
 
11 Respondent Driven Sampling
11 Respondent Driven Sampling11 Respondent Driven Sampling
11 Respondent Driven Sampling
 
07 Whole Network Descriptive Statistics
07 Whole Network Descriptive Statistics07 Whole Network Descriptive Statistics
07 Whole Network Descriptive Statistics
 
04 Network Data Collection
04 Network Data Collection04 Network Data Collection
04 Network Data Collection
 
02 Introduction to Social Networks and Health: Key Concepts and Overview
02 Introduction to Social Networks and Health: Key Concepts and Overview02 Introduction to Social Networks and Health: Key Concepts and Overview
02 Introduction to Social Networks and Health: Key Concepts and Overview
 
00 Differentiating Between Network Structure and Network Function
00 Differentiating Between Network Structure and Network Function00 Differentiating Between Network Structure and Network Function
00 Differentiating Between Network Structure and Network Function
 
00 Arrest Networks and the Spread of Violent Victimization
00 Arrest Networks and the Spread of Violent Victimization00 Arrest Networks and the Spread of Violent Victimization
00 Arrest Networks and the Spread of Violent Victimization
 
00 Networks of People Who Use Opiods Nonmedically: Reports from Rural Souther...
00 Networks of People Who Use Opiods Nonmedically: Reports from Rural Souther...00 Networks of People Who Use Opiods Nonmedically: Reports from Rural Souther...
00 Networks of People Who Use Opiods Nonmedically: Reports from Rural Souther...
 
00 Automatic Mental Health Classification in Online Settings and Language Emb...
00 Automatic Mental Health Classification in Online Settings and Language Emb...00 Automatic Mental Health Classification in Online Settings and Language Emb...
00 Automatic Mental Health Classification in Online Settings and Language Emb...
 
12 SN&H Keynote: Thomas Valente, USC
12 SN&H Keynote: Thomas Valente, USC12 SN&H Keynote: Thomas Valente, USC
12 SN&H Keynote: Thomas Valente, USC
 
11 Siena Models for Selection & Influence
11 Siena Models for Selection & Influence 11 Siena Models for Selection & Influence
11 Siena Models for Selection & Influence
 
10 Network Experiments
10 Network Experiments10 Network Experiments
10 Network Experiments
 

Dernier

Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
gindu3009
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
Lokesh Kothari
 

Dernier (20)

Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 

06 Community Detection

  • 1. Communities in Networks Peter J. Mucha, UNC–Chapel Hill AGRICULTURE APPROPRIATIONS INTERNATIONAL RELATIONS BUDGET HOUSE ADMINISTRATION ENERGY/COMMERCE FINANCIAL SERVICES VETERANS’ AFFAIRS EDUCATION ARMED SERVICES JUDICIARY RESOURCES RULES SCIENCE SMALL BUSINESS OFFICIAL CONDUCT TRANSPORTATION GOVERNMENT REFORM WAYS AND MEANS INTELLIGENCE HOMELAND SECURITY
  • 2. Outline & Acknowledgements 1. What is community detection and why is it useful? 2. How do you calculate communities? – Software links – Importance of resolution parameters 3. Multilayer networks – If time permits (I’ll leave you slides)  Skyler Cranmer, James Fowler, Jeff Henderson, Jim Moody, J.-P. Onnela, Mason Porter  Dani Bassett, Kaveri Chaturvedi, Saray Shai, Dane Taylor  Natalie Stanley, Mandi Traud, Andrew Waugh, William Weir, James Wilson  Scott Emmons, Ryan Gibson, Eric Kelsic, Kevin Macon, Thomas Richardson  JSMF, UCRF (UNC), ARO, CDC, NICHD, NIDDK, NIGMS, NSF Apologies that this presentation will seriously err on the self-absorbed side. It’s a big field, and I do not promise to cover even a small piece of it here.
  • 3.  Jim Moody (paraphrased): “I’ve been accused of turning everything into a network.”  PJM (in response): “I’m accused of turning everything into a network and a graph partitioning problem.”  “Structure  Function” Philosophical Disclaimer Images by Aaron Clauset
  • 4. Karate Club Example This partition optimizes modularity, which measures the number of intra-community ties (relative to a random model) “If your method doesn’t work on this network, then go home.”
  • 5. Karate Club Club “Cris Moore (left) is the inaugural recipient of the Zachary Karate Club Club prize, awarded on behalf of the community by Aric Hagberg (right). (9 May 2013)”
  • 6. Facebook Traud et al., “Comparing community structure to characteristics in online collegiate social networks” (2011) Traud et al., “Social structure of Facebook networks” (2012) Caltech 2005: Colors indicate residential “House” affiliations Purple = Not provided
  • 7. Facebook Traud et al., “Comparing community structure to characteristics in online collegiate social networks” (2011) Traud et al., “Social structure of Facebook networks” (2012) Caltech 2005: Colors indicate residential “House” affiliations
  • 8. Facebook Traud et al., “Comparing community structure to characteristics in online collegiate social networks” (2011) Traud et al., “Social structure of Facebook networks” (2012) Caltech 2005: Colors indicate residential “House” affiliations Purple = Not provided
  • 9. Community Detection Firehose Overview  “Hard/rigid” v. “soft/overlapping” clusters  cf. biclustering methods and mathematics of expander graphs  A community should describe a “cohesive group”: varying formulations/algorithms • Linkage clustering (average, single), local clustering coefficients, betweeness (geodesic, random walk), spectral, conductance,…  Classic approach in CS: Spectral Graph Partitioning • Need to specify number of communities sought  Conductance  MDL, Infomap, OSLOM, … (many other things I’ve missed) …  Stochastic Block Models: generative with in/out probabilities between labeled groups  Modularity: a good partition has more total intra-community edge weight than one would expect at random (but according to what model?) “Communities in Networks,” M. A. Porter, J.-P. Onnela & P. J. Mucha, Notices of the American Mathematical Society 56, 1082-97 & 1164-6 (2009). “Community Detection in Graphs,” S. Fortunato, Physics Reports 486, 75-174 (2010). “Community detection in networks: A user guide,” S. Fortunato & D. Hric, Physics Reports 659, 1-44 (2016). “Case studies in network community detection,” S. Shai, N. Stanley, C. Granell, D. Taylor & P. J. Mucha, arXiv:1705.02305.
  • 10. Modularity (see Newman & Girvan and other Newman papers) Total edge weight Modularity matrix Indicator on nodes i & j in same community Your data: Edge from i to j? Random “null model” for expected edge weight
  • 11. Modularity (see Newman & Girvan and other Newman papers)  GOAL: Assign nodes to communities in order to maximize quality function Q  NP-Complete [Brandes et al. 2008] ~ enumerate possible partitions  Numerous packages developed/developing • e.g. igraph library (R, python), NetworkX, Louvain • Need appropriate null model
  • 12.  ER degree distribution (binomial/Poisson) is not a good model for many real-world data sets  Independent edges, constrained to expected degree sequence same as observed.  Requires Pij = f(ki)f(kj), quickly yielding  g resolution parameter ad hoc (default = 1) [Reichardt & Bornholdt, 2006; Lambiotte et al., 2008 & 2015] Modularity (see Newman & Girvan and other Newman papers)
  • 13. Null Models for Modularity Quality Functions  Erdős–Rényi (Bernoulli)  Newman-Girvan* • Leicht-Newman* (directed) • Barber* (bipartite)
  • 14. Louvain Method (Blondel et al., “Fast unfolding of communities in large networks”, 2008)
  • 15. “Virality Prediction and Community Structure in Social Networks”, Weng, Menczer & Ahn (2013)
  • 16. Melnik et al., “Dynamics on modular networks with heterogeneous correlations” (2014) Fraction of active nodes Watts threshold model Multi-university Facebook network
  • 17. Modularity from Laplacian Dynamics Lambiotte, Delvenne & Barahona [arXiv:0812.1770] showed a way to derive modularity from normalized Laplacian dynamics, defining partition quality in terms of stability (autocovariance in Markov process) Expansion of matrix exponential to first-order in t recovers Newman-Girvan modularity with resolution g = 1/t. (This is going to be important again for multilayer networks)
  • 18. U.S. Congressional Roll Call as a similarity network Waugh et al., “Party polarization in Congress: a network science approach” (2009) AGRICULTURE APPROPRIATIONS INTERNATIONAL RELATIONS BUDGET HOUSE ADMINISTRATION ENERGY/COMMERCE FINANCIAL SERVICES VETERANS’ AFFAIRS EDUCATION ARMED SERVICES JUDICIARY RESOURCES RULES SCIENCE SMALL BUSINESS OFFICIAL CONDUCT TRANSPORTATION GOVERNMENT REFORM WAYS AND MEANS INTELLIGENCE HOMELAND SECURITY Adjacency matrix of similarities is dense and weighted, cf. other typical networks (see committees: weighted but sparse) 85th Senate
  • 19. U.S. Congressional Roll Call as a similarity network Waugh et al., “Party polarization in Congress: a network science approach” (2009) 85th Senate 108th Senate
  • 20. Moody & Mucha, “Portrait of political party polarization” (2013)
  • 21. Parker et al., “Network Analysis Reveals Sex- and Antibiotic Resistance- Associated Antivirulence Targets in Clinical Uropathogens” (2015)
  • 22. Parker et al., “Network Analysis Reveals Sex- and Antibiotic Resistance- Associated Antivirulence Targets in Clinical Uropathogens” (2015)
  • 23. Outline & Summary 1. What is community detection and why is it useful? 2. How do you calculate communities? – Software links – Importance of resolution parameters 3. Multilayer networks
  • 24. Software Other great codes to know: http://www.mapequation.org/ https://graph-tool.skewed.de/ https://github.com/vtraag/louvain-igraph http://netwiki.amath.unc.edu/GenLouvain
  • 25. Recall the (pesky) resolution parameter? Fenn et al., “Dynamic Communities in Multichannel Data: An Application to the Foreign Exchange Market During the 2007-2008 Credit Crisis” (2009)
  • 26. Picking resolution parameters still active research https://github.com/wweir827/CHAMP
  • 27. “Division I-A” College Football 50,000 Louvain calls 384 unique partitions
  • 28. But Qs(g) isn’t a point; it’s a line for partition s 50,000 Louvain calls 384 unique partitions
  • 29. But Qs(g) isn’t a point; it’s a line for partition s 50,000 Louvain calls 384 unique partitions 19 admissible partitions
  • 30. Pairwise compare admissible partitions 50,000 Louvain calls 384 unique partitions 19 admissible partitions
  • 31. Human Protein Reactome 20,000 calls 19,980 unique 39 admissible
  • 32. Human Protein Reactome 20,000 calls 19,980 unique 39 admissible
  • 33. Self loops of weight r as a form of resolution parameter Arenas et al., “Analysis of the structure of complex networks at different resolution levels” (2008) (see also Shai et al., “Case studies in network community detection,” 2017)
  • 34. Outline & Summary 1. What is community detection and why is it useful? 2. How do you calculate communities? – Software links – Importance of resolution parameters 3. Multilayer networks – We are surely out of time… If we had more time, we would talk a lot about the refs in the following slides  Networks appear in many disciplines  Network representations provide a flexible framework for studying general data types, leveraging methods of social network analysis and network science.  Community detection is a powerful tool for exploring and understanding network structures, including multilayer networks.  Network structures identify essential features for modeling and understanding data in applications.
  • 35. Multilayer Networks OrderedCategorical Mucha et al., “Community structure in time-dependent, multiscale, and multiplex networks” (2010) Kivelä et al., “Multilayer Networks” (2014)
  • 36. Multilayer Modularity Mucha et al., “Community structure in time-dependent, multiscale, and multiplex networks” (2010) How to count the expected weights of interlayer arcs given that they are definitional to the data structure? Generalized Lambiotte et al. (2008) connection between modularity and autocorrelation under Laplacian dynamics to re-derive null models for bipartite (Barber), directed (Leicht-Newman), and signed (Traag et al.) networks, specified in terms of one-step conditional probabilities intra-layer adjacency data and null inter-layer identity arcs Same formalism works for more general multilayer networks, with sum over inter-layer connections within same community
  • 37. U.S. Senators across 2-year Congresses Mucha et al., “Community structure in time-dependent, multiscale, and multiplex networks” (2010) Each point is a Senator in a Congress Colored bars indicate temporal extent of each community, labeled by nominal party labels Grey bars indicate Congresses including more than two communities
  • 38. Bassett et al. “Dynamic reconfiguration of human brain networks during learning” (2011)
  • 39. Cranmer et al., “Kantian fractionalization predicts the conflict propensity of the international system” (2015) • Identified communities of nation states in multiplex international relations of trade, IGOs, democracies • Granger causal relationship to total system-level conflict • Negligible contribution from joint democracy layer
  • 40. See mapequation.org Phys. Rev. X 6, 011036 (2016)
  • 41. Stanley et al., “Clustering network layers with the strata multilayer stochastic block model” (2016)
  • 42. Stanley et al., “Clustering network layers with the strata multilayer stochastic block model” (2016)
  • 43. Stanley et al., “Clustering network layers with the strata multilayer stochastic block model” (2016) Initialization layer l kmeans cluster L layers in to S strata stratum s Iterative Process stratum s Update number of strata to the number of unique clustering patterns according to (1) and (2) kmeans cluster 2L layers in to S strata (1) (2) ns r L in a stratum s kmeans cluster tion layer l kmeans cluster L layers in to S strata stratum s Process kmeans cluster 2L layers in to S strata (1) (2) tion layer l kmeans cluster L layers in to S strata stratum s Process kmeans cluster 2L (1) kmeans cluster L layers in to S strata stratum s
  • 44. Taylor et al., “Enhanced detectability of community structure in multilayer networks through layer aggregation” (2016)
  • 45. Taylor et al., “Enhanced detectability of community structure in multilayer networks through layer aggregation” (2016)
  • 46. Multilayer CHAMP (Weir et al., 2017)
  • 47. U.S. Senate Roll Call Similarities (Congresses 1-110) 240,000 GenLouvain calls; 197,879 unique partitions; 1,447 admissible partitions
  • 48. Community Detection Firehose Overview  “Hard/rigid” v. “soft/overlapping” clusters  cf. biclustering methods and mathematics of expander graphs  A community should describe a “cohesive group”: varying formulations/algorithms • Linkage clustering (average, single), local clustering coefficients, betweeness (geodesic, random walk), spectral, conductance,…  Classic approach in CS: Spectral Graph Partitioning • Need to specify number of communities sought  Conductance  MDL, Infomap, OSLOM, … (many other things I’ve missed) …  Stochastic Block Models: generative with in/out probabilities between labeled groups  Modularity: a good partition has more total intra-community edge weight than one would expect at random (but according to what model?) “Communities in Networks,” M. A. Porter, J.-P. Onnela & P. J. Mucha, Notices of the American Mathematical Society 56, 1082-97 & 1164-6 (2009). “Community Detection in Graphs,” S. Fortunato, Physics Reports 486, 75-174 (2010). “Community detection in networks: A user guide,” S. Fortunato & D. Hric, Physics Reports 659, 1-44 (2016). “Case studies in network community detection,” S. Shai, N. Stanley, C. Granell, D. Taylor & P. J. Mucha, arXiv:1705.02305.
  • 49. Outline & Summary 1. What is community detection and why is it useful? 2. How do you calculate communities? – Software links – Importance of resolution parameters 3. Multilayer networks  Networks appear in many disciplines  Network representations provide a flexible framework for studying general data types, leveraging methods of social network analysis and network science.  Community detection is a powerful tool for exploring and understanding network structures, including multilayer networks.  Network structures identify essential features for modeling and understanding data in applications.