SlideShare a Scribd company logo
1 of 44
Download to read offline
Network analysis for computational biology
Nathalie Villa-Vialaneix - nathalie.villa@univ-paris1.fr
http://www.nathalievilla.org
IUT STID (Carcassonne) & SAMM (Université Paris 1)
Séminaire INSERM - 11 octobre 2011
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 1 / 19
Outline
1 An introduction to networks/graphs
2 Analysis of the eect of low-calorie diet on obese women
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 2 / 19
An introduction to networks/graphs
Outline
1 An introduction to networks/graphs
2 Analysis of the eect of low-calorie diet on obese women
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 3 / 19
An introduction to networks/graphs
What is a network/graph? réseau/graphe
Mathematical object used to model relational data between entities.
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 4 / 19
An introduction to networks/graphs
What is a network/graph? réseau/graphe
Mathematical object used to model relational data between entities.
The entities are called the nodes or the vertices
n÷uds/sommets
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 4 / 19
An introduction to networks/graphs
What is a network/graph? réseau/graphe
Mathematical object used to model relational data between entities.
A relation between two entities is modeled by an edge
arête
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 4 / 19
An introduction to networks/graphs
Examples
Bibliographic network
nodes gene, TF, enzyme...
edges a relationship between two nodes that is already reported in
the litterature
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 5 / 19
An introduction to networks/graphs
Examples
Network inferred from expression data
nodes genes
edges a strong direct co-expression between two nodes observed
in the gene expression data
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 5 / 19
An introduction to networks/graphs
Standard issues associated with networks
Inference
Giving expression data, how to build a graph whose edges represent the
direct links between genes?
Example: co-expression networks built from microarray/RNAseq data (nodes = genes;
edges = signicant direct links between expressions of two genes)
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 6 / 19
An introduction to networks/graphs
Standard issues associated with networks
Inference
Giving expression data, how to build a graph whose edges represent the
direct links between genes?
Graph mining (examples)
1 Network visualization: nodes are not a priori associated to a given
position. How to represent the network in a meaningful way?
Random positions
Positions aiming at representing
connected nodes closer
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 6 / 19
An introduction to networks/graphs
Standard issues associated with networks
Inference
Giving expression data, how to build a graph whose edges represent the
direct links between genes?
Graph mining (examples)
1 Network visualization: nodes are not a priori associated to a given
position. How to represent the network in a meaningful way?
2 Network clustering: identify communities (groups of nodes that are
densely connected and share a few links (comparatively) with the other groups)
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 6 / 19
An introduction to networks/graphs
Network inference
Data: large scale gene expression data
individuals
n 30/50



X =


. . . . . .
. . X
j
i . . .
. . . . . .


variables (genes expression), p 103/4
What we want to obtain: a network with
• nodes: genes;
• edges: signicant and direct co-expression between two genes (track
transcription regulations)
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 7 / 19
An introduction to networks/graphs
Advantages of inferring a network from large scale
transcription data
1 over raw data: focuses on the strongest direct relationships:
irrelevant or indirect relations are removed (more robust) and the data
are easier to visualize and understand.
Expression data are analyzed all together and not by pairs.
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 8 / 19
An introduction to networks/graphs
Advantages of inferring a network from large scale
transcription data
1 over raw data: focuses on the strongest direct relationships:
irrelevant or indirect relations are removed (more robust) and the data
are easier to visualize and understand.
Expression data are analyzed all together and not by pairs.
2 over bibliographic network: can handle interactions with yet
unknown (not annotated) genes and deal with data collected in a
particular condition.
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 8 / 19
An introduction to networks/graphs
Using correlations: relevance network
[Butte and Kohane, 1999,
Butte and Kohane, 2000]
First (naive) approach: calculate correlations between expressions for all
pairs of genes, threshold the smallest ones and build the network.
Correlations Thresholding Graph
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 9 / 19
An introduction to networks/graphs
But correlation is not causality...
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 10 / 19
An introduction to networks/graphs
But correlation is not causality...
strong indirect correlation
y z
x
set.seed(2807); x - runif(100)
y - 2*x+1+rnorm(100,0,0.1); cor(x,y); [1] 0.9988261
z - 2*x+1+rnorm(100,0,0.1); cor(x,z); [1] 0.998751
cor(y,z); [1] 0.9971105
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 10 / 19
An introduction to networks/graphs
But correlation is not causality...
strong indirect correlation
y z
x
set.seed(2807); x - runif(100)
y - 2*x+1+rnorm(100,0,0.1); cor(x,y); [1] 0.9988261
z - 2*x+1+rnorm(100,0,0.1); cor(x,z); [1] 0.998751
cor(y,z); [1] 0.9971105
Partial correlation
cor(lm(y∼x)$residuals,lm(z∼x)$residuals) [1] -0.1933699
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 10 / 19
An introduction to networks/graphs
But correlation is not causality...
strong indirect correlation
y z
x
Networks are built using partial correlations, i.e., correlations between
gene expressions knowing the expression of all the other genes
(residual correlations).
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 10 / 19
An introduction to networks/graphs
Visualization tools help understand the graph
macro-structure
Purpose: How to display the nodes in a meaningful and aesthetic way?
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 11 / 19
An introduction to networks/graphs
Visualization tools help understand the graph
macro-structure
Purpose: How to display the nodes in a meaningful and aesthetic way?
Standard approach: force directed placement algorithms (FDP)
algorithmes de forces (e.g., [Fruchterman and Reingold, 1991])
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 11 / 19
An introduction to networks/graphs
Visualization tools help understand the graph
macro-structure
Purpose: How to display the nodes in a meaningful and aesthetic way?
Standard approach: force directed placement algorithms (FDP)
algorithmes de forces (e.g., [Fruchterman and Reingold, 1991])
• attractive forces: similar to springs along the edges
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 11 / 19
An introduction to networks/graphs
Visualization tools help understand the graph
macro-structure
Purpose: How to display the nodes in a meaningful and aesthetic way?
Standard approach: force directed placement algorithms (FDP)
algorithmes de forces (e.g., [Fruchterman and Reingold, 1991])
• attractive forces: similar to springs along the edges
• repulsive forces: similar to electric forces between all pairs of vertices
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 11 / 19
An introduction to networks/graphs
Visualization tools help understand the graph
macro-structure
Purpose: How to display the nodes in a meaningful and aesthetic way?
Standard approach: force directed placement algorithms (FDP)
algorithmes de forces (e.g., [Fruchterman and Reingold, 1991])
• attractive forces: similar to springs along the edges
• repulsive forces: similar to electric forces between all pairs of vertices
iterative algorithm until stabilization of the vertex positions.
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 11 / 19
An introduction to networks/graphs
Visualization software
• package igraph1 [Csardi and Nepusz, 2006] (static
representation with useful tools for graph mining)
1
http://igraph.sourceforge.net/
2
http://gephi.org
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 12 / 19
An introduction to networks/graphs
Visualization software
• package igraph1 [Csardi and Nepusz, 2006] (static
representation with useful tools for graph mining)
• free software Gephi2 (interactive software, supports zooming
and panning)
1
http://igraph.sourceforge.net/
2
http://gephi.org
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 12 / 19
An introduction to networks/graphs
Extracting important nodes
1 vertex degree degré: number of edges adjacent to a given vertex.
Vertices with a high degree are called hubs: measure of the vertex
popularity.
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 13 / 19
An introduction to networks/graphs
Extracting important nodes
1 vertex degree degré: number of edges adjacent to a given vertex.
Vertices with a high degree are called hubs: measure of the vertex
popularity.
2 vertex betweenness centralité: number of shortest paths between all
pairs of vertices that pass through the vertex. Betweenness is a
centrality measure (vertices with a large betweenness that are the most likely to
disconnect the network if removed).
The orange node's degree is equal to 2, its betweenness to 4.
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 13 / 19
An introduction to networks/graphs
Extracting important nodes
1 vertex degree degré: number of edges adjacent to a given vertex.
Vertices with a high degree are called hubs: measure of the vertex
popularity.
2 vertex betweenness centralité: number of shortest paths between all
pairs of vertices that pass through the vertex. Betweenness is a
centrality measure (vertices with a large betweenness that are the most likely to
disconnect the network if removed).
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 13 / 19
An introduction to networks/graphs
Extracting important nodes
1 vertex degree degré: number of edges adjacent to a given vertex.
Vertices with a high degree are called hubs: measure of the vertex
popularity.
2 vertex betweenness centralité: number of shortest paths between all
pairs of vertices that pass through the vertex. Betweenness is a
centrality measure (vertices with a large betweenness that are the most likely to
disconnect the network if removed).
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 13 / 19
An introduction to networks/graphs
Extracting important nodes
1 vertex degree degré: number of edges adjacent to a given vertex.
Vertices with a high degree are called hubs: measure of the vertex
popularity.
2 vertex betweenness centralité: number of shortest paths between all
pairs of vertices that pass through the vertex. Betweenness is a
centrality measure (vertices with a large betweenness that are the most likely to
disconnect the network if removed).
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 13 / 19
An introduction to networks/graphs
Extracting important nodes
1 vertex degree degré: number of edges adjacent to a given vertex.
Vertices with a high degree are called hubs: measure of the vertex
popularity.
2 vertex betweenness centralité: number of shortest paths between all
pairs of vertices that pass through the vertex. Betweenness is a
centrality measure (vertices with a large betweenness that are the most likely to
disconnect the network if removed).
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 13 / 19
An introduction to networks/graphs
Extracting important nodes
1 vertex degree degré: number of edges adjacent to a given vertex.
Vertices with a high degree are called hubs: measure of the vertex
popularity.
2 vertex betweenness centralité: number of shortest paths between all
pairs of vertices that pass through the vertex. Betweenness is a
centrality measure (vertices with a large betweenness that are the most likely to
disconnect the network if removed).
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 13 / 19
An introduction to networks/graphs
Vertex clustering classication
Cluster vertexes into groups that are densely connected and share a few
links (comparatively) with the other groups. Clusters are often called
communities communautés (social sciences) or modules modules
(biology).
Example 1: Natty's facebook network
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 14 / 19
An introduction to networks/graphs
Find clusters by modularity optimization modularité
The modularity [Newman and Girvan, 2004] of the partition
(C1, . . . , CK) is equal to:
Q(C1, . . . , CK) =
1
2m
K
k=1xi ,xj∈Ck
(Wij − Pij)
with Pij: weight of a null model (graph with the same degree distribution
but no preferential attachment):
Pij =
didj
2m
with di = 1
2 j=i Wij.
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 15 / 19
An introduction to networks/graphs
Find clusters by modularity optimization modularité
The modularity [Newman and Girvan, 2004] of the partition
(C1, . . . , CK) is equal to:
Q(C1, . . . , CK) =
1
2m
K
k=1xi ,xj∈Ck
(Wij − Pij)
with Pij: weight of a null model (graph with the same degree distribution
but no preferential attachment):
Pij =
didj
2m
with di = 1
2 j=i Wij.
A good clustering should maximize the modularity:
• Q when (xi, xj) are in the same cluster and Wij Pij
• Q when (xi, xj) are in two dierent clusters and Wij Pij
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 15 / 19
An introduction to networks/graphs
Find clusters by modularity optimization modularité
The modularity [Newman and Girvan, 2004] of the partition
(C1, . . . , CK) is equal to:
Q(C1, . . . , CK) =
1
2m
K
k=1xi ,xj∈Ck
(Wij − Pij)
with Pij: weight of a null model (graph with the same degree distribution
but no preferential attachment):
Pij =
didj
2m
with di = 1
2 j=i Wij.
A good clustering should maximize the modularity:
• Q when (xi, xj) are in the same cluster and Wij Pij
• Q when (xi, xj) are in two dierent clusters and Wij Pij
Modularity optimization helps separate hubs (an edge is all the more
important in the criterion that it is attached to a vertex with a low degree).
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 15 / 19
Analysis of the eect of low-calorie diet on obese women
Outline
1 An introduction to networks/graphs
2 Analysis of the eect of low-calorie diet on obese women
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 16 / 19
Analysis of the eect of low-calorie diet on obese women
Data
Experimental protocol
135 obese women and 3 times: before LCD, after a 2-month LCD and 6
months later (between the end of LCD and the last measurement, women
are randomized into one of 5 recommended diet groups).
At every time step, 221 gene expressions, 28 fatty acids and 15 clinical
variables (i.e., weight, HDL, ...)
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 17 / 19
Analysis of the eect of low-calorie diet on obese women
Data
Experimental protocol
135 obese women and 3 times: before LCD, after a 2-month LCD and 6
months later (between the end of LCD and the last measurement, women
are randomized into one of 5 recommended diet groups).
At every time step, 221 gene expressions, 28 fatty acids and 15 clinical
variables (i.e., weight, HDL, ...)
Correlations between gene expressions and between a gene expression and a
fatty acid levels are not of the same order: inference method must be
dierent inside the groups and between two groups.
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 17 / 19
Analysis of the eect of low-calorie diet on obese women
Data
Data pre-processing
At CID3, individuals are split into three groups: weight loss, weight
regain and stable weight (groups are not correlated to the diet group
according to χ2-test).
Partition similar to the one obtained with weight increase/decrease ratio.
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 17 / 19
Analysis of the eect of low-calorie diet on obese women
Method
Network inference Clustering Mining
3 inter-dataset networks
sparse CCA
merge for each
time step into
one network
3 intra-dataset networks
sparse partial
correlation
5 networks
CID1 CID2
3×CID3
Extract important nodes
Study/Compare clusters
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 18 / 19
Analysis of the eect of low-calorie diet on obese women
Brief overview on results
5 networks inferred with 264 nodes each:
CID1 CID2 CID3g1 CID3g2 CID3g3
size LCC 244 251 240 259 258
density 2.3% 2.3% 2.3% 2.3% 2.3%
transitivity 17.2% 11.9% 21.6% 10.6% 10.4%
nb clusters 14 (2-52) 10 (4-52) 11 (2-46) 12 (2-51) 12 (3-54)
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 19 / 19
Analysis of the eect of low-calorie diet on obese women
References
Butte, A. and Kohane, I. (1999).
Unsupervised knowledge discovery in medical databases using relevance networks.
In Proceedings of the AMIA Symposium, pages 711715.
Butte, A. and Kohane, I. (2000).
Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements.
In Proceedings of the Pacic Symposium on Biocomputing, pages 418429.
Csardi, G. and Nepusz, T. (2006).
The igraph software package for complex network research.
InterJournal, Complex Systems.
Fruchterman, T. and Reingold, B. (1991).
Graph drawing by force-directed placement.
Software, Practice and Experience, 21:11291164.
Newman, M. and Girvan, M. (2004).
Finding and evaluating community structure in networks.
Physical Review, E, 69:026113.
1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 19 / 19

More Related Content

What's hot

The complexity of social networks
The complexity of social networksThe complexity of social networks
The complexity of social networks
JULIO GONZALEZ SANZ
 

What's hot (12)

A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction models
 
Convolutional networks and graph networks through kernels
Convolutional networks and graph networks through kernelsConvolutional networks and graph networks through kernels
Convolutional networks and graph networks through kernels
 
Multiscale Granger Causality and Information Decomposition
Multiscale Granger Causality and Information Decomposition Multiscale Granger Causality and Information Decomposition
Multiscale Granger Causality and Information Decomposition
 
Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...
 
My invited talk at the 23rd International Symposium of Mathematical Programmi...
My invited talk at the 23rd International Symposium of Mathematical Programmi...My invited talk at the 23rd International Symposium of Mathematical Programmi...
My invited talk at the 23rd International Symposium of Mathematical Programmi...
 
A short and naive introduction to epistasis in association studies
A short and naive introduction to epistasis in association studiesA short and naive introduction to epistasis in association studies
A short and naive introduction to epistasis in association studies
 
Model-based and model-free connectivity methods for electrical neuroimaging
Model-based and model-free connectivity methods for electrical neuroimagingModel-based and model-free connectivity methods for electrical neuroimaging
Model-based and model-free connectivity methods for electrical neuroimaging
 
From RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphsFrom RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphs
 
Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biologyKernel methods for data integration in systems biology
Kernel methods for data integration in systems biology
 
08 Exponential Random Graph Models (ERGM)
08 Exponential Random Graph Models (ERGM)08 Exponential Random Graph Models (ERGM)
08 Exponential Random Graph Models (ERGM)
 
The complexity of social networks
The complexity of social networksThe complexity of social networks
The complexity of social networks
 
Bayes Nets Meetup Sept 29th 2016 - Bayesian Network Modelling by Marco Scutari
Bayes Nets Meetup Sept 29th 2016 - Bayesian Network Modelling by Marco ScutariBayes Nets Meetup Sept 29th 2016 - Bayesian Network Modelling by Marco Scutari
Bayes Nets Meetup Sept 29th 2016 - Bayesian Network Modelling by Marco Scutari
 

Similar to Network analysis for computational biology

ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...
ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...
ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...
Daniel Katz
 
machinelearningengineeringslideshare-160909192132 (1).pdf
machinelearningengineeringslideshare-160909192132 (1).pdfmachinelearningengineeringslideshare-160909192132 (1).pdf
machinelearningengineeringslideshare-160909192132 (1).pdf
ShivareddyGangam
 

Similar to Network analysis for computational biology (20)

Large network analysis : visualization and clustering
Large network analysis : visualization and clusteringLarge network analysis : visualization and clustering
Large network analysis : visualization and clustering
 
07 Network Visualization
07 Network Visualization07 Network Visualization
07 Network Visualization
 
Fbk Seminar Michela Ferron
Fbk Seminar Michela FerronFbk Seminar Michela Ferron
Fbk Seminar Michela Ferron
 
http://old.nathalievilla.org/IMG/pdf/Presentation-27.pdf
http://old.nathalievilla.org/IMG/pdf/Presentation-27.pdfhttp://old.nathalievilla.org/IMG/pdf/Presentation-27.pdf
http://old.nathalievilla.org/IMG/pdf/Presentation-27.pdf
 
Master Thesis Presentation (Subselection of Topics)
Master Thesis Presentation (Subselection of Topics)Master Thesis Presentation (Subselection of Topics)
Master Thesis Presentation (Subselection of Topics)
 
Graph mining with kernel self-organizing map
Graph mining with kernel self-organizing mapGraph mining with kernel self-organizing map
Graph mining with kernel self-organizing map
 
AI Class Topic 5: Social Network Graph
AI Class Topic 5:  Social Network GraphAI Class Topic 5:  Social Network Graph
AI Class Topic 5: Social Network Graph
 
Link Analysis in Networks - or - Finding the Terrorists
Link Analysis in Networks - or - Finding the TerroristsLink Analysis in Networks - or - Finding the Terrorists
Link Analysis in Networks - or - Finding the Terrorists
 
Behind the Mask: Understanding the Structural Forces That Make Social Graphs ...
Behind the Mask: Understanding the Structural Forces That Make Social Graphs ...Behind the Mask: Understanding the Structural Forces That Make Social Graphs ...
Behind the Mask: Understanding the Structural Forces That Make Social Graphs ...
 
ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...
ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...
ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...
 
NetLearn: Social Network Analysis and Visualizations for Learning
NetLearn: Social Network Analysis and Visualizations for LearningNetLearn: Social Network Analysis and Visualizations for Learning
NetLearn: Social Network Analysis and Visualizations for Learning
 
MODELING SOCIAL GAUSS-MARKOV MOBILITY FOR OPPORTUNISTIC NETWORK
MODELING SOCIAL GAUSS-MARKOV MOBILITY FOR OPPORTUNISTIC NETWORK MODELING SOCIAL GAUSS-MARKOV MOBILITY FOR OPPORTUNISTIC NETWORK
MODELING SOCIAL GAUSS-MARKOV MOBILITY FOR OPPORTUNISTIC NETWORK
 
Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...
Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...
Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...
 
Descobrindo o tesouro escondido nos seus dados usando grafos.
Descobrindo o tesouro escondido nos seus dados usando grafos.Descobrindo o tesouro escondido nos seus dados usando grafos.
Descobrindo o tesouro escondido nos seus dados usando grafos.
 
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
 
TOPOLOGY MAP ANALYSIS FOR EFFECTIVE CHOICE OF NETWORK ATTACK SCENARIO
TOPOLOGY MAP ANALYSIS FOR EFFECTIVE CHOICE OF NETWORK ATTACK SCENARIOTOPOLOGY MAP ANALYSIS FOR EFFECTIVE CHOICE OF NETWORK ATTACK SCENARIO
TOPOLOGY MAP ANALYSIS FOR EFFECTIVE CHOICE OF NETWORK ATTACK SCENARIO
 
Using spectral radius ratio for node degree
Using spectral radius ratio for node degreeUsing spectral radius ratio for node degree
Using spectral radius ratio for node degree
 
machinelearningengineeringslideshare-160909192132 (1).pdf
machinelearningengineeringslideshare-160909192132 (1).pdfmachinelearningengineeringslideshare-160909192132 (1).pdf
machinelearningengineeringslideshare-160909192132 (1).pdf
 
HalifaxNGGs
HalifaxNGGsHalifaxNGGs
HalifaxNGGs
 
EVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKS
EVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKSEVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKS
EVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKS
 

More from tuxette

More from tuxette (20)

Racines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathsRacines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en maths
 
Méthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènesMéthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènes
 
Méthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquesMéthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiques
 
Projets autour de l'Hi-C
Projets autour de l'Hi-CProjets autour de l'Hi-C
Projets autour de l'Hi-C
 
Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?
 
Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...
 
ASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquesASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiques
 
Autour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeanAutour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWean
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
 
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquesApprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
 
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
 
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
 
Journal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation dataJournal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation data
 
Overfitting or overparametrization?
Overfitting or overparametrization?Overfitting or overparametrization?
Overfitting or overparametrization?
 
Selective inference and single-cell differential analysis
Selective inference and single-cell differential analysisSelective inference and single-cell differential analysis
Selective inference and single-cell differential analysis
 
SOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricesSOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatrices
 
Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Prediction
 
Explanable models for time series with random forest
Explanable models for time series with random forestExplanable models for time series with random forest
Explanable models for time series with random forest
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICS
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICS
 

Recently uploaded

Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 
Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.
Silpa
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
Silpa
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
ANSARKHAN96
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
Silpa
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
NazaninKarimi6
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
Scintica Instrumentation
 

Recently uploaded (20)

Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
Genetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsGenetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditions
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRLGwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 

Network analysis for computational biology

  • 1. Network analysis for computational biology Nathalie Villa-Vialaneix - nathalie.villa@univ-paris1.fr http://www.nathalievilla.org IUT STID (Carcassonne) & SAMM (Université Paris 1) Séminaire INSERM - 11 octobre 2011 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 1 / 19
  • 2. Outline 1 An introduction to networks/graphs 2 Analysis of the eect of low-calorie diet on obese women 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 2 / 19
  • 3. An introduction to networks/graphs Outline 1 An introduction to networks/graphs 2 Analysis of the eect of low-calorie diet on obese women 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 3 / 19
  • 4. An introduction to networks/graphs What is a network/graph? réseau/graphe Mathematical object used to model relational data between entities. 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 4 / 19
  • 5. An introduction to networks/graphs What is a network/graph? réseau/graphe Mathematical object used to model relational data between entities. The entities are called the nodes or the vertices n÷uds/sommets 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 4 / 19
  • 6. An introduction to networks/graphs What is a network/graph? réseau/graphe Mathematical object used to model relational data between entities. A relation between two entities is modeled by an edge arête 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 4 / 19
  • 7. An introduction to networks/graphs Examples Bibliographic network nodes gene, TF, enzyme... edges a relationship between two nodes that is already reported in the litterature 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 5 / 19
  • 8. An introduction to networks/graphs Examples Network inferred from expression data nodes genes edges a strong direct co-expression between two nodes observed in the gene expression data 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 5 / 19
  • 9. An introduction to networks/graphs Standard issues associated with networks Inference Giving expression data, how to build a graph whose edges represent the direct links between genes? Example: co-expression networks built from microarray/RNAseq data (nodes = genes; edges = signicant direct links between expressions of two genes) 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 6 / 19
  • 10. An introduction to networks/graphs Standard issues associated with networks Inference Giving expression data, how to build a graph whose edges represent the direct links between genes? Graph mining (examples) 1 Network visualization: nodes are not a priori associated to a given position. How to represent the network in a meaningful way? Random positions Positions aiming at representing connected nodes closer 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 6 / 19
  • 11. An introduction to networks/graphs Standard issues associated with networks Inference Giving expression data, how to build a graph whose edges represent the direct links between genes? Graph mining (examples) 1 Network visualization: nodes are not a priori associated to a given position. How to represent the network in a meaningful way? 2 Network clustering: identify communities (groups of nodes that are densely connected and share a few links (comparatively) with the other groups) 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 6 / 19
  • 12. An introduction to networks/graphs Network inference Data: large scale gene expression data individuals n 30/50    X =   . . . . . . . . X j i . . . . . . . . .   variables (genes expression), p 103/4 What we want to obtain: a network with • nodes: genes; • edges: signicant and direct co-expression between two genes (track transcription regulations) 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 7 / 19
  • 13. An introduction to networks/graphs Advantages of inferring a network from large scale transcription data 1 over raw data: focuses on the strongest direct relationships: irrelevant or indirect relations are removed (more robust) and the data are easier to visualize and understand. Expression data are analyzed all together and not by pairs. 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 8 / 19
  • 14. An introduction to networks/graphs Advantages of inferring a network from large scale transcription data 1 over raw data: focuses on the strongest direct relationships: irrelevant or indirect relations are removed (more robust) and the data are easier to visualize and understand. Expression data are analyzed all together and not by pairs. 2 over bibliographic network: can handle interactions with yet unknown (not annotated) genes and deal with data collected in a particular condition. 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 8 / 19
  • 15. An introduction to networks/graphs Using correlations: relevance network [Butte and Kohane, 1999, Butte and Kohane, 2000] First (naive) approach: calculate correlations between expressions for all pairs of genes, threshold the smallest ones and build the network. Correlations Thresholding Graph 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 9 / 19
  • 16. An introduction to networks/graphs But correlation is not causality... 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 10 / 19
  • 17. An introduction to networks/graphs But correlation is not causality... strong indirect correlation y z x set.seed(2807); x - runif(100) y - 2*x+1+rnorm(100,0,0.1); cor(x,y); [1] 0.9988261 z - 2*x+1+rnorm(100,0,0.1); cor(x,z); [1] 0.998751 cor(y,z); [1] 0.9971105 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 10 / 19
  • 18. An introduction to networks/graphs But correlation is not causality... strong indirect correlation y z x set.seed(2807); x - runif(100) y - 2*x+1+rnorm(100,0,0.1); cor(x,y); [1] 0.9988261 z - 2*x+1+rnorm(100,0,0.1); cor(x,z); [1] 0.998751 cor(y,z); [1] 0.9971105 Partial correlation cor(lm(y∼x)$residuals,lm(z∼x)$residuals) [1] -0.1933699 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 10 / 19
  • 19. An introduction to networks/graphs But correlation is not causality... strong indirect correlation y z x Networks are built using partial correlations, i.e., correlations between gene expressions knowing the expression of all the other genes (residual correlations). 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 10 / 19
  • 20. An introduction to networks/graphs Visualization tools help understand the graph macro-structure Purpose: How to display the nodes in a meaningful and aesthetic way? 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 11 / 19
  • 21. An introduction to networks/graphs Visualization tools help understand the graph macro-structure Purpose: How to display the nodes in a meaningful and aesthetic way? Standard approach: force directed placement algorithms (FDP) algorithmes de forces (e.g., [Fruchterman and Reingold, 1991]) 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 11 / 19
  • 22. An introduction to networks/graphs Visualization tools help understand the graph macro-structure Purpose: How to display the nodes in a meaningful and aesthetic way? Standard approach: force directed placement algorithms (FDP) algorithmes de forces (e.g., [Fruchterman and Reingold, 1991]) • attractive forces: similar to springs along the edges 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 11 / 19
  • 23. An introduction to networks/graphs Visualization tools help understand the graph macro-structure Purpose: How to display the nodes in a meaningful and aesthetic way? Standard approach: force directed placement algorithms (FDP) algorithmes de forces (e.g., [Fruchterman and Reingold, 1991]) • attractive forces: similar to springs along the edges • repulsive forces: similar to electric forces between all pairs of vertices 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 11 / 19
  • 24. An introduction to networks/graphs Visualization tools help understand the graph macro-structure Purpose: How to display the nodes in a meaningful and aesthetic way? Standard approach: force directed placement algorithms (FDP) algorithmes de forces (e.g., [Fruchterman and Reingold, 1991]) • attractive forces: similar to springs along the edges • repulsive forces: similar to electric forces between all pairs of vertices iterative algorithm until stabilization of the vertex positions. 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 11 / 19
  • 25. An introduction to networks/graphs Visualization software • package igraph1 [Csardi and Nepusz, 2006] (static representation with useful tools for graph mining) 1 http://igraph.sourceforge.net/ 2 http://gephi.org 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 12 / 19
  • 26. An introduction to networks/graphs Visualization software • package igraph1 [Csardi and Nepusz, 2006] (static representation with useful tools for graph mining) • free software Gephi2 (interactive software, supports zooming and panning) 1 http://igraph.sourceforge.net/ 2 http://gephi.org 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 12 / 19
  • 27. An introduction to networks/graphs Extracting important nodes 1 vertex degree degré: number of edges adjacent to a given vertex. Vertices with a high degree are called hubs: measure of the vertex popularity. 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 13 / 19
  • 28. An introduction to networks/graphs Extracting important nodes 1 vertex degree degré: number of edges adjacent to a given vertex. Vertices with a high degree are called hubs: measure of the vertex popularity. 2 vertex betweenness centralité: number of shortest paths between all pairs of vertices that pass through the vertex. Betweenness is a centrality measure (vertices with a large betweenness that are the most likely to disconnect the network if removed). The orange node's degree is equal to 2, its betweenness to 4. 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 13 / 19
  • 29. An introduction to networks/graphs Extracting important nodes 1 vertex degree degré: number of edges adjacent to a given vertex. Vertices with a high degree are called hubs: measure of the vertex popularity. 2 vertex betweenness centralité: number of shortest paths between all pairs of vertices that pass through the vertex. Betweenness is a centrality measure (vertices with a large betweenness that are the most likely to disconnect the network if removed). 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 13 / 19
  • 30. An introduction to networks/graphs Extracting important nodes 1 vertex degree degré: number of edges adjacent to a given vertex. Vertices with a high degree are called hubs: measure of the vertex popularity. 2 vertex betweenness centralité: number of shortest paths between all pairs of vertices that pass through the vertex. Betweenness is a centrality measure (vertices with a large betweenness that are the most likely to disconnect the network if removed). 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 13 / 19
  • 31. An introduction to networks/graphs Extracting important nodes 1 vertex degree degré: number of edges adjacent to a given vertex. Vertices with a high degree are called hubs: measure of the vertex popularity. 2 vertex betweenness centralité: number of shortest paths between all pairs of vertices that pass through the vertex. Betweenness is a centrality measure (vertices with a large betweenness that are the most likely to disconnect the network if removed). 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 13 / 19
  • 32. An introduction to networks/graphs Extracting important nodes 1 vertex degree degré: number of edges adjacent to a given vertex. Vertices with a high degree are called hubs: measure of the vertex popularity. 2 vertex betweenness centralité: number of shortest paths between all pairs of vertices that pass through the vertex. Betweenness is a centrality measure (vertices with a large betweenness that are the most likely to disconnect the network if removed). 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 13 / 19
  • 33. An introduction to networks/graphs Extracting important nodes 1 vertex degree degré: number of edges adjacent to a given vertex. Vertices with a high degree are called hubs: measure of the vertex popularity. 2 vertex betweenness centralité: number of shortest paths between all pairs of vertices that pass through the vertex. Betweenness is a centrality measure (vertices with a large betweenness that are the most likely to disconnect the network if removed). 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 13 / 19
  • 34. An introduction to networks/graphs Vertex clustering classication Cluster vertexes into groups that are densely connected and share a few links (comparatively) with the other groups. Clusters are often called communities communautés (social sciences) or modules modules (biology). Example 1: Natty's facebook network 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 14 / 19
  • 35. An introduction to networks/graphs Find clusters by modularity optimization modularité The modularity [Newman and Girvan, 2004] of the partition (C1, . . . , CK) is equal to: Q(C1, . . . , CK) = 1 2m K k=1xi ,xj∈Ck (Wij − Pij) with Pij: weight of a null model (graph with the same degree distribution but no preferential attachment): Pij = didj 2m with di = 1 2 j=i Wij. 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 15 / 19
  • 36. An introduction to networks/graphs Find clusters by modularity optimization modularité The modularity [Newman and Girvan, 2004] of the partition (C1, . . . , CK) is equal to: Q(C1, . . . , CK) = 1 2m K k=1xi ,xj∈Ck (Wij − Pij) with Pij: weight of a null model (graph with the same degree distribution but no preferential attachment): Pij = didj 2m with di = 1 2 j=i Wij. A good clustering should maximize the modularity: • Q when (xi, xj) are in the same cluster and Wij Pij • Q when (xi, xj) are in two dierent clusters and Wij Pij 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 15 / 19
  • 37. An introduction to networks/graphs Find clusters by modularity optimization modularité The modularity [Newman and Girvan, 2004] of the partition (C1, . . . , CK) is equal to: Q(C1, . . . , CK) = 1 2m K k=1xi ,xj∈Ck (Wij − Pij) with Pij: weight of a null model (graph with the same degree distribution but no preferential attachment): Pij = didj 2m with di = 1 2 j=i Wij. A good clustering should maximize the modularity: • Q when (xi, xj) are in the same cluster and Wij Pij • Q when (xi, xj) are in two dierent clusters and Wij Pij Modularity optimization helps separate hubs (an edge is all the more important in the criterion that it is attached to a vertex with a low degree). 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 15 / 19
  • 38. Analysis of the eect of low-calorie diet on obese women Outline 1 An introduction to networks/graphs 2 Analysis of the eect of low-calorie diet on obese women 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 16 / 19
  • 39. Analysis of the eect of low-calorie diet on obese women Data Experimental protocol 135 obese women and 3 times: before LCD, after a 2-month LCD and 6 months later (between the end of LCD and the last measurement, women are randomized into one of 5 recommended diet groups). At every time step, 221 gene expressions, 28 fatty acids and 15 clinical variables (i.e., weight, HDL, ...) 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 17 / 19
  • 40. Analysis of the eect of low-calorie diet on obese women Data Experimental protocol 135 obese women and 3 times: before LCD, after a 2-month LCD and 6 months later (between the end of LCD and the last measurement, women are randomized into one of 5 recommended diet groups). At every time step, 221 gene expressions, 28 fatty acids and 15 clinical variables (i.e., weight, HDL, ...) Correlations between gene expressions and between a gene expression and a fatty acid levels are not of the same order: inference method must be dierent inside the groups and between two groups. 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 17 / 19
  • 41. Analysis of the eect of low-calorie diet on obese women Data Data pre-processing At CID3, individuals are split into three groups: weight loss, weight regain and stable weight (groups are not correlated to the diet group according to χ2-test). Partition similar to the one obtained with weight increase/decrease ratio. 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 17 / 19
  • 42. Analysis of the eect of low-calorie diet on obese women Method Network inference Clustering Mining 3 inter-dataset networks sparse CCA merge for each time step into one network 3 intra-dataset networks sparse partial correlation 5 networks CID1 CID2 3×CID3 Extract important nodes Study/Compare clusters 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 18 / 19
  • 43. Analysis of the eect of low-calorie diet on obese women Brief overview on results 5 networks inferred with 264 nodes each: CID1 CID2 CID3g1 CID3g2 CID3g3 size LCC 244 251 240 259 258 density 2.3% 2.3% 2.3% 2.3% 2.3% transitivity 17.2% 11.9% 21.6% 10.6% 10.4% nb clusters 14 (2-52) 10 (4-52) 11 (2-46) 12 (2-51) 12 (3-54) 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 19 / 19
  • 44. Analysis of the eect of low-calorie diet on obese women References Butte, A. and Kohane, I. (1999). Unsupervised knowledge discovery in medical databases using relevance networks. In Proceedings of the AMIA Symposium, pages 711715. Butte, A. and Kohane, I. (2000). Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. In Proceedings of the Pacic Symposium on Biocomputing, pages 418429. Csardi, G. and Nepusz, T. (2006). The igraph software package for complex network research. InterJournal, Complex Systems. Fruchterman, T. and Reingold, B. (1991). Graph drawing by force-directed placement. Software, Practice and Experience, 21:11291164. Newman, M. and Girvan, M. (2004). Finding and evaluating community structure in networks. Physical Review, E, 69:026113. 1 octobre 2013 (Séminaire INSERM) Network Nathalie Villa-Vialaneix 19 / 19