SlideShare une entreprise Scribd logo
1  sur  40
Télécharger pour lire hors ligne
Consensual gene co-expression network
inference with multiple samples
Nathalie Villa-Vialaneix(1,2)
http://www.nathalievilla.org
nathalie.villa@univ-paris1.fr
Joint work with Magali SanCristobal and Laurence Liaubet
Groupe de travail biostatistique - 19 mars 2013
(1) (2)
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 1 / 21
Overview on network inference
Outline
1 Overview on network inference
2 Graphical Gaussian Models
3 Inference with multiple samples
4 Illustration
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 2 / 21
Overview on network inference
Framework
Data: large scale gene expression data
individuals
n 30/50



X =


. . . . . .
. . X
j
i
. . .
. . . . . .


variables (genes expression), p 103/4
What we want to obtain: a graph/network with
• nodes: genes;
• edges: “significant” and direct co-expression between two genes
(track transcription regulations).
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 3 / 21
Overview on network inference
Modeling multiple interactions between genes with a
network
Co-expression networks
• nodes: genes
• edges: “direct” co-expression
between two genes
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 4 / 21
Overview on network inference
Modeling multiple interactions between genes with a
network
Co-expression networks
• nodes: genes
• edges: “direct” co-expression between two genes
Method:
“Correlations” Thresholding Graph
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 4 / 21
Overview on network inference
Correlations/Partial correlations
strong indirect correlation
y z
x
set.seed(2807); x <- runif(100)
y <- 2*x+1 + rnorm(100,0,0.1); cor(x,y); [1] 0.9870407
z <- -x+2 + rnorm(100,0,0.1); cor(x,z); [1] -0.9443082
cor(y,z) [1] -0.9336924
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 5 / 21
Overview on network inference
Correlations/Partial correlations
Partial correlation
Cor (z, y|x)
Correlation between residuals:
set.seed(2807); x <- runif(100)
y <- 2*x+1 + rnorm(100,0,0.1); cor(x,y); [1] 0.9870407
z <- -x+2 + rnorm(100,0,0.1); cor(x,z); [1] -0.9443082
cor(y,z) [1] -0.9336924
cor(lm(y x)$residuals,lm(z x)$residuals) [1] -0.03071178
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 5 / 21
Overview on network inference
Advantages of a network approach
1 over raw data and correlation network (relevance network,
[Butte and Kohane, 1999]): focuses on direct links;
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 6 / 21
Overview on network inference
Advantages of a network approach
1 over raw data and correlation network (relevance network,
[Butte and Kohane, 1999]): focuses on direct links;
2 over raw data (again): focuses on “significant” links (more robust)
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 6 / 21
Overview on network inference
Advantages of a network approach
1 over raw data and correlation network (relevance network,
[Butte and Kohane, 1999]): focuses on direct links;
2 over raw data (again): focuses on “significant” links (more robust)
3 over bibliographic network: can handle interactions with yet
unknown (not annotated) genes
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 6 / 21
Graphical Gaussian Models
Outline
1 Overview on network inference
2 Graphical Gaussian Models
3 Inference with multiple samples
4 Illustration
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 7 / 21
Graphical Gaussian Models
Theoretical framework
Gaussian Graphical Models (GGM) X ∼ N(0, Σ) gene expressions
Seminal work [Schäfer and Strimmer, 2005], R package GeneNet:
estimation of the partial correlations
πjj = Cor(Xj
, Xj
|Xk
, k j, j )
from the concentration matrix S = Σ−1
:
πjj = −
Sjj
SjjSj j
.
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 8 / 21
Graphical Gaussian Models
Theoretical framework
Gaussian Graphical Models (GGM) X ∼ N(0, Σ) gene expressions
Seminal work [Schäfer and Strimmer, 2005], R package GeneNet:
estimation of the partial correlations
πjj = Cor(Xj
, Xj
|Xk
, k j, j )
from the concentration matrix S = Σ−1
:
πjj = −
Sjj
SjjSj j
.
Main issue: p n ⇒ Σ badly conditioned ⇒ estimating S from Σ−1
is a
bad idea...
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 8 / 21
Graphical Gaussian Models
Theoretical framework
Gaussian Graphical Models (GGM) X ∼ N(0, Σ) gene expressions
Seminal work [Schäfer and Strimmer, 2005], R package GeneNet:
estimation of the partial correlations
πjj = Cor(Xj
, Xj
|Xk
, k j, j )
from the concentration matrix S = Σ−1
:
πjj = −
Sjj
SjjSj j
.
Main issue: p n ⇒ Σ badly conditioned ⇒ estimating S from Σ−1
is a
bad idea... Schafer & Strimmer’s proposal:
1 use Σ + λI rather than Σ to estimate S;
2 select only the most significant Sjj (Bayesian test):
S ∼ (1 − η0)fA + η0f0
with f0: distribution of the “null” edges and η0 proportion of null edges
among the partial correlations values (close to 1).
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 8 / 21
Graphical Gaussian Models
Sparse regression approach
[Meinshausen and Bühlmann, 2006, Friedman et al., 2008] Partial
correlations can also be estimated by using linear models: ∀ j
Xj
= βT
j X−j
+
In the Gaussian framework: βjj = −
Sjj
Sjj
.
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 9 / 21
Graphical Gaussian Models
Sparse regression approach
[Meinshausen and Bühlmann, 2006, Friedman et al., 2008] Partial
correlations can also be estimated by using linear models: ∀ j
Xj
= βT
j X−j
+
In the Gaussian framework: βjj = −
Sjj
Sjj
.
Independant regressions:
max
(βjj )j

log MLj − λ
j j
|βjj |


with log MLj ∼ − n
i X
j
i
− j j βjj X
j
i
2
.
Consequence: the sparse penalty yields to βjj = 0 for most coefficients
(“all-in-one” approach: no thresholding step needed).
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 9 / 21
Graphical Gaussian Models
Sparse regression approach
[Meinshausen and Bühlmann, 2006, Friedman et al., 2008] Partial
correlations can also be estimated by using linear models: ∀ j
Xj
= βT
j X−j
+
In the Gaussian framework: βjj = −
Sjj
Sjj
.
Global approach: Graphical Lasso (R package glasso)
max
(βjj )jj


j
log MLj + λ
j j
|βjj |


Consequence: the sparse penalty yields to βjj = 0 for most coefficients
(“all-in-one” approach: no thresholding step needed).
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 9 / 21
Graphical Gaussian Models
Other methods/packages to infer networks
• relevance (correlation) networks: R package WGCNA
• Bayesian networks: R package bnlearn
[Pearl, 1998, Pearl and Russel, 2002, Scutari, 2010]
• networks based on mutual information: R package minet
[Meyer et al., 2008]
• networks based on random forest [Huynh-Thu et al., 2010]
See also:
• http://cran.r-project.org/web/views/gR.html (CRAN task
view on graphical methods)
• https://www.coursera.org/course/pgm (Daphne’s Koller on-line
course on “Probabilistic Graphical Models”, starts on April, 8th)
• https://www.coursera.org/course/netsysbio (On-line course
on “Network Analysis in Systems Biology”)
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 10 / 21
Inference with multiple samples
Outline
1 Overview on network inference
2 Graphical Gaussian Models
3 Inference with multiple samples
4 Illustration
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 11 / 21
Inference with multiple samples
Multiple networks inference
Transcriptomic data coming from several different conditions.
Examples:
• genes expression from pig muscle in Landrace and Large white
breeds;
• genes expression from obese humans after and before a diet.
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 12 / 21
Inference with multiple samples
Multiple networks inference
Transcriptomic data coming from several different conditions.
Examples:
• genes expression from pig muscle in Landrace and Large white
breeds;
• genes expression from obese humans after and before a diet.
• Assumption: A
common functioning
exists regardless the
condition;
• Which genes are
correlated
independently
from/depending on the
condition?
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 12 / 21
Inference with multiple samples
Dataset description
“DeLiSus” dataset
• variables: expression of 81 genes (selected by Laurence)
• conditions: two breeds (33 “Landrace” and 51 “Large white”; 84 pigs)
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 13 / 21
Inference with multiple samples
“DeLiSus” dataset (restricted dataset with 84 genes (51 pigs))
Density Transitivity % shared
[1] GeneNet 0.00 0.71 0.46
[2] simone, MB-AND 0.05 0.08 0.17
[3] simone, Fried. 0.05 0.19 0.22
[4] simone, intertwined 0.05 0.09 0.52
[5] simone, CoopLasso 0.06 0.09 0.88
[6] simone, GroupLasso 0.04 0.07 0.99
[1] [2] [3] [4] [5] [6]
[1] 1.00 0.00 0.00 0.00 0.00 0.00
[2] 1.00 0.71 0.76 0.64 0.56
[3] 1.00 0.67 0.55 0.53
[4] 1.00 0.80 0.67
[5] 1.00 0.84
[6] 1.00
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 14 / 21
Inference with multiple samples
Multiple networks
Independent estimations: if c = 1, . . . , C are different samples (or
“conditions”, e.g., breeds or before/after diet...)
max
(βc
jk
)k j,c=1,...,C c

log MLc
j − λ
k j
|βc
jk |

 .
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 15 / 21
Inference with multiple samples
Multiple networks
Independent estimations: if c = 1, . . . , C are different samples (or
“conditions”, e.g., breeds or before/after diet...)
max
(βc
jk
)k j,c=1,...,C c

log MLc
j − λ
k j
|βc
jk |

 .
Joint estimations:
Implemented in the R package simone, [Chiquet et al., 2011]
GroupLasso Consensual network between conditions (enforces identical
edges by a group LASSO penalty)
CoopLasso Sign-coherent network between conditions (prevents edges
that corresponds to partial correlations having different
signs; thus allows one to obtain a few differences between
the conditions)
Intertwined In GLasso replace Σc
by 1/2Σc
+ 1/2Σ where Σ = 1
C c Σc
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 15 / 21
Inference with multiple samples
Consensus LASSO
Proposal: Infer multiple networks by forcing them toward a consensual
network.
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 16 / 21
Inference with multiple samples
Consensus LASSO
Proposal: Infer multiple networks by forcing them toward a consensual
network.
Original optimization:
max
(βc
jk
)k j,c=1,...,C c

log MLc
j − λ
k j
|βc
jk |

 .
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 16 / 21
Inference with multiple samples
Consensus LASSO
Proposal: Infer multiple networks by forcing them toward a consensual
network.
Add a constraint to force inference toward a consensus βcons:
max
(βc
jk
)k j,c=1,...,C c

log MLc
j − λ
k j
|βc
jk | − µ
c
wc βc
j − βcons
j
2


Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 16 / 21
Inference with multiple samples
Consensus LASSO
Proposal: Infer multiple networks by forcing them toward a consensual
network.
Add a constraint to force inference toward a consensus βcons:
max
(βc
jk
)k j,c=1,...,C c

log MLc
j − λ
k j
|βc
jk | − µ
c
wc βc
j − βcons
j
2


Examples:
• βcons
j
= βc∗
j
with c∗ = arg min |βc
j
| (network intersection);
• βcons
j
= c
nc
n βc
j
(“average” network).
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 16 / 21
Inference with multiple samples
In practice...
βcons
j
= c
nc
n βc
j
is a good choice because:
•
∂βcons
j
∂βc
j
exists;
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 17 / 21
Inference with multiple samples
In practice...
βcons
j
= c
nc
n βc
j
is a good choice because:
•
∂βcons
j
∂βc
j
exists;
• thus, solving the optimization problem is equivalent to maximizing
1
2
βT
j Sj(µ)βj + βT
j Σjj + λ
c
1
nc
βc
j 1
with Σjj, the jth row of empirical covariance matrix deprived from its
jth column and Sj(µ) = Σjj + 2µAT
A where Σjj is the empirical
covariance matrix deprived from its jth row and column and A is a
matrix that does not depend on j.
This is a standard LASSO problem that can be solved using a
sub-gradient method (as described in [Chiquet et al., 2011] and already
implemented in the beta-R-package therese).
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 17 / 21
Illustration
Outline
1 Overview on network inference
2 Graphical Gaussian Models
3 Inference with multiple samples
4 Illustration
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 18 / 21
Illustration
Datasets description
“DeLiSus” dataset
• variables: expression of 26 genes (selected by Laurence)
• conditions: two breeds (33 “Landrace” and 51 “Large white”; 84 pigs)
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 19 / 21
Illustration
Datasets description
“DeLiSus” dataset
• variables: expression of 26 genes (selected by Laurence)
• conditions: two breeds (33 “Landrace” and 51 “Large white”; 84 pigs)
Methodology
• package GeneNet: networks are estimated independently by a GGM
approach (edges selected based on the p-value in a Bayesian test);
• consensus LASSO: µ fixed and λ varied on a regularization path.
Selection of an instance of the path based on the number of edges
(similar than with GeneNet).
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 19 / 21
Illustration
Results
Package GeneNet
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 20 / 21
Illustration
Results
Package simone (intertwined)
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 20 / 21
Illustration
Results
Consensus LASSO
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 20 / 21
Illustration
Conclusion
... much left to do:
• biological validation,
• selecting λ (AIC and BIC are way too restrictive...),
• tuning µ,
• other comparisons...
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 21 / 21
Illustration
References
Butte, A. and Kohane, I. (1999).
Unsupervised knowledge discovery in medical databases using relevance networks.
In Proceedings of the AMIA Symposium, pages 711–715.
Chiquet, J., Grandvalet, Y., and Ambroise, C. (2011).
Inferring multiple graphical structures.
Statistics and Computing, 21(4):537–553.
Friedman, J., Hastie, T., and Tibshirani, R. (2008).
Sparse inverse covariance estimation with the graphical lasso.
Biostatistics, 9(3):432–441.
Huynh-Thu, V., Irrthum, A., Wehenkel, L., and Geurts, P. (2010).
Inferring regulatory networks from expression data using tree-based methods.
PLoS ONE, 5(9):e12776.
Meinshausen, N. and Bühlmann, P. (2006).
High dimensional graphs and variable selection with the lasso.
Annals of Statistic, 34(3):1436–1462.
Meyer, P., Lafitte, F., and Bontempi, G. (2008).
minet: A R/Bioconductor package for inferring large transcriptional networks using mutual information.
BMC Bioinformatics, 9(461).
Pearl, J. (1998).
Probabilistic reasoning in intelligent systems: networks of plausible inference.
Morgan Kaufmann, San Francisco, California, USA.
Pearl, J. and Russel, S. (2002).
Bayesian Networks.
Bradford Books (MIT Press), Cambridge, Massachussets, USA.
Schäfer, J. and Strimmer, K. (2005).
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 21 / 21
Illustration
An empirical bayes approach to inferring large-scale gene association networks.
Bioinformatics, 21(6):754–764.
Scutari, M. (2010).
Learning Bayesian networks with the bnlearn R package.
Journal of Statistical Software, 35(3):1–22.
Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 21 / 21

Contenu connexe

Tendances

(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning
(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning
(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning
Masahiro Suzuki
 
Graphical Models 4dummies
Graphical Models 4dummiesGraphical Models 4dummies
Graphical Models 4dummies
xamdam
 
Geometric correlations in multiplexes and how they make them more robust
Geometric correlations in multiplexes and how they make them more robustGeometric correlations in multiplexes and how they make them more robust
Geometric correlations in multiplexes and how they make them more robust
Kolja Kleineberg
 
RFNM-Aranda-Final.PDF
RFNM-Aranda-Final.PDFRFNM-Aranda-Final.PDF
RFNM-Aranda-Final.PDF
Thomas Aranda
 
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
Kolja Kleineberg
 

Tendances (20)

Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Prediction
 
Redundancy and synergy in dynamical systems
Redundancy and synergy in dynamical systemsRedundancy and synergy in dynamical systems
Redundancy and synergy in dynamical systems
 
Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning
(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning
(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning
 
Graphical Models 4dummies
Graphical Models 4dummiesGraphical Models 4dummies
Graphical Models 4dummies
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
 
Differential analyses of structures in HiC data
Differential analyses of structures in HiC dataDifferential analyses of structures in HiC data
Differential analyses of structures in HiC data
 
Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biologyKernel methods for data integration in systems biology
Kernel methods for data integration in systems biology
 
Explanable models for time series with random forest
Explanable models for time series with random forestExplanable models for time series with random forest
Explanable models for time series with random forest
 
Backpropagation for Deep Learning
Backpropagation for Deep LearningBackpropagation for Deep Learning
Backpropagation for Deep Learning
 
Backpropagation for Neural Networks
Backpropagation for Neural NetworksBackpropagation for Neural Networks
Backpropagation for Neural Networks
 
Geometric correlations in multiplexes and how they make them more robust
Geometric correlations in multiplexes and how they make them more robustGeometric correlations in multiplexes and how they make them more robust
Geometric correlations in multiplexes and how they make them more robust
 
RFNM-Aranda-Final.PDF
RFNM-Aranda-Final.PDFRFNM-Aranda-Final.PDF
RFNM-Aranda-Final.PDF
 
Bayes Nets Meetup Sept 29th 2016 - Bayesian Network Modelling by Marco Scutari
Bayes Nets Meetup Sept 29th 2016 - Bayesian Network Modelling by Marco ScutariBayes Nets Meetup Sept 29th 2016 - Bayesian Network Modelling by Marco Scutari
Bayes Nets Meetup Sept 29th 2016 - Bayesian Network Modelling by Marco Scutari
 
Graph Neural Network in practice
Graph Neural Network in practiceGraph Neural Network in practice
Graph Neural Network in practice
 
An introduction to deep learning
An introduction to deep learningAn introduction to deep learning
An introduction to deep learning
 
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 

En vedette

En vedette (20)

Définition et analyse de graphes d’interactions de gènes pour la qualité de l...
Définition et analyse de graphes d’interactions de gènes pour la qualité de l...Définition et analyse de graphes d’interactions de gènes pour la qualité de l...
Définition et analyse de graphes d’interactions de gènes pour la qualité de l...
 
Large network analysis : visualization and clustering
Large network analysis : visualization and clusteringLarge network analysis : visualization and clustering
Large network analysis : visualization and clustering
 
Discrimination de courbes par SVM
Discrimination de courbes par SVMDiscrimination de courbes par SVM
Discrimination de courbes par SVM
 
Discrimination de courbes par SVM
Discrimination de courbes par SVMDiscrimination de courbes par SVM
Discrimination de courbes par SVM
 
Réseaux de neurones et SVM à entrées fonctionnelles : une approche par régres...
Réseaux de neurones et SVM à entrées fonctionnelles : une approche par régres...Réseaux de neurones et SVM à entrées fonctionnelles : une approche par régres...
Réseaux de neurones et SVM à entrées fonctionnelles : une approche par régres...
 
Combiner classification et visualisation pour l’exploration de grands réseaux
Combiner classification et visualisation pour l’exploration de grands réseauxCombiner classification et visualisation pour l’exploration de grands réseaux
Combiner classification et visualisation pour l’exploration de grands réseaux
 
Inferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSOInferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSO
 
What is a MOOC?
What is a MOOC?What is a MOOC?
What is a MOOC?
 
Inferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSOInferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSO
 
Traitement de données fonctionnelles par Support Vector Machine
Traitement de données fonctionnelles par Support Vector MachineTraitement de données fonctionnelles par Support Vector Machine
Traitement de données fonctionnelles par Support Vector Machine
 
Slides Lycée Jules Fil 2014
Slides Lycée Jules Fil 2014Slides Lycée Jules Fil 2014
Slides Lycée Jules Fil 2014
 
A short introduction to statistical learning
A short introduction to statistical learningA short introduction to statistical learning
A short introduction to statistical learning
 
Inferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSOInferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSO
 
Interpretable Sparse Sliced Inverse Regression for digitized functional data
Interpretable Sparse Sliced Inverse Regression for digitized functional dataInterpretable Sparse Sliced Inverse Regression for digitized functional data
Interpretable Sparse Sliced Inverse Regression for digitized functional data
 
Visualiser et fouiller des réseaux - Méthodes et exemples dans R
Visualiser et fouiller des réseaux - Méthodes et exemples dans RVisualiser et fouiller des réseaux - Méthodes et exemples dans R
Visualiser et fouiller des réseaux - Méthodes et exemples dans R
 
Théorie de l’apprentissage et SVM : présentation rapide et premières idées da...
Théorie de l’apprentissage et SVM : présentation rapide et premières idées da...Théorie de l’apprentissage et SVM : présentation rapide et premières idées da...
Théorie de l’apprentissage et SVM : présentation rapide et premières idées da...
 
Discrimination de courbes par SVM
Discrimination de courbes par SVMDiscrimination de courbes par SVM
Discrimination de courbes par SVM
 
A comparison of learning methods to predict N2O fluxes and N leaching
A comparison of learning methods to predict N2O fluxes and N leachingA comparison of learning methods to predict N2O fluxes and N leaching
A comparison of learning methods to predict N2O fluxes and N leaching
 
Dynamique de l’occupation des sols de la région des Garrotxes
Dynamique de l’occupation des sols de la région des GarrotxesDynamique de l’occupation des sols de la région des Garrotxes
Dynamique de l’occupation des sols de la région des Garrotxes
 
Réseaux de neurones à entrées fonctionnelles
Réseaux de neurones à entrées fonctionnellesRéseaux de neurones à entrées fonctionnelles
Réseaux de neurones à entrées fonctionnelles
 

Similaire à Consensual gene co-expression network inference with multiple samples

Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Umberto Picchini
 
CLIM Program: Remote Sensing Workshop, Multilayer Modeling and Analysis of Co...
CLIM Program: Remote Sensing Workshop, Multilayer Modeling and Analysis of Co...CLIM Program: Remote Sensing Workshop, Multilayer Modeling and Analysis of Co...
CLIM Program: Remote Sensing Workshop, Multilayer Modeling and Analysis of Co...
The Statistical and Applied Mathematical Sciences Institute
 

Similaire à Consensual gene co-expression network inference with multiple samples (20)

Reading revue of "Inferring Multiple Graphical Structures"
Reading revue of "Inferring Multiple Graphical Structures"Reading revue of "Inferring Multiple Graphical Structures"
Reading revue of "Inferring Multiple Graphical Structures"
 
Mahoney mlconf-nov13
Mahoney mlconf-nov13Mahoney mlconf-nov13
Mahoney mlconf-nov13
 
Probabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering NetworksProbabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering Networks
 
Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...
 
MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...
 
diffusion 모델부터 DALLE2까지.pdf
diffusion 모델부터 DALLE2까지.pdfdiffusion 모델부터 DALLE2까지.pdf
diffusion 모델부터 DALLE2까지.pdf
 
SASA 2016
SASA 2016SASA 2016
SASA 2016
 
Basen Network
Basen NetworkBasen Network
Basen Network
 
CLIM Program: Remote Sensing Workshop, Multilayer Modeling and Analysis of Co...
CLIM Program: Remote Sensing Workshop, Multilayer Modeling and Analysis of Co...CLIM Program: Remote Sensing Workshop, Multilayer Modeling and Analysis of Co...
CLIM Program: Remote Sensing Workshop, Multilayer Modeling and Analysis of Co...
 
Quantitative Propagation of Chaos for SGD in Wide Neural Networks
Quantitative Propagation of Chaos for SGD in Wide Neural NetworksQuantitative Propagation of Chaos for SGD in Wide Neural Networks
Quantitative Propagation of Chaos for SGD in Wide Neural Networks
 
Identifiability in Dynamic Casual Networks
Identifiability in Dynamic Casual NetworksIdentifiability in Dynamic Casual Networks
Identifiability in Dynamic Casual Networks
 
Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biology Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biology
 
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
 
Gf o2014talk
Gf o2014talkGf o2014talk
Gf o2014talk
 
SMART Seminar Series: "A journey in the zoo of Turing patterns: the topology ...
SMART Seminar Series: "A journey in the zoo of Turing patterns: the topology ...SMART Seminar Series: "A journey in the zoo of Turing patterns: the topology ...
SMART Seminar Series: "A journey in the zoo of Turing patterns: the topology ...
 
Combining co-expression and co-location for gene network inference in porcine...
Combining co-expression and co-location for gene network inference in porcine...Combining co-expression and co-location for gene network inference in porcine...
Combining co-expression and co-location for gene network inference in porcine...
 
block-mdp-masters-defense.pdf
block-mdp-masters-defense.pdfblock-mdp-masters-defense.pdf
block-mdp-masters-defense.pdf
 
Lausanne 2019 #2
Lausanne 2019 #2Lausanne 2019 #2
Lausanne 2019 #2
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Statistical Physics Studies of Machine Learning Problems by Lenka Zdeborova, ...
Statistical Physics Studies of Machine Learning Problems by Lenka Zdeborova, ...Statistical Physics Studies of Machine Learning Problems by Lenka Zdeborova, ...
Statistical Physics Studies of Machine Learning Problems by Lenka Zdeborova, ...
 

Plus de tuxette

Plus de tuxette (20)

Racines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathsRacines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en maths
 
Méthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènesMéthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènes
 
Méthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquesMéthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiques
 
Projets autour de l'Hi-C
Projets autour de l'Hi-CProjets autour de l'Hi-C
Projets autour de l'Hi-C
 
Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?
 
Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...
 
ASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquesASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiques
 
Autour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeanAutour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWean
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
 
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquesApprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
 
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
 
Journal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation dataJournal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation data
 
Overfitting or overparametrization?
Overfitting or overparametrization?Overfitting or overparametrization?
Overfitting or overparametrization?
 
Selective inference and single-cell differential analysis
Selective inference and single-cell differential analysisSelective inference and single-cell differential analysis
Selective inference and single-cell differential analysis
 
SOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricesSOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatrices
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICS
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICS
 
A review on structure learning in GNN
A review on structure learning in GNNA review on structure learning in GNN
A review on structure learning in GNN
 
La statistique et le machine learning pour l'intégration de données de la bio...
La statistique et le machine learning pour l'intégration de données de la bio...La statistique et le machine learning pour l'intégration de données de la bio...
La statistique et le machine learning pour l'intégration de données de la bio...
 
La famille *down
La famille *downLa famille *down
La famille *down
 

Dernier

Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
Silpa
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 

Dernier (20)

Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICEPATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
 
Cyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptx
 
Genome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxGenome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptx
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Role of AI in seed science Predictive modelling and Beyond.pptx
Role of AI in seed science  Predictive modelling and  Beyond.pptxRole of AI in seed science  Predictive modelling and  Beyond.pptx
Role of AI in seed science Predictive modelling and Beyond.pptx
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
Chemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfChemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdf
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptx
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Genetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsGenetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditions
 

Consensual gene co-expression network inference with multiple samples

  • 1. Consensual gene co-expression network inference with multiple samples Nathalie Villa-Vialaneix(1,2) http://www.nathalievilla.org nathalie.villa@univ-paris1.fr Joint work with Magali SanCristobal and Laurence Liaubet Groupe de travail biostatistique - 19 mars 2013 (1) (2) Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 1 / 21
  • 2. Overview on network inference Outline 1 Overview on network inference 2 Graphical Gaussian Models 3 Inference with multiple samples 4 Illustration Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 2 / 21
  • 3. Overview on network inference Framework Data: large scale gene expression data individuals n 30/50    X =   . . . . . . . . X j i . . . . . . . . .   variables (genes expression), p 103/4 What we want to obtain: a graph/network with • nodes: genes; • edges: “significant” and direct co-expression between two genes (track transcription regulations). Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 3 / 21
  • 4. Overview on network inference Modeling multiple interactions between genes with a network Co-expression networks • nodes: genes • edges: “direct” co-expression between two genes Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 4 / 21
  • 5. Overview on network inference Modeling multiple interactions between genes with a network Co-expression networks • nodes: genes • edges: “direct” co-expression between two genes Method: “Correlations” Thresholding Graph Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 4 / 21
  • 6. Overview on network inference Correlations/Partial correlations strong indirect correlation y z x set.seed(2807); x <- runif(100) y <- 2*x+1 + rnorm(100,0,0.1); cor(x,y); [1] 0.9870407 z <- -x+2 + rnorm(100,0,0.1); cor(x,z); [1] -0.9443082 cor(y,z) [1] -0.9336924 Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 5 / 21
  • 7. Overview on network inference Correlations/Partial correlations Partial correlation Cor (z, y|x) Correlation between residuals: set.seed(2807); x <- runif(100) y <- 2*x+1 + rnorm(100,0,0.1); cor(x,y); [1] 0.9870407 z <- -x+2 + rnorm(100,0,0.1); cor(x,z); [1] -0.9443082 cor(y,z) [1] -0.9336924 cor(lm(y x)$residuals,lm(z x)$residuals) [1] -0.03071178 Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 5 / 21
  • 8. Overview on network inference Advantages of a network approach 1 over raw data and correlation network (relevance network, [Butte and Kohane, 1999]): focuses on direct links; Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 6 / 21
  • 9. Overview on network inference Advantages of a network approach 1 over raw data and correlation network (relevance network, [Butte and Kohane, 1999]): focuses on direct links; 2 over raw data (again): focuses on “significant” links (more robust) Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 6 / 21
  • 10. Overview on network inference Advantages of a network approach 1 over raw data and correlation network (relevance network, [Butte and Kohane, 1999]): focuses on direct links; 2 over raw data (again): focuses on “significant” links (more robust) 3 over bibliographic network: can handle interactions with yet unknown (not annotated) genes Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 6 / 21
  • 11. Graphical Gaussian Models Outline 1 Overview on network inference 2 Graphical Gaussian Models 3 Inference with multiple samples 4 Illustration Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 7 / 21
  • 12. Graphical Gaussian Models Theoretical framework Gaussian Graphical Models (GGM) X ∼ N(0, Σ) gene expressions Seminal work [Schäfer and Strimmer, 2005], R package GeneNet: estimation of the partial correlations πjj = Cor(Xj , Xj |Xk , k j, j ) from the concentration matrix S = Σ−1 : πjj = − Sjj SjjSj j . Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 8 / 21
  • 13. Graphical Gaussian Models Theoretical framework Gaussian Graphical Models (GGM) X ∼ N(0, Σ) gene expressions Seminal work [Schäfer and Strimmer, 2005], R package GeneNet: estimation of the partial correlations πjj = Cor(Xj , Xj |Xk , k j, j ) from the concentration matrix S = Σ−1 : πjj = − Sjj SjjSj j . Main issue: p n ⇒ Σ badly conditioned ⇒ estimating S from Σ−1 is a bad idea... Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 8 / 21
  • 14. Graphical Gaussian Models Theoretical framework Gaussian Graphical Models (GGM) X ∼ N(0, Σ) gene expressions Seminal work [Schäfer and Strimmer, 2005], R package GeneNet: estimation of the partial correlations πjj = Cor(Xj , Xj |Xk , k j, j ) from the concentration matrix S = Σ−1 : πjj = − Sjj SjjSj j . Main issue: p n ⇒ Σ badly conditioned ⇒ estimating S from Σ−1 is a bad idea... Schafer & Strimmer’s proposal: 1 use Σ + λI rather than Σ to estimate S; 2 select only the most significant Sjj (Bayesian test): S ∼ (1 − η0)fA + η0f0 with f0: distribution of the “null” edges and η0 proportion of null edges among the partial correlations values (close to 1). Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 8 / 21
  • 15. Graphical Gaussian Models Sparse regression approach [Meinshausen and Bühlmann, 2006, Friedman et al., 2008] Partial correlations can also be estimated by using linear models: ∀ j Xj = βT j X−j + In the Gaussian framework: βjj = − Sjj Sjj . Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 9 / 21
  • 16. Graphical Gaussian Models Sparse regression approach [Meinshausen and Bühlmann, 2006, Friedman et al., 2008] Partial correlations can also be estimated by using linear models: ∀ j Xj = βT j X−j + In the Gaussian framework: βjj = − Sjj Sjj . Independant regressions: max (βjj )j  log MLj − λ j j |βjj |   with log MLj ∼ − n i X j i − j j βjj X j i 2 . Consequence: the sparse penalty yields to βjj = 0 for most coefficients (“all-in-one” approach: no thresholding step needed). Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 9 / 21
  • 17. Graphical Gaussian Models Sparse regression approach [Meinshausen and Bühlmann, 2006, Friedman et al., 2008] Partial correlations can also be estimated by using linear models: ∀ j Xj = βT j X−j + In the Gaussian framework: βjj = − Sjj Sjj . Global approach: Graphical Lasso (R package glasso) max (βjj )jj   j log MLj + λ j j |βjj |   Consequence: the sparse penalty yields to βjj = 0 for most coefficients (“all-in-one” approach: no thresholding step needed). Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 9 / 21
  • 18. Graphical Gaussian Models Other methods/packages to infer networks • relevance (correlation) networks: R package WGCNA • Bayesian networks: R package bnlearn [Pearl, 1998, Pearl and Russel, 2002, Scutari, 2010] • networks based on mutual information: R package minet [Meyer et al., 2008] • networks based on random forest [Huynh-Thu et al., 2010] See also: • http://cran.r-project.org/web/views/gR.html (CRAN task view on graphical methods) • https://www.coursera.org/course/pgm (Daphne’s Koller on-line course on “Probabilistic Graphical Models”, starts on April, 8th) • https://www.coursera.org/course/netsysbio (On-line course on “Network Analysis in Systems Biology”) Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 10 / 21
  • 19. Inference with multiple samples Outline 1 Overview on network inference 2 Graphical Gaussian Models 3 Inference with multiple samples 4 Illustration Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 11 / 21
  • 20. Inference with multiple samples Multiple networks inference Transcriptomic data coming from several different conditions. Examples: • genes expression from pig muscle in Landrace and Large white breeds; • genes expression from obese humans after and before a diet. Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 12 / 21
  • 21. Inference with multiple samples Multiple networks inference Transcriptomic data coming from several different conditions. Examples: • genes expression from pig muscle in Landrace and Large white breeds; • genes expression from obese humans after and before a diet. • Assumption: A common functioning exists regardless the condition; • Which genes are correlated independently from/depending on the condition? Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 12 / 21
  • 22. Inference with multiple samples Dataset description “DeLiSus” dataset • variables: expression of 81 genes (selected by Laurence) • conditions: two breeds (33 “Landrace” and 51 “Large white”; 84 pigs) Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 13 / 21
  • 23. Inference with multiple samples “DeLiSus” dataset (restricted dataset with 84 genes (51 pigs)) Density Transitivity % shared [1] GeneNet 0.00 0.71 0.46 [2] simone, MB-AND 0.05 0.08 0.17 [3] simone, Fried. 0.05 0.19 0.22 [4] simone, intertwined 0.05 0.09 0.52 [5] simone, CoopLasso 0.06 0.09 0.88 [6] simone, GroupLasso 0.04 0.07 0.99 [1] [2] [3] [4] [5] [6] [1] 1.00 0.00 0.00 0.00 0.00 0.00 [2] 1.00 0.71 0.76 0.64 0.56 [3] 1.00 0.67 0.55 0.53 [4] 1.00 0.80 0.67 [5] 1.00 0.84 [6] 1.00 Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 14 / 21
  • 24. Inference with multiple samples Multiple networks Independent estimations: if c = 1, . . . , C are different samples (or “conditions”, e.g., breeds or before/after diet...) max (βc jk )k j,c=1,...,C c  log MLc j − λ k j |βc jk |   . Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 15 / 21
  • 25. Inference with multiple samples Multiple networks Independent estimations: if c = 1, . . . , C are different samples (or “conditions”, e.g., breeds or before/after diet...) max (βc jk )k j,c=1,...,C c  log MLc j − λ k j |βc jk |   . Joint estimations: Implemented in the R package simone, [Chiquet et al., 2011] GroupLasso Consensual network between conditions (enforces identical edges by a group LASSO penalty) CoopLasso Sign-coherent network between conditions (prevents edges that corresponds to partial correlations having different signs; thus allows one to obtain a few differences between the conditions) Intertwined In GLasso replace Σc by 1/2Σc + 1/2Σ where Σ = 1 C c Σc Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 15 / 21
  • 26. Inference with multiple samples Consensus LASSO Proposal: Infer multiple networks by forcing them toward a consensual network. Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 16 / 21
  • 27. Inference with multiple samples Consensus LASSO Proposal: Infer multiple networks by forcing them toward a consensual network. Original optimization: max (βc jk )k j,c=1,...,C c  log MLc j − λ k j |βc jk |   . Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 16 / 21
  • 28. Inference with multiple samples Consensus LASSO Proposal: Infer multiple networks by forcing them toward a consensual network. Add a constraint to force inference toward a consensus βcons: max (βc jk )k j,c=1,...,C c  log MLc j − λ k j |βc jk | − µ c wc βc j − βcons j 2   Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 16 / 21
  • 29. Inference with multiple samples Consensus LASSO Proposal: Infer multiple networks by forcing them toward a consensual network. Add a constraint to force inference toward a consensus βcons: max (βc jk )k j,c=1,...,C c  log MLc j − λ k j |βc jk | − µ c wc βc j − βcons j 2   Examples: • βcons j = βc∗ j with c∗ = arg min |βc j | (network intersection); • βcons j = c nc n βc j (“average” network). Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 16 / 21
  • 30. Inference with multiple samples In practice... βcons j = c nc n βc j is a good choice because: • ∂βcons j ∂βc j exists; Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 17 / 21
  • 31. Inference with multiple samples In practice... βcons j = c nc n βc j is a good choice because: • ∂βcons j ∂βc j exists; • thus, solving the optimization problem is equivalent to maximizing 1 2 βT j Sj(µ)βj + βT j Σjj + λ c 1 nc βc j 1 with Σjj, the jth row of empirical covariance matrix deprived from its jth column and Sj(µ) = Σjj + 2µAT A where Σjj is the empirical covariance matrix deprived from its jth row and column and A is a matrix that does not depend on j. This is a standard LASSO problem that can be solved using a sub-gradient method (as described in [Chiquet et al., 2011] and already implemented in the beta-R-package therese). Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 17 / 21
  • 32. Illustration Outline 1 Overview on network inference 2 Graphical Gaussian Models 3 Inference with multiple samples 4 Illustration Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 18 / 21
  • 33. Illustration Datasets description “DeLiSus” dataset • variables: expression of 26 genes (selected by Laurence) • conditions: two breeds (33 “Landrace” and 51 “Large white”; 84 pigs) Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 19 / 21
  • 34. Illustration Datasets description “DeLiSus” dataset • variables: expression of 26 genes (selected by Laurence) • conditions: two breeds (33 “Landrace” and 51 “Large white”; 84 pigs) Methodology • package GeneNet: networks are estimated independently by a GGM approach (edges selected based on the p-value in a Bayesian test); • consensus LASSO: µ fixed and λ varied on a regularization path. Selection of an instance of the path based on the number of edges (similar than with GeneNet). Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 19 / 21
  • 35. Illustration Results Package GeneNet Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 20 / 21
  • 36. Illustration Results Package simone (intertwined) Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 20 / 21
  • 37. Illustration Results Consensus LASSO Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 20 / 21
  • 38. Illustration Conclusion ... much left to do: • biological validation, • selecting λ (AIC and BIC are way too restrictive...), • tuning µ, • other comparisons... Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 21 / 21
  • 39. Illustration References Butte, A. and Kohane, I. (1999). Unsupervised knowledge discovery in medical databases using relevance networks. In Proceedings of the AMIA Symposium, pages 711–715. Chiquet, J., Grandvalet, Y., and Ambroise, C. (2011). Inferring multiple graphical structures. Statistics and Computing, 21(4):537–553. Friedman, J., Hastie, T., and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3):432–441. Huynh-Thu, V., Irrthum, A., Wehenkel, L., and Geurts, P. (2010). Inferring regulatory networks from expression data using tree-based methods. PLoS ONE, 5(9):e12776. Meinshausen, N. and Bühlmann, P. (2006). High dimensional graphs and variable selection with the lasso. Annals of Statistic, 34(3):1436–1462. Meyer, P., Lafitte, F., and Bontempi, G. (2008). minet: A R/Bioconductor package for inferring large transcriptional networks using mutual information. BMC Bioinformatics, 9(461). Pearl, J. (1998). Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, San Francisco, California, USA. Pearl, J. and Russel, S. (2002). Bayesian Networks. Bradford Books (MIT Press), Cambridge, Massachussets, USA. Schäfer, J. and Strimmer, K. (2005). Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 21 / 21
  • 40. Illustration An empirical bayes approach to inferring large-scale gene association networks. Bioinformatics, 21(6):754–764. Scutari, M. (2010). Learning Bayesian networks with the bnlearn R package. Journal of Statistical Software, 35(3):1–22. Consensus LASSO (INRA de Toulouse, MIAT) Nathalie Villa-Vialaneix Toulouse, 19 mars 2013 21 / 21