SlideShare une entreprise Scribd logo
1  sur  60
Télécharger pour lire hors ligne
Short overview on network inference with GGM Inference with multiple samples Simulations
Joint network inference with the consensual
LASSO
Nathalie Villa-Vialaneix
Joint work with Matthieu Vignes, Nathalie Viguerie
and Magali San Cristobal
Séminaire MIA-T, INRA
Toulouse, 23 avril 2014
http://www.nathalievilla.org
nathalie.villa@toulouse.inra.fr
Nathalie Villa-Vialaneix | Consensus Lasso 1/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Outline
1 Short overview on network inference with GGM
2 Inference with multiple samples
3 Simulations
Nathalie Villa-Vialaneix | Consensus Lasso 2/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Transcriptomic data
DNA transcripted into mRNA to
produce proteins
Nathalie Villa-Vialaneix | Consensus Lasso 3/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Transcriptomic data
DNA transcripted into mRNA to
produce proteins
transcriptomic data: measure
of the quantity of mRNA
corresponding to a given gene in
given cells (blood, muscle...) of a
living organism
Nathalie Villa-Vialaneix | Consensus Lasso 3/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Systems biology
Some genes’ expressions activate or repress other genes’ expressions
⇒ understanding the whole cascade helps to comprehend the global
functioning of living organisms1
1
Picture taken from: Abdollahi A et al., PNAS 2007, 104:12890-12895. c 2007 by
National Academy of Sciences
Nathalie Villa-Vialaneix | Consensus Lasso 4/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Model framework
Data: large scale gene expression data
individuals
n 30/50



X =


. . . . . .
. . X
j
i
. . .
. . . . . .


variables (genes expression), p 103/4
What we want to obtain: a graph/network with
nodes: genes;
edges: strong links between gene expressions.
Nathalie Villa-Vialaneix | Consensus Lasso 5/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Advantages of inferring a network
from large scale transcription data
1 over raw data: focuses on the strongest direct relationships:
irrelevant or indirect relations are removed (more robust) and the
data are easier to visualize and understand (track transcription
relations).
Nathalie Villa-Vialaneix | Consensus Lasso 6/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Advantages of inferring a network
from large scale transcription data
1 over raw data: focuses on the strongest direct relationships:
irrelevant or indirect relations are removed (more robust) and the
data are easier to visualize and understand (track transcription
relations).
Expression data are analyzed all together and not by pairs
(systems model).
Nathalie Villa-Vialaneix | Consensus Lasso 6/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Advantages of inferring a network
from large scale transcription data
1 over raw data: focuses on the strongest direct relationships:
irrelevant or indirect relations are removed (more robust) and the
data are easier to visualize and understand (track transcription
relations).
Expression data are analyzed all together and not by pairs
(systems model).
2 over bibliographic network: can handle interactions with yet
unknown (not annotated) genes and deal with data collected in a
particular condition.
Nathalie Villa-Vialaneix | Consensus Lasso 6/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Using correlations: relevance
network [Butte and Kohane, 1999,
Butte and Kohane, 2000]
First (naive) approach: calculate correlations between expressions for
all pairs of genes, threshold the smallest ones and build the network.
Correlations Thresholding Graph
Nathalie Villa-Vialaneix | Consensus Lasso 7/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Using partial correlations
strong indirect correlation
y z
x
Nathalie Villa-Vialaneix | Consensus Lasso 8/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Using partial correlations
strong indirect correlation
y z
x
set.seed(2807); x <- rnorm(100)
y <- 2*x+1+rnorm(100,0,0.1); cor(x,y) [1] 0.998826
z <- 2*x+1+rnorm(100,0,0.1); cor(x,z) [1] 0.998751
cor(y,z) [1] 0.9971105
Nathalie Villa-Vialaneix | Consensus Lasso 8/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Using partial correlations
strong indirect correlation
y z
x
set.seed(2807); x <- rnorm(100)
y <- 2*x+1+rnorm(100,0,0.1); cor(x,y) [1] 0.998826
z <- 2*x+1+rnorm(100,0,0.1); cor(x,z) [1] 0.998751
cor(y,z) [1] 0.9971105
Partial correlation
cor(lm(x∼z)$residuals,lm(y∼z)$residuals) [1] 0.7801174
cor(lm(x∼y)$residuals,lm(z∼y)$residuals) [1] 0.7639094
cor(lm(y∼x)$residuals,lm(z∼x)$residuals) [1] -0.1933699
Nathalie Villa-Vialaneix | Consensus Lasso 8/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Partial correlation in the Gaussian
framework
(Xi)i=1,...,n are i.i.d. Gaussian random variables N(0, Σ) (gene
expression); then
j ←→ j (genes j and j are linked) ⇔ Cor Xj
, Xj
|(Xk
)k j,j > 0
Nathalie Villa-Vialaneix | Consensus Lasso 9/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Partial correlation in the Gaussian
framework
(Xi)i=1,...,n are i.i.d. Gaussian random variables N(0, Σ) (gene
expression); then
j ←→ j (genes j and j are linked) ⇔ Cor Xj
, Xj
|(Xk
)k j,j > 0
If (concentration matrix) S = Σ−1
,
Cor Xj
, Xj
|(Xk
)k j,j = −
Sjj
SjjSj j
⇒ Estimate Σ−1
to unravel the graph structure
Nathalie Villa-Vialaneix | Consensus Lasso 9/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Partial correlation in the Gaussian
framework
(Xi)i=1,...,n are i.i.d. Gaussian random variables N(0, Σ) (gene
expression); then
j ←→ j (genes j and j are linked) ⇔ Cor Xj
, Xj
|(Xk
)k j,j > 0
If (concentration matrix) S = Σ−1
,
Cor Xj
, Xj
|(Xk
)k j,j = −
Sjj
SjjSj j
⇒ Estimate Σ−1
to unravel the graph structure
Problem: Σ: p-dimensional matrix and n p ⇒ (Σn
)−1
is a poor
estimate of S)!
Nathalie Villa-Vialaneix | Consensus Lasso 9/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Various approaches for inferring
networks with GGM
Graphical Gaussian Model
seminal work:
[Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b]
(with shrinkage and a proposal for a Bayesian test of significance)
estimate Σ−1
by (Σn
+ λI)−1
use a Bayesian test to test which coefficients are significantly non zero.
Nathalie Villa-Vialaneix | Consensus Lasso 10/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Various approaches for inferring
networks with GGM
Graphical Gaussian Model
seminal work:
[Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b]
(with shrinkage and a proposal for a Bayesian test of significance)
estimate Σ−1
by (Σn
+ λI)−1
use a Bayesian test to test which coefficients are significantly non zero.
∀ j, estimate the linear model:
Xj
= βT
j X−j
+ ; arg max
(βjj )j
(log MLj)
because βjj = −
Sjj
Sjj
.
Nathalie Villa-Vialaneix | Consensus Lasso 10/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Various approaches for inferring
networks with GGM
Graphical Gaussian Model
seminal work:
[Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b]
(with shrinkage and a proposal for a Bayesian test of significance)
estimate Σ−1
by (Σn
+ λI)−1
use a Bayesian test to test which coefficients are significantly non zero.
∀ j, estimate the linear model:
Xj
= βT
j X−j
+ ; arg min
(βjj )j
n
i=1
Xij − βT
j X
−j
i
2
because βjj = −
Sjj
Sjj
.
Nathalie Villa-Vialaneix | Consensus Lasso 10/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Various approaches for inferring
networks with GGM
Graphical Gaussian Model
seminal work:
[Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b]
(with shrinkage and a proposal for a Bayesian test of significance)
estimate Σ−1
by (Σn
+ λI)−1
use a Bayesian test to test which coefficients are significantly non zero.
sparse approaches:
[Meinshausen and Bühlmann, 2006, Friedman et al., 2008]:
∀ j, estimate the linear model:
Xj
= βT
j X−j
+ ; arg min
(βjj )j
n
i=1
Xij − βT
j X
−j
i
2
+λ βj L1
with βj L1 = j |βjj |
L1
penalty yields to βjj = 0 for most j (variable selection)
Nathalie Villa-Vialaneix | Consensus Lasso 10/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Outline
1 Short overview on network inference with GGM
2 Inference with multiple samples
3 Simulations
Nathalie Villa-Vialaneix | Consensus Lasso 11/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Motivation for multiple networks
inference
Pan-European project Diogenes2
(with Nathalie Viguerie, INSERM):
gene expressions (lipid tissues) from 204 obese women before and after
a low-calorie diet (LCD).
Assumption: A
common functioning
exists regardless the
condition;
Which genes are linked
independently
from/depending on the
condition?
2
http://www.diogenes-eu.org/; see also [Viguerie et al., 2012]
Nathalie Villa-Vialaneix | Consensus Lasso 12/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Naive approach: independent
estimations
Notations: p genes measured in k samples, each corresponding to a
specific condition: (Xc
j
)j=1,...,p ∼ N(0, Σc
), for c = 1, . . . , k.
For c = 1, . . . , k, nc independent observations (Xc
ij
)i=1,...,nc
and
c nc = n.
Independent inference
Estimation ∀ c = 1, . . . , k and ∀ j = 1, . . . , p,
Xc
j = Xc
j βc
j + c
j
are estimated (independently) by maximizing pseudo-likelihood:
L(S|X) =
k
c=1
p
j=1
nc
i=1
log P Xc
ij |Xc
i,j, Sc
j
Nathalie Villa-Vialaneix | Consensus Lasso 13/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Related papers
Problem: previous estimation does not use the fact that the different
networks should be somehow alike!
Previous proposals
[Chiquet et al., 2011] replace Σc
by Σc
= 1
2
Σc
+ 1
2
Σ and add a
sparse penalty;
Nathalie Villa-Vialaneix | Consensus Lasso 14/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Related papers
Problem: previous estimation does not use the fact that the different
networks should be somehow alike!
Previous proposals
[Chiquet et al., 2011] replace Σc
by Σc
= 1
2
Σc
+ 1
2
Σ and add a
sparse penalty;
[Chiquet et al., 2011] LASSO and Group-LASSO type penalties to
force identical or sign-coherent edges between conditions:
jj c
(Sc
jj
)2 or
jj


c
(Sc
jj
)2
+ +
c
(Sc
jj
)2
−


⇒ Sc
jj
= 0 ∀ c for most entries OR Sc
jj
can only be of a given sign
(positive or negative) whatever c
Nathalie Villa-Vialaneix | Consensus Lasso 14/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Related papers
Problem: previous estimation does not use the fact that the different
networks should be somehow alike!
Previous proposals
[Chiquet et al., 2011] replace Σc
by Σc
= 1
2
Σc
+ 1
2
Σ and add a
sparse penalty;
[Chiquet et al., 2011] LASSO and Group-LASSO type penalties to
force identical or sign-coherent edges between conditions
[Danaher et al., 2013] add the penalty c c Sc
− Sc
L1 ⇒ very
strong consistency between conditions (sparse penalty over the
inferred networks identical values for concentration matrix entries);
Nathalie Villa-Vialaneix | Consensus Lasso 14/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Related papers
Problem: previous estimation does not use the fact that the different
networks should be somehow alike!
Previous proposals
[Chiquet et al., 2011] replace Σc
by Σc
= 1
2
Σc
+ 1
2
Σ and add a
sparse penalty;
[Chiquet et al., 2011] LASSO and Group-LASSO type penalties to
force identical or sign-coherent edges between conditions
[Danaher et al., 2013] add the penalty c c Sc
− Sc
L1 ⇒ very
strong consistency between conditions (sparse penalty over the
inferred networks identical values for concentration matrix entries);
[Mohan et al., 2012] add a group-LASSO like penalty
c c j Sc
j
− Sc
j L2 that focuses on differences due to a few
number of nodes only.
Nathalie Villa-Vialaneix | Consensus Lasso 14/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Consensus LASSO
Proposal
Infer multiple networks by forcing them toward a consensual network: i.e.,
explicitly keeping the differences between conditions under control but
with a L2
penalty (allow for more differences than Group-LASSO type
penalties).
Original optimization:
max
(βc
jk
)k j,c=1,...,C
c

log MLc
j − λ
k j
|βc
jk |

 .
Nathalie Villa-Vialaneix | Consensus Lasso 15/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Consensus LASSO
Proposal
Infer multiple networks by forcing them toward a consensual network: i.e.,
explicitly keeping the differences between conditions under control but
with a L2
penalty (allow for more differences than Group-LASSO type
penalties).
Original optimization:
max
(βc
jk
)k j,c=1,...,C
c

log MLc
j − λ
k j
|βc
jk |

 .
[Ambroise et al., 2009, Chiquet et al., 2011]: is equivalent to minimize
p problems having dimension k(p − 1):
1
2
βT
j Σjjβj + βT
j Σjj + λ βj L1 , βj = (β1
j , . . . , βk
j )
with Σjj: block diagonal matrix Diag Σ1
jj
, . . . , Σk
jj
and similarly for Σjj.
Nathalie Villa-Vialaneix | Consensus Lasso 15/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Consensus LASSO
Proposal
Infer multiple networks by forcing them toward a consensual network: i.e.,
explicitly keeping the differences between conditions under control but
with a L2
penalty (allow for more differences than Group-LASSO type
penalties).
Add a constraint to force inference toward a “consensus” βcons
1
2
βT
j Σjjβj + βT
j Σjj + λ βj L1 + µ
c
wc βc
j − βcons
j
2
L2
with:
wc: real number used to weight the conditions (wc = 1 or wc = 1√
nc
);
µ regularization parameter;
βcons
j
whatever you want...?
Nathalie Villa-Vialaneix | Consensus Lasso 15/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Choice of a consensus: set one...
Typical case:
a prior network is known (e.g., from bibliography);
with no prior information, use a fixed prior corresponding to (e.g.)
global inference
⇒ given (and fixed) βcons
Nathalie Villa-Vialaneix | Consensus Lasso 16/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Choice of a consensus: set one...
Typical case:
a prior network is known (e.g., from bibliography);
with no prior information, use a fixed prior corresponding to (e.g.)
global inference
⇒ given (and fixed) βcons
Proposition
Using a fixed βcons
j
, the optimization problem is equivalent to minimizing
the p following standard quadratic problem in Rk(p−1)
with L1-penalty:
1
2
βT
j B1
(µ)βj + βT
j B2
(µ) + λ βj L1 ,
where
B1
(µ) = Σjj + 2µIk(p−1), with Ik(p−1) the k(p − 1)-identity matrix
B2
(µ) = Σjj − 2µIk(p−1)βcons with βcons = (βcons
j
)T
, . . . , (βcons
j
)T T
Nathalie Villa-Vialaneix | Consensus Lasso 16/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Choice of a consensus: adapt one
during training...
Derive the consensus from the condition-specific estimates:
βcons
j =
c
nc
n
βc
j
Nathalie Villa-Vialaneix | Consensus Lasso 17/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Choice of a consensus: adapt one
during training...
Derive the consensus from the condition-specific estimates:
βcons
j =
c
nc
n
βc
j
Proposition
Using βcons
j
= k
c=1
nc
n
βc
j
, the optimization problem is equivalent to
minimizing the following standard quadratic problem with L1-penalty:
1
2
βT
j Sj(µ)βj + βT
j Σjj + λ βj L1
where Sj(µ) = Σjj + 2µAT
(µ)A(µ) where A(µ) is a
[k(p − 1) × k(p − 1)]-matrix that does not depend on j.
Nathalie Villa-Vialaneix | Consensus Lasso 17/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Computational aspects: optimization
Common framework
Objective function can be decomposed into:
convex part C(βj) = 1
2
βT
j
Q1
j
(µ) + βT
j
Q2
j
(µ)
L1
-norm penalty P(βj) = βj L1
Nathalie Villa-Vialaneix | Consensus Lasso 18/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Computational aspects: optimization
Common framework
Objective function can be decomposed into:
convex part C(βj) = 1
2
βT
j
Q1
j
(µ) + βT
j
Q2
j
(µ)
L1
-norm penalty P(βj) = βj L1
optimization by “active set” [Osborne et al., 2000, Chiquet et al., 2011]
1: repeat(λ given)
2: Given A and βjj st: βjj 0, ∀ j ∈ A, solve (over h) the smooth
minimization problem restricted to A
C(βj + h) + λP(βj + h) ⇒ βj ← βj + h
4: until
Nathalie Villa-Vialaneix | Consensus Lasso 18/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Computational aspects: optimization
Common framework
Objective function can be decomposed into:
convex part C(βj) = 1
2
βT
j
Q1
j
(µ) + βT
j
Q2
j
(µ)
L1
-norm penalty P(βj) = βj L1
optimization by “active set” [Osborne et al., 2000, Chiquet et al., 2011]
1: repeat(λ given)
2: Given A and βjj st: βjj 0, ∀ j ∈ A, solve (over h) the smooth
minimization problem restricted to A
C(βj + h) + λP(βj + h) ⇒ βj ← βj + h
3: Update A by adding most violating variables, i.e., variables st:
abs ∂C(βj) + λ∂P(βj) > 0
with [∂P(βj)]j ∈ [−1, 1] if j A
4: until
Nathalie Villa-Vialaneix | Consensus Lasso 18/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Computational aspects: optimization
Common framework
Objective function can be decomposed into:
convex part C(βj) = 1
2
βT
j
Q1
j
(µ) + βT
j
Q2
j
(µ)
L1
-norm penalty P(βj) = βj L1
optimization by “active set” [Osborne et al., 2000, Chiquet et al., 2011]
1: repeat(λ given)
2: Given A and βjj st: βjj 0, ∀ j ∈ A, solve (over h) the smooth
minimization problem restricted to A
C(βj + h) + λP(βj + h) ⇒ βj ← βj + h
3: Update A by adding most violating variables, i.e., variables st:
abs ∂C(βj) + λ∂P(βj) > 0
with [∂P(βj)]j ∈ [−1, 1] if j A
4: until all conditions satisfied i.e. abs ∂C(βj) + λ∂P(βj) > 0
Nathalie Villa-Vialaneix | Consensus Lasso 18/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Computational aspects: optimization
Common framework
Objective function can be decomposed into:
convex part C(βj) = 1
2
βT
j
Q1
j
(µ) + βT
j
Q2
j
(µ)
L1
-norm penalty P(βj) = βj L1
optimization by “active set” [Osborne et al., 2000, Chiquet et al., 2011]
1: repeat(λ given)
2: Given A and βjj st: βjj 0, ∀ j ∈ A, solve (over h) the smooth
minimization problem restricted to A
C(βj + h) + λP(βj + h) ⇒ βj ← βj + h
3: Update A by adding most violating variables, i.e., variables st:
abs ∂C(βj) + λ∂P(βj) > 0
with [∂P(βj)]j ∈ [−1, 1] if j A
4: until all conditions satisfied i.e. abs ∂C(βj) + λ∂P(βj) > 0
Repeat:
large λ
↓
small λ
using previous β
as prior
Nathalie Villa-Vialaneix | Consensus Lasso 18/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Bootstrap estimation BOLASSO
[Bach, 2008]
x
x
x
x
x
x
x
x
subsample n observations with replacement
Nathalie Villa-Vialaneix | Consensus Lasso 19/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Bootstrap estimation BOLASSO
[Bach, 2008]
x
x
x
x
x
x
x
x
subsample n observations with replacement
cLasso estimation
−−−−−−−−−−−−→ for varying λ, (βλ,b
jj
)jj
Nathalie Villa-Vialaneix | Consensus Lasso 19/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Bootstrap estimation BOLASSO
[Bach, 2008]
x
x
x
x
x
x
x
x
subsample n observations with replacement
cLasso estimation
−−−−−−−−−−−−→ for varying λ, (βλ,b
jj
)jj
↓ threshold
keep the first T1 largest coefficients
Nathalie Villa-Vialaneix | Consensus Lasso 19/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Bootstrap estimation BOLASSO
[Bach, 2008]
x
x
x
x
x
x
x
x
subsample n observations with replacement
cLasso estimation
−−−−−−−−−−−−→ for varying λ, (βλ,b
jj
)jj
↓ threshold
keep the first T1 largest coefficients
B×
Frequency table
(1, 2) (1, 3) ... (j, j ) ...
130 25 ... 120 ...
Nathalie Villa-Vialaneix | Consensus Lasso 19/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Bootstrap estimation BOLASSO
[Bach, 2008]
x
x
x
x
x
x
x
x
subsample n observations with replacement
cLasso estimation
−−−−−−−−−−−−→ for varying λ, (βλ,b
jj
)jj
↓ threshold
keep the first T1 largest coefficients
B×
Frequency table
(1, 2) (1, 3) ... (j, j ) ...
130 25 ... 120 ...
−→ Keep the T2 most frequent pairs
Nathalie Villa-Vialaneix | Consensus Lasso 19/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Outline
1 Short overview on network inference with GGM
2 Inference with multiple samples
3 Simulations
Nathalie Villa-Vialaneix | Consensus Lasso 20/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Simulated data
Expression data with known co-expression network
original network (scale free) taken from
http://www.comp-sys-bio.org/AGN/data.html (100 nodes,
∼ 200 edges, loops removed);
rewire a ratio r of the edges to generate k “children” networks
(sharing approximately 100(1 − 2r)% of their edges);
Nathalie Villa-Vialaneix | Consensus Lasso 21/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Simulated data
Expression data with known co-expression network
original network (scale free) taken from
http://www.comp-sys-bio.org/AGN/data.html (100 nodes,
∼ 200 edges, loops removed);
rewire a ratio r of the edges to generate k “children” networks
(sharing approximately 100(1 − 2r)% of their edges);
generate “expression data” with a random Gaussian process from
each chid:
use the Laplacian of the graph to generate a putative concentration
matrix;
use edge colors in the original network to set the edge sign;
correct the obtained matrix to make it positive;
invert to obtain a covariance matrix...;
... which is used in a random Gaussian process to generate
expression data (with noise).
Nathalie Villa-Vialaneix | Consensus Lasso 21/30
Short overview on network inference with GGM Inference with multiple samples Simulations
An example with k = 2, r = 5%
mother network3
first child second child
3
actually the parent network. My co-author wisely noted that the mistake was
unforgivable for a feminist...
Nathalie Villa-Vialaneix | Consensus Lasso 22/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Choice for T2
Data: r = 0.05, k = 2 and n1 = n2 = 20
100 bootstrap samples, µ = 1, T1 = 250 or 500
qq 30 selections at least
30 selections at least
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75
precision
recall
q
q
40 selections at least
43 selections at least
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75 1.00
precision
recall
Dots correspond to best F = 2 × precision×recall
precision+recall
⇒ Best F corresponds to selecting a number of edges approximately
equal to the number of edges in the original network.
Nathalie Villa-Vialaneix | Consensus Lasso 23/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Choice for T1 and µ
µ T1 % of improvement
0.1/1 {250, 300, 500} of bootstrapping
network sizes rewired edges: 5%
20-20 1 500 30.69
20-30 0.1 500 11.87
30-30 1 300 20.15
50-50 1 300 14.36
20-20-20-20-20 1 500 86.04
30-30-30-30 0.1 500 42.67
network sizes rewired edges: 20%
20-20 0.1 300 -17.86
20-30 0.1 300 -18.35
30-30 1 500 -7.97
50-50 0.1 300 -7.83
20-20-20-20-20 0.1 500 10.27
30-30-30-30 1 500 13.48
Nathalie Villa-Vialaneix | Consensus Lasso 24/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Comparison with other approaches
Method compared (direct and bootstrap approaches)
independant Graphical LASSO estimation gLasso
methods implementated in the R package simone and described in
[Chiquet et al., 2011]: intertwinned LASSO iLasso, cooperative
LASSO coopLasso and group LASSO groupLasso
fused graphical LASSO as described in [Danaher et al., 2013] as
implemented in the R package fgLasso
Nathalie Villa-Vialaneix | Consensus Lasso 25/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Comparison with other approaches
Method compared (direct and bootstrap approaches)
independant Graphical LASSO estimation gLasso
methods implementated in the R package simone and described in
[Chiquet et al., 2011]: intertwinned LASSO iLasso, cooperative
LASSO coopLasso and group LASSO groupLasso
fused graphical LASSO as described in [Danaher et al., 2013] as
implemented in the R package fgLasso
consensus Lasso with
fixed prior (the mother network) cLasso(p)
fixed prior (average over the conditions of independant estimations)
cLasso(2)
adaptative estimation of the prior cLasso(m)
Nathalie Villa-Vialaneix | Consensus Lasso 25/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Comparison with other approaches
Method compared (direct and bootstrap approaches)
independant Graphical LASSO estimation gLasso
methods implementated in the R package simone and described in
[Chiquet et al., 2011]: intertwinned LASSO iLasso, cooperative
LASSO coopLasso and group LASSO groupLasso
fused graphical LASSO as described in [Danaher et al., 2013] as
implemented in the R package fgLasso
consensus Lasso with
fixed prior (the mother network) cLasso(p)
fixed prior (average over the conditions of independant estimations)
cLasso(2)
adaptative estimation of the prior cLasso(m)
Parameters set to: T1 = 500, B = 100, µ = 1
Nathalie Villa-Vialaneix | Consensus Lasso 25/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Selected results (best F)
rewired edges: 5% - conditions: 2 - sample size: 2 × 30
direct version
Method gLasso iLasso groupLasso coopLasso
0.28 0.35 0.32 0.35
Method fgLasso cLasso(m) cLasso(p) cLasso(2)
0.32 0.31 0.86 0.30
bootstrap version
Method gLasso iLasso groupLasso coopLasso
0.31 0.34 0.36 0.34
Method fgLasso cLasso(m) cLasso(p) cLasso(2)
0.36 0.37 0.86 0.35
Conclusions
bootstraping improves results (except for iLasso and for large r)
joint inference improves results
using a good prior is (as expected) very efficient
adaptive approch for cLasso is better than naive approach
Nathalie Villa-Vialaneix | Consensus Lasso 26/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Real data
204 obese women ; expression of 221 genes before and after a LCD
µ = 1 ; T1 = 1000 (target density: 4%)
Distribution of the number of times an edge is selected over 100
bootstrap samples
0
500
1000
1500
0 25 50 75 100
Counts
Frequency
(70% of the pairs of nodes are never selected) ⇒ T2 = 80
Nathalie Villa-Vialaneix | Consensus Lasso 27/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Networks
Before diet
q
q
q
q
q
q
q
q
q
q
q
qq
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
AZGP1
CIDEA
FADS1
FADS2
GPD1L
PCK2
After diet
q
q
q
q
q
q
q
q
q
q
q
qq
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
AZGP1
CIDEA
FADS1
FADS2
GPD1L
PCK2
densities about 1.3% - some interactions (both shared and specific)
make sense to the biologist
Nathalie Villa-Vialaneix | Consensus Lasso 28/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Thank you for your attention...
Programs available in the R package therese (on R-Forge)4
. Joint work
with
Magali SanCristobal Matthieu Vignes
(GenPhySe, INRA Toulouse) (MIAT, INRA Toulouse)
Nathalie Viguerie
(I2MC, INSERM Toulouse)
4
https://r-forge.r-project.org/projects/therese-pkg
Nathalie Villa-Vialaneix | Consensus Lasso 29/30
Short overview on network inference with GGM Inference with multiple samples Simulations
Questions?
Nathalie Villa-Vialaneix | Consensus Lasso 30/30
Short overview on network inference with GGM Inference with multiple samples Simulations
References
Ambroise, C., Chiquet, J., and Matias, C. (2009).
Inferring sparse Gaussian graphical models with latent structure.
Electronic Journal of Statistics, 3:205–238.
Bach, F. (2008).
Bolasso: model consistent lasso estimation through the bootstrap.
In Proceedings of the Twenty-fifth International Conference on Machine Learning (ICML).
Butte, A. and Kohane, I. (1999).
Unsupervised knowledge discovery in medical databases using relevance networks.
In Proceedings of the AMIA Symposium, pages 711–715.
Butte, A. and Kohane, I. (2000).
Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements.
In Proceedings of the Pacific Symposium on Biocomputing, pages 418–429.
Chiquet, J., Grandvalet, Y., and Ambroise, C. (2011).
Inferring multiple graphical structures.
Statistics and Computing, 21(4):537–553.
Danaher, P., Wang, P., and Witten, D. (2013).
The joint graphical lasso for inverse covariance estimation accross multiple classes.
Journal of the Royal Statistical Society Series B.
Forthcoming.
Friedman, J., Hastie, T., and Tibshirani, R. (2008).
Sparse inverse covariance estimation with the graphical lasso.
Biostatistics, 9(3):432–441.
Meinshausen, N. and Bühlmann, P. (2006).
Nathalie Villa-Vialaneix | Consensus Lasso 30/30
Short overview on network inference with GGM Inference with multiple samples Simulations
High dimensional graphs and variable selection with the lasso.
Annals of Statistic, 34(3):1436–1462.
Mohan, K., Chung, J., Han, S., Witten, D., Lee, S., and Fazel, M. (2012).
Structured learning of Gaussian graphical models.
In Proceedings of NIPS (Neural Information Processing Systems) 2012, Lake Tahoe, Nevada, USA.
Osborne, M., Presnell, B., and Turlach, B. (2000).
On the LASSO and its dual.
Journal of Computational and Graphical Statistics, 9(2):319–337.
Schäfer, J. and Strimmer, K. (2005a).
An empirical bayes approach to inferring large-scale gene association networks.
Bioinformatics, 21(6):754–764.
Schäfer, J. and Strimmer, K. (2005b).
A shrinkage approach to large-scale covariance matrix estimation and implication for functional genomics.
Statistical Applications in Genetics and Molecular Biology, 4:1–32.
Viguerie, N., Montastier, E., Maoret, J., Roussel, B., Combes, M., Valle, C., Villa-Vialaneix, N., Iacovoni, J., Martinez, J., Holst, C., Astrup,
A., Vidal, H., Clément, K., Hager, J., Saris, W., and Langin, D. (2012).
Determinants of human adipose tissue gene expression: impact of diet, sex, metabolic status and cis genetic regulation.
PLoS Genetics, 8(9):e1002959.
Nathalie Villa-Vialaneix | Consensus Lasso 30/30

Contenu connexe

Tendances

Inference in HMM and Bayesian Models
Inference in HMM and Bayesian ModelsInference in HMM and Bayesian Models
Inference in HMM and Bayesian ModelsMinakshi Atre
 
A paradox of importance in network epidemiology
A paradox of importance in network epidemiologyA paradox of importance in network epidemiology
A paradox of importance in network epidemiologyPetter Holme
 
Spreading processes on temporal networks
Spreading processes on temporal networksSpreading processes on temporal networks
Spreading processes on temporal networksPetter Holme
 
IEEE Fuzzy system Title and Abstract 2016
IEEE Fuzzy system Title and Abstract 2016 IEEE Fuzzy system Title and Abstract 2016
IEEE Fuzzy system Title and Abstract 2016 tsysglobalsolutions
 
Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...Hakky St
 
Artificial neural networks and its application
Artificial neural networks and its applicationArtificial neural networks and its application
Artificial neural networks and its applicationHưng Đặng
 
An information-theoretic, all-scales approach to comparing networks
An information-theoretic, all-scales approach to comparing networksAn information-theoretic, all-scales approach to comparing networks
An information-theoretic, all-scales approach to comparing networksJim Bagrow
 
CEHS 2016 Poster
CEHS 2016 PosterCEHS 2016 Poster
CEHS 2016 PosterEric Ma
 
Important spreaders in networks: Exact results for small graphs
Important spreaders in networks: Exact results for small graphsImportant spreaders in networks: Exact results for small graphs
Important spreaders in networks: Exact results for small graphsPetter Holme
 
JPN1422 Defending Against Collaborative Attacks by Malicious Nodes in MANETs...
JPN1422  Defending Against Collaborative Attacks by Malicious Nodes in MANETs...JPN1422  Defending Against Collaborative Attacks by Malicious Nodes in MANETs...
JPN1422 Defending Against Collaborative Attacks by Malicious Nodes in MANETs...chennaijp
 
Defending against collaborative attacks by
Defending against collaborative attacks byDefending against collaborative attacks by
Defending against collaborative attacks byranjith kumar
 
Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...Martin Pelikan
 
Defending against collaborative attacks by malicious nodes in manets a cooper...
Defending against collaborative attacks by malicious nodes in manets a cooper...Defending against collaborative attacks by malicious nodes in manets a cooper...
Defending against collaborative attacks by malicious nodes in manets a cooper...IISTech2015
 
RFNM-Aranda-Final.PDF
RFNM-Aranda-Final.PDFRFNM-Aranda-Final.PDF
RFNM-Aranda-Final.PDFThomas Aranda
 
Defending against collaborative attacks by
Defending against collaborative attacks byDefending against collaborative attacks by
Defending against collaborative attacks byjpstudcorner
 

Tendances (18)

Temporal networks - Alain Barrat
Temporal networks - Alain BarratTemporal networks - Alain Barrat
Temporal networks - Alain Barrat
 
Inference in HMM and Bayesian Models
Inference in HMM and Bayesian ModelsInference in HMM and Bayesian Models
Inference in HMM and Bayesian Models
 
A paradox of importance in network epidemiology
A paradox of importance in network epidemiologyA paradox of importance in network epidemiology
A paradox of importance in network epidemiology
 
Spreading processes on temporal networks
Spreading processes on temporal networksSpreading processes on temporal networks
Spreading processes on temporal networks
 
IEEE Fuzzy system Title and Abstract 2016
IEEE Fuzzy system Title and Abstract 2016 IEEE Fuzzy system Title and Abstract 2016
IEEE Fuzzy system Title and Abstract 2016
 
Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...
 
Artificial neural networks and its application
Artificial neural networks and its applicationArtificial neural networks and its application
Artificial neural networks and its application
 
An information-theoretic, all-scales approach to comparing networks
An information-theoretic, all-scales approach to comparing networksAn information-theoretic, all-scales approach to comparing networks
An information-theoretic, all-scales approach to comparing networks
 
CEHS 2016 Poster
CEHS 2016 PosterCEHS 2016 Poster
CEHS 2016 Poster
 
Ijciet 10 01_153-2
Ijciet 10 01_153-2Ijciet 10 01_153-2
Ijciet 10 01_153-2
 
Important spreaders in networks: Exact results for small graphs
Important spreaders in networks: Exact results for small graphsImportant spreaders in networks: Exact results for small graphs
Important spreaders in networks: Exact results for small graphs
 
Deepwalk vs Node2vec
Deepwalk vs Node2vecDeepwalk vs Node2vec
Deepwalk vs Node2vec
 
JPN1422 Defending Against Collaborative Attacks by Malicious Nodes in MANETs...
JPN1422  Defending Against Collaborative Attacks by Malicious Nodes in MANETs...JPN1422  Defending Against Collaborative Attacks by Malicious Nodes in MANETs...
JPN1422 Defending Against Collaborative Attacks by Malicious Nodes in MANETs...
 
Defending against collaborative attacks by
Defending against collaborative attacks byDefending against collaborative attacks by
Defending against collaborative attacks by
 
Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...
 
Defending against collaborative attacks by malicious nodes in manets a cooper...
Defending against collaborative attacks by malicious nodes in manets a cooper...Defending against collaborative attacks by malicious nodes in manets a cooper...
Defending against collaborative attacks by malicious nodes in manets a cooper...
 
RFNM-Aranda-Final.PDF
RFNM-Aranda-Final.PDFRFNM-Aranda-Final.PDF
RFNM-Aranda-Final.PDF
 
Defending against collaborative attacks by
Defending against collaborative attacks byDefending against collaborative attacks by
Defending against collaborative attacks by
 

En vedette

6 Steps for Content Analysis & Development You Cannot Ignore
6 Steps for Content Analysis & Development You Cannot Ignore6 Steps for Content Analysis & Development You Cannot Ignore
6 Steps for Content Analysis & Development You Cannot IgnorePartha Bhattacharya
 
Visualiser et fouiller des réseaux - Méthodes et exemples dans R
Visualiser et fouiller des réseaux - Méthodes et exemples dans RVisualiser et fouiller des réseaux - Méthodes et exemples dans R
Visualiser et fouiller des réseaux - Méthodes et exemples dans Rtuxette
 
Interpretable Sparse Sliced Inverse Regression for digitized functional data
Interpretable Sparse Sliced Inverse Regression for digitized functional dataInterpretable Sparse Sliced Inverse Regression for digitized functional data
Interpretable Sparse Sliced Inverse Regression for digitized functional datatuxette
 
Slides Lycée Jules Fil 2014
Slides Lycée Jules Fil 2014Slides Lycée Jules Fil 2014
Slides Lycée Jules Fil 2014tuxette
 
SAS Regression Certificate
SAS Regression CertificateSAS Regression Certificate
SAS Regression CertificateSameer Shaikh
 
Robust outlier detection
Robust outlier detection Robust outlier detection
Robust outlier detection vinnief
 
Graph mining 2: Statistical approaches for graph mining
Graph mining 2: Statistical approaches for graph miningGraph mining 2: Statistical approaches for graph mining
Graph mining 2: Statistical approaches for graph miningtuxette
 
H2O World - Cancer Detection via the Lasso - Rob Tibshirani
H2O World - Cancer Detection via the Lasso - Rob TibshiraniH2O World - Cancer Detection via the Lasso - Rob Tibshirani
H2O World - Cancer Detection via the Lasso - Rob TibshiraniSri Ambati
 
Random Forest for Big Data
Random Forest for Big DataRandom Forest for Big Data
Random Forest for Big Datatuxette
 
The RuLIS approach to outliers (Marcello D'Orazio,FAO)
The RuLIS approach to outliers (Marcello D'Orazio,FAO)The RuLIS approach to outliers (Marcello D'Orazio,FAO)
The RuLIS approach to outliers (Marcello D'Orazio,FAO)FAO
 
Bayesian Robust Linear Regression with Outlier Detection
Bayesian Robust Linear Regression with Outlier DetectionBayesian Robust Linear Regression with Outlier Detection
Bayesian Robust Linear Regression with Outlier DetectionJonathan Sedar
 
A short introduction to statistical learning
A short introduction to statistical learningA short introduction to statistical learning
A short introduction to statistical learningtuxette
 
Outlier detection for high dimensional data
Outlier detection for high dimensional dataOutlier detection for high dimensional data
Outlier detection for high dimensional dataParag Tamhane
 
Heteroscedasticity
HeteroscedasticityHeteroscedasticity
HeteroscedasticityMuhammad Ali
 
A LASSO for Linked Data
A LASSO for Linked DataA LASSO for Linked Data
A LASSO for Linked Datathosch
 
Multicollinearity1
Multicollinearity1Multicollinearity1
Multicollinearity1Muhammad Ali
 
Content analysis
Content analysisContent analysis
Content analysisJimi Kayode
 

En vedette (20)

6 Steps for Content Analysis & Development You Cannot Ignore
6 Steps for Content Analysis & Development You Cannot Ignore6 Steps for Content Analysis & Development You Cannot Ignore
6 Steps for Content Analysis & Development You Cannot Ignore
 
Visualiser et fouiller des réseaux - Méthodes et exemples dans R
Visualiser et fouiller des réseaux - Méthodes et exemples dans RVisualiser et fouiller des réseaux - Méthodes et exemples dans R
Visualiser et fouiller des réseaux - Méthodes et exemples dans R
 
Interpretable Sparse Sliced Inverse Regression for digitized functional data
Interpretable Sparse Sliced Inverse Regression for digitized functional dataInterpretable Sparse Sliced Inverse Regression for digitized functional data
Interpretable Sparse Sliced Inverse Regression for digitized functional data
 
Slides Lycée Jules Fil 2014
Slides Lycée Jules Fil 2014Slides Lycée Jules Fil 2014
Slides Lycée Jules Fil 2014
 
SAS Regression Certificate
SAS Regression CertificateSAS Regression Certificate
SAS Regression Certificate
 
Robust outlier detection
Robust outlier detection Robust outlier detection
Robust outlier detection
 
Content analysis20 07-12
Content analysis20 07-12Content analysis20 07-12
Content analysis20 07-12
 
Graph mining 2: Statistical approaches for graph mining
Graph mining 2: Statistical approaches for graph miningGraph mining 2: Statistical approaches for graph mining
Graph mining 2: Statistical approaches for graph mining
 
H2O World - Cancer Detection via the Lasso - Rob Tibshirani
H2O World - Cancer Detection via the Lasso - Rob TibshiraniH2O World - Cancer Detection via the Lasso - Rob Tibshirani
H2O World - Cancer Detection via the Lasso - Rob Tibshirani
 
Random Forest for Big Data
Random Forest for Big DataRandom Forest for Big Data
Random Forest for Big Data
 
The RuLIS approach to outliers (Marcello D'Orazio,FAO)
The RuLIS approach to outliers (Marcello D'Orazio,FAO)The RuLIS approach to outliers (Marcello D'Orazio,FAO)
The RuLIS approach to outliers (Marcello D'Orazio,FAO)
 
Bayesian Robust Linear Regression with Outlier Detection
Bayesian Robust Linear Regression with Outlier DetectionBayesian Robust Linear Regression with Outlier Detection
Bayesian Robust Linear Regression with Outlier Detection
 
A short introduction to statistical learning
A short introduction to statistical learningA short introduction to statistical learning
A short introduction to statistical learning
 
Outlier detection for high dimensional data
Outlier detection for high dimensional dataOutlier detection for high dimensional data
Outlier detection for high dimensional data
 
Outliers
OutliersOutliers
Outliers
 
Heteroscedasticity
HeteroscedasticityHeteroscedasticity
Heteroscedasticity
 
A LASSO for Linked Data
A LASSO for Linked DataA LASSO for Linked Data
A LASSO for Linked Data
 
Factor Analysis
Factor AnalysisFactor Analysis
Factor Analysis
 
Multicollinearity1
Multicollinearity1Multicollinearity1
Multicollinearity1
 
Content analysis
Content analysisContent analysis
Content analysis
 

Similaire à Inferring networks from multiple samples with consensus LASSO

Consensual gene co-expression network inference with multiple samples
Consensual gene co-expression network inference with multiple samplesConsensual gene co-expression network inference with multiple samples
Consensual gene co-expression network inference with multiple samplestuxette
 
Reading revue of "Inferring Multiple Graphical Structures"
Reading revue of "Inferring Multiple Graphical Structures"Reading revue of "Inferring Multiple Graphical Structures"
Reading revue of "Inferring Multiple Graphical Structures"tuxette
 
ProbabilisticModeling20080411
ProbabilisticModeling20080411ProbabilisticModeling20080411
ProbabilisticModeling20080411Clay Stanek
 
Gaining Confidence in Signalling and Regulatory Networks
Gaining Confidence in Signalling and Regulatory NetworksGaining Confidence in Signalling and Regulatory Networks
Gaining Confidence in Signalling and Regulatory NetworksMichael Stumpf
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelstuxette
 
DTI brain networks analysis
DTI brain networks analysisDTI brain networks analysis
DTI brain networks analysisemapesce
 
Probabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering NetworksProbabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering NetworksTomaso Aste
 
Inference of the JAK-STAT Gene Network via Graphical Models
Inference of the JAK-STAT Gene Network via Graphical ModelsInference of the JAK-STAT Gene Network via Graphical Models
Inference of the JAK-STAT Gene Network via Graphical ModelsSSA KPI
 
Sequential Monte Carlo algorithms for agent-based models of disease transmission
Sequential Monte Carlo algorithms for agent-based models of disease transmissionSequential Monte Carlo algorithms for agent-based models of disease transmission
Sequential Monte Carlo algorithms for agent-based models of disease transmissionJeremyHeng10
 
block-mdp-masters-defense.pdf
block-mdp-masters-defense.pdfblock-mdp-masters-defense.pdf
block-mdp-masters-defense.pdfJunghyun Lee
 
Estimating Functional Connectomes: Sparsity’s Strength and Limitations
Estimating Functional Connectomes: Sparsity’s Strength and LimitationsEstimating Functional Connectomes: Sparsity’s Strength and Limitations
Estimating Functional Connectomes: Sparsity’s Strength and LimitationsGael Varoquaux
 
Safety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdfSafety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdfPolytechnique Montréal
 
08 Exponential Random Graph Models (ERGM)
08 Exponential Random Graph Models (ERGM)08 Exponential Random Graph Models (ERGM)
08 Exponential Random Graph Models (ERGM)dnac
 

Similaire à Inferring networks from multiple samples with consensus LASSO (20)

Consensual gene co-expression network inference with multiple samples
Consensual gene co-expression network inference with multiple samplesConsensual gene co-expression network inference with multiple samples
Consensual gene co-expression network inference with multiple samples
 
Reading revue of "Inferring Multiple Graphical Structures"
Reading revue of "Inferring Multiple Graphical Structures"Reading revue of "Inferring Multiple Graphical Structures"
Reading revue of "Inferring Multiple Graphical Structures"
 
ProbabilisticModeling20080411
ProbabilisticModeling20080411ProbabilisticModeling20080411
ProbabilisticModeling20080411
 
Gaining Confidence in Signalling and Regulatory Networks
Gaining Confidence in Signalling and Regulatory NetworksGaining Confidence in Signalling and Regulatory Networks
Gaining Confidence in Signalling and Regulatory Networks
 
Basen Network
Basen NetworkBasen Network
Basen Network
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction models
 
DTI brain networks analysis
DTI brain networks analysisDTI brain networks analysis
DTI brain networks analysis
 
Probabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering NetworksProbabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering Networks
 
08 Statistical Models for Nets I, cross-section
08 Statistical Models for Nets I, cross-section08 Statistical Models for Nets I, cross-section
08 Statistical Models for Nets I, cross-section
 
Statistical Physics Studies of Machine Learning Problems by Lenka Zdeborova, ...
Statistical Physics Studies of Machine Learning Problems by Lenka Zdeborova, ...Statistical Physics Studies of Machine Learning Problems by Lenka Zdeborova, ...
Statistical Physics Studies of Machine Learning Problems by Lenka Zdeborova, ...
 
Xenia miscouridou wi mlds 4
Xenia miscouridou wi mlds 4Xenia miscouridou wi mlds 4
Xenia miscouridou wi mlds 4
 
PhD_defense_Alla
PhD_defense_AllaPhD_defense_Alla
PhD_defense_Alla
 
Inference of the JAK-STAT Gene Network via Graphical Models
Inference of the JAK-STAT Gene Network via Graphical ModelsInference of the JAK-STAT Gene Network via Graphical Models
Inference of the JAK-STAT Gene Network via Graphical Models
 
Sequential Monte Carlo algorithms for agent-based models of disease transmission
Sequential Monte Carlo algorithms for agent-based models of disease transmissionSequential Monte Carlo algorithms for agent-based models of disease transmission
Sequential Monte Carlo algorithms for agent-based models of disease transmission
 
block-mdp-masters-defense.pdf
block-mdp-masters-defense.pdfblock-mdp-masters-defense.pdf
block-mdp-masters-defense.pdf
 
17 Statistical Models for Networks
17 Statistical Models for Networks17 Statistical Models for Networks
17 Statistical Models for Networks
 
Estimating Functional Connectomes: Sparsity’s Strength and Limitations
Estimating Functional Connectomes: Sparsity’s Strength and LimitationsEstimating Functional Connectomes: Sparsity’s Strength and Limitations
Estimating Functional Connectomes: Sparsity’s Strength and Limitations
 
Safety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdfSafety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdf
 
08 Exponential Random Graph Models (2016)
08 Exponential Random Graph Models (2016)08 Exponential Random Graph Models (2016)
08 Exponential Random Graph Models (2016)
 
08 Exponential Random Graph Models (ERGM)
08 Exponential Random Graph Models (ERGM)08 Exponential Random Graph Models (ERGM)
08 Exponential Random Graph Models (ERGM)
 

Plus de tuxette

Racines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathsRacines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathstuxette
 
Méthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènesMéthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènestuxette
 
Méthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquesMéthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquestuxette
 
Projets autour de l'Hi-C
Projets autour de l'Hi-CProjets autour de l'Hi-C
Projets autour de l'Hi-Ctuxette
 
Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?tuxette
 
Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...tuxette
 
ASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquesASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquestuxette
 
Autour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeanAutour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeantuxette
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...tuxette
 
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquesApprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquestuxette
 
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...tuxette
 
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...tuxette
 
Journal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation dataJournal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation datatuxette
 
Overfitting or overparametrization?
Overfitting or overparametrization?Overfitting or overparametrization?
Overfitting or overparametrization?tuxette
 
Selective inference and single-cell differential analysis
Selective inference and single-cell differential analysisSelective inference and single-cell differential analysis
Selective inference and single-cell differential analysistuxette
 
SOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricesSOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricestuxette
 
Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Predictiontuxette
 
Explanable models for time series with random forest
Explanable models for time series with random forestExplanable models for time series with random forest
Explanable models for time series with random foresttuxette
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICStuxette
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICStuxette
 

Plus de tuxette (20)

Racines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathsRacines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en maths
 
Méthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènesMéthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènes
 
Méthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquesMéthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiques
 
Projets autour de l'Hi-C
Projets autour de l'Hi-CProjets autour de l'Hi-C
Projets autour de l'Hi-C
 
Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?
 
Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...
 
ASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquesASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiques
 
Autour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeanAutour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWean
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
 
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquesApprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
 
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
 
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
 
Journal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation dataJournal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation data
 
Overfitting or overparametrization?
Overfitting or overparametrization?Overfitting or overparametrization?
Overfitting or overparametrization?
 
Selective inference and single-cell differential analysis
Selective inference and single-cell differential analysisSelective inference and single-cell differential analysis
Selective inference and single-cell differential analysis
 
SOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricesSOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatrices
 
Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Prediction
 
Explanable models for time series with random forest
Explanable models for time series with random forestExplanable models for time series with random forest
Explanable models for time series with random forest
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICS
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICS
 

Dernier

Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSSLeenakshiTyagi
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 

Dernier (20)

Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 

Inferring networks from multiple samples with consensus LASSO

  • 1. Short overview on network inference with GGM Inference with multiple samples Simulations Joint network inference with the consensual LASSO Nathalie Villa-Vialaneix Joint work with Matthieu Vignes, Nathalie Viguerie and Magali San Cristobal Séminaire MIA-T, INRA Toulouse, 23 avril 2014 http://www.nathalievilla.org nathalie.villa@toulouse.inra.fr Nathalie Villa-Vialaneix | Consensus Lasso 1/30
  • 2. Short overview on network inference with GGM Inference with multiple samples Simulations Outline 1 Short overview on network inference with GGM 2 Inference with multiple samples 3 Simulations Nathalie Villa-Vialaneix | Consensus Lasso 2/30
  • 3. Short overview on network inference with GGM Inference with multiple samples Simulations Transcriptomic data DNA transcripted into mRNA to produce proteins Nathalie Villa-Vialaneix | Consensus Lasso 3/30
  • 4. Short overview on network inference with GGM Inference with multiple samples Simulations Transcriptomic data DNA transcripted into mRNA to produce proteins transcriptomic data: measure of the quantity of mRNA corresponding to a given gene in given cells (blood, muscle...) of a living organism Nathalie Villa-Vialaneix | Consensus Lasso 3/30
  • 5. Short overview on network inference with GGM Inference with multiple samples Simulations Systems biology Some genes’ expressions activate or repress other genes’ expressions ⇒ understanding the whole cascade helps to comprehend the global functioning of living organisms1 1 Picture taken from: Abdollahi A et al., PNAS 2007, 104:12890-12895. c 2007 by National Academy of Sciences Nathalie Villa-Vialaneix | Consensus Lasso 4/30
  • 6. Short overview on network inference with GGM Inference with multiple samples Simulations Model framework Data: large scale gene expression data individuals n 30/50    X =   . . . . . . . . X j i . . . . . . . . .   variables (genes expression), p 103/4 What we want to obtain: a graph/network with nodes: genes; edges: strong links between gene expressions. Nathalie Villa-Vialaneix | Consensus Lasso 5/30
  • 7. Short overview on network inference with GGM Inference with multiple samples Simulations Advantages of inferring a network from large scale transcription data 1 over raw data: focuses on the strongest direct relationships: irrelevant or indirect relations are removed (more robust) and the data are easier to visualize and understand (track transcription relations). Nathalie Villa-Vialaneix | Consensus Lasso 6/30
  • 8. Short overview on network inference with GGM Inference with multiple samples Simulations Advantages of inferring a network from large scale transcription data 1 over raw data: focuses on the strongest direct relationships: irrelevant or indirect relations are removed (more robust) and the data are easier to visualize and understand (track transcription relations). Expression data are analyzed all together and not by pairs (systems model). Nathalie Villa-Vialaneix | Consensus Lasso 6/30
  • 9. Short overview on network inference with GGM Inference with multiple samples Simulations Advantages of inferring a network from large scale transcription data 1 over raw data: focuses on the strongest direct relationships: irrelevant or indirect relations are removed (more robust) and the data are easier to visualize and understand (track transcription relations). Expression data are analyzed all together and not by pairs (systems model). 2 over bibliographic network: can handle interactions with yet unknown (not annotated) genes and deal with data collected in a particular condition. Nathalie Villa-Vialaneix | Consensus Lasso 6/30
  • 10. Short overview on network inference with GGM Inference with multiple samples Simulations Using correlations: relevance network [Butte and Kohane, 1999, Butte and Kohane, 2000] First (naive) approach: calculate correlations between expressions for all pairs of genes, threshold the smallest ones and build the network. Correlations Thresholding Graph Nathalie Villa-Vialaneix | Consensus Lasso 7/30
  • 11. Short overview on network inference with GGM Inference with multiple samples Simulations Using partial correlations strong indirect correlation y z x Nathalie Villa-Vialaneix | Consensus Lasso 8/30
  • 12. Short overview on network inference with GGM Inference with multiple samples Simulations Using partial correlations strong indirect correlation y z x set.seed(2807); x <- rnorm(100) y <- 2*x+1+rnorm(100,0,0.1); cor(x,y) [1] 0.998826 z <- 2*x+1+rnorm(100,0,0.1); cor(x,z) [1] 0.998751 cor(y,z) [1] 0.9971105 Nathalie Villa-Vialaneix | Consensus Lasso 8/30
  • 13. Short overview on network inference with GGM Inference with multiple samples Simulations Using partial correlations strong indirect correlation y z x set.seed(2807); x <- rnorm(100) y <- 2*x+1+rnorm(100,0,0.1); cor(x,y) [1] 0.998826 z <- 2*x+1+rnorm(100,0,0.1); cor(x,z) [1] 0.998751 cor(y,z) [1] 0.9971105 Partial correlation cor(lm(x∼z)$residuals,lm(y∼z)$residuals) [1] 0.7801174 cor(lm(x∼y)$residuals,lm(z∼y)$residuals) [1] 0.7639094 cor(lm(y∼x)$residuals,lm(z∼x)$residuals) [1] -0.1933699 Nathalie Villa-Vialaneix | Consensus Lasso 8/30
  • 14. Short overview on network inference with GGM Inference with multiple samples Simulations Partial correlation in the Gaussian framework (Xi)i=1,...,n are i.i.d. Gaussian random variables N(0, Σ) (gene expression); then j ←→ j (genes j and j are linked) ⇔ Cor Xj , Xj |(Xk )k j,j > 0 Nathalie Villa-Vialaneix | Consensus Lasso 9/30
  • 15. Short overview on network inference with GGM Inference with multiple samples Simulations Partial correlation in the Gaussian framework (Xi)i=1,...,n are i.i.d. Gaussian random variables N(0, Σ) (gene expression); then j ←→ j (genes j and j are linked) ⇔ Cor Xj , Xj |(Xk )k j,j > 0 If (concentration matrix) S = Σ−1 , Cor Xj , Xj |(Xk )k j,j = − Sjj SjjSj j ⇒ Estimate Σ−1 to unravel the graph structure Nathalie Villa-Vialaneix | Consensus Lasso 9/30
  • 16. Short overview on network inference with GGM Inference with multiple samples Simulations Partial correlation in the Gaussian framework (Xi)i=1,...,n are i.i.d. Gaussian random variables N(0, Σ) (gene expression); then j ←→ j (genes j and j are linked) ⇔ Cor Xj , Xj |(Xk )k j,j > 0 If (concentration matrix) S = Σ−1 , Cor Xj , Xj |(Xk )k j,j = − Sjj SjjSj j ⇒ Estimate Σ−1 to unravel the graph structure Problem: Σ: p-dimensional matrix and n p ⇒ (Σn )−1 is a poor estimate of S)! Nathalie Villa-Vialaneix | Consensus Lasso 9/30
  • 17. Short overview on network inference with GGM Inference with multiple samples Simulations Various approaches for inferring networks with GGM Graphical Gaussian Model seminal work: [Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b] (with shrinkage and a proposal for a Bayesian test of significance) estimate Σ−1 by (Σn + λI)−1 use a Bayesian test to test which coefficients are significantly non zero. Nathalie Villa-Vialaneix | Consensus Lasso 10/30
  • 18. Short overview on network inference with GGM Inference with multiple samples Simulations Various approaches for inferring networks with GGM Graphical Gaussian Model seminal work: [Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b] (with shrinkage and a proposal for a Bayesian test of significance) estimate Σ−1 by (Σn + λI)−1 use a Bayesian test to test which coefficients are significantly non zero. ∀ j, estimate the linear model: Xj = βT j X−j + ; arg max (βjj )j (log MLj) because βjj = − Sjj Sjj . Nathalie Villa-Vialaneix | Consensus Lasso 10/30
  • 19. Short overview on network inference with GGM Inference with multiple samples Simulations Various approaches for inferring networks with GGM Graphical Gaussian Model seminal work: [Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b] (with shrinkage and a proposal for a Bayesian test of significance) estimate Σ−1 by (Σn + λI)−1 use a Bayesian test to test which coefficients are significantly non zero. ∀ j, estimate the linear model: Xj = βT j X−j + ; arg min (βjj )j n i=1 Xij − βT j X −j i 2 because βjj = − Sjj Sjj . Nathalie Villa-Vialaneix | Consensus Lasso 10/30
  • 20. Short overview on network inference with GGM Inference with multiple samples Simulations Various approaches for inferring networks with GGM Graphical Gaussian Model seminal work: [Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b] (with shrinkage and a proposal for a Bayesian test of significance) estimate Σ−1 by (Σn + λI)−1 use a Bayesian test to test which coefficients are significantly non zero. sparse approaches: [Meinshausen and Bühlmann, 2006, Friedman et al., 2008]: ∀ j, estimate the linear model: Xj = βT j X−j + ; arg min (βjj )j n i=1 Xij − βT j X −j i 2 +λ βj L1 with βj L1 = j |βjj | L1 penalty yields to βjj = 0 for most j (variable selection) Nathalie Villa-Vialaneix | Consensus Lasso 10/30
  • 21. Short overview on network inference with GGM Inference with multiple samples Simulations Outline 1 Short overview on network inference with GGM 2 Inference with multiple samples 3 Simulations Nathalie Villa-Vialaneix | Consensus Lasso 11/30
  • 22. Short overview on network inference with GGM Inference with multiple samples Simulations Motivation for multiple networks inference Pan-European project Diogenes2 (with Nathalie Viguerie, INSERM): gene expressions (lipid tissues) from 204 obese women before and after a low-calorie diet (LCD). Assumption: A common functioning exists regardless the condition; Which genes are linked independently from/depending on the condition? 2 http://www.diogenes-eu.org/; see also [Viguerie et al., 2012] Nathalie Villa-Vialaneix | Consensus Lasso 12/30
  • 23. Short overview on network inference with GGM Inference with multiple samples Simulations Naive approach: independent estimations Notations: p genes measured in k samples, each corresponding to a specific condition: (Xc j )j=1,...,p ∼ N(0, Σc ), for c = 1, . . . , k. For c = 1, . . . , k, nc independent observations (Xc ij )i=1,...,nc and c nc = n. Independent inference Estimation ∀ c = 1, . . . , k and ∀ j = 1, . . . , p, Xc j = Xc j βc j + c j are estimated (independently) by maximizing pseudo-likelihood: L(S|X) = k c=1 p j=1 nc i=1 log P Xc ij |Xc i,j, Sc j Nathalie Villa-Vialaneix | Consensus Lasso 13/30
  • 24. Short overview on network inference with GGM Inference with multiple samples Simulations Related papers Problem: previous estimation does not use the fact that the different networks should be somehow alike! Previous proposals [Chiquet et al., 2011] replace Σc by Σc = 1 2 Σc + 1 2 Σ and add a sparse penalty; Nathalie Villa-Vialaneix | Consensus Lasso 14/30
  • 25. Short overview on network inference with GGM Inference with multiple samples Simulations Related papers Problem: previous estimation does not use the fact that the different networks should be somehow alike! Previous proposals [Chiquet et al., 2011] replace Σc by Σc = 1 2 Σc + 1 2 Σ and add a sparse penalty; [Chiquet et al., 2011] LASSO and Group-LASSO type penalties to force identical or sign-coherent edges between conditions: jj c (Sc jj )2 or jj   c (Sc jj )2 + + c (Sc jj )2 −   ⇒ Sc jj = 0 ∀ c for most entries OR Sc jj can only be of a given sign (positive or negative) whatever c Nathalie Villa-Vialaneix | Consensus Lasso 14/30
  • 26. Short overview on network inference with GGM Inference with multiple samples Simulations Related papers Problem: previous estimation does not use the fact that the different networks should be somehow alike! Previous proposals [Chiquet et al., 2011] replace Σc by Σc = 1 2 Σc + 1 2 Σ and add a sparse penalty; [Chiquet et al., 2011] LASSO and Group-LASSO type penalties to force identical or sign-coherent edges between conditions [Danaher et al., 2013] add the penalty c c Sc − Sc L1 ⇒ very strong consistency between conditions (sparse penalty over the inferred networks identical values for concentration matrix entries); Nathalie Villa-Vialaneix | Consensus Lasso 14/30
  • 27. Short overview on network inference with GGM Inference with multiple samples Simulations Related papers Problem: previous estimation does not use the fact that the different networks should be somehow alike! Previous proposals [Chiquet et al., 2011] replace Σc by Σc = 1 2 Σc + 1 2 Σ and add a sparse penalty; [Chiquet et al., 2011] LASSO and Group-LASSO type penalties to force identical or sign-coherent edges between conditions [Danaher et al., 2013] add the penalty c c Sc − Sc L1 ⇒ very strong consistency between conditions (sparse penalty over the inferred networks identical values for concentration matrix entries); [Mohan et al., 2012] add a group-LASSO like penalty c c j Sc j − Sc j L2 that focuses on differences due to a few number of nodes only. Nathalie Villa-Vialaneix | Consensus Lasso 14/30
  • 28. Short overview on network inference with GGM Inference with multiple samples Simulations Consensus LASSO Proposal Infer multiple networks by forcing them toward a consensual network: i.e., explicitly keeping the differences between conditions under control but with a L2 penalty (allow for more differences than Group-LASSO type penalties). Original optimization: max (βc jk )k j,c=1,...,C c  log MLc j − λ k j |βc jk |   . Nathalie Villa-Vialaneix | Consensus Lasso 15/30
  • 29. Short overview on network inference with GGM Inference with multiple samples Simulations Consensus LASSO Proposal Infer multiple networks by forcing them toward a consensual network: i.e., explicitly keeping the differences between conditions under control but with a L2 penalty (allow for more differences than Group-LASSO type penalties). Original optimization: max (βc jk )k j,c=1,...,C c  log MLc j − λ k j |βc jk |   . [Ambroise et al., 2009, Chiquet et al., 2011]: is equivalent to minimize p problems having dimension k(p − 1): 1 2 βT j Σjjβj + βT j Σjj + λ βj L1 , βj = (β1 j , . . . , βk j ) with Σjj: block diagonal matrix Diag Σ1 jj , . . . , Σk jj and similarly for Σjj. Nathalie Villa-Vialaneix | Consensus Lasso 15/30
  • 30. Short overview on network inference with GGM Inference with multiple samples Simulations Consensus LASSO Proposal Infer multiple networks by forcing them toward a consensual network: i.e., explicitly keeping the differences between conditions under control but with a L2 penalty (allow for more differences than Group-LASSO type penalties). Add a constraint to force inference toward a “consensus” βcons 1 2 βT j Σjjβj + βT j Σjj + λ βj L1 + µ c wc βc j − βcons j 2 L2 with: wc: real number used to weight the conditions (wc = 1 or wc = 1√ nc ); µ regularization parameter; βcons j whatever you want...? Nathalie Villa-Vialaneix | Consensus Lasso 15/30
  • 31. Short overview on network inference with GGM Inference with multiple samples Simulations Choice of a consensus: set one... Typical case: a prior network is known (e.g., from bibliography); with no prior information, use a fixed prior corresponding to (e.g.) global inference ⇒ given (and fixed) βcons Nathalie Villa-Vialaneix | Consensus Lasso 16/30
  • 32. Short overview on network inference with GGM Inference with multiple samples Simulations Choice of a consensus: set one... Typical case: a prior network is known (e.g., from bibliography); with no prior information, use a fixed prior corresponding to (e.g.) global inference ⇒ given (and fixed) βcons Proposition Using a fixed βcons j , the optimization problem is equivalent to minimizing the p following standard quadratic problem in Rk(p−1) with L1-penalty: 1 2 βT j B1 (µ)βj + βT j B2 (µ) + λ βj L1 , where B1 (µ) = Σjj + 2µIk(p−1), with Ik(p−1) the k(p − 1)-identity matrix B2 (µ) = Σjj − 2µIk(p−1)βcons with βcons = (βcons j )T , . . . , (βcons j )T T Nathalie Villa-Vialaneix | Consensus Lasso 16/30
  • 33. Short overview on network inference with GGM Inference with multiple samples Simulations Choice of a consensus: adapt one during training... Derive the consensus from the condition-specific estimates: βcons j = c nc n βc j Nathalie Villa-Vialaneix | Consensus Lasso 17/30
  • 34. Short overview on network inference with GGM Inference with multiple samples Simulations Choice of a consensus: adapt one during training... Derive the consensus from the condition-specific estimates: βcons j = c nc n βc j Proposition Using βcons j = k c=1 nc n βc j , the optimization problem is equivalent to minimizing the following standard quadratic problem with L1-penalty: 1 2 βT j Sj(µ)βj + βT j Σjj + λ βj L1 where Sj(µ) = Σjj + 2µAT (µ)A(µ) where A(µ) is a [k(p − 1) × k(p − 1)]-matrix that does not depend on j. Nathalie Villa-Vialaneix | Consensus Lasso 17/30
  • 35. Short overview on network inference with GGM Inference with multiple samples Simulations Computational aspects: optimization Common framework Objective function can be decomposed into: convex part C(βj) = 1 2 βT j Q1 j (µ) + βT j Q2 j (µ) L1 -norm penalty P(βj) = βj L1 Nathalie Villa-Vialaneix | Consensus Lasso 18/30
  • 36. Short overview on network inference with GGM Inference with multiple samples Simulations Computational aspects: optimization Common framework Objective function can be decomposed into: convex part C(βj) = 1 2 βT j Q1 j (µ) + βT j Q2 j (µ) L1 -norm penalty P(βj) = βj L1 optimization by “active set” [Osborne et al., 2000, Chiquet et al., 2011] 1: repeat(λ given) 2: Given A and βjj st: βjj 0, ∀ j ∈ A, solve (over h) the smooth minimization problem restricted to A C(βj + h) + λP(βj + h) ⇒ βj ← βj + h 4: until Nathalie Villa-Vialaneix | Consensus Lasso 18/30
  • 37. Short overview on network inference with GGM Inference with multiple samples Simulations Computational aspects: optimization Common framework Objective function can be decomposed into: convex part C(βj) = 1 2 βT j Q1 j (µ) + βT j Q2 j (µ) L1 -norm penalty P(βj) = βj L1 optimization by “active set” [Osborne et al., 2000, Chiquet et al., 2011] 1: repeat(λ given) 2: Given A and βjj st: βjj 0, ∀ j ∈ A, solve (over h) the smooth minimization problem restricted to A C(βj + h) + λP(βj + h) ⇒ βj ← βj + h 3: Update A by adding most violating variables, i.e., variables st: abs ∂C(βj) + λ∂P(βj) > 0 with [∂P(βj)]j ∈ [−1, 1] if j A 4: until Nathalie Villa-Vialaneix | Consensus Lasso 18/30
  • 38. Short overview on network inference with GGM Inference with multiple samples Simulations Computational aspects: optimization Common framework Objective function can be decomposed into: convex part C(βj) = 1 2 βT j Q1 j (µ) + βT j Q2 j (µ) L1 -norm penalty P(βj) = βj L1 optimization by “active set” [Osborne et al., 2000, Chiquet et al., 2011] 1: repeat(λ given) 2: Given A and βjj st: βjj 0, ∀ j ∈ A, solve (over h) the smooth minimization problem restricted to A C(βj + h) + λP(βj + h) ⇒ βj ← βj + h 3: Update A by adding most violating variables, i.e., variables st: abs ∂C(βj) + λ∂P(βj) > 0 with [∂P(βj)]j ∈ [−1, 1] if j A 4: until all conditions satisfied i.e. abs ∂C(βj) + λ∂P(βj) > 0 Nathalie Villa-Vialaneix | Consensus Lasso 18/30
  • 39. Short overview on network inference with GGM Inference with multiple samples Simulations Computational aspects: optimization Common framework Objective function can be decomposed into: convex part C(βj) = 1 2 βT j Q1 j (µ) + βT j Q2 j (µ) L1 -norm penalty P(βj) = βj L1 optimization by “active set” [Osborne et al., 2000, Chiquet et al., 2011] 1: repeat(λ given) 2: Given A and βjj st: βjj 0, ∀ j ∈ A, solve (over h) the smooth minimization problem restricted to A C(βj + h) + λP(βj + h) ⇒ βj ← βj + h 3: Update A by adding most violating variables, i.e., variables st: abs ∂C(βj) + λ∂P(βj) > 0 with [∂P(βj)]j ∈ [−1, 1] if j A 4: until all conditions satisfied i.e. abs ∂C(βj) + λ∂P(βj) > 0 Repeat: large λ ↓ small λ using previous β as prior Nathalie Villa-Vialaneix | Consensus Lasso 18/30
  • 40. Short overview on network inference with GGM Inference with multiple samples Simulations Bootstrap estimation BOLASSO [Bach, 2008] x x x x x x x x subsample n observations with replacement Nathalie Villa-Vialaneix | Consensus Lasso 19/30
  • 41. Short overview on network inference with GGM Inference with multiple samples Simulations Bootstrap estimation BOLASSO [Bach, 2008] x x x x x x x x subsample n observations with replacement cLasso estimation −−−−−−−−−−−−→ for varying λ, (βλ,b jj )jj Nathalie Villa-Vialaneix | Consensus Lasso 19/30
  • 42. Short overview on network inference with GGM Inference with multiple samples Simulations Bootstrap estimation BOLASSO [Bach, 2008] x x x x x x x x subsample n observations with replacement cLasso estimation −−−−−−−−−−−−→ for varying λ, (βλ,b jj )jj ↓ threshold keep the first T1 largest coefficients Nathalie Villa-Vialaneix | Consensus Lasso 19/30
  • 43. Short overview on network inference with GGM Inference with multiple samples Simulations Bootstrap estimation BOLASSO [Bach, 2008] x x x x x x x x subsample n observations with replacement cLasso estimation −−−−−−−−−−−−→ for varying λ, (βλ,b jj )jj ↓ threshold keep the first T1 largest coefficients B× Frequency table (1, 2) (1, 3) ... (j, j ) ... 130 25 ... 120 ... Nathalie Villa-Vialaneix | Consensus Lasso 19/30
  • 44. Short overview on network inference with GGM Inference with multiple samples Simulations Bootstrap estimation BOLASSO [Bach, 2008] x x x x x x x x subsample n observations with replacement cLasso estimation −−−−−−−−−−−−→ for varying λ, (βλ,b jj )jj ↓ threshold keep the first T1 largest coefficients B× Frequency table (1, 2) (1, 3) ... (j, j ) ... 130 25 ... 120 ... −→ Keep the T2 most frequent pairs Nathalie Villa-Vialaneix | Consensus Lasso 19/30
  • 45. Short overview on network inference with GGM Inference with multiple samples Simulations Outline 1 Short overview on network inference with GGM 2 Inference with multiple samples 3 Simulations Nathalie Villa-Vialaneix | Consensus Lasso 20/30
  • 46. Short overview on network inference with GGM Inference with multiple samples Simulations Simulated data Expression data with known co-expression network original network (scale free) taken from http://www.comp-sys-bio.org/AGN/data.html (100 nodes, ∼ 200 edges, loops removed); rewire a ratio r of the edges to generate k “children” networks (sharing approximately 100(1 − 2r)% of their edges); Nathalie Villa-Vialaneix | Consensus Lasso 21/30
  • 47. Short overview on network inference with GGM Inference with multiple samples Simulations Simulated data Expression data with known co-expression network original network (scale free) taken from http://www.comp-sys-bio.org/AGN/data.html (100 nodes, ∼ 200 edges, loops removed); rewire a ratio r of the edges to generate k “children” networks (sharing approximately 100(1 − 2r)% of their edges); generate “expression data” with a random Gaussian process from each chid: use the Laplacian of the graph to generate a putative concentration matrix; use edge colors in the original network to set the edge sign; correct the obtained matrix to make it positive; invert to obtain a covariance matrix...; ... which is used in a random Gaussian process to generate expression data (with noise). Nathalie Villa-Vialaneix | Consensus Lasso 21/30
  • 48. Short overview on network inference with GGM Inference with multiple samples Simulations An example with k = 2, r = 5% mother network3 first child second child 3 actually the parent network. My co-author wisely noted that the mistake was unforgivable for a feminist... Nathalie Villa-Vialaneix | Consensus Lasso 22/30
  • 49. Short overview on network inference with GGM Inference with multiple samples Simulations Choice for T2 Data: r = 0.05, k = 2 and n1 = n2 = 20 100 bootstrap samples, µ = 1, T1 = 250 or 500 qq 30 selections at least 30 selections at least 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 precision recall q q 40 selections at least 43 selections at least 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 precision recall Dots correspond to best F = 2 × precision×recall precision+recall ⇒ Best F corresponds to selecting a number of edges approximately equal to the number of edges in the original network. Nathalie Villa-Vialaneix | Consensus Lasso 23/30
  • 50. Short overview on network inference with GGM Inference with multiple samples Simulations Choice for T1 and µ µ T1 % of improvement 0.1/1 {250, 300, 500} of bootstrapping network sizes rewired edges: 5% 20-20 1 500 30.69 20-30 0.1 500 11.87 30-30 1 300 20.15 50-50 1 300 14.36 20-20-20-20-20 1 500 86.04 30-30-30-30 0.1 500 42.67 network sizes rewired edges: 20% 20-20 0.1 300 -17.86 20-30 0.1 300 -18.35 30-30 1 500 -7.97 50-50 0.1 300 -7.83 20-20-20-20-20 0.1 500 10.27 30-30-30-30 1 500 13.48 Nathalie Villa-Vialaneix | Consensus Lasso 24/30
  • 51. Short overview on network inference with GGM Inference with multiple samples Simulations Comparison with other approaches Method compared (direct and bootstrap approaches) independant Graphical LASSO estimation gLasso methods implementated in the R package simone and described in [Chiquet et al., 2011]: intertwinned LASSO iLasso, cooperative LASSO coopLasso and group LASSO groupLasso fused graphical LASSO as described in [Danaher et al., 2013] as implemented in the R package fgLasso Nathalie Villa-Vialaneix | Consensus Lasso 25/30
  • 52. Short overview on network inference with GGM Inference with multiple samples Simulations Comparison with other approaches Method compared (direct and bootstrap approaches) independant Graphical LASSO estimation gLasso methods implementated in the R package simone and described in [Chiquet et al., 2011]: intertwinned LASSO iLasso, cooperative LASSO coopLasso and group LASSO groupLasso fused graphical LASSO as described in [Danaher et al., 2013] as implemented in the R package fgLasso consensus Lasso with fixed prior (the mother network) cLasso(p) fixed prior (average over the conditions of independant estimations) cLasso(2) adaptative estimation of the prior cLasso(m) Nathalie Villa-Vialaneix | Consensus Lasso 25/30
  • 53. Short overview on network inference with GGM Inference with multiple samples Simulations Comparison with other approaches Method compared (direct and bootstrap approaches) independant Graphical LASSO estimation gLasso methods implementated in the R package simone and described in [Chiquet et al., 2011]: intertwinned LASSO iLasso, cooperative LASSO coopLasso and group LASSO groupLasso fused graphical LASSO as described in [Danaher et al., 2013] as implemented in the R package fgLasso consensus Lasso with fixed prior (the mother network) cLasso(p) fixed prior (average over the conditions of independant estimations) cLasso(2) adaptative estimation of the prior cLasso(m) Parameters set to: T1 = 500, B = 100, µ = 1 Nathalie Villa-Vialaneix | Consensus Lasso 25/30
  • 54. Short overview on network inference with GGM Inference with multiple samples Simulations Selected results (best F) rewired edges: 5% - conditions: 2 - sample size: 2 × 30 direct version Method gLasso iLasso groupLasso coopLasso 0.28 0.35 0.32 0.35 Method fgLasso cLasso(m) cLasso(p) cLasso(2) 0.32 0.31 0.86 0.30 bootstrap version Method gLasso iLasso groupLasso coopLasso 0.31 0.34 0.36 0.34 Method fgLasso cLasso(m) cLasso(p) cLasso(2) 0.36 0.37 0.86 0.35 Conclusions bootstraping improves results (except for iLasso and for large r) joint inference improves results using a good prior is (as expected) very efficient adaptive approch for cLasso is better than naive approach Nathalie Villa-Vialaneix | Consensus Lasso 26/30
  • 55. Short overview on network inference with GGM Inference with multiple samples Simulations Real data 204 obese women ; expression of 221 genes before and after a LCD µ = 1 ; T1 = 1000 (target density: 4%) Distribution of the number of times an edge is selected over 100 bootstrap samples 0 500 1000 1500 0 25 50 75 100 Counts Frequency (70% of the pairs of nodes are never selected) ⇒ T2 = 80 Nathalie Villa-Vialaneix | Consensus Lasso 27/30
  • 56. Short overview on network inference with GGM Inference with multiple samples Simulations Networks Before diet q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q AZGP1 CIDEA FADS1 FADS2 GPD1L PCK2 After diet q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q AZGP1 CIDEA FADS1 FADS2 GPD1L PCK2 densities about 1.3% - some interactions (both shared and specific) make sense to the biologist Nathalie Villa-Vialaneix | Consensus Lasso 28/30
  • 57. Short overview on network inference with GGM Inference with multiple samples Simulations Thank you for your attention... Programs available in the R package therese (on R-Forge)4 . Joint work with Magali SanCristobal Matthieu Vignes (GenPhySe, INRA Toulouse) (MIAT, INRA Toulouse) Nathalie Viguerie (I2MC, INSERM Toulouse) 4 https://r-forge.r-project.org/projects/therese-pkg Nathalie Villa-Vialaneix | Consensus Lasso 29/30
  • 58. Short overview on network inference with GGM Inference with multiple samples Simulations Questions? Nathalie Villa-Vialaneix | Consensus Lasso 30/30
  • 59. Short overview on network inference with GGM Inference with multiple samples Simulations References Ambroise, C., Chiquet, J., and Matias, C. (2009). Inferring sparse Gaussian graphical models with latent structure. Electronic Journal of Statistics, 3:205–238. Bach, F. (2008). Bolasso: model consistent lasso estimation through the bootstrap. In Proceedings of the Twenty-fifth International Conference on Machine Learning (ICML). Butte, A. and Kohane, I. (1999). Unsupervised knowledge discovery in medical databases using relevance networks. In Proceedings of the AMIA Symposium, pages 711–715. Butte, A. and Kohane, I. (2000). Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. In Proceedings of the Pacific Symposium on Biocomputing, pages 418–429. Chiquet, J., Grandvalet, Y., and Ambroise, C. (2011). Inferring multiple graphical structures. Statistics and Computing, 21(4):537–553. Danaher, P., Wang, P., and Witten, D. (2013). The joint graphical lasso for inverse covariance estimation accross multiple classes. Journal of the Royal Statistical Society Series B. Forthcoming. Friedman, J., Hastie, T., and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3):432–441. Meinshausen, N. and Bühlmann, P. (2006). Nathalie Villa-Vialaneix | Consensus Lasso 30/30
  • 60. Short overview on network inference with GGM Inference with multiple samples Simulations High dimensional graphs and variable selection with the lasso. Annals of Statistic, 34(3):1436–1462. Mohan, K., Chung, J., Han, S., Witten, D., Lee, S., and Fazel, M. (2012). Structured learning of Gaussian graphical models. In Proceedings of NIPS (Neural Information Processing Systems) 2012, Lake Tahoe, Nevada, USA. Osborne, M., Presnell, B., and Turlach, B. (2000). On the LASSO and its dual. Journal of Computational and Graphical Statistics, 9(2):319–337. Schäfer, J. and Strimmer, K. (2005a). An empirical bayes approach to inferring large-scale gene association networks. Bioinformatics, 21(6):754–764. Schäfer, J. and Strimmer, K. (2005b). A shrinkage approach to large-scale covariance matrix estimation and implication for functional genomics. Statistical Applications in Genetics and Molecular Biology, 4:1–32. Viguerie, N., Montastier, E., Maoret, J., Roussel, B., Combes, M., Valle, C., Villa-Vialaneix, N., Iacovoni, J., Martinez, J., Holst, C., Astrup, A., Vidal, H., Clément, K., Hager, J., Saris, W., and Langin, D. (2012). Determinants of human adipose tissue gene expression: impact of diet, sex, metabolic status and cis genetic regulation. PLoS Genetics, 8(9):e1002959. Nathalie Villa-Vialaneix | Consensus Lasso 30/30