SlideShare une entreprise Scribd logo
1  sur  69
Télécharger pour lire hors ligne
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Joint network inference with the consensual 
LASSO 
Nathalie Villa-Vialaneix 
Joint work with Matthieu Vignes, Nathalie Viguerie 
and Magali San Cristobal 
Séminaire de Statistique de Montpellier 
6 octobre 2014 
http://www.nathalievilla.org 
nathalie.villa@toulouse.inra.fr 
Nathalie Villa-Vialaneix | Consensus Lasso 1/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Outline 
1 Short overview on biological background 
2 Network inference and GGM 
3 Inference with multiple samples 
4 Simulations 
Nathalie Villa-Vialaneix | Consensus Lasso 2/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
DNA 
DNA (DeoxyriboNucleic Acid (DNA) 
molecule that encodes the genetic instructions used in the development 
and functioning of all known living organisms and many viruses 
double helix made with only four nucleotides (Adenine, Cytosine, 
Thymine, Guanine): A binds with T and C with G. 
Nathalie Villa-Vialaneix | Consensus Lasso 3/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Transcription 
Transcription of DNA 
transcription is the process by which a particular segment of DNA (called 
gene) is copied into a single-strand RNA (which is called message RNA) 
A is transcripted into U, T into A, C 
into G and G into C 
Nathalie Villa-Vialaneix | Consensus Lasso 4/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
What is mRNA used for? 
mRNA then moves out of the cell nucleus and is translated into proteins 
which are made of 20 different amino-acids (using an alphabet: 3 letters 
of mRNA ! 1 amino-acid) 
Proteins are used by the cell to function. 
Nathalie Villa-Vialaneix | Consensus Lasso 5/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Gene expression 
In a given cell, at a given time, all the genes are not translated or not 
translated at the same level. 
Gene expression 
is a process by which a given gene is transcripted or/and traducted 
It depends on the type of cell, the environment, ... 
Nathalie Villa-Vialaneix | Consensus Lasso 6/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Gene expression 
In a given cell, at a given time, all the genes are not translated or not 
translated at the same level. 
Gene expression 
is a process by which a given gene is transcripted or/and traducted 
It depends on the type of cell, the environment, ... 
Gene expression can be measured either by: 
“counting” the number of copies (mRNA) of a given gene in the cell: 
transcriptomic data; 
“counting” the quantity of a given protein in the cell: proteomic data. 
Even though a given mRNA is translated into a unique protein, there is 
no simple relationship between transcriptomic and proteomic data. 
Nathalie Villa-Vialaneix | Consensus Lasso 6/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Transcriptomic data 
How are transcriptomic data obtained? 
DNA spots associated to target 
genes (probes) are attached to a 
solid surface (array) 
Nathalie Villa-Vialaneix | Consensus Lasso 7/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Transcriptomic data 
How are transcriptomic data obtained? 
RNA material extracting for cells of 
interest (blood, lipid tissue, muscle, 
urine...) are labeled fluorescently 
and applied on the array 
When the binding between mRNA 
and the probes is good, the spot 
becomes fluorescent. 
Expression of a given gene is quantified by the intensity of the 
fluorescent signal (read by a scanner). 
Nathalie Villa-Vialaneix | Consensus Lasso 7/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Typical microarray data 
Data: large scale gene expression data 
individuals 
n ' 30=50 
8>>><>>>: 
X = 
0BBBBBBBB@ 
: : : : : : 
: : Xj 
i : : : 
: : : : : : 
1CCCCCCCCA 
|                                {z                                } 
variables (genes expression); p'103=4 
Nathalie Villa-Vialaneix | Consensus Lasso 8/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Typical microarray data 
Data: large scale gene expression data 
individuals 
n ' 30=50 
8>>><>>>: 
X = 
0BBBBBBBB@ 
: : : : : : 
: : Xj 
i : : : 
: : : : : : 
1CCCCCCCCA 
|                                {z                                } 
variables (genes expression); p'103=4 
Typical design: two (or more) conditions (treated/control for instance) 
More and more complicated designs: crossed conditions (treated/control 
and different breeds), longitudinal data... 
Nathalie Villa-Vialaneix | Consensus Lasso 8/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Typical microarray data 
Data: large scale gene expression data 
individuals 
n ' 30=50 
8>>><>>>: 
X = 
0BBBBBBBB@ 
: : : : : : 
: : Xj 
i : : : 
: : : : : : 
1CCCCCCCCA 
|                                {z                                } 
variables (genes expression); p'103=4 
Typical design: two (or more) conditions (treated/control for instance) 
More and more complicated designs: crossed conditions (treated/control 
and different breeds), longitudinal data... 
Typical issues: find differentially expressed genes (i.e., genes whose 
expression is significantly different between the conditions) 
Nathalie Villa-Vialaneix | Consensus Lasso 8/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Outline 
1 Short overview on biological background 
2 Network inference and GGM 
3 Inference with multiple samples 
4 Simulations 
Nathalie Villa-Vialaneix | Consensus Lasso 9/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Systems biology 
Instead of being used to produce proteins, some genes’ expressions 
activate or repress other genes’ expressions ) understanding the whole 
cascade helps to comprehend the global functioning of living organisms1 
1Picture taken from: Abdollahi A et al., PNAS 2007, 104:12890-12895. c
 2007 by 
National Academy of Sciences 
Nathalie Villa-Vialaneix | Consensus Lasso 10/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Model framework 
Data: large scale gene expression data 
individuals 
n ' 30=50 
8>>><>>>:X = 
0BBBBBBBB@ 
: : : : : : 
: : Xj 
i : : : 
: : : : : : 
1CCCCCCCCA 
|                                {z                                } 
variables (genes expression); p'103=4 
What we want to obtain: a graph/network with 
nodes: (selected) genes; 
edges: strong links between gene expressions. 
Nathalie Villa-Vialaneix | Consensus Lasso 11/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Advantages of network inference 
1 over raw data: focuses on the strongest direct relationships: 
irrelevant or indirect relations are removed (more robust) and the 
data are easier to visualize and understand (track transcription 
relations). 
Nathalie Villa-Vialaneix | Consensus Lasso 12/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Advantages of network inference 
1 over raw data: focuses on the strongest direct relationships: 
irrelevant or indirect relations are removed (more robust) and the 
data are easier to visualize and understand (track transcription 
relations). 
Expression data are analyzed all together and not by pairs (systems 
model). 
Nathalie Villa-Vialaneix | Consensus Lasso 12/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Advantages of network inference 
1 over raw data: focuses on the strongest direct relationships: 
irrelevant or indirect relations are removed (more robust) and the 
data are easier to visualize and understand (track transcription 
relations). 
Expression data are analyzed all together and not by pairs (systems 
model). 
2 over bibliographic network: can handle interactions with yet 
unknown (not annotated) genes and deal with data collected in a 
particular condition. 
Nathalie Villa-Vialaneix | Consensus Lasso 12/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Using correlations 
Relevance network [Butte and Kohane, 1999] 
First (naive) approach: calculate correlations between expressions for all 
pairs of genes, threshold the smallest ones and build the network. 
Correlations Thresholding Graph 
Nathalie Villa-Vialaneix | Consensus Lasso 13/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Using partial correlations 
x 
y z 
strong indirect correlation 
Nathalie Villa-Vialaneix | Consensus Lasso 14/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Using partial correlations 
x 
y z 
strong indirect correlation 
set.seed(2807); x <- rnorm(100) 
y <- 2*x+1+rnorm(100,0,0.1); cor(x,y) [1] 0.998826 
z <- 2*x+1+rnorm(100,0,0.1); cor(x,z) [1] 0.998751 
cor(y,z) [1] 0.9971105 
Nathalie Villa-Vialaneix | Consensus Lasso 14/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Using partial correlations 
x 
y z 
strong indirect correlation 
set.seed(2807); x <- rnorm(100) 
y <- 2*x+1+rnorm(100,0,0.1); cor(x,y) [1] 0.998826 
z <- 2*x+1+rnorm(100,0,0.1); cor(x,z) [1] 0.998751 
cor(y,z) [1] 0.9971105 
] Partial correlation 
cor(lm(xz)$residuals,lm(yz)$residuals) [1] 0.7801174 
cor(lm(xy)$residuals,lm(zy)$residuals) [1] 0.7639094 
cor(lm(yx)$residuals,lm(zx)$residuals) [1] -0.1933699 
Nathalie Villa-Vialaneix | Consensus Lasso 14/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Partial correlation and GGM 
(Xi)i=1;:::;n are i.i.d. Gaussian random variables N(0; ) (gene 
expression); then 
j  ! j0(genes j and j0 are linked) , Cor 
 
Xj ; Xj0 
j(Xk )k,j;j0 
 
, 0 
Nathalie Villa-Vialaneix | Consensus Lasso 15/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Partial correlation and GGM 
(Xi)i=1;:::;n are i.i.d. Gaussian random variables N(0; ) (gene 
expression); then 
j  ! j0(genes j and j0 are linked) , Cor 
 
Xj ; Xj0 
j(Xk )k,j;j0 
 
, 0 
If (concentration matrix) S = 1, 
Cor 
 
Xj ; Xj0 
j(Xk )k,j;j0 
 
=  
Sjj0 p 
SjjSj0 j0 
) Estimate 1 to unravel the graph structure 
Nathalie Villa-Vialaneix | Consensus Lasso 15/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Partial correlation and GGM 
(Xi)i=1;:::;n are i.i.d. Gaussian random variables N(0; ) (gene 
expression); then 
j  ! j0(genes j and j0 are linked) , Cor 
 
Xj ; Xj0 
j(Xk )k,j;j0 
 
, 0 
If (concentration matrix) S = 1, 
Cor 
 
Xj ; Xj0 
j(Xk )k,j;j0 
 
=  
Sjj0 p 
SjjSj0 j0 
) Estimate 1 to unravel the graph structure 
Problem: : p-dimensional matrix and n  p ) (b 
n)1 is a poor 
estimate of S)! 
Nathalie Villa-Vialaneix | Consensus Lasso 15/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Various approaches for inferring 
networks with GGM 
Graphical Gaussian Model 
seminal work: 
[Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b] 
(with shrinkage and a proposal for a Bayesian test of significance) 
estimate 1 by (b 
n + I)1 
use a Bayesian test to test which coefficients are significantly non zero. 
Nathalie Villa-Vialaneix | Consensus Lasso 16/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Various approaches for inferring 
networks with GGM 
Graphical Gaussian Model 
seminal work: 
[Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b] 
(with shrinkage and a proposal for a Bayesian test of significance) 
estimate 1 by (b 
n + I)1 
use a Bayesian test to test which coefficients are significantly non zero. 
relation with LM 
8 j, estimate the linear model: 
Xj =
Tj 
Xj +  ; arg max 
(
jj0 )j0 
(logMLj) 
because
jj0 =  
Sjj0 
Sjj 
. 
Nathalie Villa-Vialaneix | Consensus Lasso 16/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Various approaches for inferring 
networks with GGM 
Graphical Gaussian Model 
seminal work: 
[Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b] 
(with shrinkage and a proposal for a Bayesian test of significance) 
estimate 1 by (b 
n + I)1 
use a Bayesian test to test which coefficients are significantly non zero. 
relation with LM 
8 j, estimate the linear model: 
Xj =
Tj 
Xj +  ; arg min 
(
jj0 )j0 
Xn 
i=1 
 
Xij
Tj 
Xj 
i 
2 
because
jj0 =  
Sjj0 
Sjj 
. 
Nathalie Villa-Vialaneix | Consensus Lasso 16/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Various approaches for inferring 
networks with GGM 
Graphical Gaussian Model 
seminal work: 
[Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b] 
(with shrinkage and a proposal for a Bayesian test of significance) 
estimate 1 by (b 
n + I)1 
use a Bayesian test to test which coefficients are significantly non zero. 
sparse approaches: 
[Meinshausen and Bühlmann, 2006, Friedman et al., 2008]: 
8 j, estimate the linear model: 
Xj =
Tj 
Xj +  ; arg min 
(
jj0 )j0 
Xn 
i=1 
 
Xij
Tj 
Xj 
i 
2 
+k
jkL1 
with k
jkL1 = 
P 
j0 j
jj0 j 
L1 penalty yields to
jj0 = 0 for most j0 (variable selection) 
Nathalie Villa-Vialaneix | Consensus Lasso 16/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Outline 
1 Short overview on biological background 
2 Network inference and GGM 
3 Inference with multiple samples 
4 Simulations 
Nathalie Villa-Vialaneix | Consensus Lasso 17/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Motivation for multiple networks 
inference 
Pan-European project Diogenes2 (with Nathalie Viguerie, INSERM): 
gene expressions (lipid tissues) from 204 obese women before and after 
a low-calorie diet (LCD). 
Assumption: A common 
functioning exists 
regardless the 
condition; 
Which genes are linked 
independently 
from/depending on the 
condition? 
2http://www.diogenes-eu.org/; see also [Viguerie et al., 2012] 
Nathalie Villa-Vialaneix | Consensus Lasso 18/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Naive approach: independent 
estimations 
Notations: p genes measured in k samples, each corresponding to a 
specific condition: (Xc 
j )j=1;:::;p  N(0;c), for c = 1; : : : ; k. 
For c = 1; : : : ; k, nc independent observations (Xc 
P ij )i=1;:::;nc and 
c nc = n. 
Independent inference 
Estimation 8 c = 1; : : : ; k and 8 j = 1; : : : ; p, 
Xc 
j = Xc 
nj
cj 
+ c 
j 
are estimated (independently) by maximizing pseudo-likelihood: 
L(SjX) = 
Xk 
c=1 
Xp 
j=1 
Xnc 
i=1 
log P 
 
Xc 
ij jXci 
;nj ; Sc 
j 
 
Nathalie Villa-Vialaneix | Consensus Lasso 19/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Related papers 
Problem: previous estimation does not use the fact that the different 
networks should be somehow alike! 
Previous proposals 
[Chiquet et al., 2011] replace c bye 
c = 12 
c + 12 
 and add a 
sparse penalty; 
Nathalie Villa-Vialaneix | Consensus Lasso 20/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Related papers 
Problem: previous estimation does not use the fact that the different 
networks should be somehow alike! 
Previous proposals 
[Chiquet et al., 2011] replace c bye 
c = 12 
c + 12 
 and add a 
sparse penalty; 
[Chiquet et al., 2011] LASSO and Group-LASSO type penalties to 
force identical or sign-coherent edges between conditions: 
X 
jj0 
sX 
c 
(Sc 
jj0 )2 or 
X 
jj0 
266666664 
sX 
c 
(Sc 
jj0 )2 
+ + 
sX 
c 
(Sc 
jj0 )2 
 
377777775 
) Sc 
jj0 = 0 8 c for most entries OR Sc 
jj0 can only be of a given sign 
(positive or negative) whatever c 
Nathalie Villa-Vialaneix | Consensus Lasso 20/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Related papers 
Problem: previous estimation does not use the fact that the different 
networks should be somehow alike! 
Previous proposals 
[Chiquet et al., 2011] replace c bye 
c = 12 
c + 12 
 and add a 
sparse penalty; 
[Chiquet et al., 2011] LASSO and Group-LASSO type penalties to 
force identical or sign-coherent edges between conditions 
[Danaher et al., 2013] add the penalty 
P 
c,c0 kSc  Sc0 
kL1 ) (sparse 
penalty over the concentration matrix entries forces to strong 
consistency between conditions); 
Nathalie Villa-Vialaneix | Consensus Lasso 20/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Related papers 
Problem: previous estimation does not use the fact that the different 
networks should be somehow alike! 
Previous proposals 
[Chiquet et al., 2011] replace c bye 
c = 12 
c + 12 
 and add a 
sparse penalty; 
[Chiquet et al., 2011] LASSO and Group-LASSO type penalties to 
force identical or sign-coherent edges between conditions 
[Danaher et al., 2013] add the penalty 
P 
c,c0 kSc  Sc0 
kL1 ) (sparse 
penalty over the concentration matrix entries forces to strong 
consistency between conditions); 
[PMohan P 
et al., 2012] add a group-LASSO like penalty 
c,c0 
j kSc 
j  Sc0 
j kL2 that focuses on differences due to a few 
number of nodes only. 
Nathalie Villa-Vialaneix | Consensus Lasso 20/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Consensus LASSO 
Proposal 
Infer multiple networks by forcing them toward a consensual network: i.e., 
explicitly keeping the differences between conditions under control but 
with a L2 penalty (allow for more differences than Group-LASSO type 
penalties). 
Original optimization: 
max 
(
c 
jk )k,j;c=1;:::;C 
X 
c 
0BBBBBB@ 
logMLcj 
  
X 
k,j 
1CCCCCCA 
j
c 
jk j 
: 
Nathalie Villa-Vialaneix | Consensus Lasso 21/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Consensus LASSO 
Proposal 
Infer multiple networks by forcing them toward a consensual network: i.e., 
explicitly keeping the differences between conditions under control but 
with a L2 penalty (allow for more differences than Group-LASSO type 
penalties). 
Original optimization: 
max 
(
c 
jk )k,j;c=1;:::;C 
X 
c 
0BBBBBB@ 
logMLcj 
  
X 
k,j 
1CCCCCCA 
j
c 
jk j 
: 
[Ambroise et al., 2009, Chiquet et al., 2011]: is equivalent to minimize 
p problems having dimension k(p  1): 
1 
2
Tj 
b 
njnj
j +
Tj 
b 
jnj + k
jkL1 ;
j = (
1j 
; : : : ;
kj 
) 
withb 
njnj : block diagonal matrix Diag 
 
b 
1 
njnj ; : : : ;b 
k 
njnj 
 
and similarly forb 
jnj . 
Nathalie Villa-Vialaneix | Consensus Lasso 21/36
Short overview on biological background Network inference and GGM Inference with multiple samples Simulations 
Consensus LASSO 
Proposal 
Infer multiple networks by forcing them toward a consensual network: i.e., 
explicitly keeping the differences between conditions under control but 
with a L2 penalty (allow for more differences than Group-LASSO type 
penalties). 
Add a constraint to force inference toward a “consensus”
cons 
1 
2
Tj 
b 
njnj
j +
Tj b 
jnj + k

Contenu connexe

Tendances

A paradox of importance in network epidemiology
A paradox of importance in network epidemiologyA paradox of importance in network epidemiology
A paradox of importance in network epidemiologyPetter Holme
 
Inference in HMM and Bayesian Models
Inference in HMM and Bayesian ModelsInference in HMM and Bayesian Models
Inference in HMM and Bayesian ModelsMinakshi Atre
 
Spreading processes on temporal networks
Spreading processes on temporal networksSpreading processes on temporal networks
Spreading processes on temporal networksPetter Holme
 
Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...Hakky St
 
An information-theoretic, all-scales approach to comparing networks
An information-theoretic, all-scales approach to comparing networksAn information-theoretic, all-scales approach to comparing networks
An information-theoretic, all-scales approach to comparing networksJim Bagrow
 
IEEE Fuzzy system Title and Abstract 2016
IEEE Fuzzy system Title and Abstract 2016 IEEE Fuzzy system Title and Abstract 2016
IEEE Fuzzy system Title and Abstract 2016 tsysglobalsolutions
 
CoCoWa A Collaborative Contact-Based
CoCoWa  A Collaborative Contact-BasedCoCoWa  A Collaborative Contact-Based
CoCoWa A Collaborative Contact-BasedBen Paul
 
Artificial neural networks and its application
Artificial neural networks and its applicationArtificial neural networks and its application
Artificial neural networks and its applicationHưng Đặng
 
CEHS 2016 Poster
CEHS 2016 PosterCEHS 2016 Poster
CEHS 2016 PosterEric Ma
 
Important spreaders in networks: Exact results for small graphs
Important spreaders in networks: Exact results for small graphsImportant spreaders in networks: Exact results for small graphs
Important spreaders in networks: Exact results for small graphsPetter Holme
 
Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...Martin Pelikan
 
RFNM-Aranda-Final.PDF
RFNM-Aranda-Final.PDFRFNM-Aranda-Final.PDF
RFNM-Aranda-Final.PDFThomas Aranda
 

Tendances (14)

Temporal networks - Alain Barrat
Temporal networks - Alain BarratTemporal networks - Alain Barrat
Temporal networks - Alain Barrat
 
A paradox of importance in network epidemiology
A paradox of importance in network epidemiologyA paradox of importance in network epidemiology
A paradox of importance in network epidemiology
 
Inference in HMM and Bayesian Models
Inference in HMM and Bayesian ModelsInference in HMM and Bayesian Models
Inference in HMM and Bayesian Models
 
Spreading processes on temporal networks
Spreading processes on temporal networksSpreading processes on temporal networks
Spreading processes on temporal networks
 
Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...
 
An information-theoretic, all-scales approach to comparing networks
An information-theoretic, all-scales approach to comparing networksAn information-theoretic, all-scales approach to comparing networks
An information-theoretic, all-scales approach to comparing networks
 
IEEE Fuzzy system Title and Abstract 2016
IEEE Fuzzy system Title and Abstract 2016 IEEE Fuzzy system Title and Abstract 2016
IEEE Fuzzy system Title and Abstract 2016
 
CoCoWa A Collaborative Contact-Based
CoCoWa  A Collaborative Contact-BasedCoCoWa  A Collaborative Contact-Based
CoCoWa A Collaborative Contact-Based
 
Artificial neural networks and its application
Artificial neural networks and its applicationArtificial neural networks and its application
Artificial neural networks and its application
 
CEHS 2016 Poster
CEHS 2016 PosterCEHS 2016 Poster
CEHS 2016 Poster
 
Important spreaders in networks: Exact results for small graphs
Important spreaders in networks: Exact results for small graphsImportant spreaders in networks: Exact results for small graphs
Important spreaders in networks: Exact results for small graphs
 
Ijciet 10 01_153-2
Ijciet 10 01_153-2Ijciet 10 01_153-2
Ijciet 10 01_153-2
 
Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...
 
RFNM-Aranda-Final.PDF
RFNM-Aranda-Final.PDFRFNM-Aranda-Final.PDF
RFNM-Aranda-Final.PDF
 

En vedette

Interpretable Sparse Sliced Inverse Regression for digitized functional data
Interpretable Sparse Sliced Inverse Regression for digitized functional dataInterpretable Sparse Sliced Inverse Regression for digitized functional data
Interpretable Sparse Sliced Inverse Regression for digitized functional datatuxette
 
Slides Lycée Jules Fil 2014
Slides Lycée Jules Fil 2014Slides Lycée Jules Fil 2014
Slides Lycée Jules Fil 2014tuxette
 
Visualiser et fouiller des réseaux - Méthodes et exemples dans R
Visualiser et fouiller des réseaux - Méthodes et exemples dans RVisualiser et fouiller des réseaux - Méthodes et exemples dans R
Visualiser et fouiller des réseaux - Méthodes et exemples dans Rtuxette
 
Graph mining 2: Statistical approaches for graph mining
Graph mining 2: Statistical approaches for graph miningGraph mining 2: Statistical approaches for graph mining
Graph mining 2: Statistical approaches for graph miningtuxette
 
Random Forest for Big Data
Random Forest for Big DataRandom Forest for Big Data
Random Forest for Big Datatuxette
 
A short introduction to statistical learning
A short introduction to statistical learningA short introduction to statistical learning
A short introduction to statistical learningtuxette
 
Multiple kernel learning applied to the integration of Tara oceans datasets
Multiple kernel learning applied to the integration of Tara oceans datasetsMultiple kernel learning applied to the integration of Tara oceans datasets
Multiple kernel learning applied to the integration of Tara oceans datasetstuxette
 

En vedette (7)

Interpretable Sparse Sliced Inverse Regression for digitized functional data
Interpretable Sparse Sliced Inverse Regression for digitized functional dataInterpretable Sparse Sliced Inverse Regression for digitized functional data
Interpretable Sparse Sliced Inverse Regression for digitized functional data
 
Slides Lycée Jules Fil 2014
Slides Lycée Jules Fil 2014Slides Lycée Jules Fil 2014
Slides Lycée Jules Fil 2014
 
Visualiser et fouiller des réseaux - Méthodes et exemples dans R
Visualiser et fouiller des réseaux - Méthodes et exemples dans RVisualiser et fouiller des réseaux - Méthodes et exemples dans R
Visualiser et fouiller des réseaux - Méthodes et exemples dans R
 
Graph mining 2: Statistical approaches for graph mining
Graph mining 2: Statistical approaches for graph miningGraph mining 2: Statistical approaches for graph mining
Graph mining 2: Statistical approaches for graph mining
 
Random Forest for Big Data
Random Forest for Big DataRandom Forest for Big Data
Random Forest for Big Data
 
A short introduction to statistical learning
A short introduction to statistical learningA short introduction to statistical learning
A short introduction to statistical learning
 
Multiple kernel learning applied to the integration of Tara oceans datasets
Multiple kernel learning applied to the integration of Tara oceans datasetsMultiple kernel learning applied to the integration of Tara oceans datasets
Multiple kernel learning applied to the integration of Tara oceans datasets
 

Similaire à Inferring networks from multiple samples with consensus LASSO

Consensual gene co-expression network inference with multiple samples
Consensual gene co-expression network inference with multiple samplesConsensual gene co-expression network inference with multiple samples
Consensual gene co-expression network inference with multiple samplestuxette
 
Community Finding with Applications on Phylogenetic Networks [Extended Abstract]
Community Finding with Applications on Phylogenetic Networks [Extended Abstract]Community Finding with Applications on Phylogenetic Networks [Extended Abstract]
Community Finding with Applications on Phylogenetic Networks [Extended Abstract]Luís Rita
 
American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1Double Check ĆŐNSULTING
 
Next generation sequencing by Muhammad Abbas
Next generation sequencing by Muhammad AbbasNext generation sequencing by Muhammad Abbas
Next generation sequencing by Muhammad AbbasMuhammadAbbaskhan9
 
NetBioSIG2014-Talk by Salvatore Loguercio
NetBioSIG2014-Talk by Salvatore LoguercioNetBioSIG2014-Talk by Salvatore Loguercio
NetBioSIG2014-Talk by Salvatore LoguercioAlexander Pico
 
Integrating phylogenetic inference and metadata visualization for NGS data
Integrating phylogenetic inference and metadata visualization for NGS dataIntegrating phylogenetic inference and metadata visualization for NGS data
Integrating phylogenetic inference and metadata visualization for NGS dataJoão André Carriço
 
Deciphering cancer stem cells regulatory circuits through an interactome–regu...
Deciphering cancer stem cells regulatory circuits through an interactome–regu...Deciphering cancer stem cells regulatory circuits through an interactome–regu...
Deciphering cancer stem cells regulatory circuits through an interactome–regu...Claire Rioualen
 
Neuroinformatics conference 2012
Neuroinformatics conference 2012Neuroinformatics conference 2012
Neuroinformatics conference 2012Nicolas Le Novère
 
20080110 Genome exploration in A-T G-C space: an introduction to DNA walking
20080110 Genome exploration in A-T G-C space: an introduction to DNA walking20080110 Genome exploration in A-T G-C space: an introduction to DNA walking
20080110 Genome exploration in A-T G-C space: an introduction to DNA walkingJonathan Blakes
 
Bioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptxBioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptxxRowlet
 
Complementing Computation with Visualization in Genomics
Complementing Computation with Visualization in GenomicsComplementing Computation with Visualization in Genomics
Complementing Computation with Visualization in GenomicsFrancis Rowland
 
STRING - Modeling of pathways through cross-species integration of large-scal...
STRING - Modeling of pathways through cross-species integration of large-scal...STRING - Modeling of pathways through cross-species integration of large-scal...
STRING - Modeling of pathways through cross-species integration of large-scal...Lars Juhl Jensen
 
NetBioSIG2012 chrisevelo
NetBioSIG2012 chriseveloNetBioSIG2012 chrisevelo
NetBioSIG2012 chriseveloAlexander Pico
 
WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015
WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015
WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015Torsten Seemann
 

Similaire à Inferring networks from multiple samples with consensus LASSO (20)

Consensual gene co-expression network inference with multiple samples
Consensual gene co-expression network inference with multiple samplesConsensual gene co-expression network inference with multiple samples
Consensual gene co-expression network inference with multiple samples
 
Community Finding with Applications on Phylogenetic Networks [Extended Abstract]
Community Finding with Applications on Phylogenetic Networks [Extended Abstract]Community Finding with Applications on Phylogenetic Networks [Extended Abstract]
Community Finding with Applications on Phylogenetic Networks [Extended Abstract]
 
American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1
 
Next generation sequencing by Muhammad Abbas
Next generation sequencing by Muhammad AbbasNext generation sequencing by Muhammad Abbas
Next generation sequencing by Muhammad Abbas
 
NetBioSIG2014-Talk by Salvatore Loguercio
NetBioSIG2014-Talk by Salvatore LoguercioNetBioSIG2014-Talk by Salvatore Loguercio
NetBioSIG2014-Talk by Salvatore Loguercio
 
Integrating phylogenetic inference and metadata visualization for NGS data
Integrating phylogenetic inference and metadata visualization for NGS dataIntegrating phylogenetic inference and metadata visualization for NGS data
Integrating phylogenetic inference and metadata visualization for NGS data
 
presentation
presentationpresentation
presentation
 
Deciphering cancer stem cells regulatory circuits through an interactome–regu...
Deciphering cancer stem cells regulatory circuits through an interactome–regu...Deciphering cancer stem cells regulatory circuits through an interactome–regu...
Deciphering cancer stem cells regulatory circuits through an interactome–regu...
 
Neuroinformatics conference 2012
Neuroinformatics conference 2012Neuroinformatics conference 2012
Neuroinformatics conference 2012
 
Basen Network
Basen NetworkBasen Network
Basen Network
 
New generation Sequencing
New generation Sequencing New generation Sequencing
New generation Sequencing
 
20080110 Genome exploration in A-T G-C space: an introduction to DNA walking
20080110 Genome exploration in A-T G-C space: an introduction to DNA walking20080110 Genome exploration in A-T G-C space: an introduction to DNA walking
20080110 Genome exploration in A-T G-C space: an introduction to DNA walking
 
Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical ModelsBiological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models
 
RapportHicham
RapportHichamRapportHicham
RapportHicham
 
Bioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptxBioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptx
 
Complementing Computation with Visualization in Genomics
Complementing Computation with Visualization in GenomicsComplementing Computation with Visualization in Genomics
Complementing Computation with Visualization in Genomics
 
STRING - Modeling of pathways through cross-species integration of large-scal...
STRING - Modeling of pathways through cross-species integration of large-scal...STRING - Modeling of pathways through cross-species integration of large-scal...
STRING - Modeling of pathways through cross-species integration of large-scal...
 
PhD midterm report
PhD midterm reportPhD midterm report
PhD midterm report
 
NetBioSIG2012 chrisevelo
NetBioSIG2012 chriseveloNetBioSIG2012 chrisevelo
NetBioSIG2012 chrisevelo
 
WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015
WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015
WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015
 

Plus de tuxette

Racines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathsRacines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathstuxette
 
Méthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènesMéthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènestuxette
 
Méthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquesMéthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquestuxette
 
Projets autour de l'Hi-C
Projets autour de l'Hi-CProjets autour de l'Hi-C
Projets autour de l'Hi-Ctuxette
 
Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?tuxette
 
Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...tuxette
 
ASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquesASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquestuxette
 
Autour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeanAutour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeantuxette
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...tuxette
 
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquesApprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquestuxette
 
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...tuxette
 
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...tuxette
 
Journal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation dataJournal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation datatuxette
 
Overfitting or overparametrization?
Overfitting or overparametrization?Overfitting or overparametrization?
Overfitting or overparametrization?tuxette
 
Selective inference and single-cell differential analysis
Selective inference and single-cell differential analysisSelective inference and single-cell differential analysis
Selective inference and single-cell differential analysistuxette
 
SOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricesSOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricestuxette
 
Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Predictiontuxette
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelstuxette
 
Explanable models for time series with random forest
Explanable models for time series with random forestExplanable models for time series with random forest
Explanable models for time series with random foresttuxette
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICStuxette
 

Plus de tuxette (20)

Racines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathsRacines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en maths
 
Méthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènesMéthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènes
 
Méthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquesMéthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiques
 
Projets autour de l'Hi-C
Projets autour de l'Hi-CProjets autour de l'Hi-C
Projets autour de l'Hi-C
 
Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?
 
Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...
 
ASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquesASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiques
 
Autour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeanAutour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWean
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
 
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquesApprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
 
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
 
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
 
Journal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation dataJournal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation data
 
Overfitting or overparametrization?
Overfitting or overparametrization?Overfitting or overparametrization?
Overfitting or overparametrization?
 
Selective inference and single-cell differential analysis
Selective inference and single-cell differential analysisSelective inference and single-cell differential analysis
Selective inference and single-cell differential analysis
 
SOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricesSOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatrices
 
Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Prediction
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction models
 
Explanable models for time series with random forest
Explanable models for time series with random forestExplanable models for time series with random forest
Explanable models for time series with random forest
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICS
 

Dernier

Substances in Common Use for Shahu College Screening Test
Substances in Common Use for Shahu College Screening TestSubstances in Common Use for Shahu College Screening Test
Substances in Common Use for Shahu College Screening TestAkashDTejwani
 
Pests of tenai_Identification,Binomics_Dr.UPR
Pests of tenai_Identification,Binomics_Dr.UPRPests of tenai_Identification,Binomics_Dr.UPR
Pests of tenai_Identification,Binomics_Dr.UPRPirithiRaju
 
Gene transfer in plants agrobacterium.pdf
Gene transfer in plants agrobacterium.pdfGene transfer in plants agrobacterium.pdf
Gene transfer in plants agrobacterium.pdfNetHelix
 
Pests of Redgram_Identification, Binomics_Dr.UPR
Pests of Redgram_Identification, Binomics_Dr.UPRPests of Redgram_Identification, Binomics_Dr.UPR
Pests of Redgram_Identification, Binomics_Dr.UPRPirithiRaju
 
Lehninger_Chapter 17_Fatty acid Oxid.ppt
Lehninger_Chapter 17_Fatty acid Oxid.pptLehninger_Chapter 17_Fatty acid Oxid.ppt
Lehninger_Chapter 17_Fatty acid Oxid.pptSachin Teotia
 
Pests of cumbu_Identification, Binomics, Integrated ManagementDr.UPR.pdf
Pests of cumbu_Identification, Binomics, Integrated ManagementDr.UPR.pdfPests of cumbu_Identification, Binomics, Integrated ManagementDr.UPR.pdf
Pests of cumbu_Identification, Binomics, Integrated ManagementDr.UPR.pdfPirithiRaju
 
Contracts with Interdependent Preferences (2)
Contracts with Interdependent Preferences (2)Contracts with Interdependent Preferences (2)
Contracts with Interdependent Preferences (2)GRAPE
 
Application of Foraminiferal Ecology- Rahul.pptx
Application of Foraminiferal Ecology- Rahul.pptxApplication of Foraminiferal Ecology- Rahul.pptx
Application of Foraminiferal Ecology- Rahul.pptxRahulVishwakarma71547
 
KeyBio pipeline for bioinformatics and data science
KeyBio pipeline for bioinformatics and data scienceKeyBio pipeline for bioinformatics and data science
KeyBio pipeline for bioinformatics and data scienceLayne Sadler
 
Controlling Parameters of Carbonate platform Environment
Controlling Parameters of Carbonate platform EnvironmentControlling Parameters of Carbonate platform Environment
Controlling Parameters of Carbonate platform EnvironmentRahulVishwakarma71547
 
Krishi Vigyan Kendras - कृषि विज्ञान केंद्र
Krishi Vigyan Kendras - कृषि विज्ञान केंद्रKrishi Vigyan Kendras - कृषि विज्ञान केंद्र
Krishi Vigyan Kendras - कृषि विज्ञान केंद्रKrashi Coaching
 
MARKER ASSISTED SELECTION IN CROP IMPROVEMENT
MARKER ASSISTED SELECTION IN CROP IMPROVEMENTMARKER ASSISTED SELECTION IN CROP IMPROVEMENT
MARKER ASSISTED SELECTION IN CROP IMPROVEMENTjipexe1248
 
soft skills question paper set for bba ca
soft skills question paper set for bba casoft skills question paper set for bba ca
soft skills question paper set for bba caohsadfeeling
 
Human brain.. It's parts and function.
Human brain.. It's parts and function. Human brain.. It's parts and function.
Human brain.. It's parts and function. MUKTA MANJARI SAHOO
 
Exploration Method’s in Archaeological Studies & Research
Exploration Method’s in Archaeological Studies & ResearchExploration Method’s in Archaeological Studies & Research
Exploration Method’s in Archaeological Studies & ResearchPrachya Adhyayan
 
Principles & Formulation of Hair Care Products
Principles & Formulation of Hair Care  ProductsPrinciples & Formulation of Hair Care  Products
Principles & Formulation of Hair Care Productspurwaborkar@gmail.com
 
Identification of Superclusters and Their Properties in the Sloan Digital Sky...
Identification of Superclusters and Their Properties in the Sloan Digital Sky...Identification of Superclusters and Their Properties in the Sloan Digital Sky...
Identification of Superclusters and Their Properties in the Sloan Digital Sky...Sérgio Sacani
 
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...PirithiRaju
 
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdfPests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdfPirithiRaju
 
Physics Serway Jewett 6th edition for Scientists and Engineers
Physics Serway Jewett 6th edition for Scientists and EngineersPhysics Serway Jewett 6th edition for Scientists and Engineers
Physics Serway Jewett 6th edition for Scientists and EngineersAndreaLucarelli
 

Dernier (20)

Substances in Common Use for Shahu College Screening Test
Substances in Common Use for Shahu College Screening TestSubstances in Common Use for Shahu College Screening Test
Substances in Common Use for Shahu College Screening Test
 
Pests of tenai_Identification,Binomics_Dr.UPR
Pests of tenai_Identification,Binomics_Dr.UPRPests of tenai_Identification,Binomics_Dr.UPR
Pests of tenai_Identification,Binomics_Dr.UPR
 
Gene transfer in plants agrobacterium.pdf
Gene transfer in plants agrobacterium.pdfGene transfer in plants agrobacterium.pdf
Gene transfer in plants agrobacterium.pdf
 
Pests of Redgram_Identification, Binomics_Dr.UPR
Pests of Redgram_Identification, Binomics_Dr.UPRPests of Redgram_Identification, Binomics_Dr.UPR
Pests of Redgram_Identification, Binomics_Dr.UPR
 
Lehninger_Chapter 17_Fatty acid Oxid.ppt
Lehninger_Chapter 17_Fatty acid Oxid.pptLehninger_Chapter 17_Fatty acid Oxid.ppt
Lehninger_Chapter 17_Fatty acid Oxid.ppt
 
Pests of cumbu_Identification, Binomics, Integrated ManagementDr.UPR.pdf
Pests of cumbu_Identification, Binomics, Integrated ManagementDr.UPR.pdfPests of cumbu_Identification, Binomics, Integrated ManagementDr.UPR.pdf
Pests of cumbu_Identification, Binomics, Integrated ManagementDr.UPR.pdf
 
Contracts with Interdependent Preferences (2)
Contracts with Interdependent Preferences (2)Contracts with Interdependent Preferences (2)
Contracts with Interdependent Preferences (2)
 
Application of Foraminiferal Ecology- Rahul.pptx
Application of Foraminiferal Ecology- Rahul.pptxApplication of Foraminiferal Ecology- Rahul.pptx
Application of Foraminiferal Ecology- Rahul.pptx
 
KeyBio pipeline for bioinformatics and data science
KeyBio pipeline for bioinformatics and data scienceKeyBio pipeline for bioinformatics and data science
KeyBio pipeline for bioinformatics and data science
 
Controlling Parameters of Carbonate platform Environment
Controlling Parameters of Carbonate platform EnvironmentControlling Parameters of Carbonate platform Environment
Controlling Parameters of Carbonate platform Environment
 
Krishi Vigyan Kendras - कृषि विज्ञान केंद्र
Krishi Vigyan Kendras - कृषि विज्ञान केंद्रKrishi Vigyan Kendras - कृषि विज्ञान केंद्र
Krishi Vigyan Kendras - कृषि विज्ञान केंद्र
 
MARKER ASSISTED SELECTION IN CROP IMPROVEMENT
MARKER ASSISTED SELECTION IN CROP IMPROVEMENTMARKER ASSISTED SELECTION IN CROP IMPROVEMENT
MARKER ASSISTED SELECTION IN CROP IMPROVEMENT
 
soft skills question paper set for bba ca
soft skills question paper set for bba casoft skills question paper set for bba ca
soft skills question paper set for bba ca
 
Human brain.. It's parts and function.
Human brain.. It's parts and function. Human brain.. It's parts and function.
Human brain.. It's parts and function.
 
Exploration Method’s in Archaeological Studies & Research
Exploration Method’s in Archaeological Studies & ResearchExploration Method’s in Archaeological Studies & Research
Exploration Method’s in Archaeological Studies & Research
 
Principles & Formulation of Hair Care Products
Principles & Formulation of Hair Care  ProductsPrinciples & Formulation of Hair Care  Products
Principles & Formulation of Hair Care Products
 
Identification of Superclusters and Their Properties in the Sloan Digital Sky...
Identification of Superclusters and Their Properties in the Sloan Digital Sky...Identification of Superclusters and Their Properties in the Sloan Digital Sky...
Identification of Superclusters and Their Properties in the Sloan Digital Sky...
 
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
 
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdfPests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
 
Physics Serway Jewett 6th edition for Scientists and Engineers
Physics Serway Jewett 6th edition for Scientists and EngineersPhysics Serway Jewett 6th edition for Scientists and Engineers
Physics Serway Jewett 6th edition for Scientists and Engineers
 

Inferring networks from multiple samples with consensus LASSO

  • 1. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Joint network inference with the consensual LASSO Nathalie Villa-Vialaneix Joint work with Matthieu Vignes, Nathalie Viguerie and Magali San Cristobal Séminaire de Statistique de Montpellier 6 octobre 2014 http://www.nathalievilla.org nathalie.villa@toulouse.inra.fr Nathalie Villa-Vialaneix | Consensus Lasso 1/36
  • 2. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Outline 1 Short overview on biological background 2 Network inference and GGM 3 Inference with multiple samples 4 Simulations Nathalie Villa-Vialaneix | Consensus Lasso 2/36
  • 3. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations DNA DNA (DeoxyriboNucleic Acid (DNA) molecule that encodes the genetic instructions used in the development and functioning of all known living organisms and many viruses double helix made with only four nucleotides (Adenine, Cytosine, Thymine, Guanine): A binds with T and C with G. Nathalie Villa-Vialaneix | Consensus Lasso 3/36
  • 4. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Transcription Transcription of DNA transcription is the process by which a particular segment of DNA (called gene) is copied into a single-strand RNA (which is called message RNA) A is transcripted into U, T into A, C into G and G into C Nathalie Villa-Vialaneix | Consensus Lasso 4/36
  • 5. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations What is mRNA used for? mRNA then moves out of the cell nucleus and is translated into proteins which are made of 20 different amino-acids (using an alphabet: 3 letters of mRNA ! 1 amino-acid) Proteins are used by the cell to function. Nathalie Villa-Vialaneix | Consensus Lasso 5/36
  • 6. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Gene expression In a given cell, at a given time, all the genes are not translated or not translated at the same level. Gene expression is a process by which a given gene is transcripted or/and traducted It depends on the type of cell, the environment, ... Nathalie Villa-Vialaneix | Consensus Lasso 6/36
  • 7. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Gene expression In a given cell, at a given time, all the genes are not translated or not translated at the same level. Gene expression is a process by which a given gene is transcripted or/and traducted It depends on the type of cell, the environment, ... Gene expression can be measured either by: “counting” the number of copies (mRNA) of a given gene in the cell: transcriptomic data; “counting” the quantity of a given protein in the cell: proteomic data. Even though a given mRNA is translated into a unique protein, there is no simple relationship between transcriptomic and proteomic data. Nathalie Villa-Vialaneix | Consensus Lasso 6/36
  • 8. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Transcriptomic data How are transcriptomic data obtained? DNA spots associated to target genes (probes) are attached to a solid surface (array) Nathalie Villa-Vialaneix | Consensus Lasso 7/36
  • 9. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Transcriptomic data How are transcriptomic data obtained? RNA material extracting for cells of interest (blood, lipid tissue, muscle, urine...) are labeled fluorescently and applied on the array When the binding between mRNA and the probes is good, the spot becomes fluorescent. Expression of a given gene is quantified by the intensity of the fluorescent signal (read by a scanner). Nathalie Villa-Vialaneix | Consensus Lasso 7/36
  • 10. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Typical microarray data Data: large scale gene expression data individuals n ' 30=50 8>>><>>>: X = 0BBBBBBBB@ : : : : : : : : Xj i : : : : : : : : : 1CCCCCCCCA | {z } variables (genes expression); p'103=4 Nathalie Villa-Vialaneix | Consensus Lasso 8/36
  • 11. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Typical microarray data Data: large scale gene expression data individuals n ' 30=50 8>>><>>>: X = 0BBBBBBBB@ : : : : : : : : Xj i : : : : : : : : : 1CCCCCCCCA | {z } variables (genes expression); p'103=4 Typical design: two (or more) conditions (treated/control for instance) More and more complicated designs: crossed conditions (treated/control and different breeds), longitudinal data... Nathalie Villa-Vialaneix | Consensus Lasso 8/36
  • 12. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Typical microarray data Data: large scale gene expression data individuals n ' 30=50 8>>><>>>: X = 0BBBBBBBB@ : : : : : : : : Xj i : : : : : : : : : 1CCCCCCCCA | {z } variables (genes expression); p'103=4 Typical design: two (or more) conditions (treated/control for instance) More and more complicated designs: crossed conditions (treated/control and different breeds), longitudinal data... Typical issues: find differentially expressed genes (i.e., genes whose expression is significantly different between the conditions) Nathalie Villa-Vialaneix | Consensus Lasso 8/36
  • 13. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Outline 1 Short overview on biological background 2 Network inference and GGM 3 Inference with multiple samples 4 Simulations Nathalie Villa-Vialaneix | Consensus Lasso 9/36
  • 14. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Systems biology Instead of being used to produce proteins, some genes’ expressions activate or repress other genes’ expressions ) understanding the whole cascade helps to comprehend the global functioning of living organisms1 1Picture taken from: Abdollahi A et al., PNAS 2007, 104:12890-12895. c 2007 by National Academy of Sciences Nathalie Villa-Vialaneix | Consensus Lasso 10/36
  • 15. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Model framework Data: large scale gene expression data individuals n ' 30=50 8>>><>>>:X = 0BBBBBBBB@ : : : : : : : : Xj i : : : : : : : : : 1CCCCCCCCA | {z } variables (genes expression); p'103=4 What we want to obtain: a graph/network with nodes: (selected) genes; edges: strong links between gene expressions. Nathalie Villa-Vialaneix | Consensus Lasso 11/36
  • 16. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Advantages of network inference 1 over raw data: focuses on the strongest direct relationships: irrelevant or indirect relations are removed (more robust) and the data are easier to visualize and understand (track transcription relations). Nathalie Villa-Vialaneix | Consensus Lasso 12/36
  • 17. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Advantages of network inference 1 over raw data: focuses on the strongest direct relationships: irrelevant or indirect relations are removed (more robust) and the data are easier to visualize and understand (track transcription relations). Expression data are analyzed all together and not by pairs (systems model). Nathalie Villa-Vialaneix | Consensus Lasso 12/36
  • 18. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Advantages of network inference 1 over raw data: focuses on the strongest direct relationships: irrelevant or indirect relations are removed (more robust) and the data are easier to visualize and understand (track transcription relations). Expression data are analyzed all together and not by pairs (systems model). 2 over bibliographic network: can handle interactions with yet unknown (not annotated) genes and deal with data collected in a particular condition. Nathalie Villa-Vialaneix | Consensus Lasso 12/36
  • 19. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Using correlations Relevance network [Butte and Kohane, 1999] First (naive) approach: calculate correlations between expressions for all pairs of genes, threshold the smallest ones and build the network. Correlations Thresholding Graph Nathalie Villa-Vialaneix | Consensus Lasso 13/36
  • 20. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Using partial correlations x y z strong indirect correlation Nathalie Villa-Vialaneix | Consensus Lasso 14/36
  • 21. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Using partial correlations x y z strong indirect correlation set.seed(2807); x <- rnorm(100) y <- 2*x+1+rnorm(100,0,0.1); cor(x,y) [1] 0.998826 z <- 2*x+1+rnorm(100,0,0.1); cor(x,z) [1] 0.998751 cor(y,z) [1] 0.9971105 Nathalie Villa-Vialaneix | Consensus Lasso 14/36
  • 22. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Using partial correlations x y z strong indirect correlation set.seed(2807); x <- rnorm(100) y <- 2*x+1+rnorm(100,0,0.1); cor(x,y) [1] 0.998826 z <- 2*x+1+rnorm(100,0,0.1); cor(x,z) [1] 0.998751 cor(y,z) [1] 0.9971105 ] Partial correlation cor(lm(xz)$residuals,lm(yz)$residuals) [1] 0.7801174 cor(lm(xy)$residuals,lm(zy)$residuals) [1] 0.7639094 cor(lm(yx)$residuals,lm(zx)$residuals) [1] -0.1933699 Nathalie Villa-Vialaneix | Consensus Lasso 14/36
  • 23. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Partial correlation and GGM (Xi)i=1;:::;n are i.i.d. Gaussian random variables N(0; ) (gene expression); then j ! j0(genes j and j0 are linked) , Cor Xj ; Xj0 j(Xk )k,j;j0 , 0 Nathalie Villa-Vialaneix | Consensus Lasso 15/36
  • 24. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Partial correlation and GGM (Xi)i=1;:::;n are i.i.d. Gaussian random variables N(0; ) (gene expression); then j ! j0(genes j and j0 are linked) , Cor Xj ; Xj0 j(Xk )k,j;j0 , 0 If (concentration matrix) S = 1, Cor Xj ; Xj0 j(Xk )k,j;j0 = Sjj0 p SjjSj0 j0 ) Estimate 1 to unravel the graph structure Nathalie Villa-Vialaneix | Consensus Lasso 15/36
  • 25. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Partial correlation and GGM (Xi)i=1;:::;n are i.i.d. Gaussian random variables N(0; ) (gene expression); then j ! j0(genes j and j0 are linked) , Cor Xj ; Xj0 j(Xk )k,j;j0 , 0 If (concentration matrix) S = 1, Cor Xj ; Xj0 j(Xk )k,j;j0 = Sjj0 p SjjSj0 j0 ) Estimate 1 to unravel the graph structure Problem: : p-dimensional matrix and n p ) (b n)1 is a poor estimate of S)! Nathalie Villa-Vialaneix | Consensus Lasso 15/36
  • 26. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Various approaches for inferring networks with GGM Graphical Gaussian Model seminal work: [Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b] (with shrinkage and a proposal for a Bayesian test of significance) estimate 1 by (b n + I)1 use a Bayesian test to test which coefficients are significantly non zero. Nathalie Villa-Vialaneix | Consensus Lasso 16/36
  • 27. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Various approaches for inferring networks with GGM Graphical Gaussian Model seminal work: [Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b] (with shrinkage and a proposal for a Bayesian test of significance) estimate 1 by (b n + I)1 use a Bayesian test to test which coefficients are significantly non zero. relation with LM 8 j, estimate the linear model: Xj =
  • 28. Tj Xj + ; arg max (
  • 29. jj0 )j0 (logMLj) because
  • 30. jj0 = Sjj0 Sjj . Nathalie Villa-Vialaneix | Consensus Lasso 16/36
  • 31. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Various approaches for inferring networks with GGM Graphical Gaussian Model seminal work: [Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b] (with shrinkage and a proposal for a Bayesian test of significance) estimate 1 by (b n + I)1 use a Bayesian test to test which coefficients are significantly non zero. relation with LM 8 j, estimate the linear model: Xj =
  • 32. Tj Xj + ; arg min (
  • 33. jj0 )j0 Xn i=1 Xij
  • 34. Tj Xj i 2 because
  • 35. jj0 = Sjj0 Sjj . Nathalie Villa-Vialaneix | Consensus Lasso 16/36
  • 36. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Various approaches for inferring networks with GGM Graphical Gaussian Model seminal work: [Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b] (with shrinkage and a proposal for a Bayesian test of significance) estimate 1 by (b n + I)1 use a Bayesian test to test which coefficients are significantly non zero. sparse approaches: [Meinshausen and Bühlmann, 2006, Friedman et al., 2008]: 8 j, estimate the linear model: Xj =
  • 37. Tj Xj + ; arg min (
  • 38. jj0 )j0 Xn i=1 Xij
  • 39. Tj Xj i 2 +k
  • 41. jkL1 = P j0 j
  • 42. jj0 j L1 penalty yields to
  • 43. jj0 = 0 for most j0 (variable selection) Nathalie Villa-Vialaneix | Consensus Lasso 16/36
  • 44. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Outline 1 Short overview on biological background 2 Network inference and GGM 3 Inference with multiple samples 4 Simulations Nathalie Villa-Vialaneix | Consensus Lasso 17/36
  • 45. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Motivation for multiple networks inference Pan-European project Diogenes2 (with Nathalie Viguerie, INSERM): gene expressions (lipid tissues) from 204 obese women before and after a low-calorie diet (LCD). Assumption: A common functioning exists regardless the condition; Which genes are linked independently from/depending on the condition? 2http://www.diogenes-eu.org/; see also [Viguerie et al., 2012] Nathalie Villa-Vialaneix | Consensus Lasso 18/36
  • 46. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Naive approach: independent estimations Notations: p genes measured in k samples, each corresponding to a specific condition: (Xc j )j=1;:::;p N(0;c), for c = 1; : : : ; k. For c = 1; : : : ; k, nc independent observations (Xc P ij )i=1;:::;nc and c nc = n. Independent inference Estimation 8 c = 1; : : : ; k and 8 j = 1; : : : ; p, Xc j = Xc nj
  • 47. cj + c j are estimated (independently) by maximizing pseudo-likelihood: L(SjX) = Xk c=1 Xp j=1 Xnc i=1 log P Xc ij jXci ;nj ; Sc j Nathalie Villa-Vialaneix | Consensus Lasso 19/36
  • 48. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Related papers Problem: previous estimation does not use the fact that the different networks should be somehow alike! Previous proposals [Chiquet et al., 2011] replace c bye c = 12 c + 12 and add a sparse penalty; Nathalie Villa-Vialaneix | Consensus Lasso 20/36
  • 49. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Related papers Problem: previous estimation does not use the fact that the different networks should be somehow alike! Previous proposals [Chiquet et al., 2011] replace c bye c = 12 c + 12 and add a sparse penalty; [Chiquet et al., 2011] LASSO and Group-LASSO type penalties to force identical or sign-coherent edges between conditions: X jj0 sX c (Sc jj0 )2 or X jj0 266666664 sX c (Sc jj0 )2 + + sX c (Sc jj0 )2 377777775 ) Sc jj0 = 0 8 c for most entries OR Sc jj0 can only be of a given sign (positive or negative) whatever c Nathalie Villa-Vialaneix | Consensus Lasso 20/36
  • 50. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Related papers Problem: previous estimation does not use the fact that the different networks should be somehow alike! Previous proposals [Chiquet et al., 2011] replace c bye c = 12 c + 12 and add a sparse penalty; [Chiquet et al., 2011] LASSO and Group-LASSO type penalties to force identical or sign-coherent edges between conditions [Danaher et al., 2013] add the penalty P c,c0 kSc Sc0 kL1 ) (sparse penalty over the concentration matrix entries forces to strong consistency between conditions); Nathalie Villa-Vialaneix | Consensus Lasso 20/36
  • 51. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Related papers Problem: previous estimation does not use the fact that the different networks should be somehow alike! Previous proposals [Chiquet et al., 2011] replace c bye c = 12 c + 12 and add a sparse penalty; [Chiquet et al., 2011] LASSO and Group-LASSO type penalties to force identical or sign-coherent edges between conditions [Danaher et al., 2013] add the penalty P c,c0 kSc Sc0 kL1 ) (sparse penalty over the concentration matrix entries forces to strong consistency between conditions); [PMohan P et al., 2012] add a group-LASSO like penalty c,c0 j kSc j Sc0 j kL2 that focuses on differences due to a few number of nodes only. Nathalie Villa-Vialaneix | Consensus Lasso 20/36
  • 52. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Consensus LASSO Proposal Infer multiple networks by forcing them toward a consensual network: i.e., explicitly keeping the differences between conditions under control but with a L2 penalty (allow for more differences than Group-LASSO type penalties). Original optimization: max (
  • 53. c jk )k,j;c=1;:::;C X c 0BBBBBB@ logMLcj X k,j 1CCCCCCA j
  • 54. c jk j : Nathalie Villa-Vialaneix | Consensus Lasso 21/36
  • 55. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Consensus LASSO Proposal Infer multiple networks by forcing them toward a consensual network: i.e., explicitly keeping the differences between conditions under control but with a L2 penalty (allow for more differences than Group-LASSO type penalties). Original optimization: max (
  • 56. c jk )k,j;c=1;:::;C X c 0BBBBBB@ logMLcj X k,j 1CCCCCCA j
  • 57. c jk j : [Ambroise et al., 2009, Chiquet et al., 2011]: is equivalent to minimize p problems having dimension k(p 1): 1 2
  • 59. j +
  • 60. Tj b jnj + k
  • 62. j = (
  • 63. 1j ; : : : ;
  • 64. kj ) withb njnj : block diagonal matrix Diag b 1 njnj ; : : : ;b k njnj and similarly forb jnj . Nathalie Villa-Vialaneix | Consensus Lasso 21/36
  • 65. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Consensus LASSO Proposal Infer multiple networks by forcing them toward a consensual network: i.e., explicitly keeping the differences between conditions under control but with a L2 penalty (allow for more differences than Group-LASSO type penalties). Add a constraint to force inference toward a “consensus”
  • 68. j +
  • 69. Tj b jnj + k
  • 70. jkL1 + X c wck
  • 71. cj
  • 72. cons j k2 L2 with: wc : real number used to weight the conditions; regularization parameter;
  • 73. cons j whatever you want...? Nathalie Villa-Vialaneix | Consensus Lasso 21/36
  • 74. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Choice of a consensus: set one... Typical case: a prior network is known (e.g., from bibliography); with no prior information, use a fixed prior corresponding to (e.g.) global inference ) given (and fixed)
  • 75. cons Nathalie Villa-Vialaneix | Consensus Lasso 22/36
  • 76. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Choice of a consensus: set one... Typical case: a prior network is known (e.g., from bibliography); with no prior information, use a fixed prior corresponding to (e.g.) global inference ) given (and fixed)
  • 78. cons j , the optimization problem is equivalent to minimizing the p following standard quadratic problem in Rk(p1) with L1-penalty: 1 2
  • 80. j +
  • 82. jkL1 ; where B1() =b njnj + 2Ik(p1), with Ik(p1) the k(p 1)-identity matrix B2() =b jnj 2Ik(p1)
  • 84. cons = (
  • 85. cons j )T ; : : : ; (
  • 86. cons j )T T Nathalie Villa-Vialaneix | Consensus Lasso 22/36
  • 87. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Choice of a consensus: adapt one during training... Derive the consensus from the condition-specific estimates:
  • 88. cons j = X c nc n
  • 89. cj Nathalie Villa-Vialaneix | Consensus Lasso 23/36
  • 90. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Choice of a consensus: adapt one during training... Derive the consensus from the condition-specific estimates:
  • 91. cons j = X c nc n
  • 93. cons j = Pkc =1 nc n
  • 94. cj , the optimization problem is equivalent to minimizing the following standard quadratic problem with L1-penalty: 1 2
  • 96. j +
  • 97. Tj b jnj + k
  • 98. jkL1 where Sj() =b njnj + 2AT ()A() where A() is a [k(p 1) k(p 1)]-matrix that does not depend on j. Nathalie Villa-Vialaneix | Consensus Lasso 23/36
  • 99. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Computational aspects: optimization 2j Tj 1j Tj 12 Common framework Objective function can be decomposed into: convex part C(
  • 100. j) =
  • 101. Q() +
  • 103. j) = k
  • 104. jkL1 Nathalie Villa-Vialaneix | Consensus Lasso 24/36
  • 105. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Computational aspects: optimization 2j Tj 1j Tj 12 Common framework Objective function can be decomposed into: convex part C(
  • 106. j) =
  • 107. Q() +
  • 109. j) = k
  • 110. jkL1 optimization by “active set” [Osborne et al., 2000, Chiquet et al., 2011] 1: repeat( given) 2: Given A and
  • 112. jj0 , 0, 8 j0 2 A, solve (over h) the smooth minimization problem restricted to A C(
  • 113. j + h) + P(
  • 114. j + h) )
  • 115. j
  • 116. j + h 4: until Nathalie Villa-Vialaneix | Consensus Lasso 24/36
  • 117. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Computational aspects: optimization 2j Tj 1j Tj 12 Common framework Objective function can be decomposed into: convex part C(
  • 118. j) =
  • 119. Q() +
  • 121. j) = k
  • 122. jkL1 optimization by “active set” [Osborne et al., 2000, Chiquet et al., 2011] 1: repeat( given) 2: Given A and
  • 124. jj0 , 0, 8 j0 2 A, solve (over h) the smooth minimization problem restricted to A C(
  • 125. j + h) + P(
  • 126. j + h) )
  • 127. j
  • 128. j + h 3: Update A by adding most violating variables, i.e., variables st: abs
  • 134. @C(
  • 136. j) 0 with [@P(
  • 137. j )]j0 2 [1; 1] if j0 A 4: until Nathalie Villa-Vialaneix | Consensus Lasso 24/36
  • 138. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Computational aspects: optimization 2j Tj 1j Tj 12 Common framework Objective function can be decomposed into: convex part C(
  • 139. j) =
  • 140. Q() +
  • 142. j) = k
  • 143. jkL1 optimization by “active set” [Osborne et al., 2000, Chiquet et al., 2011] 1: repeat( given) 2: Given A and
  • 145. jj0 , 0, 8 j0 2 A, solve (over h) the smooth minimization problem restricted to A C(
  • 146. j + h) + P(
  • 147. j + h) )
  • 148. j
  • 149. j + h 3: Update A by adding most violating variables, i.e., variables st: abs
  • 155. @C(
  • 157. j) 0 with [@P(
  • 158. j )]j0 2 [1; 1] if j0 A 4: until all first order conditions satisfied Nathalie Villa-Vialaneix | Consensus Lasso 24/36
  • 159. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Computational aspects: optimization 2j Tj 1j Tj 12 Common framework Objective function can be decomposed into: convex part C(
  • 160. j) =
  • 161. Q() +
  • 163. j) = k
  • 164. jkL1 optimization by “active set” [Osborne et al., 2000, Chiquet et al., 2011] 1: repeat( given) 2: Given A and
  • 166. jj0 , 0, 8 j0 2 A, solve (over h) the smooth minimization problem restricted to A C(
  • 167. j + h) + P(
  • 168. j + h) )
  • 169. j
  • 170. j + h 3: Update A by adding most violating variables, i.e., variables st: abs
  • 176. @C(
  • 178. j) 0 with [@P(
  • 179. j )]j0 2 [1; 1] if j0 A 4: until all first order conditions satisfied Repeat: large # small using previous
  • 180. as prior Nathalie Villa-Vialaneix | Consensus Lasso 24/36
  • 181. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Bootstrap estimation ' BOLASSO subsample n observations with replacement x x x x x x x x Nathalie Villa-Vialaneix | Consensus Lasso 25/36
  • 182. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Bootstrap estimation ' BOLASSO subsample n observations with replacement x x x x x x x x cLasso estimation ! for varying , (
  • 183. ;b jj0 )jj0 Nathalie Villa-Vialaneix | Consensus Lasso 25/36
  • 184. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Bootstrap estimation ' BOLASSO subsample n observations with replacement x x x x x x x x cLasso estimation ! for varying , (
  • 185. ;b jj0 )jj0 # threshold keep the first T1 largest coefficients Nathalie Villa-Vialaneix | Consensus Lasso 25/36
  • 186. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Bootstrap estimation ' BOLASSO subsample n observations with replacement x x x x x x x x cLasso estimation ! for varying , (
  • 187. ;b jj0 )jj0 # threshold keep the first T1 largest coefficients B Frequency table (1; 2) (1; 3) ... (j; j0) ... 130 25 ... 120 ... Nathalie Villa-Vialaneix | Consensus Lasso 25/36
  • 188. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Bootstrap estimation ' BOLASSO subsample n observations with replacement x x x x x x x x cLasso estimation ! for varying , (
  • 189. ;b jj0 )jj0 # threshold keep the first T1 largest coefficients B Frequency table (1; 2) (1; 3) ... (j; j0) ... 130 25 ... 120 ... ! Keep the T2 most frequent pairs Nathalie Villa-Vialaneix | Consensus Lasso 25/36
  • 190. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Outline 1 Short overview on biological background 2 Network inference and GGM 3 Inference with multiple samples 4 Simulations Nathalie Villa-Vialaneix | Consensus Lasso 26/36
  • 191. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Simulated data Expression data with known co-expression network original network (scale free) taken from http://www.comp-sys-bio.org/AGN/data.html (100 nodes, 200 edges, loops removed); rewire a ratio r of the edges to generate k “children” networks (sharing approximately 100(1 2r)% of their edges); Nathalie Villa-Vialaneix | Consensus Lasso 27/36
  • 192. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Simulated data Expression data with known co-expression network original network (scale free) taken from http://www.comp-sys-bio.org/AGN/data.html (100 nodes, 200 edges, loops removed); rewire a ratio r of the edges to generate k “children” networks (sharing approximately 100(1 2r)% of their edges); generate “expression data” with a random Gaussian process from each chid: use the Laplacian of the graph to generate a putative concentration matrix; use edge colors in the original network to set the edge sign; correct the obtained matrix to make it positive; invert to obtain a covariance matrix...; ... which is used in a random Gaussian process to generate expression data (with noise). Nathalie Villa-Vialaneix | Consensus Lasso 27/36
  • 193. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations An example with k = 2, r = 5% mother network3 first child second child 3actually the parent network. My co-author wisely noted that the mistake was unforgivable for a feminist... Nathalie Villa-Vialaneix | Consensus Lasso 28/36
  • 194. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Choice for T2 Data: r = 0:05, k = 2 and n1 = n2 = 20 100 bootstrap samples, = 1, T1 = 250 or 500 l l 30 selections at least 30 selections at least 1.00 0.75 0.50 0.25 0.00 0.00 0.25 0.50 0.75 precision recall 43 selections at least l l 40 selections at least 1.00 0.75 0.50 0.25 0.00 0.00 0.25 0.50 0.75 1.00 precision recall Dots correspond to best F = 2 precisionrecall precision+recall ) Best F corresponds to selecting a number of edges approximately equal to the number of edges in the original network. Nathalie Villa-Vialaneix | Consensus Lasso 29/36
  • 195. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Choice for T1 and T1 % of improvement 0.1/1 f250; 300; 500g of bootstrapping network sizes rewired edges: 5% 20-20 1 500 30.69 20-30 0.1 500 11.87 30-30 1 300 20.15 50-50 1 300 14.36 20-20-20-20-20 1 500 86.04 30-30-30-30 0.1 500 42.67 network sizes rewired edges: 20% 20-20 0.1 300 -17.86 20-30 0.1 300 -18.35 30-30 1 500 -7.97 50-50 0.1 300 -7.83 20-20-20-20-20 0.1 500 10.27 30-30-30-30 1 500 13.48 Nathalie Villa-Vialaneix | Consensus Lasso 30/36
  • 196. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Comparison with other approaches Method compared (direct and bootstrap approaches) independant Graphical LASSO estimation gLasso methods implementated in the R package simone and described in [Chiquet et al., 2011]: intertwinned LASSO iLasso, cooperative LASSO coopLasso and group LASSO groupLasso fused graphical LASSO as described in [Danaher et al., 2013] as implemented in the R package fgLasso Nathalie Villa-Vialaneix | Consensus Lasso 31/36
  • 197. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Comparison with other approaches Method compared (direct and bootstrap approaches) independant Graphical LASSO estimation gLasso methods implementated in the R package simone and described in [Chiquet et al., 2011]: intertwinned LASSO iLasso, cooperative LASSO coopLasso and group LASSO groupLasso fused graphical LASSO as described in [Danaher et al., 2013] as implemented in the R package fgLasso consensus Lasso with fixed prior (the mother network) cLasso(p) fixed prior (average over the conditions of independant estimations) cLasso(2) adaptative estimation of the prior cLasso(m) Nathalie Villa-Vialaneix | Consensus Lasso 31/36
  • 198. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Comparison with other approaches Method compared (direct and bootstrap approaches) independant Graphical LASSO estimation gLasso methods implementated in the R package simone and described in [Chiquet et al., 2011]: intertwinned LASSO iLasso, cooperative LASSO coopLasso and group LASSO groupLasso fused graphical LASSO as described in [Danaher et al., 2013] as implemented in the R package fgLasso consensus Lasso with fixed prior (the mother network) cLasso(p) fixed prior (average over the conditions of independant estimations) cLasso(2) adaptative estimation of the prior cLasso(m) Parameters set to: T1 = 500, B = 100, = 1 Nathalie Villa-Vialaneix | Consensus Lasso 31/36
  • 199. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Selected results (best F) rewired edges: 5% - conditions: 2 - sample size: 2 30 direct version Method gLasso iLasso groupLasso coopLasso 0.28 0.35 0.32 0.35 Method fgLasso cLasso(m) cLasso(p) cLasso(2) 0.32 0.31 0.86 0.30 bootstrap version Method gLasso iLasso groupLasso coopLasso 0.31 0.34 0.36 0.34 Method fgLasso cLasso(m) cLasso(p) cLasso(2) 0.36 0.37 0.86 0.35 Conclusions bootstraping improves performance (except for iLasso and for large r) joint inference improves performance using a good prior is (as expected) very efficient adaptive approach for cLasso is better than naive approach Nathalie Villa-Vialaneix | Consensus Lasso 32/36
  • 200. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Real data 204 obese women ; expression of 221 genes before and after a LCD = 1 ; T1 = 1000 (target density: 4%) Distribution of the number of times an edge is selected over 100 bootstrap samples 1500 1000 500 0 0 25 50 75 100 Counts Frequency (70% of the pairs of nodes are never selected) ) T2 = 80 Nathalie Villa-Vialaneix | Consensus Lasso 33/36
  • 201. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Networks Before diet l l l l l l l FADS2 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l AZGP1 CIDEA FADS1 GPD1L PCK2 After diet l l l l l l l FADS2 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l AZGP1 CIDEA FADS1 GPD1L PCK2 densities about 1:3% - some interactions (both shared and specific) make sense to the biologist Nathalie Villa-Vialaneix | Consensus Lasso 34/36
  • 202. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Thank you for your attention... Programs available in the R package therese (on R-Forge)4. Joint work with Magali SanCristobal Matthieu Vignes (GenPhySe, INRA Toulouse) (Massey University, NZ) Nathalie Viguerie (I2MC, INSERM Toulouse) 4https://r-forge.r-project.org/projects/therese-pkg Nathalie Villa-Vialaneix | Consensus Lasso 35/36
  • 203. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations Questions? Nathalie Villa-Vialaneix | Consensus Lasso 36/36
  • 204. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations References Ambroise, C., Chiquet, J., and Matias, C. (2009). Inferring sparse Gaussian graphical models with latent structure. Electronic Journal of Statistics, 3:205–238. Butte, A. and Kohane, I. (1999). Unsupervised knowledge discovery in medical databases using relevance networks. In Proceedings of the AMIA Symposium, pages 711–715. Chiquet, J., Grandvalet, Y., and Ambroise, C. (2011). Inferring multiple graphical structures. Statistics and Computing, 21(4):537–553. Danaher, P., Wang, P., and Witten, D. (2013). The joint graphical lasso for inverse covariance estimation accross multiple classes. Journal of the Royal Statistical Society Series B. Forthcoming. Friedman, J., Hastie, T., and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3):432–441. Meinshausen, N. and Bühlmann, P. (2006). High dimensional graphs and variable selection with the lasso. Annals of Statistic, 34(3):1436–1462. Mohan, K., Chung, J., Han, S., Witten, D., Lee, S., and Fazel, M. (2012). Structured learning of Gaussian graphical models. In Proceedings of NIPS (Neural Information Processing Systems) 2012, Lake Tahoe, Nevada, USA. Osborne, M., Presnell, B., and Turlach, B. (2000). Nathalie Villa-Vialaneix | Consensus Lasso 36/36
  • 205. Short overview on biological background Network inference and GGM Inference with multiple samples Simulations On the LASSO and its dual. Journal of Computational and Graphical Statistics, 9(2):319–337. Schäfer, J. and Strimmer, K. (2005a). An empirical bayes approach to inferring large-scale gene association networks. Bioinformatics, 21(6):754–764. Schäfer, J. and Strimmer, K. (2005b). A shrinkage approach to large-scale covariance matrix estimation and implication for functional genomics. Statistical Applications in Genetics and Molecular Biology, 4:1–32. Viguerie, N., Montastier, E., Maoret, J., Roussel, B., Combes, M., Valle, C., Villa-Vialaneix, N., Iacovoni, J., Martinez, J., Holst, C., Astrup, A., Vidal, H., Clément, K., Hager, J., Saris, W., and Langin, D. (2012). Determinants of human adipose tissue gene expression: impact of diet, sex, metabolic status and cis genetic regulation. PLoS Genetics, 8(9):e1002959. Nathalie Villa-Vialaneix | Consensus Lasso 36/36