SlideShare a Scribd company logo
1 of 9
Statistical integration of
methylation, transcriptome and
proteome in cell lines
Said el Bouhaddani1, Hae-Won Uh1, Jeanine Houwing-Duistermaat1,2
1Department of Biostatistics and Research Support, Julius Center, University Medical Center Utrecht,
Netherlands;
2Department of Statistics, University of Leeds, UK.
Background
Multiple System Atrophy (MSA) is a rare neurodegenerative disorder. Almost 80% of patients
are disabled within 5 years of disease onset. The key pathogenic event when developing MSA
is an abnormal accumulation of harmful proteins. Molecular causes and consequences of this
aggregation need to be elucidated, e.g. using multiple omics datasets.
We have access to DNA-methylome, transcriptome, and proteome data, measured in cell
lines that show harmful protein aggregation and in negative controls. Standard sequential
analysis of these data shows no overlap of the significant genes.
Our aim is to develop a data integration method to identify consistent molecular biomarkers
that can classify cells with protein aggregation across all datasets. Apart from the high
dimensionality (p>N), also platform-specific heterogeneity between the omics data need to
be considered.
Motivating data & challenges
Methylome
- 850k sites on 4 cases, 4 controls
Transcriptome
- 25k probes on 3 cases, 3 controls
Proteome
- 2k proteins on 9 cases, 9 controls
Preprocessing: normalize data and map all IDs
to gene IDs
Final dataset: 1732 overlapping genes on 16
cases and 16 controls
Challenges
- High dimensional (p>N)
- Highly correlated
- Different platforms
Methods
There are several estimation methods proposed. The
general model is written as
𝑥 𝑘 = 𝑡 𝑘 𝑊⊤
+ 𝑡 𝑠,𝑘 𝑊𝑠,𝑘
⊤
+ 𝑒 𝑘
𝑦 𝑘 = 𝑡 𝑘 𝐵 + ℎ 𝑘
Underlying general model
For each omics dataset 𝒌=1,…,3, we introduce
- Joint latent variables 𝑡 underlying omics data 𝑥
and MSA outcome 𝑦
- Omic-specific latent variables 𝑡 𝑠 for each omics
dataset
Methods
Sparse PLS-DA 𝑡 𝑠 = 0, algorithmic,
sequential estimation
Sparse OPLS-DA Algorithmic, sequential
estimation
Probabilistic OPLS-
DA
Likelihood, simultaneous
estimation
Three methods considered
Estimation methods
Sparse PLS-DA (sPLS-DA) [1]
1.Convert binary 𝑦 to numerical ‘dummy’ 𝑦
2.Maximize 𝑤⊤ 𝑋⊤ 𝑦 with an L1 penalty on 𝑤
3.Calculate 𝑦 = 𝑥𝑊𝐵 and obtain class-predictions
Sparse OPLS-DA (sOPLS-DA) [2]
1.Obtain estimates for 𝑡 𝑠 𝑊𝑠
⊤ using OPLS
2.Subtract these parts from the original data matrix 𝑋
3.Follow steps in sparse PLS-DA using corrected 𝑋
Probabilistic OPLS-DA (POPLS-DA)
1.Formulate observed likelihood 𝑓(𝑥, 𝑦)
2.Formulate complete likelihood 𝑓 𝑥, 𝑦, 𝑡 =
𝑓 𝑥 𝑡 𝑓 𝑦 𝑡 𝑓(𝑡)
• Each term is computationally efficiently optimized
3.Utilize EM algorithm on 𝑓(𝑥, 𝑦, 𝑡) to obtain maximizers
for 𝑓(𝑥, 𝑦)
Simulation study
Conclusions
- POPLS-DA scores highest on accuracy, even in small sample size
- sparse OPLS-DA likely to overfit: it estimates omics-specific parts in each dataset, while sample
size is low
Setup
- Simulate 𝑋 and 𝑦 from “underlying model”
- Setup close to real data:
- 1000 features,
- 3 data types with resp. 8, 6 and 18 samples
- Two joint, two specific components
- Calculate accuracy of prediction using large
simulated test data:
- 500*{8,6,18} samples
- Compare sPLS-DA, sOPLS-DA, POPLS-DA
Data analysis
Results
- Two joint, two specific
components
- Sparsity level: 50 genes
retained (not for POPLS-DA)
- All methods separate MSA cases from controls
Conclusions
- sOPLS-DA clusters more homogeneous
- POPLS-DA has more spread, less certain about
predictions
- Top ten genes directly involved in harmful protein
aggregation
Conclusions
- POPLS-DA discriminates MSA based on multiple omics data, performs best
for small sample size
- Simulation: algorithmic methods sPLS-DA and sOPLS-DA likely to overfit,
need larger sample size
- MSA cases separated from controls based on 3 omics datasets, top genes
biologically important
- POPLS-DA will be added to OmicsPLS package (on cran.r-
project.org/package=OmicsPLS)
s.elbouhaddani@umcutrecht.nl
Günter Höglinger
Jörg Tost
Matthias Höllerhage
E-Rare EU project: MSAomics
H2020 project: IMFORFUTURE
Acknowledgments
References
[1] Lê Cao, K., Boitard, S. & Besse, P. Sparse PLS discriminant analysis:
biologically relevant feature selection and graphical displays for multiclass
problems. BMC Bioinformatics 12, 253 (2011). https://doi.org/10.1186/1471-
2105-12-253
[2] Bylesjö, M., Rantalainen, M., Cloarec, O., Nicholson, J.K., Holmes, E. and
Trygg, J. (2006), OPLS discriminant analysis: combining the strengths of PLS‐DA
and SIMCA classification. J. Chemometrics, 20: 341-351. doi:10.1002/cem.1006

More Related Content

Similar to Omics data integration for MSA | International Society for Clinical Biostatistics 2020

Presentation july 28_2015
Presentation july 28_2015Presentation july 28_2015
Presentation july 28_2015gkoytiger
 
20100509 bioinformatics kapushesky_lecture05_0
20100509 bioinformatics kapushesky_lecture05_020100509 bioinformatics kapushesky_lecture05_0
20100509 bioinformatics kapushesky_lecture05_0Computer Science Club
 
ON THE PREDICTION ACCURACIES OF THREE MOST KNOWN REGULARIZERS : RIDGE REGRESS...
ON THE PREDICTION ACCURACIES OF THREE MOST KNOWN REGULARIZERS : RIDGE REGRESS...ON THE PREDICTION ACCURACIES OF THREE MOST KNOWN REGULARIZERS : RIDGE REGRESS...
ON THE PREDICTION ACCURACIES OF THREE MOST KNOWN REGULARIZERS : RIDGE REGRESS...ijaia
 
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...Servio Fernando Lima Reina
 
How to analyse large data sets
How to analyse large data setsHow to analyse large data sets
How to analyse large data setsimprovemed
 
Heart Disease Prediction Using Associative Relational Classification Techniq...
Heart Disease Prediction Using Associative Relational  Classification Techniq...Heart Disease Prediction Using Associative Relational  Classification Techniq...
Heart Disease Prediction Using Associative Relational Classification Techniq...IJMER
 
Research Statement Chien-Wei Lin
Research Statement Chien-Wei LinResearch Statement Chien-Wei Lin
Research Statement Chien-Wei LinChien-Wei Lin
 
IRJET - A Framework for Predicting Drug Effectiveness in Human Body
IRJET - A Framework for Predicting Drug Effectiveness in Human BodyIRJET - A Framework for Predicting Drug Effectiveness in Human Body
IRJET - A Framework for Predicting Drug Effectiveness in Human BodyIRJET Journal
 
Comparative study of artificial neural network based classification for liver...
Comparative study of artificial neural network based classification for liver...Comparative study of artificial neural network based classification for liver...
Comparative study of artificial neural network based classification for liver...Alexander Decker
 
Back to Basics: Using GWAS to Drive Discovery for Complex Diseases
Back to Basics: Using GWAS to Drive Discovery for Complex DiseasesBack to Basics: Using GWAS to Drive Discovery for Complex Diseases
Back to Basics: Using GWAS to Drive Discovery for Complex DiseasesGolden Helix Inc
 
heart final last sem.pptx
heart final last sem.pptxheart final last sem.pptx
heart final last sem.pptxrakshashadu
 
IRJET- Genetic Algorithm for Feature Selection to Improve Heart Disease Predi...
IRJET- Genetic Algorithm for Feature Selection to Improve Heart Disease Predi...IRJET- Genetic Algorithm for Feature Selection to Improve Heart Disease Predi...
IRJET- Genetic Algorithm for Feature Selection to Improve Heart Disease Predi...IRJET Journal
 
Prediction of transcription factor binding to DNA using rule induction methods
Prediction of transcription factor binding to DNA using rule induction methodsPrediction of transcription factor binding to DNA using rule induction methods
Prediction of transcription factor binding to DNA using rule induction methodsziggurat
 
IRJET- Disease Prediction using Machine Learning
IRJET-  Disease Prediction using Machine LearningIRJET-  Disease Prediction using Machine Learning
IRJET- Disease Prediction using Machine LearningIRJET Journal
 
RUCK 2017 김성환 R 패키지 메타주성분분석(MetaPCA)
RUCK 2017 김성환 R 패키지 메타주성분분석(MetaPCA)RUCK 2017 김성환 R 패키지 메타주성분분석(MetaPCA)
RUCK 2017 김성환 R 패키지 메타주성분분석(MetaPCA)r-kor
 

Similar to Omics data integration for MSA | International Society for Clinical Biostatistics 2020 (20)

Presentation july 28_2015
Presentation july 28_2015Presentation july 28_2015
Presentation july 28_2015
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
20100509 bioinformatics kapushesky_lecture05_0
20100509 bioinformatics kapushesky_lecture05_020100509 bioinformatics kapushesky_lecture05_0
20100509 bioinformatics kapushesky_lecture05_0
 
ON THE PREDICTION ACCURACIES OF THREE MOST KNOWN REGULARIZERS : RIDGE REGRESS...
ON THE PREDICTION ACCURACIES OF THREE MOST KNOWN REGULARIZERS : RIDGE REGRESS...ON THE PREDICTION ACCURACIES OF THREE MOST KNOWN REGULARIZERS : RIDGE REGRESS...
ON THE PREDICTION ACCURACIES OF THREE MOST KNOWN REGULARIZERS : RIDGE REGRESS...
 
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
 
How to analyse large data sets
How to analyse large data setsHow to analyse large data sets
How to analyse large data sets
 
Short story.pptx
Short story.pptxShort story.pptx
Short story.pptx
 
Heart Disease Prediction Using Associative Relational Classification Techniq...
Heart Disease Prediction Using Associative Relational  Classification Techniq...Heart Disease Prediction Using Associative Relational  Classification Techniq...
Heart Disease Prediction Using Associative Relational Classification Techniq...
 
Research Statement Chien-Wei Lin
Research Statement Chien-Wei LinResearch Statement Chien-Wei Lin
Research Statement Chien-Wei Lin
 
IRJET - A Framework for Predicting Drug Effectiveness in Human Body
IRJET - A Framework for Predicting Drug Effectiveness in Human BodyIRJET - A Framework for Predicting Drug Effectiveness in Human Body
IRJET - A Framework for Predicting Drug Effectiveness in Human Body
 
Comparative study of artificial neural network based classification for liver...
Comparative study of artificial neural network based classification for liver...Comparative study of artificial neural network based classification for liver...
Comparative study of artificial neural network based classification for liver...
 
article.pdf
article.pdfarticle.pdf
article.pdf
 
Data handling metabolomics
Data handling metabolomicsData handling metabolomics
Data handling metabolomics
 
Back to Basics: Using GWAS to Drive Discovery for Complex Diseases
Back to Basics: Using GWAS to Drive Discovery for Complex DiseasesBack to Basics: Using GWAS to Drive Discovery for Complex Diseases
Back to Basics: Using GWAS to Drive Discovery for Complex Diseases
 
heart final last sem.pptx
heart final last sem.pptxheart final last sem.pptx
heart final last sem.pptx
 
IRJET- Genetic Algorithm for Feature Selection to Improve Heart Disease Predi...
IRJET- Genetic Algorithm for Feature Selection to Improve Heart Disease Predi...IRJET- Genetic Algorithm for Feature Selection to Improve Heart Disease Predi...
IRJET- Genetic Algorithm for Feature Selection to Improve Heart Disease Predi...
 
Prediction of transcription factor binding to DNA using rule induction methods
Prediction of transcription factor binding to DNA using rule induction methodsPrediction of transcription factor binding to DNA using rule induction methods
Prediction of transcription factor binding to DNA using rule induction methods
 
IRJET- Disease Prediction using Machine Learning
IRJET-  Disease Prediction using Machine LearningIRJET-  Disease Prediction using Machine Learning
IRJET- Disease Prediction using Machine Learning
 
bioinformatic.pptx
bioinformatic.pptxbioinformatic.pptx
bioinformatic.pptx
 
RUCK 2017 김성환 R 패키지 메타주성분분석(MetaPCA)
RUCK 2017 김성환 R 패키지 메타주성분분석(MetaPCA)RUCK 2017 김성환 R 패키지 메타주성분분석(MetaPCA)
RUCK 2017 김성환 R 패키지 메타주성분분석(MetaPCA)
 

Recently uploaded

BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 

Recently uploaded (20)

BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 

Omics data integration for MSA | International Society for Clinical Biostatistics 2020

  • 1. Statistical integration of methylation, transcriptome and proteome in cell lines Said el Bouhaddani1, Hae-Won Uh1, Jeanine Houwing-Duistermaat1,2 1Department of Biostatistics and Research Support, Julius Center, University Medical Center Utrecht, Netherlands; 2Department of Statistics, University of Leeds, UK.
  • 2. Background Multiple System Atrophy (MSA) is a rare neurodegenerative disorder. Almost 80% of patients are disabled within 5 years of disease onset. The key pathogenic event when developing MSA is an abnormal accumulation of harmful proteins. Molecular causes and consequences of this aggregation need to be elucidated, e.g. using multiple omics datasets. We have access to DNA-methylome, transcriptome, and proteome data, measured in cell lines that show harmful protein aggregation and in negative controls. Standard sequential analysis of these data shows no overlap of the significant genes. Our aim is to develop a data integration method to identify consistent molecular biomarkers that can classify cells with protein aggregation across all datasets. Apart from the high dimensionality (p>N), also platform-specific heterogeneity between the omics data need to be considered.
  • 3. Motivating data & challenges Methylome - 850k sites on 4 cases, 4 controls Transcriptome - 25k probes on 3 cases, 3 controls Proteome - 2k proteins on 9 cases, 9 controls Preprocessing: normalize data and map all IDs to gene IDs Final dataset: 1732 overlapping genes on 16 cases and 16 controls Challenges - High dimensional (p>N) - Highly correlated - Different platforms
  • 4. Methods There are several estimation methods proposed. The general model is written as 𝑥 𝑘 = 𝑡 𝑘 𝑊⊤ + 𝑡 𝑠,𝑘 𝑊𝑠,𝑘 ⊤ + 𝑒 𝑘 𝑦 𝑘 = 𝑡 𝑘 𝐵 + ℎ 𝑘 Underlying general model For each omics dataset 𝒌=1,…,3, we introduce - Joint latent variables 𝑡 underlying omics data 𝑥 and MSA outcome 𝑦 - Omic-specific latent variables 𝑡 𝑠 for each omics dataset
  • 5. Methods Sparse PLS-DA 𝑡 𝑠 = 0, algorithmic, sequential estimation Sparse OPLS-DA Algorithmic, sequential estimation Probabilistic OPLS- DA Likelihood, simultaneous estimation Three methods considered Estimation methods Sparse PLS-DA (sPLS-DA) [1] 1.Convert binary 𝑦 to numerical ‘dummy’ 𝑦 2.Maximize 𝑤⊤ 𝑋⊤ 𝑦 with an L1 penalty on 𝑤 3.Calculate 𝑦 = 𝑥𝑊𝐵 and obtain class-predictions Sparse OPLS-DA (sOPLS-DA) [2] 1.Obtain estimates for 𝑡 𝑠 𝑊𝑠 ⊤ using OPLS 2.Subtract these parts from the original data matrix 𝑋 3.Follow steps in sparse PLS-DA using corrected 𝑋 Probabilistic OPLS-DA (POPLS-DA) 1.Formulate observed likelihood 𝑓(𝑥, 𝑦) 2.Formulate complete likelihood 𝑓 𝑥, 𝑦, 𝑡 = 𝑓 𝑥 𝑡 𝑓 𝑦 𝑡 𝑓(𝑡) • Each term is computationally efficiently optimized 3.Utilize EM algorithm on 𝑓(𝑥, 𝑦, 𝑡) to obtain maximizers for 𝑓(𝑥, 𝑦)
  • 6. Simulation study Conclusions - POPLS-DA scores highest on accuracy, even in small sample size - sparse OPLS-DA likely to overfit: it estimates omics-specific parts in each dataset, while sample size is low Setup - Simulate 𝑋 and 𝑦 from “underlying model” - Setup close to real data: - 1000 features, - 3 data types with resp. 8, 6 and 18 samples - Two joint, two specific components - Calculate accuracy of prediction using large simulated test data: - 500*{8,6,18} samples - Compare sPLS-DA, sOPLS-DA, POPLS-DA
  • 7. Data analysis Results - Two joint, two specific components - Sparsity level: 50 genes retained (not for POPLS-DA) - All methods separate MSA cases from controls Conclusions - sOPLS-DA clusters more homogeneous - POPLS-DA has more spread, less certain about predictions - Top ten genes directly involved in harmful protein aggregation
  • 8. Conclusions - POPLS-DA discriminates MSA based on multiple omics data, performs best for small sample size - Simulation: algorithmic methods sPLS-DA and sOPLS-DA likely to overfit, need larger sample size - MSA cases separated from controls based on 3 omics datasets, top genes biologically important - POPLS-DA will be added to OmicsPLS package (on cran.r- project.org/package=OmicsPLS)
  • 9. s.elbouhaddani@umcutrecht.nl Günter Höglinger Jörg Tost Matthias Höllerhage E-Rare EU project: MSAomics H2020 project: IMFORFUTURE Acknowledgments References [1] Lê Cao, K., Boitard, S. & Besse, P. Sparse PLS discriminant analysis: biologically relevant feature selection and graphical displays for multiclass problems. BMC Bioinformatics 12, 253 (2011). https://doi.org/10.1186/1471- 2105-12-253 [2] Bylesjö, M., Rantalainen, M., Cloarec, O., Nicholson, J.K., Holmes, E. and Trygg, J. (2006), OPLS discriminant analysis: combining the strengths of PLS‐DA and SIMCA classification. J. Chemometrics, 20: 341-351. doi:10.1002/cem.1006