SlideShare une entreprise Scribd logo
1  sur  44
Télécharger pour lire hors ligne
A Model for Interpretable High Dimensional
Interactions
Sahir Rai Bhatnagar
Joint work with Yi Yang, Mathieu Blanchette and Celia Greenwood
Poster Number 67
Motivation
one predictor variable at a time
Predictor Variable Phenotype
one predictor variable at a time
Predictor Variable Phenotype
Test 1
Test 2
Test 3
Test 4
Test 5
1
a network based view
Predictor Variable Phenotype
a network based view
Predictor Variable Phenotype
a network based view
Predictor Variable Phenotype
Test 1
2
system level changes due to environment
Predictor Variable PhenotypeEnvironment
A
B
system level changes due to environment
Predictor Variable PhenotypeEnvironment
A
B
Test 1
3
Motivating Dataset: Newborn epigenetic adaptations to gesta-
tional diabetes exposure (Luigi Bouchard, Sherbrooke)
Environment
Gestational
Diabetes
Large Data
Child’s epigenome
(p ≈ 450k)
Phenotype
Obesity measures
4
Differential Correlation between environments
(a) Gestational diabetes affected pregnancy (b) Controls
5
formal statement of initial problem
• n: number of subjects
6
formal statement of initial problem
• n: number of subjects
• p: number of predictor variables
6
formal statement of initial problem
• n: number of subjects
• p: number of predictor variables
• Xn×p: high dimensional data set (p >> n)
6
formal statement of initial problem
• n: number of subjects
• p: number of predictor variables
• Xn×p: high dimensional data set (p >> n)
• Yn×1: phenotype
6
formal statement of initial problem
• n: number of subjects
• p: number of predictor variables
• Xn×p: high dimensional data set (p >> n)
• Yn×1: phenotype
• En×1: environmental factor that has widespread effect on X and can
modify the relation between X and Y
6
formal statement of initial problem
• n: number of subjects
• p: number of predictor variables
• Xn×p: high dimensional data set (p >> n)
• Yn×1: phenotype
• En×1: environmental factor that has widespread effect on X and can
modify the relation between X and Y
Objective
• Which elements of X that are associated with Y , depend on E?
6
Methods
ECLUST - our proposed method: 3 phases
Original Data
ECLUST - our proposed method: 3 phases
Original Data
E = 0
1) Gene Similarity
E = 1
ECLUST - our proposed method: 3 phases
Original Data
E = 0
1) Gene Similarity
E = 1
ECLUST - our proposed method: 3 phases
Original Data
E = 0
1) Gene Similarity
E = 1
2) Cluster
Representation
ECLUST - our proposed method: 3 phases
Original Data
E = 0
1) Gene Similarity
E = 1
2) Cluster
Representation
n × 1 n × 1
ECLUST - our proposed method: 3 phases
Original Data
E = 0
1) Gene Similarity
E = 1
2) Cluster
Representation
n × 1 n × 1
3) Penalized
Regression
Yn×1∼ + ×E
7
the objective of statistical
methods is the reduction of data.
A quantity of data . . . is to be
replaced by relatively few quantities
which shall adequately represent
. . . the relevant information
contained in the original data.
- Sir R. A. Fisher, 1922
7
Model
g(µ) =β0 + β1X1 + · · · + βpXp + βE E
main effects
+ α1E (X1E) + · · · + αpE (XpE)
interactions
1Choi et al. 2010, JASA
2Chipman 1996, Canadian Journal of Statistics
8
Model
g(µ) =β0 + β1X1 + · · · + βpXp + βE E
main effects
+ α1E (X1E) + · · · + αpE (XpE)
interactions
Reparametrization1
: αjE = γjE βj βE .
1Choi et al. 2010, JASA
2Chipman 1996, Canadian Journal of Statistics
8
Model
g(µ) =β0 + β1X1 + · · · + βpXp + βE E
main effects
+ α1E (X1E) + · · · + αpE (XpE)
interactions
Reparametrization1
: αjE = γjE βj βE .
Strong heredity principle2
:
ˆαjE = 0 ⇒ ˆβj = 0 and ˆβE = 0
1Choi et al. 2010, JASA
2Chipman 1996, Canadian Journal of Statistics
8
Strong Heredity Model with Penalization
arg min
β0,β,γ
1
2
Y − g(µ)
2
+
λβ (w1β1 + · · · + wqβq + wE βE ) +
λγ (w1E γ1E + · · · + wqE γqE )
wj =
1
ˆβj
, wjE =
ˆβj
ˆβE
ˆαjE
9
Results
Simulation Study: Jaccard Index and test set MSE
10
Open source software
• Software implementation in R: http://sahirbhatnagar.com/eclust/
• Allows user specified interaction terms
• Automatically determines the optimal tuning parameters through
cross validation
• Can also be applied to genetic data
11
Conclusions
Conclusions and Contributions
• Large system-wide changes are observed in many environments
12
Conclusions and Contributions
• Large system-wide changes are observed in many environments
• This assumption can possibly be exploited to aid analysis of large
data
12
Conclusions and Contributions
• Large system-wide changes are observed in many environments
• This assumption can possibly be exploited to aid analysis of large
data
• We develop and implement a multivariate penalization procedure for
predicting a continuous or binary disease outcome while detecting
interactions between high dimensional data (p >> n) and an
environmental factor.
12
Conclusions and Contributions
• Large system-wide changes are observed in many environments
• This assumption can possibly be exploited to aid analysis of large
data
• We develop and implement a multivariate penalization procedure for
predicting a continuous or binary disease outcome while detecting
interactions between high dimensional data (p >> n) and an
environmental factor.
• Dimension reduction is achieved through leveraging the
environmental-class-conditional correlations
12
Conclusions and Contributions
• Large system-wide changes are observed in many environments
• This assumption can possibly be exploited to aid analysis of large
data
• We develop and implement a multivariate penalization procedure for
predicting a continuous or binary disease outcome while detecting
interactions between high dimensional data (p >> n) and an
environmental factor.
• Dimension reduction is achieved through leveraging the
environmental-class-conditional correlations
• Also, we develop and implement a strong heredity framework
within the penalized model
12
Conclusions and Contributions
• Large system-wide changes are observed in many environments
• This assumption can possibly be exploited to aid analysis of large
data
• We develop and implement a multivariate penalization procedure for
predicting a continuous or binary disease outcome while detecting
interactions between high dimensional data (p >> n) and an
environmental factor.
• Dimension reduction is achieved through leveraging the
environmental-class-conditional correlations
• Also, we develop and implement a strong heredity framework
within the penalized model
• R software: http://sahirbhatnagar.com/eclust/
12
Limitations
• There must be a high-dimensional signature of the exposure
13
Limitations
• There must be a high-dimensional signature of the exposure
• Clustering is unsupervised
13
Limitations
• There must be a high-dimensional signature of the exposure
• Clustering is unsupervised
• Two tuning parameters
13
Limitations
• There must be a high-dimensional signature of the exposure
• Clustering is unsupervised
• Two tuning parameters
• Need more samples . . . Got data? (Poster 67)
13
acknowledgements
• Dr. Celia Greenwood
• Dr. Blanchette and Dr. Yang
• Dr. Luigi Bouchard, Andr´e Anne
Houde
• Dr. Steele, Dr. Kramer,
Dr. Abrahamowicz
• Maxime Turgeon, Kevin
McGregor, Lauren Mokry,
Dr. Forest
• Greg Voisin, Dr. Forgetta,
Dr. Klein
• Mothers and children from the
study
14

Contenu connexe

Dernier

COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxFarihaAbdulRasheed
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .Poonam Aher Patil
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticssakshisoni2385
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformationAreesha Ahmad
 
American Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptxAmerican Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptxabhishekdhamu51
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)Areesha Ahmad
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptxAlMamun560346
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLkantirani197
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...ssuser79fe74
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...chandars293
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑Damini Dixit
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...Lokesh Kothari
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Monika Rani
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Servicemonikaservice1
 

Dernier (20)

COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
American Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptxAmerican Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptx
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 

En vedette

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

En vedette (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

A model for interpretable high dimensional interactions

  • 1. A Model for Interpretable High Dimensional Interactions Sahir Rai Bhatnagar Joint work with Yi Yang, Mathieu Blanchette and Celia Greenwood Poster Number 67
  • 3. one predictor variable at a time Predictor Variable Phenotype
  • 4. one predictor variable at a time Predictor Variable Phenotype Test 1 Test 2 Test 3 Test 4 Test 5 1
  • 5. a network based view Predictor Variable Phenotype
  • 6. a network based view Predictor Variable Phenotype
  • 7. a network based view Predictor Variable Phenotype Test 1 2
  • 8. system level changes due to environment Predictor Variable PhenotypeEnvironment A B
  • 9. system level changes due to environment Predictor Variable PhenotypeEnvironment A B Test 1 3
  • 10. Motivating Dataset: Newborn epigenetic adaptations to gesta- tional diabetes exposure (Luigi Bouchard, Sherbrooke) Environment Gestational Diabetes Large Data Child’s epigenome (p ≈ 450k) Phenotype Obesity measures 4
  • 11. Differential Correlation between environments (a) Gestational diabetes affected pregnancy (b) Controls 5
  • 12. formal statement of initial problem • n: number of subjects 6
  • 13. formal statement of initial problem • n: number of subjects • p: number of predictor variables 6
  • 14. formal statement of initial problem • n: number of subjects • p: number of predictor variables • Xn×p: high dimensional data set (p >> n) 6
  • 15. formal statement of initial problem • n: number of subjects • p: number of predictor variables • Xn×p: high dimensional data set (p >> n) • Yn×1: phenotype 6
  • 16. formal statement of initial problem • n: number of subjects • p: number of predictor variables • Xn×p: high dimensional data set (p >> n) • Yn×1: phenotype • En×1: environmental factor that has widespread effect on X and can modify the relation between X and Y 6
  • 17. formal statement of initial problem • n: number of subjects • p: number of predictor variables • Xn×p: high dimensional data set (p >> n) • Yn×1: phenotype • En×1: environmental factor that has widespread effect on X and can modify the relation between X and Y Objective • Which elements of X that are associated with Y , depend on E? 6
  • 19. ECLUST - our proposed method: 3 phases Original Data
  • 20. ECLUST - our proposed method: 3 phases Original Data E = 0 1) Gene Similarity E = 1
  • 21. ECLUST - our proposed method: 3 phases Original Data E = 0 1) Gene Similarity E = 1
  • 22. ECLUST - our proposed method: 3 phases Original Data E = 0 1) Gene Similarity E = 1 2) Cluster Representation
  • 23. ECLUST - our proposed method: 3 phases Original Data E = 0 1) Gene Similarity E = 1 2) Cluster Representation n × 1 n × 1
  • 24. ECLUST - our proposed method: 3 phases Original Data E = 0 1) Gene Similarity E = 1 2) Cluster Representation n × 1 n × 1 3) Penalized Regression Yn×1∼ + ×E 7
  • 25. the objective of statistical methods is the reduction of data. A quantity of data . . . is to be replaced by relatively few quantities which shall adequately represent . . . the relevant information contained in the original data. - Sir R. A. Fisher, 1922 7
  • 26. Model g(µ) =β0 + β1X1 + · · · + βpXp + βE E main effects + α1E (X1E) + · · · + αpE (XpE) interactions 1Choi et al. 2010, JASA 2Chipman 1996, Canadian Journal of Statistics 8
  • 27. Model g(µ) =β0 + β1X1 + · · · + βpXp + βE E main effects + α1E (X1E) + · · · + αpE (XpE) interactions Reparametrization1 : αjE = γjE βj βE . 1Choi et al. 2010, JASA 2Chipman 1996, Canadian Journal of Statistics 8
  • 28. Model g(µ) =β0 + β1X1 + · · · + βpXp + βE E main effects + α1E (X1E) + · · · + αpE (XpE) interactions Reparametrization1 : αjE = γjE βj βE . Strong heredity principle2 : ˆαjE = 0 ⇒ ˆβj = 0 and ˆβE = 0 1Choi et al. 2010, JASA 2Chipman 1996, Canadian Journal of Statistics 8
  • 29. Strong Heredity Model with Penalization arg min β0,β,γ 1 2 Y − g(µ) 2 + λβ (w1β1 + · · · + wqβq + wE βE ) + λγ (w1E γ1E + · · · + wqE γqE ) wj = 1 ˆβj , wjE = ˆβj ˆβE ˆαjE 9
  • 31. Simulation Study: Jaccard Index and test set MSE 10
  • 32. Open source software • Software implementation in R: http://sahirbhatnagar.com/eclust/ • Allows user specified interaction terms • Automatically determines the optimal tuning parameters through cross validation • Can also be applied to genetic data 11
  • 34. Conclusions and Contributions • Large system-wide changes are observed in many environments 12
  • 35. Conclusions and Contributions • Large system-wide changes are observed in many environments • This assumption can possibly be exploited to aid analysis of large data 12
  • 36. Conclusions and Contributions • Large system-wide changes are observed in many environments • This assumption can possibly be exploited to aid analysis of large data • We develop and implement a multivariate penalization procedure for predicting a continuous or binary disease outcome while detecting interactions between high dimensional data (p >> n) and an environmental factor. 12
  • 37. Conclusions and Contributions • Large system-wide changes are observed in many environments • This assumption can possibly be exploited to aid analysis of large data • We develop and implement a multivariate penalization procedure for predicting a continuous or binary disease outcome while detecting interactions between high dimensional data (p >> n) and an environmental factor. • Dimension reduction is achieved through leveraging the environmental-class-conditional correlations 12
  • 38. Conclusions and Contributions • Large system-wide changes are observed in many environments • This assumption can possibly be exploited to aid analysis of large data • We develop and implement a multivariate penalization procedure for predicting a continuous or binary disease outcome while detecting interactions between high dimensional data (p >> n) and an environmental factor. • Dimension reduction is achieved through leveraging the environmental-class-conditional correlations • Also, we develop and implement a strong heredity framework within the penalized model 12
  • 39. Conclusions and Contributions • Large system-wide changes are observed in many environments • This assumption can possibly be exploited to aid analysis of large data • We develop and implement a multivariate penalization procedure for predicting a continuous or binary disease outcome while detecting interactions between high dimensional data (p >> n) and an environmental factor. • Dimension reduction is achieved through leveraging the environmental-class-conditional correlations • Also, we develop and implement a strong heredity framework within the penalized model • R software: http://sahirbhatnagar.com/eclust/ 12
  • 40. Limitations • There must be a high-dimensional signature of the exposure 13
  • 41. Limitations • There must be a high-dimensional signature of the exposure • Clustering is unsupervised 13
  • 42. Limitations • There must be a high-dimensional signature of the exposure • Clustering is unsupervised • Two tuning parameters 13
  • 43. Limitations • There must be a high-dimensional signature of the exposure • Clustering is unsupervised • Two tuning parameters • Need more samples . . . Got data? (Poster 67) 13
  • 44. acknowledgements • Dr. Celia Greenwood • Dr. Blanchette and Dr. Yang • Dr. Luigi Bouchard, Andr´e Anne Houde • Dr. Steele, Dr. Kramer, Dr. Abrahamowicz • Maxime Turgeon, Kevin McGregor, Lauren Mokry, Dr. Forest • Greg Voisin, Dr. Forgetta, Dr. Klein • Mothers and children from the study 14