SlideShare une entreprise Scribd logo
1  sur  38
Télécharger pour lire hors ligne
Data analysis workshop for massive
           sequencing data

Towards an understanding of diversity in
  biological and biomedical systems

                         Igor Zwir

     Department Computer Science and Artificial Intelligence,
            University of Granada, Granada, Spain
                Howard Hughes Medical Institute
          Yale School of Medicine, NewHeaven, CT, US
    Department of Psychiatry Washington University School of
                  Medicine, St. Louis, MO, US
              e-mail: zwiri@psychiatry.wustl.edu
“Some people enjoy reading papers,                  “Some people enjoy reading papers, juggling
juggling possibilities and formulating ideas,       possibilities and formulating ideas, even if
even if they can’t work a pipette”                  they can’t write a line of a computer program”
(“Reasoning for results”, Nature, Bray, D., 2001)   (“Reasoning for results”, Groisman Lab, 2007)
“…organisms of the most different sorts are
constructed from the very same battery of
genes. The diversity of life forms results from
small changes in the regulatory systems that
govern expression of these genes.”

                                     François Jacob
                          In Of flies, mice and men
Salmonella : A Gram-negative
pathogen with a varied lifestyle
Signal transduction cascade by
two-component regulatory systems

    Signal             low Mg2+


    Sensor      PhoQ


   Regulator
                        PhoP -PO3

   Effectors       mgtA                mgtB



  Response      Mg2+ transport    Mg2+ transport
Two-component systems regulate physiological
         and virulence functions


    System           Signal                   Function
    ArcA/ArcB        Quinones             Anaerobic respiration
    OmpR/EnvZ    Osmolarity changes          Osmoadaptation
     NtrB/NtrC   Low nitrogen levels       Nitrogen metabolism
    PhoP/PhoQ        Low Mg2+          Virulence, growth in low Mg2+
    PmrA/PmrB       Fe3+ and Al3+       Resistance to polymyxin B
    SsrA/SpiR        Unknown                    Virulence
     TtrR/TtrS     Tetrathionate          Anaerobic respiration
The Salmonella PMRA/PMRB system
   responds to Fe3+ and low Mg2+

       low Mg2+                  high Fe3+



PhoQ                      PmrB



       PhoP -PO3                 PmrA -PO3
            pmrD   PmrD
                                      pbgP



                             LPS modification
The E. Coli PMRA/PMRB system
   responds to Fe3+ but not to low Mg2+

       low Mg2+                  high Fe3+



PhoQ                      PmrB



       PhoP -PO3                 PmrA -PO3

           pmrD    PmrD               pbgP



85.4% 93.3%                  LPS modification
The Salmonella but not the E. coli ugd gene is
       regulated by the PhoP protein



           PhoQ                           PhoQ




                     PhoP -PO3                      PhoP -PO3
                             ugd                            ugd

           85.4% 93.3%                                       85.5%

(the median amino acid identity between Salmonella and E. coli proteins is 90%)
PhoP-PhoQ Two component system
  regulates 5% of Salmonella genes




     Consensus Motif
Salmonella LT2 & E. coli K12
Single motif vs. a family of PhoP
                          submotifs
                                     +Sensitivity
+Specificity                                                        +Specificity




                  Harari et al., PloS computational Biology, 2010
PhoP submotifs improve BS detection




 26 BS
Genome wide analysis: custom tiling
    arrays and ChIP assays
Evolution of submotives thougout the
            Gamma/Enterobacteria
                    S01                                       S05




                                  Information content
                                   PhoP (Halpem Bruno)
                                   Background (HKY85 Model)




Perez et al., PloS Genetics, 2009; Harari et al., PloS computational Biology, 2010
The submotifs and the PhoP protein evolve at
              correlated rates
In vitro affinities correlate well with the top three
                 families of submotifs
+
                               -
Zwir et al., PNAS, 2005; Zwir et al, Bioinformatics, 2005,
         Harari et al., BMC Bioinformatics, 2009
Submotif & distances from the
    RNAP binding site
    Close          Medium          Remote



                                              45%




                                                     21%




   Harari et al., PloS computational Biology, 2010
Two closely related species show
     distinct promoter’s preferences
        Close   Medium   Remote




Submotifs & distances can distinguish
        Salmonella & E. coli
Two far related species show distinct
       promoter architectures
PhoP-activated genes are bound and
transcribed at different times and levels
Predicting gene binding and transcription of
          PhoP regulated targets
 ancestral
  horizontally-acquired
Summary
TF Affinity for its binding sites determine promoter
time and levels in naked DNA

Binding and Transcription in vivo depends on where
the binding sites sit (promoter architectures)

Cis-acting features in the PhoP-activated promoters
determine non-arbitrary organized architectures

The differences of the regulon througout distinct
species depends on the evolution of the binding sites
and promoter architectures
Two paradigms: multiple genes with small
  effect, or few genes with large effect




      London Metro             Boston Metro
                         de Vries, Nature Medicine, 2009
Phenotypic-genotypic relations describe a risk
         surface of Schizophrenia
   R19:                                                                    R10:
6 affected,                                                             11 affected,
1 Relative                                                              6 Relatives




                        Gottesman II, Gould TD. Am J Psychiatry, 2003

         0.1% of the population affected
         Multigenic disease
         Non-genetic contributions
         Risk: Monozygotic twins 50% - Dizygotic twins 15%.
Uncovering genotype-phenotype relations by
  independently clustering both domains
                                  Phenotype clusters

 Trios (affected, relatives and
 controls)




                                                        Subjects
   70 clinical attributes

      Cognitive

      Motor                         Genotype clusters



      Behavioral

      Structural




                                                        Subjects
   SNPs chips
Identifying significant genotype-phenotype
    relations among inter-domain clusters




                                                                             0.01




                                                                             1E-10




Romero-Zaliz et al, Nucleic Acids Research, 2008; Romero-Zaliz. et al, IEEE Trans. on
        Evol. Computation, 2008, de Erausquin et al, Mol. Psych in Press
Phenotype relations
Genotype relations


~
=
Optimal (multiobjective/multimodal) relations
        are hierarchically organized
Relations reflect the risk of Schizophrenia




                    First degree relatives have
                     a genetic predisposition
Validation using an independent set of
               subjects


         Relation Risk(%)   Affected   Relative   Control
            R22     91       10164
                             10170
           R19      88       10155
                             10192
           R05      61                  10184
           R06      57                  10156
           R11      32                  10181
           R30      28                  20148
                                        10127
           R29      17                  10198     10158
                                        10165
           R24       9                  10193     10151
                                                  10166
           R25       1                            10157
Qualitative significance of learned SNPs


                 Pathway analysis
             Process for Neurological Disease




       .
       .                           .
                                   .            .
                                                .
       .                           .            .
Neuronal cell adhesion pathway derived from
   the genotype domain of the relations
Novel pathways: oxidative stress and
epigenetic control of gene expression
Summary

We proposed the first data-driven definition of the Schizophrenia risk
function

Concurrent CGWAS provides a panoramic vision of phenotype-
genotype associations, each of which can be used by traditional
GWAS analysis

Four signaling pathways associated with risk of schizophrenia were
identified

Phenotype-genotype relations were sufficient to reliably predict
subject status

This finding opens the door for early detection and preventative
intervention prior to the onset of psychotic symptoms in
high/intermediate risk populations
Acknowledgements

Eduardo Groisman Lab
Howard Hughes Medical Institute
                                  Dept. of Computer Science and
Dongwoo Shin                      Artificial Intelligence
Chistian Perez                    University of Granada, Spain

Henry Huang Lab                   Coral del Val

Dept. of Molecular Microbiology   Pat Anders
Washington U.                     Javier Arnedo
School of Medicine, USA           Luis Miguel Merino

                                  Rocio Romero-Zaliz (U. de Granada)
Gabriel de Erausquin Lab          Cristina Rubio-Escudero (U. Seville)
Departments of Psychiatry and     Christopher Previti (U. Bergen)
Neurology                         Oscar Harari (Washington U.)
Harvard Med. School
Acknowledgments
 Francisco Herrera                  Mining for Modeling Lab
          DECSAI,
    University of Granada                      DECSAI,
                                         University of Granada


    Coral del Val
          DECSAI,
    University of Granada         Gabriel de Eraúsquin
                                      Department of Psychiatry,
                                  Washington University in St. Louis


       Igor Zwir
          DECSAI,
                                    Eduardo Groisman
    University of Granada        HHMI, Department of Molecular Biology,
                                  Washington University in St. Louis




 Kathleen Marchal                        Henry Huang
    Department of Microbial
                                   Department of Molecular Biology,
    and Molecular Systems
                                  Washington University in St. Louis
Katholieke Universiteit Leuven

Contenu connexe

Tendances

Introduction to and Applications of Unison, an Open Source Database for Targe...
Introduction to and Applications of Unison, an Open Source Database for Targe...Introduction to and Applications of Unison, an Open Source Database for Targe...
Introduction to and Applications of Unison, an Open Source Database for Targe...
Reece Hart
 
Molecular biology
Molecular biologyMolecular biology
Molecular biology
UE
 

Tendances (20)

Grindberg - PNAS
Grindberg - PNASGrindberg - PNAS
Grindberg - PNAS
 
Pradeep.ii
Pradeep.iiPradeep.ii
Pradeep.ii
 
Introduction to and Applications of Unison, an Open Source Database for Targe...
Introduction to and Applications of Unison, an Open Source Database for Targe...Introduction to and Applications of Unison, an Open Source Database for Targe...
Introduction to and Applications of Unison, an Open Source Database for Targe...
 
Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16
Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16
Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16
 
Genomics in Crop improvement
Genomics in Crop improvementGenomics in Crop improvement
Genomics in Crop improvement
 
seminar on new technologies of cell and molecular biology
seminar on new technologies of cell and molecular biologyseminar on new technologies of cell and molecular biology
seminar on new technologies of cell and molecular biology
 
Genomics
GenomicsGenomics
Genomics
 
Ppt snp detection
Ppt snp detectionPpt snp detection
Ppt snp detection
 
MICROSATELITE Markers for LIVESTOCK Genetic DIVERSITY ANALYSES
MICROSATELITE Markers for LIVESTOCK Genetic DIVERSITY ANALYSESMICROSATELITE Markers for LIVESTOCK Genetic DIVERSITY ANALYSES
MICROSATELITE Markers for LIVESTOCK Genetic DIVERSITY ANALYSES
 
Single Nucleotide Polymorphism (SNP)
Single Nucleotide Polymorphism (SNP)Single Nucleotide Polymorphism (SNP)
Single Nucleotide Polymorphism (SNP)
 
Surp08 Signaling
Surp08 SignalingSurp08 Signaling
Surp08 Signaling
 
Single Nucleotide Polymorphism
Single Nucleotide PolymorphismSingle Nucleotide Polymorphism
Single Nucleotide Polymorphism
 
Molecular markers
Molecular markersMolecular markers
Molecular markers
 
Molecular marker and its application to genome mapping and molecular breeding
Molecular marker and its application to genome mapping and molecular breedingMolecular marker and its application to genome mapping and molecular breeding
Molecular marker and its application to genome mapping and molecular breeding
 
Molecular biology
Molecular biologyMolecular biology
Molecular biology
 
Snp
SnpSnp
Snp
 
Seminario biologia
Seminario biologiaSeminario biologia
Seminario biologia
 
A f l p
A f l pA f l p
A f l p
 
Lecture 2
Lecture 2Lecture 2
Lecture 2
 
Molecular markers
Molecular markersMolecular markers
Molecular markers
 

En vedette

Linked Data in Healthcare and Life Sciences
Linked Data in Healthcare and Life SciencesLinked Data in Healthcare and Life Sciences
Linked Data in Healthcare and Life Sciences
James G. Boram Kim
 

En vedette (20)

Applying Ontology Design Patterns in bio-ontologies
Applying Ontology Design Patterns in bio-ontologiesApplying Ontology Design Patterns in bio-ontologies
Applying Ontology Design Patterns in bio-ontologies
 
Inteligencia Artificial en Bioinformática. Algunas Aplicaciones.
Inteligencia Artificial en Bioinformática. Algunas Aplicaciones.Inteligencia Artificial en Bioinformática. Algunas Aplicaciones.
Inteligencia Artificial en Bioinformática. Algunas Aplicaciones.
 
Aplicación de la Web Semántica en Bioinformática
Aplicación de la Web Semántica en BioinformáticaAplicación de la Web Semántica en Bioinformática
Aplicación de la Web Semántica en Bioinformática
 
Métodos y Resultados Actuales en Bioinformática: know-how y know-what de las ...
Métodos y Resultados Actuales en Bioinformática: know-how y know-what de las ...Métodos y Resultados Actuales en Bioinformática: know-how y know-what de las ...
Métodos y Resultados Actuales en Bioinformática: know-how y know-what de las ...
 
Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013
 
Introduction of Linked Data for Science
Introduction of Linked Data for ScienceIntroduction of Linked Data for Science
Introduction of Linked Data for Science
 
Life Sciences Linked Data
Life Sciences Linked DataLife Sciences Linked Data
Life Sciences Linked Data
 
Linux for bioinformatics
Linux for bioinformaticsLinux for bioinformatics
Linux for bioinformatics
 
BITS - Search engines for mass spec data
BITS - Search engines for mass spec dataBITS - Search engines for mass spec data
BITS - Search engines for mass spec data
 
Linked Data in Healthcare and Life Sciences
Linked Data in Healthcare and Life SciencesLinked Data in Healthcare and Life Sciences
Linked Data in Healthcare and Life Sciences
 
Emerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomicsEmerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomics
 
Introduction to Linux for bioinformatics
Introduction to Linux for bioinformaticsIntroduction to Linux for bioinformatics
Introduction to Linux for bioinformatics
 
Data analytics challenges in genomics
Data analytics challenges in genomicsData analytics challenges in genomics
Data analytics challenges in genomics
 
NGS analysis of micro-RNA
NGS analysis of micro-RNANGS analysis of micro-RNA
NGS analysis of micro-RNA
 
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
 
BITS - Introduction to comparative genomics
BITS - Introduction to comparative genomicsBITS - Introduction to comparative genomics
BITS - Introduction to comparative genomics
 
RNA-seq for DE analysis: extracting counts and QC - part 4
RNA-seq for DE analysis: extracting counts and QC - part 4RNA-seq for DE analysis: extracting counts and QC - part 4
RNA-seq for DE analysis: extracting counts and QC - part 4
 
Utilidad de la genómica en la salud humana
Utilidad de la genómica en la salud humanaUtilidad de la genómica en la salud humana
Utilidad de la genómica en la salud humana
 
Text mining on the command line - Introduction to linux for bioinformatics
Text mining on the command line - Introduction to linux for bioinformaticsText mining on the command line - Introduction to linux for bioinformatics
Text mining on the command line - Introduction to linux for bioinformatics
 
Managing your data - Introduction to Linux for bioinformatics
Managing your data - Introduction to Linux for bioinformaticsManaging your data - Introduction to Linux for bioinformatics
Managing your data - Introduction to Linux for bioinformatics
 

Similaire à Towards an understanding of diversity in biological and biomedical systems

Molecular markers for measuring genetic diversity
Molecular markers for measuring genetic diversity Molecular markers for measuring genetic diversity
Molecular markers for measuring genetic diversity
Zohaib HUSSAIN
 
How Can Ngs Forward Research Essay
How Can Ngs Forward Research EssayHow Can Ngs Forward Research Essay
How Can Ngs Forward Research Essay
Stefanie Yang
 
Chapter 20 ppt
Chapter 20 pptChapter 20 ppt
Chapter 20 ppt
rehman2009
 
Detection of genomic homology in eukaryotic genomes
Detection of genomic homology in eukaryotic genomesDetection of genomic homology in eukaryotic genomes
Detection of genomic homology in eukaryotic genomes
Klaas Vandepoele
 

Similaire à Towards an understanding of diversity in biological and biomedical systems (20)

Human genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traitsHuman genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traits
 
Molecular markers for measuring genetic diversity
Molecular markers for measuring genetic diversity Molecular markers for measuring genetic diversity
Molecular markers for measuring genetic diversity
 
How Can Ngs Forward Research Essay
How Can Ngs Forward Research EssayHow Can Ngs Forward Research Essay
How Can Ngs Forward Research Essay
 
Pharmacogenomics
PharmacogenomicsPharmacogenomics
Pharmacogenomics
 
Molecular Markers: Major Applications in Insects
Molecular Markers: Major Applications in InsectsMolecular Markers: Major Applications in Insects
Molecular Markers: Major Applications in Insects
 
MSc dissertation
MSc dissertationMSc dissertation
MSc dissertation
 
NARASIMHA MURTHY. SNPs.pptx
NARASIMHA MURTHY. SNPs.pptxNARASIMHA MURTHY. SNPs.pptx
NARASIMHA MURTHY. SNPs.pptx
 
Snp
SnpSnp
Snp
 
Making Protein Function and Subcellular Localization Predictions: Challenges ...
Making Protein Function and Subcellular Localization Predictions: Challenges ...Making Protein Function and Subcellular Localization Predictions: Challenges ...
Making Protein Function and Subcellular Localization Predictions: Challenges ...
 
Friend AACR 2013-01-16
Friend AACR 2013-01-16Friend AACR 2013-01-16
Friend AACR 2013-01-16
 
Chapter 20 ppt
Chapter 20 pptChapter 20 ppt
Chapter 20 ppt
 
Molecular systematics.pdf
Molecular systematics.pdfMolecular systematics.pdf
Molecular systematics.pdf
 
Regulatory RNA at epigenetic level
Regulatory RNA at epigenetic level Regulatory RNA at epigenetic level
Regulatory RNA at epigenetic level
 
SNP ppt.pptx
SNP ppt.pptxSNP ppt.pptx
SNP ppt.pptx
 
Una revisión de los conocimientos fundamentales de la biología de la célula. ...
Una revisión de los conocimientos fundamentales de la biología de la célula. ...Una revisión de los conocimientos fundamentales de la biología de la célula. ...
Una revisión de los conocimientos fundamentales de la biología de la célula. ...
 
Detection of genomic homology in eukaryotic genomes
Detection of genomic homology in eukaryotic genomesDetection of genomic homology in eukaryotic genomes
Detection of genomic homology in eukaryotic genomes
 
A New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The ClinicA New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The Clinic
 
Genetics in Psychiatry
Genetics in PsychiatryGenetics in Psychiatry
Genetics in Psychiatry
 
POSTER FINAL
POSTER FINALPOSTER FINAL
POSTER FINAL
 
Conferencia Narendra Maheshri
Conferencia Narendra Maheshri Conferencia Narendra Maheshri
Conferencia Narendra Maheshri
 

Plus de cursoNGS (6)

Differential expression in RNA-Seq
Differential expression in RNA-SeqDifferential expression in RNA-Seq
Differential expression in RNA-Seq
 
Discovery and annotation of variants by exome analysis using NGS
Discovery and annotation of variants by exome analysis using NGSDiscovery and annotation of variants by exome analysis using NGS
Discovery and annotation of variants by exome analysis using NGS
 
NGS Data Preprocessing
NGS Data PreprocessingNGS Data Preprocessing
NGS Data Preprocessing
 
Computational infrastructure for NGS data analysis
Computational infrastructure for NGS data analysisComputational infrastructure for NGS data analysis
Computational infrastructure for NGS data analysis
 
Introduction to NGS
Introduction to NGSIntroduction to NGS
Introduction to NGS
 
Introduccion a la bioinformatica
Introduccion a la bioinformaticaIntroduccion a la bioinformatica
Introduccion a la bioinformatica
 

Dernier

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Dernier (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 

Towards an understanding of diversity in biological and biomedical systems

  • 1. Data analysis workshop for massive sequencing data Towards an understanding of diversity in biological and biomedical systems Igor Zwir Department Computer Science and Artificial Intelligence, University of Granada, Granada, Spain Howard Hughes Medical Institute Yale School of Medicine, NewHeaven, CT, US Department of Psychiatry Washington University School of Medicine, St. Louis, MO, US e-mail: zwiri@psychiatry.wustl.edu
  • 2. “Some people enjoy reading papers, “Some people enjoy reading papers, juggling juggling possibilities and formulating ideas, possibilities and formulating ideas, even if even if they can’t work a pipette” they can’t write a line of a computer program” (“Reasoning for results”, Nature, Bray, D., 2001) (“Reasoning for results”, Groisman Lab, 2007)
  • 3. “…organisms of the most different sorts are constructed from the very same battery of genes. The diversity of life forms results from small changes in the regulatory systems that govern expression of these genes.” François Jacob In Of flies, mice and men
  • 4. Salmonella : A Gram-negative pathogen with a varied lifestyle
  • 5. Signal transduction cascade by two-component regulatory systems Signal low Mg2+ Sensor PhoQ Regulator PhoP -PO3 Effectors mgtA mgtB Response Mg2+ transport Mg2+ transport
  • 6. Two-component systems regulate physiological and virulence functions System Signal Function ArcA/ArcB Quinones Anaerobic respiration OmpR/EnvZ Osmolarity changes Osmoadaptation NtrB/NtrC Low nitrogen levels Nitrogen metabolism PhoP/PhoQ Low Mg2+ Virulence, growth in low Mg2+ PmrA/PmrB Fe3+ and Al3+ Resistance to polymyxin B SsrA/SpiR Unknown Virulence TtrR/TtrS Tetrathionate Anaerobic respiration
  • 7. The Salmonella PMRA/PMRB system responds to Fe3+ and low Mg2+ low Mg2+ high Fe3+ PhoQ PmrB PhoP -PO3 PmrA -PO3 pmrD PmrD pbgP LPS modification
  • 8. The E. Coli PMRA/PMRB system responds to Fe3+ but not to low Mg2+ low Mg2+ high Fe3+ PhoQ PmrB PhoP -PO3 PmrA -PO3 pmrD PmrD pbgP 85.4% 93.3% LPS modification
  • 9. The Salmonella but not the E. coli ugd gene is regulated by the PhoP protein PhoQ PhoQ PhoP -PO3 PhoP -PO3 ugd ugd 85.4% 93.3% 85.5% (the median amino acid identity between Salmonella and E. coli proteins is 90%)
  • 10. PhoP-PhoQ Two component system regulates 5% of Salmonella genes Consensus Motif Salmonella LT2 & E. coli K12
  • 11. Single motif vs. a family of PhoP submotifs +Sensitivity +Specificity +Specificity Harari et al., PloS computational Biology, 2010
  • 12. PhoP submotifs improve BS detection 26 BS
  • 13. Genome wide analysis: custom tiling arrays and ChIP assays
  • 14. Evolution of submotives thougout the Gamma/Enterobacteria S01 S05 Information content PhoP (Halpem Bruno) Background (HKY85 Model) Perez et al., PloS Genetics, 2009; Harari et al., PloS computational Biology, 2010
  • 15. The submotifs and the PhoP protein evolve at correlated rates
  • 16. In vitro affinities correlate well with the top three families of submotifs
  • 17. + - Zwir et al., PNAS, 2005; Zwir et al, Bioinformatics, 2005, Harari et al., BMC Bioinformatics, 2009
  • 18. Submotif & distances from the RNAP binding site Close Medium Remote 45% 21% Harari et al., PloS computational Biology, 2010
  • 19. Two closely related species show distinct promoter’s preferences Close Medium Remote Submotifs & distances can distinguish Salmonella & E. coli
  • 20. Two far related species show distinct promoter architectures
  • 21. PhoP-activated genes are bound and transcribed at different times and levels
  • 22. Predicting gene binding and transcription of PhoP regulated targets ancestral horizontally-acquired
  • 23. Summary TF Affinity for its binding sites determine promoter time and levels in naked DNA Binding and Transcription in vivo depends on where the binding sites sit (promoter architectures) Cis-acting features in the PhoP-activated promoters determine non-arbitrary organized architectures The differences of the regulon througout distinct species depends on the evolution of the binding sites and promoter architectures
  • 24. Two paradigms: multiple genes with small effect, or few genes with large effect London Metro Boston Metro de Vries, Nature Medicine, 2009
  • 25. Phenotypic-genotypic relations describe a risk surface of Schizophrenia R19: R10: 6 affected, 11 affected, 1 Relative 6 Relatives Gottesman II, Gould TD. Am J Psychiatry, 2003 0.1% of the population affected Multigenic disease Non-genetic contributions Risk: Monozygotic twins 50% - Dizygotic twins 15%.
  • 26. Uncovering genotype-phenotype relations by independently clustering both domains Phenotype clusters Trios (affected, relatives and controls) Subjects 70 clinical attributes Cognitive Motor Genotype clusters Behavioral Structural Subjects SNPs chips
  • 27. Identifying significant genotype-phenotype relations among inter-domain clusters 0.01 1E-10 Romero-Zaliz et al, Nucleic Acids Research, 2008; Romero-Zaliz. et al, IEEE Trans. on Evol. Computation, 2008, de Erausquin et al, Mol. Psych in Press
  • 30. Optimal (multiobjective/multimodal) relations are hierarchically organized
  • 31. Relations reflect the risk of Schizophrenia First degree relatives have a genetic predisposition
  • 32. Validation using an independent set of subjects Relation Risk(%) Affected Relative Control R22 91 10164 10170 R19 88 10155 10192 R05 61 10184 R06 57 10156 R11 32 10181 R30 28 20148 10127 R29 17 10198 10158 10165 R24 9 10193 10151 10166 R25 1 10157
  • 33. Qualitative significance of learned SNPs Pathway analysis Process for Neurological Disease . . . . . . . . .
  • 34. Neuronal cell adhesion pathway derived from the genotype domain of the relations
  • 35. Novel pathways: oxidative stress and epigenetic control of gene expression
  • 36. Summary We proposed the first data-driven definition of the Schizophrenia risk function Concurrent CGWAS provides a panoramic vision of phenotype- genotype associations, each of which can be used by traditional GWAS analysis Four signaling pathways associated with risk of schizophrenia were identified Phenotype-genotype relations were sufficient to reliably predict subject status This finding opens the door for early detection and preventative intervention prior to the onset of psychotic symptoms in high/intermediate risk populations
  • 37. Acknowledgements Eduardo Groisman Lab Howard Hughes Medical Institute Dept. of Computer Science and Dongwoo Shin Artificial Intelligence Chistian Perez University of Granada, Spain Henry Huang Lab Coral del Val Dept. of Molecular Microbiology Pat Anders Washington U. Javier Arnedo School of Medicine, USA Luis Miguel Merino Rocio Romero-Zaliz (U. de Granada) Gabriel de Erausquin Lab Cristina Rubio-Escudero (U. Seville) Departments of Psychiatry and Christopher Previti (U. Bergen) Neurology Oscar Harari (Washington U.) Harvard Med. School
  • 38. Acknowledgments Francisco Herrera Mining for Modeling Lab DECSAI, University of Granada DECSAI, University of Granada Coral del Val DECSAI, University of Granada Gabriel de Eraúsquin Department of Psychiatry, Washington University in St. Louis Igor Zwir DECSAI, Eduardo Groisman University of Granada HHMI, Department of Molecular Biology, Washington University in St. Louis Kathleen Marchal Henry Huang Department of Microbial Department of Molecular Biology, and Molecular Systems Washington University in St. Louis Katholieke Universiteit Leuven