SlideShare une entreprise Scribd logo
1  sur  38
Télécharger pour lire hors ligne
Deciphering the regulatory
                        code in the genome
                      PhD completion seminar
                      Denis C. Bauer

                      Institute for Molecular Bioscience
                      The University of Queensland,
                      Australia


By yankodesign                                             by linh.ngân 
Research Aim
 Thermodynamic model
Develop a method that translates the
 regulatory message in the DNA of when and
 how strong a gene is expressed.


   AAGAAGGTTTTAGTTTAGCC     Express gene with 
   CACCGTAGGTACCTGAAGAA
   GAAGGTTTTAGTTTAGCCCA    70% capacity when it 
   CCGTAGGTACCTGAAG           is hot, Thanks! 
Why understanding transcriptional
      regulation is important?
•  Insight in the biology of gene pathways.
•  Search for regulatory regions with specific function.
•  “Re-programming” of genes has therapeutic
   potential.


                                 A                                    transcription

                                                               gene
                                                    promoter

 DNA

    Broken regulatory    Design and insert a new 
        element            regulatory element 
What do we need to know 
for  building  a  model  able 
to translate the regulatory 
message ? 
Background : Enhancer
•  Genes can have independent “switches” (Enhancer)
   beyond the core promoter, which can start the
   transcription of the target gene under different
   conditions.
                                                   transcription


                                            gene
                                 promoter




           enhancer regions
Background: Enhancer
   •  Transcription is regulated by the binding of activator
      and repressor TFs to an enhancer region.

                     enhancer


binding site map



                   Active
     TF               8 Activators                   transcription
Concentration
                      2 Repressors
Background: Repression
   •  Transcriptional regulation is also dependent on the
      interplay between activators and repressors, i.e.
      where they bind relative to each other.
                               Repressor range




binding site map

                    enhancer
On  which  system  would 
we  test  the  model’s 
abiliJes ? 
Background: Even-skipped gene (eve)
                       Drosophila melanogaster 1




                       Embryo stained for eve 2




                       Function representation 3


                                   1 hLp://insects.eugenes.org/ 
                                   2 Small et al. 
                                   3 hLp://bioinform.geneJka.ru 
Background: Regulation of eve
                 MSE                    MSE                    eve                                    MSE                   MSE MSE
Late1            3+7                        2            P                       late2                     4+6                    1        5 




                                                                                                     lacZ 




                                              Janssens, H. et al. QuanJtaJve and predicJve model of transcripJonal control of the 
                                              Drosophila melanogaster even skipped gene. Nat Genet, 2006, 38, 1159‐1165  
Hypothesis


           TF            Bindin
                   ns                       Genome  
                                                      
      conce ntraJo             g site 
                            map                      re,
                                         a rchitectu
                                              RNA, 
                                                       n,
                                          m  ethylaJo
                                                 … 




predicts gene activation
Research Goals
•  Optimize Thermodynamic models
   efficiently.
•  Analyze robustness of these
   models.
•  Explore the regulation of a
     particular gene.
•  Examine how the regulatory program evolves.
•  Extend current thermodynamic model.


                                                 Cooperphoto/CORBIS 
Model definition
 Site occupancy (Hill function)
                Kt · K(s, t) · [t]
  p(s, t) =
              1 + Kt · K(s, t) · [t]                                                                      Free parameters
                                                                                                    TF PARAMS
 Total activation
                                                                                                     K           Binding affinity
W (S, T ) =            Ets p(s, ts )            1 − Ets · p(s , ts ) · d(s, s )
               s∈S A                   s ∈S R
                                                                                                      E           Effectiveness
                                                quenching of the activator
                                         activator contribution                                     GENERAL PARAMS

 Transcription rate (Arrhenius function)
                                                                                                    R0 Max. transcription
            R           exp W (S, T ) − G0               iff W < G0                                                             rate
               0
R(S, T ) =
           
             R0                                            otherwise,
                                                                                                     G0          Energy barrier  


                       ts                                    ts
                                                                                                                Buena Vista Pictures 
                       s                                    s
                                            Janssens, H. et al. QuanJtaJve and predicJve model of transcripJonal control of the 
                                            Drosophila melanogaster even skipped gene. Nat Genet, 2006, 38, 1159‐1165  
Training the model




                                            200
                                            100
                                            50
                                            0
                                                           < [TF ], [TF ], [TF ], [TF ] >
                                                       0         20            40        60       80       100



                                                                      1             2         3        4

                  TF Binding                                TF Concentration



                                    Thermodynamic
                                        Model

      predicted                                                                         Adjust model
expression and                                                                          parameters to
                    150
                    100




  compare it to                                                                         improve fit
                    50




         target
                    0




                          40   50      60         70        80            90
Optimization methods
•  Two optimization paradigms
   –  Simulated Annealing
      •  LAM schedule (Reinitz et al. 2003)
      •  Geometric cooling
   –  Gradient descent
      •  Three GD variants approximating the objective function, which
         was not continuously differentiable.
•  Judged on accuracy achieved in the given time
   –  Drosophila MSE2 data with 400 data points and 7 TF
      (16 free parameters).
Optimization
            Simulated Annealing                                            Gradient Descent




                                                           1.00


                                                                       20




                                                                                                                                        20
                                                                                                                SA LAM
     0.99




                                                                                                                SA geom




                                                           0.99


                                                                       15




                                                                                                                                        15
                                                           RMS error
                                                                0.98




                                                                                                                            RMS error
CC




                                                      CC




                                                                       10




                                                                                                                                        10
     0.97




                                                           0.97
                                                                                                            SA_geom




                                                                       5




                                                                                                                                        5
                                                           0.96
                                                                                                            GD_softmax
                                       SA LAM
                                                                                                            GD_nomax
                                       SA geom
     0.95




                                                                                                            GD_max




                                                           0.95


                                                                       0




                                                                                                                                        0
             1   2   5 10       50       200                           1    2   1   5
                                                                                    2   105    20
                                                                                               10    50   100
                                                                                                          50    200200500
                                                                                         time [minutes]
                      time [minutes]                                                          time [minutes]


                            Suggests: many local minima.
                                 Bauer, D. C. & Bailey, T. L. OpJmizing staJc thermodynamic models of transcripJonal 
                                 regulaJon. BioinformaJcs, 2009, 25, 1640‐1646  
If  gradient  descent  gets 
stuck  in  local  minima  all 
the  Jme,  how  does  the 
opJmizaJon  landscape 
look like ? 
Landscape analysis
•  Synthetic data based on real MSE2 data
  –  global minimum and solution (parameter values) are
     known.
  –  Measuring distance of the optimization solution to the
     starting position and the known solution.
  –  Measuring error reduction at the
     solution compared to the
     starting position.
Landscape analysis
Experiment      Ini$al distance to  Final distance to              Error Red. 
                solu4on (mean)      solu4on                        (mean) 
                                    (mean) 
1% perturbed     3.4·10−4                   2.8·10−4               88% 
random          0.1                      0.11                      97% 




                                                                           Conclusion:
                                                                           many local
                                                                           minima.
                       Bauer, D. C. & Bailey, T. L. OpJmizing staJc thermodynamic models of transcripJonal 
                       regulaJon. BioinformaJcs, 2009, 25, 1640‐1646  
Does the model over-fit ?
•  Cross-validation (5-fold)
            Experiment   Mean RMS error    Mean CC  
                         (SE)              (SE) 
            training     13.39 (0.004)     0.92  (4.8 · 10−5 )
            tesJng       14.04 (0.005)     0.91  (5.7 · 10−5 )



•  Redundancy reduction
   –  Not enough data to begin with
Summary: Optimization & Analysis
•  The objective function is
   ill-posed.
   –  It has a plethora of local
      minima.
   –  It might have many
      global minima.
•  Hence SA is the
   method of choice.
•  There might be a
   tendency to over-fit the
   data.
                                   hLp://www2.cmp.uea.ac.uk/~aih/code/SVM/KernelTrickDemo.html 
                                                                        hLp://images.nciku.com/ 
Research Goals
•  Optimize Thermodynamic models
   efficiently
•  Analyze robustness of these
   models
•  Explore the regulation of a
     particular gene
•  Examine how the regulatory program evolves
•  Extend current thermodynamic model


                                                Cooperphoto/CORBIS 
Regulation and Evolution of eve
•  Mechanism for regulating eve is
   conserved:
   –  Stripe 2 elements from other
      Drosophila species activate
      eve in D. mel. correctly.
   –  Despite the substantial
      difference in the
      regulatory DNA
      sequence.

                                                                                hLp://www.bio.ilstu.edu/Edwards/ 

                    Hare, E. E. et al. Sepsid even‐skipped enhancers are funcJonally conserved in Drosophila 
                    despite lack of sequence conservaJon. PLoS Genet, 2008, 4, e1000106  
Evaluate Evolution of MSE2
•  Test if the model can identify the MSE2 in these
   other species.

•  Test if the model correctly predicts the
   transcriptional output of the homologous MSE2s.
Searching for MSE2
•  Apply a model trained on D. mel. MSE2 to the TFBS-map
   from sequential windows to find the MSE2 in other
   species
                        MSE2              promoter
                                                           eve
    Other species




                                                                    150
                                                                    100
                                                                    50
                                                                    0
                                                                          40   50   60   70   80   90




                                                                    150
                                        RMS error




                                                                    100
                                                                    50
                                                                    0
                                                                          40   50   60   70   80   90




<   23 27 43        …   13                                    …
                                                                                              >

                         Bauer, D. C. & Bailey, T. L. Studying the funcJonal conservaJon of cis‐regulatory modules 
                         and their transcripJonal output. BMC BioinformaJcs, 2008, 9, 220   
Searching for MSE2: Result
•  Correctly identified the MSE2 in 6/8 species




                                                                                             40
           D. melanogaster




                                                                                             30
                                                                                             20
                                                                                                   RMS error 
                                                                                             10
                                                                                             40
           D.pseudoobscura




                                                                                             30
                                                                                             20
                                                                                             10
                                                                                                  rms error
                                   Genomic locaJon 




                                                                                             40
                             Bauer, D. C. & Bailey, T. L. Studying the funcJonal conservaJon of cis‐regulatory modules 




                                                                                             30
           rimshawi




                             and their transcripJonal output. BMC BioinformaJcs, 2008, 9, 220   




                                                                                             20
Predicting the output in other species
                   •  Apply a model trained on D. mel. MSE2 to the MSE2s
                      in other species
D. melanogaster 

                                           15




                                                                                                                                                 150
                                                                                                                                                                          Target
                                           10




                                                                                                                                                                          D. melanogaster
                   Log odds score (bits)




                                                                                                                    relative RNA concentration
                                           5




                                                                                                                                                                          D. pseudoobscura
                                           0




                                                                                                                                                                          D. ananassae
                                           !5




                                                                                                                                                 100
                                                                                                                                                                          D. mojavensis
                                           !10
                                           !15




                                                 0   500                           1000                1500
D. mojavensis 




                                                                 rel. genomic position




                                                                                                                                                 50
                                                       bicoid   kruppel         giant      hunchback
                                                       knirps   caudal          tailless




                                                                                                                                                 0
                                                                                                                                                       40   50       60     70      80   90

                                                                                                                                                                 A!P position (%)

                                                                                           Bauer, D. C. & Bailey, T. L. Studying the funcJonal conservaJon of cis‐regulatory modules 
                                                                                           and their transcripJonal output. BMC BioinformaJcs, 2008, 9, 220   
Summary Application
•  Model fits the data
   qualitatively.
•  Predictions are biologically
   meaningful.

•  However, there is room for
   improvement.
Research Goals
•  Optimize Thermodynamic models
   efficiently
•  Analyze robustness of these
   models
•  Explore the regulation of a
     particular gene
•  Examine how the regulatory program evolves
•  Extend current thermodynamic model


                                                Cooperphoto/CORBIS 
One role fits them all?
•  Dual function is proposed for some of the regulatory
   TFs.
   –  E.g. TF Hunchback (Hb) might be an activator when
      regulating stripe2 and repressor for stripe3.


   Late1            3+7                        2            P                       late2                     4+6                    1        5 




                                                 Papatsenko, D. & Levine, M. S. Dual regulaJon by the Hunchback gradient in the 
                                                 Drosophila embryo. Proc Natl Acad Sci U S A, 2008, 105, 2901‐2906  
                                                 Schroeder, M. D. et al. TranscripJonal control in the segmentaJon gene network of 
                                                 Drosophila. PLoS Biol, 2004, 2, E271  
Determine the regulatory role of TFs
•  Different data set: 44 CRMs important for D. mel.
   development but same set of TFs.
•  Determine the best role for each TF in each of the
   CRMs
   –  Brute Force: train a model for all TF role-combinations on
      each of the 44 CRMs.
   –  Record the correlation achieved.
   –  Identify TFs that have dual-function.


                     Segal, E. et al. PredicJng expression paLerns from regulatory sequence includes 
                     Drosophila segmentaJon. Nature, 2008, 451, 535‐540 
                     Bauer, D. C.; Buske, F. A. & Bailey, T. L. Dual funcJoning transcripJon factors regulated by 
                     SUMOylaJon in the developmental gene network of Drosophila melanogaster submiLed 
                     for publicaJon, 2009 
TFs with dual role
                       Bcd         Cad         Hb           Tll          Gt           Kr           Kni         TorRE 
 Det. roles                s           +           s               ‐         s              s          ‐            s 
 Literature               +            +           s               ‐        (s)             s          ‐          NA 
 (consensus) 

 “s”: dual-functioning, “+”: activator, “-”: repressor.


•  E.g. Hb
     –  Activator for 17 CRMs
     –  Repressor for 27 CRMs




                                       Perkins, T. J. et al. Reverse engineering the gap gene network of Drosophila melanogaster. 
                                       PLoS Comput Biol, 2006, 2, e51  
                                       Schroeder, M. D. et al. TranscripJonal control in the segmentaJon gene network of 
                                       Drosophila. PLoS Biol, 2004, 2, E271  
Improvement with dual function
                                  kr_CD1_ru                                                       hb_anterior_actv
       1.0




                                                                              1.0




                                                                                                                                                   1.0
                 target
                 previous roles
                 HbDual                                                       Experiment         number of            mean CC  
                 KrDual                                                                          free                 (SE) 
       0.8




                                                                              0.8




                                                                                                                                                   0.8
                 HbKrDual
                 best                                                                            parameters 
                                                                              Previous                 18              0.27 (0.008) 
       0.6




                                                                              0.6




                                                                                                                                                   0.6
mRNA




                                                                       mRNA




                                                                                                                                            mRNA
                                                                              roles 
                                                                              HbDual                   19              0.35 (0.009) 
       0.4




                                                                              0.4




                                                                                                                                                   0.4
                                                                              KrDual                   19              0.37 (0.007) 
       0.2




                                                                              0.2




                                                                                                                                                   0.2
                                                                              HbKrDual                 20              0.38 (0.007) 
       0.0




                                                                              0.0




                                                                                                                                                   0.0
             0      20            40        60     80        100                    0       20        40         60         80        100

                                       AP                                                                   AP

                                          Bauer, D. C.; Buske, F. A. & Bailey, T. L. Dual funcJoning transcripJon factors regulated by 
                              run_stripe5 SUMOylaJon in the developmental gene network of Drosophila melanogaster submiLed 
                                                                                                     eve_37ext_ru
                                          for publicaJon, 2009 
       .0




                                                                              .0




                                                                                                                                                   .0
Marker motifs for dual function
•  Running MEME on the protein sequence of dual-
   functioning TFs to find short motifs (<6aa) present
   in all of them.




                       CI                              KE
              4                               4




                                                     Q
              3                               3


                  K D                               ID
           bits




                                           bits
              2
                    G                         2


              1


              0
                  L E
                  Y         Q
                                              1


                                              0
                                                  L
                                                  V
                   1
                   2
                   3
                   4




                                                   1
                                                   2
                                                   3
                                                   4
            MEME (no SSC) 15.07.09 12:07    MEME (no SSC) 15.07.09 12:07




                                           SUMOyla(on 
                                              mo(f 
SUMOylation
•  Small Ubiquitin-related Modifier a                                                         SUMO
                                                                                            protease
                                                                                    SU
   small protein covalently attached              ATP


   to target-proteins.                                                                                 SU

                                                                                SUMO
•  Involved in many pathways/                      SU
                                                                               pathway
   mechanisms                        E1 activating
                                          enzyme

    –  Compartmentisation                                                                                     target protein
                                                                                               + E3 ligasis
    –  Transcriptional regulation                                                   SU

        •  Can reverse the function of a TF e.g.                                    E2 conjugating
                                                                                    enzyme
           Ikaros (the human homologue of Kr)

•  SUMO (Smt3) is present in D. mel during development

                          Bauer, D. C.; Buske, F. A.; Bailey, T. L. & Bodén, M. PredicJng SUMOylaJon sites in 
                          developmental transcripJon factors of Drosophila melanogaster NeurocompuJng, 2009, 
                          in submission  
                          del Arco, P. G. et al. Ikaros SUMOylaJon: switching out of repression. Mol Cell Biol 2005, 
                          25, 2688‐2697   
Conclusion
•  Thermodynamic models can be best optimized using SA but
   over-fitting is an issue to keep in mind.
       Bauer, D. C. & Bailey, T. L. OpJmizing staJc thermodynamic models of transcripJonal regulaJon. BioinformaJcs, 2009, 25, 1640‐1646  



•  Non-the-less, they are applicable for
   –  examining the mechanisms of transcriptional regulation,
   –  explore the evolution of a particular regulatory mechanism
       Bauer, D. C. & Bailey, T. L. Studying the funcJonal conservaJon of cis‐regulatory modules and their transcripJonal output. BMC BioinformaJcs, 2008, 9, 220   



•  Model prediction improves when dual-function is allowed.
       Bauer, D. C.; Buske, F. A. & Bailey, T. L. Dual funcJoning transcripJon factors regulated by SUMOylaJon in the developmental gene network of Drosophila 
       melanogaster submiLed for publicaJon, 2009 


   –  SUMOylation seems to be a good candidate for the biological
      mechanism of role-change.
       Bauer, D. C.; Buske, F. A.; Bailey, T. L. & Bodén, M. PredicJng SUMOylaJon sites in developmental transcripJon factors of Drosophila melanogaster 
       NeurocompuJng, 2009, in submission  
Acknowledgments
•  IMB                                          •    Funding
    –    Timothy Bailey (supervisor)                  –  Institute for Molecular
    –    Mikael Bodén (supervisor)                       Bioscience, The University of
    –    Sean Grimmond (thesis committee)
                                                         Queensland
    –    Nick Hamilton (thesis committee)
                                                      –  Australian Research Council
    –    Fabian Buske
                                                         Centre of Excellence in
    –    Stefan Maetschke
                                                         Bioinformatics
                                                      –  National Institutes of Health
•  Stony Brook University
    –  John Reinitz                                   –  UQ International Research
                                                         Tuition Award




                            Framework for modeling, visualizing, and predicJng the 
                            regulaJon of the transcripJon rate of a target gene 
                              www.bioinforma(cs.org.au/stream 
www.bioinforma(cs.org.au/stream 


•  Framework for modeling, visualizing,
   and predicting the regulation of the
   transcription rate of a target gene.
•  Publicly available
•  Modular: New functions can be
   plugged in




                                                        Many functions
  Command line




                             Bauer, D.C. and Bailey, T.L, STREAM ‐ StaJc Thermodynamic REgulAtory Model for 
                             transcripJonal. BioinformaJcs, 2008, 24, 2544‐2545. 

Contenu connexe

En vedette

Writing assignment 3 molecular biology
Writing assignment 3   molecular biologyWriting assignment 3   molecular biology
Writing assignment 3 molecular biologycorv629
 
Writing assignment 4 molecular cell biology
Writing assignment 4   molecular cell biologyWriting assignment 4   molecular cell biology
Writing assignment 4 molecular cell biologycorv629
 
Developmental cascade of morphogens Define Drosophila Body Plan
Developmental cascade of morphogens Define Drosophila Body PlanDevelopmental cascade of morphogens Define Drosophila Body Plan
Developmental cascade of morphogens Define Drosophila Body PlanDouglas Easton
 
Regulation of gene expression in eukaryotes
Regulation of gene expression in eukaryotesRegulation of gene expression in eukaryotes
Regulation of gene expression in eukaryotesNamrata Chhabra
 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerLuminary Labs
 

En vedette (7)

Writing assignment 3 molecular biology
Writing assignment 3   molecular biologyWriting assignment 3   molecular biology
Writing assignment 3 molecular biology
 
Writing assignment 4 molecular cell biology
Writing assignment 4   molecular cell biologyWriting assignment 4   molecular cell biology
Writing assignment 4 molecular cell biology
 
Drosophilla
DrosophillaDrosophilla
Drosophilla
 
Developmental cascade of morphogens Define Drosophila Body Plan
Developmental cascade of morphogens Define Drosophila Body PlanDevelopmental cascade of morphogens Define Drosophila Body Plan
Developmental cascade of morphogens Define Drosophila Body Plan
 
Cook2010web
Cook2010webCook2010web
Cook2010web
 
Regulation of gene expression in eukaryotes
Regulation of gene expression in eukaryotesRegulation of gene expression in eukaryotes
Regulation of gene expression in eukaryotes
 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI Explainer
 

Similaire à Deciphering the regulatory code in the genome

Regulation of Gene Expression ppt
Regulation of Gene Expression pptRegulation of Gene Expression ppt
Regulation of Gene Expression pptKhaled Elmasry
 
Gene prediction and expression
Gene prediction and expressionGene prediction and expression
Gene prediction and expressionishi tandon
 
Data Driven Process Optimization Using Real-Coded Genetic Algorithms ~陳奇中教授演講投影片
Data Driven Process Optimization Using Real-Coded Genetic Algorithms ~陳奇中教授演講投影片Data Driven Process Optimization Using Real-Coded Genetic Algorithms ~陳奇中教授演講投影片
Data Driven Process Optimization Using Real-Coded Genetic Algorithms ~陳奇中教授演講投影片Chyi-Tsong Chen
 
Computational Synthetic Biology
Computational Synthetic BiologyComputational Synthetic Biology
Computational Synthetic BiologyNatalio Krasnogor
 
An introduction to RNA-seq data analysis
An introduction to RNA-seq data analysisAn introduction to RNA-seq data analysis
An introduction to RNA-seq data analysisAGRF_Ltd
 
2018-05-24 Research update on Armadillo Repeat Proteins: Evolution and Design...
2018-05-24 Research update on Armadillo Repeat Proteins: Evolution and Design...2018-05-24 Research update on Armadillo Repeat Proteins: Evolution and Design...
2018-05-24 Research update on Armadillo Repeat Proteins: Evolution and Design...Spencer Bliven
 
transposons complete ppt
transposons complete ppttransposons complete ppt
transposons complete ppttauseefsko
 
Aiche 2008, Philadelphia
Aiche 2008, PhiladelphiaAiche 2008, Philadelphia
Aiche 2008, Philadelphiajshine
 
London Calling 2019: Karen Miga
London Calling 2019: Karen MigaLondon Calling 2019: Karen Miga
London Calling 2019: Karen MigaKaren Hayden Miga
 
C value, Cot Curve & Rot Curve L1-3.pdf
C value, Cot Curve & Rot Curve L1-3.pdfC value, Cot Curve & Rot Curve L1-3.pdf
C value, Cot Curve & Rot Curve L1-3.pdfNitin Wahi
 
Leiden_VU_Delft_seminar short.pdf
Leiden_VU_Delft_seminar short.pdfLeiden_VU_Delft_seminar short.pdf
Leiden_VU_Delft_seminar short.pdfChiheb Ben Hammouda
 
An introduction to promoter prediction and analysis
An introduction to promoter prediction and analysisAn introduction to promoter prediction and analysis
An introduction to promoter prediction and analysisSarbesh D. Dangol
 
Hadoop for Bioinformatics
Hadoop for BioinformaticsHadoop for Bioinformatics
Hadoop for BioinformaticsDeepak Singh
 
Genome editing tools in plants
Genome editing tools in plantsGenome editing tools in plants
Genome editing tools in plantsSAIMA BARKI
 
Estimation of region of attraction for polynomial nonlinear systems a numeric...
Estimation of region of attraction for polynomial nonlinear systems a numeric...Estimation of region of attraction for polynomial nonlinear systems a numeric...
Estimation of region of attraction for polynomial nonlinear systems a numeric...ISA Interchange
 

Similaire à Deciphering the regulatory code in the genome (20)

Regulation of Gene Expression ppt
Regulation of Gene Expression pptRegulation of Gene Expression ppt
Regulation of Gene Expression ppt
 
Gene prediction and expression
Gene prediction and expressionGene prediction and expression
Gene prediction and expression
 
Data Driven Process Optimization Using Real-Coded Genetic Algorithms ~陳奇中教授演講投影片
Data Driven Process Optimization Using Real-Coded Genetic Algorithms ~陳奇中教授演講投影片Data Driven Process Optimization Using Real-Coded Genetic Algorithms ~陳奇中教授演講投影片
Data Driven Process Optimization Using Real-Coded Genetic Algorithms ~陳奇中教授演講投影片
 
Computational Synthetic Biology
Computational Synthetic BiologyComputational Synthetic Biology
Computational Synthetic Biology
 
ชีวะ Bio
ชีวะ Bio ชีวะ Bio
ชีวะ Bio
 
An introduction to RNA-seq data analysis
An introduction to RNA-seq data analysisAn introduction to RNA-seq data analysis
An introduction to RNA-seq data analysis
 
2018-05-24 Research update on Armadillo Repeat Proteins: Evolution and Design...
2018-05-24 Research update on Armadillo Repeat Proteins: Evolution and Design...2018-05-24 Research update on Armadillo Repeat Proteins: Evolution and Design...
2018-05-24 Research update on Armadillo Repeat Proteins: Evolution and Design...
 
transposons complete ppt
transposons complete ppttransposons complete ppt
transposons complete ppt
 
State-of-the-Art Normalization of RT-qPCR Data
State-of-the-Art Normalization of RT-qPCR Data State-of-the-Art Normalization of RT-qPCR Data
State-of-the-Art Normalization of RT-qPCR Data
 
Aiche 2008, Philadelphia
Aiche 2008, PhiladelphiaAiche 2008, Philadelphia
Aiche 2008, Philadelphia
 
London Calling 2019: Karen Miga
London Calling 2019: Karen MigaLondon Calling 2019: Karen Miga
London Calling 2019: Karen Miga
 
Terahertz vcsel
Terahertz vcselTerahertz vcsel
Terahertz vcsel
 
C value, Cot Curve & Rot Curve L1-3.pdf
C value, Cot Curve & Rot Curve L1-3.pdfC value, Cot Curve & Rot Curve L1-3.pdf
C value, Cot Curve & Rot Curve L1-3.pdf
 
Leiden_VU_Delft_seminar short.pdf
Leiden_VU_Delft_seminar short.pdfLeiden_VU_Delft_seminar short.pdf
Leiden_VU_Delft_seminar short.pdf
 
An introduction to promoter prediction and analysis
An introduction to promoter prediction and analysisAn introduction to promoter prediction and analysis
An introduction to promoter prediction and analysis
 
Evolution 2012
Evolution 2012Evolution 2012
Evolution 2012
 
Rnaseq forgenefinding
Rnaseq forgenefindingRnaseq forgenefinding
Rnaseq forgenefinding
 
Hadoop for Bioinformatics
Hadoop for BioinformaticsHadoop for Bioinformatics
Hadoop for Bioinformatics
 
Genome editing tools in plants
Genome editing tools in plantsGenome editing tools in plants
Genome editing tools in plants
 
Estimation of region of attraction for polynomial nonlinear systems a numeric...
Estimation of region of attraction for polynomial nonlinear systems a numeric...Estimation of region of attraction for polynomial nonlinear systems a numeric...
Estimation of region of attraction for polynomial nonlinear systems a numeric...
 

Plus de Denis C. Bauer

Cloud-native machine learning - Transforming bioinformatics research
Cloud-native machine learning - Transforming bioinformatics research Cloud-native machine learning - Transforming bioinformatics research
Cloud-native machine learning - Transforming bioinformatics research Denis C. Bauer
 
Translating genomics into clinical practice - 2018 AWS summit keynote
Translating genomics into clinical practice - 2018 AWS summit keynoteTranslating genomics into clinical practice - 2018 AWS summit keynote
Translating genomics into clinical practice - 2018 AWS summit keynoteDenis C. Bauer
 
Going Server-less for Web-Services that need to Crunch Large Volumes of Data
Going Server-less for Web-Services that need to Crunch Large Volumes of DataGoing Server-less for Web-Services that need to Crunch Large Volumes of Data
Going Server-less for Web-Services that need to Crunch Large Volumes of DataDenis C. Bauer
 
How novel compute technology transforms life science research
How novel compute technology transforms life science researchHow novel compute technology transforms life science research
How novel compute technology transforms life science researchDenis C. Bauer
 
How novel compute technology transforms life science research
How novel compute technology transforms life science researchHow novel compute technology transforms life science research
How novel compute technology transforms life science researchDenis C. Bauer
 
VariantSpark: applying Spark-based machine learning methods to genomic inform...
VariantSpark: applying Spark-based machine learning methods to genomic inform...VariantSpark: applying Spark-based machine learning methods to genomic inform...
VariantSpark: applying Spark-based machine learning methods to genomic inform...Denis C. Bauer
 
Population-scale high-throughput sequencing data analysis
Population-scale high-throughput sequencing data analysisPopulation-scale high-throughput sequencing data analysis
Population-scale high-throughput sequencing data analysisDenis C. Bauer
 
Allelic Imbalance for Pre-capture Whole Exome Sequencing
Allelic Imbalance for Pre-capture Whole Exome SequencingAllelic Imbalance for Pre-capture Whole Exome Sequencing
Allelic Imbalance for Pre-capture Whole Exome SequencingDenis C. Bauer
 
Centralizing sequence analysis
Centralizing sequence analysisCentralizing sequence analysis
Centralizing sequence analysisDenis C. Bauer
 
Qbi Centre for Brain genomics (Informatics side)
Qbi Centre for Brain genomics (Informatics side)Qbi Centre for Brain genomics (Informatics side)
Qbi Centre for Brain genomics (Informatics side)Denis C. Bauer
 
Differential gene expression
Differential gene expressionDifferential gene expression
Differential gene expressionDenis C. Bauer
 
Transcript detection in RNAseq
Transcript detection in RNAseqTranscript detection in RNAseq
Transcript detection in RNAseqDenis C. Bauer
 
Functionally annotate genomic variants
Functionally annotate genomic variantsFunctionally annotate genomic variants
Functionally annotate genomic variantsDenis C. Bauer
 
Variant (SNPs/Indels) calling in DNA sequences, Part 2
Variant (SNPs/Indels) calling in DNA sequences, Part 2Variant (SNPs/Indels) calling in DNA sequences, Part 2
Variant (SNPs/Indels) calling in DNA sequences, Part 2Denis C. Bauer
 
Variant (SNPs/Indels) calling in DNA sequences, Part 1
Variant (SNPs/Indels) calling in DNA sequences, Part 1 Variant (SNPs/Indels) calling in DNA sequences, Part 1
Variant (SNPs/Indels) calling in DNA sequences, Part 1 Denis C. Bauer
 
Introduction to second generation sequencing
Introduction to second generation sequencingIntroduction to second generation sequencing
Introduction to second generation sequencingDenis C. Bauer
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to BioinformaticsDenis C. Bauer
 
The missing data issue for HiSeq runs
The missing data issue for HiSeq runsThe missing data issue for HiSeq runs
The missing data issue for HiSeq runsDenis C. Bauer
 

Plus de Denis C. Bauer (20)

Cloud-native machine learning - Transforming bioinformatics research
Cloud-native machine learning - Transforming bioinformatics research Cloud-native machine learning - Transforming bioinformatics research
Cloud-native machine learning - Transforming bioinformatics research
 
Translating genomics into clinical practice - 2018 AWS summit keynote
Translating genomics into clinical practice - 2018 AWS summit keynoteTranslating genomics into clinical practice - 2018 AWS summit keynote
Translating genomics into clinical practice - 2018 AWS summit keynote
 
Going Server-less for Web-Services that need to Crunch Large Volumes of Data
Going Server-less for Web-Services that need to Crunch Large Volumes of DataGoing Server-less for Web-Services that need to Crunch Large Volumes of Data
Going Server-less for Web-Services that need to Crunch Large Volumes of Data
 
How novel compute technology transforms life science research
How novel compute technology transforms life science researchHow novel compute technology transforms life science research
How novel compute technology transforms life science research
 
How novel compute technology transforms life science research
How novel compute technology transforms life science researchHow novel compute technology transforms life science research
How novel compute technology transforms life science research
 
VariantSpark: applying Spark-based machine learning methods to genomic inform...
VariantSpark: applying Spark-based machine learning methods to genomic inform...VariantSpark: applying Spark-based machine learning methods to genomic inform...
VariantSpark: applying Spark-based machine learning methods to genomic inform...
 
Population-scale high-throughput sequencing data analysis
Population-scale high-throughput sequencing data analysisPopulation-scale high-throughput sequencing data analysis
Population-scale high-throughput sequencing data analysis
 
Trip Report Seattle
Trip Report SeattleTrip Report Seattle
Trip Report Seattle
 
Allelic Imbalance for Pre-capture Whole Exome Sequencing
Allelic Imbalance for Pre-capture Whole Exome SequencingAllelic Imbalance for Pre-capture Whole Exome Sequencing
Allelic Imbalance for Pre-capture Whole Exome Sequencing
 
Centralizing sequence analysis
Centralizing sequence analysisCentralizing sequence analysis
Centralizing sequence analysis
 
Qbi Centre for Brain genomics (Informatics side)
Qbi Centre for Brain genomics (Informatics side)Qbi Centre for Brain genomics (Informatics side)
Qbi Centre for Brain genomics (Informatics side)
 
Differential gene expression
Differential gene expressionDifferential gene expression
Differential gene expression
 
Transcript detection in RNAseq
Transcript detection in RNAseqTranscript detection in RNAseq
Transcript detection in RNAseq
 
Functionally annotate genomic variants
Functionally annotate genomic variantsFunctionally annotate genomic variants
Functionally annotate genomic variants
 
Variant (SNPs/Indels) calling in DNA sequences, Part 2
Variant (SNPs/Indels) calling in DNA sequences, Part 2Variant (SNPs/Indels) calling in DNA sequences, Part 2
Variant (SNPs/Indels) calling in DNA sequences, Part 2
 
Variant (SNPs/Indels) calling in DNA sequences, Part 1
Variant (SNPs/Indels) calling in DNA sequences, Part 1 Variant (SNPs/Indels) calling in DNA sequences, Part 1
Variant (SNPs/Indels) calling in DNA sequences, Part 1
 
Introduction to second generation sequencing
Introduction to second generation sequencingIntroduction to second generation sequencing
Introduction to second generation sequencing
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
 
The missing data issue for HiSeq runs
The missing data issue for HiSeq runsThe missing data issue for HiSeq runs
The missing data issue for HiSeq runs
 
ReliF
ReliFReliF
ReliF
 

Dernier

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
latest AZ-104 Exam Questions and Answers
latest AZ-104 Exam Questions and Answerslatest AZ-104 Exam Questions and Answers
latest AZ-104 Exam Questions and Answersdalebeck957
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsSandeep D Chaudhary
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
Tatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf artsTatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf artsNbelano25
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxJisc
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17Celine George
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Pooja Bhuva
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfDr Vijay Vishwakarma
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxUmeshTimilsina1
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxPooja Bhuva
 

Dernier (20)

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
latest AZ-104 Exam Questions and Answers
latest AZ-104 Exam Questions and Answerslatest AZ-104 Exam Questions and Answers
latest AZ-104 Exam Questions and Answers
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & Systems
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Tatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf artsTatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf arts
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 

Deciphering the regulatory code in the genome

  • 1. Deciphering the regulatory code in the genome PhD completion seminar Denis C. Bauer Institute for Molecular Bioscience The University of Queensland, Australia By yankodesign  by linh.ngân 
  • 2. Research Aim Thermodynamic model Develop a method that translates the regulatory message in the DNA of when and how strong a gene is expressed. AAGAAGGTTTTAGTTTAGCC Express gene with  CACCGTAGGTACCTGAAGAA GAAGGTTTTAGTTTAGCCCA 70% capacity when it  CCGTAGGTACCTGAAG  is hot, Thanks! 
  • 3. Why understanding transcriptional regulation is important? •  Insight in the biology of gene pathways. •  Search for regulatory regions with specific function. •  “Re-programming” of genes has therapeutic potential. A transcription gene promoter DNA Broken regulatory  Design and insert a new  element  regulatory element 
  • 4. What do we need to know  for  building  a  model  able  to translate the regulatory  message ? 
  • 5. Background : Enhancer •  Genes can have independent “switches” (Enhancer) beyond the core promoter, which can start the transcription of the target gene under different conditions. transcription gene promoter enhancer regions
  • 6. Background: Enhancer •  Transcription is regulated by the binding of activator and repressor TFs to an enhancer region. enhancer binding site map Active TF 8 Activators transcription Concentration 2 Repressors
  • 7. Background: Repression •  Transcriptional regulation is also dependent on the interplay between activators and repressors, i.e. where they bind relative to each other. Repressor range binding site map enhancer
  • 8. On  which  system  would  we  test  the  model’s  abiliJes ? 
  • 9. Background: Even-skipped gene (eve) Drosophila melanogaster 1 Embryo stained for eve 2 Function representation 3 1 hLp://insects.eugenes.org/  2 Small et al.  3 hLp://bioinform.geneJka.ru 
  • 10. Background: Regulation of eve MSE MSE eve MSE MSE MSE Late1            3+7                        2            P                       late2                     4+6                    1        5  lacZ  Janssens, H. et al. QuanJtaJve and predicJve model of transcripJonal control of the  Drosophila melanogaster even skipped gene. Nat Genet, 2006, 38, 1159‐1165  
  • 11. Hypothesis TF  Bindin ns  Genome     conce ntraJo g site  map  re, a rchitectu RNA,  n, m ethylaJo …  predicts gene activation
  • 12. Research Goals •  Optimize Thermodynamic models efficiently. •  Analyze robustness of these models. •  Explore the regulation of a particular gene. •  Examine how the regulatory program evolves. •  Extend current thermodynamic model. Cooperphoto/CORBIS 
  • 13. Model definition Site occupancy (Hill function) Kt · K(s, t) · [t] p(s, t) = 1 + Kt · K(s, t) · [t] Free parameters TF PARAMS Total activation K Binding affinity W (S, T ) = Ets p(s, ts ) 1 − Ets · p(s , ts ) · d(s, s ) s∈S A s ∈S R E Effectiveness quenching of the activator activator contribution GENERAL PARAMS Transcription rate (Arrhenius function)  R0 Max. transcription  R exp W (S, T ) − G0 iff W < G0 rate 0 R(S, T ) =  R0 otherwise, G0 Energy barrier   ts ts Buena Vista Pictures  s s Janssens, H. et al. QuanJtaJve and predicJve model of transcripJonal control of the  Drosophila melanogaster even skipped gene. Nat Genet, 2006, 38, 1159‐1165  
  • 14. Training the model 200 100 50 0 < [TF ], [TF ], [TF ], [TF ] > 0 20 40 60 80 100 1 2 3 4 TF Binding TF Concentration Thermodynamic Model predicted Adjust model expression and parameters to 150 100 compare it to improve fit 50 target 0 40 50 60 70 80 90
  • 15. Optimization methods •  Two optimization paradigms –  Simulated Annealing •  LAM schedule (Reinitz et al. 2003) •  Geometric cooling –  Gradient descent •  Three GD variants approximating the objective function, which was not continuously differentiable. •  Judged on accuracy achieved in the given time –  Drosophila MSE2 data with 400 data points and 7 TF (16 free parameters).
  • 16. Optimization Simulated Annealing Gradient Descent 1.00 20 20 SA LAM 0.99 SA geom 0.99 15 15 RMS error 0.98 RMS error CC CC 10 10 0.97 0.97 SA_geom 5 5 0.96 GD_softmax SA LAM GD_nomax SA geom 0.95 GD_max 0.95 0 0 1 2 5 10 50 200 1 2 1 5 2 105 20 10 50 100 50 200200500 time [minutes] time [minutes] time [minutes] Suggests: many local minima. Bauer, D. C. & Bailey, T. L. OpJmizing staJc thermodynamic models of transcripJonal  regulaJon. BioinformaJcs, 2009, 25, 1640‐1646  
  • 17. If  gradient  descent  gets  stuck  in  local  minima  all  the  Jme,  how  does  the  opJmizaJon  landscape  look like ? 
  • 18. Landscape analysis •  Synthetic data based on real MSE2 data –  global minimum and solution (parameter values) are known. –  Measuring distance of the optimization solution to the starting position and the known solution. –  Measuring error reduction at the solution compared to the starting position.
  • 19. Landscape analysis Experiment Ini$al distance to  Final distance to  Error Red.  solu4on (mean)  solu4on  (mean)  (mean)  1% perturbed  3.4·10−4 2.8·10−4 88%  random  0.1  0.11  97%  Conclusion: many local minima. Bauer, D. C. & Bailey, T. L. OpJmizing staJc thermodynamic models of transcripJonal  regulaJon. BioinformaJcs, 2009, 25, 1640‐1646  
  • 20. Does the model over-fit ? •  Cross-validation (5-fold) Experiment Mean RMS error  Mean CC   (SE)   (SE)  training  13.39 (0.004)  0.92  (4.8 · 10−5 ) tesJng  14.04 (0.005)  0.91  (5.7 · 10−5 ) •  Redundancy reduction –  Not enough data to begin with
  • 21. Summary: Optimization & Analysis •  The objective function is ill-posed. –  It has a plethora of local minima. –  It might have many global minima. •  Hence SA is the method of choice. •  There might be a tendency to over-fit the data. hLp://www2.cmp.uea.ac.uk/~aih/code/SVM/KernelTrickDemo.html  hLp://images.nciku.com/ 
  • 22. Research Goals •  Optimize Thermodynamic models efficiently •  Analyze robustness of these models •  Explore the regulation of a particular gene •  Examine how the regulatory program evolves •  Extend current thermodynamic model Cooperphoto/CORBIS 
  • 23. Regulation and Evolution of eve •  Mechanism for regulating eve is conserved: –  Stripe 2 elements from other Drosophila species activate eve in D. mel. correctly. –  Despite the substantial difference in the regulatory DNA sequence. hLp://www.bio.ilstu.edu/Edwards/  Hare, E. E. et al. Sepsid even‐skipped enhancers are funcJonally conserved in Drosophila  despite lack of sequence conservaJon. PLoS Genet, 2008, 4, e1000106  
  • 24. Evaluate Evolution of MSE2 •  Test if the model can identify the MSE2 in these other species. •  Test if the model correctly predicts the transcriptional output of the homologous MSE2s.
  • 25. Searching for MSE2 •  Apply a model trained on D. mel. MSE2 to the TFBS-map from sequential windows to find the MSE2 in other species MSE2 promoter eve Other species 150 100 50 0 40 50 60 70 80 90 150 RMS error 100 50 0 40 50 60 70 80 90 < 23 27 43 … 13 … > Bauer, D. C. & Bailey, T. L. Studying the funcJonal conservaJon of cis‐regulatory modules  and their transcripJonal output. BMC BioinformaJcs, 2008, 9, 220   
  • 26. Searching for MSE2: Result •  Correctly identified the MSE2 in 6/8 species 40 D. melanogaster 30 20 RMS error  10 40 D.pseudoobscura 30 20 10 rms error Genomic locaJon  40 Bauer, D. C. & Bailey, T. L. Studying the funcJonal conservaJon of cis‐regulatory modules  30 rimshawi and their transcripJonal output. BMC BioinformaJcs, 2008, 9, 220    20
  • 27. Predicting the output in other species •  Apply a model trained on D. mel. MSE2 to the MSE2s in other species D. melanogaster  15 150 Target 10 D. melanogaster Log odds score (bits) relative RNA concentration 5 D. pseudoobscura 0 D. ananassae !5 100 D. mojavensis !10 !15 0 500 1000 1500 D. mojavensis  rel. genomic position 50 bicoid kruppel giant hunchback knirps caudal tailless 0 40 50 60 70 80 90 A!P position (%) Bauer, D. C. & Bailey, T. L. Studying the funcJonal conservaJon of cis‐regulatory modules  and their transcripJonal output. BMC BioinformaJcs, 2008, 9, 220   
  • 28. Summary Application •  Model fits the data qualitatively. •  Predictions are biologically meaningful. •  However, there is room for improvement.
  • 29. Research Goals •  Optimize Thermodynamic models efficiently •  Analyze robustness of these models •  Explore the regulation of a particular gene •  Examine how the regulatory program evolves •  Extend current thermodynamic model Cooperphoto/CORBIS 
  • 30. One role fits them all? •  Dual function is proposed for some of the regulatory TFs. –  E.g. TF Hunchback (Hb) might be an activator when regulating stripe2 and repressor for stripe3. Late1            3+7                        2            P                       late2                     4+6                    1        5  Papatsenko, D. & Levine, M. S. Dual regulaJon by the Hunchback gradient in the  Drosophila embryo. Proc Natl Acad Sci U S A, 2008, 105, 2901‐2906   Schroeder, M. D. et al. TranscripJonal control in the segmentaJon gene network of  Drosophila. PLoS Biol, 2004, 2, E271  
  • 31. Determine the regulatory role of TFs •  Different data set: 44 CRMs important for D. mel. development but same set of TFs. •  Determine the best role for each TF in each of the CRMs –  Brute Force: train a model for all TF role-combinations on each of the 44 CRMs. –  Record the correlation achieved. –  Identify TFs that have dual-function. Segal, E. et al. PredicJng expression paLerns from regulatory sequence includes  Drosophila segmentaJon. Nature, 2008, 451, 535‐540  Bauer, D. C.; Buske, F. A. & Bailey, T. L. Dual funcJoning transcripJon factors regulated by  SUMOylaJon in the developmental gene network of Drosophila melanogaster submiLed  for publicaJon, 2009 
  • 32. TFs with dual role Bcd  Cad  Hb  Tll  Gt  Kr  Kni  TorRE  Det. roles  s  +  s  ‐  s  s  ‐  s  Literature  +  +  s  ‐  (s)  s  ‐  NA  (consensus)  “s”: dual-functioning, “+”: activator, “-”: repressor. •  E.g. Hb –  Activator for 17 CRMs –  Repressor for 27 CRMs Perkins, T. J. et al. Reverse engineering the gap gene network of Drosophila melanogaster.  PLoS Comput Biol, 2006, 2, e51   Schroeder, M. D. et al. TranscripJonal control in the segmentaJon gene network of  Drosophila. PLoS Biol, 2004, 2, E271  
  • 33. Improvement with dual function kr_CD1_ru hb_anterior_actv 1.0 1.0 1.0 target previous roles HbDual Experiment number of  mean CC   KrDual free  (SE)  0.8 0.8 0.8 HbKrDual best parameters  Previous  18  0.27 (0.008)  0.6 0.6 0.6 mRNA mRNA mRNA roles  HbDual  19  0.35 (0.009)  0.4 0.4 0.4 KrDual  19  0.37 (0.007)  0.2 0.2 0.2 HbKrDual  20  0.38 (0.007)  0.0 0.0 0.0 0 20 40 60 80 100 0 20 40 60 80 100 AP AP Bauer, D. C.; Buske, F. A. & Bailey, T. L. Dual funcJoning transcripJon factors regulated by  run_stripe5 SUMOylaJon in the developmental gene network of Drosophila melanogaster submiLed  eve_37ext_ru for publicaJon, 2009  .0 .0 .0
  • 34. Marker motifs for dual function •  Running MEME on the protein sequence of dual- functioning TFs to find short motifs (<6aa) present in all of them. CI KE 4 4 Q 3 3 K D ID bits bits 2 G 2 1 0 L E Y Q 1 0 L V 1 2 3 4 1 2 3 4 MEME (no SSC) 15.07.09 12:07 MEME (no SSC) 15.07.09 12:07 SUMOyla(on  mo(f 
  • 35. SUMOylation •  Small Ubiquitin-related Modifier a SUMO protease SU small protein covalently attached ATP to target-proteins. SU SUMO •  Involved in many pathways/ SU pathway mechanisms E1 activating enzyme –  Compartmentisation target protein + E3 ligasis –  Transcriptional regulation SU •  Can reverse the function of a TF e.g. E2 conjugating enzyme Ikaros (the human homologue of Kr) •  SUMO (Smt3) is present in D. mel during development Bauer, D. C.; Buske, F. A.; Bailey, T. L. & Bodén, M. PredicJng SUMOylaJon sites in  developmental transcripJon factors of Drosophila melanogaster NeurocompuJng, 2009,  in submission   del Arco, P. G. et al. Ikaros SUMOylaJon: switching out of repression. Mol Cell Biol 2005,  25, 2688‐2697   
  • 36. Conclusion •  Thermodynamic models can be best optimized using SA but over-fitting is an issue to keep in mind. Bauer, D. C. & Bailey, T. L. OpJmizing staJc thermodynamic models of transcripJonal regulaJon. BioinformaJcs, 2009, 25, 1640‐1646   •  Non-the-less, they are applicable for –  examining the mechanisms of transcriptional regulation, –  explore the evolution of a particular regulatory mechanism Bauer, D. C. & Bailey, T. L. Studying the funcJonal conservaJon of cis‐regulatory modules and their transcripJonal output. BMC BioinformaJcs, 2008, 9, 220    •  Model prediction improves when dual-function is allowed. Bauer, D. C.; Buske, F. A. & Bailey, T. L. Dual funcJoning transcripJon factors regulated by SUMOylaJon in the developmental gene network of Drosophila  melanogaster submiLed for publicaJon, 2009  –  SUMOylation seems to be a good candidate for the biological mechanism of role-change. Bauer, D. C.; Buske, F. A.; Bailey, T. L. & Bodén, M. PredicJng SUMOylaJon sites in developmental transcripJon factors of Drosophila melanogaster  NeurocompuJng, 2009, in submission  
  • 37. Acknowledgments •  IMB •  Funding –  Timothy Bailey (supervisor) –  Institute for Molecular –  Mikael Bodén (supervisor) Bioscience, The University of –  Sean Grimmond (thesis committee) Queensland –  Nick Hamilton (thesis committee) –  Australian Research Council –  Fabian Buske Centre of Excellence in –  Stefan Maetschke Bioinformatics –  National Institutes of Health •  Stony Brook University –  John Reinitz –  UQ International Research Tuition Award Framework for modeling, visualizing, and predicJng the  regulaJon of the transcripJon rate of a target gene  www.bioinforma(cs.org.au/stream 
  • 38. www.bioinforma(cs.org.au/stream  •  Framework for modeling, visualizing, and predicting the regulation of the transcription rate of a target gene. •  Publicly available •  Modular: New functions can be plugged in Many functions Command line Bauer, D.C. and Bailey, T.L, STREAM ‐ StaJc Thermodynamic REgulAtory Model for  transcripJonal. BioinformaJcs, 2008, 24, 2544‐2545.