SlideShare une entreprise Scribd logo
1  sur  51
Bioinformatics in the Bourne Lab


                                     Philip E. Bourne
                                    pbourne@ucsd.edu

                                        BILD 94
                                       May 3, 2012

August 14, 2009



   5/3/12            UCSD BILD 94                       1
Some Personal Background ….




5/3/12               UCSD BILD 94      2
5/3/12   UCSD BILD 94   3
The Life of One Scientist – The Early Years
                   So That You Might Not Make the Same Mistakes




• My high school
  teacher Mr. Wilson                                      • The opportunity to
  said I would be a
  failure at chemistry                                      live in different
• My PhD is in                                              places shaped my
  chemistry                                                 life
                                                          • Good friends are
5/3/12                              UCSD BILD 94
                                                            forever           4
40+ Years Later




                       Ten Simple Rules for Starting a Company
                       PLoS Comp Biol 2012 8(3) 1002439

5/3/12        UCSD BILD 94                                   5
5/3/12   UCSD BILD 94   6
PhD in Physical Chemistry




5/3/12             UCSD BILD 94      7
Always Loved Computing




5/3/12            UCSD BILD 94
                                 Circa 1974   8
Postdoctoral Work – The Molecular
         Basis of How the Body Works




• Regrets: never
  learnt another
  language
 5/3/12             UCSD BILD 94          9
Post Doc




5/3/12     UCSD BILD 94   10
Some Things Stay with You Your Whole
                Life




5/3/12          UCSD BILD 94       11
Senior Scientist HHMI Columbia
               University New York




• Driven not by career but
  wanting to live in New York
  City
 5/3/12                         UCSD BILD 94   12
~1990 Got Involved with the The Human
                        Genome

                                 • Was only possible by
                                   applying computers to
                                   problems in biology

                                 • Developed algorithms
                                   to support physical and
                                   genetic mapping of Chr
                                   13



5/3/12                   UCSD BILD 94                      13
Came to UCSD to Apply Computers to
         Big Biological Problems




• Possibly the best place in the
  world to do computational
  biology
 5/3/12                            UCSD BILD 94   14
5/3/12   UCSD BILD 94   15
The Protein Kinase Family
 •A large family
 important to signal
 transduction in
 eukaryotes and many
 bacteria.
 •Phosphotransferases:
 transfer phosphate
 group from ATP to
 Ser/Thr or Tyr residue on
 target protein,
 producing a range of
 downstream signaling
 effects.
 •PKA: an example of a
 typical protein kinase
 (TPK) fold, shown in
 “open book” format


    5/3/12                   UCSD BILD 94   16
Sometime Ya Got to Just Do It Yourself




5/3/12           UCSD BILD 94         17
The Growth of Data is A Major Driver
                                         in Biology
Number of released entries




                                                                    Year
                    5/3/12                  UCSD BILD 94            18
Demo




5/3/12   UCSD BILD 94      19
Big Research Questions in the Lab
                                  1.     Can we improve how science is
                                         disseminated and
                                         comprehended?
                                  2.     What is the ancestry of the
                                         protein structure universe and
                                         what can we learn from it?
                                  3.     Are there alternative ways to
                                         represent proteins from which
                                         we can learn something new?
                                  4.     What really happens when we
                                         take a drug?
                                  5.     Can we contribute to the
                                         treatment of neglected
                                         {tropical} diseases?

August 14, 2009



   5/3/12                 UCSD BILD 94                                    20
Studying Evolution
         Through Structure

5/3/12          UCSD BILD 94   21
Nature’s Reductionism
             There are ~ 20300 possible proteins
             >>>> all the atoms in the Universe



             11.2M protein sequences from
             10,854 species (source RefSeq)



             38,221 protein structures
             yield 1195 domain folds (SCOP 1.75)
5/3/12              UCSD BILD 94                   22
Initial Question:
    With the current coverage of proteomes
     by structure and assuming we know a
    high percentage of all folds, is structure
       a useful discriminator of species?




5/3/12                UCSD BILD 94               23
Chapter 2 Initial Findings




                                                                       Song Yang
                Russ Doolittle,                                  Post Doc UC Berkeley
                   Professor                            Department of Chemistry and Biochemistry
         Center for Molecular Genetics                                   UCSD
                     UCSD


              Yang, Doolittle & Bourne (2005) PNAS 102(2) 373-8


5/3/12                                   UCSD BILD 94                                          24
To Answer this Question We Only Need to
       Make Use of Existing Resources


• SCOP – Further catalogs Nature’s
  reductionism into structural domains, folds,
  families and superfamilies

• SUPERFAMILY assigns the above to fully
  sequenced proteomes

5/3/12               UCSD BILD 94                25
The SCOP Hierarchy v1.75
         Based on 38221 Structures

                                  7



                                      1195



                                             1962



                                                3902



                                                       110800

5/3/12             UCSD BILD 94                                 26
Is Structure a Useful Discriminator of Species? -
                    Maybe…
      Distribution among the three kingdomsas taken from SUPERFAMILY
                                                           Eukaryota (650)



                                                             153/14
                                                                135

• Superfamily distributions
  would seem to be                                   10
                                                    21/2                      118
                                                                             310/0
  related to the complexity                                  645/49
                                                                387
  of life
                                              9/1
                                               12               29/0
                                                                 17
                                                                                 42
                                                                                 68/0
• Update of the work of
  Caetano-Anolles2 (2003)     Archaea (416)                                          Bacteria (564)
  Genome Biology 13:1563
                                                     SCOP fold (765 total)
                                              Any genome / All genomes

  5/3/12                         UCSD BILD 94                                                   27
Method – Distance Determination
  Presence/Absence Data Matrix
                                                                organisms

                                (FSF)
                                SCOP
                             SUPERFAMILY   C. intestinalis     C. briggsae        F. rubripes
                                 a.1.1           1                    1               1
                                 a.1.2           1                    1               1
                                a.10.1           0                    0               1
                               a.100.1           1                    1               1
                               a.101.1           0                    0               0
                               a.102.1           0                    1               1
                               a.102.2           1                    1               1
    Distance Matrix

                                           C. intestinalis          C. briggsae    F. rubripes


                         C. intestinalis          0                    101             109
                          C. briggsae                                     0            144
                             F. rubripes                                                  0

Chapter 2 Initial Findings
    5/3/12                                           UCSD BILD 94                                28
Is Structure a Useful Discriminator of
                     Species? - Yes




    Archaea                    Bacteria                Eukaryota


          The method cleanly placed all species in their
                   correct superkingdoms
5/3/12                         UCSD BILD 94                        29
The Answer Would Appear to be Yes

                          • It is possible to
                            generate a reasonable
                            tree of life from merely
                            the presence or
                            absence of
                            superfamilies (FSFs)
                            within a given
                            proteome

5/3/12            UCSD BILD 94                         30
Environmental Influence




                                      Chris Dupont
                           Scripps Institute of Oceanography
                                          UCSD
         DuPont, Yang, Palenik, Bourne. 2006 PNAS 103(47) 17822-17827

5/3/12                                UCSD BILD 94                      31
Consider the Distribution of Disulfide Bonds
                             among Folds
• Disulphides are only stable under
  oxidizing conditions                                                Eukaryota
• Oxygen content gradually accumulated
  during the earth’s evolution                                            31.9%
                                                                         (43/135)
• The divergence of the three kingdoms
  occurred 1.8-2.2 billion years ago
                                                              0%                     14.4%
• Oxygen began to accumulate ~ 2.0                           (0/10)        4.7%     (17/118)

  billion years ago                                                      (18/387)

• Logical deduction – disulfides more                 0%                                       16.7%
                                                                          5.9%
  prevalent in folds (organisms) that                  1
                                                     (0/2)                (1/17)               (7/42)

  evolved later                            Archaea                                                      Bacteria
• This would seem to hold true
• Can we take this further?
                                                              SCOP fold (708 total)




  5/3/12                          UCSD BILD 94                                                           32
Evolution of the Earth
• 4.5 billion years of change
• 300+50K
• 1-5 atmospheres
• Constant photoenergy
• Chemical and geological
  changes
• Life has evolved in this time

• The ocean was the “cradle”
  for 90% of evolution


5/3/12                       UCSD BILD 94   33
Theoretical Levels of Trace Metals and Oxygen in the
                Deep Ocean Through Earth’s History
                                                                                                                                      • Whether the deep ocean became
                                                                                                                                        oxic or euxinic following the rise
       Bacteria                               Eukarya
                                                                                                                                        in atmospheric oxygen (~2.3 Gya)
       Archaea
                                                                          1                                                             is debated, therefore both are
       Oxygen
                                                                          0.5                                                           shown (oxic ocean-solid lines,




                                                                                     (O2 in arbitrary units, Zn and Fe in moles L-1
                                                                          0                                                             euxinic ocean-dashed lines).
                                                                          1.00E-08
       Zinc                                                               1.00E-12




                                                                                                     Concentration
                                                                          1.00E-16
                                                                          1.00E-20
                                                                                                                                      • The phylogenetic tree symbols at
       Iron
                                                                          1.00E-06
                                                                          1.00E-09                                                      the top of the figure show one
                                                                          1.00E-12
                                                                          1.00E-15
                                                                          1.00E-07
                                                                                                                                        idea as to the theoretical periods
       Cobalt                                                             1.00E-09
                                                                                                                                        of diversification for each
       Manganese
                                                                          1.00E-11
                                                                                                                                        Superkingdom.
 4.5      4        3.5   3       2.5      2       1.5       1   0.5   0
                         Billions of years before present




Replotted from Saito et al, 2003
Inorganica Chimica Acta 356: 308-318

  5/3/12                                                                             UCSD BILD 94                                                                        34
Superfamily Distribution As Well As Overall
            Content Has Changed
                a.1.1     a.1.2                               a.1.1         a.1.2
                                                              a.104.1       a.110.1

 Bacteria Fe
                a.104.1
                a.119.1
                a.2.11
                          a.110.1
                          a.138.1
                          a.24.3
                                              Eukaryotic Fe   a.119.1
                                                              a.2.11
                                                                            a.138.1
                                                                            a.24.3



superfamilies
                a.24.4
                a.3.1
                          a.25.1
                          a.39.3              superfamilies   a.24.4
                                                              a.3.1
                                                                            a.25.1
                                                                            a.39.3

                a.56.1    a.93.1                              a.56.1        a.93.1

                b.1.13    b.2.6                               b.1.13        b.2.6

                b.3.6     b.33.1                              b.3.6         b.33.1

                b.70.2    b.82.2                              b.70.2        b.82.2

                c.56.6    c.83.1                              c.56.6        c.83.1

                c.96.1    d.134.1                             c.96.1        d.134.1

                d.15.4    d.174.1                             d.15.4        d.174.1

                d.178.1   d.35.1                              d.178.1       d.35.1

                d.44.1    d.58.1                              d.44.1        d.58.1

                e.18.1    e.19.1                              e.18.1        e.19.1

                e.26.1    e.5.1                               e.26.1        e.5.1

                f.21.1    f.21.2                              f.21.1        f.21.2

                f.24.1    f.26.1                              f.24.1        f.26.1

                g.35.1    g.36.1                              g.35.1        g.36.1

                g.41.5                                        g.41.5



  5/3/12                           UCSD BILD 94                        35
Hypothesis
    • Emergence of cyanobacteria changed oxygen
      concentrations
    • Impacted metal concentrations in the ocean
    • Organisms used new metals in new ways to
      evolve new biological processes eg complex
      signaling
    • This in turn further impacted the environment



5/3/12                 UCSD BILD 94             36
Big Research Questions in the Lab
                                  1.     Can we improve how science is
                                         disseminated and
                                         comprehended?
                                  2.     What is the ancestry of the
                                         protein structure universe and
                                         what can we learn from it?
                                  3.     Are there alternative ways to
                                         represent proteins from which
                                         we can learn something new?
                                  4.     What really happens when we
                                         take a drug?
                                  5.     Can we contribute to the
                                         treatment of neglected
                                         {tropical} diseases?

August 14, 2009



   5/3/12                 UCSD BILD 94                                    37
Our Motivation
                                         • Tykerb – Breast cancer
                                         • Gleevac – Leukemia, GI
                                         cancers
                                         • Nexavar – Kidney and liver
                                         cancer
                                         • Staurosporine – natural product
                                         – alkaloid – uses many e.g.,
                                         antifungal antihypertensive




     5/3/12        UCSD BILD 94                                               38
                        Collins and Workman 2006 Nature Chemical Biology 2 689-700
Motivators
Our Broad Approach
   • Involves the fields of:
          –    Structural bioinformatics
          –    Cheminformatics
          –    Biophysics
          –    Systems biology
          –    Pharmaceutical chemistry

   •     L. Xie, L. Xie, S.L. Kinnings and P.E. Bourne 2012 Novel Computational Approaches to Polypharmacology as a
         Means to Define Responses to Individual Drugs, Annual Review of Pharmacology and Toxicology 52: 361-379
   •     L. Xie, S.L. Kinnings, L. Xie and P.E. Bourne 2012 Predicting the Polypharmacology of Drugs: Identifying New Uses
         Through Bioinformatics and Cheminformatics Approaches in Drug Repurposing M. Barrett and D. Frail (Eds.) Wiley
         and Sons. (available upon request)




5/3/12                                                  UCSD BILD 94                                                         39
Approach - Need to Start with a 3D Drug-
         Receptor Complex – Either Experimental or
                        Modeled
 Generic Name        Other Name                     Treatment          PDBid


Lipitor         Atorvastatin                High cholesterol    1HWK, 1HW8…


Testosterone    Testosterone                Osteoporosis        1AFS, 1I9J ..


Taxol           Paclitaxel                  Cancer              1JFF, 2HXF, 2HXH


Viagra          Sildenafil citrate          ED, pulmonary       1TBF, 1UDT,
                                               arterial            1XOS..
                                               hypertension

Digoxin         Lanoxin                     Congestive heart    1IGJ
                                               failure
5/3/12                               UCSD BILD 94                               40
A Reverse Engineering Approach to
          Drug Discovery Across Gene Families
         Characterize ligand binding       Identify off-targets by ligand
         site of primary target            binding site similarity
         (Geometric Potential)             (Sequence order independent
                                           profile-profile alignment)
         Extract known drugs
         or inhibitors of the
         primary and/or off-targets


         Search for similar
         small molecules               …



         Dock molecules to both
         primary and off-targets




         Statistics analysis
         of docking score
         correlations                                                       41
5/3/12                                             Xie and Bourne 2009
                                                   Bioinformatics 25(12) 305-312
Characterization of the Ligand Binding
             Site - The Geometric Potential

                                           Conceptually similar to hydrophobicity
                                              or electrostatic potential that is
                                              dependant on both global and local
                                              environments
                                          • Initially assign C atom with a
                                            value that is the distance to the
                                            environmental boundary
                                          • Update the value with those of
                                            surrounding C atoms
                                            dependent on distances and
                                            orientation – atoms within a
                                            10A radius define i

                      Pi   cos( i) 1.0
GP   P
          neighbors Di 1.0     2.0           Xie and Bourne 2007 BMC Bioinformatics, 8(Suppl 4):S9

 5/3/12                                  UCSD BILD 94                                            42
Discrimination Power of the Geometric
                               Potential
 4
                          binding site
                          non-binding site
3.5



 3                                                                             • Geometric
2.5                                                                              potential can
 2                                                                               distinguish
1.5
                                                                                 binding and
 1
                                                                                 non-binding
0.5
                                                                                 sites
 0                                                     100              0
      11

           22

                33

                     44

                          55

                               66

                                    77

                                         88

                                              99
 0




                Geometric Potential                Geometric Potential Scale
       For Residue Clusters
  5/3/12                                                 UCSD BILD 94                      43
Local Sequence-order Independent Alignment with
          Maximum-Weight Sub-Graph Algorithm
                    Xie and Bourne 2008 PNAS, 105(14) 5441

                  Structure A   Structure B




                                                                  LER

                                                                 VKDL




                                                                  LER

                                                                 VKDL




  • Build an associated graph from the graph representations of two
    structures being compared. Each of the nodes is assigned with a weight
    from the similarity matrix
  • The maximum-weight clique corresponds to the optimum alignment of
    the two structures
5/3/12                                    UCSD BILD 94                       44
Similarity Matrix of Alignment

Chemical Similarity
• Amino acid grouping: (LVIMC), (AGSTP), (FYW), and (EDNQKRH)
• Amino acid chemical similarity matrix

Evolutionary Correlation
• Amino acid substitution matrix such as BLOSUM45
• Similarity score between two sequence profiles



                                  i      i            i      i
                     d          f a Sb               fb Sa
                            i                   i


 fa, fb are the 20 amino acid target frequencies of profile a
 and b, respectively
 Sa, Sb are the PSSM of profile a and b, respectively

5/3/12                                UCSD BILD 94               45
The Problem with Tuberculosis
 • One third of global population infected
 • 1.7 million deaths per year
 • 95% of deaths in developing countries
 • Anti-TB drugs hardly changed in 40 years
 • MDR-TB and XDR-TB pose a threat to
   human health worldwide
 • Development of novel, effective and
   inexpensive drugs is an urgent priority



5/3/12               UCSD BILD 94             46
The TB-Drugome
   1. Determine the TB structural proteome

   2. Determine all known drug binding sites
      from the PDB

   3. Determine which of the sites found in 2
      exist in 1

   4. Call the result the TB-drugome
                              Kinnings et al 2010 PLoS Comp Biol 6(11): e1000976



5/3/12                 UCSD BILD 94                                        47
1. Determine the TB Structural
                      Proteome




         3, 996   2, 266                        284

                                       1, 446



 • High quality homology models from ModBase
   (http://modbase.compbio.ucsf.edu) increase structural
   coverage from 7.1% to 43.3%
5/3/12                      UCSD BILD 94                   48
2. Determine all Known Drug
                       Binding Sites in the PDB
         • Searched the PDB for protein crystal structures
           bound with FDA-approved drugs
         • 268 drugs bound in a total of 931 binding sites

               140

               120

               100
                                                                Acarbose
No. of drugs




                                                                    Darunavir       Alitretinoin
                80
                                                                         Conjugated
                60
                                                                         estrogens
                40                                                                             Chenodiol
                20
                                                                                                                             Methotrexate
                 0
                     1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37


                                              No. of drug binding sites
5/3/12                                                          UCSD BILD 94                                                         49
Map 2 onto 1 – The TB-Drugome
            http://funsite.sdsc.edu/drugome/TB/




Similarities between the binding sites of M.tb proteins (blue),
                           UCSD BILD 94
     and binding sites containing approved drugs (red).
Research is a Good Life

Contenu connexe

Plus de Philip Bourne

Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedPhilip Bourne
 
AI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a ConversationAI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a ConversationPhilip Bourne
 
AI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We GoingAI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We GoingPhilip Bourne
 
Thoughts on Biological Data Sustainability
Thoughts on Biological Data SustainabilityThoughts on Biological Data Sustainability
Thoughts on Biological Data SustainabilityPhilip Bourne
 
What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?Philip Bourne
 
Data Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything ChangeData Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything ChangePhilip Bourne
 
Data Science Meets Drug Discovery
Data Science Meets Drug DiscoveryData Science Meets Drug Discovery
Data Science Meets Drug DiscoveryPhilip Bourne
 
Biomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AloneBiomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AlonePhilip Bourne
 
BIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in ResearchBIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in ResearchPhilip Bourne
 
AI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data ScienceAI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data SciencePhilip Bourne
 
What Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewWhat Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewPhilip Bourne
 
Novo Nordisk 080522.pptx
Novo Nordisk 080522.pptxNovo Nordisk 080522.pptx
Novo Nordisk 080522.pptxPhilip Bourne
 
Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)Philip Bourne
 
COVID and Precision Education
COVID and Precision EducationCOVID and Precision Education
COVID and Precision EducationPhilip Bourne
 
One View of Data Science
One View of Data ScienceOne View of Data Science
One View of Data SciencePhilip Bourne
 
Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Philip Bourne
 
Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?Philip Bourne
 
Data to Advance Sustainability
Data to Advance SustainabilityData to Advance Sustainability
Data to Advance SustainabilityPhilip Bourne
 
Frontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular ScalesFrontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular ScalesPhilip Bourne
 
Social Responsibility in Research
Social Responsibility in ResearchSocial Responsibility in Research
Social Responsibility in ResearchPhilip Bourne
 

Plus de Philip Bourne (20)

Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
AI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a ConversationAI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a Conversation
 
AI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We GoingAI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We Going
 
Thoughts on Biological Data Sustainability
Thoughts on Biological Data SustainabilityThoughts on Biological Data Sustainability
Thoughts on Biological Data Sustainability
 
What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?
 
Data Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything ChangeData Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything Change
 
Data Science Meets Drug Discovery
Data Science Meets Drug DiscoveryData Science Meets Drug Discovery
Data Science Meets Drug Discovery
 
Biomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AloneBiomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not Alone
 
BIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in ResearchBIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in Research
 
AI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data ScienceAI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data Science
 
What Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewWhat Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's View
 
Novo Nordisk 080522.pptx
Novo Nordisk 080522.pptxNovo Nordisk 080522.pptx
Novo Nordisk 080522.pptx
 
Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)
 
COVID and Precision Education
COVID and Precision EducationCOVID and Precision Education
COVID and Precision Education
 
One View of Data Science
One View of Data ScienceOne View of Data Science
One View of Data Science
 
Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?
 
Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?
 
Data to Advance Sustainability
Data to Advance SustainabilityData to Advance Sustainability
Data to Advance Sustainability
 
Frontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular ScalesFrontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular Scales
 
Social Responsibility in Research
Social Responsibility in ResearchSocial Responsibility in Research
Social Responsibility in Research
 

Dernier

Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 

Dernier (20)

Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 

Bioinformatics in the Bourne Lab.

  • 1. Bioinformatics in the Bourne Lab Philip E. Bourne pbourne@ucsd.edu BILD 94 May 3, 2012 August 14, 2009 5/3/12 UCSD BILD 94 1
  • 2. Some Personal Background …. 5/3/12 UCSD BILD 94 2
  • 3. 5/3/12 UCSD BILD 94 3
  • 4. The Life of One Scientist – The Early Years So That You Might Not Make the Same Mistakes • My high school teacher Mr. Wilson • The opportunity to said I would be a failure at chemistry live in different • My PhD is in places shaped my chemistry life • Good friends are 5/3/12 UCSD BILD 94 forever 4
  • 5. 40+ Years Later Ten Simple Rules for Starting a Company PLoS Comp Biol 2012 8(3) 1002439 5/3/12 UCSD BILD 94 5
  • 6. 5/3/12 UCSD BILD 94 6
  • 7. PhD in Physical Chemistry 5/3/12 UCSD BILD 94 7
  • 8. Always Loved Computing 5/3/12 UCSD BILD 94 Circa 1974 8
  • 9. Postdoctoral Work – The Molecular Basis of How the Body Works • Regrets: never learnt another language 5/3/12 UCSD BILD 94 9
  • 10. Post Doc 5/3/12 UCSD BILD 94 10
  • 11. Some Things Stay with You Your Whole Life 5/3/12 UCSD BILD 94 11
  • 12. Senior Scientist HHMI Columbia University New York • Driven not by career but wanting to live in New York City 5/3/12 UCSD BILD 94 12
  • 13. ~1990 Got Involved with the The Human Genome • Was only possible by applying computers to problems in biology • Developed algorithms to support physical and genetic mapping of Chr 13 5/3/12 UCSD BILD 94 13
  • 14. Came to UCSD to Apply Computers to Big Biological Problems • Possibly the best place in the world to do computational biology 5/3/12 UCSD BILD 94 14
  • 15. 5/3/12 UCSD BILD 94 15
  • 16. The Protein Kinase Family •A large family important to signal transduction in eukaryotes and many bacteria. •Phosphotransferases: transfer phosphate group from ATP to Ser/Thr or Tyr residue on target protein, producing a range of downstream signaling effects. •PKA: an example of a typical protein kinase (TPK) fold, shown in “open book” format 5/3/12 UCSD BILD 94 16
  • 17. Sometime Ya Got to Just Do It Yourself 5/3/12 UCSD BILD 94 17
  • 18. The Growth of Data is A Major Driver in Biology Number of released entries Year 5/3/12 UCSD BILD 94 18
  • 19. Demo 5/3/12 UCSD BILD 94 19
  • 20. Big Research Questions in the Lab 1. Can we improve how science is disseminated and comprehended? 2. What is the ancestry of the protein structure universe and what can we learn from it? 3. Are there alternative ways to represent proteins from which we can learn something new? 4. What really happens when we take a drug? 5. Can we contribute to the treatment of neglected {tropical} diseases? August 14, 2009 5/3/12 UCSD BILD 94 20
  • 21. Studying Evolution Through Structure 5/3/12 UCSD BILD 94 21
  • 22. Nature’s Reductionism There are ~ 20300 possible proteins >>>> all the atoms in the Universe 11.2M protein sequences from 10,854 species (source RefSeq) 38,221 protein structures yield 1195 domain folds (SCOP 1.75) 5/3/12 UCSD BILD 94 22
  • 23. Initial Question: With the current coverage of proteomes by structure and assuming we know a high percentage of all folds, is structure a useful discriminator of species? 5/3/12 UCSD BILD 94 23
  • 24. Chapter 2 Initial Findings Song Yang Russ Doolittle, Post Doc UC Berkeley Professor Department of Chemistry and Biochemistry Center for Molecular Genetics UCSD UCSD Yang, Doolittle & Bourne (2005) PNAS 102(2) 373-8 5/3/12 UCSD BILD 94 24
  • 25. To Answer this Question We Only Need to Make Use of Existing Resources • SCOP – Further catalogs Nature’s reductionism into structural domains, folds, families and superfamilies • SUPERFAMILY assigns the above to fully sequenced proteomes 5/3/12 UCSD BILD 94 25
  • 26. The SCOP Hierarchy v1.75 Based on 38221 Structures 7 1195 1962 3902 110800 5/3/12 UCSD BILD 94 26
  • 27. Is Structure a Useful Discriminator of Species? - Maybe… Distribution among the three kingdomsas taken from SUPERFAMILY Eukaryota (650) 153/14 135 • Superfamily distributions would seem to be 10 21/2 118 310/0 related to the complexity 645/49 387 of life 9/1 12 29/0 17 42 68/0 • Update of the work of Caetano-Anolles2 (2003) Archaea (416) Bacteria (564) Genome Biology 13:1563 SCOP fold (765 total) Any genome / All genomes 5/3/12 UCSD BILD 94 27
  • 28. Method – Distance Determination Presence/Absence Data Matrix organisms (FSF) SCOP SUPERFAMILY C. intestinalis C. briggsae F. rubripes a.1.1 1 1 1 a.1.2 1 1 1 a.10.1 0 0 1 a.100.1 1 1 1 a.101.1 0 0 0 a.102.1 0 1 1 a.102.2 1 1 1 Distance Matrix C. intestinalis C. briggsae F. rubripes C. intestinalis 0 101 109 C. briggsae 0 144 F. rubripes 0 Chapter 2 Initial Findings 5/3/12 UCSD BILD 94 28
  • 29. Is Structure a Useful Discriminator of Species? - Yes Archaea Bacteria Eukaryota The method cleanly placed all species in their correct superkingdoms 5/3/12 UCSD BILD 94 29
  • 30. The Answer Would Appear to be Yes • It is possible to generate a reasonable tree of life from merely the presence or absence of superfamilies (FSFs) within a given proteome 5/3/12 UCSD BILD 94 30
  • 31. Environmental Influence Chris Dupont Scripps Institute of Oceanography UCSD DuPont, Yang, Palenik, Bourne. 2006 PNAS 103(47) 17822-17827 5/3/12 UCSD BILD 94 31
  • 32. Consider the Distribution of Disulfide Bonds among Folds • Disulphides are only stable under oxidizing conditions Eukaryota • Oxygen content gradually accumulated during the earth’s evolution 31.9% (43/135) • The divergence of the three kingdoms occurred 1.8-2.2 billion years ago 0% 14.4% • Oxygen began to accumulate ~ 2.0 (0/10) 4.7% (17/118) billion years ago (18/387) • Logical deduction – disulfides more 0% 16.7% 5.9% prevalent in folds (organisms) that 1 (0/2) (1/17) (7/42) evolved later Archaea Bacteria • This would seem to hold true • Can we take this further? SCOP fold (708 total) 5/3/12 UCSD BILD 94 32
  • 33. Evolution of the Earth • 4.5 billion years of change • 300+50K • 1-5 atmospheres • Constant photoenergy • Chemical and geological changes • Life has evolved in this time • The ocean was the “cradle” for 90% of evolution 5/3/12 UCSD BILD 94 33
  • 34. Theoretical Levels of Trace Metals and Oxygen in the Deep Ocean Through Earth’s History • Whether the deep ocean became oxic or euxinic following the rise Bacteria Eukarya in atmospheric oxygen (~2.3 Gya) Archaea 1 is debated, therefore both are Oxygen 0.5 shown (oxic ocean-solid lines, (O2 in arbitrary units, Zn and Fe in moles L-1 0 euxinic ocean-dashed lines). 1.00E-08 Zinc 1.00E-12 Concentration 1.00E-16 1.00E-20 • The phylogenetic tree symbols at Iron 1.00E-06 1.00E-09 the top of the figure show one 1.00E-12 1.00E-15 1.00E-07 idea as to the theoretical periods Cobalt 1.00E-09 of diversification for each Manganese 1.00E-11 Superkingdom. 4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 Billions of years before present Replotted from Saito et al, 2003 Inorganica Chimica Acta 356: 308-318 5/3/12 UCSD BILD 94 34
  • 35. Superfamily Distribution As Well As Overall Content Has Changed a.1.1 a.1.2 a.1.1 a.1.2 a.104.1 a.110.1 Bacteria Fe a.104.1 a.119.1 a.2.11 a.110.1 a.138.1 a.24.3 Eukaryotic Fe a.119.1 a.2.11 a.138.1 a.24.3 superfamilies a.24.4 a.3.1 a.25.1 a.39.3 superfamilies a.24.4 a.3.1 a.25.1 a.39.3 a.56.1 a.93.1 a.56.1 a.93.1 b.1.13 b.2.6 b.1.13 b.2.6 b.3.6 b.33.1 b.3.6 b.33.1 b.70.2 b.82.2 b.70.2 b.82.2 c.56.6 c.83.1 c.56.6 c.83.1 c.96.1 d.134.1 c.96.1 d.134.1 d.15.4 d.174.1 d.15.4 d.174.1 d.178.1 d.35.1 d.178.1 d.35.1 d.44.1 d.58.1 d.44.1 d.58.1 e.18.1 e.19.1 e.18.1 e.19.1 e.26.1 e.5.1 e.26.1 e.5.1 f.21.1 f.21.2 f.21.1 f.21.2 f.24.1 f.26.1 f.24.1 f.26.1 g.35.1 g.36.1 g.35.1 g.36.1 g.41.5 g.41.5 5/3/12 UCSD BILD 94 35
  • 36. Hypothesis • Emergence of cyanobacteria changed oxygen concentrations • Impacted metal concentrations in the ocean • Organisms used new metals in new ways to evolve new biological processes eg complex signaling • This in turn further impacted the environment 5/3/12 UCSD BILD 94 36
  • 37. Big Research Questions in the Lab 1. Can we improve how science is disseminated and comprehended? 2. What is the ancestry of the protein structure universe and what can we learn from it? 3. Are there alternative ways to represent proteins from which we can learn something new? 4. What really happens when we take a drug? 5. Can we contribute to the treatment of neglected {tropical} diseases? August 14, 2009 5/3/12 UCSD BILD 94 37
  • 38. Our Motivation • Tykerb – Breast cancer • Gleevac – Leukemia, GI cancers • Nexavar – Kidney and liver cancer • Staurosporine – natural product – alkaloid – uses many e.g., antifungal antihypertensive 5/3/12 UCSD BILD 94 38 Collins and Workman 2006 Nature Chemical Biology 2 689-700 Motivators
  • 39. Our Broad Approach • Involves the fields of: – Structural bioinformatics – Cheminformatics – Biophysics – Systems biology – Pharmaceutical chemistry • L. Xie, L. Xie, S.L. Kinnings and P.E. Bourne 2012 Novel Computational Approaches to Polypharmacology as a Means to Define Responses to Individual Drugs, Annual Review of Pharmacology and Toxicology 52: 361-379 • L. Xie, S.L. Kinnings, L. Xie and P.E. Bourne 2012 Predicting the Polypharmacology of Drugs: Identifying New Uses Through Bioinformatics and Cheminformatics Approaches in Drug Repurposing M. Barrett and D. Frail (Eds.) Wiley and Sons. (available upon request) 5/3/12 UCSD BILD 94 39
  • 40. Approach - Need to Start with a 3D Drug- Receptor Complex – Either Experimental or Modeled Generic Name Other Name Treatment PDBid Lipitor Atorvastatin High cholesterol 1HWK, 1HW8… Testosterone Testosterone Osteoporosis 1AFS, 1I9J .. Taxol Paclitaxel Cancer 1JFF, 2HXF, 2HXH Viagra Sildenafil citrate ED, pulmonary 1TBF, 1UDT, arterial 1XOS.. hypertension Digoxin Lanoxin Congestive heart 1IGJ failure 5/3/12 UCSD BILD 94 40
  • 41. A Reverse Engineering Approach to Drug Discovery Across Gene Families Characterize ligand binding Identify off-targets by ligand site of primary target binding site similarity (Geometric Potential) (Sequence order independent profile-profile alignment) Extract known drugs or inhibitors of the primary and/or off-targets Search for similar small molecules … Dock molecules to both primary and off-targets Statistics analysis of docking score correlations 41 5/3/12 Xie and Bourne 2009 Bioinformatics 25(12) 305-312
  • 42. Characterization of the Ligand Binding Site - The Geometric Potential  Conceptually similar to hydrophobicity or electrostatic potential that is dependant on both global and local environments • Initially assign C atom with a value that is the distance to the environmental boundary • Update the value with those of surrounding C atoms dependent on distances and orientation – atoms within a 10A radius define i Pi cos( i) 1.0 GP P neighbors Di 1.0 2.0 Xie and Bourne 2007 BMC Bioinformatics, 8(Suppl 4):S9 5/3/12 UCSD BILD 94 42
  • 43. Discrimination Power of the Geometric Potential 4 binding site non-binding site 3.5 3 • Geometric 2.5 potential can 2 distinguish 1.5 binding and 1 non-binding 0.5 sites 0 100 0 11 22 33 44 55 66 77 88 99 0 Geometric Potential Geometric Potential Scale For Residue Clusters 5/3/12 UCSD BILD 94 43
  • 44. Local Sequence-order Independent Alignment with Maximum-Weight Sub-Graph Algorithm Xie and Bourne 2008 PNAS, 105(14) 5441 Structure A Structure B LER VKDL LER VKDL • Build an associated graph from the graph representations of two structures being compared. Each of the nodes is assigned with a weight from the similarity matrix • The maximum-weight clique corresponds to the optimum alignment of the two structures 5/3/12 UCSD BILD 94 44
  • 45. Similarity Matrix of Alignment Chemical Similarity • Amino acid grouping: (LVIMC), (AGSTP), (FYW), and (EDNQKRH) • Amino acid chemical similarity matrix Evolutionary Correlation • Amino acid substitution matrix such as BLOSUM45 • Similarity score between two sequence profiles i i i i d f a Sb fb Sa i i fa, fb are the 20 amino acid target frequencies of profile a and b, respectively Sa, Sb are the PSSM of profile a and b, respectively 5/3/12 UCSD BILD 94 45
  • 46. The Problem with Tuberculosis • One third of global population infected • 1.7 million deaths per year • 95% of deaths in developing countries • Anti-TB drugs hardly changed in 40 years • MDR-TB and XDR-TB pose a threat to human health worldwide • Development of novel, effective and inexpensive drugs is an urgent priority 5/3/12 UCSD BILD 94 46
  • 47. The TB-Drugome 1. Determine the TB structural proteome 2. Determine all known drug binding sites from the PDB 3. Determine which of the sites found in 2 exist in 1 4. Call the result the TB-drugome Kinnings et al 2010 PLoS Comp Biol 6(11): e1000976 5/3/12 UCSD BILD 94 47
  • 48. 1. Determine the TB Structural Proteome 3, 996 2, 266 284 1, 446 • High quality homology models from ModBase (http://modbase.compbio.ucsf.edu) increase structural coverage from 7.1% to 43.3% 5/3/12 UCSD BILD 94 48
  • 49. 2. Determine all Known Drug Binding Sites in the PDB • Searched the PDB for protein crystal structures bound with FDA-approved drugs • 268 drugs bound in a total of 931 binding sites 140 120 100 Acarbose No. of drugs Darunavir Alitretinoin 80 Conjugated 60 estrogens 40 Chenodiol 20 Methotrexate 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 No. of drug binding sites 5/3/12 UCSD BILD 94 49
  • 50. Map 2 onto 1 – The TB-Drugome http://funsite.sdsc.edu/drugome/TB/ Similarities between the binding sites of M.tb proteins (blue), UCSD BILD 94 and binding sites containing approved drugs (red).
  • 51. Research is a Good Life

Notes de l'éditeur

  1. Tuberculosis, which is caused by the bacterial pathogen Mycobacterium tuberculosis, is a leading cause of mortality among the infectious diseases. It has been estimated by the World Health Organization (WHO) that almost one-third of the world's population, around 2 billion people, is infected with the disease. Every year, more than 8 million people develop an active form of the disease, which claims the lives of nearly 2 million. This translates to over 4,900 deaths per day, and more than 95% of these are in developing countries. Despite the current global situation, antitubercular drugs have remained largely unchanged over the last four decades. The widespread use of these agents has provided a strong selective pressure for M.tuberculosis, thus encouraging the emergence of resistant strains. Multidrug resistant (MDR) tuberculosis is defined as resistance to the first-line drugs isoniazid and rifampin. The effective treatment of MDR tuberculosis necessitates long-term use of second-line drug combinations, an unfortunate consequence of which is the emergence of further drug resistance. Enter extensively drug resistant (XDR) tuberculosis - M.tuberculosis strains that are resistant to both isoniazid plus rifampin, as well as key second-line drugs. Since the only remaining drug classes exhibit such low potency and high toxicity, XDR tuberculosis is extremely difficult to treat. The rise of XDR tuberculosis around the world imposes a great threat on human health, therefore reinforcing the development of new antitubercular agents as an urgent priority. Very few Mtb proteins explored as drug targets
  2. 3,996 proteins in TB proteome749 solved structures in the PDB, representing a total of 284 proteins (7.2% coverage)ModBase contains homology models for entire TB proteome1,446 ‘high quality’ homology models were added to the data setStructural coverage increased to 43.8% Retained only those models with a model score of > 0.7 and a Modpipe quality score of > 1.1 (2818 models).There were multiple models per protein. For each TB protein, chose the model with the best model score, and if they were equal, chose the model with the best Modpipe quality score (1703 models).However, 251 (+6) models were removed since they correspond to TB proteins that already have solved structures. 1446 models remained)Score for the reliability of a Model, derived from statistical potentials (F. Melo, R. Sanchez, A. Sali,2001 PDF). A model is predicted to be good when the model score is higher than a pre-specified cutoff (0.7). A reliable model has a probability of the correct fold that is larger than 95%. A fold is correct when at least 30% of its Calpha atoms superpose within 3.5A of their correct positions. The ModPipe Protein Quality Score is a composite score comprising sequence identity to the template, coverage, and the three individual scores evalue, z-Dope and GA341. We consider a MPQS of >1.1 as reliable
  3. (nutraceuticals excluded)