SlideShare une entreprise Scribd logo
1  sur  190
Télécharger pour lire hors ligne
Phylogenomics

                         Jonathan A. Eisen
                            UC Davis

              Bodega Applied Phylogenetics Workshop
                          March 7, 2011
Tuesday, March 8, 2011
Fleischmann et al.
                         1995 Science
                         269:496-512
Tuesday, March 8, 2011
Whole Genome Shotgun Sequencing




Tuesday, March 8, 2011
Whole Genome Shotgun Sequencing




Tuesday, March 8, 2011
Whole Genome Shotgun Sequencing




 Warner Brothers, Inc.




Tuesday, March 8, 2011
Whole Genome Shotgun Sequencing


                          shotgun


 Warner Brothers, Inc.




Tuesday, March 8, 2011
Whole Genome Shotgun Sequencing


                          shotgun


 Warner Brothers, Inc.




Tuesday, March 8, 2011
Whole Genome Shotgun Sequencing


                          shotgun


 Warner Brothers, Inc.
                                    sequence




Tuesday, March 8, 2011
Whole Genome Shotgun Sequencing


                          shotgun


 Warner Brothers, Inc.
                                    sequence




Tuesday, March 8, 2011
Assemble Fragments




Tuesday, March 8, 2011
Assemble Fragments


                  sequencer output




Tuesday, March 8, 2011
Assemble Fragments


                  sequencer output




Tuesday, March 8, 2011
Assemble Fragments


                  sequencer output

                                     assemble
                                     fragments




Tuesday, March 8, 2011
Assemble Fragments


                  sequencer output

                                     assemble
                                     fragments

                                     Closure &

                                     Annotation




Tuesday, March 8, 2011
From http://genomesonline.org
Tuesday, March 8, 2011
Tuesday, March 8, 2011
Tuesday, March 8, 2011
Tuesday, March 8, 2011
Tuesday, March 8, 2011
Genome Sequences Have
               Revolutionized Microbiology
         • Predictions of metabolic processes
         • Better vaccine and drug design
         • New insights into mechanisms of evolution
         • Genomes serve as template for functional
           studies
         • New enzymes and materials for engineering
           and synthetic biology
Tuesday, March 8, 2011
General Steps in Analysis of
            Complete Genomes
       • Identification/prediction of genes
       • Characterization of gene features
       • Characterization of genome features
       • Prediction of gene function
       • Prediction of pathways
       • Integration with known biological
         data
       • Comparative genomics

Tuesday, March 8, 2011
Genome Size




Tuesday, March 8, 2011
Genome
         Structure:
            More
          Variable
         than Once
          Thought




Tuesday, March 8, 2011
Tuesday, March 8, 2011
Why Completeness is
     • Improves characterization of genome
       features
           – Gene order, replication origins
     • Better comparative genomics
           – Genome duplications, inversions
     • Presence and absence of particular genes
       can be very important
     • Missing sequence might be important (e.g.,
       centromere)
     • Allows researchers to focus on biology not
       sequencing


Tuesday, March 8, 2011
Vibrio cholerae Metabolism




Tuesday, March 8, 2011
Tuesday, March 8, 2011
From http://genomesonline.org
Tuesday, March 8, 2011
Phylogenomic Analysis

         • Evolutionary reconstructions greatly
           improve genome analyses
         • Genome analysis greatly improves
           evolutionary reconstructions
         • There is a feedback loop such that these
           should be integrated



Tuesday, March 8, 2011
Outline


         • Phylogenomic Tales
               –   Selecting genomes for sequencing
               –   Species evolution
               –   Predicting functions of genes
               –   Uncultured microbes
               –   Searching for novel organisms and genes




Tuesday, March 8, 2011
Outline


         • Phylogenomic Tales
               –   Selecting genomes for sequencing
               –   Species evolution
               –   Predicting functions of genes
               –   Uncultured microbes
               –   Searching for novel organisms and genes
         • All of these going to be told in context of a
           recent project “A Genomic Encyclopedia of
           Bacteria and Archaea” (aka GEBA)
Tuesday, March 8, 2011
GEBA Introduction

                         Knowing What We Don’t Know




Tuesday, March 8, 2011
Major Microbial Sequencing
                    Efforts
      •   Coordinated, top-down efforts
            – Fungal Genome Initiative (Broad/Whitehead)
            – Gordon and Betty Moore Foundation Marine Microbial Genome
              Sequencing Project
            – Sanger Center Pathogen Sequencing Unit
            – NHGRI Human Gut Microbiome Project
            – NIH Human Microbiome Program
      •   White paper or grant systems
            –   NIAID Microbial Sequencing Centers
            –   DOE/JGI Community Sequencing Program
            –   DOE/JGI BER Sequencing Program
            –   NSF/USDA Microbial Genome Sequencing
      •   Covers lots of ground and biological diversity



Tuesday, March 8, 2011
As of 2002




Tuesday, March 8, 2011
As of 2002               Proteobacteria
                         TM6
                         OS-K                    • At least 40
                         Acidobacteria
                         Termite Group
                         OP8
                                                   phyla of
                         Nitrospira
                         Bacteroides               bacteria
                         Chlorobi
                         Fibrobacteres
                         Marine GroupA
                         WS3
                         Gemmimonas
                         Firmicutes
                         Fusobacteria
                         Actinobacteria
                         OP9
                         Cyanobacteria
                         Synergistes
                         Deferribacteres
                         Chrysiogenetes
                         NKB19
                         Verrucomicrobia
                         Chlamydia
                         OP3
                         Planctomycetes
                         Spriochaetes
                         Coprothmermobacter
                         OP10
                         Thermomicrobia
                         Chloroflexi
                         TM7
                         Deinococcus-Thermus
                         Dictyoglomus
                         Aquificae
                         Thermudesulfobacteria
                         Thermotogae
                         OP1                       Based on
                         OP11                      Hugenholtz, 2002
Tuesday, March 8, 2011
As of 2002               Proteobacteria
                         TM6
                         OS-K
                                                 • At least 40
                         Acidobacteria
                         Termite Group
                         OP8
                                                   phyla of
                         Nitrospira
                         Bacteroides               bacteria
                         Chlorobi
                         Fibrobacteres
                         Marine GroupA           • Genome
                         WS3
                         Gemmimonas
                         Firmicutes
                                                   sequences are
                         Fusobacteria
                         Actinobacteria
                                                   mostly from
                         OP9
                         Cyanobacteria
                         Synergistes
                                                   three phyla
                         Deferribacteres
                         Chrysiogenetes
                         NKB19
                         Verrucomicrobia
                         Chlamydia
                         OP3
                         Planctomycetes
                         Spriochaetes
                         Coprothmermobacter
                         OP10
                         Thermomicrobia
                         Chloroflexi
                         TM7
                         Deinococcus-Thermus
                         Dictyoglomus
                         Aquificae
                         Thermudesulfobacteria
                         Thermotogae
                         OP1                       Based on
                         OP11                      Hugenholtz, 2002
Tuesday, March 8, 2011
As of 2002               Proteobacteria
                         TM6
                         OS-K
                                                 • At least 40
                         Acidobacteria
                         Termite Group
                         OP8
                                                   phyla of
                         Nitrospira
                         Bacteroides               bacteria
                         Chlorobi
                         Fibrobacteres
                         Marine GroupA           • Genome
                         WS3
                         Gemmimonas
                         Firmicutes
                                                   sequences are
                         Fusobacteria
                         Actinobacteria
                                                   mostly from
                         OP9
                         Cyanobacteria
                         Synergistes
                                                   three phyla
                         Deferribacteres
                         Chrysiogenetes
                         NKB19
                                                 • Some other
                         Verrucomicrobia
                         Chlamydia
                         OP3
                                                   phyla are
                         Planctomycetes
                         Spriochaetes              only sparsely
                         Coprothmermobacter
                         OP10
                         Thermomicrobia
                                                   sampled
                         Chloroflexi
                         TM7
                         Deinococcus-Thermus
                         Dictyoglomus
                         Aquificae
                         Thermudesulfobacteria
                         Thermotogae
                         OP1                       Based on
                         OP11                      Hugenholtz, 2002
Tuesday, March 8, 2011
As of 2002               Proteobacteria
                         TM6
                         OS-K
                                                 • At least 40
                         Acidobacteria
                         Termite Group
                         OP8
                                                   phyla of
                         Nitrospira
                         Bacteroides               bacteria
                         Chlorobi
                         Fibrobacteres
                         Marine GroupA           • Genome
                         WS3
                         Gemmimonas
                         Firmicutes
                                                   sequences are
                         Fusobacteria
                         Actinobacteria
                                                   mostly from
                         OP9
                         Cyanobacteria
                         Synergistes
                                                   three phyla
                         Deferribacteres
                         Chrysiogenetes
                         NKB19
                                                 • Some other
                         Verrucomicrobia
                         Chlamydia
                         OP3
                                                   phyla are
                         Planctomycetes
                         Spriochaetes              only sparsely
                         Coprothmermobacter
                         OP10
                         Thermomicrobia
                                                   sampled
                         Chloroflexi
                         TM7
                         Deinococcus-Thermus
                         Dictyoglomus
                         Aquificae
                         Thermudesulfobacteria
                         Thermotogae
                         OP1                       Based on
                         OP11                      Hugenholtz, 2002
Tuesday, March 8, 2011
Need for Tree Guidance Well Established

     • Common approach within some eukaryotic
       groups

     • Many small projects funded to fill in some
       bacterial or archaeal gaps

     • Phylogenetic gaps in bacterial and archaeal
       projects commonly lamented in literature


Tuesday, March 8, 2011
Proteobacteria
• NSF-funded             TM6
                         OS-K
                                                 • At least 40
  Tree of Life           Acidobacteria
                         Termite Group             phyla of
                         OP8
  Project                Nitrospira
                         Bacteroides               bacteria
                         Chlorobi
• A genome               Fibrobacteres
                         Marine GroupA           • Genome
                         WS3
  from each of           Gemmimonas                sequences are
                         Firmicutes
  eight phyla            Fusobacteria
                                                   mostly from
                         Actinobacteria
                         OP9
                         Cyanobacteria
                         Synergistes
                                                   three phyla
                         Deferribacteres
                         Chrysiogenetes
                         NKB19
                                                 • Some other
                         Verrucomicrobia
                         Chlamydia
                         OP3
                                                   phyla are only
                         Planctomycetes
                         Spriochaetes              sparsely
                         Coprothmermobacter
                         OP10
                         Thermomicrobia
                                                   sampled
                         Chloroflexi
                         TM7
                         Deinococcus-Thermus
                                                 • Solution I:
                         Dictyoglomus
 Eisen, Ward,            Aquificae
                         Thermudesulfobacteria
                                                   sequence more
 Robb, Nelson, et        Thermotogae
                                                   phyla
                         OP1
 al                      OP11

Tuesday, March 8, 2011
Organisms Selected
        Phylum                  Species selected


        Chrysiogenes            Chrysiogenes arsenatis (GCA)

        Coprothermobacter       Coprothermobacter proteolyticus (GCBP)

        Dictyoglomi             Dictyoglomus thermophilum (GD T )

        Thermodesulfobacteria   Thermodesulfobacterium commune (GTC)

        Nitrospirae             Thermodesulfovibrio yellowstonii (GTY)

        Thermomicrobia          Thermomicrobium roseum (GTR )

        Deferribacteres         Geovibrio thiophilus (GGT)

        Synergistes             Synergistes jonesii (GSJ)

Tuesday, March 8, 2011
Proteobacteria
• NSF-funded             TM6
                         OS-K
                                                 • At least 40
  Tree of Life           Acidobacteria
                         Termite Group             phyla of bacteria
                         OP8
  Project                Nitrospira
                                                 • Genome
                         Bacteroides

• A genome               Chlorobi
                         Fibrobacteres             sequences are
                         Marine GroupA
  from each of           WS3
                         Gemmimonas                mostly from
  eight phyla            Firmicutes
                         Fusobacteria              three phyla
                         Actinobacteria
                         OP9
                         Cyanobacteria
                                                 • Some other
                         Synergistes
                         Deferribacteres
                         Chrysiogenetes
                                                   phyla are only
                         NKB19
                         Verrucomicrobia           sparsely
                         Chlamydia
                         OP3
                         Planctomycetes
                                                   sampled
                         Spriochaetes
                         Coprothmermobacter      • Still highly
                         OP10
                         Thermomicrobia
                         Chloroflexi
                                                   biased in terms
                         TM7
                         Deinococcus-Thermus
                         Dictyoglomus
                                                   of the tree
                         Aquificae
Eisen & Ward, PIs        Thermudesulfobacteria
                         Thermotogae
                         OP1
                         OP11

Tuesday, March 8, 2011
Major Lineages of Actinobacteria
                                                                       2.5 Actinobacteria
                                                          2.5.1            Acidimicrobidae
                         2.5.1      Acidimicrobidae       2.5.1.1          Unclassified
                                                          2.5.1.2          "Microthrixineae
                         2.5.1.1    Unclassified          2.5.1.3          Acidimicrobineae
                                                          2.5.1.3.1        Unclassified
                         2.5.1.2    "Microthrixineae      2.5.1.3.2        Acidimicrobiaceae
                                                          2.5.1.4          BD2-10
                         2.5.1.3    Acidimicrobineae      2.5.1.5          EB1017
                                                          2.5.2            Actinobacteridae
                         2.5.1.4    BD2-10                2.5.2.1          Unclassified
                                                          2.5.2.10         Ellin306/WR160
                         2.5.1.5    EB1017                2.5.2.11         Ellin5012
                                                          2.5.2.12         Ellin5034
                         2.5.2      Actinobacteridae      2.5.2.13         Frankineae
                                                          2.5.2.13.1       Unclassified
                         2.5.2.1    Unclassified          2.5.2.13.2       Acidothermaceae

                         2.5.2.10   Ellin306/WR160        2.5.2.13.3
                                                          2.5.2.13.4
                                                                           Ellin6090
                                                                           Frankiaceae

                         2.5.2.11   Ellin5012             2.5.2.13.5
                                                          2.5.2.13.6
                                                                           Geodermatophilaceae
                                                                           Microsphaeraceae

                         2.5.2.12   Ellin5034             2.5.2.13.7
                                                          2.5.2.14
                                                                           Sporichthyaceae
                                                                           Glycomyces
                         2.5.2.13   Frankineae            2.5.2.15
                                                          2.5.2.15.1
                                                                           Intrasporangiaceae
                                                                           Unclassified
                         2.5.2.14   Glycomyces            2.5.2.15.2
                                                          2.5.2.15.3
                                                                           Dermacoccus
                                                                           Intrasporangiaceae
                         2.5.2.15   Intrasporangiaceae    2.5.2.16
                                                          2.5.2.17
                                                                           Kineosporiaceae
                                                                           Microbacteriaceae
                         2.5.2.16   Kineosporiaceae       2.5.2.17.1
                                                          2.5.2.17.2
                                                                           Unclassified
                                                                           Agrococcus
                         2.5.2.17   Microbacteriaceae     2.5.2.17.3
                                                          2.5.2.18
                                                                           Agromyces
                                                                           Micrococcaceae
                         2.5.2.18   Micrococcaceae        2.5.2.19
                                                          2.5.2.2
                                                                           Micromonosporaceae
                                                                           Actinomyces
                         2.5.2.19   Micromonosporaceae    2.5.2.20
                                                          2.5.2.20.1
                                                                           Propionibacterineae
                                                                           Unclassified
                         2.5.2.2    Actinomyces           2.5.2.20.2
                                                          2.5.2.20.3
                                                                           Kribbella
                                                                           Nocardioidaceae
                         2.5.2.20   Propionibacterineae   2.5.2.20.4
                                                          2.5.2.21
                                                                           Propionibacteriaceae
                                                                           Pseudonocardiaceae
                         2.5.2.21   Pseudonocardiaceae    2.5.2.22
                                                          2.5.2.22.1
                                                                           Streptomycineae
                                                                           Unclassified
                         2.5.2.22   Streptomycineae       2.5.2.22.2
                                                          2.5.2.22.3
                                                                           Kitasatospora
                                                                           Streptacidiphilus
                         2.5.2.23   Streptosporangineae   2.5.2.23
                                                          2.5.2.23.1
                                                                           Streptosporangineae
                                                                           Unclassified
                         2.5.2.3    Actinomycineae        2.5.2.23.2
                                                          2.5.2.23.3
                                                                           Ellin5129
                                                                           Nocardiopsaceae
                         2.5.2.4    Actinosynnemataceae   2.5.2.23.4
                                                          2.5.2.23.5
                                                                           Streptosporangiaceae
                                                                           Thermomonosporaceae
                         2.5.2.5    Bifidobacteriaceae    2.5.2.3          Actinomycineae
                                                          2.5.2.4          Actinosynnemataceae
                         2.5.2.6    Brevibacteriaceae     2.5.2.5          Bifidobacteriaceae
                                                          2.5.2.6          Brevibacteriaceae
                         2.5.2.7    Cellulomonadaceae     2.5.2.7          Cellulomonadaceae
                                                          2.5.2.8          Corynebacterineae
                         2.5.2.8    Corynebacterineae     2.5.2.8.1        Unclassified
                                                          2.5.2.8.2        Corynebacteriaceae
                         2.5.2.9    Dermabacteraceae      2.5.2.8.3        Dietziaceae
                                                          2.5.2.8.4        Gordoniaceae
                         2.5.3      Coriobacteridae       2.5.2.8.5        Mycobacteriaceae
                                                          2.5.2.8.6        Rhodococcus
                         2.5.3.1    Unclassified          2.5.2.8.7        Rhodococcus
                                                          2.5.2.8.8        Rhodococcus
                         2.5.3.2    Atopobiales           2.5.2.9          Dermabacteraceae
                                                          2.5.2.9.1        Unclassified
                         2.5.3.3    Coriobacteriales      2.5.2.9.2        Brachybacterium
                                                          2.5.2.9.3        Dermabacter
                         2.5.3.4    Eggerthellales        2.5.3            Coriobacteridae
                                                          2.5.3.1          Unclassified
                         2.5.4      OPB41                 2.5.3.2          Atopobiales
                                                          2.5.3.3          Coriobacteriales
                         2.5.5      PK1                   2.5.3.4          Eggerthellales
                                                          2.5.4            OPB41
                         2.5.6      Rubrobacteridae       2.5.5            PK1
                                                          2.5.6            Rubrobacteridae
                         2.5.6.1    Unclassified          2.5.6.1          Unclassified
                                                          2.5.6.2          "Thermoleiphilaceae
                         2.5.6.2    "Thermoleiphilaceae   2.5.6.2.1        Unclassified
                                                          2.5.6.2.2        Conexibacter
                         2.5.6.3    MC47                  2.5.6.2.3        XGE514
                                                          2.5.6.3          MC47
                         2.5.6.4    Rubrobacteraceae      2.5.6.4          Rubrobacteraceae



Tuesday, March 8, 2011
Proteobacteria
• NSF-funded             TM6
                         OS-K
                                                 • At least 40
  Tree of Life           Acidobacteria
                         Termite Group             phyla of bacteria
                         OP8
  Project                Nitrospira
                                                 • Genome
                         Bacteroides

• A genome               Chlorobi
                         Fibrobacteres             sequences are
                         Marine GroupA
  from each of           WS3
                         Gemmimonas                mostly from
  eight phyla            Firmicutes
                         Fusobacteria              three phyla
                         Actinobacteria
                         OP9
                         Cyanobacteria
                                                 • Some other
                         Synergistes
                         Deferribacteres
                         Chrysiogenetes
                                                   phyla are only
                         NKB19
                         Verrucomicrobia           sparsely
                         Chlamydia
                         OP3
                         Planctomycetes
                                                   sampled
                         Spriochaetes
                         Coprothmermobacter      • Same trend in
                         OP10
                         Thermomicrobia
                         Chloroflexi
                                                   Archaea
                         TM7
                         Deinococcus-Thermus
                         Dictyoglomus
                         Aquificae
Eisen & Ward, PIs        Thermudesulfobacteria
                         Thermotogae
                         OP1
                         OP11

Tuesday, March 8, 2011
Proteobacteria
• NSF-funded             TM6
                         OS-K
                                                 • At least 40
  Tree of Life           Acidobacteria
                         Termite Group             phyla of bacteria
                         OP8
  Project                Nitrospira
                                                 • Genome
                         Bacteroides

• A genome               Chlorobi
                         Fibrobacteres             sequences are
                         Marine GroupA
  from each of           WS3
                         Gemmimonas                mostly from
  eight phyla            Firmicutes
                         Fusobacteria              three phyla
                         Actinobacteria
                         OP9
                         Cyanobacteria
                                                 • Some other
                         Synergistes
                         Deferribacteres
                         Chrysiogenetes
                                                   phyla are only
                         NKB19
                         Verrucomicrobia           sparsely
                         Chlamydia
                         OP3
                         Planctomycetes
                                                   sampled
                         Spriochaetes
                         Coprothmermobacter      • Same trend in
                         OP10
                         Thermomicrobia
                         Chloroflexi
                                                   Eukaryotes
                         TM7
                         Deinococcus-Thermus
                         Dictyoglomus
                         Aquificae
Eisen & Ward, PIs        Thermudesulfobacteria
                         Thermotogae
                         OP1
                         OP11

Tuesday, March 8, 2011
Proteobacteria
• NSF-funded             TM6
                         OS-K
                                                 • At least 40
  Tree of Life           Acidobacteria
                         Termite Group             phyla of bacteria
                         OP8
  Project                Nitrospira
                                                 • Genome
                         Bacteroides

• A genome               Chlorobi
                         Fibrobacteres             sequences are
                         Marine GroupA
  from each of           WS3
                         Gemmimonas                mostly from
  eight phyla            Firmicutes
                         Fusobacteria              three phyla
                         Actinobacteria
                         OP9
                         Cyanobacteria
                                                 • Some other
                         Synergistes
                         Deferribacteres
                         Chrysiogenetes
                                                   phyla are only
                         NKB19
                         Verrucomicrobia           sparsely
                         Chlamydia
                         OP3
                         Planctomycetes
                                                   sampled
                         Spriochaetes
                         Coprothmermobacter      • Same trend in
                         OP10
                         Thermomicrobia
                         Chloroflexi
                                                   Viruses
                         TM7
                         Deinococcus-Thermus
                         Dictyoglomus
                         Aquificae
Eisen & Ward, PIs        Thermudesulfobacteria
                         Thermotogae
                         OP1
                         OP11

Tuesday, March 8, 2011
Proteobacteria
• GEBA                   TM6
                         OS-K                    • At least 40
                         Acidobacteria
• A genomic              Termite Group
                         OP8
                                                   phyla of bacteria
  encyclopedia           Nitrospira
                         Bacteroides             • Genome
                         Chlorobi
  of bacteria            Fibrobacteres
                         Marine GroupA
                                                   sequences are
  and archaea            WS3
                         Gemmimonas                mostly from
                         Firmicutes
                         Fusobacteria              three phyla
                         Actinobacteria
                         OP9
                         Cyanobacteria           • Some other
                         Synergistes
                         Deferribacteres
                         Chrysiogenetes
                                                   phyla are only
                         NKB19
                         Verrucomicrobia           sparsely
                         Chlamydia
                         OP3
                         Planctomycetes
                                                   sampled
                         Spriochaetes
                         Coprothmermobacter
                         OP10
                                                 • Solution: Really
                         Thermomicrobia
                         Chloroflexi                Fill in the Tree
                         TM7
                         Deinococcus-Thermus
                         Dictyoglomus
                         Aquificae
                         Thermudesulfobacteria
Eisen & Ward, PIs        Thermotogae
                         OP1
                         OP11

Tuesday, March 8, 2011
http://www.jgi.doe.gov/programs/GEBA/pilot.html
Tuesday, March 8, 2011
GEBA Pilot Project: Components
      • Project overview (Phil Hugenholtz, Nikos Kyrpides, Jonathan
        Eisen, Eddy Rubin, Jim Bristow)
      • Project management (David Bruce, Eileen Dalin, Lynne Goodwin)
      • Culture collection and DNA prep (DSMZ, Hans-Peter Klenk)
      • Sequencing and closure (Eileen Dalin, Susan Lucas, Alla Lapidus,
        Mat Nolan, Alex Copeland, Cliff Han, Feng Chen, Jan-Fang Cheng)
      • Annotation and data release (Nikos Kyrpides, Victor Markowitz, et
        al)
      • Analysis (Dongying Wu, Kostas Mavrommatis, Martin Wu, Victor
        Kunin, Neil Rawlings, Ian Paulsen, Patrick Chain, Patrik
        D’Haeseleer, Sean Hooper, Iain Anderson, Amrita Pati, Natalia N.
        Ivanova, Athanasios Lykidis, Adam Zemla)
      • Adopt a microbe education project (Cheryl Kerfeld)
      • Outreach (David Gilbert)
      • $$$ (DOE, Eddy Rubin, Jim Bristow)
Tuesday, March 8, 2011
rRNA Tree of Life




                          FIgure from Barton, Eisen et al.
                             “Evolution”, CSHL Press.
                         Based on tree from Pace NR, 2003.

Tuesday, March 8, 2011
Tuesday, March 8, 2011
Tuesday, March 8, 2011
Tuesday, March 8, 2011
B:
                                      Ac
                                      in t
                                         ob
                                            ac
                                               te
                                            B: ria                                 # of Genomes
                                                Am (H




Tuesday, March 8, 2011
                                                    in igh




                                                                              10
                                                                                    15
                                                                                          20
                                                                                                  25
                                                                                                       30
                                                                                                            35




                                                                      0
                                                                          5
                                                       an G
                                                         a C
                                              B: B: er )
                                                  Ba    Aq ob
                                                     ct uif ia
                                                  B: ero ica
                                           B:                   e
                                               D Ch ide
                                           B: e  ef lo te
                                                           r     s
                                               D rri ofl
                                                 ef ba e
                                     B:             e      c xi
                                  B: De B rrib ter
                                      Ep lta : D act es
                                         si Pr ei er
                                           lo o n es
                                              n te oc
                                                Pr ob oc
                                                   ot a ci
                                  B:                  e ct
                                     G            B: oba eri
                                       am B F ct a
                                                 : ir e
                                     B: m Fu mi ria
                                              a
                                         G P so cut
                                          em ro ba e
                                                     t      c s
                                         B: ma eo te
                                                         ba ri
                                             H tim c a
                                               a             t
                                          B: loa ona eri
                                                                a
                                  B:           Pl nae de
                                                  an r te
                                     Th              c o         s




                         Phyla
                                        er B: to bia
                                           m S           m le
                                                          y s
                                        B: od piro ce
                                                 es c te
                                            T       u h
                                        B: he lfo ae s
                                                 rm b te
                                                                                                                 GEBA Pilot Target List




                                            Th o a               s
                                               er de cte
                                                  m s ri
                                                          u a
                                               A: ove lfo
                                                   H n bi
                                              A: alo abu a
                                         A:        A b la
                                             M rc ac e
                                         A: et ha te
                                             M han eo ria
                                               et            g
                                                  ha ob lob
                                                          ac i
                                               A: no te
                                                        m r
                                              A: The icr ia
                                                  Th rm obi
                                                     er oc a
                                                        m oc
                                                          op ci
                                                             ro
                                                               te
                                                                  i
GEBA Pilot Project Overview

        • Identify major branches in rRNA tree for
          which no genomes are available
        • Identify those with a cultured representative
          in DSMZ
        • DSMZ grew > 200 of these and prepped
          DNA
        • Sequence and finish 200+
        • Annotate, analyze, release data
        • Assess benefits of tree guided sequencing
        • 1st paper Wu et al in Nature Dec 2009
Tuesday, March 8, 2011
GEBA Phylogenomic Lesson 1

                 The rRNA Tree of Life is a Useful Tool
                 for Identifying Phylogenetically Novel
                                Genomes



Tuesday, March 8, 2011
rRNA Tree of Life
                         Bacteria




                                                                Archaea




                          Eukaryotes

                             Figure from Barton, Eisen et al.
                             “Evolution”, CSHL Press. 2007.
                          Based on tree from Pace 1997 Science
                                      276:734-740
Tuesday, March 8, 2011
The Core Gets Small ...




Tuesday, March 8, 2011
The Pangenome




Tuesday, March 8, 2011
Islands Among Synteny




Tuesday, March 8, 2011
The Pangenome




Tuesday, March 8, 2011
Network of Life
                         Bacteria




                                                                Archaea




                          Eukaryotes

                             Figure from Barton, Eisen et al.
                                “Evolution”, CSHL Press.
                           Based on tree from Pace NR, 2003.

Tuesday, March 8, 2011
Using the Core




Tuesday, March 8, 2011
Wh




  Whole genome tree
  built using
  AMPHORA
  by Martin Wu and
  Dongying Wu


Tuesday, March 8, 2011
Tuesday, March 8, 2011
Four Models for Rooting TOL
                         from Lake et al. doi: 10.1098/rstb.2009.0035




Tuesday, March 8, 2011
GEBA Phylogenomic Lesson 2

                      rRNA Tree is good but not perfect
                    and better genomic sampling improves
                            phylogenetic inference



Tuesday, March 8, 2011
16s Says Hyphomonas is in Rhodobacteriales




Badger et al.
2005


Tuesday, March 8, 2011
WGT and individual gene trees:
                         Its Related to Caulobacterales




Badger et al.
2005


Tuesday, March 8, 2011
16s                                          WGT, 23S




  Badger et al. 2005 Int J System Evol Microbiol 55: 1021-1026.
Tuesday, March 8, 2011
Caveats: ignoring LGT and using
               concatenated alignments




Tuesday, March 8, 2011
Concatenated Alignment ML Tree




Tuesday, March 8, 2011
Green Non Sulfur Bacteria




Tuesday, March 8, 2011
Chlamydia-Verrucomicrobia




Tuesday, March 8, 2011
Proteobacteria




Tuesday, March 8, 2011
Zimmer. New York Times. 2009
Tuesday, March 8, 2011
GEBA Phylogenomic Lesson 3

                      Phylogenetics guided genome
                     selection (and phylogenetics in
                  general) improves genome annotation



Tuesday, March 8, 2011
Predicting Function

         • Key step in genome projects
         • More accurate predictions help guide
           experimental and computational analyses
         • Many diverse approaches
         • All improved both by “phylogenomic” type
           analyses that integrate evolutionary
           reconstructions and understanding of how
           new functions evolve


Tuesday, March 8, 2011
From Eisen et
                         al. 1997 Nature
                         Medicine 3:
                         1076-1078.
Tuesday, March 8, 2011
Blast Search of H. pylori “MutS”




         • Blast search pulls up Syn. sp MutS#2 with much higher p
           value than other MutS homologs
         • Based on this TIGR predicted this species had mismatch
           repair
                                                              Based on Eisen
         • Assumes functional constancy                       et al. 1997
                                                                 Nature Medicine
                                                                 3: 1076-1078.
Tuesday, March 8, 2011
Predicting Function
         • Identification of motifs
               – Short regions of sequence similarity that are indicative of
                 general activity
               – e.g., ATP binding
         • Homology/similarity based methods
               – Gene sequence is searched against a databases of other
                 sequences
               – If significant similar genes are found, their functional
                 information is used
         • Problem
               – Genes frequently have similarity to hundreds of motifs
                 and multiple genes, not all with the same function


Tuesday, March 8, 2011
MutL??




     From http://asajj.roswellpark.org/huberman/dna_repair/mmr.html
Tuesday, March 8, 2011
Phylogenetic Tree of MutS Family
                                              Aquae
                                                  Strpy
                                                      Bacsu
                                                          Synsp
                                                            Deira Helpy
                                 Yeast
                           Human                              Borbu     Metth
                           Celeg

                                                                           mSaco
                      Yeast
                    Human                                                    Yeast
                    Mouse
                     Arath                                                    Celeg
                                                                             Human
                    Arath
                   Human
                   Mouse
                Spombe                                                        Fly
                   Yeast                                                     Xenla
                                                                             Rat
                                                                             Mouse
                   Yeast                                                    Human
                Spombe                                                       Yeast
                                                                            Neucr
                                                                           Arath

                               Aquae                            Trepa
                               Chltr
                                DeiraTheaq
                                       Thema                  BacsuBorbu              Based on Eisen,
                                                       SynspStrpy                     1998 Nucl Acids
                                           Ecoli
                                                   Neigo                              Res 26: 4291-4300.
Tuesday, March 8, 2011
MutS Subfamilies
                                            MSH5                        MutS2
                                                     Aquae
                                                         Strpy
                                                             Bacsu
                                                                 Synsp
                                                                   Deira Helpy
                                          Yeast
                                    Human                            Borbu        Metth
                                    Celeg

                                                                                mSaco
              MSH6             Yeast
                             Human
                             Mouse
                              Arath
                                                                                  Yeast    MSH4
                                                                                   Celeg
                                                                                  Human
                               Arath
                               Human
        MSH3                Mouse
                                                                                    Fly
                         Spombe
                            Yeast                                                 Xenla
                                                                                  Rat
                                                                                   Mouse
                            Yeast
         MSH1            Spombe
                                                                                  Human
                                                                                  Yeast
                                                                                           MSH2
                                                                                 Neucr
                                                                                Arath


                                        Aquae                        Trepa
                                        Chltr
                                          Deira
                                              Theaq
                                                                   BacsuBorbu
                                                 Thema
                                                            SynspStrpy
                                                  Ecoli
                                                        Neigo                                Based on Eisen,
                                                                                             1998 Nucl Acids
                                                     MutS1
                                                                                             Res 26: 4291-4300.
Tuesday, March 8, 2011
Overlaying Functions onto Tree
                                                                        MutS2
                                           MSH5           Aquae
                                                              Strpy
                                                                  Bacsu
                                                                      Synsp
                                                                        Deira Helpy
                                            Yeast
                                      Human                               Borbu     Metth
                                      Celeg


                     MSH6                                                         mSaco
                                Yeast
                              Human
                              Mouse
                               Arath
                                                                                     YeastMSH4
                                                                                      Celeg
                                                                                     Human
                              Arath
                           Human
              MSH3         Mouse
                                                                                      Fly
                         Spombe
                            Yeast                                                   Xenla
                                                                                    Rat
                                                                                     Mouse
                            Yeast                                                   Human
             MSH1        Spombe                                                     Yeast    MSH2
                                                                                   Neucr
                                                                                  Arath


                                         Aquae                         Trepa
                                         Chltr
                                          DeiraTheaq
                                                                     BacsuBorbu
                                                  Thema
                                                              SynspStrpy                      Based on Eisen,
                                                    Ecoli
                                                          Neigo
                                                                                              1998 Nucl Acids
                                                      MutS1                                   Res 26: 4291-4300.
Tuesday, March 8, 2011
Functional Prediction Using Tree
               MSH5 - Meiotic Crossing Over                MutS2 - Unknown Functions
                                                     Aquae
                                                         Strpy
                                                             Bacsu
                                                                 Synsp
                                                                   Deira Helpy
                                         Yeast
                                   Human                             Borbu     Metth
                                   Celeg

  MSH6 - Nuclear                                                               mSaco
  Repair
                           Yeast
  Of Mismatches          Human                                                               MSH4 - Meiotic Crossing
                         Mouse                                                    Yeast      Over
                          Arath                                                    Celeg
                                                                                  Human
                    Arath
 MSH3 - Nuclear     Human
                  Mouse
 RepairOf Loops Spombe                                                             Fly
                   Yeast                                                         Xenla
                                                                                 Rat
                                                                                  Mouse    MSH2 - Eukaryotic Nuclear
                      Yeast                                                      Human     Mismatch and Loop Repair
 MSH1              Spombe                                                        Yeast
                                                                                Neucr
 Mitochondrial
                                                                               Arath
 Repair
                                   Aquae                            Trepa
                                   Chltr
                                    DeiraTheaq
                                                                  BacsuBorbu
                                            Thema
                                                           SynspStrpy
                                                 Ecoli                                             Based on Eisen,
                                                       Neigo
                                                                                                   1998 Nucl Acids
                                   MutS1 - Bacterial Mismatch and Loop Repair                      Res 26: 4291-4300.
Tuesday, March 8, 2011
Tuesday, March 8, 2011
PHYLOGENENETIC PREDICTION OF GENE FUNCTION



                                     EXAMPLE A                                   METHOD                           EXAMPLE B

                                              2A                         CHOOSE GENE(S) OF INTEREST                        5


                                           3A                                                                          1 3 4
                                                2B                                                                 2
                                                                            IDENTIFY HOMOLOGS                             5
                                      1A 2A 1B 3B                                                                       6



                                                                             ALIGN SEQUENCES

                             1A      2A    3A 1B        2B      3B                                      1    2         3       4   5   6



                                                                           CALCULATE GENE TREE


                                                      Duplication?


                            1A       2A 3A 1B          2B      3B                                       1    2         3       4   5   6



                                                                             OVERLAY KNOWN
                                                                           FUNCTIONS ONTO TREE

                                                      Duplication?


                                     2A 3A 1B          2B      3B                                      1      2        3       4   5   6
                            1A



                                                                           INFER LIKELY FUNCTION
                                                                           OF GENE(S) OF INTEREST
                                                                                                      Ambiguous
                                                      Duplication?



                         Species 1        Species 2          Species 3
                          1A 1B            2A 2B              3A 3B                                     1    2         3       4   5   6


                                                                             ACTUAL EVOLUTION
                                                                         (ASSUMED TO BE UNKNOWN)
                                                                                                                                           Based on Eisen,
                                                                                                                                           1998 Genome
                                                      Duplication
                                                                                                                                           Res 8: 163-167.
Tuesday, March 8, 2011
Phylogenetic Prediction of


         • Termed phylogenomics (Eisen, et al 1997)
         • Greatly improves accuracy of functional
           predictions compared to similarity based
           methods (e.g., blast)
         • Automated methods now available
               – Sean Eddy, Steven Brenner, Kimmen Sjölander,
                 etc.
         • But …

Tuesday, March 8, 2011
Example 2: Recent Changes
        • Phylogenomic functional prediction         NJ



                                                                        *      **
                                                                                               V.cholerae0512
                                                                                                        VC
                                                                                                V.cholerae
                                                                                                        VCA1034
                                                                                                 V.cholerae
                                                                                                          VC
                                                                                                 V.cholerae
                                                                                                         VC
                                                                                                 V.cholerae
                                                                                                         VC
                                                                                                           A0974
                                                                                                           A0068
                                                                                                    V.cholerae
                                                                                                            VC
                                                                                                             0825
                                                                                                           0282


          may not work well for very newly
                                                                                              V.cholerae
                                                                                                       VCA0906
                                                                                                      V.cholerae
                                                                                                              VC
                                                                                                               A0979
                                                                                              V.cholerae
                                                                                                       VCA1056
                                                                                                 V.cholerae
                                                                                                         VC1643
                                                                                                  V.cholerae
                                                                                                          VC2161
                                                                                       **          V.cholerae
                                                                                                           VCA0923
                                                                              **                 V.cholerae
                                                                                                         VC0514
                                                                                                    V.cholerae
                                                                                                             VC
                                                                                                              1868
                                                                                                   V.cholerae
                                                                                                           VC
                                                                                                            A0773
                                                                                                 V.cholerae
                                                                                                         VC1313


          evolved functions
                                                                                                   V.cholerae
                                                                                                           VC
                                                                                                            1859
                                                                                                V.cholerae
                                                                                                        VC1413
                                                                                              V.cholerae
                                                                                                       VCA0268
                                                                      **                                V.cholerae
                                                                                                                VC
                                                                                                                 A0658
                                                                                                   V.cholerae
                                                                                                           VC
                                                                                                            1405
                                                                    *                             V.cholerae
                                                                                                          VC1298
                                                                                                    V.cholerae
                                                                                                            VC1248
                                                                                             V.cholerae
                                                                                                      VCA0864
                                                                                             V.cholerae
                                                                                                      VCA0176
                                                                           **                   V.cholerae
                                                                                                        VCA0220
                                                                                               V.cholerae
                                                                                                        VC
                                                                                                         1289
                                                                              **                   V.cholerae
                                                                                                           VC1069
                                                                                                             A
                                                                                                 V.cholerae
                                                                                                         VC2439


        • Can use understanding of origin of
                                                                                                    V.cholerae
                                                                                                            VC967
                                                                                                             1
                                                                                                    V.cholerae
                                                                                                            VC
                                                                                                             A0031
                                                                                                V.cholerae
                                                                                                        VC1898
                                                                                                    V.cholerae
                                                                                                            VC
                                                                                                             A0663
                                                                                             V.cholerae
                                                                                                     VC0988
                                                                                                       A
                                                                                             V.cholerae
                                                                                                      VC0216
                                                                      *                      V.cholerae
                                                                                                      VC0449
                                                                                            V.cholerae
                                                                                                     VCA0008
                                                                                             V.cholerae
                                                                                                      VC1406
                                                                                                      V.cholerae
                                                                                                              VC
                                                                                                               1535


          novelty to better interpret these cases?
                                                                                               V.cholerae
                                                                                                       VC0840
                                                                                                          B.subtilis
                                                                                                                gi2633766
                                                                                                      Synechocystis
                                                                                                                sp.
                                                                                                                  gi1001299
                                                                         *                   Synechocystis
                                                                                                        sp.gi1001300
                                                                    *                                 Synechocystis
                                                                                                                sp.
                                                                                                                  gi1652276
                                                                          *                     Synechocystis
                                                                                                           sp.
                                                                                                             gi1652103
                                                                                               H.pylori
                                                                                                     gi2313716
                                                                     **                     **H.pylori
                                                                                                    99 gi4155097
                                                                                               C.jejuni
                                                                                                     Cj1190c
                                                                                           C.jejuni
                                                                                                 Cj1110c
                                                                                             A.fulgidus
                                                                                                     gi2649560
                                                                                             A.fulgidus
                                                                                                     gi2649548
                                                                                           ** B.subtilis
                                                                                                       gi2634254


        • Screen genomes for genes that have
                                                                                             B.subtilis
                                                                                                    gi2632630
                                                                                             B.subtilis
                                                                                                     gi2635607
                                                                                             B.subtilis
                                                                                                    gi2635608
                                                                                   **         B.subtilis
                                                                                                     gi2635609
                                                                                 ** ** B.subtilisgi2635882
                                                                                                    gi2635610
                                                                                                  B.subtilis
                                                                                           E.coligi1788195
                                                                                           E.coli
                                                                                                gi2367378
                                                                                * **       E.coligi1788194
                                                                                               E.coli A1092
                                                                                                    gi1787690
                                                                                             V.cholerae
                                                                                                      VC


          changed recently
                                                                                              V.cholerae
                                                                                                       VC
                                                                                                        0098
                                                                                              E.coli
                                                                                                   gi1789453
                                                                                                 H.pylori
                                                                                                       gi2313186
                                                                                                 H.pylori
                                                                                                      99 gi4154603
                                                                                             ** C.jejuni   Cj0144
                                                                                                     C.jejuni
                                                                                                           Cj1564
                                                                                                   **C.jejuni
                                                                                                 C.jejuni
                                                                                                           Cj0262c
                                                                                                      Cj1506c
                                                                                      **          H.pylori
                                                                                                        gi2313163
                                                                                *              ** H.pylori
                                                                                                       99 gi4154575
                                                                                   **            H.pylori
                                                                                                      gi2313179
                                                                                                 H.pylori
                                                                                                      99 gi4154599

         –   Pseudogenes and gene loss
                                                                                              ** C.jejuni Cj0019c
                                                                                                         C.jejuni
                                                                                                               Cj0951c
                                                                                                      C.jejuni
                                                                                                            Cj0246c
                                                                                                     B.subtilis
                                                                                                            gi2633374
                                                                                                      T.maritima
                                                                                                              TM0014
                                                                                                           V.cholerae
                                                                                                                  VC1403
                                                                                                         V.cholerae
                                                                                                                VCA1088
                                                                                                          T.pallidum
                                                                                                                 gi3322777
                                                                                **                               T.pallidum
                                                                                                                        gi3322939
                                                                              **                          T.pallidum
                                                                                                                 gi3322938
                                                                                                           B.burgdorferi
                                                                                                                    gi2688522

         –   Contingency Loci
                                                                                                             T.pallidum
                                                                                                                    gi3322296
                                                                                                         B.burgdorferi
                                                                                                                  gi2688521
                                                                     *                          T.maritima
                                                                                                        TM0429
                                                                                              **T.maritima
                                                                                                        TM0918
                                                                       *                     **T.maritima
                                                                                            T.maritima
                                                                                                        TM0023
                                                                                                     TM1428
                                                                                               T.maritima
                                                                                                       TM1143
                                                                                            T.maritima
                                                                                                     TM1146
                                                                                               P.abyssi
                                                                                                      PAB1308
                                                                                               P.horikoshii
                                                                                                       gi3256846
                                                                                          ** P.abyssiPAB1336


         –   Acquisition (e.g., LGT)
                                                                               **             P.horikoshii
                                                                                                       gi3256896
                                                                      **                   **P.abyssi
                                                                                                    PAB2066
                                                               **                            P.horikoshii
                                                                                        ** P.abyssi   gi3258290
                                                                    *                                PAB1026
                                                                                        ** P.horikoshii DRA00354
                                                                                                        gi3256884
                                                                                                         D.radiodurans
                                                                                                        D.radiodurans
                                                                                                  ** D.radioduransDRA0353
                                                                            **                                   DRA0352
                                                          **                                        V.cholerae
                                                                                                             VC
                                                                                                              1394
                                                                                                   P.abyssi
                                                                                                         PAB1189
                                                                                                   P.horikoshii
                                                                                                           gi3258414


         –   Unusual dS/dN ratios
                                                                                            ** B.burgdorferi
                                                                                                         gi2688621
                                                                                                       M.tuberculosis
                                                                                                                 gi1666149
                                                                                                         V.cholerae
                                                                                                                 VC
                                                                                                                  0622




         –   Rapid evolutionary rates
         –   Recent duplications
Tuesday, March 8, 2011
Example 3: Non homology
                             methods

         • Many genes have homologs in other species
           but no homologs have ever been studied
           experimentally
         • Non-homology methods can make functional
           predictions for these
         • Example: phylogenetic profiling




Tuesday, March 8, 2011
Phylogenetic profiling basis

         • Microbial genes are lost rapidly when not
           maintained by selection
         • Genes can be acquired by lateral transfer
         • Frequently gain and loss occurs for entire
           pathways/processes
         • Thus might be able to use correlated presence/
           absence information to identify genes with
           similar functions

Tuesday, March 8, 2011
Non-Homology Predictions:
               Phylogenetic Profiling

          • Step 1: Search all genes in
            organisms of interest against all
            other genomes

          • Ask: Yes or No, is each gene
            found in each other species

          • Cluster genes by distribution
            patterns (profiles)

Tuesday, March 8, 2011
Carboxydothermus hydrogenoformans


   • Isolated from a Russian hotspring
   • Thermophile (grows at 80°C)
   • Anaerobic
   • Grows very efficiently on CO
     (Carbon Monoxide)
   • Produces hydrogen gas
   • Low GC Gram positive
     (Firmicute)
   • Genome Determined (Wu et al.
     2005 PLoS Genetics 1: e65. )

Tuesday, March 8, 2011
Homologs of Sporulation Genes




                                    Wu et al. 2005
                                    PLoS Genetics 1:
                                    e65.
Tuesday, March 8, 2011
Carboxydothermus sporulates




                         Wu et al. 2005 PLoS Genetics 1: e65.
Tuesday, March 8, 2011
Wu et al. 2005 PLoS Genetics 1: e65.
Tuesday, March 8, 2011
PG Profiling Works Better Using
                    Orthology




Tuesday, March 8, 2011
GEBA Lesson 3:
              Phylogeny driven genome selection (and
             phylogenetics) improves genome annotation
          • Took 56 GEBA genomes and compared results vs. 56
            randomly sampled new genomes
          • Better definition of protein family sequence “patterns”
          • Greatly improves “comparative” and “evolutionary”
            based predictions
          • Conversion of hypothetical into conserved hypotheticals
          • Linking distantly related members of protein families
          • Improved non-homology prediction




Tuesday, March 8, 2011
GEBA Lesson 4:
                          Metadata Important




Tuesday, March 8, 2011
GEBA Phylogenomic Lesson 5

                    Phylogeny-driven genome selection
                    helps discover new genetic diversity




Tuesday, March 8, 2011
Network of Life
                         Bacteria




                                                                Archaea




                          Eukaryotes

                             FIgure from Barton, Eisen et al.
                                “Evolution”, CSHL Press.
                           Based on tree from Pace NR, 2003.

Tuesday, March 8, 2011
Protein Family Rarefaction


         • Take data set of multiple complete genomes
         • Identify all protein families using MCL
         • Plot # of genomes vs. # of protein families




Tuesday, March 8, 2011
Wu et al. 2009 Nature 462, 1056-1060
Tuesday, March 8, 2011
Wu et al. 2009 Nature 462, 1056-1060
Tuesday, March 8, 2011
Wu et al. 2009 Nature 462, 1056-1060
Tuesday, March 8, 2011
Wu et al. 2009 Nature 462, 1056-1060
Tuesday, March 8, 2011
Wu et al. 2009 Nature 462, 1056-1060
Tuesday, March 8, 2011
Synapomorphies exist




Wu et al. 2009 Nature 462, 1056-1060
Tuesday, March 8, 2011
Families/PD not uniform
           +,%-./&#(%)"*




                                   !"#$%"&'(%)"*
       !                                  !


Tuesday, March 8, 2011
Structural Novelty

         • Of the 17000 protein families in the GEBA56, 1800
           are novel in sequence (Wu)


         • Structural modeling suggests many are structurally
           novel too (D'haeseleer)


         • 372 being crystallized by the PSI (Kerfeld)




Tuesday, March 8, 2011
GEBA Phylogenomic Lesson 6

                         Improves analysis of genome data
                            from uncultured organisms




Tuesday, March 8, 2011
Great Plate Count Anomaly




                         Culturing   Microscope

                          Count       Count


Tuesday, March 8, 2011
Great Plate Count Anomaly




                         Culturing       Microscope

                          Count      <<<< Count

Tuesday, March 8, 2011
Environmental DNA Analysis

                                                      DNA




                         Culturing       Microscope

                          Count      <<<< Count

Tuesday, March 8, 2011
rRNA Phylotyping

                                   • Collect DNA from
                                     environment
                                   • PCR amplify rRNA
                                     genes using broad (so-
                                     called universal) primers
                                   • Sequence
                                   • Align to others
                                   • Infer evolutionary tree
                                   • Unknowns “identified”
                                     by placement on tree
                                   • Some use BLAST, but
                                     not as good as phylogeny
Tuesday, March 8, 2011
rRNA PCR

     The Hidden Majority                   Richness estimates




                         Hugenholtz 2002         Bohannan and Hughes 2003


Tuesday, March 8, 2011
Tuesday, March 8, 2011
rRNA data increasing exponentially too
Tuesday, March 8, 2011
rRNA phylotyping issues

         • Massive amounts of data
               – 1 x 10^6 new partial sequences with new 454
               – 2 x 10^6 full length sequences in DB
         • Alignments of new sequences not always
           straightforward
         • Solutions:
               – Reliance on similarity scores (bad)
               – High throughput automated phylogenetic tools
                  • STAP
                  • WATERs
Tuesday, March 8, 2011
Perna et al. 2003
Tuesday, March 8, 2011
Tuesday, March 8, 2011
Tuesday, March 8, 2011
Tuesday, March 8, 2011
Diversity of Proteorhodopsins by PCR




                                de la Torre
                                et al 2003


Tuesday, March 8, 2011
Metagenomics



                                 shotgun
                                      sequence




Tuesday, March 8, 2011
Massiuve Diversity of Proteorhodopsins




                                           Venter et al., 2004
Tuesday, March 8, 2011
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay
Talk for UC Davis Applied Phylogenetics Course at Bodega Bay

Contenu connexe

Tendances

A phylogeny driven genomic encyclopedia of bacteria and archaea
A phylogeny driven genomic encyclopedia of bacteria and archaeaA phylogeny driven genomic encyclopedia of bacteria and archaea
A phylogeny driven genomic encyclopedia of bacteria and archaeaJonathan Eisen
 
Environmental Shotgun Sequencing
Environmental Shotgun SequencingEnvironmental Shotgun Sequencing
Environmental Shotgun SequencingJonathan Eisen
 
Phylogenomics, Microbes, Yada Yada Yada - Talk by Jeisen at JCVI 1/18/11
Phylogenomics, Microbes, Yada Yada Yada - Talk by Jeisen at JCVI 1/18/11Phylogenomics, Microbes, Yada Yada Yada - Talk by Jeisen at JCVI 1/18/11
Phylogenomics, Microbes, Yada Yada Yada - Talk by Jeisen at JCVI 1/18/11Jonathan Eisen
 
The Genomic Encyclopedia of Bacteria and Archaea & the Need for A Built Envir...
The Genomic Encyclopedia of Bacteria and Archaea & the Need for A Built Envir...The Genomic Encyclopedia of Bacteria and Archaea & the Need for A Built Envir...
The Genomic Encyclopedia of Bacteria and Archaea & the Need for A Built Envir...Jonathan Eisen
 
Metagenomics sk presentation 17.10.2017
Metagenomics sk presentation 17.10.2017 Metagenomics sk presentation 17.10.2017
Metagenomics sk presentation 17.10.2017 SUNILKUMARSAHOO16
 
Brief history and development of metagenomics
Brief history and development of metagenomicsBrief history and development of metagenomics
Brief history and development of metagenomicsSunidhi Shreya
 
Charrieretal2007
Charrieretal2007Charrieretal2007
Charrieretal2007ivanfrere
 
EVE 161 Winter 2018 Class 14
EVE 161 Winter 2018 Class 14EVE 161 Winter 2018 Class 14
EVE 161 Winter 2018 Class 14Jonathan Eisen
 
Computational Enzymology of Ribozymes (from metal-ion to nucleobase catalysis...
Computational Enzymology of Ribozymes (from metal-ion to nucleobase catalysis...Computational Enzymology of Ribozymes (from metal-ion to nucleobase catalysis...
Computational Enzymology of Ribozymes (from metal-ion to nucleobase catalysis...Fabrice Leclerc
 
METAGENOMICS & BIOREMEDIATION
METAGENOMICS & BIOREMEDIATIONMETAGENOMICS & BIOREMEDIATION
METAGENOMICS & BIOREMEDIATIONSunidhi Shreya
 
Carabid And Ladybird Bt Poster
Carabid And Ladybird Bt PosterCarabid And Ladybird Bt Poster
Carabid And Ladybird Bt Posterrasams2
 
ENDOSYMBIOTIC THEORY
ENDOSYMBIOTIC THEORYENDOSYMBIOTIC THEORY
ENDOSYMBIOTIC THEORYvibhakhanna1
 
EVE 161 Winter 2018 Class 17
EVE 161 Winter 2018 Class 17EVE 161 Winter 2018 Class 17
EVE 161 Winter 2018 Class 17Jonathan Eisen
 
Metagenomics by microbiology dept. panjab university2018copy
Metagenomics by microbiology dept. panjab university2018copyMetagenomics by microbiology dept. panjab university2018copy
Metagenomics by microbiology dept. panjab university2018copydeepankarshashni
 
Differentiation in microorganisms
Differentiation in microorganismsDifferentiation in microorganisms
Differentiation in microorganismsNagaraju Yalavarthi
 
Cluster classification of mycobacteriophages isolated from tropical soils of ...
Cluster classification of mycobacteriophages isolated from tropical soils of ...Cluster classification of mycobacteriophages isolated from tropical soils of ...
Cluster classification of mycobacteriophages isolated from tropical soils of ...Nicole Colon
 
Genetics of Microorganisms. Forms of variation in microbes : Non-heredity and...
Genetics of Microorganisms. Forms of variation in microbes : Non-heredity and...Genetics of Microorganisms. Forms of variation in microbes : Non-heredity and...
Genetics of Microorganisms. Forms of variation in microbes : Non-heredity and...Eneutron
 

Tendances (20)

A phylogeny driven genomic encyclopedia of bacteria and archaea
A phylogeny driven genomic encyclopedia of bacteria and archaeaA phylogeny driven genomic encyclopedia of bacteria and archaea
A phylogeny driven genomic encyclopedia of bacteria and archaea
 
Environmental Shotgun Sequencing
Environmental Shotgun SequencingEnvironmental Shotgun Sequencing
Environmental Shotgun Sequencing
 
Phylogenomics, Microbes, Yada Yada Yada - Talk by Jeisen at JCVI 1/18/11
Phylogenomics, Microbes, Yada Yada Yada - Talk by Jeisen at JCVI 1/18/11Phylogenomics, Microbes, Yada Yada Yada - Talk by Jeisen at JCVI 1/18/11
Phylogenomics, Microbes, Yada Yada Yada - Talk by Jeisen at JCVI 1/18/11
 
The Genomic Encyclopedia of Bacteria and Archaea & the Need for A Built Envir...
The Genomic Encyclopedia of Bacteria and Archaea & the Need for A Built Envir...The Genomic Encyclopedia of Bacteria and Archaea & the Need for A Built Envir...
The Genomic Encyclopedia of Bacteria and Archaea & the Need for A Built Envir...
 
Metagenomics sk presentation 17.10.2017
Metagenomics sk presentation 17.10.2017 Metagenomics sk presentation 17.10.2017
Metagenomics sk presentation 17.10.2017
 
Brief history and development of metagenomics
Brief history and development of metagenomicsBrief history and development of metagenomics
Brief history and development of metagenomics
 
Charrieretal2007
Charrieretal2007Charrieretal2007
Charrieretal2007
 
EVE 161 Winter 2018 Class 14
EVE 161 Winter 2018 Class 14EVE 161 Winter 2018 Class 14
EVE 161 Winter 2018 Class 14
 
Computational Enzymology of Ribozymes (from metal-ion to nucleobase catalysis...
Computational Enzymology of Ribozymes (from metal-ion to nucleobase catalysis...Computational Enzymology of Ribozymes (from metal-ion to nucleobase catalysis...
Computational Enzymology of Ribozymes (from metal-ion to nucleobase catalysis...
 
METAGENOMICS & BIOREMEDIATION
METAGENOMICS & BIOREMEDIATIONMETAGENOMICS & BIOREMEDIATION
METAGENOMICS & BIOREMEDIATION
 
Carabid And Ladybird Bt Poster
Carabid And Ladybird Bt PosterCarabid And Ladybird Bt Poster
Carabid And Ladybird Bt Poster
 
ENDOSYMBIOTIC THEORY
ENDOSYMBIOTIC THEORYENDOSYMBIOTIC THEORY
ENDOSYMBIOTIC THEORY
 
EVE 161 Winter 2018 Class 17
EVE 161 Winter 2018 Class 17EVE 161 Winter 2018 Class 17
EVE 161 Winter 2018 Class 17
 
Metagenomics by microbiology dept. panjab university2018copy
Metagenomics by microbiology dept. panjab university2018copyMetagenomics by microbiology dept. panjab university2018copy
Metagenomics by microbiology dept. panjab university2018copy
 
metagenomics
metagenomicsmetagenomics
metagenomics
 
Differentiation in microorganisms
Differentiation in microorganismsDifferentiation in microorganisms
Differentiation in microorganisms
 
Cluster classification of mycobacteriophages isolated from tropical soils of ...
Cluster classification of mycobacteriophages isolated from tropical soils of ...Cluster classification of mycobacteriophages isolated from tropical soils of ...
Cluster classification of mycobacteriophages isolated from tropical soils of ...
 
Genetics of Microorganisms. Forms of variation in microbes : Non-heredity and...
Genetics of Microorganisms. Forms of variation in microbes : Non-heredity and...Genetics of Microorganisms. Forms of variation in microbes : Non-heredity and...
Genetics of Microorganisms. Forms of variation in microbes : Non-heredity and...
 
Metagenomics
MetagenomicsMetagenomics
Metagenomics
 
Metagenomic
MetagenomicMetagenomic
Metagenomic
 

Similaire à Talk for UC Davis Applied Phylogenetics Course at Bodega Bay

Phylogenomics and the Diversification of Microbes, J. A. Eisen at Genentech 1...
Phylogenomics and the Diversification of Microbes, J. A. Eisen at Genentech 1...Phylogenomics and the Diversification of Microbes, J. A. Eisen at Genentech 1...
Phylogenomics and the Diversification of Microbes, J. A. Eisen at Genentech 1...Jonathan Eisen
 
Phylogenomics Talk at UC Berkeley by J. A. Eisen
Phylogenomics Talk at UC Berkeley by J. A. EisenPhylogenomics Talk at UC Berkeley by J. A. Eisen
Phylogenomics Talk at UC Berkeley by J. A. EisenJonathan Eisen
 
The Human Microbiome 101: talk Jonathan Eisen at #FOGM13
The Human Microbiome 101: talk Jonathan Eisen at #FOGM13The Human Microbiome 101: talk Jonathan Eisen at #FOGM13
The Human Microbiome 101: talk Jonathan Eisen at #FOGM13Jonathan Eisen
 
04 lecture presentation
04 lecture presentation04 lecture presentation
04 lecture presentationcurrie538
 
Apollo Workshop AGS2017 Introduction
Apollo Workshop AGS2017 IntroductionApollo Workshop AGS2017 Introduction
Apollo Workshop AGS2017 IntroductionMonica Munoz-Torres
 
Multi-Omics Bioinformatics across Application Domains
Multi-Omics Bioinformatics across Application DomainsMulti-Omics Bioinformatics across Application Domains
Multi-Omics Bioinformatics across Application DomainsChristoph Steinbeck
 
Molecular pathology in microbiology and metagenomics
Molecular pathology in microbiology and metagenomicsMolecular pathology in microbiology and metagenomics
Molecular pathology in microbiology and metagenomicsCharithRanatunga
 
Challenges and opportunities in personal omics profiling
Challenges and opportunities in personal omics profilingChallenges and opportunities in personal omics profiling
Challenges and opportunities in personal omics profilingSenthil Natesan
 
Working with Chromosomes
Working with ChromosomesWorking with Chromosomes
Working with ChromosomesIoanna Leontiou
 
Lecture 3 -the diversity of genomes and the tree of life
Lecture 3 -the diversity of genomes and the tree of lifeLecture 3 -the diversity of genomes and the tree of life
Lecture 3 -the diversity of genomes and the tree of lifeEmmanuel Aguon
 
Marine Host-Microbiome Interactions: Challenges and Opportunities
Marine Host-Microbiome Interactions: Challenges and OpportunitiesMarine Host-Microbiome Interactions: Challenges and Opportunities
Marine Host-Microbiome Interactions: Challenges and OpportunitiesJonathan Eisen
 
Fundamentals of Analysis of Exomes
Fundamentals of Analysis of ExomesFundamentals of Analysis of Exomes
Fundamentals of Analysis of Exomesdaforerog
 
REVIEW THE STATUS OF GENOME ANALYSIS OF CULTURED ARCHAEA
REVIEW THE STATUS OF GENOME ANALYSIS OF CULTURED ARCHAEA REVIEW THE STATUS OF GENOME ANALYSIS OF CULTURED ARCHAEA
REVIEW THE STATUS OF GENOME ANALYSIS OF CULTURED ARCHAEA Shruti Gupta
 
Jonathan Eisen talk at Lake Arrowhead Microbial Genomics Mtg #LAMG10
Jonathan Eisen talk at Lake Arrowhead Microbial Genomics Mtg #LAMG10Jonathan Eisen talk at Lake Arrowhead Microbial Genomics Mtg #LAMG10
Jonathan Eisen talk at Lake Arrowhead Microbial Genomics Mtg #LAMG10Jonathan Eisen
 
The Human Genome Project - Part III
The Human Genome Project - Part IIIThe Human Genome Project - Part III
The Human Genome Project - Part IIIhhalhaddad
 
Jonathan Eisen talk on 1$ Genome
Jonathan Eisen talk on 1$ GenomeJonathan Eisen talk on 1$ Genome
Jonathan Eisen talk on 1$ GenomeJonathan Eisen
 
Genomics, Transcriptomics, Proteomics, Metabolomics - Basic concepts for clin...
Genomics, Transcriptomics, Proteomics, Metabolomics - Basic concepts for clin...Genomics, Transcriptomics, Proteomics, Metabolomics - Basic concepts for clin...
Genomics, Transcriptomics, Proteomics, Metabolomics - Basic concepts for clin...Prasenjit Mitra
 

Similaire à Talk for UC Davis Applied Phylogenetics Course at Bodega Bay (20)

Phylogenomics and the Diversification of Microbes, J. A. Eisen at Genentech 1...
Phylogenomics and the Diversification of Microbes, J. A. Eisen at Genentech 1...Phylogenomics and the Diversification of Microbes, J. A. Eisen at Genentech 1...
Phylogenomics and the Diversification of Microbes, J. A. Eisen at Genentech 1...
 
Eisen #FOGM13
Eisen #FOGM13Eisen #FOGM13
Eisen #FOGM13
 
Phylogenomics Talk at UC Berkeley by J. A. Eisen
Phylogenomics Talk at UC Berkeley by J. A. EisenPhylogenomics Talk at UC Berkeley by J. A. Eisen
Phylogenomics Talk at UC Berkeley by J. A. Eisen
 
The Human Microbiome 101: talk Jonathan Eisen at #FOGM13
The Human Microbiome 101: talk Jonathan Eisen at #FOGM13The Human Microbiome 101: talk Jonathan Eisen at #FOGM13
The Human Microbiome 101: talk Jonathan Eisen at #FOGM13
 
Eisen.Csb2009
Eisen.Csb2009Eisen.Csb2009
Eisen.Csb2009
 
04 lecture presentation
04 lecture presentation04 lecture presentation
04 lecture presentation
 
Apollo Workshop AGS2017 Introduction
Apollo Workshop AGS2017 IntroductionApollo Workshop AGS2017 Introduction
Apollo Workshop AGS2017 Introduction
 
Multi-Omics Bioinformatics across Application Domains
Multi-Omics Bioinformatics across Application DomainsMulti-Omics Bioinformatics across Application Domains
Multi-Omics Bioinformatics across Application Domains
 
Molecular pathology in microbiology and metagenomics
Molecular pathology in microbiology and metagenomicsMolecular pathology in microbiology and metagenomics
Molecular pathology in microbiology and metagenomics
 
Challenges and opportunities in personal omics profiling
Challenges and opportunities in personal omics profilingChallenges and opportunities in personal omics profiling
Challenges and opportunities in personal omics profiling
 
Working with Chromosomes
Working with ChromosomesWorking with Chromosomes
Working with Chromosomes
 
Lecture 3 -the diversity of genomes and the tree of life
Lecture 3 -the diversity of genomes and the tree of lifeLecture 3 -the diversity of genomes and the tree of life
Lecture 3 -the diversity of genomes and the tree of life
 
Marine Host-Microbiome Interactions: Challenges and Opportunities
Marine Host-Microbiome Interactions: Challenges and OpportunitiesMarine Host-Microbiome Interactions: Challenges and Opportunities
Marine Host-Microbiome Interactions: Challenges and Opportunities
 
Fundamentals of Analysis of Exomes
Fundamentals of Analysis of ExomesFundamentals of Analysis of Exomes
Fundamentals of Analysis of Exomes
 
REVIEW THE STATUS OF GENOME ANALYSIS OF CULTURED ARCHAEA
REVIEW THE STATUS OF GENOME ANALYSIS OF CULTURED ARCHAEA REVIEW THE STATUS OF GENOME ANALYSIS OF CULTURED ARCHAEA
REVIEW THE STATUS OF GENOME ANALYSIS OF CULTURED ARCHAEA
 
Jonathan Eisen talk at Lake Arrowhead Microbial Genomics Mtg #LAMG10
Jonathan Eisen talk at Lake Arrowhead Microbial Genomics Mtg #LAMG10Jonathan Eisen talk at Lake Arrowhead Microbial Genomics Mtg #LAMG10
Jonathan Eisen talk at Lake Arrowhead Microbial Genomics Mtg #LAMG10
 
Prokaryotes
ProkaryotesProkaryotes
Prokaryotes
 
The Human Genome Project - Part III
The Human Genome Project - Part IIIThe Human Genome Project - Part III
The Human Genome Project - Part III
 
Jonathan Eisen talk on 1$ Genome
Jonathan Eisen talk on 1$ GenomeJonathan Eisen talk on 1$ Genome
Jonathan Eisen talk on 1$ Genome
 
Genomics, Transcriptomics, Proteomics, Metabolomics - Basic concepts for clin...
Genomics, Transcriptomics, Proteomics, Metabolomics - Basic concepts for clin...Genomics, Transcriptomics, Proteomics, Metabolomics - Basic concepts for clin...
Genomics, Transcriptomics, Proteomics, Metabolomics - Basic concepts for clin...
 

Plus de Jonathan Eisen

Eisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdfEisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdfJonathan Eisen
 
Phylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of MicrobesPhylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of MicrobesJonathan Eisen
 
Talk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meetingTalk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meetingJonathan Eisen
 
Thoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current ActionsThoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current ActionsJonathan Eisen
 
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Jonathan Eisen
 
A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2Jonathan Eisen
 
EVE198 Summer Session Class 4
EVE198 Summer Session Class 4EVE198 Summer Session Class 4
EVE198 Summer Session Class 4Jonathan Eisen
 
EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1 EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1 Jonathan Eisen
 
EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines Jonathan Eisen
 
EVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 IntroductionEVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 IntroductionJonathan Eisen
 
EVE198 Spring2021 Class2
EVE198 Spring2021 Class2EVE198 Spring2021 Class2
EVE198 Spring2021 Class2Jonathan Eisen
 
EVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 VaccinesEVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 VaccinesJonathan Eisen
 
EVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA DetectionEVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA DetectionJonathan Eisen
 
EVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 IntroductionEVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 IntroductionJonathan Eisen
 
EVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID TestingEVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID TestingJonathan Eisen
 
EVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID VaccinesEVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID VaccinesJonathan Eisen
 
EVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID TransmissionEVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID TransmissionJonathan Eisen
 
EVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 VaccinesEVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 VaccinesJonathan Eisen
 
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and TestingEVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and TestingJonathan Eisen
 
EVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 IntroductionEVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 IntroductionJonathan Eisen
 

Plus de Jonathan Eisen (20)

Eisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdfEisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdf
 
Phylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of MicrobesPhylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of Microbes
 
Talk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meetingTalk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meeting
 
Thoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current ActionsThoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current Actions
 
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
 
A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2
 
EVE198 Summer Session Class 4
EVE198 Summer Session Class 4EVE198 Summer Session Class 4
EVE198 Summer Session Class 4
 
EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1 EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1
 
EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines
 
EVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 IntroductionEVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 Introduction
 
EVE198 Spring2021 Class2
EVE198 Spring2021 Class2EVE198 Spring2021 Class2
EVE198 Spring2021 Class2
 
EVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 VaccinesEVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 Vaccines
 
EVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA DetectionEVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA Detection
 
EVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 IntroductionEVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 Introduction
 
EVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID TestingEVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID Testing
 
EVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID VaccinesEVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID Vaccines
 
EVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID TransmissionEVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID Transmission
 
EVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 VaccinesEVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
 
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and TestingEVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
 
EVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 IntroductionEVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
 

Talk for UC Davis Applied Phylogenetics Course at Bodega Bay

  • 1. Phylogenomics Jonathan A. Eisen UC Davis Bodega Applied Phylogenetics Workshop March 7, 2011 Tuesday, March 8, 2011
  • 2. Fleischmann et al. 1995 Science 269:496-512 Tuesday, March 8, 2011
  • 3. Whole Genome Shotgun Sequencing Tuesday, March 8, 2011
  • 4. Whole Genome Shotgun Sequencing Tuesday, March 8, 2011
  • 5. Whole Genome Shotgun Sequencing Warner Brothers, Inc. Tuesday, March 8, 2011
  • 6. Whole Genome Shotgun Sequencing shotgun Warner Brothers, Inc. Tuesday, March 8, 2011
  • 7. Whole Genome Shotgun Sequencing shotgun Warner Brothers, Inc. Tuesday, March 8, 2011
  • 8. Whole Genome Shotgun Sequencing shotgun Warner Brothers, Inc. sequence Tuesday, March 8, 2011
  • 9. Whole Genome Shotgun Sequencing shotgun Warner Brothers, Inc. sequence Tuesday, March 8, 2011
  • 11. Assemble Fragments sequencer output Tuesday, March 8, 2011
  • 12. Assemble Fragments sequencer output Tuesday, March 8, 2011
  • 13. Assemble Fragments sequencer output assemble fragments Tuesday, March 8, 2011
  • 14. Assemble Fragments sequencer output assemble fragments Closure & Annotation Tuesday, March 8, 2011
  • 20. Genome Sequences Have Revolutionized Microbiology • Predictions of metabolic processes • Better vaccine and drug design • New insights into mechanisms of evolution • Genomes serve as template for functional studies • New enzymes and materials for engineering and synthetic biology Tuesday, March 8, 2011
  • 21. General Steps in Analysis of Complete Genomes • Identification/prediction of genes • Characterization of gene features • Characterization of genome features • Prediction of gene function • Prediction of pathways • Integration with known biological data • Comparative genomics Tuesday, March 8, 2011
  • 23. Genome Structure: More Variable than Once Thought Tuesday, March 8, 2011
  • 25. Why Completeness is • Improves characterization of genome features – Gene order, replication origins • Better comparative genomics – Genome duplications, inversions • Presence and absence of particular genes can be very important • Missing sequence might be important (e.g., centromere) • Allows researchers to focus on biology not sequencing Tuesday, March 8, 2011
  • 29. Phylogenomic Analysis • Evolutionary reconstructions greatly improve genome analyses • Genome analysis greatly improves evolutionary reconstructions • There is a feedback loop such that these should be integrated Tuesday, March 8, 2011
  • 30. Outline • Phylogenomic Tales – Selecting genomes for sequencing – Species evolution – Predicting functions of genes – Uncultured microbes – Searching for novel organisms and genes Tuesday, March 8, 2011
  • 31. Outline • Phylogenomic Tales – Selecting genomes for sequencing – Species evolution – Predicting functions of genes – Uncultured microbes – Searching for novel organisms and genes • All of these going to be told in context of a recent project “A Genomic Encyclopedia of Bacteria and Archaea” (aka GEBA) Tuesday, March 8, 2011
  • 32. GEBA Introduction Knowing What We Don’t Know Tuesday, March 8, 2011
  • 33. Major Microbial Sequencing Efforts • Coordinated, top-down efforts – Fungal Genome Initiative (Broad/Whitehead) – Gordon and Betty Moore Foundation Marine Microbial Genome Sequencing Project – Sanger Center Pathogen Sequencing Unit – NHGRI Human Gut Microbiome Project – NIH Human Microbiome Program • White paper or grant systems – NIAID Microbial Sequencing Centers – DOE/JGI Community Sequencing Program – DOE/JGI BER Sequencing Program – NSF/USDA Microbial Genome Sequencing • Covers lots of ground and biological diversity Tuesday, March 8, 2011
  • 34. As of 2002 Tuesday, March 8, 2011
  • 35. As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA WS3 Gemmimonas Firmicutes Fusobacteria Actinobacteria OP9 Cyanobacteria Synergistes Deferribacteres Chrysiogenetes NKB19 Verrucomicrobia Chlamydia OP3 Planctomycetes Spriochaetes Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on OP11 Hugenholtz, 2002 Tuesday, March 8, 2011
  • 36. As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Genome WS3 Gemmimonas Firmicutes sequences are Fusobacteria Actinobacteria mostly from OP9 Cyanobacteria Synergistes three phyla Deferribacteres Chrysiogenetes NKB19 Verrucomicrobia Chlamydia OP3 Planctomycetes Spriochaetes Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on OP11 Hugenholtz, 2002 Tuesday, March 8, 2011
  • 37. As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Genome WS3 Gemmimonas Firmicutes sequences are Fusobacteria Actinobacteria mostly from OP9 Cyanobacteria Synergistes three phyla Deferribacteres Chrysiogenetes NKB19 • Some other Verrucomicrobia Chlamydia OP3 phyla are Planctomycetes Spriochaetes only sparsely Coprothmermobacter OP10 Thermomicrobia sampled Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on OP11 Hugenholtz, 2002 Tuesday, March 8, 2011
  • 38. As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Genome WS3 Gemmimonas Firmicutes sequences are Fusobacteria Actinobacteria mostly from OP9 Cyanobacteria Synergistes three phyla Deferribacteres Chrysiogenetes NKB19 • Some other Verrucomicrobia Chlamydia OP3 phyla are Planctomycetes Spriochaetes only sparsely Coprothmermobacter OP10 Thermomicrobia sampled Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on OP11 Hugenholtz, 2002 Tuesday, March 8, 2011
  • 39. Need for Tree Guidance Well Established • Common approach within some eukaryotic groups • Many small projects funded to fill in some bacterial or archaeal gaps • Phylogenetic gaps in bacterial and archaeal projects commonly lamented in literature Tuesday, March 8, 2011
  • 40. Proteobacteria • NSF-funded TM6 OS-K • At least 40 Tree of Life Acidobacteria Termite Group phyla of OP8 Project Nitrospira Bacteroides bacteria Chlorobi • A genome Fibrobacteres Marine GroupA • Genome WS3 from each of Gemmimonas sequences are Firmicutes eight phyla Fusobacteria mostly from Actinobacteria OP9 Cyanobacteria Synergistes three phyla Deferribacteres Chrysiogenetes NKB19 • Some other Verrucomicrobia Chlamydia OP3 phyla are only Planctomycetes Spriochaetes sparsely Coprothmermobacter OP10 Thermomicrobia sampled Chloroflexi TM7 Deinococcus-Thermus • Solution I: Dictyoglomus Eisen, Ward, Aquificae Thermudesulfobacteria sequence more Robb, Nelson, et Thermotogae phyla OP1 al OP11 Tuesday, March 8, 2011
  • 41. Organisms Selected Phylum Species selected Chrysiogenes Chrysiogenes arsenatis (GCA) Coprothermobacter Coprothermobacter proteolyticus (GCBP) Dictyoglomi Dictyoglomus thermophilum (GD T ) Thermodesulfobacteria Thermodesulfobacterium commune (GTC) Nitrospirae Thermodesulfovibrio yellowstonii (GTY) Thermomicrobia Thermomicrobium roseum (GTR ) Deferribacteres Geovibrio thiophilus (GGT) Synergistes Synergistes jonesii (GSJ) Tuesday, March 8, 2011
  • 42. Proteobacteria • NSF-funded TM6 OS-K • At least 40 Tree of Life Acidobacteria Termite Group phyla of bacteria OP8 Project Nitrospira • Genome Bacteroides • A genome Chlorobi Fibrobacteres sequences are Marine GroupA from each of WS3 Gemmimonas mostly from eight phyla Firmicutes Fusobacteria three phyla Actinobacteria OP9 Cyanobacteria • Some other Synergistes Deferribacteres Chrysiogenetes phyla are only NKB19 Verrucomicrobia sparsely Chlamydia OP3 Planctomycetes sampled Spriochaetes Coprothmermobacter • Still highly OP10 Thermomicrobia Chloroflexi biased in terms TM7 Deinococcus-Thermus Dictyoglomus of the tree Aquificae Eisen & Ward, PIs Thermudesulfobacteria Thermotogae OP1 OP11 Tuesday, March 8, 2011
  • 43. Major Lineages of Actinobacteria 2.5 Actinobacteria 2.5.1 Acidimicrobidae 2.5.1 Acidimicrobidae 2.5.1.1 Unclassified 2.5.1.2 "Microthrixineae 2.5.1.1 Unclassified 2.5.1.3 Acidimicrobineae 2.5.1.3.1 Unclassified 2.5.1.2 "Microthrixineae 2.5.1.3.2 Acidimicrobiaceae 2.5.1.4 BD2-10 2.5.1.3 Acidimicrobineae 2.5.1.5 EB1017 2.5.2 Actinobacteridae 2.5.1.4 BD2-10 2.5.2.1 Unclassified 2.5.2.10 Ellin306/WR160 2.5.1.5 EB1017 2.5.2.11 Ellin5012 2.5.2.12 Ellin5034 2.5.2 Actinobacteridae 2.5.2.13 Frankineae 2.5.2.13.1 Unclassified 2.5.2.1 Unclassified 2.5.2.13.2 Acidothermaceae 2.5.2.10 Ellin306/WR160 2.5.2.13.3 2.5.2.13.4 Ellin6090 Frankiaceae 2.5.2.11 Ellin5012 2.5.2.13.5 2.5.2.13.6 Geodermatophilaceae Microsphaeraceae 2.5.2.12 Ellin5034 2.5.2.13.7 2.5.2.14 Sporichthyaceae Glycomyces 2.5.2.13 Frankineae 2.5.2.15 2.5.2.15.1 Intrasporangiaceae Unclassified 2.5.2.14 Glycomyces 2.5.2.15.2 2.5.2.15.3 Dermacoccus Intrasporangiaceae 2.5.2.15 Intrasporangiaceae 2.5.2.16 2.5.2.17 Kineosporiaceae Microbacteriaceae 2.5.2.16 Kineosporiaceae 2.5.2.17.1 2.5.2.17.2 Unclassified Agrococcus 2.5.2.17 Microbacteriaceae 2.5.2.17.3 2.5.2.18 Agromyces Micrococcaceae 2.5.2.18 Micrococcaceae 2.5.2.19 2.5.2.2 Micromonosporaceae Actinomyces 2.5.2.19 Micromonosporaceae 2.5.2.20 2.5.2.20.1 Propionibacterineae Unclassified 2.5.2.2 Actinomyces 2.5.2.20.2 2.5.2.20.3 Kribbella Nocardioidaceae 2.5.2.20 Propionibacterineae 2.5.2.20.4 2.5.2.21 Propionibacteriaceae Pseudonocardiaceae 2.5.2.21 Pseudonocardiaceae 2.5.2.22 2.5.2.22.1 Streptomycineae Unclassified 2.5.2.22 Streptomycineae 2.5.2.22.2 2.5.2.22.3 Kitasatospora Streptacidiphilus 2.5.2.23 Streptosporangineae 2.5.2.23 2.5.2.23.1 Streptosporangineae Unclassified 2.5.2.3 Actinomycineae 2.5.2.23.2 2.5.2.23.3 Ellin5129 Nocardiopsaceae 2.5.2.4 Actinosynnemataceae 2.5.2.23.4 2.5.2.23.5 Streptosporangiaceae Thermomonosporaceae 2.5.2.5 Bifidobacteriaceae 2.5.2.3 Actinomycineae 2.5.2.4 Actinosynnemataceae 2.5.2.6 Brevibacteriaceae 2.5.2.5 Bifidobacteriaceae 2.5.2.6 Brevibacteriaceae 2.5.2.7 Cellulomonadaceae 2.5.2.7 Cellulomonadaceae 2.5.2.8 Corynebacterineae 2.5.2.8 Corynebacterineae 2.5.2.8.1 Unclassified 2.5.2.8.2 Corynebacteriaceae 2.5.2.9 Dermabacteraceae 2.5.2.8.3 Dietziaceae 2.5.2.8.4 Gordoniaceae 2.5.3 Coriobacteridae 2.5.2.8.5 Mycobacteriaceae 2.5.2.8.6 Rhodococcus 2.5.3.1 Unclassified 2.5.2.8.7 Rhodococcus 2.5.2.8.8 Rhodococcus 2.5.3.2 Atopobiales 2.5.2.9 Dermabacteraceae 2.5.2.9.1 Unclassified 2.5.3.3 Coriobacteriales 2.5.2.9.2 Brachybacterium 2.5.2.9.3 Dermabacter 2.5.3.4 Eggerthellales 2.5.3 Coriobacteridae 2.5.3.1 Unclassified 2.5.4 OPB41 2.5.3.2 Atopobiales 2.5.3.3 Coriobacteriales 2.5.5 PK1 2.5.3.4 Eggerthellales 2.5.4 OPB41 2.5.6 Rubrobacteridae 2.5.5 PK1 2.5.6 Rubrobacteridae 2.5.6.1 Unclassified 2.5.6.1 Unclassified 2.5.6.2 "Thermoleiphilaceae 2.5.6.2 "Thermoleiphilaceae 2.5.6.2.1 Unclassified 2.5.6.2.2 Conexibacter 2.5.6.3 MC47 2.5.6.2.3 XGE514 2.5.6.3 MC47 2.5.6.4 Rubrobacteraceae 2.5.6.4 Rubrobacteraceae Tuesday, March 8, 2011
  • 44. Proteobacteria • NSF-funded TM6 OS-K • At least 40 Tree of Life Acidobacteria Termite Group phyla of bacteria OP8 Project Nitrospira • Genome Bacteroides • A genome Chlorobi Fibrobacteres sequences are Marine GroupA from each of WS3 Gemmimonas mostly from eight phyla Firmicutes Fusobacteria three phyla Actinobacteria OP9 Cyanobacteria • Some other Synergistes Deferribacteres Chrysiogenetes phyla are only NKB19 Verrucomicrobia sparsely Chlamydia OP3 Planctomycetes sampled Spriochaetes Coprothmermobacter • Same trend in OP10 Thermomicrobia Chloroflexi Archaea TM7 Deinococcus-Thermus Dictyoglomus Aquificae Eisen & Ward, PIs Thermudesulfobacteria Thermotogae OP1 OP11 Tuesday, March 8, 2011
  • 45. Proteobacteria • NSF-funded TM6 OS-K • At least 40 Tree of Life Acidobacteria Termite Group phyla of bacteria OP8 Project Nitrospira • Genome Bacteroides • A genome Chlorobi Fibrobacteres sequences are Marine GroupA from each of WS3 Gemmimonas mostly from eight phyla Firmicutes Fusobacteria three phyla Actinobacteria OP9 Cyanobacteria • Some other Synergistes Deferribacteres Chrysiogenetes phyla are only NKB19 Verrucomicrobia sparsely Chlamydia OP3 Planctomycetes sampled Spriochaetes Coprothmermobacter • Same trend in OP10 Thermomicrobia Chloroflexi Eukaryotes TM7 Deinococcus-Thermus Dictyoglomus Aquificae Eisen & Ward, PIs Thermudesulfobacteria Thermotogae OP1 OP11 Tuesday, March 8, 2011
  • 46. Proteobacteria • NSF-funded TM6 OS-K • At least 40 Tree of Life Acidobacteria Termite Group phyla of bacteria OP8 Project Nitrospira • Genome Bacteroides • A genome Chlorobi Fibrobacteres sequences are Marine GroupA from each of WS3 Gemmimonas mostly from eight phyla Firmicutes Fusobacteria three phyla Actinobacteria OP9 Cyanobacteria • Some other Synergistes Deferribacteres Chrysiogenetes phyla are only NKB19 Verrucomicrobia sparsely Chlamydia OP3 Planctomycetes sampled Spriochaetes Coprothmermobacter • Same trend in OP10 Thermomicrobia Chloroflexi Viruses TM7 Deinococcus-Thermus Dictyoglomus Aquificae Eisen & Ward, PIs Thermudesulfobacteria Thermotogae OP1 OP11 Tuesday, March 8, 2011
  • 47. Proteobacteria • GEBA TM6 OS-K • At least 40 Acidobacteria • A genomic Termite Group OP8 phyla of bacteria encyclopedia Nitrospira Bacteroides • Genome Chlorobi of bacteria Fibrobacteres Marine GroupA sequences are and archaea WS3 Gemmimonas mostly from Firmicutes Fusobacteria three phyla Actinobacteria OP9 Cyanobacteria • Some other Synergistes Deferribacteres Chrysiogenetes phyla are only NKB19 Verrucomicrobia sparsely Chlamydia OP3 Planctomycetes sampled Spriochaetes Coprothmermobacter OP10 • Solution: Really Thermomicrobia Chloroflexi Fill in the Tree TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Eisen & Ward, PIs Thermotogae OP1 OP11 Tuesday, March 8, 2011
  • 49. GEBA Pilot Project: Components • Project overview (Phil Hugenholtz, Nikos Kyrpides, Jonathan Eisen, Eddy Rubin, Jim Bristow) • Project management (David Bruce, Eileen Dalin, Lynne Goodwin) • Culture collection and DNA prep (DSMZ, Hans-Peter Klenk) • Sequencing and closure (Eileen Dalin, Susan Lucas, Alla Lapidus, Mat Nolan, Alex Copeland, Cliff Han, Feng Chen, Jan-Fang Cheng) • Annotation and data release (Nikos Kyrpides, Victor Markowitz, et al) • Analysis (Dongying Wu, Kostas Mavrommatis, Martin Wu, Victor Kunin, Neil Rawlings, Ian Paulsen, Patrick Chain, Patrik D’Haeseleer, Sean Hooper, Iain Anderson, Amrita Pati, Natalia N. Ivanova, Athanasios Lykidis, Adam Zemla) • Adopt a microbe education project (Cheryl Kerfeld) • Outreach (David Gilbert) • $$$ (DOE, Eddy Rubin, Jim Bristow) Tuesday, March 8, 2011
  • 50. rRNA Tree of Life FIgure from Barton, Eisen et al. “Evolution”, CSHL Press. Based on tree from Pace NR, 2003. Tuesday, March 8, 2011
  • 54. B: Ac in t ob ac te B: ria # of Genomes Am (H Tuesday, March 8, 2011 in igh 10 15 20 25 30 35 0 5 an G a C B: B: er ) Ba Aq ob ct uif ia B: ero ica B: e D Ch ide B: e ef lo te r s D rri ofl ef ba e B: e c xi B: De B rrib ter Ep lta : D act es si Pr ei er lo o n es n te oc Pr ob oc ot a ci B: e ct G B: oba eri am B F ct a : ir e B: m Fu mi ria a G P so cut em ro ba e t c s B: ma eo te ba ri H tim c a a t B: loa ona eri a B: Pl nae de an r te Th c o s Phyla er B: to bia m S m le y s B: od piro ce es c te T u h B: he lfo ae s rm b te GEBA Pilot Target List Th o a s er de cte m s ri u a A: ove lfo H n bi A: alo abu a A: A b la M rc ac e A: et ha te M han eo ria et g ha ob lob ac i A: no te m r A: The icr ia Th rm obi er oc a m oc op ci ro te i
  • 55. GEBA Pilot Project Overview • Identify major branches in rRNA tree for which no genomes are available • Identify those with a cultured representative in DSMZ • DSMZ grew > 200 of these and prepped DNA • Sequence and finish 200+ • Annotate, analyze, release data • Assess benefits of tree guided sequencing • 1st paper Wu et al in Nature Dec 2009 Tuesday, March 8, 2011
  • 56. GEBA Phylogenomic Lesson 1 The rRNA Tree of Life is a Useful Tool for Identifying Phylogenetically Novel Genomes Tuesday, March 8, 2011
  • 57. rRNA Tree of Life Bacteria Archaea Eukaryotes Figure from Barton, Eisen et al. “Evolution”, CSHL Press. 2007. Based on tree from Pace 1997 Science 276:734-740 Tuesday, March 8, 2011
  • 58. The Core Gets Small ... Tuesday, March 8, 2011
  • 62. Network of Life Bacteria Archaea Eukaryotes Figure from Barton, Eisen et al. “Evolution”, CSHL Press. Based on tree from Pace NR, 2003. Tuesday, March 8, 2011
  • 63. Using the Core Tuesday, March 8, 2011
  • 64. Wh Whole genome tree built using AMPHORA by Martin Wu and Dongying Wu Tuesday, March 8, 2011
  • 66. Four Models for Rooting TOL from Lake et al. doi: 10.1098/rstb.2009.0035 Tuesday, March 8, 2011
  • 67. GEBA Phylogenomic Lesson 2 rRNA Tree is good but not perfect and better genomic sampling improves phylogenetic inference Tuesday, March 8, 2011
  • 68. 16s Says Hyphomonas is in Rhodobacteriales Badger et al. 2005 Tuesday, March 8, 2011
  • 69. WGT and individual gene trees: Its Related to Caulobacterales Badger et al. 2005 Tuesday, March 8, 2011
  • 70. 16s WGT, 23S Badger et al. 2005 Int J System Evol Microbiol 55: 1021-1026. Tuesday, March 8, 2011
  • 71. Caveats: ignoring LGT and using concatenated alignments Tuesday, March 8, 2011
  • 72. Concatenated Alignment ML Tree Tuesday, March 8, 2011
  • 73. Green Non Sulfur Bacteria Tuesday, March 8, 2011
  • 76. Zimmer. New York Times. 2009 Tuesday, March 8, 2011
  • 77. GEBA Phylogenomic Lesson 3 Phylogenetics guided genome selection (and phylogenetics in general) improves genome annotation Tuesday, March 8, 2011
  • 78. Predicting Function • Key step in genome projects • More accurate predictions help guide experimental and computational analyses • Many diverse approaches • All improved both by “phylogenomic” type analyses that integrate evolutionary reconstructions and understanding of how new functions evolve Tuesday, March 8, 2011
  • 79. From Eisen et al. 1997 Nature Medicine 3: 1076-1078. Tuesday, March 8, 2011
  • 80. Blast Search of H. pylori “MutS” • Blast search pulls up Syn. sp MutS#2 with much higher p value than other MutS homologs • Based on this TIGR predicted this species had mismatch repair Based on Eisen • Assumes functional constancy et al. 1997 Nature Medicine 3: 1076-1078. Tuesday, March 8, 2011
  • 81. Predicting Function • Identification of motifs – Short regions of sequence similarity that are indicative of general activity – e.g., ATP binding • Homology/similarity based methods – Gene sequence is searched against a databases of other sequences – If significant similar genes are found, their functional information is used • Problem – Genes frequently have similarity to hundreds of motifs and multiple genes, not all with the same function Tuesday, March 8, 2011
  • 82. MutL?? From http://asajj.roswellpark.org/huberman/dna_repair/mmr.html Tuesday, March 8, 2011
  • 83. Phylogenetic Tree of MutS Family Aquae Strpy Bacsu Synsp Deira Helpy Yeast Human Borbu Metth Celeg mSaco Yeast Human Yeast Mouse Arath Celeg Human Arath Human Mouse Spombe Fly Yeast Xenla Rat Mouse Yeast Human Spombe Yeast Neucr Arath Aquae Trepa Chltr DeiraTheaq Thema BacsuBorbu Based on Eisen, SynspStrpy 1998 Nucl Acids Ecoli Neigo Res 26: 4291-4300. Tuesday, March 8, 2011
  • 84. MutS Subfamilies MSH5 MutS2 Aquae Strpy Bacsu Synsp Deira Helpy Yeast Human Borbu Metth Celeg mSaco MSH6 Yeast Human Mouse Arath Yeast MSH4 Celeg Human Arath Human MSH3 Mouse Fly Spombe Yeast Xenla Rat Mouse Yeast MSH1 Spombe Human Yeast MSH2 Neucr Arath Aquae Trepa Chltr Deira Theaq BacsuBorbu Thema SynspStrpy Ecoli Neigo Based on Eisen, 1998 Nucl Acids MutS1 Res 26: 4291-4300. Tuesday, March 8, 2011
  • 85. Overlaying Functions onto Tree MutS2 MSH5 Aquae Strpy Bacsu Synsp Deira Helpy Yeast Human Borbu Metth Celeg MSH6 mSaco Yeast Human Mouse Arath YeastMSH4 Celeg Human Arath Human MSH3 Mouse Fly Spombe Yeast Xenla Rat Mouse Yeast Human MSH1 Spombe Yeast MSH2 Neucr Arath Aquae Trepa Chltr DeiraTheaq BacsuBorbu Thema SynspStrpy Based on Eisen, Ecoli Neigo 1998 Nucl Acids MutS1 Res 26: 4291-4300. Tuesday, March 8, 2011
  • 86. Functional Prediction Using Tree MSH5 - Meiotic Crossing Over MutS2 - Unknown Functions Aquae Strpy Bacsu Synsp Deira Helpy Yeast Human Borbu Metth Celeg MSH6 - Nuclear mSaco Repair Yeast Of Mismatches Human MSH4 - Meiotic Crossing Mouse Yeast Over Arath Celeg Human Arath MSH3 - Nuclear Human Mouse RepairOf Loops Spombe Fly Yeast Xenla Rat Mouse MSH2 - Eukaryotic Nuclear Yeast Human Mismatch and Loop Repair MSH1 Spombe Yeast Neucr Mitochondrial Arath Repair Aquae Trepa Chltr DeiraTheaq BacsuBorbu Thema SynspStrpy Ecoli Based on Eisen, Neigo 1998 Nucl Acids MutS1 - Bacterial Mismatch and Loop Repair Res 26: 4291-4300. Tuesday, March 8, 2011
  • 88. PHYLOGENENETIC PREDICTION OF GENE FUNCTION EXAMPLE A METHOD EXAMPLE B 2A CHOOSE GENE(S) OF INTEREST 5 3A 1 3 4 2B 2 IDENTIFY HOMOLOGS 5 1A 2A 1B 3B 6 ALIGN SEQUENCES 1A 2A 3A 1B 2B 3B 1 2 3 4 5 6 CALCULATE GENE TREE Duplication? 1A 2A 3A 1B 2B 3B 1 2 3 4 5 6 OVERLAY KNOWN FUNCTIONS ONTO TREE Duplication? 2A 3A 1B 2B 3B 1 2 3 4 5 6 1A INFER LIKELY FUNCTION OF GENE(S) OF INTEREST Ambiguous Duplication? Species 1 Species 2 Species 3 1A 1B 2A 2B 3A 3B 1 2 3 4 5 6 ACTUAL EVOLUTION (ASSUMED TO BE UNKNOWN) Based on Eisen, 1998 Genome Duplication Res 8: 163-167. Tuesday, March 8, 2011
  • 89. Phylogenetic Prediction of • Termed phylogenomics (Eisen, et al 1997) • Greatly improves accuracy of functional predictions compared to similarity based methods (e.g., blast) • Automated methods now available – Sean Eddy, Steven Brenner, Kimmen Sjölander, etc. • But … Tuesday, March 8, 2011
  • 90. Example 2: Recent Changes • Phylogenomic functional prediction NJ * ** V.cholerae0512 VC V.cholerae VCA1034 V.cholerae VC V.cholerae VC V.cholerae VC A0974 A0068 V.cholerae VC 0825 0282 may not work well for very newly V.cholerae VCA0906 V.cholerae VC A0979 V.cholerae VCA1056 V.cholerae VC1643 V.cholerae VC2161 ** V.cholerae VCA0923 ** V.cholerae VC0514 V.cholerae VC 1868 V.cholerae VC A0773 V.cholerae VC1313 evolved functions V.cholerae VC 1859 V.cholerae VC1413 V.cholerae VCA0268 ** V.cholerae VC A0658 V.cholerae VC 1405 * V.cholerae VC1298 V.cholerae VC1248 V.cholerae VCA0864 V.cholerae VCA0176 ** V.cholerae VCA0220 V.cholerae VC 1289 ** V.cholerae VC1069 A V.cholerae VC2439 • Can use understanding of origin of V.cholerae VC967 1 V.cholerae VC A0031 V.cholerae VC1898 V.cholerae VC A0663 V.cholerae VC0988 A V.cholerae VC0216 * V.cholerae VC0449 V.cholerae VCA0008 V.cholerae VC1406 V.cholerae VC 1535 novelty to better interpret these cases? V.cholerae VC0840 B.subtilis gi2633766 Synechocystis sp. gi1001299 * Synechocystis sp.gi1001300 * Synechocystis sp. gi1652276 * Synechocystis sp. gi1652103 H.pylori gi2313716 ** **H.pylori 99 gi4155097 C.jejuni Cj1190c C.jejuni Cj1110c A.fulgidus gi2649560 A.fulgidus gi2649548 ** B.subtilis gi2634254 • Screen genomes for genes that have B.subtilis gi2632630 B.subtilis gi2635607 B.subtilis gi2635608 ** B.subtilis gi2635609 ** ** B.subtilisgi2635882 gi2635610 B.subtilis E.coligi1788195 E.coli gi2367378 * ** E.coligi1788194 E.coli A1092 gi1787690 V.cholerae VC changed recently V.cholerae VC 0098 E.coli gi1789453 H.pylori gi2313186 H.pylori 99 gi4154603 ** C.jejuni Cj0144 C.jejuni Cj1564 **C.jejuni C.jejuni Cj0262c Cj1506c ** H.pylori gi2313163 * ** H.pylori 99 gi4154575 ** H.pylori gi2313179 H.pylori 99 gi4154599 – Pseudogenes and gene loss ** C.jejuni Cj0019c C.jejuni Cj0951c C.jejuni Cj0246c B.subtilis gi2633374 T.maritima TM0014 V.cholerae VC1403 V.cholerae VCA1088 T.pallidum gi3322777 ** T.pallidum gi3322939 ** T.pallidum gi3322938 B.burgdorferi gi2688522 – Contingency Loci T.pallidum gi3322296 B.burgdorferi gi2688521 * T.maritima TM0429 **T.maritima TM0918 * **T.maritima T.maritima TM0023 TM1428 T.maritima TM1143 T.maritima TM1146 P.abyssi PAB1308 P.horikoshii gi3256846 ** P.abyssiPAB1336 – Acquisition (e.g., LGT) ** P.horikoshii gi3256896 ** **P.abyssi PAB2066 ** P.horikoshii ** P.abyssi gi3258290 * PAB1026 ** P.horikoshii DRA00354 gi3256884 D.radiodurans D.radiodurans ** D.radioduransDRA0353 ** DRA0352 ** V.cholerae VC 1394 P.abyssi PAB1189 P.horikoshii gi3258414 – Unusual dS/dN ratios ** B.burgdorferi gi2688621 M.tuberculosis gi1666149 V.cholerae VC 0622 – Rapid evolutionary rates – Recent duplications Tuesday, March 8, 2011
  • 91. Example 3: Non homology methods • Many genes have homologs in other species but no homologs have ever been studied experimentally • Non-homology methods can make functional predictions for these • Example: phylogenetic profiling Tuesday, March 8, 2011
  • 92. Phylogenetic profiling basis • Microbial genes are lost rapidly when not maintained by selection • Genes can be acquired by lateral transfer • Frequently gain and loss occurs for entire pathways/processes • Thus might be able to use correlated presence/ absence information to identify genes with similar functions Tuesday, March 8, 2011
  • 93. Non-Homology Predictions: Phylogenetic Profiling • Step 1: Search all genes in organisms of interest against all other genomes • Ask: Yes or No, is each gene found in each other species • Cluster genes by distribution patterns (profiles) Tuesday, March 8, 2011
  • 94. Carboxydothermus hydrogenoformans • Isolated from a Russian hotspring • Thermophile (grows at 80°C) • Anaerobic • Grows very efficiently on CO (Carbon Monoxide) • Produces hydrogen gas • Low GC Gram positive (Firmicute) • Genome Determined (Wu et al. 2005 PLoS Genetics 1: e65. ) Tuesday, March 8, 2011
  • 95. Homologs of Sporulation Genes Wu et al. 2005 PLoS Genetics 1: e65. Tuesday, March 8, 2011
  • 96. Carboxydothermus sporulates Wu et al. 2005 PLoS Genetics 1: e65. Tuesday, March 8, 2011
  • 97. Wu et al. 2005 PLoS Genetics 1: e65. Tuesday, March 8, 2011
  • 98. PG Profiling Works Better Using Orthology Tuesday, March 8, 2011
  • 99. GEBA Lesson 3: Phylogeny driven genome selection (and phylogenetics) improves genome annotation • Took 56 GEBA genomes and compared results vs. 56 randomly sampled new genomes • Better definition of protein family sequence “patterns” • Greatly improves “comparative” and “evolutionary” based predictions • Conversion of hypothetical into conserved hypotheticals • Linking distantly related members of protein families • Improved non-homology prediction Tuesday, March 8, 2011
  • 100. GEBA Lesson 4: Metadata Important Tuesday, March 8, 2011
  • 101. GEBA Phylogenomic Lesson 5 Phylogeny-driven genome selection helps discover new genetic diversity Tuesday, March 8, 2011
  • 102. Network of Life Bacteria Archaea Eukaryotes FIgure from Barton, Eisen et al. “Evolution”, CSHL Press. Based on tree from Pace NR, 2003. Tuesday, March 8, 2011
  • 103. Protein Family Rarefaction • Take data set of multiple complete genomes • Identify all protein families using MCL • Plot # of genomes vs. # of protein families Tuesday, March 8, 2011
  • 104. Wu et al. 2009 Nature 462, 1056-1060 Tuesday, March 8, 2011
  • 105. Wu et al. 2009 Nature 462, 1056-1060 Tuesday, March 8, 2011
  • 106. Wu et al. 2009 Nature 462, 1056-1060 Tuesday, March 8, 2011
  • 107. Wu et al. 2009 Nature 462, 1056-1060 Tuesday, March 8, 2011
  • 108. Wu et al. 2009 Nature 462, 1056-1060 Tuesday, March 8, 2011
  • 109. Synapomorphies exist Wu et al. 2009 Nature 462, 1056-1060 Tuesday, March 8, 2011
  • 110. Families/PD not uniform +,%-./&#(%)"* !"#$%"&'(%)"* ! ! Tuesday, March 8, 2011
  • 111. Structural Novelty • Of the 17000 protein families in the GEBA56, 1800 are novel in sequence (Wu) • Structural modeling suggests many are structurally novel too (D'haeseleer) • 372 being crystallized by the PSI (Kerfeld) Tuesday, March 8, 2011
  • 112. GEBA Phylogenomic Lesson 6 Improves analysis of genome data from uncultured organisms Tuesday, March 8, 2011
  • 113. Great Plate Count Anomaly Culturing Microscope Count Count Tuesday, March 8, 2011
  • 114. Great Plate Count Anomaly Culturing Microscope Count <<<< Count Tuesday, March 8, 2011
  • 115. Environmental DNA Analysis DNA Culturing Microscope Count <<<< Count Tuesday, March 8, 2011
  • 116. rRNA Phylotyping • Collect DNA from environment • PCR amplify rRNA genes using broad (so- called universal) primers • Sequence • Align to others • Infer evolutionary tree • Unknowns “identified” by placement on tree • Some use BLAST, but not as good as phylogeny Tuesday, March 8, 2011
  • 117. rRNA PCR The Hidden Majority Richness estimates Hugenholtz 2002 Bohannan and Hughes 2003 Tuesday, March 8, 2011
  • 119. rRNA data increasing exponentially too Tuesday, March 8, 2011
  • 120. rRNA phylotyping issues • Massive amounts of data – 1 x 10^6 new partial sequences with new 454 – 2 x 10^6 full length sequences in DB • Alignments of new sequences not always straightforward • Solutions: – Reliance on similarity scores (bad) – High throughput automated phylogenetic tools • STAP • WATERs Tuesday, March 8, 2011
  • 121. Perna et al. 2003 Tuesday, March 8, 2011
  • 125. Diversity of Proteorhodopsins by PCR de la Torre et al 2003 Tuesday, March 8, 2011
  • 126. Metagenomics shotgun sequence Tuesday, March 8, 2011
  • 127. Massiuve Diversity of Proteorhodopsins Venter et al., 2004 Tuesday, March 8, 2011