1. Lecture 13:
EVE 161:
Microbial Phylogenomics
!
Lecture #13:
Era III: Genome Sequencing and
Phylogenomic Analysis
!
UC Davis, Winter 2014
Instructor: Jonathan Eisen
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!1
2. Where we are going and where we have been
• Previous lecture:
! 12: Guest Lecture
• Current Lecture:
! 13: Genome Sequencing III
• Next Lecture:
! 14: Metagenomics
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!2
4. Phylogenomics I:Major Evolutionary Transitions
• Analysis of S. pombe genome by Wood et al
2002
• Compared the genomes of eukaryotes to
those of prokaryotes
• “Are there genes found in all eukaryotes
with no obvious homologs in any
prokaryote?”
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
5. Evolutionary Model
S. pombe
Eukaryotes
S. cerevisiae
Encephalatozoon
Archaea
Bacteria
Worm
Fly
Humans
Dictyostelium
Arabidopsis
Chlamydomonas
Phytophthora
Tetrahymena
Plasmodium
Trypanosoma
Euglena
Naegleria
Trichomonas
Giardia
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
6. Eukaryotic Specific Genes
• >200 genes found including:
– Cytoskeleton components: tubulin,
ankyrin, myosin
– Protein degradation: ubiquitin, proteases
– Chromatin and DNA packaging
• Of the 200 many had no known function:
could encode novel eukaryotic wide
processes
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
7. Multi- vs. Single-Cellular Eukaryotes
• Further analysis of S. pombe genome
• Compared multi-cellular vs. single-cellular eukaryotes
(animals and plants vs. yeast)
• “Are there genes in all multi-cellular and not in any singlecellular?”
• Found only 3
• Concluded that the genetic basis of multi-cellularity was
likely to be gene regulation and not invention of new genes
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
8. Multiple Origins of Multicellularity
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
10. Endosymbiont Evolution
• Compared to free-living relatives
–
–
–
–
Smaller genomes
Lower GC content
Higher pIs
Higher rates of sequence evolution
• Baumannia shows ALL of these
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
11. Uses of Whole Genome Trees
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
13. Variation Between Endosymbionts and Free Living
• Repair hypothesis
!
• Population genetics hypothesis
!
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
14. Variation Between Endosymbionts and Free Living
• Repair hypothesis
!
• Population genetics hypothesis
!
• PopGen explanations favored
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
17. Variation Among Endosymbionts
• Repair hypothesis
!
• Population genetics hypothesis
!
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
18. Variation Among Endosymbionts
• Repair hypothesis
!
• Population genetics hypothesis
!
• Repair explanations favored
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
28. Steps in Lateral Gene Transfer (LGT)
A
B
C
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
D
29. Steps in Lateral Gene Transfer (LGT)
A
B
C
1
D
Gene acquires host features
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
30. Steps in Lateral Gene Transfer (LGT)
A
B
C
D
2
Transfer
1
Gene acquires host features
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
31. Steps in Lateral Gene Transfer (LGT)
A
B
C
3-5
D
Integration, selection, spread
2
Transfer
1
Gene acquires host features
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
32. Steps in Lateral Gene Transfer (LGT)
A
B
C
D
Amelioration
Integration, selection, spread
6
3-5
2
Transfer
1
Gene acquires host features
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
33. Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
34. Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
35. Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
36. Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
37. How to Infer Gene Transfers
• Unusual distribution patterns
!
• Unusual nucleotide composition
!
• High sequence similarity to supposedly
distantly related species
!
• Unusual gene trees
!
• Observe transfer events
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
38. Case Study I: Aphids
ig. 1 Coloration and carotenoids in the pea aphid. Typical green (A) and red (B) aphid clones, (C) 5AY, a green mutant clone arising from the red clone 5A. (D)
rofiles of carotenoids in red (5A, LSR1), mutant redgreen (5AY, two samples), and green (8-10-1, 7-2-1) pea aphid clones. Torulene and a related red compound
re restricted to red clones; the mutant 5AY clone lacks these and displays an elevation in their predicted precursor, -carotene.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
39. Case Study I: Aphids
Table 1 Genes in the A. pisum genome with closest homology to carotenoid biosynthetic enzymes, including scaffold of origin and matching EST sequences.
Similar color indicates that the gene is on the same scaffold. The 3' end of scaffold NW_001925130 overlaps with the 5' end of NW_001923501 for 5400 base
pairs, and PCR demonstrated continuity of these scaffolds. Pink row is the gene corresponding to torR and conferring red color (see text). Protein length, amino
acids; ESTs are those present in GenBank, mostly from clone LSR1.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
40. Case Study I: Aphids
Fig. 2 Phylogenetic relations of inferred carotenoid biosynthetic enzymes from the pea aphid genome. (A) Carotenoid desaturases and (B) carotenoid cyclase–
carotenoid synthases. Sequences are from aphids, bacteria, plants, and fungi; no homologs were detectable in other sequenced animal genomes. Bootstrap
support greater than 50% is indicated on branches.
!
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
41. Case Study II: GEBA
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
42. Tree of Life
Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
43. Genomes Poorly Sampled
Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
44. TIGR Tree of Life Project
Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
45. Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!41
46. Genomes Still Poorly Sampled
Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
47. Genomic Encyclopedia of Bacteria & Archaea
Wu et al. 2009 Nature 462, 1056-1060
Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
48. Genomic Encyclopedia of Bacteria & Archaea
Wu et al. 2009 Nature 462, 1056-1060
Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
49. GEBA Lesson 1: rRNA utility in IDing novel genomes
From Wu et al. 2009 Nature 462, 1056-1060
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!45
50. GEBA Lesson 2: rRNA Tree is not perfect
16s
WGT, 23S
Badger et al. 2005 Int J System Evol Microbiol 55: 1021-1026.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!46
51. GEBA Lesson 3: Phylogenetic sampling improves annotation
• Took 56 GEBA genomes and compared results vs. 56
randomly sampled new genomes
• Better definition of protein family sequence “patterns”
• Greatly improves “comparative” and “evolutionary”
based predictions
• Conversion of hypothetical into conserved hypotheticals
• Linking distantly related members of protein families
• Improved non-homology prediction
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!47
52. GEBA Lesson 4 : Metadata Important
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!48
53. GEBA Lesson 5:Improves discovering new genetic diversity
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!49
54. Protein Family Rarefaction Curves
• Take data set of multiple complete genomes
• Identify all protein families using MCL
• Plot # of genomes vs. # of protein families
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!50
55. Wu et al. 2009 Nature 462, 1056-1060
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!51
56. Wu et al. 2009 Nature 462, 1056-1060
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!51
57. Wu et al. 2009 Nature 462, 1056-1060
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!51
58. Wu et al. 2009 Nature 462, 1056-1060
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!51
59. Wu et al. 2009 Nature 462, 1056-1060
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!51
60. Synapomorphies exist
Wu et al. 2009 Nature 462, 1056-1060
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!52
61. Phylogenetic Distribution Novelty:
Bacterial Actin Related Protein
87
C. boidinii gi57157304
S. cerevisiae gi14318479
L. starkeyi gi166080363
S. japonicus gi213407080
ACTIN
A. cliftonii gi14269497
U. pertusa gi50355609
99
H. sapiens gi4501889
M. cerebralis gi46326807
67
C. cinerea gi169844021
ARP1
N. crassa gi85101929
100
I. scapularis gi215507378
100 H. sapiens gi5031569
51
65
S. japonicus gi213404844
100
S. cerevisiae gi6320175
ARP2
D. melanogaster gi24642545
100 G. gallus gi45382569
75
C. neoformans gi58266690
S. cerevisiae gi6322525
ARP3
100
D. melanogaster gi17737543
100 H. sapiens gi5031573
H. ochraceum gi227395998
BARP
S. cerevisiae gi1008244
P. patens gi168051992
ARP4
73
99
A. thaliana gi18394608
94
S. cerevisiae gi1301932
100
S. japonicus gi213408393
ARP5
D. discoideum gi66802418
D. melanogaster gi17737347
74
S. cerevisiae gi6323114
97
ARP6
100
D. hansenii gi21851 1921
100
O. sativa gi182657420
ARP7
A. thaliana gi1841 1737
D. melanogater gi19920358
100
M. musculus gi226246593
ARP10
0.5
Haliangium ochraceum DSM 14365
Patrik D’haeseleer, Adam Zemla,
Victor Kunin
See also Guljamow et al. 2007 Current Biology.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
63. Haloarchaeal GEBA-like
Lynch et al. (2012) PLoS ONE 7(7): e41389. doi:10.1371/journal.pone.0041389
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
64. The Dark Matter of Biology
From Wu et al. 2009 Nature 462, 1056-1060
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
65. GEBA Uncultured
SAR
A: Hydrothermal vent
B: Gold Mine
C: Tropical gyres (Mesopelagic)
D: Tropical gyres (Photic zone)
OP3
Site
Site
Site
Site
OP1
406
OD1
1
Number of SAGs from Candidate Phyla
4
6
1
1
13
-
2
-
2
-
Sample collections at 4 additional sites are underway.
Phil Hugenholtz
!57
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
66. JGI Dark Matter Project
brackish/freshwater
TG
HSM
SM
GBS
GBS
HOT
OT
SAK
AK
hydrothermal
sediment
ETL
E
BACTERIA
ARCHAEA
UGA recoded for Gly (Gracilibacteria)
seawater
HGT from Eukaryotes (Nanoarchaea)
bioreactor
EPR
EPR
T
TA
G
GOM
OM
Growing
AA chain
U
oxidoretucase
Ribo
A
P51$
environmental
samples (n=9)
draft genomes
(n=201)
W51$*O
67. recognizes
UGA
G
isolation of single
cells (n=9,600)
SSU rRNA gene
based identification
(n=2,000)
whole genome
amplification (n=3,300)
U
genome sequencing,
assembly and QC (n=201)
1
H
H
1
1
$,$5
adenine
+2+2
+2+2
OH
2+3
Woyke et al. Nature 2013.
limiting
phosphate,
fatty acids,
carbon, iron SpotT
1+ 2
1$'+
51$ SROPHUDVH
ı3
ı2
-10
ı1
GTP or GDP
+ATP
limiting
amino acids
RelA
ppGpp
(GTP or GDP)
+ PPi
H
DksA
Expression of components
for stress response
O
OH
+2+2
O
O
O
1+
1+
2+3
2+3
tetrapeptide
e- acceptor
stringent response
(Diapherotrites, Nanoarchaea)
H
+2+2
O
IMP
1
1
O
O
1+
ı4
-35
)$,$5
1
guanine
O
PurP
O
+2 1
H
H H
+
ȕ ȕ¶
Į7'
?
H 1
+
1+2
O
Oxidation
Archaea
PurF
PurD
3XU1
PurL/Q
PurM
PurK
PurE
3XU
PurB
1+2
1
O
Į17'
archaeal type purine synthesis
(Microgenomates)
1
Eukaryota
ADP
sigma factor (Diapherotrites, Nanoarchaea)
ribosome
PRPP
Reduction
1$'+ + + H-
A U
A U
G U
A A U G A U
Ribo
1+
H
Korarchaeota
Cren Thermoprotei
Thaumarchaeota
Cren MCG
Cren pISA7
Cren C2
Aigarchaeota
Nanoarchaea
Micrarchaea
pMC2A384 (Diapherotrites)
DSEG (Aenigmarchaea)
Nanohaloarchaea
Euryarchaeota
:6
OP11 (Microgenomates)
OD1 (Parcubacteria)
SR1
BH1
TM7
GN02 (Gracilibacteria)
Bacteriodetes
OP1 (Acetothermia)
'HLQRFRFFXVí7KHUPXV
093í
70
ZB3
)LEUREDFWHUHV
TG3
Spirochaetes
WWE1 (Cloacamonetes)
Proteobacteria
)LUPLFXWHV
Tenericutes
)XVREDFWHULD
Chrysiogenetes
Chlorobi
6$5 0DULQLPLFURELD
70. A Genomic Encyclopedia of Microbes (GEM)
Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
71. A Genomic Encyclopedia of Microbes (GEM)
Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
72. GEBA Lesson 6: Improves analysis of metagenomic data
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!61
73. Other Markers
Sargasso Phylotypes
0.500
GEBA Project improves
metagenomic analysis
EFG
EFTu
HSP70
RecA
RpoB
rRNA
0.250
0.125
us
ar
ch
ae
ot
C
a
re
na
rc
ha
eo
ta
er
m
er
ia
ct
Th
ba
Eu
cc
u
in
o
co
ry
s-
so
iro
c
ha
et
es
ex
i
Sp
hl
or
ofl
C
FB
C
Fu
De
ap
ro
t
eo
ba
Be
ct
ta
er
pr
ia
ot
eo
ba
G
am
ct
er
m
ia
ap
ro
te
ob
Ep
ac
si
lo
te
np
ria
ro
te
ob
De
ac
lta
te
pr
ria
ot
eo
ba
ct
C
er
ya
ia
no
ba
ct
er
ia
Fi
rm
ic
ut
es
Ac
tin
ob
ac
te
ria
C
hl
or
ob
i
0.000
Al
ph
Weighted % of Clones
0.375
Major Phylogenetic Group
Venter et Eisen Winter 304:
Slides for UC Davis EVE161 Course Taught by Jonathan al., Science2014
66-74. 2004
!62
74. Venter et Eisen Winter 304:
Major Phylogenetic Group
Slides for UC Davis EVE161 Course Taught by Jonathan al., Science2014
ar
ch
re
n
C
ot
a
ae
a
ot
us
rm
ria
s
te
te
ae
ry
ar
ch
Eu
Th
e
s-
cu
oc
De
in
oc
ac
so
b
Fu
ae
ch
Sp
iro
xi
or
ofl
e
hl
C
FB
C
i
or
ob
hl
C
ria
te
ob
ac
tin
Ac
es
ut
ic
rm
Fi
ria
ria
te
ac
ob
ya
n
C
te
ia
er
ia
er
ct
ba
c
eo
pr
ot
lta
De
ct
eo
ba
ro
t
np
si
lo
ria
te
0.375
Ep
eo
ba
ro
t
ap
am
m
G
ba
c
eo
ia
er
ct
ba
eo
pr
ot
pr
ot
ta
ph
a
Be
Al
Weighted % of Clones
Other Markers
0.500
Sargasso Phylotypes
But not a lot
EFG
rRNA
66-74. 2004
EFTu
0.250
0.125
0.000
!63
75. rRNA Tree of Life
Bacteria
Archaea
Eukaryotes
Figure from Barton, Eisen et al. “Evolution”, CSHL Press.
2007.
Based on tree from Pace 1997 Science 276:734-740
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!64
76. PD: Genomes
From Wu et
al. 2009
Nature 462,
1056-1060
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!65
77. PD: Genomes + GEBA
From Wu et
al. 2009
Nature 462,
1056-1060
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!66
78. PD: Isolates
From Wu et al. 2009 Nature 462,
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 1056-1060
!67
79. PD: All
From Wu et al. 2009 Nature 462,
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 1056-1060
!68
80. Uncultured Lineages:
Technical Approaches
• Get into culture
• Enrichment cultures
• If abundant in low diversity ecosystems
• Flow sorting
• Microbeads
• Microfluidic sorting
• Single cell amplification
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!69
81. GEBA uncultured
SAR
A: Hydrothermal vent
B: Gold Mine
C: Tropical gyres (Mesopelagic)
D: Tropical gyres (Photic zone)
OP3
Site
Site
Site
Site
OP1
406
OD1
1
Number of SAGs from Candidate Phyla
4
6
1
1
13
-
2
-
2
-
Sample collections at 4 additional sites are underway.
Phil Hugenholtz
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!70