This document contains slides from a lecture on metagenomics and microbial phylogenomics. The lecture discusses the history and development of metagenomics, which involves studying the collective genomes of microbes in an environment. It reviews key papers on metagenomics and the discovery of proteorhodopsin and the SAR11 lineage of bacteria from environmental samples. The slides also discuss previous findings on marine microbes from rRNA studies and introduce two new lineages of alpha- and gamma-proteobacteria identified from an analysis of 16S rRNA genes cloned from Sargasso Sea bacterioplankton DNA.
1. Lecture 14:
EVE 161:
Microbial Phylogenomics
!
Lecture #14:
Era IV: Metagenomics
!
UC Davis, Winter 2014
Instructor: Jonathan Eisen
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!1
2. Where we are going and where we have been
• Previous lecture:
! 13: Era III: Sequencing
• Current Lecture:
! 14: Era IV: Metagenomics
! Next Lecture:
! 15: Era IV: Metagenomics
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!2
3. Era IV: Genomes in the environment
Era IV:
Genomes in the Environment
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
4. Metagenomics History
•
Term first use in 1998 paper by Handelsman et al.
Chem Biol. 1998 Oct;5(10):R245-9. Molecular
biological access to the chemistry of unknown soil
microbes: a new frontier for natural products
•
Collective genomes of microbes in the soil termed
the “soil metagenome”
•
Good review is Metagenomics: Application of
Genomics to Uncultured Microorganisms by
Handelsman Microbiol Mol Biol Rev. 2004
December; 68(4): 669–685.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
5. 17 GPa. J. Geophys. Res. 102, 12252±12263 (1997).
28. Duffy, T. S. & Vaughan, M. T. Elasticity of enstatite and its relationships to crystal structure. J. Geophys.
Res. 93, 383±391 (1988).
characteristic of
cycle seen with E
proteorhodopsin
Furthermore,
Acknowledgements
Bacterial Rhodopsin: M. Ferrero and Boudier spectrum of
We thank C. Karger for participation in the measurements, and EvidenceF.for a New Type of the
and same isosbes
for providing the samplesPhototrophy Papua NewSea respectively. The
from Baldissero and in the Guinea,
Laboratoire de Tectonophysique's EBSD system was funded by the CNRS/INSU,
O ded Béjà, et al.
dopsin expressed
Â
Universite of Montpellier Scienceproject , 1902 of an archean craton''. This work
II, and NSF 289 ``Anatomy (2000);
chemical reactio
Â
was supported by the CNRS/INSU programme ``Action Thematique Innovante''.
D O I: 10.1126/science.289.5486.1902
generate a red-s
Correspondence should be addressed to A.T.
to that of recomb
(e-mail: deia@dstu.univ-montp2.fr). is for your personal, non-commercial use only.
This copy
¯ash-photolysis
http://www.sciencemag.org/content/289/5486/1902.full existence of pro
retinal molecules
waters.
The photocyc
If you wish to distribute this article to others, you can order high-quality copies for
and, after hydro
colleagues, clients, or custom ers by clicking here .
adding all-trans
indeed derive fro
Permission to republish or repurpose articles or portions of articles can be obta
Papers for Today
.................................................................
Proteorhodopsin phototrophy
following ocean
in the the guidelines here .
The following resources related to this article are available
 Á
Oded Beja*², Elena N. Spudich²³, John L. Spudich³, Marion Leclerc* online at www.scien
(this information is current as of May 18, 2010 ):
& Edward F. DeLong*
2
orbance (×10–3)
Updated information and services, including high-resolution figures, can be found in
1
* version of this article at:
Monterey Bay Aquarium Research Institute, Moss Landing, California 95039,
4
USA
http://www.sciencem ag.org/cgi/content/full/289/5486/1902
http://www.nature.com/nature/journal/v411/n6839/abs/411786a0.html
0
³ Department of Microbiology and Molecular Genetics, The University of
A list of selected Houston, Texas 77030, the
–1
Texas Medical School,additional articles onUSA S cience W eb sites related to this article c
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
²found authors contributed equally to this work
These at:
6. Papers for Today
Bacterial Rhodopsin: Evidence for a New Type of
Phototrophy in the Sea
O ded Béjà, et al.
Science 289, 1902 (2000);
D O I: 10.1126/science.289.5486.1902
This copy is for your personal, non-commercial use only.
http://www.sciencemag.org/content/289/5486/1902.full
If you wish to distribute this article to others, you can order high-quality copies for
colleagues, clients, or custom ers by clicking here .
Permission to republish or repurpose articles or portions of articles can be obta
following the guidelines here .
The following resources related to this article are available online at www.scien
(this information is current as of May 18, 2010 ):
Updated information and services, including high-resolution figures, can be found in
version of this article at:
http://www.sciencem ag.org/cgi/content/full/289/5486/1902
A list of selected additional articles on the S cience W eb sites related to this article c
Slides
found at: for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
7. gene
own transducer of light stimuli [for example,
leSAR86 (22, 23)]. Although sequence analysis of
geHtr
idenproteorhodopsin shows moderate statistical
roteosupport for a specific relationship with senfrom
opsins
erent.
hereas
philes
r than
rmine
l, we
a coli
presrotein
3A).
nes of
poprom was
(Fig.
at 520
banderated
odopnce of
dth is
the kinetics of its photochemical reaction cycle. The transport rhodopsins (bacteriorhodopsins and halorhodopsins) are characterized by cyclic photochemical reaction se-
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!7
8. Marine Microbe Background
• rRNA PCR studies of marine microbes have been extensive
• Comparative analysis had revealed many lineages, some very
novel, some less so, that were dominant in many, if not all,
open ocean samples
• Lineages given names based on specific clones: e.g., SAR11,
SAR86, etc
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!8
11. Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
12. APPLIED
AND
ENVIRONMENTAL MICROBIOLOGY, June 1991,
0099-2240/91/061707-07$02.00/0
Copyright C) 1991, American Society for Microbiology
Vol. 57, No. 6
p. 1707-1713
Phylogenetic Analysis of a Natural Marine Bacterioplankton
Population by rRNA Gene Cloning and Sequencing
THERESA B. BRITSCHGI AND STEPHEN J. GIOVANNONI*
Department of Microbiology, Oregon State University, Corvallis, Oregon 97331
VOL. 57, 1991
Received
5
February 1991/Accepted 25 March 1991
MARINE BACTERIOPLANKTON
The identification of the prokaryotic species which constitute marine bacterioplankton communities has been
a long-standing problem in marine microbiology. To address this question, we used the polymerase chain
CZ
reaction to construct and analyze a library of 51 small-subunit (16S) rRNA genes cloned from Sargasso Sea
bacterioplankton genomic DNA. Oligonucleotides complementary to conserved regions in the 16S rDNAs of
0._
* C)
eubacteria were used to direct the synthesis of polymerase chain reaction products, which were then cloned by
u
blunt-end ligation into the phagemid vector pBluescript. Restriction fragment length polymorphisms and
inmm Vibrio Synechococcus phylogeneticC) groups
hybridizations to oligonucleotide probes for the SARll and marine anguillarum
10
0
indicated the presence of at least seven classes of genes. The sequences of five unique rDNAs were determined
0
from
completely. In addition to 16S rRNA genes - the marine Synechococcus cluster and the previously identified
but uncultivated microbial group, the SARll cluster [S. J. Giovannoni, T. B. Britschgi, C. L. Moyer, and
K. G. Field, Nature (London) 345:60-63], two new gene classes were observed. Phylogenetic comparisons
indicated that these belonged to unknown species of a- and -y-proteobacteria. The data confirm the earlier
crescentus
conclusion that a majority of planktonic bacteria are new species previously unrecognized by bacteriologists.
1711
Escherichia coli
Vibrio harveyi
P.
4..
titative hybridization data which demonstrated that a novel
lineage belonging to the oa-proteobacteria was abundant in
and amplification of rDNA. A surface (2-m) 0
Collection SAR139
the Sargasso Sea; we also identified novel lineages which
bacterioplankton population was collected from hydrostaI
I
related to cultivated marine Synechococwere very closely
S (32° 4'N, 64° 23'W) in the Sargasso Sea as
cus spp. (6). In a 0.05 of a
study
0coworkers examined clonedhot-spring ecosystem, Ward and tion 0.15 (8). The polymerase chain reaction (24)previously
0.10
was used
described
fragments of 16S rRNA cDNAs
to produce double-stranded rDNA from the mixed-microbiand found numerous novel lineages but no matches to genes
al-population Sargasso Sea to representative, cultivated species (2, 6, 18,
from the genomic DNA prepared from the plankton
FIG. 4. Phylogenetic tree cultivated hot-spring bacteria (33). the rDNA clones
from showing relationships of These results sugsample. The amplification and were
30, 31, 32, 35, 40). Positionsthat natural ecosystems in general may include spe- containingregions in the 5' primers regionscomplementary to
gested of uncertain homology in regions
of the 16S rRNA
conserved insertions and 3' deletions were omitted from the analysis.
cies which are unknown to microbiologists. Jukes and Cantor The amplification corrects for the effects of superimposed mutations.
of
Evolutionary distances were calculated by the methoda genetic marker, genes. (15), which primers were the universal 1406R
Although any gene may be used as
and eubacterial 68F sequence
(ACGGGCGGTGTGTRC)rooted with the (TNANAC of Bacillus subtilis (38).
distance matrix method ATGCAAGTCGAKCG)
The phylogenetic tree was determineddistinctaadvantages. The extensive use of (20). The tree was (17). The reaction conditions were
rRNA genes offer by
16S rRNAs for studies of microbial systematics and evoluR. Woese as the as follows:
Sequence data not referenced were provided by C.data bases, such and R. Rossen. 1 ,ug of template DNA, 10 ,ul of 10x reaction
tion has resulted in large computer
buffer (500 mM KCI, 100 mM Tris HCI [pH 9.0 at 250C], 15
RNA Data Base Project, which encompass the phylogenetic
mM MgCl2, 0.1% gelatin, 1% Triton X-100), 1 U of Taq DNA
diversity found within culture collections. rRNA genes are
polymerase (Promega Biological Research Products, Madison, Wis.), 1 ,uM (each) primer, 200 ,uM (each) dATP, dCTP,
C, 1
Slides for UC Davis EVE161 CoursedGTP, and dTTP in aJonathan EisenatWinter 2014
Taught by 100-,ul total volume; 1 min 94°chain
*
min at 72°
We Polymerase
conserved
phyla were Corresponding author. in the corresponding min at 600C, 4 C to C C; 30 cycles. note with caution that some classes
880
phototroph
(G
-
G).
D ownloa de d from a e m .a sm .org a t U N IV
r SARIOO
D ownloa de d from a e m .a sm .org a t U N IV O F C ALIF D AV IS on M a y 1 8 , 2 0 1 0
Oceanospirillum linum
SAR 92
Caulobacter
SAR83
.0
-bacte- Erythrobacter and therefore can be 0 to examine
OCh 114
used
Q
highly conserved
Within the past decade, it has become evident that
- Hyphomicrobium vulgare
distant phylogenetic relationships with accuracy. The presrioplankton contribute significantly to biomass and bioC)
0
(4, 36). Until
ence of large numbers
geochemical activity in planktonic systems Rhodopseudomonas marina of ribosomes within cells and the
0
regulation of their biosyntheses in proportion to cellular
recently, progress in identifying the microbial species which
Rochalimaearates make these molecules ideally O) for ecologsuited
growth quintana
constitute these communities was slow, because the majore
counts)
ical studies with nucleic acid
ity of the organisms present (as measured by direct Agrobacterium tumefaciens probes.
Here we describe the partial analysis of a 16S rDNA
cannot be recovered in cultures (5, 14, 16). The discovery a
library prepared from photic zone Sargasso Sea bacteriodecade ago that unicellular cyanobacteria are widely distribSARI
more recent observation
plankton using a
uted in the open ocean (34) and the
-Es The general approacha for cloning eubacterial
Sargasso
16S rDNAs. SAR95 Sea, central oceanic gyre,
of abundant marine prochlorophytes (3) have underscored
typifies the oligotrophic conditions of the open ocean, prothe uncertainty of our knowledge about bacterioplankton
SAR1I microbial community adapted to
viding an example of a
community structure.
low-nutrient conditions.
extremely - Liverwort CP The results extend our
Molecular approaches are now providing genetic markers
previous observations of high genetic diversity among
for the dominant bacterial species in natural microbial pop*Anacystis nidulans
closely related lineages within this microbial community and
ulations (6, 21, 33). The immediate objectives of these
rDNA
provide - genetic markers for two novel eubacterial
studies are twofold: (i) to identify species, both known and
Prochlorothrix
lineages.
novel, by reference to sequence data bases; and (ii) to
- SAR6
construct species-specific probes which can be used in
0
quantitative ecological studies (7, 29).
* SAR7
p.0
Previously, we reported rRNA genetic markers and quanMATERIALS AND METHODS
-
13. JUlY 1991, p. 4371-4378
0021-9193/91/144371-08$02.00/0
Copyright C) 1991, American Society for Microbiology
JOURNAL OF BACTERIOLOGY,
Vol. 173, No. 14
Analysis of a Marine Picoplankton Community by 16S rRNA
Gene Cloning and Sequencing
THOMAS M. SCHMIDT,t EDWARD F. DELONG,t
NORMAN R. PACE*
Department of Biology and Institute for Molecular and Cellular Biology, Indiana University,
Bloomington, Indiana 47405
AND
Received 7 January 1991/Accepted 13 May 1991
The phylogenetic diversity of an oligotrophic marine picoplankton community was examined by analyzing
the sequences of cloned ribosomal genes. This strategy does not rely on cultivation of the resident
microorganisms. Bulk genomic DNA was isolated from picoplankton collected in the north central Pacific
Ocean by tangential flow filtration. The mixed-population DNA was fragmented, size fractionated, and cloned
into bacteriophage lambda. Thirty-eight clones containing 16S rRNA genes were identified in a screen of 3.2
x
104 recombinant phage, and portions of the rRNA gene were amplified by polymerase chain reaction and
D ownloa de d from jb.a sm
sequenced. The resulting sequences were used to establish the identities of the picoplankton by comparison with
an established data base of rRNA sequences. Fifteen unique eubacterial sequences were obtained, including
four from cyanobacteria and eleven from proteobacteria. A single eucaryote related to dinoflagellates was
identified; no archaebacterial sequences were detected. The cyanobacterial sequences are all closely related to
sequences from cultivated marine Synechococcus strains and with cyanobacterial sequences obtained from the
Atlantic Ocean (Sargasso Sea). Several sequences were related to common marine isolates of the y subdivision
of proteobacteria. In addition to sequences closely related to those of described bacteria, sequences were
obtained from two phylogenetic groups of organisms that are not closely related to any known rRNA sequences
from cultivated Slides for UC Davis EVE161 Course Taught by Jonathan Eisen one group within the
organisms. Both of these novel phylogenetic clusters are proteobacteria, Winter 2014
15. libraries prepared using PCR methods that reduced sequence artefacts17. They concluded that most sequence variation was clustered
35
% of 16S rRNA sequences
30
25
20
15
10
5
0
)
e
n
II
r) tes ia) ria nas ia) era de xi)
ria up kto lad cte
r
r
e
e
la
e
id cte act mo cte eim r c rofl
Ia acte gro lan a C ba
o
s
a
a
e
o
i
b
r
h
o
o
up eob sub ytop cter ibr cte eob tino lter eob ein act Chl
o
b
(F Ba rot Ac doa rot Rh eo 2 (
11 ph ba
gr rot
s
0
ub γ-P SAR ico teo 06
u
e
-P
-P
4
Ro R2
1s (
P ro
(δ arin Pse (α
1
R
e
6
6
4 M
/
SA
in α-P SA
AR R 8
r
as R11
32
S A
a
R
on SA
S
M red
SA
m
tu
ro
ul
c
lte
A
Un
+
Ib
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Phylogenetic clade
The ecoty
many micro
tions in the w
best exampl
entiated by
adapted (hig
Phylogeneti
gests that th
cally distinct
Cluster A S
of which ca
characterist
urobilin)37,3
ample suppo
teristics that
SS120 has a
ammonium
extreme, Syn
nitrate, cyan
interesting t
seem to pro
whereby nu
conditions.
seasonal spe
lular cyanob
!15
The obse
16. DeLong Lab
•
•
Studying Sar86 and other marine plankton
Note - published one of first genomic studies of
uncultured microbes - in 1996
JOURNAL OF BACTERIOLOGY, Feb. 1996, p. 591–599
0021-9193/96/$04.00 0
Copyright 1996, American Society for Microbiology
Vol. 178, No. 3
Characterization of Uncultivated Prokaryotes: Isolation and
Analysis of a 40-Kilobase-Pair Genome Fragment
from a Planktonic Marine Archaeon
JEFFEREY L. STEIN,1* TERENCE L. MARSH,2 KE YING WU,3 HIROAKI SHIZUYA,4
3
AND EDWARD F. DELONG *
Recombinant BioCatalysis, Inc., La Jolla, California 920371; Microbiology Department, University
of Illinois, Urbana, Illinois 618012; Department of Ecology, Evolution and Marine Biology,
University of California, Santa Barbara, California 931063; and Division of Biology,
California Institute of Technology, Pasadena, California 911254
Received 14 July 1995/Accepted 14 November 1995
D ownloa de d from
One potential approach for characterizing uncultivated prokaryotes from natural assemblages involves
genomic analysis of DNA fragments retrieved directly from naturally occurring microbial biomass. In this
study, we sought to isolate large genomic fragments from a widely distributed and relatively abundant but as
yet uncultivated group of prokaryotes, the planktonic marine Archaea. A fosmid DNA library was prepared
from a marine picoplankton assemblage collected at a depth of 200 m in the eastern North Pacific. We
identified a 38.5-kbp recombinant fosmid clone which contained an archaeal small subunit ribosomal DNA
gene. Phylogenetic analyses of the small subunit rRNA sequence demonstrated its close relationship to that of
previously Slides for UC Davis EVE161 Course coherent group rooted deeply within the Crenarchaeota
described planktonic archaea, which form a Taught by Jonathan Eisen Winter 2014
17. Delong Lab
GENOMIC FRAGMENTS FROM PLANKTONIC MARINE ARCHAEA
593
ments isolated from fosmid clones
with various restriction endonucle10 kb, the F-factor-based vector
the fosmid subfragments. Partial
of restriction enzyme to 1 ⇥g of
mixture. The reaction mixture was
removed at 10, 40, and 60 min.
dding 1 ⇥l of 0.5 M EDTA to the
e. The partially digested DNA was
s described above except using a 1he sizes of the separated fragments
n standards. The distances of the
d SP6 promoter sites on the excised
pmol of T7- or SP6-specific oligol) and hybridizing with Southern
artial sequences reported in Table
the following accession numbers:
U40243, U40244, and U40245. The
and EF2 have been submitted to
and U41261.
FIG. 1. Flowchart depicting the construction and screening of an environmental library from a mixed picoplankton sample. MW, molecular weight;
PFGE, pulsed-field gel electrophoresis.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
D ownloa de d from jb.a sm .org a t U N IV O
fosmid and pBAC clones digested
probed with labeled T7 and SP6
eled subclones and PCR fragments
otgun sequencing described above.
e estimates from the partial digesof the fosmids and their subclones.
and DeSoete distance (9) analyses
n using GDE 2.2 and Treetool 1.0,
(RDP) (23). DeSoete least squares
ng pairwise evolutionary distances,
to account for empirical base fretained from the RDP, version 4.0
rRNA sequences were performed
the RDP. For distance analyses of
lutionary distances were estimated
d tree topology was inferred by the
n addition and global branch swapprotein sequences, the Phylip proaddition and ordinary parsimony
!17
18. Delong Lab
J. BACTERIOL.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
D ownloa de
FIG. 4. High-density filter replica of 2,304 fosmid clones containing approximately 92 million bp of DNA cloned from the mixed picoplankton community.
The filter was probed with the labeled insert from clone 4B7 (dark spot). The
lack of other hybridizing clones suggests that contigs of 4B7 are absent from this
portion of the library. Similar experiments with the remainder of the library
yielded similar results.
!18
19. on was des in a cell
ward transin proteornd only in
(Fig. 4A).
edium was
ce of a 10
re carbonyl
19). Illumiical potenright-sidence of retilight onset
hat proteocapable of
physiologe activities
containing
proteorhomain to be
has Proteorhodopsin in Its Genome
D ownloa de d from w
generated
eorhodopSAR86
resence of
ndwidth is
absorption
. The rednm in the
ated Schiff
ably to the
Fig. 1. (A) Phylogenetic tree of bacterial 16S rRNA gene sequences, including that encoded on the
130-kb bacterioplankton BAC clone (EBAC31A08) (16). (B) Phylogenetic analysis of proteorhodopsin with archaeal (BR, HR, and SR prefixes) and Neurospora crassa (NOP1 prefix) rhodopsins (16).
Nomenclature: Name_Species.abbreviation_Genbank.gi (HR, halorhodopsin; SR, sensory rhodopsin;
BR, bacteriorhodopsin). Halsod, Halorubrum sodomense; Halhal, Halobacterium salinarum (halobium); Halval, Haloarcula vallismortis; Natpha, Natronomonas pharaonis; Halsp, Halobacterium sp;
Neucra, Neurospora crassa.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!19
20. Bacteriorhodopsin and its relatives
Figure 3. Phylogenetic tree based on the amino acid sequences of 25 archaeal rhodopsins. (a) NJ-tree. The numbers at each node are clustering probabilities generated by
bootstrap resampling 1000 times. D1 and D2 represent gene duplication points. The four shaded rectangles indicate the speciation dates when halobacteria speciation occurred at
the genus level. (b) ML-tree. Log likelihood value for ML-tree was −6579.02 (best score) and that for topology of the NJ-tree was −6583.43. The stippled bars indicate the 95%
confidence limits. Both trees were tentatively rooted at the mid-point of the longest distance, although true root positions are unknown.
!
From Ihara et al. 1999
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!20
22. Proteorhodopsin Predicted Secondary Structure
RESEARCH ARTICLES
Fig. 2. Secondary
structure of proteorhodopsin. Singleletter amino acid
codes are used (33),
and the numbering
is as in bacteriorhodopsin.
Predicted
retinal binding pocket
residues
are
marked in red.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!22
23. Proteorhodopsin in E. coli
Fig. 3. (A) Proteorhodopsin-expressing E. coli cell suspension (ϩ) compared to control cells (Ϫ),
both with all-trans retinal. (B) Absorption spectra of retinal-reconstituted proteorhodopsin in E. coli
membranes (17). A time series of spectra is shown for reconstituted proteorhodopsin membranes
(red) and a negative control (black). Time points for spectra after retinal addition, progressing from
low to high absorbance values, are 10, 20, 30, and 40 min.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
duce
in th
occu
pigm
and
ed a
tiona
sorp
in 0.
sorp
botto
nated
retin
ms d
deca
shift
appe
term
cay
step
singl
upw
amp
gene
with
ms p
reco
phot
!23
prod
24. both with all-trans retinal. (B) Absorption spectra of retinal-reconstituted proteorhodopsin in E. coli
membranes (17). A time series of spectra is shown for reconstituted proteorhodopsin membranes
(red) and a negative control (black). Time points for spectra after retinal addition, progressing from
low to high absorbance values, are 10, 20, 30, and 40 min.
Proteorhodopsin function
Fig. 4. (A) Light-driven transport of protons by a
proteorhodopsin-expressing E. coli cell suspension.
The beginning and cessation of illumination (with
yellow light Ͼ485 nm) is indicated by arrows labeled ON and OFF, respectively. The cells were
suspended in 10 mM NaCl, 10 mM MgSO4⅐7H2O, and 100 M CaCl2. (B) Transport of 3Hϩ-labeled
tetraphenylphosphonium ([3Hϩ]TPP) in E. coli right-side-out vesicles containing expressed proteorhodopsin, reconstituted with (squares) or without (circles) 10 M retinal in the presence of light (open
symbols) or in the dark (solid symbols) (20).
1904
geneity in
with 87% o
ms photocy
recovery. A
photocycle
produces a b
this alternat
well as a tw
component,
the total am
with a 45amplitude).
cycle rate, w
teristic of
strong evid
tions as a tr
rhodopsin.
Implicat
harbor the
tributed in
bacteria hav
ture-indepen
oceanic reg
Oceans, as
(8, 25–29).
tribution, pr
this ␥-prote
15 SEPTEMBER 2000 VOL 289 SCIENCE www.sciencemag.org
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
25. Figure 5
Laser flash-induced absorbance
changes in suspensions of E. coli
membranes containing
proteorhodopsin. A 532-nm pulse
(6 ns duration, 40 mJ) was delivered
at time 0, and absorption changes
were monitored at various
wavelengths in the visible range in
a lab-constructed flash photolysis
system as described (34). Sixtyfour transients were collected for
each wavelength. (A) Transients at
the three wavelengths exhibiting
maximal amplitudes. (B) Absorption
difference spectra calculated from
amplitudes at 0.5 ms (blue) and
between 0.5 ms and 5.0 ms (red).
http://www.sciencemag.org/content/289/5486/1902/F5.expansion.html
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
26. Correspondence should be addressed to A.T.
Papers for(e-mail: deia@dstu.univ-montp2.fr).
Today
.................................................................
Proteorhodopsin phototrophy
in the ocean
to that of recomb
¯ash-photolysis
existence of pro
retinal molecules
waters.
The photocyc
and, after hydro
adding all-trans
indeed derive fro
 Á
Oded Beja*², Elena N. Spudich²³, John L. Spudich³, Marion Leclerc*
& Edward F. DeLong*
Δ Absorbance (×10–3)
2
1
* Monterey Bay Aquarium Research Institute, Moss Landing, California 95039,
USA
http://www.nature.com/nature/journal/v411/n6839/abs/411786a0.html
0
³ Department of Microbiology and Molecular Genetics, The University of
–1
Texas Medical School, Houston, Texas 77030, USA
² These authors contributed equally to this work
–2
Proteorhodopsin1, a retinal-containing integral membrane protein that functions as a light-driven proton pump, was discovered
in the genome of an uncultivated marine bacterium; however, the
prevalence, expression and genetic variability of this protein in
native marine microbial populations remain unknown. Here we
report that photoactive proteorhodopsin is present in oceanic
surface waters. We also provide evidence of an extensive family of
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
–3
–4
3
2
0–3)
..............................................................................................................................................
4
0
30. nes from the HOT station were
m samples. Most proteorhodopsurface waters belonged to the
(on the amino-acid level) to the
40E8. In contrast, most of the
) fell within the Antarctic clade
es from the HOT station, HOT
d in E. coli and their absorption
d from their primary sequences,
ng with the Antarctic clade gave
um, whereas the proteorhodopclade gave a green (527-nm)
central North Paci®c gyre, most
ge, with maximal intensity near
eak is maintained over depth,
At the surface, the energy peak
th between 400 and 650 nm. In
narrows and the half bandwidth
00 nm (Fig. 5). Considering the
members of the two different
0.06
0.05
Absorbance
rom Antarctica (E. F. DeLong,
dopsin primers, and sequenced
minary sequence analyses based
indicate that, despite differences
orption spectra, the Antarctic
a bacterium highly related to
AR86 bacteria of Monterey Bay10
dopsin genetic variants in mixed populations with a selective
advantage at different points along the depth-dependent light
0.04
HOT 75 m
Antarctica
0.03
HOT 0 m
0.02
Monterey
Bay
0.01
1.0
Relative irradiance
mechanisms of wavelength regurhodopsins. Furthermore, palE6
larly to its Monterey Bay
diated transport of protons in
. Spudich et al., manuscript in
5m
0.8
0.6
75 m
0.4
0.2
0
350
400
450
500
550
600
650
Wavelength (nm)
Figure 5 Absorption spectra of retinal-reconstituted proteorhodopsins in E. coli
membranes. All-trans retinal (2.5 mM) was added to membrane suspensions in 100 mM
phosphate buffer, pH 7.0, and absorption spectra were recorded. Top, four spectra for
palE6 (Antarctica), HOT 75m4, HOT 0m1, and BAC 31A8 (Monterey Bay) at 1 h after
retinal addition. Bottom, downwelling irradiance from HOT station measured at six
wavelengths (412, 443, 490, 510, 555 and 665 nm) and at two depths, for the same
depths and date that the HOT samples were collected (0 and 75 m). Irradiance is plotted
relative to irradiance at 490 nm.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!30
32. (e-mail: denning@atmos.colostate.edu).
More on Large Insert Metagenomics
6
.................................................................
98
Unsuspected diversity among marine
aerobic anoxygenic phototrophs
 Á
Oded Beja*², Marcelino T. Suzuki*, John F. Heidelberg³,
William C. Nelson³, Christina M. Preston*, Tohru Hamada§²,
Jonathan A. Eisen³, Claire M. Fraser³ & Edward F. DeLong*
* Monterey Bay Aquarium Research Institute, Moss Landing,
http://www.nature.com/nature/journal/v415/n6872/full/415630a.html
California 95039-0628, USA
³ The Institute for Genomic Research, Rockville, Maryland 20850, USA
§ Marine Biotechnology Institute, Kamaishi Laboratories, Kamaishi City,
Iwate 026-0001, Japan
..............................................................................................................................................
Aerobic, anoxygenic, phototrophic bacteria containing bacteriochlorophyll a (Bchla) require oxygen for both growth and Bchla
synthesis1±6. Recent reports suggest that these bacteria are widely
distributed in marine plankton, and that they may account for up
to 5% of Slides for UCocean photosyntheticby Jonathan Eisen Winter 2014and
surface Davis EVE161 Course Taught electron transport7
8
b
33. esearch, Rockville, Maryland 20850, USA
ute, Kamaishi Laboratories, Kamaishi City,
........................................................................................
100
100
67
99
envHOT2
envHOT3
cDNA 20m10
BAC 29C02
BAC 39B11
BAC 52B02
100
54 BAC 24D02
BAC 65D09
rRNA
PUF
Chloroflexus aurantiacus
ototrophic bacteria containing bacterioa
BAC 56B12
quire oxygen for both growth and Bchla
BAC 60D04
b
Roseobacter denitrificans*
100
0.1
BAC 30G07
rts suggest that these bacteria are widely
0.1
100
cDNA 0m13
Roseobacter litoralis*
52
ankton, and that they may account cDNAup
for 20m11
91
a, b, Evolutionary distances for the
R2A84*
and
photosynthetic electron transport7env20m1
pufM genes (a) were determined from
100
76
R2A62*
env20m5
α-3
obial community8. Known planktonic20m22
an alignment of 600 nucleotide
cDNA
MBIC3951*
100
env0m2
99
positions, and for rRNA genes (b) from
belong to only a few restricted groups
Rhodobacter capsulatus
cDNA 20m8
95
R2A84*
an alignment of 860 nucleotide
Rhodobacter sphaeroides
72
ia a-subclass. Here we report genomic
R2A62*
R2A163*
Rhodovulum sulfidophilum*
sequence positions. Evolutionary
100
thetic gene content and operon organizaRhodobacter capsulatus
95
relationships were determined by
Erythromicrobium ramosum
α-3
ng marine bacteria. TheseRhodobacter sphaeroides
photosynthetic
Rhodovulum sulfidophilum*
Erythrobacter longus*
82
neighbour-joining analysis (see
69
ome that most closely resembled those of
Roseobacter denitrificans*
100
α-4
Erythrobacter litoralis*
Methods). The green non-sulphur
Roseobacter litoralis*
73
b-subclass, which have never before been
MBIC3019*
MBIC3951*
100
bacterium Chloroflexus aurantiacus
61
82
cDNA
ronments. Furthermore, thesecDNA0m1
photosynSphingomonas natatoria
61
was used as an outgroup. pufM genes
20m21
65
100
env0m1
Rhodomicrobium vannielii α-2
ly distributed in marine plankton, and
99
that were amplified by PCR in this
envHOT1
Rhodopseudomonas palustris
98
itic bacterioplankton assemblages, indistudy are indicated by the env prefix,
Erythrobacter longus*
100
Rhodospirillum rubrum α-1
69
Erythromicrobium ramosum
nti®ed phototrophs were photosynthetiwith 'm' indicating Monterey, and HOT
100 96
Erythrobacter litoralis*
Rubrivivax gelatinosus
100
α-4
ta demonstrate that planktonic bacterial
indicating Hawaii ocean time series.
98
MBIC3019*
Roseateles depolymerans
100
Sphingomonas natatoria
Cultivated aerobes are marked in light
β
ply composed of one uniform, widespread
Rhodoferax fermentans
100
cDNA 0m20
blue, bacteria cultured from sea water
otrophs, as previously proposed8; rather,
Rhodocyclus tenuis
Thiocystisge latinosa
96
100
γ
are marked with an asterisk, and
Allochromatium vinosum
ain multiple, distantly related, photoThiocystis gelatinosa
100
γ
Rhodopseudomonas palustris
environmental cDNAs are marked in
Allochromatium vinosum
erial groups, including some unrelated
α–1
Rhodospirillum rubrum
α-2
Chloroflexus aurantiacus red. Photosynthetic -, - and Rhodomicrobium vannielii
56
types.
Rhodocyclus tenuis
proteobacterial groups are indicated
52
Rubrivivax gelatinosus
uired for the formation of bacteriochloro- β
by the vertical bars to the right of the
Roseateles depolymerans
AAP
ystems in aerobic, anoxygenic, photo- Figure 1 Phylogenetic relationships of pufM gene (a) and rRNA (b) sequences of Bootstrap values greater than
Rhodoferax fermentans
tree.
re clustered in a contiguous, 45-kilobase bacteria. a, b, Evolutionary distances for the pufM genes (a) were determined from an indicated above the
envHOT2
100
50% are
envHOT3
100
(superoperon)6. These include bch and crt alignment of 600 nucleotide positions, and for rRNA genes (b) from an alignment of 860 The scale bar represents
cDNA 20m10
branches.
67
BAC 29C02
by neighbournzymes of the bacteriochlorophyll and nucleotide sequence positions. Evolutionary relationships were determined number of substitutions per site.
BAC 39B11
99
BAC 52B02
pathways, and the puf genes coding for joining analysis (see Methods). The green non-sulphur bacterium Chloro¯exus
100
54 BAC 24D02
harvesting complex (pufB and pufA) and aurantiacus was used as an outgroup. pufM genes that were ampli®ed by PCR in this
BAC 65D09
Chloroflexus aurantiacus
ex (pufL and pufM). To better describe the study are indicated by the env pre®x, with `m' indicating Monterey, and HOT indicating
planktonic, anoxygenic, photosynthetic Hawaii ocean time series. Cultivated aerobes are marked in light blue, bacteria cultured
b
100
surface-water bacterial Roseobacter denitrificans*
arti®cial chromo- from sea water are marked with an asterisk, and environmental cDNAs are marked in red.
0.1
Roseobacter litoralis*
52
Slides for UC Davis EVE161and g-proteobacterial by Jonathan Eisen Winter bars to
Course Taught groups are indicated by the vertical 2014
91
Photosynthetic a-, b-
35. letters to nature
a
b
BchB
Rhodobacter capsulatus
BAC 60D04
Acidiphilium rubrum
Rhodopseudomonas palustris
Rubrivivax gelatinosus
Rhodobacter sphaeroides
100/94
Acidiphilium rubrum
BchH
Rubrivivax gelatinosus
100/66
100/100
Rhodopseudomonas palustris
BAC 60D04
BAC 29C02
100/100
BAC 65D09
Rhodobacter capsulatus
100/96
Rhodobacter sphaeroides
100/100
100/100
100/100
0.1
100/100
BAC 29C02
100/100 100/99
BAC 65D09
100/100
0.1
100/100
Heliobacillus mobilis
Heliobacillus mobilis
Chloroflexus aurantiacus
Chlorobium tepidum
Figure 3 Phylogenetic analyses of BchB and BchH proteins. a, Phylogenetic tree for the
BchB protein. b, Phylogenetic tree for the BchH protein. The BchH sequences from
Chlorobium vibrioforme25and BchH2 and BchH3 from C. tepidum18 were omitted from the
tree because these genes potentially encode an enzyme for bacteriochlorophyll c
Chloroflexus aurantiacus
Chlorobium tepidum
biosynthesis and are probably of distinct origin (J. Xiong, personal communication).
Bootstrap values (neighbour-joining/parsimony method) greater than 50% are indicated
next to the branches. The scale bar represents number of substitutions per site. The
position of Acidiphilium rubrum (bold branch) was not well resolved by both methods.
We compared several photosynthetic genes found on the oceanic operon) to cultivated Erythrobacter species.
bacteriochlorophyll superoperons with characterized homologues
Recently, it has been suggested that cultivated Erythrobacter
from cultured photosynthetic bacteria. The relationships among species (a-4 subclass of proteobacteria) may represent the preBchla biosynthetic proteins BchB (a subunit of light-independent dominant AAPs in the upper ocean8,20. Surprisingly, we were not
protochlorophyllide reductase) and BchH (magnesium chelatase) able to retrieve photosynthetic operon genes belonging to this group
Slides for UC relationships, BchB and in any Jonathan Eisen Winter 2014
were determined18 (Fig. 3). Similar to pufMDavis EVE161 Course Taught byof the samples analysed. Furthermore, very few sequences
36. om the HOT station were
mples. Most proteorhodopce waters belonged to the
he amino-acid level) to the
. In contrast, most of the
within the Antarctic clade
om the HOT station, HOT
E. coli and their absorption
m their primary sequences,
th the Antarctic clade gave
whereas the proteorhodope gave a green (527-nm)
al North Paci®c gyre, most
with maximal intensity near
s maintained over depth,
he surface, the energy peak
tween 400 and 650 nm. In
ows and the half bandwidth
m (Fig. 5). Considering the
mbers of the two different
0.06
0.05
Absorbance
Antarctica (E. F. DeLong,
in primers, and sequenced
ary sequence analyses based
ate that, despite differences
on spectra, the Antarctic
cterium highly related to
bacteria of Monterey Bay10
advantage at different points along the depth-dependent light
0.04
HOT 75 m
Antarctica
0.03
HOT 0 m
0.02
Monterey
Bay
0.01
1.0
Relative irradiance
opsins. Furthermore, palE6
to its Monterey Bay
d transport of protons in
udich et al., manuscript in
5m
0.8
0.6
75 m
0.4
0.2
0
350
400
450
500
550
600
650
Wavelength (nm)
Figure 5 Absorption spectra of retinal-reconstituted proteorhodopsins in E. coli
membranes. All-trans retinal (2.5 mM) was added to membrane suspensions in 100 mM
phosphate buffer, pH 7.0, and absorption spectra were recorded. Top, four spectra for
palE6 (Antarctica), HOT 75m4, HOT 0m1, and BAC 31A8 (Monterey Bay) at 1 h after
retinal addition. Bottom, downwelling irradiance from HOT station measured at six
wavelengths (412, 443, 490, 510, 555 and 665 nm) and at two depths, for the same
depths and date that the HOT samples were collected (0 and 75 m). Irradiance is plotted
relative to irradiance at 490 nm.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
37. articles
Community structure and metabolism
through reconstruction of microbial
genomes from the environment
Gene W. Tyson1, Jarrod Chapman3,4, Philip Hugenholtz1, Eric E. Allen1, Rachna J. Ram1, Paul M. Richardson4, Victor V. Solovyev4,
Edward M. Rubin4, Daniel S. Rokhsar3,4 & Jillian F. Banfield1,2
1
Department of Environmental Science, Policy and Management, 2Department of Earth and Planetary Sciences, and 3Department of Physics, University of California,
Berkeley, California 94720, USA
4
Joint Genome Institute, Walnut Creek, California 94598, USA
...........................................................................................................................................................................................................................
RESEARCH ARTICLE
Microbial communities are vital in the functioning of all ecosystems; however, most microorganisms are uncultivated, and their
roles in natural systems are unclear. Here, using random shotgun sequencing of DNA from a natural acidophilic biofilm, we report
reconstruction of near-complete genomes of Leptospirillum group II and Ferroplasma type II, and partial recovery of three other
genomes. This was possible because the biofilm was dominated by a small number of species populations and the frequency of
genomic rearrangements and gene insertions or deletions was relatively low. Because each sequence read came from a different
individual, we could determine that single-nucleotide polymorphisms are the predominant form of heterogeneity at the strain level.
The Leptospirillum group II genome had remarkably few nucleotide polymorphisms, despite the existence of low-abundance
variants. The Ferroplasma type II genome seems to be a composite from three ancestral strains that have undergone homologous
recombination to form a large population of mosaic genomes. Analysis of the gene complement for each organism revealed the
pathways for carbon and nitrogen fixation and energy generation, and provided insights into survival strategies in an extreme
environment.
1
1
3
Environmental Genome Shotgun
Sequencing of the Sargasso Sea
J. Craig Venter, * Karin Remington, John F. Heidelberg,
Aaron L. Halpern,2 Doug Rusch,2 Jonathan A. Eisen,3
The study of microbial evolution and ecology has been revolutio- fluorescence in situ hybridization (FISH) revealed that all biofilms
Dongying Wu,3 Ian Paulsen,3 Karen E. Nelson,3 William Nelson,3
nized by DNA sequencing and analysis1–3. However, isolates have contained mixtures of bacteria (Leptospirillum, Sulfobacillus and, in
Derrick few cases, 3 Samuel Levy,2 archaea (Ferroplasma 6
been the main source of sequence data, and only a small fraction of aE. Fouts, Acidimicrobium) andAnthony H. Knap,and other
4–6
has members of the Thermoplasmatales). The genome of one
microorganisms have been cultivated . Consequently, focusMichael W. Lomas,6 Ken Nealson,5 Owen White,3 of these
shifted towards the analysis of uncultivated microorganisms via archaea, Ferroplasma acidarmanus 1 Rachel Parsons,6
Jeremy Peterson,3 Jeff Hoffman, fer1, isolated from the Richmond
cloning of conserved genes5 and genome fragments directly from mine, has been sequenced previously (http://www.jgi.doe.gov/JGI_
4
Holly Baden-Tillson,1 Cynthia Pfannkoch,1 Yu-Hui
the environment7–9. To date, only a small fraction of genes have been microbial/html/ferroplasma/ferro_homepage.html).Rogers,
Slides for UC Davis EVE161 Course biofilm (Fig.Jonathan 1 of AMD communities was
Hamilton 1a) typical
recovered from individual environments, limiting the analysis of
A pink Taught by O. Smith Eisen Winter 2014
chlorococcus, tha
photosynthetic bio
Surface water
were collected ab
from three sites o
February 2003. A
lected aboard the S
station S” in May
are indicated on F
S1; sampling prot
one expedition to
was extracted from
genomic libraries w
2 to 6 kb were m
!37
prepared plasmid
40. genes needed to fix carbon by means of the Calvin–Benson–
Bassham cycle (using type II ribulose 1,5-bisphosphate carboxylase–oxygenase). All genomes recovered from the AMD system
fixation via the reductive acetyl coenzyme A (acetyl-CoA) pathway
by some, or all, organisms. Given the large number of ABC-type
sugar and amino acid transporters encoded in the Ferroplasma type
Figure 4 Cell metabolic cartoons constructed from the annotation of 2,180 ORFs
identified in the Leptospirillum group II genome (63% with putative assigned function) and
1,931 ORFs in the Ferroplasma type II genome (58% with assigned function). The cell
drainage stream (viewed in cross-section). Tight coupling between ferrous iron oxidation,
Slides for UC Davis EVE161 Course pyrite dissolution and acid generation is indicated. Rubisco, ribulose 1,5-bisphosphate
Taught by Jonathan Eisen Winter 2014
carboxylase–oxygenase. THF, tetrahydrofolate.
!40
41. RESEARCH ARTICLE
Environmental Genome Shotgun
Sequencing of the Sargasso Sea
J. Craig Venter,1* Karin Remington,1 John F. Heidelberg,3
Aaron L. Halpern,2 Doug Rusch,2 Jonathan A. Eisen,3
Dongying Wu,3 Ian Paulsen,3 Karen E. Nelson,3 William Nelson,3
Derrick E. Fouts,3 Samuel Levy,2 Anthony H. Knap,6
Michael W. Lomas,6 Ken Nealson,5 Owen White,3
Jeremy Peterson,3 Jeff Hoffman,1 Rachel Parsons,6
Holly Baden-Tillson,1 Cynthia Pfannkoch,1 Yu-Hui Rogers,4
Hamilton O. Smith1
We have applied “whole-genome shotgun sequencing” to microbial populations
collected en masse on tangential flow and impact filters from seawater samples
collected from the Sargasso Sea near Bermuda. A total of 1.045 billion base pairs
http://www.sciencemag.org/content/304/5667/66
of nonredundant sequence was generated, annotated, and analyzed to elucidate
the gene content, diversity, and relative abundance of the organisms within
these environmental samples. These data are estimated to derive from at least
1800 genomic species based on sequence relatedness, including 148 previously
unknown bacterial phylotypes. We have identified over 1.2 million previously
unknown genes represented in these samples,by Jonathanmore than 782 new
Slides for UC Davis EVE161 Course Taught including Eisen Winter 2014
chlorococcus, th
photosynthetic bi
Surface wate
were collected a
from three sites
February 2003. A
lected aboard the
station S” in Ma
are indicated on
S1; sampling pro
one expedition to
was extracted fro
genomic libraries
2 to 6 kb were
prepared plasmid
both ends to prov
Craig Venter Sc
nology Center on
ers (Applied Bi
Whole-genome ra
the Weatherbird II
4) produced 1.66
in length, for a tota
microbial DNA se
sequences were g
!41
42. two groups of scaffolds representing two disSargasso Sea related to the published
tinct strains closely
at depths ranging from 4ϫ to 36ϫ (indicated
with shading in table S3 with nine depicted in
Fig. 1. MODIS-Aqua satellite image of
ocean chlorophyll in the Sargasso Sea grid
about the BATS site from 22 February
2003. The station locations are overlain
with their respective identifications. Note
the elevated levels of chlorophyll (green
color shades) around station 3, which are
not present around stations 11 and 13.
http://www.sciencemag.org/content/304/5667/66
Fig. 2. Gene conserSlides
vation among closely for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!42
43. sequence when they would otherwise be labeled
as repetitive. We evaluated our final assembly
Table 1. Gene count results in a tiered fashion, looking at well-sampled
breakdown by TIGR role
genomic those found on ascategory. Gene set includes regions separately from those barely
sampled at our current level of sequencing.
semblies from samples 1 to 41.66 million sequencesreads the
The and fragment
from
from samples 5 to 7. AWeatherbird II samples (table S1; samples 1 to
more detailed table, sep4; stations 3, 11, and 13), were pooled and
arating Weatherbird II samplesto providethe Sorcerer II
assembled from a single master assembly
samples is presented in for comparative purposes. The assembly generthe SOM (table S4). Note
ated 64,398 scaffolds ranging size from 826
that there are 28,023 genes which were 256inMbp of unique
bp to 2.1 Mbp, containing classified
sequence and
in more than one role category.spanning 400 Mbp. After assembly, there remained 217,015 paired-end reads,
or “mini-scaffolds,” spanning 820.7 Mbp as
well as an additional 215,038 unassembled sinTotal
gleton
TIGR role category reads covering 169.9 Mbp (table S2,
genes
column 1). The Sorcerer II samples provided
almost no assembly, so we consider for these
samples only the 153,458 mini-scaffolds, spanAmino acid biosynthesis 518.4 Mbp, and the remaining 18,692
37,118
ning
singleton reads (table S2, column 2). In total,
Biosynthesis of cofactors,
25,905
1.045 Gbp of nonredundant sequence was genprosthetic groups, and carriers of overlapping reads within the
erated. The lack
unassembled set indicates that lack of additionCell envelope
27,883
al assembly was not due to algorithmic limitaCellular processes
17,260
tions but to the relatively limited depth of sequencing coverage
Central intermediary metabolism given the level of diversity
13,639
within the sample.
DNA metabolism
25,346
The whole-genome shotgun (WGS) assembly
has been deposited at DDBJ/EMBL/GenBank
Energy metabolism
69,718
under
Fatty acid and phospholipidthe project accession AACY00000000,
18,558
and all traces have been deposited in a corresponding TraceDB trace archive. The version
metabolism
described
paper is the first version,
Mobile and extrachromosomalin this Unlike a conventional WGS
1,061
AACY01000000.
element functions entry, we have deposited not just contigs and
scaffolds but the unassembled paired singletons
Protein fate
28,768
and individual singletons in order to accurateProtein synthesis
48,012
ly reflect the diversity in the sample and
allow searches
Purines, pyrimidines, nucleosides,across the entire sample with19,912
in a single database.
and nucleotides
Genomes and large assemblies. Our
analysis first focused on the well-sampled geRegulatory functions nomes by characterizing scaffolds8,392least
with at
Signal transduction
4,817
3ϫ coverage depth. There were 333 scaffolds
comprising 2226 contigs and 12,75630.9
spanning
Transcription
Mbp that met this criterion (table S3), accountTransport and binding proteins 410,000 reads, 49,185 the
ing for roughly
or 25% of
pooled assembly data set. From 38,067
this set of wellUnknown function
sampled material, we were able to cluster and
Miscellaneous
1,864
classify assemblies by organism; from the rare
sequence similarConserved hypotheticalspecies in our sample, we used794,061
ity based methods together with computational
gene finding to obtain both qualitative and quantitative estimates
and functional diverTotal number of roles assigned of genomic1,242,230
sity within this particular marine environment.
We employed several criteria to sort the
Total number of genes major assembly pieces into tentative organism
1,214,207
“bins”; these include depth of coverage, oligo-
We closely examined the multiple sequence
alignments of the contigs with high SNP rates
eate scaffold borders.
frames (5). A total ofassembly to identify a setgenes deeply as69,901 novel of large, besembling nonrepetitive contigs. This was used to
longing to 15,601 single the expected coveragewere iden- (to
set link clusters in unique regions
tified. The predicted 23ϫ) for adeep contigs categorized algenes final run of to be treated asThis
were the assembler. unique
lowed the
nucleotide frequencies (7), and similarity to
previously sequenced genomes (5). With these
techniques, the majority of sequence assigned
to the most abundant species (16.5 Mbp of the
30.9 Mb in the main scaffolds) could be separated based on several corroborating indicators.
In particular, we identified a distinct group of
scaffolds representing an abundant population
clearly related to Burkholderia (fig. S2) and
two groups of scaffolds representing two distinct strains closely related to the published
Shewanella oneidensis genometo (8) (fig. S3). two fairly
and were able classify these into
distinct classes: regions where several
There is a group of scaffolds assembling at over closely
related
been collapsed, in6ϫ coverage that appears haplotypes havecoverage accordingly
to represent the gecreasing the depth of
nome of a SAR86 (10), and regions Scaffold tosets relatively
(table S3). that appear be a
homogenous blend of discrepancies
representing a conglomerate of Prochlorococ- from the
consensus without any apparent separation into
cus strains (Fig. 2), haplotypes, suchan the Prochlorococcus scafas well as as uncultured
fold region (Fig. 5). Indeed, the Prochlorococmarine archaeon, were also identified (table S3;
cus scaffolds display considerable heterogeneFig. 3). Additionally, 10 not only at mega plasmids
putative the nucleotide sequence level
ity
were found in the main 5) but also atset, genomic level, where
(Fig. scaffold the covered
multiple scaffolds align with the same region of
at depths ranging from 4ϫ to 36ϫ (indicated
the MED4 (11) genome but differ due to gene
with shading in tableor genomic island insertion, deletion, rearrangeS3 with nine depicted in
ment events. This observation is consistent with
previous findings (12). For instance, scaffolds
2221918 and 2223700 share gene synteny with
Fig. 1. MODIS-Aqua satellite image of
each
MED4 but
the
ocean chlorophyll in other andof probableSea grid insertion
the Sargasso differ byorigin, likely
of 15 genes
phage
about the BATSrepresenting an integrated bacteriophage. These
site from 22 February
2003. The station locations are overlaingraphically
genomic differences are displayed
in Fig. 2, where it is evident that
with their respective identifications. Noteup to four
conflicting scaffolds can (green
the elevated levels of chlorophyll align with the same
region of the MED4 genome.
color shades) around Prochlorococcus MED4More than 85%
station 3, which genome can be
are
of the
not present around stations 11 and 13.
aligned with Sargasso Sea scaffolds greater
Fig. 4. Circular diagrams of nine complete megaplasmids. Genes encoded in the forward direction
are shown in the outer concentric circle; reverse coding genes are shown in the inner concentric
circle. The genes have been given role category assignment and colored accordingly: amino acid
biosynthesis, violet; biosynthesis of cofactors, prosthetic groups, and carriers, light blue; cell
envelope, light green; cellular processes, red; central intermediary metabolism, brown; DNA
metabolism, gold; energy metabolism, light gray; fatty acid and phospholipid metabolism, magenta;
protein fate and protein synthesis, pink; purines, pyrimidines, nucleosides, and nucleotides, orange;
regulatory functions and signal transduction, olive; transcription, dark green; transport and binding
proteins, blue-green; genes with no known homology to other proteins and genes with homology
to genes with no known function, white; genes of unknown function, gray; Tick marks are placed
on 10-kb intervals.
68
than 10 kb; however, there appear to be a
couple of regions of MED4 that are not represented in the 10-kb scaffolds (Fig. 2). The
larger of these two regions (PMM1187 to
PMM1277) consists primarily of a gene cluster
coding for surface polysaccharide biosynthesis,
which may represent a MED4-specific polysaccharide absent or highly diverged in our Sargasso Sea Prochlorococcus bacteria. The heterogeneity of the Prochlorococcus scaffolds suggest
that the scaffolds are not derived from a single
discrete strain, but instead probably represent a
conglomerate assembled from a population of
closely related Prochlorococcus biotypes.
The gene complement of the Sargasso.
The heterogeneity of the Sargasso sequences
complicates the identification of microbial
genes. The typical approach for microbial annotation, model-based gene finding, relies entirely on training with a subset of manually
Fig. 2. Gene conser2 APRIL 2004 VOL 304 SCIENCE www.sciencemag.org
vation among closely
related Prochlorococcus. The outermost
concentric circle of
the diagram depicts
the competed genomic sequence of Prochlorococcus marinus
MED4 (11). Fragments
from environmental
sequencing were compared to this completed Prochlorococcus genome and are shown in
the inner concentric
circles and were given
boxed outlines. Genes
for the outermost circle have been assigned psuedospectrum colors based on
the position of those
genes along the chromosome, where genes
nearer to the start of
the genome are colored in red, and genes
nearer to the end of the genome are colored in blue. Fragments from environmental sequencing
were subjected to an analysis that identifies conserved gene order between those fragments and
the completed Prochlorococcus MED4 genome. Genes on the environmental genome segments
that exhibited conserved gene order are colored with the same color assignments as the
Prochlorococcus MED4 chromosome. Colored regions on the environmental segments exhibiting
color differences from the adjacent outermost concentric circle are the result of conserved gene
order with other MED4 regions and probably represent chromosomal rearrangements. Genes that
did not exhibit conserved gene order are colored in black.
Fig. 5. Prochlorococcus-related scaffold 2223290 illustrates the assembly of a broad community of closely related organisms, distinctly nonpunctate in nature. The image represents (A)
global structure of Scaffold 2223290 with respect to assembly and (B) a sample of the multiple
sequence alignment. Blue segments, contigs; green segments, fragments; and yellow segments,
stages of the assembly of fragments into the resulting contigs. The yellow bars indicate that
fragments were initially assembled in several different pieces, which in places collapsed to
form the final contig structure. The multiple sequence alignment for this region shows a
homogenous blend of haplotypes, none with sufficient depth of coverage to provide a
separate assembly.
http://www.sciencemag.org/content/304/5667/66
www.sciencemag.org SCIENCE VOL 304 2 APRIL 2004
www.sciencemag.org SCIENCE VOL 304 2 APRIL 2004
67
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
69
!43
44. rRNA phylotyping from metagenomics
http://www.sciencemag.org/
content/304/5667/66
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!44
45. Shotgun Sequencing Allows Alternative Anchors (e.g., RecA)
http://www.sciencemag.org/
content/304/5667/66
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!45
46. nomic group using the phylogenetic analysis
described for rRNA. For example, our data set
marker genes, is roughly comparable to the
97% cutoff traditionally used for rRNA. Thus
http://www.sciencemag.org/
content/304/5667/66
Fig. 6. Phylogenetic diversity of Sargasso Sea sequences using multiple phylogenetic markers. The
relative contribution of organisms from different major phylogenetic groups (phylotypes) was
measured using multiple phylogenetic markers that have been used previously in phylogenetic
studies of prokaryotes: 16S rRNA, RecA, EF-Tu, EF-G, HSP70, and RNA polymerase B (RpoB). The
relative proportion of different phylotypes for each sequence (weighted by the depth of coverage
of the contigs from which those sequences came) is shown. The phylotype distribution was
determined as follows: (i) Sequences in the Sargasso data set corresponding to each of these genes
were identified using HMM and BLAST searches. (ii) Phylogenetic analysis was performed for each
phylogenetic marker identified in the Sargasso data separately compared with all members of that
gene family in all complete genome sequences (only complete genomes were used to control for
the differential sampling of these markers in GenBank). (iii) The phylogenetic affinity of each
sequence was assigned based on the classification of the nearest neighbor in the phylogenetic tree.
Slides for UC Davis
RIL 2004 VOL 304 SCIENCE www.sciencemag.org EVE161 Course Taught by Jonathan Eisen Winter 2014
!46
47. Functional Inference from Metagenomics
• Can work well for individual genes
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!47
48. Functional Diversity of Proteorhodopsins?
http://www.sciencemag.org/
content/304/5667/66
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!48
49. ARTICLE
Received 3 Nov 2010 | Accepted 11 Jan 2011 | Published 8 Feb 2011
DOI: 10.1038/ncomms1188
A bacterial proteorhodopsin proton pump
in marine eukaryotes
Claudio H. Slamovits1,†, Noriko Okamoto1, Lena Burri1,†, Erick R. James1 & Patrick J. Keeling1
Proteorhodopsins are light-driven proton pumps involved in widespread phototrophy.
Discovered in marine proteobacteria just 10 years ago, proteorhodopsins are now known
http://www.nature.com/ncomms/journal/v2/n2/abs/ncomms1188.html
to have been spread by lateral gene transfer across diverse prokaryotes, but are curiously
absent from eukaryotes. In this study, we show that proteorhodopsins have been acquired
by horizontal gene transfer from bacteria at least twice independently in dinoflagellate
protists. We find that in the marine predator Oxyrrhis marina, proteorhodopsin is indeed the
most abundantly expressed nuclear gene and its product localizes to discrete cytoplasmic
structures suggestive of the endomembrane system. To date, photosystems I and II have
been the only known mechanism for transducing solar energy in eukaryotes; however, it now
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
appears that some abundant zooplankton use this alternative pathway to harness light to
52. ARTICLE
NATURE COMMUNICATIONS | DOI: 10.1038/ncomms1188
DIC
Rhodopsin
MitoTracker
DAPI
Merge
260
60
40
30
20
15
Figure 3 | Cellular localization of proteorhodopsin in O. marina cells. (a) Western blot of total O. marina protein probed with an antibody raised against
the C-terminal peptide of the proteorhodopsin OM27 from O. marina. Expected protein size is 28 kDa. (b) Localization of proteorhodopsin in O. marina
cells using immunofluorescence assay with the same antibody. Antibody signal forms small irregular and ring-like structures independent of mitochondria
in O. marina. Three independent cells are shown, each showing (left to right) differential interference contrast (DIC) light micrograph, anti-OM27
proteorhodopsin, MitoTracker staining, Hoechst 33258 staining for DNA and a merge of all four. White bar = 10 m. See Supplementary Movies 1–4 for a
360° video rendering. DAPI, 4,6-diamidino-2-phenylindole.
Rhodopsin function requires the prosthetic group retinal, sug- a distinctive transit peptide19. There is no evidence of such a pepSlides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
gesting that retinal biosynthesis should also be present in O. marina. tide on any O. marina rhodopsin gene, arguing against a location
53. Primer
1
The Light-Driven Proton Pump Proteorhodopsin
Enhances Bacterial Survival during Tough Times
´ `
Edward F. DeLong1,2*, Oded Beja3*
1 Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America, 2 Department of
Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America, 3 Faculty of Biology, Technion - Israel Institute of
Technology, Technion City, Haifa, Israel
http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1000359#pbio-1000359-g002
Some microorganisms contain proteins that can interact with
Cultivation-independent genomic surveys (e.g., ‘‘metagelight and convert it into energy for growth and survival, or into
nomics’’) revealed proteorhodopsin presence and diversity, and
sensory information that guides cells towards or away from light.
heterologous expression in E. coli demonstrated many of its
The simplest energy-harvesting photoproteins are the rhodopsins,
functional properties. Access to cultivable marine bacteria that
which consist of a single, membrane-embedded protein covalently
contain proteorhodopsin, however, would be very useful to further
bound to the chromophore retinal (a light-sensitive pigment) [1].
characterize its native function in diverse physiological contexts.
One class of archaeal photoproteins (called bacteriorhodopsin) was
Whole-genome sequencing then came to the rescue. The
shown to function as a light-driven proton pump, generating
Gordon and Betty Moore Foundation (GBMF) Microbial Genome
biochemical energy from light [2,3]. For many years, these lightSequencing Project (http://www.moore.org/microgenome) unexdriven proton pumps were thought to be found only in relatively
pectedly revealed that many culturable marine bacteria submitted
obscure Archaea living in high salinity.
for sequencing (including Pelagibacter spp., Vibrio spp., and
Ten years ago, a new type of microbial rhodopsin (proteorhoFlavobacteria isolates) in fact possessed proteorhodopsin genes
dopsin) was discovered in marine planktonic bacterial assemblages
(Table 1). What can these proteorhodopsin-containing isolates tell
[4]. This proteorhodopsin was co-localized on a large genome
us? Experiments with the proteorhodopsin-containing isolate
fragment containing the small subunit ribosomal RNA gene,
‘Cand. P. ubique’ (a member of the SAR11 bacterial lineage, the
which identified its genomic source—an uncultured gammapromost abundant bacterial group in the ocean [19,20]) showed no
teobacterium (of the SAR86 bacterial Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 rate or yield [11]. Later,
significant light enhancement of growth
Slides for UC lineage). Further work
54. Table 1. Marine bacterial isolates and genome fragments containing proteorhodopsins.
Table 1. Cont.
Organism
Strain
General Group
Reference
General Group
Reference
HTCC2181
Betaproteobacteria
GBMF
HOT2C01
unknown
[8]
Rhodobacterales sp.
HTCC2255
Alphaproteobacteria
GBMF
EBAC31A08
Gammaproteobacteria
[4]
Vibrio angustum
S14
Gammaproteobacteria
GBMF
ANT32C12
unknown
[8]
Photobacterium
SKA34
Gammaproteobacteria
GBMF
HF70_39H11_ArchHighGC
unknown
[12]
Vibrio harveyi
ATCC BAA-1116
Gammaproteobacteria
GenBank # CP000789
HF10_3D09_mediumGC
unknown
[12]
Marine gamma
HTCC2143
Gammaproteobacteria
GBMF
HF70_19B12_highGC
unknown
[12]
Marine gamma
HTCC2207
Gammaproteobacteria
GBMF
HF70_59C08
unknown
[12]
Cand. P. ubique
HTCC1002
Alphaproteobacteria
GBMF
Cand. P. ubique
HTCC1062
Alphaproteobacteria
[26]
Rhodospirillales
BAL199
Alphaproteobacteria
GBMF
Marinobacter
ELB17
Gammaproteobacteria
GBMF
Vibrio campbelli
AND4
Gammaproteobacteria
GBMF
Vibrio angustum
S14
Gammaproteobacteria
GBMF
Dokdonia donghaensis
MED134
Flavobacteria
GBMF
Polaribacter dokdonensis
MED152
Flavobacteria
GBMF
Psychroflexus
ATCC700755
Flavobacteria
GBMF
Polaribacter irgensii
23-P
Flavobacteria
GBMF
Flavobacteria bacterium
BAL38
Flavobacteria
GBMF
HF10_05C07
Proteobacteria
[24]
HF10_45G01
Proteobacteria
[24]
HF130_81H07
Gammaproteobacteria
[24]
EB0_39F01
Alphaproteobacteria
[24]
EB0_39H12
Proteobacteria
[24]
EB80_69G07
Alphaproteobacteria
[24]
EB80_02D08
Gammaproteobacteria
[24]
EB0_35D03
Proteobacteria
[24]
EB0_49D07
Proteobacteria
[24]
Organism
Genomes
Methylophilales
BACs and fosmids
EBO_50A10
Gammaproteobacteria
[24]
EB0_55B11f
Alphaproteobacteria
[24]
EBO_41B09
Betaproteobacteria
[24]
HF10_19P19
Proteobacteria
[17]
HF10_25F10
Proteobacteria
[17]
HF10_49E08
Planctomycetes
[24]
HF10_12C08
Alphaproteobacteria
Euryarchaea
unknown
unknown
unknown
[10]
MED42A11
unknown
[10]
MED46A06
unknown
[10]
MED49C08
unknown
[10]
MED66A03
unknown
[10]
MED82F10
unknown
[10]
MED86H08
unknown
[10]
RED17H08
unknown
[10]
RED22E04
unknown
Gammaproteobacteria
[25]
EBAC20E09
Gammaproteobacteria
[25]
NFirst proteorhodopsin gene found in uncultured
SAR86 using metagenomics; proteorhodopsin
light-driven proton pump activity confirmed in
heterologous E. coli cells [4].
NProteorhodopsin presence confirmed directly in
the ocean using laser flash photolysis [5].
NProteorhodopsin genes also found in other
bacterial groups [8].
NEnormous diversity of proteorhodopsin genes
found in the Sargasso Sea using metagenomics [9].
NRetinal biosynthesis pathways found in metagenomic data and confirmed using E. coli cells [10].
NProteorhodopsin genes are found in ‘Canditatus
Pelagibacter ubique’ (SAR11), the most abundant
bacterium on earth; environmental SAR11 proteorhodopsin presence confirmed using metaproteomics [11].
NProteorhodopsin genes found in uncultured
marine Archaea [12].
NFirst indication of proteorhodopsin light-dependent growth in cultured Flavobacteria [13] (see
Figure 1 for colony morphologies and pigmentation).
NProteorhodopsin genes found in non-marine
environments [14,15].
NProteorhodopsin phototrophy directly confirmed
using a genetic system in marine Vibrio sp. [16]
[10]
eBACHOT4E07
2000
[10]
MED35C06
Box 1. A Decade of Proteorhodopsin
Milestones
[10]
MED18B02
(xanthorhodopsin) was discovered in the salt-loving bacterium
Salinibacter ruber [28]. Xanthorhodopsin is a proton-pumping
retinal protein/carotenoid complex in which the carotenoid
strategy on marine bacterioplankton could be substantial given the
‘‘feast or famine’’ existence experienced by many of these
microbes. The Vibrio/proteorhodopsin model system is likely to
reveal further secrets on the nature and function of proteorhodopsin photosytems in bacteria that are usually (but erroneously)
considered as strict heterotrophs not capable of utilizing light at all.
While this study [16] adds an important new result, it certainly
does not solve the whole puzzle of proteorhodopsin photophysiology. Considering the staggering variety of genetic, physiological
and environmental contexts in which proteorhodopsin and related
photoproteins are found, a great variety of light-dependent
adaptive strategies are likely to occur in the natural microbial
world. For example, in 2005, a new type of bacterial rhodopsin
[24]
MED13K09
Marine microbial isolates and large genome fragments from the environment GBMF, microbial genomes sequenced as part of the Gordon and Betty Moore Foundation
microbial genome sequencing project (http://www.moore.org/microgenome), found to encode proteorhodopsin genes. The list includes whole genome sequences
from a wide array of cultivated marine microorganisms (Genomes), as well as cloned large DNA fragments (BACs and fosmids) recovered directly from the environment.
doi:10.1371/journal.pbio.1000359.t001
[24]
HF10_29C11
Strain
PLoS Biology | www.plosbiology.org
2001
2003
2004
2005
2006
2007
2008
2010
PLoS Biology | www.plosbiology.org
Figure 1. Various colony morphologies and coloration of
different proteorhodopsin-containing bacteria used to study
proteorhodopsin phototrophy. From top to bottom, the flavobacterium Polaribacter dokdonensis strain MED152 used to show proteorhodopsin light stimulated growth [13]; the flavobacterium Dokdonia
donghaensis strain MED134 used to show proteorhodopsin light
stimulated CO2-fixation [23]; and Vibrio strain AND4 used to show
proteorhodopsin phototrophy [16]; note the lack of detectable
pigments in Vibrio strain AND4. However, when these vibrio cells are
pelleted, they do show a pale reddish color, which is the result of
proteorhodopsin pigments presence in their membranes. Photos are
courtesy of Jarone Pinhassi.
doi:10.1371/journal.pbio.1000359.g001
3
2
April 2010 | Volume 8 Issue 4 | e1000359
Slides for UC Davis EVE161| Course Taught by Jonathan Eisen Winter 2014
April 2010 | Volume 8 | Issue 4 | e1000359
55. Box 1. A Decade of Proteorhodopsin
Milestones
2000
2001
2003
2004
2005
2006
2007
2008
2010
NFirst proteorhodopsin gene found in uncultured
SAR86 using metagenomics; proteorhodopsin
light-driven proton pump activity confirmed in
heterologous E. coli cells [4].
NProteorhodopsin presence confirmed directly in
the ocean using laser flash photolysis [5].
NProteorhodopsin genes also found in other
bacterial groups [8].
NEnormous diversity of proteorhodopsin genes
found in the Sargasso Sea using metagenomics [9].
NRetinal biosynthesis pathways found in metagenomic data and confirmed using E. coli cells [10].
NProteorhodopsin genes are found in ‘Canditatus
Pelagibacter ubique’ (SAR11), the most abundant
bacterium on earth; environmental SAR11 proteorhodopsin presence confirmed using metaproteomics [11].
NProteorhodopsin genes found in uncultured
marine Archaea [12].
NFirst indication of proteorhodopsin light-dependent growth in cultured Flavobacteria [13] (see
Figure 1 for colony morphologies and pigmentation).
NProteorhodopsin genes found in non-marine
environments [14,15].
NProteorhodopsin phototrophy directly confirmed
using a genetic system in marine Vibrio sp. [16]
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Figure 1. Various col
different proteorhodo
proteorhodopsin phot
terium Polaribacter dokdo
hodopsin light stimulated
donghaensis strain MED
stimulated CO2-fixation [
proteorhodopsin photot
pigments in Vibrio strain
pelleted, they do show a
proteorhodopsin pigmen
courtesy of Jarone Pinhas
doi:10.1371/journal.pbio.1