SlideShare une entreprise Scribd logo
1  sur  66
Télécharger pour lire hors ligne
Diversity Diversity Diversity
Diversity Diversity Diversity
Diversity Diversity Diversity
Diversity Diversity Diversity
SSE 2015
Jonathan A. Eisen
@phylogenomics
University of California, Davis
Overwhelming Diversity of Microbes
!2
Overwhelming Diversity of Microbes
!2
Overwhelming Diversity of Microbes
!2
Diversity of Form
Overwhelming Diversity of Microbes
!2
Diversity of Form
Phylogenetic Diversity
Overwhelming Diversity of Microbes
!2
Functional Diversity
Diversity of Form
Phylogenetic Diversity
Overwhelming Diversity of Microbes
!2
Functional Diversity
Diversity of Form
Phylogenetic Diversity
MICROBES
RUN THE
PLANET
Great Plate Count Anomaly
!3
Great Plate Count Anomaly
!3
Great Plate Count Anomaly
!3
Observation
Great Plate Count Anomaly
!3
Culturing Observation
Great Plate Count Anomaly
!3
Culturing Observation
CountCount
Great Plate Count Anomaly
!3
<<<<
Culturing Observation
CountCount
Great Plate Count Anomaly
!3
<<<<
Culturing Observation
CountCount
http://www.google.com/url?
sa=i&rct=j&q=&esrc=s&source=images&
cd=&docid=rLu5sL207WlE1M&tbnid=CR
LQYP7d9d_TcM:&ved=0CAUQjRw&url=h
ttp%3A%2F%2Fwww.biol.unt.edu
%2F~jajohnson
%2FDNA_sequencing_process&ei=hFu7
U_TyCtOqsQSu9YGwBg&psig=AFQjCN
G-8EBdEljE7-
yHFG2KPuBZt8kIPw&ust=140487395121
1424
Great Plate Count Anomaly
!3
<<<<
Culturing Observation
CountCount
http://www.google.com/url?
sa=i&rct=j&q=&esrc=s&source=images&
cd=&docid=rLu5sL207WlE1M&tbnid=CR
LQYP7d9d_TcM:&ved=0CAUQjRw&url=h
ttp%3A%2F%2Fwww.biol.unt.edu
%2F~jajohnson
%2FDNA_sequencing_process&ei=hFu7
U_TyCtOqsQSu9YGwBg&psig=AFQjCN
G-8EBdEljE7-
yHFG2KPuBZt8kIPw&ust=140487395121
1424
DNA
Embracing Diversity 1: rRNA
Phylotyping via rRNA PCR: One Taxon
• v
DNA
ACTGC
ACCTAT
CGTTCG
ACTGC
ACCTAT
CGTTCG
ACTGC
ACCTAT
CGTTCG
Taxa Characters
B1 ACTGCACCTATCGTTCG
B2 ACTCCACCTATCGTTCG
E1 ACTCCAGCTATCGATCG
E2 ACTCCAGGTATCGATCG
A1 ACCCCAGCTCTCGCTCG
A2 ACCCCAGCTCTGGCTCG
New1 ACTGCACCTATCGTTCG
EukaryotesBacteria Archaea
!5
Many
sequences
from one
sample all
point to the
same branch
on the tree
DNA
ACTGC
ACCTAT
CGTTCG
ACTGC
ACCTAT
CGTTCG
ACCCC
AGCTCT
CGCTCG
Taxa Characters
B1 ACTGCACCTATCGTTCG
B2 ACTCCACCTATCGTTCG
E1 ACTCCAGCTATCGATCG
E2 ACTCCAGGTATCGATCG
A1 ACCCCAGCTCTCGCTCG
A2 ACCCCAGCTCTGGCTCG
New1 ACCCCAGCTCTGCCTCG
New2 ACTGCACCTATCGTTCG
EukaryotesBacteria Archaea
!6
One can
estimate cell
counts from
the number of
times each
sequence is
seen.
Phylotyping via rRNA PCR: Two Taxa
DNA
ACTGC
ACCTAT
CGTTCG
ACTGC
ACCTAT
CGTTCG
ACCCC
AGCTCT
CGCTCG
Taxa Characters
B1 ACTGCACCTATCGTTCG
B2 ACTCCACCTATCGTTCG
E1 ACTCCAGCTATCGATCG
E2 ACTCCAGGTATCGATCG
A1 ACCCCAGCTCTCGCTCG
A2 ACCCCAGCTCTGGCTCG
New1 ACCCCAGCTCTGCCTCG
New2 ACTGCACCTATCGTTCG
EukaryotesBacteria Archaea
!6
One can
estimate cell
counts from
the number of
times each
sequence is
seen.
Phylotyping via rRNA PCR: Two Taxa
DNA
ACTGC
ACCTAT
CGTTCG
ACTGC
ACCTAT
CGTTCG
ACCCC
AGCTCT
CGCTCG
Taxa Characters
B1 ACTGCACCTATCGTTCG
B2 ACTCCACCTATCGTTCG
E1 ACTCCAGCTATCGATCG
E2 ACTCCAGGTATCGATCG
A1 ACCCCAGCTCTCGCTCG
A2 ACCCCAGCTCTGGCTCG
New1 ACCCCAGCTCTGCCTCG
New2 ACTGCACCTATCGTTCG
EukaryotesBacteria Archaea
!6
One can
estimate cell
counts from
the number of
times each
sequence is
seen.
Phylotyping via rRNA PCR: Two Taxa
DNA
Taxa Characters
B1 ACTGCACCTATCGTTCG
B2 ACTCCACCTATCGTTCG
E1 ACTCCAGCTATCGATCG
E2 ACTCCAGGTATCGATCG
A1 ACCCCAGCTCTCGCTCG
A2 ACCCCAGCTCTGGCTCG
New1 ACCCCAGCTCTGCCTCG
New2 AGGGGAGCTCTGCCTCG
New3 ACTCCAGCTATCGATCG
New4 ACTGCACCTATCGTTCG
EukaryotesBacteria Archaea
!7
ACTGC
ACCTAT
CGTTCG
ACTCC
AGCTAT
CGATCG
ACCCC
AGCTCT
CGCTCG
AGGGG
AGCTCT
CGCTCG
AGGGG
AGCTCT
CGCTCG
ACTGC
ACCTAT
CGTTCG
Even with
more taxa it
still works
Phylotyping via rRNA PCR: Four Taxa
rRNA PCR: Community Comparisons
DNA DNADNA
ACTGC
ACCTAT
CGTTCG
ACTCC
AGCTAT
CGATCG
ACCCC
AGCTCT
CGCTCG
Taxa Characters
B1 ACTGCACCTATCGTTCG
B2 ACTCCACCTATCGTTCG
E1 ACTCCAGCTATCGATCG
E2 ACTCCAGGTATCGATCG
A1 ACCCCAGCTCTCGCTCG
A2 ACCCCAGCTCTGGCTCG
New1 ACCCCAGCTCTGCCTCG
New2 ACGGCAGCTCTGCCTCG
EukaryotesBacteria Archaea
!8
rRNA PCR: Community Comparisons
DNA DNADNA
ACTGC
ACCTAT
CGTTCG
ACTCC
AGCTAT
CGATCG
ACCCC
AGCTCT
CGCTCG
Taxa Characters
B1 ACTGCACCTATCGTTCG
B2 ACTCCACCTATCGTTCG
E1 ACTCCAGCTATCGATCG
E2 ACTCCAGGTATCGATCG
A1 ACCCCAGCTCTCGCTCG
A2 ACCCCAGCTCTGGCTCG
New1 ACCCCAGCTCTGCCTCG
New2 ACGGCAGCTCTGCCTCG
!9
Chemosymbiont rRNA Phylotyping
Eisen et al.
1992Eisen et al. 1992. J. Bact.174: 3416
Colleen Cavanaugh
Approaching to NGS
Discovery of DNA structure
(Cold Spring Harb. Symp. Quant. Biol. 1953;18:123-31)
1953
Sanger sequencing method by F. Sanger
(PNAS ,1977, 74: 560-564)
1977
PCR by K. Mullis
(Cold Spring Harb Symp Quant Biol. 1986;51 Pt 1:263-73)
1983
Development of pyrosequencing
(Anal. Biochem., 1993, 208: 171-175; Science ,1998, 281: 363-365)
1993
1980
1990
2000
2010
Single molecule emulsion PCR 1998
Human Genome Project
(Nature , 2001, 409: 860–92; Science, 2001, 291: 1304–1351)
Founded 454 Life Science 2000
454 GS20 sequencer
(First NGS sequencer)
2005
Founded Solexa 1998
Solexa Genome Analyzer
(First short-read NGS sequencer)
2006
GS FLX sequencer
(NGS with 400-500 bp read lenght)
2008
Hi-Seq2000
(200Gbp per Flow Cell)
2010
Illumina acquires Solexa
(Illumina enters the NGS business)
2006
ABI SOLiD
(Short-read sequencer based upon ligation)
2007
Roche acquires 454 Life Sciences
(Roche enters the NGS business)
2007
NGS Human Genome sequencing
(First Human Genome sequencing based upon NGS technology)
2008
From Slideshare presentation of Cosentino Cristian
http://www.slideshare.net/cosentia/high-throughput-equencing
Miseq
Roche Jr
Ion Torrent
PacBio
Oxford
Drowning in Data
AAATCGCTAGCGC
CGGCGAGCTAGC
CGAGCGATCGAGC
CGAGCATCGAGTA
Hartman et al. BMC Bioinformatics 2010, 11:317
http://www.biomedcentral.com/1471-2105/11/317
Open AccessSOFTWARE
Software
Introducing W.A.T.E.R.S.: a Workflow for the
Alignment, Taxonomy, and Ecology of Ribosomal
Sequences
Amber L Hartman†1,3, Sean Riddle†2, Timothy McPhillips2, Bertram Ludäscher2 and Jonathan A Eisen*1
Abstract
Background: For more than two decades microbiologists have used a highly conserved microbial gene as a
phylogenetic marker for bacteria and archaea. The small-subunit ribosomal RNA gene, also known as 16 S rRNA, is
encoded by ribosomal DNA, 16 S rDNA, and has provided a powerful comparative tool to microbial ecologists. Over
time, the microbial ecology field has matured from small-scale studies in a select number of environments to massive
collections of sequence data that are paired with dozens of corresponding collection variables. As the complexity of
data and tool sets have grown, the need for flexible automation and maintenance of the core processes of 16 S rDNA
sequence analysis has increased correspondingly.
Results: We present WATERS, an integrated approach for 16 S rDNA analysis that bundles a suite of publicly available 16
S rDNA analysis software tools into a single software package. The "toolkit" includes sequence alignment, chimera
removal, OTU determination, taxonomy assignment, phylogentic tree construction as well as a host of ecological
analysis and visualization tools. WATERS employs a flexible, collection-oriented 'workflow' approach using the open-
source Kepler system as a platform.
Conclusions: By packaging available software tools into a single automated workflow, WATERS simplifies 16 S rDNA
analyses, especially for those without specialized bioinformatics, programming expertise. In addition, WATERS, like
some of the newer comprehensive rRNA analysis tools, allows researchers to minimize the time dedicated to carrying
out tedious informatics steps and to focus their attention instead on the biological interpretation of the results. One
advantage of WATERS over other comprehensive tools is that the use of the Kepler workflow system facilitates result
interpretation and reproducibility via a data provenance sub-system. Furthermore, new "actors" can be added to the
workflow as desired and we see WATERS as an initial seed for a sizeable and growing repository of interoperable, easy-
to-combine tools for asking increasingly complex microbial ecology questions.
Background
Microbial communities and how they are surveyed
Microbial communities abound in nature and are crucial
for the success and diversity of ecosystems. There is no
end in sight to the number of biological questions that
can be asked about microbial diversity on earth. From
animal and human guts to open ocean surfaces and deep
sea hydrothermal vents, to anaerobic mud swamps or
boiling thermal pools, to the tops of the rainforest canopy
and the frozen Antarctic tundra, the composition of
microbial communities is a source of natural history,
intellectual curiosity, and reservoir of environmental
health [1]. Microbial communities are also mediators of
insight into global warming processes [2,3], agricultural
success [4], pathogenicity [5,6], and even human obesity
[7,8].
In the mid-1980 s, researchers began to sequence ribo-
somal RNAs from environmental samples in order to
characterize the types of microbes present in those sam-
ples, (e.g., [9,10]). This general approach was revolution-
ized by the invention of the polymerase chain reaction
(PCR), which made it relatively easy to clone and then
* Correspondence: jaeisen@ucdavis.edu
1 Department of Medical Microbiology and Immunology and the Department
of Evolution and Ecology, Genome Center, University of California Davis, One
Shields Avenue, Davis, CA, 95616, USA
† Contributed equally
Full list of author information is available at the end of the article
WATERS - Kepler Workflow for rRNA
matics 2010, 11:317
.com/1471-2105/11/317
Page 2 of 14
genes for ribosomal RNA) in partic-
ubunit ribosomal RNA (ss-rRNA).
ed a large amount of previously
l diversity [1,11-13]. Researchers
all subunit rRNA gene not only
ith which it can be PCR amplified,
has variable and highly conserved
to be universally distributed among
nd it is useful for inferring phyloge-
4,15]. Since then, "cultivation-inde-
" have brought a revolution to the
by allowing scientists to study a
mount of diversity in many different
ments [16-18]. The general premise
Figure 1 Overview of WATERS. Schema of WATERS where white
boxes indicate "behind the scenes" analyses that are performed in WA-
Align
Check
chimeras
Cluster Build
Tree
Assign
Taxonomy
Tree w/
Taxonomy
Diversity
statistics &
graphs
Unifrac
files
Cytoscape
network
OTU table
Hartman et al. BMC Bioinformatics 2010, 11:317
http://www.biomedcentral.com/1471-2105/11/317
Page 3 of 14
Motivations
As outlined above, successfully processing microbial
sequence collections is far from trivial. Each step is com-
plex and usually requires significant bioinformatics
expertise and time investment prior to the biological
interpretation. In order to both increase efficiency and
ensure that all best-practice tools are easily usable, we
sought to create an "all-inclusive" method for performing
all of these bioinformatics steps together in one package.
To this end, we have built an automated, user-friendly,
workflow-based system called WATERS: a Workflow for
the Alignment, Taxonomy, and Ecology of Ribosomal
Sequences (Fig. 1). In addition to being automated and
simple to use, because WATERS is executed in the Kepler
scientific workflow system (Fig. 2) it also has the advan-
tage that it keeps track of the data lineage and provenance
of data products [23,24].
Automation
The primary motivation in building WATERS was to
minimize the technical, bioinformatics challenges that
arise when performing DNA sequence clustering, phylo-
genetic tree, and statistical analyses by automating the 16
S rDNA analysis workflow. We also hoped to exploit
additional features that workflow-based approaches
entail, such as optimized execution and data lineage
tracking and browsing [23,25-27]. In the earlier days of 16
S rDNA analysis, simply knowing which microbes were
present and whether they were biologically novel was a
noteworthy achievement. It was reasonable and expected,
therefore, to invest a large amount of time and effort to
get to that list of microbes. But now that current efforts
are significantly more advanced and often require com-
parison of dozens of factors and variables with datasets of
thousands of sequences, it is not practically feasible to
process these large collections "by hand", and hugely inef-
ficient if instead automated methods can be successfully
employed.
Broadening the user base
A second motivation and perspective is that by minimiz-
ing the technical difficulty of 16 S rDNA analysis through
the use of WATERS, we aim to make the analysis of these
datasets more widely available and allow individuals with
Figure 2 Screenshot of WATERS in Kepler software. Key features: the library of actors un-collapsed and displayed on the left-hand side, the input
and output paths where the user declares the location of their input files and desired location for the results files. Each green box is an individual Kepler
actor that performs a single action on the data stream. The connectors (black arrows) direct and hook up the actors in a defined sequence. Double-
clicking on any actor or connector allows it to be manipulated and re-arranged.
Hartman et al. BMC Bioinformatics 2010, 11:317
http://www.biomedcentral.com/1471-2105/11/317
Page 9
default is 97% and 99%), and they are also generated for
every metadata variable comparison that the user
includes.
Data pruning
To assist in troubleshooting and quality con
WATERS returns to the user three fasta files of seque
Figure 3 Biologically similar results automatically produced by WATERS on published colonic microbiota samples. (A) Rarefaction curves s
ilar to curves shown in Eckburg et al. Fig. 2; 70-72, indicate patient numbers, i.e., 3 different individuals. (B) Weighted Unifrac analysis based on ph
genetic tree and OTU data produced by WATERS very similar to Eckburg et al. Fig. 3B. (C) Neighbor-joining phylogenetic tree (Quicktree) represent
the sequences analyzed by WATERS, which is clearly similar to Fig. S1 in Eckburg et al.
BA
3 3HUFHQW YDULDWLRQ H[SODLQHG
33HUFHQWYDULDWLRQH[SODLQHG
$%
&
')
$ %
&
'(
)
6
$
%&
'
()
6
3&$ 3 YV 3
C
%$&7(52,'(7(6
%$&7(52,'$/(6
'(/7$3527(2%$&7(5,$
$&7,12%$&7(5,$
9(558&20,&52%,$
(36,/213527(2%$&7(5,$
),50,&87(6
&/2675,',$
&/2675,',$/(6
*$00$3527(2%$&7(5,$
&<$12%$&7(5,$
$/3+$3527(2%$&7(5,$
)862%$&7(5,$
),50,&87(6
%$&,//,
),50,&87(6
02//,&87(6
Amber

Hartman
Phylogenetic Copy # Correction
Kembel SW, Wu M, Eisen JA, Green JL (2012) Incorporating 16S Gene Copy
Number Information Improves Estimates of Microbial Diversity and Abundance. PLoS
Comput Biol 8(10): e1002743. doi:10.1371/journal.pcbi.1002743
Steven
Kembel
Jessica
Green
alignment used to build the profile, resulting in a multiple PD versus PID clustering, 2) to explore overlap between PhylOT
Figure 1. PhylOTU Workflow. Computational processes are represented as squares and databases are represented as cylinders in this generaliz
workflow of PhylOTU. See Results section for details.
doi:10.1371/journal.pcbi.1001061.g001
Finding Metagenomic OTU
Sharpton TJ, Riesenfeld SJ, Kembel SW, Ladau J, O'Dwyer JP, Green JL, Eisen JA, Pollard
KS. (2011) PhylOTU: A High-Throughput Procedure Quantifies Microbial Community Diversity
and Resolves Novel Taxa from Metagenomic Data. PLoS Comput Biol 7(1): e1001061. doi:
10.1371/journal.pcbi.1001061
PhylOTU
Tom Sharpton
Katie Pollard
Jessica Green
Beta-Diversity
a broader range of Proteobacteria, but yielded similar results
(Fig. S1 and Tables S2 and S3).
Across all samples, we identified 4,931 quality Nitrosomadales
sequences, which grouped into 176 OTUs (operational taxo-
nomic units) using an arbitrary 99% sequence similarity cutoff.
This cutoff retained a high amount of sequence diversity, but
minimized the chance of including diversity because of se-
quencing or PCR errors. Most (95%) of the sequences appear
closely related either to the marine Nitrosospira-like clade,
known to be abundant in estuarine sediments (e.g., ref. 19) or to
marine bacterium C-17, classified as Nitrosomonas (20) (Fig. S2).
Pairwise community similarity between the samples was calcu-
lated based on the presence or absence of each OTU using
somonadales community similarity. Geographic distance con-
tributed the largest partial regression coefficient (b = 0.40,
P < 0.0001), with sediment moisture, nitrate concentration, plant
cover, salinity, and air and water temperature contributing to
smaller, but significant, partial regression coefficients (b = 0.09–
Fig. 1. The 13 marshes sampled (see Table S1 for details). Marshes com-
pared with one another within regions are circled. (Inset) The arrangement
of sampling points within marshes. Six points were sampled along a 100-m
transect, and a seventh point was sampled ∼1 km away. Two marshes in the
Northeast United States (outlined stars) were sampled more intensively,
along four 100-m transects in a grid pattern.
Fig. 2. Distance-decay curves for the Nitrosomadales communities. The
dashed, blue line denotes the least-squares linear regression across all spatial
scales. The solid lines denote separate regressions within each of the three
spatial scales: within marshes, regional (across marshes within regions circled in
Fig. 1), and continental (across regions). The slopes of all lines (except the solid
light blue line) are significantly less than zero. The slopes of the solid red lines
are significantly different from the slope of the all scale (blue dashed) line.
ECOLOGY
a broader range of Proteobacteria, but yielded similar results
(Fig. S1 and Tables S2 and S3).
Across all samples, we identified 4,931 quality Nitrosomadales
sequences, which grouped into 176 OTUs (operational taxo-
nomic units) using an arbitrary 99% sequence similarity cutoff.
This cutoff retained a high amount of sequence diversity, but
minimized the chance of including diversity because of se-
quencing or PCR errors. Most (95%) of the sequences appear
closely related either to the marine Nitrosospira-like clade,
known to be abundant in estuarine sediments (e.g., ref. 19) or to
marine bacterium C-17, classified as Nitrosomonas (20) (Fig. S2).
Pairwise community similarity between the samples was calcu-
lated based on the presence or absence of each OTU using
a rarefied Sørensen’s index (4). Community similarity using this
somonadales community similarity. Geographic distance con-
tributed the largest partial regression coefficient (b = 0.40,
P < 0.0001), with sediment moisture, nitrate concentration, plant
cover, salinity, and air and water temperature contributing to
smaller, but significant, partial regression coefficients (b = 0.09–
0.17, P < 0.05) (Table 1). Because salt marsh bacteria may be
Fig. 1. The 13 marshes sampled (see Table S1 for details). Marshes com-
pared with one another within regions are circled. (Inset) The arrangement
of sampling points within marshes. Six points were sampled along a 100-m
transect, and a seventh point was sampled ∼1 km away. Two marshes in the
Northeast United States (outlined stars) were sampled more intensively,
along four 100-m transects in a grid pattern.
Fig. 2. Distance-decay curves for the Nitrosomadales communities. The
dashed, blue line denotes the least-squares linear regression across all spatial
scales. The solid lines denote separate regressions within each of the three
spatial scales: within marshes, regional (across marshes within regions circled in
Fig. 1), and continental (across regions). The slopes of all lines (except the solid
light blue line) are significantly less than zero. The slopes of the solid red lines
are significantly different from the slope of the all scale (blue dashed) line.
ECOLOGY
Drivers of bacterial β-diversity depend on spatial scale
Jennifer B. H. Martinya,1
, Jonathan A. Eisenb
, Kevin Pennc
, Steven D. Allisona,d
, and M. Claire Horner-Devinee
a
Department of Ecology and Evolutionary Biology, and d
Department of Earth System Science, University of California, Irvine, CA 92697; b
Department of
Evolution and Ecology, University of California Davis Genome Center, Davis, CA 95616; c
Center for Marine Biotechnology and Biomedicine, The Scripps
Institution of Oceanography, University of California at San Diego, La Jolla, CA 92093; and e
School of Aquatic and Fishery Sciences, University of Washington,
Seattle, WA 98195
Edited by Edward F. DeLong, Massachusetts Institute of Technology, Cambridge, MA, and approved March 31, 2011 (received for review November 1, 2010)
The factors driving β-diversity (variation in community composi-
tion) yield insights into the maintenance of biodiversity on the
planet. Here we tested whether the mechanisms that underlie
bacterial β-diversity vary over centimeters to continental spatial
scales by comparing the composition of ammonia-oxidizing bacte-
ria communities in salt marsh sediments. As observed in studies
of macroorganisms, the drivers of salt marsh bacterial β-diversity
depend on spatial scale. In contrast to macroorganism studies,
however, we found no evidence of evolutionary diversification
of ammonia-oxidizing bacteria taxa at the continental scale, de-
spite an overall relationship between geographic distance and
community similarity. Our data are consistent with the idea that
dispersal limitation at local scales can contribute to β-diversity,
even though the 16S rRNA genes of the relatively common taxa
are globally distributed. These results highlight the importance
of considering multiple spatial scales for understanding microbial
biogeography.
microbial composition | distance-decay | Nitrosomonadales | ecological drift
Biodiversity supports the ecosystem processes upon which so-
ciety depends (1). Understanding the mechanisms that gen-
erate and maintain biodiversity is thus key to predicting ecosystem
responses to future environmental changes. The decrease in
community similarity with geographic distance is a universal
biogeographic pattern observed in communities from all
spatial scale (12). Fifty-years ago, Preston (13) noted that the
turnover rate (rate of change) of bird species composition across
space within a continent is lower than that across continents. He
attributed the high turnover rate across continents to evolu-
tionary diversification (i.e., speciation) between faunas as a result
of dispersal limitation and the lower turnover rates of bird spe-
cies within continents as a result of environmental variation.
Here we investigate whether the mechanisms underlying β-
diversity in bacteria also vary by spatial scale. We chose to focus
on the ammonia-oxidizing bacteria (AOB), which along with the
ammonia-oxidizing archaea (14), perform the rate-limiting step of
nitrification and thus play a key role in nitrogen dynamics. We
compared AOB community composition in 106 sediment samples
from 12 salt marshes on three continents. A partially nested
sampling design achieved a relatively balanced distribution of
pairwise distance classes over nine orders of magnitude, from
3 cm to 12,500 km (Fig. 1 and Table S1). We limited our sam-
pling to a monophyletic group of bacteria, the AOB within the
β-Proteobacteria, and one habitat, salt marshes primarily domi-
nated by cordgrass (Spartina spp.). This approach constrained
the pool of total diversity (richness) and kept the environmental
and plant variation relatively constant, increasing our ability to
identify if dispersal limitation influences AOB composition.
We then asked two questions: (i) Does bacterial β-diversity—
specifically, the slope of the distance-decay curve—vary over
community composition) yield insights into the maintenance of
biodiversity. These studies are still relatively rare for micro-
organisms, however, and thus our understanding of the mecha-
nisms underlying microbial diversity—most of the tree of life—
remains limited.
β-Diversity, and therefore distance-decay patterns, could be
driven solely by differences in environmental conditions across
space, a hypothesis summed up by microbiologists as, “every-
thing is everywhere—the environmental selects” (10). Under this
model, a distance-decay curve is observed because environmen-
tal variables tend to be spatially autocorrelated, and organisms
with differing niche preferences are selected from the available
pool of taxa as the environment changes with distance.
Dispersal limitation can also give rise to β-diversity, as it per-
mits historical contingencies to influence present-day biogeo-
graphic patterns. For example, neutral niche models, in which an
organism’s abundance is not influenced by its environmental
preferences, predict a distance-decay curve (8, 11). On relatively
short time scales, stochastic births and deaths contribute to
a heterogeneous distribution of taxa (ecological drift). On longer
time scales, stochastic genetic processes allow for taxon di-
versification across the landscape (evolutionary drift). If dispersal
is limiting, then current environmental or biotic conditions will
not fully explain the distance-decay curve, and thus geographic
distance will be correlated with community similarity even after
controlling for other factors (2).
For macroorganisms, the relative contribution of environ-
mental factors or dispersal limitation to β-diversity depends on
vary by spatial scale? Because most bacteria
and hardy, we predicted that dispersal lim
primarily across continents, resulting in
microbial “provinces” (15). At the same tim
environmental factors would contribute
decay at all scales, resulting in the steepest sl
scale as reported in plant and animal comm
Results and Discussion
We characterized AOB community compo
Sanger sequencing of 16S rRNA gene reg
primer sets. Here we focus on the results f
sequences from the order Nitrosomonada
primers specific for AOB within the β-Prot
The second primer set (18) generated lo
Author contributions: J.B.H.M. and M.C.H.-D. designed rese
M.C.H.-D. performed research; J.B.H.M., S.D.A., and M.C.H.-D
and M.C.H.-D. wrote the paper.
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
Freely available online through the PNAS open access opti
Data deposition: The sequences reported in this paper hav
Bank database (accession nos. HQ271472–HQ276885 and H
1
To whom correspondence should be addressed. E-mail: jm
This article contains supporting information online at www.
1073/pnas.1016308108/-/DCSupplemental.
7850–7854 | PNAS | May 10, 2011 | vol. 108 | no. 19 www.pnas.org/cgi/do
Our data are consistent with the idea
that dispersal limitation at local
scales can contribute to à-diversity,
even though the 16S rRNA genes of
the relatively common taxa are
globally distributed.
Jen Hughes_

Martiny
M. Claire 

Horner-Devine
Drosophila microbiome
Both natural surveys and laboratory
experiments indicate that host diet
plays a major role in shaping the
Drosophila bacterial microbiome.
Laboratory strains provide only a
limited model of natural host–microbe
interactions
Jenna Lang Angus Chandler
Embracing Diversity 2: Other Genes
Culture Independent “Metagenomics”
DNA DNADNA
!18
Metagenomics
http://dx.doi.org/10.1016/S1074-5521(98)90108-9
Jo Handelsman
Culture Independent “Metagenomics”
DNA DNADNA
!19
Taxa Characters
B1 ACTGCACCTATCGTTCG
B2 ACTCCACCTATCGTTCG
E1 ACTCCAGCTATCGATCG
E2 ACTCCAGGTATCGATCG
A1 ACCCCAGCTCTCGCTCG
A2 ACCCCAGCTCTGGCTCG
New1 ACCCCAGCTCTGCCTCG
New2 AGGGGAGCTCTGCCTCG
New3 ACTCCAGCTATCGATCG
New4 ACTGCACCTATCGTTCG
RecA RecARecA
http://genomebiology.com/2008/9/10/R151 Genome Biology 2008, Volume 9, Issue 10, Article R151 Wu and Eisen R151.7
Genome Biology 2008, 9:R151
sequences are not conserved at the nucleotide level [29]. As a
result, the nr database does not actually contain many more
protein marker sequences that can be used as references than
those available from complete genome sequences.
Comparison of phylogeny-based and similarity-based phylotyping
Although our phylogeny-based phylotyping is fully auto-
mated, it still requires many more steps than, and is slower
than, similarity based phylotyping methods such as a
MEGAN [30]. Is it worth the trouble? Similarity based phylo-
typing works by searching a query sequence against a refer-
ence database such as NCBI nr and deriving taxonomic
information from the best matches or 'hits'. When species
that are closely related to the query sequence exist in the ref-
erence database, similarity-based phylotyping can work well.
However, if the reference database is a biased sample or if it
contains no closely related species to the query, then the top
hits returned could be misleading [31]. Furthermore, similar-
ity-based methods require an arbitrary similarity cut-off
value to define the top hits. Because individual bacterial
genomes and proteins can evolve at very different rates, a uni-
versal cut-off that works under all conditions does not exist.
As a result, the final results can be very subjective.
In contrast, our tree-based bracketing algorithm places the
query sequence within the context of a phylogenetic tree and
only assigns it to a taxonomic level if that level has adequate
sampling (see Materials and methods [below] for details of
the algorithm). With the well sampled species Prochlorococ-
cus marinus, for example, our method can distinguish closely
related organisms and make taxonomic identifications at the
species level. Our reanalysis of the Sargasso Sea data placed
672 sequences (3.6% of the total) within a P. marinus clade.
On the other hand, for sparsely sampled clades such as
Aquifex, assignments will be made only at the phylum level.
Thus, our phylogeny-based analysis is less susceptible to data
sampling bias than a similarity based approach, and it makes
Major phylotypes identified in Sargasso Sea metagenomic dataFigure 3
Major phylotypes identified in Sargasso Sea metagenomic data. The metagenomic data previously obtained from the Sargasso Sea was reanalyzed using
AMPHORA and the 31 protein phylogenetic markers. The microbial diversity profiles obtained from individual markers are remarkably consistent. The
breakdown of the phylotyping assignments by markers and major taxonomic groups is listed in Additional data file 5.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Alphaproteobacteria
Betaproteobacteria
G
am
m
aproteobacteria
D
eltaproteobacteria
Epsilonproteobacteria
U
nclassified
proteobacteria
Bacteroidetes
C
hlam
ydiae
C
yanobacteria
Acidobacteria
Therm
otogae
Fusobacteria
ActinobacteriaAquificae
Planctom
ycetes
Spirochaetes
Firm
icutes
C
hloroflexiC
hlorobi
U
nclassified
bacteria
dnaG
frr
infC
nusA
pgk
pyrG
rplA
rplB
rplC
rplD
rplE
rplF
rplK
rplL
rplM
rplN
rplP
rplS
rplT
rpmA
rpoB
rpsB
rpsC
rpsE
rpsI
rpsJ
rpsK
rpsM
rpsS
smpB
tsf
Relativeabundance
RpoB RpoBRpoB
Rpl4 Rpl4Rpl4 rRNA rRNArRNA
Hsp70 Hsp70Hsp70
EFTu EFTuEFTu
Many other genes
better than rRNA
Phylosift for Other Marker Genes
DNA DNADNA
Taxa Characters
B1 ACTGCACCTATCGTTCG
B2 ACTCCACCTATCGTTCG
E1 ACTCCAGCTATCGATCG
E2 ACTCCAGGTATCGATCG
A1 ACCCCAGCTCTCGCTCG
A2 ACCCCAGCTCTGGCTCG
New1 ACCCCAGCTCTGCCTCG
New2 AGGGGAGCTCTGCCTCG
Input Sequences
rRNA workflow
protein workflow
profile HMMs used to align
candidates to reference alignment
Taxonomic
Summaries
parallel option
hmmalign
multiple alignment
LAST
fast candidate search
pplacer
phylogenetic placement
LAST
fast candidate search
LAST
fast candidate search
search input against references
hmmalign
multiple alignment
hmmalign
multiple alignment
Infernal
multiple alignment
LAST
fast candidate search
<600 bp
>600 bp
Sample Analysis &
Comparison
Krona plots,
Number of reads placed
for each marker gene
Edge PCA,
Tree visualization,
Bayes factor tests
eachinputsequencescannedagainstbothworkflows
https://phylosift.wordpress.com
PeerJ 2:e243 https://dx.doi.org/10.7717/peerj.243
Aaron Darling
Holly Bik
Wu et al. 2006 PLoS Biology 4: e188.
Baumannia makes vitamins and cofactors
Sulcia makes amino acids
Phylogenetic Binning
Nancy Moran
Dongying Wu
Embracing Diversity 3: Function
Functional Prediction from Metagenomes
DNA DNADNA
!23
Taxa Characters
B1 ACTGCACCTATCGTTCG
B2 ACTCCACCTATCGTTCG
E1 ACTCCAGCTATCGATCG
E2 ACTCCAGGTATCGATCG
A1 ACCCCAGCTCTCGCTCG
A2 ACCCCAGCTCTGGCTCG
New1 ACCCCAGCTCTGCCTCG
New2 AGGGGAGCTCTGCCTCG
New3 ACTCCAGCTATCGATCG
New4 ACTGCACCTATCGTTCG
inputs of fixed carbon or nitrogen from external sources. As with
Leptospirillum group I, both Leptospirillum group II and III have the
genes needed to fix carbon by means of the Calvin–Benson–
Bassham cycle (using type II ribulose 1,5-bisphosphate carboxy-
lase–oxygenase). All genomes recovered from the AMD system
contain formate hydrogenlyase complexes. These, in combination
with carbon monoxide dehydrogenase, may be used for carbon
fixation via the reductive acetyl coenzyme A (acetyl-CoA) pathway
by some, or all, organisms. Given the large number of ABC-type
sugar and amino acid transporters encoded in the Ferroplasma type
Figure 4 Cell metabolic cartoons constructed from the annotation of 2,180 ORFs
identified in the Leptospirillum group II genome (63% with putative assigned function) and
1,931 ORFs in the Ferroplasma type II genome (58% with assigned function). The cell
cartoons are shown within a biofilm that is attached to the surface of an acid mine
drainage stream (viewed in cross-section). Tight coupling between ferrous iron oxidation,
pyrite dissolution and acid generation is indicated. Rubisco, ribulose 1,5-bisphosphate
carboxylase–oxygenase. THF, tetrahydrofolate.
articles
NATURE | doi:10.1038/nature02340 | www.nature.com/nature 5©2004 NaturePublishing Group
DNA DNADNA
!24
Taxa Characters
B1 ACTGCACCTATCGTTCG
B2 ACTCCACCTATCGTTCG
E1 ACTCCAGCTATCGATCG
E2 ACTCCAGGTATCGATCG
A1 ACCCCAGCTCTCGCTCG
A2 ACCCCAGCTCTGGCTCG
New1 ACCCCAGCTCTGCCTCG
New2 AGGGGAGCTCTGCCTCG
New3 ACTCCAGCTATCGATCG
New4 ACTGCACCTATCGTTCG
Functional Prediction from Metagenomes
PHYLOGENENETIC PREDICTION OF GENE FUNCTION
IDENTIFY HOMOLOGS
OVERLAY KNOWN
FUNCTIONS ONTO TREE
INFER LIKELY FUNCTION
OF GENE(S) OF INTEREST
1 2 3 4 5 6
3 5
3
1A 2A 3A 1B 2B 3B
2A 1B
1A
3A
1B
2B
3B
ALIGN SEQUENCES
CALCULATE GENE TREE
1
2
4
6
CHOOSE GENE(S) OF INTEREST
2A
2A
5
3
Species 3Species 1 Species 2
1
1 2
2
2 31
1A 3A
1A 2A 3A
1A 2A 3A
4 6
4 5 6
4 5 6
2B 3B
1B 2B 3B
1B 2B 3B
ACTUAL EVOLUTION
(ASSUMED TO BE UNKNOWN)
Duplication?
EXAMPLE A EXAMPLE B
Duplication?
Duplication?
Duplication
5
METHOD
Ambiguous
Based on
Eisen, 1998
Genome Res 8:
163-167.
Phylogenomics
Shotmap
Simulate)
metagenomic)
library)
Translate)
metagenomic)
reads)
Search)
metagenomic)
pep6des)
Classify)
metagenomic)
pep6des)
Es6mate)
protein)family)
abundance)
Taxonomic)
profiles)from)real)
metagenomes)
Protein)family)
database)
IMG/ER)
reference)
genomes)
Construct))
mock))
community)
1"
Annotate)
genes)in)
genomes)
2"
Expected)
abundance)of)
gene)families)
3"
4"
5"
Protein)family)
database)
Evaluate)
es6ma6on)
accuracy)
6" 7"
8"
9"
Tom Sharpton
Katie Pollardhttps://github.com/sharpton/shotmap
Embracing Diversity 4: Organized Reference Data
Automated Accurate Genome Tree
Lang JM, Darling AE, Eisen JA (2013) Phylogeny of
Bacterial and Archaeal Genomes Using Conserved
Genes: Supertrees and Supermatrices. PLoS ONE
8(4): e62510. doi:10.1371/journal.pone.0062510
Jenna Lang
Automated Protein Family Surveys
A
B
C
Representative
Genomes
Extract
Protein
Annotation
All v. All
BLAST
Homology
Clustering
(MCL)
SFams
Align &
Build
HMMs
HMMs
Screen for
Homologs
New
Genomes
Extract
Protein
Annotation
Figure 1
Tom Sharpton
Katie Pollardhttp://www.biomedcentral.com/1471-2105/13/264
Genomes Poorly Sampled
Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree
2002: TIGR Tree of Life Project
Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree
Naomi
Ward
Karen
Nelson
Genomic Encyclopedia of Bacteria & Archaea
Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree
Synapomorphies exist
GEBA Cyanobacteria
Shih et al. 2013. PNAS 10.1073/pnas.1217107110
0.3
B1
B2
C1
Paulinella
Glaucophyte
Green
Red
Chromalveolates
C2
C3
A
E
F
G
B3
D
A
B
Fig. 2. Implications on plastid evolution. (A) Maxi-
mum-likelihood phylogenetic tree of plastids and cya-
nobacteria, grouped by subclades (Fig. 1). The red dot
Cheryl
Kerfeld
Haloarchaeal GEBA-like
Lynch et al. (2012) PLoS ONE 7(7): e41389. doi:10.1371/journal.pone.0041389
Erin
Lynch
The Dark Matter of Biology
From Wu et al. 2009 Nature 462, 1056-1060
JGI Dark Matter Project
environmental
samples (n=9)
isolation of single
cells (n=9,600)
whole genome
amplification (n=3,300)
SSU rRNA gene
based identification
(n=2,000)
genome sequencing,
assembly and QC (n=201)
draft genomes
(n=201)
SAK
HSM ETLTG
HOT
GOM
GBS
EPR
TAETL T
PR
EBS
AK E
SM G TATTG
OM
OT
seawater brackish/freshwater hydrothermal sediment bioreactor
GN04
WS3 (Latescibacteria)
GN01
+Gí
LD1
WS1
Poribacteria
BRC1
Lentisphaerae
Verrucomicrobia
OP3 (Omnitrophica)
Chlamydiae
Planctomycetes
NKB19 (Hydrogenedentes)
WYO
Armatimonadetes
WS4
Actinobacteria
Gemmatimonadetes
NC10
SC4
WS2
Cyanobacteria
:36í2
Deltaproteobacteria
EM19 (Calescamantes)
2FW6SDí )HUYLGLEDFWHULD
GAL35
Aquificae
EM3
Thermotogae
Dictyoglomi
SPAM
GAL15
CD12 (Aerophobetes)
OP8 (Aminicenantes)
AC1
SBR1093
Thermodesulfobacteria
Deferribacteres
Synergistetes
OP9 (Atribacteria)
:36í2
Caldiserica
AD3
Chloroflexi
Acidobacteria
Elusimicrobia
Nitrospirae
49S1 2B
Caldithrix
GOUTA4
6$5 0DULQLPLFURELD
Chlorobi
)LUPLFXWHV
Tenericutes
)XVREDFWHULD
Chrysiogenetes
Proteobacteria
)LEUREDFWHUHV
TG3
Spirochaetes
WWE1 (Cloacamonetes)
70
ZB3
093í
'HLQRFRFFXVí7KHUPXV
OP1 (Acetothermia)
Bacteriodetes
TM7
GN02 (Gracilibacteria)
SR1
BH1
OD1 (Parcubacteria)
:6
OP11 (Microgenomates)
Euryarchaeota
Micrarchaea
DSEG (Aenigmarchaea)
Nanohaloarchaea
Nanoarchaea
Cren MCG
Thaumarchaeota
Cren C2
Aigarchaeota
Cren pISA7
Cren Thermoprotei
Korarchaeota
pMC2A384 (Diapherotrites)
BACTERIA ARCHAEA
archaeal toxins (Nanoarchaea)
lytic murein transglycosylase
stringent response
(Diapherotrites, Nanoarchaea)
ppGpp
limiting
amino acids
SpotT RelA
(GTP or GDP)
+ PPi
GTP or GDP
+ATP
limiting
phosphate,
fatty acids,
carbon, iron
DksA
Expression of components
for stress response
sigma factor (Diapherotrites, Nanoarchaea)
ı4
ȕ  ȕ¶
ı2ı3 ı1
-35 -10
Į17'
Į7'
51$ SROPHUDVH
oxidoretucase
+ +e- donor e- acceptor
H
1
Ribo
ADP
+
1+2
O
Reduction
Oxidation
H
1
Ribo
ADP
1+
O
2H
1$'  +  H 1$'++ + -
HGT from Eukaryotes (Nanoarchaea)
Eukaryota
O
+2+2
OH
1+
2+3
O
O
+2+2
1+
2+3
O
tetra-
peptide
O
+2+2
OH
1+
2+3
O
O
+2+2
1+
2+3
O
tetra-
peptide
murein (peptido-glycan)
archaeal type purine synthesis
(Microgenomates)
PurF
PurD
3XU1
PurL/Q
PurM
PurK
PurE
3XU
PurB
PurP
?
Archaea
adenine guanine
O
+ 12
+
1
1+2
1
1
H
H
1
1
1
H
H
H1 1
H
PRPP )$,$5
IMP
$,$5
A

GUA 
G U
G
U
A

G
U
A U
A  U
A  U
Growing
AA chain
W51$*O
recognizes
UGA
P51$
UGA recoded for Gly (Gracilibacteria)
ribosome
Woyke et al. Nature 2013.
Tanja

Woyke
Embracing Diversity 5: Public Participation
The Rise of Citizen Microbiology
Darlene
Cavalier
Kitty Microbiome Project
tinyurl/kittybiome
Holly Ganz
Embracing Diversity 6: Diversity in STEM
Diversity in STEM
Diversity in STEM
Jo Handelsman
Diversity in STEM #DoSomething
Don’t Just Sit There #DoSomething
Acknowledgements
DOE JGI Sloan GBMF NSF
DHS DARPA
Aaron Darling

Lizzy
Wilbanks
Jenna Lang Russell
Neches
Rob Knight
Jack Gilbert Tanja Woyke Rob Dunn
Katie Pollard
Jessica
Green
Darlene
Cavalier
Eddy RubinWendy Brown
Dongying Wu
Phil
Hugenholtz
DSMZ
Sundar
Srijak
Bhatnagar David Coil
Alex Alexiev
Hannah
Holland-Moritz
Holly Bik
John Zhang
Holly
Menninger
Guillaume
Jospin
David Lang
Cassie
Ettinger
Tim HarkinsJennifer Gardy
Holly Ganz
Diversity Diversity Diversity Diversity ....
Diversity Diversity Diversity Diversity ....
Diversity Diversity Diversity Diversity ....

Contenu connexe

Tendances

Microbial Metagenomics and Human Health
Microbial Metagenomics and Human HealthMicrobial Metagenomics and Human Health
Microbial Metagenomics and Human HealthLarry Smarr
 
'Novel technologies to study the resistome'
'Novel technologies to study the resistome''Novel technologies to study the resistome'
'Novel technologies to study the resistome'Willem van Schaik
 
De novo RNA-seq for the study of ODAP synthesis pathway in Lathyrus sativus
De novo RNA-seq for the study of ODAP synthesis pathway in Lathyrus sativus De novo RNA-seq for the study of ODAP synthesis pathway in Lathyrus sativus
De novo RNA-seq for the study of ODAP synthesis pathway in Lathyrus sativus Iris Martínez-Rodero
 
Creating a Cyberinfrastructure for Advanced Marine Microbial Ecology Research...
Creating a Cyberinfrastructure for Advanced Marine Microbial Ecology Research...Creating a Cyberinfrastructure for Advanced Marine Microbial Ecology Research...
Creating a Cyberinfrastructure for Advanced Marine Microbial Ecology Research...Larry Smarr
 
Microbial Phylogenomics (EVE161) Class 13 - Comparative Genomics
Microbial Phylogenomics (EVE161) Class 13 - Comparative GenomicsMicrobial Phylogenomics (EVE161) Class 13 - Comparative Genomics
Microbial Phylogenomics (EVE161) Class 13 - Comparative GenomicsJonathan Eisen
 
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Jonathan Eisen
 
Discovery and Annotation of Novel Proteins from Rumen Gut Metagenomic Sequenc...
Discovery and Annotation of Novel Proteins from Rumen Gut Metagenomic Sequenc...Discovery and Annotation of Novel Proteins from Rumen Gut Metagenomic Sequenc...
Discovery and Annotation of Novel Proteins from Rumen Gut Metagenomic Sequenc...Mick Watson
 
Rapid Impact Assessment of Climatic and Physio-graphic Changes on Flagship G...
Rapid Impact Assessment of Climatic and Physio-graphic Changes  on Flagship G...Rapid Impact Assessment of Climatic and Physio-graphic Changes  on Flagship G...
Rapid Impact Assessment of Climatic and Physio-graphic Changes on Flagship G...Arvinder Singh
 
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics Jonathan Eisen
 
UC Davis EVE161 Lecture 14 by @phylogenomics
UC Davis EVE161 Lecture 14 by @phylogenomicsUC Davis EVE161 Lecture 14 by @phylogenomics
UC Davis EVE161 Lecture 14 by @phylogenomicsJonathan Eisen
 
EVE 161 Winter 2018 Class 13
EVE 161 Winter 2018 Class 13EVE 161 Winter 2018 Class 13
EVE 161 Winter 2018 Class 13Jonathan Eisen
 
American Gut Project presentation at Masaryk University
American Gut Project presentation at Masaryk UniversityAmerican Gut Project presentation at Masaryk University
American Gut Project presentation at Masaryk Universitymcdonadt
 
EVE 161 Winter 2018 Class 14
EVE 161 Winter 2018 Class 14EVE 161 Winter 2018 Class 14
EVE 161 Winter 2018 Class 14Jonathan Eisen
 
[13.07.07] albertsen mewe13 metagenomics
[13.07.07] albertsen mewe13 metagenomics[13.07.07] albertsen mewe13 metagenomics
[13.07.07] albertsen mewe13 metagenomicsMads Albertsen
 
Targeted RNA Sequencing, Urban Metagenomics, and Astronaut Genomics
Targeted RNA Sequencing, Urban Metagenomics, and Astronaut GenomicsTargeted RNA Sequencing, Urban Metagenomics, and Astronaut Genomics
Targeted RNA Sequencing, Urban Metagenomics, and Astronaut GenomicsQIAGEN
 
Metagenomics analysis
Metagenomics  analysisMetagenomics  analysis
Metagenomics analysisVijiMahesh1
 
Microbial Phylogenomics (EVE161) Class 14: Metagenomics
Microbial Phylogenomics (EVE161) Class 14: MetagenomicsMicrobial Phylogenomics (EVE161) Class 14: Metagenomics
Microbial Phylogenomics (EVE161) Class 14: MetagenomicsJonathan Eisen
 
Microbial Phylogenomics (EVE161) Class 5
Microbial Phylogenomics (EVE161) Class 5Microbial Phylogenomics (EVE161) Class 5
Microbial Phylogenomics (EVE161) Class 5Jonathan Eisen
 

Tendances (20)

Microbial Metagenomics and Human Health
Microbial Metagenomics and Human HealthMicrobial Metagenomics and Human Health
Microbial Metagenomics and Human Health
 
'Novel technologies to study the resistome'
'Novel technologies to study the resistome''Novel technologies to study the resistome'
'Novel technologies to study the resistome'
 
De novo RNA-seq for the study of ODAP synthesis pathway in Lathyrus sativus
De novo RNA-seq for the study of ODAP synthesis pathway in Lathyrus sativus De novo RNA-seq for the study of ODAP synthesis pathway in Lathyrus sativus
De novo RNA-seq for the study of ODAP synthesis pathway in Lathyrus sativus
 
Creating a Cyberinfrastructure for Advanced Marine Microbial Ecology Research...
Creating a Cyberinfrastructure for Advanced Marine Microbial Ecology Research...Creating a Cyberinfrastructure for Advanced Marine Microbial Ecology Research...
Creating a Cyberinfrastructure for Advanced Marine Microbial Ecology Research...
 
metagenomics
metagenomicsmetagenomics
metagenomics
 
Microbial Phylogenomics (EVE161) Class 13 - Comparative Genomics
Microbial Phylogenomics (EVE161) Class 13 - Comparative GenomicsMicrobial Phylogenomics (EVE161) Class 13 - Comparative Genomics
Microbial Phylogenomics (EVE161) Class 13 - Comparative Genomics
 
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
 
Discovery and Annotation of Novel Proteins from Rumen Gut Metagenomic Sequenc...
Discovery and Annotation of Novel Proteins from Rumen Gut Metagenomic Sequenc...Discovery and Annotation of Novel Proteins from Rumen Gut Metagenomic Sequenc...
Discovery and Annotation of Novel Proteins from Rumen Gut Metagenomic Sequenc...
 
Rapid Impact Assessment of Climatic and Physio-graphic Changes on Flagship G...
Rapid Impact Assessment of Climatic and Physio-graphic Changes  on Flagship G...Rapid Impact Assessment of Climatic and Physio-graphic Changes  on Flagship G...
Rapid Impact Assessment of Climatic and Physio-graphic Changes on Flagship G...
 
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics
 
UC Davis EVE161 Lecture 14 by @phylogenomics
UC Davis EVE161 Lecture 14 by @phylogenomicsUC Davis EVE161 Lecture 14 by @phylogenomics
UC Davis EVE161 Lecture 14 by @phylogenomics
 
EVE 161 Winter 2018 Class 13
EVE 161 Winter 2018 Class 13EVE 161 Winter 2018 Class 13
EVE 161 Winter 2018 Class 13
 
Metagenomics
MetagenomicsMetagenomics
Metagenomics
 
American Gut Project presentation at Masaryk University
American Gut Project presentation at Masaryk UniversityAmerican Gut Project presentation at Masaryk University
American Gut Project presentation at Masaryk University
 
EVE 161 Winter 2018 Class 14
EVE 161 Winter 2018 Class 14EVE 161 Winter 2018 Class 14
EVE 161 Winter 2018 Class 14
 
[13.07.07] albertsen mewe13 metagenomics
[13.07.07] albertsen mewe13 metagenomics[13.07.07] albertsen mewe13 metagenomics
[13.07.07] albertsen mewe13 metagenomics
 
Targeted RNA Sequencing, Urban Metagenomics, and Astronaut Genomics
Targeted RNA Sequencing, Urban Metagenomics, and Astronaut GenomicsTargeted RNA Sequencing, Urban Metagenomics, and Astronaut Genomics
Targeted RNA Sequencing, Urban Metagenomics, and Astronaut Genomics
 
Metagenomics analysis
Metagenomics  analysisMetagenomics  analysis
Metagenomics analysis
 
Microbial Phylogenomics (EVE161) Class 14: Metagenomics
Microbial Phylogenomics (EVE161) Class 14: MetagenomicsMicrobial Phylogenomics (EVE161) Class 14: Metagenomics
Microbial Phylogenomics (EVE161) Class 14: Metagenomics
 
Microbial Phylogenomics (EVE161) Class 5
Microbial Phylogenomics (EVE161) Class 5Microbial Phylogenomics (EVE161) Class 5
Microbial Phylogenomics (EVE161) Class 5
 

Similaire à Diversity Diversity Diversity Diversity ....

Microbial Metagenomics Drives a New Cyberinfrastructure
Microbial Metagenomics Drives a New CyberinfrastructureMicrobial Metagenomics Drives a New Cyberinfrastructure
Microbial Metagenomics Drives a New CyberinfrastructureLarry Smarr
 
"The Quest for A field Guide to the Microbes" talk by Jonathan Eisen February...
"The Quest for A field Guide to the Microbes" talk by Jonathan Eisen February..."The Quest for A field Guide to the Microbes" talk by Jonathan Eisen February...
"The Quest for A field Guide to the Microbes" talk by Jonathan Eisen February...Jonathan Eisen
 
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...Larry Smarr
 
Talk by J. Eisen for NZ Computational Genomics meeting
Talk by J. Eisen for NZ Computational Genomics meetingTalk by J. Eisen for NZ Computational Genomics meeting
Talk by J. Eisen for NZ Computational Genomics meetingJonathan Eisen
 
Using Supercomputers and Supernetworks to Explore the Ocean of Life
Using Supercomputers and Supernetworks to Explore the Ocean of LifeUsing Supercomputers and Supernetworks to Explore the Ocean of Life
Using Supercomputers and Supernetworks to Explore the Ocean of LifeLarry Smarr
 
Marine Host-Microbiome Interactions: Challenges and Opportunities
Marine Host-Microbiome Interactions: Challenges and OpportunitiesMarine Host-Microbiome Interactions: Challenges and Opportunities
Marine Host-Microbiome Interactions: Challenges and OpportunitiesJonathan Eisen
 
Genome sequencing and the development of our current information library
Genome sequencing and the development of our current information libraryGenome sequencing and the development of our current information library
Genome sequencing and the development of our current information libraryZarlishAttique1
 
"Phylogeny-Driven Approaches to Genomics and Metagenomics" talk by Jonathan E...
"Phylogeny-Driven Approaches to Genomics and Metagenomics" talk by Jonathan E..."Phylogeny-Driven Approaches to Genomics and Metagenomics" talk by Jonathan E...
"Phylogeny-Driven Approaches to Genomics and Metagenomics" talk by Jonathan E...Jonathan Eisen
 
Thesis Poster
Thesis PosterThesis Poster
Thesis PosterTravis Tu
 
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK Cyndy Parr
 
PLANT GENOME SEQUENCING AND DATA MINING.pptx
PLANT GENOME SEQUENCING AND DATA MINING.pptxPLANT GENOME SEQUENCING AND DATA MINING.pptx
PLANT GENOME SEQUENCING AND DATA MINING.pptxChristalKyuka
 
Stalking the Fourth Domain in Metagenomic Data: Searching for, Discovering, a...
Stalking the Fourth Domain in Metagenomic Data: Searching for, Discovering, a...Stalking the Fourth Domain in Metagenomic Data: Searching for, Discovering, a...
Stalking the Fourth Domain in Metagenomic Data: Searching for, Discovering, a...Jonathan Eisen
 
Comparative genomics and proteomics
Comparative genomics and proteomicsComparative genomics and proteomics
Comparative genomics and proteomicsNikhil Aggarwal
 
Conservation Biotechnology: DNA and Tissue Bank, DNA Barcoding , DNA fingerpr...
Conservation Biotechnology: DNA and Tissue Bank, DNA Barcoding, DNA fingerpr...Conservation Biotechnology: DNA and Tissue Bank, DNA Barcoding, DNA fingerpr...
Conservation Biotechnology: DNA and Tissue Bank, DNA Barcoding , DNA fingerpr...AnitaPoudel5
 
ISB nov 2014
ISB nov 2014ISB nov 2014
ISB nov 2014mcdonadt
 
Biotecnika Times Newspaper 6th December 2018
Biotecnika Times Newspaper 6th December 2018Biotecnika Times Newspaper 6th December 2018
Biotecnika Times Newspaper 6th December 2018shekhar suman
 

Similaire à Diversity Diversity Diversity Diversity .... (20)

Microbial Metagenomics Drives a New Cyberinfrastructure
Microbial Metagenomics Drives a New CyberinfrastructureMicrobial Metagenomics Drives a New Cyberinfrastructure
Microbial Metagenomics Drives a New Cyberinfrastructure
 
"The Quest for A field Guide to the Microbes" talk by Jonathan Eisen February...
"The Quest for A field Guide to the Microbes" talk by Jonathan Eisen February..."The Quest for A field Guide to the Microbes" talk by Jonathan Eisen February...
"The Quest for A field Guide to the Microbes" talk by Jonathan Eisen February...
 
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
 
rheumatoid arthritis
rheumatoid arthritisrheumatoid arthritis
rheumatoid arthritis
 
Talk by J. Eisen for NZ Computational Genomics meeting
Talk by J. Eisen for NZ Computational Genomics meetingTalk by J. Eisen for NZ Computational Genomics meeting
Talk by J. Eisen for NZ Computational Genomics meeting
 
Using Supercomputers and Supernetworks to Explore the Ocean of Life
Using Supercomputers and Supernetworks to Explore the Ocean of LifeUsing Supercomputers and Supernetworks to Explore the Ocean of Life
Using Supercomputers and Supernetworks to Explore the Ocean of Life
 
Marine Host-Microbiome Interactions: Challenges and Opportunities
Marine Host-Microbiome Interactions: Challenges and OpportunitiesMarine Host-Microbiome Interactions: Challenges and Opportunities
Marine Host-Microbiome Interactions: Challenges and Opportunities
 
Genome sequencing and the development of our current information library
Genome sequencing and the development of our current information libraryGenome sequencing and the development of our current information library
Genome sequencing and the development of our current information library
 
"Phylogeny-Driven Approaches to Genomics and Metagenomics" talk by Jonathan E...
"Phylogeny-Driven Approaches to Genomics and Metagenomics" talk by Jonathan E..."Phylogeny-Driven Approaches to Genomics and Metagenomics" talk by Jonathan E...
"Phylogeny-Driven Approaches to Genomics and Metagenomics" talk by Jonathan E...
 
Metagenomics
MetagenomicsMetagenomics
Metagenomics
 
Thesis Poster
Thesis PosterThesis Poster
Thesis Poster
 
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
 
PLANT GENOME SEQUENCING AND DATA MINING.pptx
PLANT GENOME SEQUENCING AND DATA MINING.pptxPLANT GENOME SEQUENCING AND DATA MINING.pptx
PLANT GENOME SEQUENCING AND DATA MINING.pptx
 
Stalking the Fourth Domain in Metagenomic Data: Searching for, Discovering, a...
Stalking the Fourth Domain in Metagenomic Data: Searching for, Discovering, a...Stalking the Fourth Domain in Metagenomic Data: Searching for, Discovering, a...
Stalking the Fourth Domain in Metagenomic Data: Searching for, Discovering, a...
 
Comparative genomics and proteomics
Comparative genomics and proteomicsComparative genomics and proteomics
Comparative genomics and proteomics
 
Microbe diversity-handout
Microbe diversity-handoutMicrobe diversity-handout
Microbe diversity-handout
 
Conservation Biotechnology: DNA and Tissue Bank, DNA Barcoding , DNA fingerpr...
Conservation Biotechnology: DNA and Tissue Bank, DNA Barcoding, DNA fingerpr...Conservation Biotechnology: DNA and Tissue Bank, DNA Barcoding, DNA fingerpr...
Conservation Biotechnology: DNA and Tissue Bank, DNA Barcoding , DNA fingerpr...
 
Introduction to 16S Microbiome Analysis
Introduction to 16S Microbiome AnalysisIntroduction to 16S Microbiome Analysis
Introduction to 16S Microbiome Analysis
 
ISB nov 2014
ISB nov 2014ISB nov 2014
ISB nov 2014
 
Biotecnika Times Newspaper 6th December 2018
Biotecnika Times Newspaper 6th December 2018Biotecnika Times Newspaper 6th December 2018
Biotecnika Times Newspaper 6th December 2018
 

Plus de Jonathan Eisen

Eisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdfEisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdfJonathan Eisen
 
Phylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of MicrobesPhylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of MicrobesJonathan Eisen
 
Talk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meetingTalk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meetingJonathan Eisen
 
Thoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current ActionsThoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current ActionsJonathan Eisen
 
A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2Jonathan Eisen
 
EVE198 Summer Session Class 4
EVE198 Summer Session Class 4EVE198 Summer Session Class 4
EVE198 Summer Session Class 4Jonathan Eisen
 
EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1 EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1 Jonathan Eisen
 
EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines Jonathan Eisen
 
EVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 IntroductionEVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 IntroductionJonathan Eisen
 
EVE198 Spring2021 Class2
EVE198 Spring2021 Class2EVE198 Spring2021 Class2
EVE198 Spring2021 Class2Jonathan Eisen
 
EVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 VaccinesEVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 VaccinesJonathan Eisen
 
EVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA DetectionEVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA DetectionJonathan Eisen
 
EVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 IntroductionEVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 IntroductionJonathan Eisen
 
EVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID TestingEVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID TestingJonathan Eisen
 
EVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID VaccinesEVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID VaccinesJonathan Eisen
 
EVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID TransmissionEVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID TransmissionJonathan Eisen
 
EVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 VaccinesEVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 VaccinesJonathan Eisen
 
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and TestingEVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and TestingJonathan Eisen
 
EVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 IntroductionEVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 IntroductionJonathan Eisen
 
Evolution of microbiomes and the evolution of the study and politics of micro...
Evolution of microbiomes and the evolution of the study and politics of micro...Evolution of microbiomes and the evolution of the study and politics of micro...
Evolution of microbiomes and the evolution of the study and politics of micro...Jonathan Eisen
 

Plus de Jonathan Eisen (20)

Eisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdfEisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdf
 
Phylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of MicrobesPhylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of Microbes
 
Talk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meetingTalk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meeting
 
Thoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current ActionsThoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current Actions
 
A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2
 
EVE198 Summer Session Class 4
EVE198 Summer Session Class 4EVE198 Summer Session Class 4
EVE198 Summer Session Class 4
 
EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1 EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1
 
EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines
 
EVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 IntroductionEVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 Introduction
 
EVE198 Spring2021 Class2
EVE198 Spring2021 Class2EVE198 Spring2021 Class2
EVE198 Spring2021 Class2
 
EVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 VaccinesEVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 Vaccines
 
EVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA DetectionEVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA Detection
 
EVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 IntroductionEVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 Introduction
 
EVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID TestingEVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID Testing
 
EVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID VaccinesEVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID Vaccines
 
EVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID TransmissionEVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID Transmission
 
EVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 VaccinesEVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
 
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and TestingEVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
 
EVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 IntroductionEVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
 
Evolution of microbiomes and the evolution of the study and politics of micro...
Evolution of microbiomes and the evolution of the study and politics of micro...Evolution of microbiomes and the evolution of the study and politics of micro...
Evolution of microbiomes and the evolution of the study and politics of micro...
 

Diversity Diversity Diversity Diversity ....

  • 1. Diversity Diversity Diversity Diversity Diversity Diversity Diversity Diversity Diversity Diversity Diversity Diversity SSE 2015 Jonathan A. Eisen @phylogenomics University of California, Davis
  • 4. Overwhelming Diversity of Microbes !2 Diversity of Form
  • 5. Overwhelming Diversity of Microbes !2 Diversity of Form Phylogenetic Diversity
  • 6. Overwhelming Diversity of Microbes !2 Functional Diversity Diversity of Form Phylogenetic Diversity
  • 7. Overwhelming Diversity of Microbes !2 Functional Diversity Diversity of Form Phylogenetic Diversity MICROBES RUN THE PLANET
  • 8. Great Plate Count Anomaly !3
  • 9. Great Plate Count Anomaly !3
  • 10. Great Plate Count Anomaly !3 Observation
  • 11. Great Plate Count Anomaly !3 Culturing Observation
  • 12. Great Plate Count Anomaly !3 Culturing Observation CountCount
  • 13. Great Plate Count Anomaly !3 <<<< Culturing Observation CountCount
  • 14. Great Plate Count Anomaly !3 <<<< Culturing Observation CountCount http://www.google.com/url? sa=i&rct=j&q=&esrc=s&source=images& cd=&docid=rLu5sL207WlE1M&tbnid=CR LQYP7d9d_TcM:&ved=0CAUQjRw&url=h ttp%3A%2F%2Fwww.biol.unt.edu %2F~jajohnson %2FDNA_sequencing_process&ei=hFu7 U_TyCtOqsQSu9YGwBg&psig=AFQjCN G-8EBdEljE7- yHFG2KPuBZt8kIPw&ust=140487395121 1424
  • 15. Great Plate Count Anomaly !3 <<<< Culturing Observation CountCount http://www.google.com/url? sa=i&rct=j&q=&esrc=s&source=images& cd=&docid=rLu5sL207WlE1M&tbnid=CR LQYP7d9d_TcM:&ved=0CAUQjRw&url=h ttp%3A%2F%2Fwww.biol.unt.edu %2F~jajohnson %2FDNA_sequencing_process&ei=hFu7 U_TyCtOqsQSu9YGwBg&psig=AFQjCN G-8EBdEljE7- yHFG2KPuBZt8kIPw&ust=140487395121 1424 DNA
  • 17. Phylotyping via rRNA PCR: One Taxon • v DNA ACTGC ACCTAT CGTTCG ACTGC ACCTAT CGTTCG ACTGC ACCTAT CGTTCG Taxa Characters B1 ACTGCACCTATCGTTCG B2 ACTCCACCTATCGTTCG E1 ACTCCAGCTATCGATCG E2 ACTCCAGGTATCGATCG A1 ACCCCAGCTCTCGCTCG A2 ACCCCAGCTCTGGCTCG New1 ACTGCACCTATCGTTCG EukaryotesBacteria Archaea !5 Many sequences from one sample all point to the same branch on the tree
  • 18. DNA ACTGC ACCTAT CGTTCG ACTGC ACCTAT CGTTCG ACCCC AGCTCT CGCTCG Taxa Characters B1 ACTGCACCTATCGTTCG B2 ACTCCACCTATCGTTCG E1 ACTCCAGCTATCGATCG E2 ACTCCAGGTATCGATCG A1 ACCCCAGCTCTCGCTCG A2 ACCCCAGCTCTGGCTCG New1 ACCCCAGCTCTGCCTCG New2 ACTGCACCTATCGTTCG EukaryotesBacteria Archaea !6 One can estimate cell counts from the number of times each sequence is seen. Phylotyping via rRNA PCR: Two Taxa
  • 19. DNA ACTGC ACCTAT CGTTCG ACTGC ACCTAT CGTTCG ACCCC AGCTCT CGCTCG Taxa Characters B1 ACTGCACCTATCGTTCG B2 ACTCCACCTATCGTTCG E1 ACTCCAGCTATCGATCG E2 ACTCCAGGTATCGATCG A1 ACCCCAGCTCTCGCTCG A2 ACCCCAGCTCTGGCTCG New1 ACCCCAGCTCTGCCTCG New2 ACTGCACCTATCGTTCG EukaryotesBacteria Archaea !6 One can estimate cell counts from the number of times each sequence is seen. Phylotyping via rRNA PCR: Two Taxa
  • 20. DNA ACTGC ACCTAT CGTTCG ACTGC ACCTAT CGTTCG ACCCC AGCTCT CGCTCG Taxa Characters B1 ACTGCACCTATCGTTCG B2 ACTCCACCTATCGTTCG E1 ACTCCAGCTATCGATCG E2 ACTCCAGGTATCGATCG A1 ACCCCAGCTCTCGCTCG A2 ACCCCAGCTCTGGCTCG New1 ACCCCAGCTCTGCCTCG New2 ACTGCACCTATCGTTCG EukaryotesBacteria Archaea !6 One can estimate cell counts from the number of times each sequence is seen. Phylotyping via rRNA PCR: Two Taxa
  • 21. DNA Taxa Characters B1 ACTGCACCTATCGTTCG B2 ACTCCACCTATCGTTCG E1 ACTCCAGCTATCGATCG E2 ACTCCAGGTATCGATCG A1 ACCCCAGCTCTCGCTCG A2 ACCCCAGCTCTGGCTCG New1 ACCCCAGCTCTGCCTCG New2 AGGGGAGCTCTGCCTCG New3 ACTCCAGCTATCGATCG New4 ACTGCACCTATCGTTCG EukaryotesBacteria Archaea !7 ACTGC ACCTAT CGTTCG ACTCC AGCTAT CGATCG ACCCC AGCTCT CGCTCG AGGGG AGCTCT CGCTCG AGGGG AGCTCT CGCTCG ACTGC ACCTAT CGTTCG Even with more taxa it still works Phylotyping via rRNA PCR: Four Taxa
  • 22. rRNA PCR: Community Comparisons DNA DNADNA ACTGC ACCTAT CGTTCG ACTCC AGCTAT CGATCG ACCCC AGCTCT CGCTCG Taxa Characters B1 ACTGCACCTATCGTTCG B2 ACTCCACCTATCGTTCG E1 ACTCCAGCTATCGATCG E2 ACTCCAGGTATCGATCG A1 ACCCCAGCTCTCGCTCG A2 ACCCCAGCTCTGGCTCG New1 ACCCCAGCTCTGCCTCG New2 ACGGCAGCTCTGCCTCG EukaryotesBacteria Archaea !8
  • 23. rRNA PCR: Community Comparisons DNA DNADNA ACTGC ACCTAT CGTTCG ACTCC AGCTAT CGATCG ACCCC AGCTCT CGCTCG Taxa Characters B1 ACTGCACCTATCGTTCG B2 ACTCCACCTATCGTTCG E1 ACTCCAGCTATCGATCG E2 ACTCCAGGTATCGATCG A1 ACCCCAGCTCTCGCTCG A2 ACCCCAGCTCTGGCTCG New1 ACCCCAGCTCTGCCTCG New2 ACGGCAGCTCTGCCTCG !9
  • 24. Chemosymbiont rRNA Phylotyping Eisen et al. 1992Eisen et al. 1992. J. Bact.174: 3416 Colleen Cavanaugh
  • 25. Approaching to NGS Discovery of DNA structure (Cold Spring Harb. Symp. Quant. Biol. 1953;18:123-31) 1953 Sanger sequencing method by F. Sanger (PNAS ,1977, 74: 560-564) 1977 PCR by K. Mullis (Cold Spring Harb Symp Quant Biol. 1986;51 Pt 1:263-73) 1983 Development of pyrosequencing (Anal. Biochem., 1993, 208: 171-175; Science ,1998, 281: 363-365) 1993 1980 1990 2000 2010 Single molecule emulsion PCR 1998 Human Genome Project (Nature , 2001, 409: 860–92; Science, 2001, 291: 1304–1351) Founded 454 Life Science 2000 454 GS20 sequencer (First NGS sequencer) 2005 Founded Solexa 1998 Solexa Genome Analyzer (First short-read NGS sequencer) 2006 GS FLX sequencer (NGS with 400-500 bp read lenght) 2008 Hi-Seq2000 (200Gbp per Flow Cell) 2010 Illumina acquires Solexa (Illumina enters the NGS business) 2006 ABI SOLiD (Short-read sequencer based upon ligation) 2007 Roche acquires 454 Life Sciences (Roche enters the NGS business) 2007 NGS Human Genome sequencing (First Human Genome sequencing based upon NGS technology) 2008 From Slideshare presentation of Cosentino Cristian http://www.slideshare.net/cosentia/high-throughput-equencing Miseq Roche Jr Ion Torrent PacBio Oxford Drowning in Data AAATCGCTAGCGC CGGCGAGCTAGC CGAGCGATCGAGC CGAGCATCGAGTA
  • 26. Hartman et al. BMC Bioinformatics 2010, 11:317 http://www.biomedcentral.com/1471-2105/11/317 Open AccessSOFTWARE Software Introducing W.A.T.E.R.S.: a Workflow for the Alignment, Taxonomy, and Ecology of Ribosomal Sequences Amber L Hartman†1,3, Sean Riddle†2, Timothy McPhillips2, Bertram Ludäscher2 and Jonathan A Eisen*1 Abstract Background: For more than two decades microbiologists have used a highly conserved microbial gene as a phylogenetic marker for bacteria and archaea. The small-subunit ribosomal RNA gene, also known as 16 S rRNA, is encoded by ribosomal DNA, 16 S rDNA, and has provided a powerful comparative tool to microbial ecologists. Over time, the microbial ecology field has matured from small-scale studies in a select number of environments to massive collections of sequence data that are paired with dozens of corresponding collection variables. As the complexity of data and tool sets have grown, the need for flexible automation and maintenance of the core processes of 16 S rDNA sequence analysis has increased correspondingly. Results: We present WATERS, an integrated approach for 16 S rDNA analysis that bundles a suite of publicly available 16 S rDNA analysis software tools into a single software package. The "toolkit" includes sequence alignment, chimera removal, OTU determination, taxonomy assignment, phylogentic tree construction as well as a host of ecological analysis and visualization tools. WATERS employs a flexible, collection-oriented 'workflow' approach using the open- source Kepler system as a platform. Conclusions: By packaging available software tools into a single automated workflow, WATERS simplifies 16 S rDNA analyses, especially for those without specialized bioinformatics, programming expertise. In addition, WATERS, like some of the newer comprehensive rRNA analysis tools, allows researchers to minimize the time dedicated to carrying out tedious informatics steps and to focus their attention instead on the biological interpretation of the results. One advantage of WATERS over other comprehensive tools is that the use of the Kepler workflow system facilitates result interpretation and reproducibility via a data provenance sub-system. Furthermore, new "actors" can be added to the workflow as desired and we see WATERS as an initial seed for a sizeable and growing repository of interoperable, easy- to-combine tools for asking increasingly complex microbial ecology questions. Background Microbial communities and how they are surveyed Microbial communities abound in nature and are crucial for the success and diversity of ecosystems. There is no end in sight to the number of biological questions that can be asked about microbial diversity on earth. From animal and human guts to open ocean surfaces and deep sea hydrothermal vents, to anaerobic mud swamps or boiling thermal pools, to the tops of the rainforest canopy and the frozen Antarctic tundra, the composition of microbial communities is a source of natural history, intellectual curiosity, and reservoir of environmental health [1]. Microbial communities are also mediators of insight into global warming processes [2,3], agricultural success [4], pathogenicity [5,6], and even human obesity [7,8]. In the mid-1980 s, researchers began to sequence ribo- somal RNAs from environmental samples in order to characterize the types of microbes present in those sam- ples, (e.g., [9,10]). This general approach was revolution- ized by the invention of the polymerase chain reaction (PCR), which made it relatively easy to clone and then * Correspondence: jaeisen@ucdavis.edu 1 Department of Medical Microbiology and Immunology and the Department of Evolution and Ecology, Genome Center, University of California Davis, One Shields Avenue, Davis, CA, 95616, USA † Contributed equally Full list of author information is available at the end of the article WATERS - Kepler Workflow for rRNA matics 2010, 11:317 .com/1471-2105/11/317 Page 2 of 14 genes for ribosomal RNA) in partic- ubunit ribosomal RNA (ss-rRNA). ed a large amount of previously l diversity [1,11-13]. Researchers all subunit rRNA gene not only ith which it can be PCR amplified, has variable and highly conserved to be universally distributed among nd it is useful for inferring phyloge- 4,15]. Since then, "cultivation-inde- " have brought a revolution to the by allowing scientists to study a mount of diversity in many different ments [16-18]. The general premise Figure 1 Overview of WATERS. Schema of WATERS where white boxes indicate "behind the scenes" analyses that are performed in WA- Align Check chimeras Cluster Build Tree Assign Taxonomy Tree w/ Taxonomy Diversity statistics & graphs Unifrac files Cytoscape network OTU table Hartman et al. BMC Bioinformatics 2010, 11:317 http://www.biomedcentral.com/1471-2105/11/317 Page 3 of 14 Motivations As outlined above, successfully processing microbial sequence collections is far from trivial. Each step is com- plex and usually requires significant bioinformatics expertise and time investment prior to the biological interpretation. In order to both increase efficiency and ensure that all best-practice tools are easily usable, we sought to create an "all-inclusive" method for performing all of these bioinformatics steps together in one package. To this end, we have built an automated, user-friendly, workflow-based system called WATERS: a Workflow for the Alignment, Taxonomy, and Ecology of Ribosomal Sequences (Fig. 1). In addition to being automated and simple to use, because WATERS is executed in the Kepler scientific workflow system (Fig. 2) it also has the advan- tage that it keeps track of the data lineage and provenance of data products [23,24]. Automation The primary motivation in building WATERS was to minimize the technical, bioinformatics challenges that arise when performing DNA sequence clustering, phylo- genetic tree, and statistical analyses by automating the 16 S rDNA analysis workflow. We also hoped to exploit additional features that workflow-based approaches entail, such as optimized execution and data lineage tracking and browsing [23,25-27]. In the earlier days of 16 S rDNA analysis, simply knowing which microbes were present and whether they were biologically novel was a noteworthy achievement. It was reasonable and expected, therefore, to invest a large amount of time and effort to get to that list of microbes. But now that current efforts are significantly more advanced and often require com- parison of dozens of factors and variables with datasets of thousands of sequences, it is not practically feasible to process these large collections "by hand", and hugely inef- ficient if instead automated methods can be successfully employed. Broadening the user base A second motivation and perspective is that by minimiz- ing the technical difficulty of 16 S rDNA analysis through the use of WATERS, we aim to make the analysis of these datasets more widely available and allow individuals with Figure 2 Screenshot of WATERS in Kepler software. Key features: the library of actors un-collapsed and displayed on the left-hand side, the input and output paths where the user declares the location of their input files and desired location for the results files. Each green box is an individual Kepler actor that performs a single action on the data stream. The connectors (black arrows) direct and hook up the actors in a defined sequence. Double- clicking on any actor or connector allows it to be manipulated and re-arranged. Hartman et al. BMC Bioinformatics 2010, 11:317 http://www.biomedcentral.com/1471-2105/11/317 Page 9 default is 97% and 99%), and they are also generated for every metadata variable comparison that the user includes. Data pruning To assist in troubleshooting and quality con WATERS returns to the user three fasta files of seque Figure 3 Biologically similar results automatically produced by WATERS on published colonic microbiota samples. (A) Rarefaction curves s ilar to curves shown in Eckburg et al. Fig. 2; 70-72, indicate patient numbers, i.e., 3 different individuals. (B) Weighted Unifrac analysis based on ph genetic tree and OTU data produced by WATERS very similar to Eckburg et al. Fig. 3B. (C) Neighbor-joining phylogenetic tree (Quicktree) represent the sequences analyzed by WATERS, which is clearly similar to Fig. S1 in Eckburg et al. BA 3 3HUFHQW YDULDWLRQ H[SODLQHG 33HUFHQWYDULDWLRQH[SODLQHG $% & ') $ % & '( ) 6 $ %& ' () 6 3&$ 3 YV 3 C %$&7(52,'(7(6 %$&7(52,'$/(6 '(/7$3527(2%$&7(5,$ $&7,12%$&7(5,$ 9(558&20,&52%,$ (36,/213527(2%$&7(5,$ ),50,&87(6 &/2675,',$ &/2675,',$/(6 *$00$3527(2%$&7(5,$ &<$12%$&7(5,$ $/3+$3527(2%$&7(5,$ )862%$&7(5,$ ),50,&87(6 %$&,//, ),50,&87(6 02//,&87(6 Amber
 Hartman
  • 27. Phylogenetic Copy # Correction Kembel SW, Wu M, Eisen JA, Green JL (2012) Incorporating 16S Gene Copy Number Information Improves Estimates of Microbial Diversity and Abundance. PLoS Comput Biol 8(10): e1002743. doi:10.1371/journal.pcbi.1002743 Steven Kembel Jessica Green
  • 28. alignment used to build the profile, resulting in a multiple PD versus PID clustering, 2) to explore overlap between PhylOT Figure 1. PhylOTU Workflow. Computational processes are represented as squares and databases are represented as cylinders in this generaliz workflow of PhylOTU. See Results section for details. doi:10.1371/journal.pcbi.1001061.g001 Finding Metagenomic OTU Sharpton TJ, Riesenfeld SJ, Kembel SW, Ladau J, O'Dwyer JP, Green JL, Eisen JA, Pollard KS. (2011) PhylOTU: A High-Throughput Procedure Quantifies Microbial Community Diversity and Resolves Novel Taxa from Metagenomic Data. PLoS Comput Biol 7(1): e1001061. doi: 10.1371/journal.pcbi.1001061 PhylOTU Tom Sharpton Katie Pollard Jessica Green
  • 29. Beta-Diversity a broader range of Proteobacteria, but yielded similar results (Fig. S1 and Tables S2 and S3). Across all samples, we identified 4,931 quality Nitrosomadales sequences, which grouped into 176 OTUs (operational taxo- nomic units) using an arbitrary 99% sequence similarity cutoff. This cutoff retained a high amount of sequence diversity, but minimized the chance of including diversity because of se- quencing or PCR errors. Most (95%) of the sequences appear closely related either to the marine Nitrosospira-like clade, known to be abundant in estuarine sediments (e.g., ref. 19) or to marine bacterium C-17, classified as Nitrosomonas (20) (Fig. S2). Pairwise community similarity between the samples was calcu- lated based on the presence or absence of each OTU using somonadales community similarity. Geographic distance con- tributed the largest partial regression coefficient (b = 0.40, P < 0.0001), with sediment moisture, nitrate concentration, plant cover, salinity, and air and water temperature contributing to smaller, but significant, partial regression coefficients (b = 0.09– Fig. 1. The 13 marshes sampled (see Table S1 for details). Marshes com- pared with one another within regions are circled. (Inset) The arrangement of sampling points within marshes. Six points were sampled along a 100-m transect, and a seventh point was sampled ∼1 km away. Two marshes in the Northeast United States (outlined stars) were sampled more intensively, along four 100-m transects in a grid pattern. Fig. 2. Distance-decay curves for the Nitrosomadales communities. The dashed, blue line denotes the least-squares linear regression across all spatial scales. The solid lines denote separate regressions within each of the three spatial scales: within marshes, regional (across marshes within regions circled in Fig. 1), and continental (across regions). The slopes of all lines (except the solid light blue line) are significantly less than zero. The slopes of the solid red lines are significantly different from the slope of the all scale (blue dashed) line. ECOLOGY a broader range of Proteobacteria, but yielded similar results (Fig. S1 and Tables S2 and S3). Across all samples, we identified 4,931 quality Nitrosomadales sequences, which grouped into 176 OTUs (operational taxo- nomic units) using an arbitrary 99% sequence similarity cutoff. This cutoff retained a high amount of sequence diversity, but minimized the chance of including diversity because of se- quencing or PCR errors. Most (95%) of the sequences appear closely related either to the marine Nitrosospira-like clade, known to be abundant in estuarine sediments (e.g., ref. 19) or to marine bacterium C-17, classified as Nitrosomonas (20) (Fig. S2). Pairwise community similarity between the samples was calcu- lated based on the presence or absence of each OTU using a rarefied Sørensen’s index (4). Community similarity using this somonadales community similarity. Geographic distance con- tributed the largest partial regression coefficient (b = 0.40, P < 0.0001), with sediment moisture, nitrate concentration, plant cover, salinity, and air and water temperature contributing to smaller, but significant, partial regression coefficients (b = 0.09– 0.17, P < 0.05) (Table 1). Because salt marsh bacteria may be Fig. 1. The 13 marshes sampled (see Table S1 for details). Marshes com- pared with one another within regions are circled. (Inset) The arrangement of sampling points within marshes. Six points were sampled along a 100-m transect, and a seventh point was sampled ∼1 km away. Two marshes in the Northeast United States (outlined stars) were sampled more intensively, along four 100-m transects in a grid pattern. Fig. 2. Distance-decay curves for the Nitrosomadales communities. The dashed, blue line denotes the least-squares linear regression across all spatial scales. The solid lines denote separate regressions within each of the three spatial scales: within marshes, regional (across marshes within regions circled in Fig. 1), and continental (across regions). The slopes of all lines (except the solid light blue line) are significantly less than zero. The slopes of the solid red lines are significantly different from the slope of the all scale (blue dashed) line. ECOLOGY Drivers of bacterial β-diversity depend on spatial scale Jennifer B. H. Martinya,1 , Jonathan A. Eisenb , Kevin Pennc , Steven D. Allisona,d , and M. Claire Horner-Devinee a Department of Ecology and Evolutionary Biology, and d Department of Earth System Science, University of California, Irvine, CA 92697; b Department of Evolution and Ecology, University of California Davis Genome Center, Davis, CA 95616; c Center for Marine Biotechnology and Biomedicine, The Scripps Institution of Oceanography, University of California at San Diego, La Jolla, CA 92093; and e School of Aquatic and Fishery Sciences, University of Washington, Seattle, WA 98195 Edited by Edward F. DeLong, Massachusetts Institute of Technology, Cambridge, MA, and approved March 31, 2011 (received for review November 1, 2010) The factors driving β-diversity (variation in community composi- tion) yield insights into the maintenance of biodiversity on the planet. Here we tested whether the mechanisms that underlie bacterial β-diversity vary over centimeters to continental spatial scales by comparing the composition of ammonia-oxidizing bacte- ria communities in salt marsh sediments. As observed in studies of macroorganisms, the drivers of salt marsh bacterial β-diversity depend on spatial scale. In contrast to macroorganism studies, however, we found no evidence of evolutionary diversification of ammonia-oxidizing bacteria taxa at the continental scale, de- spite an overall relationship between geographic distance and community similarity. Our data are consistent with the idea that dispersal limitation at local scales can contribute to β-diversity, even though the 16S rRNA genes of the relatively common taxa are globally distributed. These results highlight the importance of considering multiple spatial scales for understanding microbial biogeography. microbial composition | distance-decay | Nitrosomonadales | ecological drift Biodiversity supports the ecosystem processes upon which so- ciety depends (1). Understanding the mechanisms that gen- erate and maintain biodiversity is thus key to predicting ecosystem responses to future environmental changes. The decrease in community similarity with geographic distance is a universal biogeographic pattern observed in communities from all spatial scale (12). Fifty-years ago, Preston (13) noted that the turnover rate (rate of change) of bird species composition across space within a continent is lower than that across continents. He attributed the high turnover rate across continents to evolu- tionary diversification (i.e., speciation) between faunas as a result of dispersal limitation and the lower turnover rates of bird spe- cies within continents as a result of environmental variation. Here we investigate whether the mechanisms underlying β- diversity in bacteria also vary by spatial scale. We chose to focus on the ammonia-oxidizing bacteria (AOB), which along with the ammonia-oxidizing archaea (14), perform the rate-limiting step of nitrification and thus play a key role in nitrogen dynamics. We compared AOB community composition in 106 sediment samples from 12 salt marshes on three continents. A partially nested sampling design achieved a relatively balanced distribution of pairwise distance classes over nine orders of magnitude, from 3 cm to 12,500 km (Fig. 1 and Table S1). We limited our sam- pling to a monophyletic group of bacteria, the AOB within the β-Proteobacteria, and one habitat, salt marshes primarily domi- nated by cordgrass (Spartina spp.). This approach constrained the pool of total diversity (richness) and kept the environmental and plant variation relatively constant, increasing our ability to identify if dispersal limitation influences AOB composition. We then asked two questions: (i) Does bacterial β-diversity— specifically, the slope of the distance-decay curve—vary over community composition) yield insights into the maintenance of biodiversity. These studies are still relatively rare for micro- organisms, however, and thus our understanding of the mecha- nisms underlying microbial diversity—most of the tree of life— remains limited. β-Diversity, and therefore distance-decay patterns, could be driven solely by differences in environmental conditions across space, a hypothesis summed up by microbiologists as, “every- thing is everywhere—the environmental selects” (10). Under this model, a distance-decay curve is observed because environmen- tal variables tend to be spatially autocorrelated, and organisms with differing niche preferences are selected from the available pool of taxa as the environment changes with distance. Dispersal limitation can also give rise to β-diversity, as it per- mits historical contingencies to influence present-day biogeo- graphic patterns. For example, neutral niche models, in which an organism’s abundance is not influenced by its environmental preferences, predict a distance-decay curve (8, 11). On relatively short time scales, stochastic births and deaths contribute to a heterogeneous distribution of taxa (ecological drift). On longer time scales, stochastic genetic processes allow for taxon di- versification across the landscape (evolutionary drift). If dispersal is limiting, then current environmental or biotic conditions will not fully explain the distance-decay curve, and thus geographic distance will be correlated with community similarity even after controlling for other factors (2). For macroorganisms, the relative contribution of environ- mental factors or dispersal limitation to β-diversity depends on vary by spatial scale? Because most bacteria and hardy, we predicted that dispersal lim primarily across continents, resulting in microbial “provinces” (15). At the same tim environmental factors would contribute decay at all scales, resulting in the steepest sl scale as reported in plant and animal comm Results and Discussion We characterized AOB community compo Sanger sequencing of 16S rRNA gene reg primer sets. Here we focus on the results f sequences from the order Nitrosomonada primers specific for AOB within the β-Prot The second primer set (18) generated lo Author contributions: J.B.H.M. and M.C.H.-D. designed rese M.C.H.-D. performed research; J.B.H.M., S.D.A., and M.C.H.-D and M.C.H.-D. wrote the paper. The authors declare no conflict of interest. This article is a PNAS Direct Submission. Freely available online through the PNAS open access opti Data deposition: The sequences reported in this paper hav Bank database (accession nos. HQ271472–HQ276885 and H 1 To whom correspondence should be addressed. E-mail: jm This article contains supporting information online at www. 1073/pnas.1016308108/-/DCSupplemental. 7850–7854 | PNAS | May 10, 2011 | vol. 108 | no. 19 www.pnas.org/cgi/do Our data are consistent with the idea that dispersal limitation at local scales can contribute to à-diversity, even though the 16S rRNA genes of the relatively common taxa are globally distributed. Jen Hughes_
 Martiny M. Claire 
 Horner-Devine
  • 30. Drosophila microbiome Both natural surveys and laboratory experiments indicate that host diet plays a major role in shaping the Drosophila bacterial microbiome. Laboratory strains provide only a limited model of natural host–microbe interactions Jenna Lang Angus Chandler
  • 31. Embracing Diversity 2: Other Genes
  • 32. Culture Independent “Metagenomics” DNA DNADNA !18 Metagenomics http://dx.doi.org/10.1016/S1074-5521(98)90108-9 Jo Handelsman
  • 33. Culture Independent “Metagenomics” DNA DNADNA !19 Taxa Characters B1 ACTGCACCTATCGTTCG B2 ACTCCACCTATCGTTCG E1 ACTCCAGCTATCGATCG E2 ACTCCAGGTATCGATCG A1 ACCCCAGCTCTCGCTCG A2 ACCCCAGCTCTGGCTCG New1 ACCCCAGCTCTGCCTCG New2 AGGGGAGCTCTGCCTCG New3 ACTCCAGCTATCGATCG New4 ACTGCACCTATCGTTCG RecA RecARecA http://genomebiology.com/2008/9/10/R151 Genome Biology 2008, Volume 9, Issue 10, Article R151 Wu and Eisen R151.7 Genome Biology 2008, 9:R151 sequences are not conserved at the nucleotide level [29]. As a result, the nr database does not actually contain many more protein marker sequences that can be used as references than those available from complete genome sequences. Comparison of phylogeny-based and similarity-based phylotyping Although our phylogeny-based phylotyping is fully auto- mated, it still requires many more steps than, and is slower than, similarity based phylotyping methods such as a MEGAN [30]. Is it worth the trouble? Similarity based phylo- typing works by searching a query sequence against a refer- ence database such as NCBI nr and deriving taxonomic information from the best matches or 'hits'. When species that are closely related to the query sequence exist in the ref- erence database, similarity-based phylotyping can work well. However, if the reference database is a biased sample or if it contains no closely related species to the query, then the top hits returned could be misleading [31]. Furthermore, similar- ity-based methods require an arbitrary similarity cut-off value to define the top hits. Because individual bacterial genomes and proteins can evolve at very different rates, a uni- versal cut-off that works under all conditions does not exist. As a result, the final results can be very subjective. In contrast, our tree-based bracketing algorithm places the query sequence within the context of a phylogenetic tree and only assigns it to a taxonomic level if that level has adequate sampling (see Materials and methods [below] for details of the algorithm). With the well sampled species Prochlorococ- cus marinus, for example, our method can distinguish closely related organisms and make taxonomic identifications at the species level. Our reanalysis of the Sargasso Sea data placed 672 sequences (3.6% of the total) within a P. marinus clade. On the other hand, for sparsely sampled clades such as Aquifex, assignments will be made only at the phylum level. Thus, our phylogeny-based analysis is less susceptible to data sampling bias than a similarity based approach, and it makes Major phylotypes identified in Sargasso Sea metagenomic dataFigure 3 Major phylotypes identified in Sargasso Sea metagenomic data. The metagenomic data previously obtained from the Sargasso Sea was reanalyzed using AMPHORA and the 31 protein phylogenetic markers. The microbial diversity profiles obtained from individual markers are remarkably consistent. The breakdown of the phylotyping assignments by markers and major taxonomic groups is listed in Additional data file 5. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Alphaproteobacteria Betaproteobacteria G am m aproteobacteria D eltaproteobacteria Epsilonproteobacteria U nclassified proteobacteria Bacteroidetes C hlam ydiae C yanobacteria Acidobacteria Therm otogae Fusobacteria ActinobacteriaAquificae Planctom ycetes Spirochaetes Firm icutes C hloroflexiC hlorobi U nclassified bacteria dnaG frr infC nusA pgk pyrG rplA rplB rplC rplD rplE rplF rplK rplL rplM rplN rplP rplS rplT rpmA rpoB rpsB rpsC rpsE rpsI rpsJ rpsK rpsM rpsS smpB tsf Relativeabundance RpoB RpoBRpoB Rpl4 Rpl4Rpl4 rRNA rRNArRNA Hsp70 Hsp70Hsp70 EFTu EFTuEFTu Many other genes better than rRNA
  • 34. Phylosift for Other Marker Genes DNA DNADNA Taxa Characters B1 ACTGCACCTATCGTTCG B2 ACTCCACCTATCGTTCG E1 ACTCCAGCTATCGATCG E2 ACTCCAGGTATCGATCG A1 ACCCCAGCTCTCGCTCG A2 ACCCCAGCTCTGGCTCG New1 ACCCCAGCTCTGCCTCG New2 AGGGGAGCTCTGCCTCG Input Sequences rRNA workflow protein workflow profile HMMs used to align candidates to reference alignment Taxonomic Summaries parallel option hmmalign multiple alignment LAST fast candidate search pplacer phylogenetic placement LAST fast candidate search LAST fast candidate search search input against references hmmalign multiple alignment hmmalign multiple alignment Infernal multiple alignment LAST fast candidate search <600 bp >600 bp Sample Analysis & Comparison Krona plots, Number of reads placed for each marker gene Edge PCA, Tree visualization, Bayes factor tests eachinputsequencescannedagainstbothworkflows https://phylosift.wordpress.com PeerJ 2:e243 https://dx.doi.org/10.7717/peerj.243 Aaron Darling Holly Bik
  • 35. Wu et al. 2006 PLoS Biology 4: e188. Baumannia makes vitamins and cofactors Sulcia makes amino acids Phylogenetic Binning Nancy Moran Dongying Wu
  • 37. Functional Prediction from Metagenomes DNA DNADNA !23 Taxa Characters B1 ACTGCACCTATCGTTCG B2 ACTCCACCTATCGTTCG E1 ACTCCAGCTATCGATCG E2 ACTCCAGGTATCGATCG A1 ACCCCAGCTCTCGCTCG A2 ACCCCAGCTCTGGCTCG New1 ACCCCAGCTCTGCCTCG New2 AGGGGAGCTCTGCCTCG New3 ACTCCAGCTATCGATCG New4 ACTGCACCTATCGTTCG inputs of fixed carbon or nitrogen from external sources. As with Leptospirillum group I, both Leptospirillum group II and III have the genes needed to fix carbon by means of the Calvin–Benson– Bassham cycle (using type II ribulose 1,5-bisphosphate carboxy- lase–oxygenase). All genomes recovered from the AMD system contain formate hydrogenlyase complexes. These, in combination with carbon monoxide dehydrogenase, may be used for carbon fixation via the reductive acetyl coenzyme A (acetyl-CoA) pathway by some, or all, organisms. Given the large number of ABC-type sugar and amino acid transporters encoded in the Ferroplasma type Figure 4 Cell metabolic cartoons constructed from the annotation of 2,180 ORFs identified in the Leptospirillum group II genome (63% with putative assigned function) and 1,931 ORFs in the Ferroplasma type II genome (58% with assigned function). The cell cartoons are shown within a biofilm that is attached to the surface of an acid mine drainage stream (viewed in cross-section). Tight coupling between ferrous iron oxidation, pyrite dissolution and acid generation is indicated. Rubisco, ribulose 1,5-bisphosphate carboxylase–oxygenase. THF, tetrahydrofolate. articles NATURE | doi:10.1038/nature02340 | www.nature.com/nature 5©2004 NaturePublishing Group
  • 38. DNA DNADNA !24 Taxa Characters B1 ACTGCACCTATCGTTCG B2 ACTCCACCTATCGTTCG E1 ACTCCAGCTATCGATCG E2 ACTCCAGGTATCGATCG A1 ACCCCAGCTCTCGCTCG A2 ACCCCAGCTCTGGCTCG New1 ACCCCAGCTCTGCCTCG New2 AGGGGAGCTCTGCCTCG New3 ACTCCAGCTATCGATCG New4 ACTGCACCTATCGTTCG Functional Prediction from Metagenomes
  • 39. PHYLOGENENETIC PREDICTION OF GENE FUNCTION IDENTIFY HOMOLOGS OVERLAY KNOWN FUNCTIONS ONTO TREE INFER LIKELY FUNCTION OF GENE(S) OF INTEREST 1 2 3 4 5 6 3 5 3 1A 2A 3A 1B 2B 3B 2A 1B 1A 3A 1B 2B 3B ALIGN SEQUENCES CALCULATE GENE TREE 1 2 4 6 CHOOSE GENE(S) OF INTEREST 2A 2A 5 3 Species 3Species 1 Species 2 1 1 2 2 2 31 1A 3A 1A 2A 3A 1A 2A 3A 4 6 4 5 6 4 5 6 2B 3B 1B 2B 3B 1B 2B 3B ACTUAL EVOLUTION (ASSUMED TO BE UNKNOWN) Duplication? EXAMPLE A EXAMPLE B Duplication? Duplication? Duplication 5 METHOD Ambiguous Based on Eisen, 1998 Genome Res 8: 163-167. Phylogenomics
  • 41. Embracing Diversity 4: Organized Reference Data
  • 42. Automated Accurate Genome Tree Lang JM, Darling AE, Eisen JA (2013) Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees and Supermatrices. PLoS ONE 8(4): e62510. doi:10.1371/journal.pone.0062510 Jenna Lang
  • 43. Automated Protein Family Surveys A B C Representative Genomes Extract Protein Annotation All v. All BLAST Homology Clustering (MCL) SFams Align & Build HMMs HMMs Screen for Homologs New Genomes Extract Protein Annotation Figure 1 Tom Sharpton Katie Pollardhttp://www.biomedcentral.com/1471-2105/13/264
  • 44. Genomes Poorly Sampled Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree
  • 45. 2002: TIGR Tree of Life Project Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree Naomi Ward Karen Nelson
  • 46. Genomic Encyclopedia of Bacteria & Archaea Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree
  • 48. GEBA Cyanobacteria Shih et al. 2013. PNAS 10.1073/pnas.1217107110 0.3 B1 B2 C1 Paulinella Glaucophyte Green Red Chromalveolates C2 C3 A E F G B3 D A B Fig. 2. Implications on plastid evolution. (A) Maxi- mum-likelihood phylogenetic tree of plastids and cya- nobacteria, grouped by subclades (Fig. 1). The red dot Cheryl Kerfeld
  • 49. Haloarchaeal GEBA-like Lynch et al. (2012) PLoS ONE 7(7): e41389. doi:10.1371/journal.pone.0041389 Erin Lynch
  • 50. The Dark Matter of Biology From Wu et al. 2009 Nature 462, 1056-1060
  • 51. JGI Dark Matter Project environmental samples (n=9) isolation of single cells (n=9,600) whole genome amplification (n=3,300) SSU rRNA gene based identification (n=2,000) genome sequencing, assembly and QC (n=201) draft genomes (n=201) SAK HSM ETLTG HOT GOM GBS EPR TAETL T PR EBS AK E SM G TATTG OM OT seawater brackish/freshwater hydrothermal sediment bioreactor GN04 WS3 (Latescibacteria) GN01 +Gí LD1 WS1 Poribacteria BRC1 Lentisphaerae Verrucomicrobia OP3 (Omnitrophica) Chlamydiae Planctomycetes NKB19 (Hydrogenedentes) WYO Armatimonadetes WS4 Actinobacteria Gemmatimonadetes NC10 SC4 WS2 Cyanobacteria :36í2 Deltaproteobacteria EM19 (Calescamantes) 2FW6SDí )HUYLGLEDFWHULD
  • 52. GAL35 Aquificae EM3 Thermotogae Dictyoglomi SPAM GAL15 CD12 (Aerophobetes) OP8 (Aminicenantes) AC1 SBR1093 Thermodesulfobacteria Deferribacteres Synergistetes OP9 (Atribacteria) :36í2 Caldiserica AD3 Chloroflexi Acidobacteria Elusimicrobia Nitrospirae 49S1 2B Caldithrix GOUTA4 6$5 0DULQLPLFURELD
  • 53. Chlorobi )LUPLFXWHV Tenericutes )XVREDFWHULD Chrysiogenetes Proteobacteria )LEUREDFWHUHV TG3 Spirochaetes WWE1 (Cloacamonetes) 70 ZB3 093í 'HLQRFRFFXVí7KHUPXV OP1 (Acetothermia) Bacteriodetes TM7 GN02 (Gracilibacteria) SR1 BH1 OD1 (Parcubacteria) :6 OP11 (Microgenomates) Euryarchaeota Micrarchaea DSEG (Aenigmarchaea) Nanohaloarchaea Nanoarchaea Cren MCG Thaumarchaeota Cren C2 Aigarchaeota Cren pISA7 Cren Thermoprotei Korarchaeota pMC2A384 (Diapherotrites) BACTERIA ARCHAEA archaeal toxins (Nanoarchaea) lytic murein transglycosylase stringent response (Diapherotrites, Nanoarchaea) ppGpp limiting amino acids SpotT RelA (GTP or GDP) + PPi GTP or GDP +ATP limiting phosphate, fatty acids, carbon, iron DksA Expression of components for stress response sigma factor (Diapherotrites, Nanoarchaea) ı4 ȕ ȕ¶ ı2ı3 ı1 -35 -10 Į17' Į7' 51$ SROPHUDVH oxidoretucase + +e- donor e- acceptor H 1 Ribo ADP + 1+2 O Reduction Oxidation H 1 Ribo ADP 1+ O 2H 1$' + H 1$'++ + - HGT from Eukaryotes (Nanoarchaea) Eukaryota O +2+2 OH 1+ 2+3 O O +2+2 1+ 2+3 O tetra- peptide O +2+2 OH 1+ 2+3 O O +2+2 1+ 2+3 O tetra- peptide murein (peptido-glycan) archaeal type purine synthesis (Microgenomates) PurF PurD 3XU1 PurL/Q PurM PurK PurE 3XU PurB PurP ? Archaea adenine guanine O + 12 + 1 1+2 1 1 H H 1 1 1 H H H1 1 H PRPP )$,$5 IMP $,$5 A GUA G U G U A G U A U A U A U Growing AA chain W51$*O
  • 54. recognizes UGA P51$ UGA recoded for Gly (Gracilibacteria) ribosome Woyke et al. Nature 2013. Tanja
 Woyke
  • 55. Embracing Diversity 5: Public Participation
  • 56. The Rise of Citizen Microbiology Darlene Cavalier
  • 58. Embracing Diversity 6: Diversity in STEM
  • 60. Diversity in STEM Jo Handelsman
  • 61. Diversity in STEM #DoSomething
  • 62. Don’t Just Sit There #DoSomething
  • 63. Acknowledgements DOE JGI Sloan GBMF NSF DHS DARPA Aaron Darling
 Lizzy Wilbanks Jenna Lang Russell Neches Rob Knight Jack Gilbert Tanja Woyke Rob Dunn Katie Pollard Jessica Green Darlene Cavalier Eddy RubinWendy Brown Dongying Wu Phil Hugenholtz DSMZ Sundar Srijak Bhatnagar David Coil Alex Alexiev Hannah Holland-Moritz Holly Bik John Zhang Holly Menninger Guillaume Jospin David Lang Cassie Ettinger Tim HarkinsJennifer Gardy Holly Ganz