SlideShare une entreprise Scribd logo
1  sur  18
Télécharger pour lire hors ligne
Genome analysis of multiple pathogenic isolates
of Streptococcus agalactiae: Implications
for the microbial ‘‘pan-genome’’
Herve´ Tettelina,b, Vega Masignanib,c, Michael J. Cieslewiczb,d,e, Claudio Donatic, Duccio Medinic, Naomi L. Warda,f,
Samuel V. Angiuolia, Jonathan Crabtreea, Amanda L. Jonesg, A. Scott Durkina, Robert T. DeBoya, Tanja M. Davidsena,
Marirosa Morac
, Maria Scarsellic
, Immaculada Margarit y Rosc
, Jeremy D. Petersona
, Christopher R. Hausera
,
Jaideep P. Sundarama, William C. Nelsona, Ramana Madupua, Lauren M. Brinkaca, Robert J. Dodsona, Mary J. Rosovitza,
Steven A. Sullivana, Sean C. Daughertya, Daniel H. Hafta, Jeremy Selenguta, Michelle L. Gwinna, Liwei Zhoua,
Nikhat Zafara, Hoda Khouria, Diana Radunea, George Dimitrova, Kisha Watkinsa, Kevin J. B. O’Connorh, Shannon Smithi,
Teresa R. Utterbacki
, Owen Whitea
, Craig E. Rubensg
, Guido Grandic
, Lawrence C. Madoffe,j
, Dennis L. Kaspere,j
,
John L. Telfordc, Michael R. Wesselsd,e, Rino Rappuolic,k,l, and Claire M. Frasera,b,k,m
aInstitute for Genomic Research, 9712 Medical Center Drive, Rockville, MD 20850; cChiron Vaccines, Via Fiorentina 1, 53100 Siena, Italy; dDivision
of Infectious Diseases, Children’s Hospital, 300 Longwood Avenue, Boston, MA 02115; eHarvard Medical School, Boston, MA 02115; fCenter of Marine
Biotechnology, University of Maryland Biotechnology Institute, 701 East Pratt Street, Baltimore, MD 21202; gChildren’s Hospital and Regional Medical
Center, 307 Westlake Avenue N, Seattle, WA 98109; hThe Johns Hopkins University, 3400 North Charles Street, Baltimore, MD 21218; iJ. Craig Venter
Institute, 5 Research Place, Rockville, MD 20850; jChanning Laboratory, Brigham and Women’s Hospital, 181 Longwood Avenue, Boston, MA 02115;
and mGeorge Washington University Medical Center, 2300 Eye Street NW, Washington, DC 20037
Contributed by Rino Rappuoli, August 5, 2005
The development of efficient and inexpensive genome sequencing
methods has revolutionized the study of human bacterial patho-
gens and improved vaccine design. Unfortunately, the sequence of
a single genome does not reflect how genetic variability drives
pathogenesis within a bacterial species and also limits genome-
wide screens for vaccine candidates or for antimicrobial targets.
We have generated the genomic sequence of six strains represent-
ing the five major disease-causing serotypes of Streptococcus
agalactiae, the main cause of neonatal infection in humans. Anal-
ysis of these genomes and those available in databases showed
that the S. agalactiae species can be described by a pan-genome
consisting of a core genome shared by all isolates, accounting for
Ϸ80% of any single genome, plus a dispensable genome consisting
of partially shared and strain-specific genes. Mathematical extrap-
genome sequence of the type Ia strain A909 and draft genome
sequences (8ϫ sequence coverage) of five additional strains, rep-
resenting the five major serotypes. Comparative analysis of the six
newly sequenced genomes and the two genomes already available
in the databases suggests that a bacterial species can be described
by its ‘‘pan-genome’’ (pan, from the Greek word ␲␣␯, meaning
whole), which includes a core genome containing genes present in
all strains and a dispensable genome composed of genes absent
from one or more strains and genes that are unique to each strain.
Surprisingly, unique genes were still detected after eight genomes
were sequenced, and mathematical extrapolation predicts that new
genes will still be found after sequencing many more strains. Thus,
the genomes of multiple, independent isolates are required to
understand the global complexity of bacterial species. Analysis of
In 1987, it was proposed (1) that bacterial strains showing 70% DNA DNA
reassociation and sharing characteristic phenotypic traits should be considered to be
strains of the same species.
Thus far, the genome sequence of one or two strains for each species has provided
unprecedented information; however, the question of how many genomes are
necessary to fully describe a bacterial species has yet to be asked.
Comparative analysis of the six newly sequenced genomes and the two genomes
already available in the databases suggests that a bacterial species can be
described by its ‘‘pan-genome’’ (pan, from the Greek word , meaning whole),
which includes a core genome containing genes present in all strains and a
dispensable genome composed of genes absent from one or more strains and
genes that are unique to each strain. Surprisingly, unique genes were still
detected after eight genomes were sequenced, and mathematical extrapolation
predicts that new genes will still be found after sequencing many more strains.
Methods
• Sequence

• Assemble

• Annotate

• Compare

• Modeling
rma-
erial
s for
ever,
fully
have
ns of
alac-
born
derly
ibed;
tates
V (5).
I and
gene
plete
Sequencing, Annotation, and Unfinished Genomes. Genome se-
quences were generated by the whole-genome shotgun sequenc-
ing approach (13, 14). Draft genomes were sequenced to 8ϫ-
Freely available online through the PNAS open access option.
Abbreviations: GBS, group B Streptococcus; MLST, multilocus sequence typing; GAS, group
A Streptococcus.
Data deposition: The sequences reported in this paper have been deposited in the DDBJ͞
EMBL͞GenBank database [accession nos. AAJO01000000 (18RS21), AAJP01000000 (515),
AAJQ01000000 (CJB111), AAJR01000000 (COH1), AAJS01000000 (H36B), and CP000114
(A909)].
bH.T., V.M., and M.J.C. contributed equally to this work.
kR.R. and C.M.F. contributed equally to this work.
lTo whom correspondence should be addressed. E-mail: rino࿝rappuoli@chiron.com.
© 2005 by The National Academy of Sciences of the USA
www.pnas.org͞cgi͞doi͞10.1073͞pnas.0506758102
sequence coverage, and the sequences were assembled by using
the Celera Assembler (Celera Genomics, Rockville, MD) (15).
Contigs were ordered and oriented according to their alignment
to strain 2603V͞R by using PROMER (16). Ordered matching
contigs were pasted together into a pseudochromosome, and
nonmatching contigs were tacked on the end in random order.
In the pseudochromosome, contigs were separated by the se-
quence NNNNNCATTCCATTCATTAATTAATTAATG-
AATGAATGNNNNN, which (i) generates a stop codon in all
six reading frames so that no gene is predicted across junctions
and (ii) provides a start site in all frames, pointing toward contigs
to predict incomplete genes at their extremities. ORFs were
predicted and annotated by using an automated pipeline that
combines GLIMMER gene prediction (17, 18), ORF and non-ORF
feature identification, and assignment of functional role catego-
ries to genes (14). Assembly of strain 18RS21 resulted in a higher
number of contigs than for the other unfinished genomes,
leading to the prediction of Ͼ3,500 genes. Many small contigs did
not harbor protein-coding genes, and several were fragments of
rRNAs or coded for tRNAs or structural RNAs.
ortholog
contigs
draft ge
NUCMER
drawn (
In Fig
follows:
and from
not shar
merged
and (iv)
noted th
Genomi
␣-galact
region
modifica
regions
and the
transfer
in strain
leading to the prediction of Ͼ3,500 genes. Many small contigs did
not harbor protein-coding genes, and several were fragments of
rRNAs or coded for tRNAs or structural RNAs.
Shared and Strain-Specific Genes. Each strain pair was compared by
means of the following: (i) a Smith and Waterman protein search
on all of the predicted proteins by using the SSEARCH program
(version 3.4) (19, 20); (ii) a DNA search of all of the predicted
ORFs of a strain against the complete DNA sequence of the
other strain, by using the FASTA program (version 3.4) (20); and
(iii) a translated protein search of all of the predicted proteins
of a strain against the complete DNA sequence of the other
strain, by using the TFASTY program (version 3.4) (20). A gene
was considered conserved if at least one of these three methods
produced an alignment with a minimum of 50% sequence
conservation over 50% of the protein͞gene length.
Core-Genome and Pan-Genome Extrapolation. The number of genes
shared by all GBS isolates and the number of strain-specific
Core-Genome and Pan-Genome Extrapolation. The number of genes
shared by all GBS isolates and the number of strain-specific
genes depend on how many strains are taken into account. The
sequential inclusion of up to eight strains was simulated in all
possible combinations. The number (N) of independent mea-
surements of the shared (see Fig. 2) and strain-specific genes (see
Fig. 3) present in the nth genome is N ϭ 8!͞[(n Ϫ 1)!⅐(8 Ϫ n)!].
The size of the species core genome and the number of strain-
specific genes for a large number of sequenced strains were
extrapolated by fitting the exponential decaying functions Fc ϭ
␬c exp[Ϫn͞␶c] ϩ ⍀ and Fs ϭ ␬s exp[Ϫn͞␶s] ϩ tg(␪), respectively,
to the amount of conserved genes (see Fig. 2) and of strain-
specific genes (see Fig. 3), where n is the number of sequenced
strains and ␬c, ␬s, ␶c, ␶s, ⍀, and tg(␪) are free parameters. tg(␪)
represents the extrapolated rate of growth of the pan-genome
size, P(n), as a greater number of independent GBS strain
sequences become available, i.e.,
lim͓P͑n͔͒
n3 ϱ
Ϸ tg͑␪͒⅐n.
The Inset of Fig. 3 displays the measured size of the pan-genome
as a function of n [in this case, N ϭ 8!͞(8 Ϫ n)!; points are
obtained for each value of n] together with a plot of the
calculated P(n) (see Supporting Text, which is published as
supporting information on the PNAS web site).
of the whole
regions of aty
Results
General Featur
coverage) wer
and CJB111, b
and V, which
the United St
strain A909 of
515, both belo
different sequ
sequence typi
genetic divers
published as su
six newly seq
2603V͞R and
used for subse
genome sizes
entire nucleot
the GBS strai
nations with N
ranged from 8
and noncodin
Fig. 1 summ
ison of the eig
gene synteny i
calculated P(n) (see Supporting Text, which is published as
supporting information on the PNAS web site).
Synteny. Paralog clusters in each genome were generated by
using the Jaccard algorithm (21), with Ն 80% identity, and a
Jaccard coefficient Ն 0.6. Members of paralog clusters were then
organized into ortholog clusters by allowing any member of a
paralog cluster to contribute to the reciprocal best matches used
to construct the ortholog clusters. Syntenic blocks are defined as
a set of five or more consecutive pairs of genes from the same
Tettelin et al.
ortholog cluster. Because they do not participate in clusters, all
contigs that do not contain protein-coding genes from the five
draft genomes were searched against all genomes by using the
NUCMER program (16). Syntenic blocks and NUCMER results were
drawn (Fig. 1) by using SYBIL (http:͞͞sybil.sourceforge.net͞).
In Fig. 1, genomic islands of diversity Ͼ5 kb are predicted as
follows: (i) strains are inspected from the top panel and down
and from left to right on each panel; (ii) regions of at least 1 kb
not shared with another strain are identified; (iii) regions are
and
er.
se-
G-
all
ons
igs
ere
hat
RF
go-
her
mes,
did
of
by
rch
am
ted
the
drawn (Fig. 1) by using SYBIL (http:͞͞sybil.sourceforge.net͞).
In Fig. 1, genomic islands of diversity Ͼ5 kb are predicted as
follows: (i) strains are inspected from the top panel and down
and from left to right on each panel; (ii) regions of at least 1 kb
not shared with another strain are identified; (iii) regions are
merged into single islands if they are within 5 kb of each other;
and (iv) resulting islands Ͼ5 kb are considered. It should be
noted that some islands are composed of more than one contig.
Genomic islands discussed in the text are the following: the
␣-galactosidase region in strain H36B, island 7.4; the prophage
region in strain H36B, island 7.5; the DNA restriction͞
modification system in strain 515, part of island 4.5; the Tn916
regions in strains 2603V͞R, 515, CJB111, and COH1, islands 1.8
and the left side of 5.3; and serine-rich protein and glycosyl-
transferases flanked by cell-wall-anchored proteins and sortases
in strain COH1, unnumbered region between islands 6.5 and
1.15. Fig. 1 reveals many non-protein-coding regions in strain
18RS21 that display NUCMER matches elsewhere in the 18RS21
genome. Most of these regions correspond to fragments of
rRNAs, tRNAs, or structural RNAs, all of which exhibit an
expected atypical nucleotide composition.
1.15. Fig. 1 reveals many non-protein-coding regions in strain
18RS21 that display NUCMER matches elsewhere in the 18RS21
genome. Most of these regions correspond to fragments of
rRNAs, tRNAs, or structural RNAs, all of which exhibit an
expected atypical nucleotide composition.
␹2 Analysis. Regions of atypical nucleotide composition were
identified by the ␹2 analysis; the distribution of all 64 trinucle-
otides (3mers) was computed for the complete genome in all six
reading frames, followed by the 3mer distribution in 5,000-bp
windows. Windows overlapped by 500 bp. For each window, the
␹2 statistic on the difference between its 3mer content and that
of the whole genome was computed. Peaks in Fig. 1 indicate
regions of atypical nucleotide composition.
Results
General Features of GBS Genomes. Draft genome sequences (8ϫ
coverage) were obtained for strains 515, H36B, 18RS21, COH1,
and CJB111, belonging, respectively, to serotypes Ia, Ib, II, III,
Fig. 1. Whole genome alignment of GBS strains. The eight genomes are compared to each
other by using COG (41) and NUCMER analyses (see Materials and Methods). Each
genome (shaded strain name) is colored with a gradient that ranges from yellow (nucleotide
1) to blue (end). Differences in color between a reference sequence (the last colored line in
each genome) and the other genomes indicate conserved protein-coding regions that have
been rearranged. Uncolored segments denote coding regions in which no conserved genes
were detected. NUCMER matches for contigs that do not contain protein-coding genes are
displayed by red blocks (matches within the reference strain are displayed on the line directly
above it). Genomic islands of diversity are boxed and numbered ‘‘x.y,’’ where x is the panel
or strain number where the island first appeared and y is the island location in that genome
from left to right. A indicates an island that was not identified in a previous genome. Islands
that overlap by at least 50% (based on the number of shared genes) with previously identified
islands receive the same number as the initial island. The gene content of the 69 islands
identified is listed in Table 2, which is published as supporting information on the PNAS web
site. Strain-specific regions, free of COG or NUCMER matches, are displayed in black at the
bottom of each panel. Portions of these regions that harbor protein-coding genes are indicated
in gray below the black blocks. The curves on top of each panel represent the nucleotide
composition ( 2 analysis) (see Materials and Methods) of the reference strain of the panel,
and peaks indicate regions of atypical composition.
with the numbe
verify whether
repeated the an
of Streptococcu
strains of Bac
levels of genom
Fig. 2. GBS core genome. The number of shared genes is plotted as a
function of the number n of strains sequentially added (see Materials and
Methods). For each n, circles are the 8!͞[(n Ϫ 1)!⅐(8 Ϫ n)!] values obtained for
the different strain combinations. Squares are the averages of such values. The
continuous curve represents the least-squares fit of the function Fc ϭ
␬c exp[Ϫn͞␶c] ϩ ⍀ (see Eq. 1 in Supporting Text) to data. The best fit was
obtained with correlation r2 ϭ 0.990 for ␬c ϭ 610 Ϯ 38, ␶c ϭ 2.16 Ϯ 0.28, and
⍀ ϭ 1,806 Ϯ 16. The extrapolated GBS core genome size ⍀ is shown as a dashed
line.
Fig. 4. Dendro
was used to cluste
the program CLUS
profiles of presen
and used as inpu
drawing and boo
Serotypes and M
the B. anthrac
Genome Diver
phenotypic tra
the capsular p
widely used to
information h
therapy. Rece
conserved cor
correlate with
terize the gen
isolates, a den
genes across t
two belong to
(NEM316 and
Furthermore,
(ST23). Comp
7, which is pub
site) showed th
type often sha
serotype, resu
strains. In sup
at the nucleot
related seroty
the least cons
whereas the tw
identity over 9
(type V and I
Fig. 3. GBS pan-genome. The number of specific genes is plotted as a
function of the number n of strains sequentially added (see Materials and
Methods). For each n, circles are the 8!͞[(n Ϫ 1)!⅐(8 Ϫ n)!] values obtained for
the different strain combinations; squares are the averages of such values. The
blue curve is the least-squares fit of the function Fs(n) ϭ ␬s exp[Ϫn͞␶s] ϩ tg(␪)
(see Eq. 2 in Supporting Text) to the data. The best fit was obtained with
correlation r2 ϭ 0.995 for ␬s ϭ 476 Ϯ 62, ␶s ϭ 1.51 Ϯ 0.15, and tg(␪) ϭ 33 Ϯ 3.5.
The extrapolated average number tg(␪) of strain-specific genes is shown as a
dashed line. (Inset) Size of the GBS pan-genome as a function of n. The red
curve is the calculated pan-genome size
Fig. 4. Dendrogram of the eight GBS genomes. Shared gene information
was used to cluster proteins into groups by using the single-linkage method of
the program CLUSTER (http:͞͞rana.lbl.gov). Groups were then converted into
profiles of presence or absence of each gene (0 or 1) in the eight GBS strains
and used as input to PAUP* 4.0b10 (Sinauer, Sunderland, MA) for dendrogram
drawing and bootstrapping. Numbers at the nodes indicate bootstrap values.
Serotypes and MLST types of each strain are within parentheses.
Tettelin et al. Figure 5
Distribution of shared and unique genes sorted by functional categories. Groups of shared genes and strain-specific genes were
sorted by broad functional categories as follows: housekeeping functions, cell envelope proteins, regulatory functions, transport and
binding proteins, mobile and extrachromosomal elements, proteins of unknown function, and hypothetical proteins. The total
number of genes in each of these categories was counted and displayed for groups shared by all eight strains, groups shared by any
combination of seven to two strains, and strain-specific genes.
Implications for Bacterial Taxonomy. Methods commonly used to
define bacterial species (DNA⅐DNA reassociation, 16S rRNA
typing, MLST, etc.) rely mostly on features associated with the core
genome (40). Our work confirms that the essence of the species is
linked to the core genome. However, the majority of the genetic
traits linked to virulence, capsular serotype, adaptation, and anti-
biotic resistance pertain to the dispensable genome. Therefore,
sequencing of multiple strains is necessary to understand the
virulence of pathogenic bacteria and to provide a more consistent
definition of the species itself. We identified species with an open
pan-genome, such as GBS and GAS, and species with a closed
pan-genome, such as B. anthracis. Nevertheless, a different inter-
pretation of the same data may lead to the conclusion that the
present definition of bacterial species is inconsistent because, in
reality, only species with an open pan-genome are species, whereas
B. anthracis is not a true genetic species on its own, but only a clone
of Bacillus cereus, with very distinctive phenotypic traits provided by
the acquisition of the virulence plasmid coding for the anthrax
toxin.
Concluding Comment. Our data clearly show that the strategy to
sequence one or two genomes per species, which has been used
during the first decade of the genomic era, is not sufficient and that
multiple strains need to be sequenced to understand the basics of
bacterial species. The methods presently used to evaluate the
species diversity, such as complete genome hybridization and
MLST, can explain only the presence, absence, and variability of the
genetic loci that are already known and do not provide information
on the genes that are not present in the reference genome. Our work
provides a clear demonstration that, by these approaches, we fail to
include in the analysis the entire dispensable genome, the size of
which can be vastly larger than the core genome. Our work on the
protein-based vaccine against GBS has shown that this is not just a
theoretical disadvantage but has very important practical conse-
quences because a universal vaccine is possible only by including
dispensable genes (8).

Contenu connexe

Tendances

Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Jonathan Eisen
 
Microbial Phylogenomics (EVE161) Class 5
Microbial Phylogenomics (EVE161) Class 5Microbial Phylogenomics (EVE161) Class 5
Microbial Phylogenomics (EVE161) Class 5Jonathan Eisen
 
American Gut Project presentation at Masaryk University
American Gut Project presentation at Masaryk UniversityAmerican Gut Project presentation at Masaryk University
American Gut Project presentation at Masaryk Universitymcdonadt
 
UC Davis EVE161 Lecture 14 by @phylogenomics
UC Davis EVE161 Lecture 14 by @phylogenomicsUC Davis EVE161 Lecture 14 by @phylogenomics
UC Davis EVE161 Lecture 14 by @phylogenomicsJonathan Eisen
 
Phylogenomic Case Studies: The Benefits (and Occasional Drawbacks) of Integra...
Phylogenomic Case Studies: The Benefits (and Occasional Drawbacks) of Integra...Phylogenomic Case Studies: The Benefits (and Occasional Drawbacks) of Integra...
Phylogenomic Case Studies: The Benefits (and Occasional Drawbacks) of Integra...Jonathan Eisen
 
Evolution of microbiomes and the evolution of the study and politics of micro...
Evolution of microbiomes and the evolution of the study and politics of micro...Evolution of microbiomes and the evolution of the study and politics of micro...
Evolution of microbiomes and the evolution of the study and politics of micro...Jonathan Eisen
 
EVE 161 Winter 2018 Class 15
EVE 161 Winter 2018 Class 15EVE 161 Winter 2018 Class 15
EVE 161 Winter 2018 Class 15Jonathan Eisen
 
Jonathan Eisen talk for 2019 ADVANCE Scholar Award Symposium
Jonathan Eisen talk for 2019 ADVANCE Scholar Award SymposiumJonathan Eisen talk for 2019 ADVANCE Scholar Award Symposium
Jonathan Eisen talk for 2019 ADVANCE Scholar Award SymposiumJonathan Eisen
 
A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2Jonathan Eisen
 
Microbial Phylogenomics (EVE161) Class 14: Metagenomics
Microbial Phylogenomics (EVE161) Class 14: MetagenomicsMicrobial Phylogenomics (EVE161) Class 14: Metagenomics
Microbial Phylogenomics (EVE161) Class 14: MetagenomicsJonathan Eisen
 
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics Jonathan Eisen
 
Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups
Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups
Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups Jonathan Eisen
 
EveMicrobial Phylogenomics (EVE161) Class 9
EveMicrobial Phylogenomics (EVE161) Class 9EveMicrobial Phylogenomics (EVE161) Class 9
EveMicrobial Phylogenomics (EVE161) Class 9Jonathan Eisen
 
Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Microbial Phylogenomics (EVE161) Class 17: Genomes from UnculturedMicrobial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Microbial Phylogenomics (EVE161) Class 17: Genomes from UnculturedJonathan Eisen
 
Clinical Metagenomics for Rapid Detection of Enteric Pathogens and Characteri...
Clinical Metagenomics for Rapid Detection of Enteric Pathogens and Characteri...Clinical Metagenomics for Rapid Detection of Enteric Pathogens and Characteri...
Clinical Metagenomics for Rapid Detection of Enteric Pathogens and Characteri...QIAGEN
 
The Seagrass Microbiome Project
The Seagrass Microbiome Project The Seagrass Microbiome Project
The Seagrass Microbiome Project Jonathan Eisen
 
UC Davis EVE161 Lecture 9 by @phylogenomics
UC Davis EVE161 Lecture 9 by @phylogenomicsUC Davis EVE161 Lecture 9 by @phylogenomics
UC Davis EVE161 Lecture 9 by @phylogenomicsJonathan Eisen
 
Microbial Phylogenomics (EVE161) Class 3: Woese and the Tree of Life
Microbial Phylogenomics (EVE161) Class 3: Woese and the Tree of LifeMicrobial Phylogenomics (EVE161) Class 3: Woese and the Tree of Life
Microbial Phylogenomics (EVE161) Class 3: Woese and the Tree of LifeJonathan Eisen
 
UC Davis EVE161 Lecture 15 by @phylogenomics
UC Davis EVE161 Lecture 15 by @phylogenomicsUC Davis EVE161 Lecture 15 by @phylogenomics
UC Davis EVE161 Lecture 15 by @phylogenomicsJonathan Eisen
 

Tendances (20)

Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
 
Microbial Phylogenomics (EVE161) Class 5
Microbial Phylogenomics (EVE161) Class 5Microbial Phylogenomics (EVE161) Class 5
Microbial Phylogenomics (EVE161) Class 5
 
American Gut Project presentation at Masaryk University
American Gut Project presentation at Masaryk UniversityAmerican Gut Project presentation at Masaryk University
American Gut Project presentation at Masaryk University
 
UC Davis EVE161 Lecture 14 by @phylogenomics
UC Davis EVE161 Lecture 14 by @phylogenomicsUC Davis EVE161 Lecture 14 by @phylogenomics
UC Davis EVE161 Lecture 14 by @phylogenomics
 
Phylogenomic Case Studies: The Benefits (and Occasional Drawbacks) of Integra...
Phylogenomic Case Studies: The Benefits (and Occasional Drawbacks) of Integra...Phylogenomic Case Studies: The Benefits (and Occasional Drawbacks) of Integra...
Phylogenomic Case Studies: The Benefits (and Occasional Drawbacks) of Integra...
 
Evolution of microbiomes and the evolution of the study and politics of micro...
Evolution of microbiomes and the evolution of the study and politics of micro...Evolution of microbiomes and the evolution of the study and politics of micro...
Evolution of microbiomes and the evolution of the study and politics of micro...
 
EVE 161 Winter 2018 Class 15
EVE 161 Winter 2018 Class 15EVE 161 Winter 2018 Class 15
EVE 161 Winter 2018 Class 15
 
Jonathan Eisen talk for 2019 ADVANCE Scholar Award Symposium
Jonathan Eisen talk for 2019 ADVANCE Scholar Award SymposiumJonathan Eisen talk for 2019 ADVANCE Scholar Award Symposium
Jonathan Eisen talk for 2019 ADVANCE Scholar Award Symposium
 
A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2
 
Microbial Phylogenomics (EVE161) Class 14: Metagenomics
Microbial Phylogenomics (EVE161) Class 14: MetagenomicsMicrobial Phylogenomics (EVE161) Class 14: Metagenomics
Microbial Phylogenomics (EVE161) Class 14: Metagenomics
 
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics
 
Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups
Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups
Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups
 
EveMicrobial Phylogenomics (EVE161) Class 9
EveMicrobial Phylogenomics (EVE161) Class 9EveMicrobial Phylogenomics (EVE161) Class 9
EveMicrobial Phylogenomics (EVE161) Class 9
 
EVE 161 Lecture 6
EVE 161 Lecture 6EVE 161 Lecture 6
EVE 161 Lecture 6
 
Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Microbial Phylogenomics (EVE161) Class 17: Genomes from UnculturedMicrobial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
 
Clinical Metagenomics for Rapid Detection of Enteric Pathogens and Characteri...
Clinical Metagenomics for Rapid Detection of Enteric Pathogens and Characteri...Clinical Metagenomics for Rapid Detection of Enteric Pathogens and Characteri...
Clinical Metagenomics for Rapid Detection of Enteric Pathogens and Characteri...
 
The Seagrass Microbiome Project
The Seagrass Microbiome Project The Seagrass Microbiome Project
The Seagrass Microbiome Project
 
UC Davis EVE161 Lecture 9 by @phylogenomics
UC Davis EVE161 Lecture 9 by @phylogenomicsUC Davis EVE161 Lecture 9 by @phylogenomics
UC Davis EVE161 Lecture 9 by @phylogenomics
 
Microbial Phylogenomics (EVE161) Class 3: Woese and the Tree of Life
Microbial Phylogenomics (EVE161) Class 3: Woese and the Tree of LifeMicrobial Phylogenomics (EVE161) Class 3: Woese and the Tree of Life
Microbial Phylogenomics (EVE161) Class 3: Woese and the Tree of Life
 
UC Davis EVE161 Lecture 15 by @phylogenomics
UC Davis EVE161 Lecture 15 by @phylogenomicsUC Davis EVE161 Lecture 15 by @phylogenomics
UC Davis EVE161 Lecture 15 by @phylogenomics
 

Similaire à EVE 161 Winter 2018 Class 13

ASHG 2015 - Redundant Annotations in Tertiary Analysis
ASHG 2015 - Redundant Annotations in Tertiary AnalysisASHG 2015 - Redundant Annotations in Tertiary Analysis
ASHG 2015 - Redundant Annotations in Tertiary AnalysisJames Warren
 
A Comparative Encyclopedia Of DNA Elements In The Mouse Genome
A Comparative Encyclopedia Of DNA Elements In The Mouse GenomeA Comparative Encyclopedia Of DNA Elements In The Mouse Genome
A Comparative Encyclopedia Of DNA Elements In The Mouse GenomeDaniel Wachtel
 
2013_CarterEtal_MultiplexPCR-Cronobacter_ AEM
2013_CarterEtal_MultiplexPCR-Cronobacter_ AEM2013_CarterEtal_MultiplexPCR-Cronobacter_ AEM
2013_CarterEtal_MultiplexPCR-Cronobacter_ AEMMonica Pava-Ripoll
 
J. Clin. Microbiol.-2014-Davidson-JCM.01144-14
J. Clin. Microbiol.-2014-Davidson-JCM.01144-14J. Clin. Microbiol.-2014-Davidson-JCM.01144-14
J. Clin. Microbiol.-2014-Davidson-JCM.01144-14PreveenRamamoorthy
 
An Evolutionary and Structural Analysis of the Connective Tissue Growth Facto...
An Evolutionary and Structural Analysis of the Connective Tissue Growth Facto...An Evolutionary and Structural Analysis of the Connective Tissue Growth Facto...
An Evolutionary and Structural Analysis of the Connective Tissue Growth Facto...Ashley Kennedy
 
Next generation seqencing tecnologies and application vegetable crops
Next generation seqencing tecnologies and application vegetable cropsNext generation seqencing tecnologies and application vegetable crops
Next generation seqencing tecnologies and application vegetable cropsPulipati Gangadhara Rao
 
Gardner and Song_2015_Genetics in Medicine
Gardner and Song_2015_Genetics in MedicineGardner and Song_2015_Genetics in Medicine
Gardner and Song_2015_Genetics in MedicineAmanda Natalizio
 
Utility of transcriptome sequencing for phylogenetic
Utility of transcriptome sequencing for phylogeneticUtility of transcriptome sequencing for phylogenetic
Utility of transcriptome sequencing for phylogeneticEdizonJambormias2
 
How to transform genomic big data into valuable clinical information
How to transform genomic big data into valuable clinical informationHow to transform genomic big data into valuable clinical information
How to transform genomic big data into valuable clinical informationJoaquin Dopazo
 
A common rejection module (CRM) for acute rejection across multiple organs
A common rejection module (CRM) for acute rejection across multiple organsA common rejection module (CRM) for acute rejection across multiple organs
A common rejection module (CRM) for acute rejection across multiple organsKevin Jaglinski
 
EVE 161 Winter 2018 Class 16
EVE 161 Winter 2018 Class 16EVE 161 Winter 2018 Class 16
EVE 161 Winter 2018 Class 16Jonathan Eisen
 
Genome responses of trypanosome infected cattle
Genome responses of trypanosome infected cattleGenome responses of trypanosome infected cattle
Genome responses of trypanosome infected cattleLaurence Dawkins-Hall
 
Global Gene Expression Profiles from Breast Tumor Samples using the Ion Ampli...
Global Gene Expression Profiles from Breast Tumor Samples using the Ion Ampli...Global Gene Expression Profiles from Breast Tumor Samples using the Ion Ampli...
Global Gene Expression Profiles from Breast Tumor Samples using the Ion Ampli...Thermo Fisher Scientific
 
Transcriptomics and metabolomics
Transcriptomics and metabolomicsTranscriptomics and metabolomics
Transcriptomics and metabolomicsSukhjinder Singh
 
importance of pathogenomics in plant pathology
importance of pathogenomics in plant pathologyimportance of pathogenomics in plant pathology
importance of pathogenomics in plant pathologyvinay ju
 
Assessment of immunomolecular_expression_and_prognostic_role_of_tlr7_among_pa...
Assessment of immunomolecular_expression_and_prognostic_role_of_tlr7_among_pa...Assessment of immunomolecular_expression_and_prognostic_role_of_tlr7_among_pa...
Assessment of immunomolecular_expression_and_prognostic_role_of_tlr7_among_pa...dr.Ihsan alsaimary
 

Similaire à EVE 161 Winter 2018 Class 13 (20)

ASHG 2015 - Redundant Annotations in Tertiary Analysis
ASHG 2015 - Redundant Annotations in Tertiary AnalysisASHG 2015 - Redundant Annotations in Tertiary Analysis
ASHG 2015 - Redundant Annotations in Tertiary Analysis
 
A Comparative Encyclopedia Of DNA Elements In The Mouse Genome
A Comparative Encyclopedia Of DNA Elements In The Mouse GenomeA Comparative Encyclopedia Of DNA Elements In The Mouse Genome
A Comparative Encyclopedia Of DNA Elements In The Mouse Genome
 
2013_CarterEtal_MultiplexPCR-Cronobacter_ AEM
2013_CarterEtal_MultiplexPCR-Cronobacter_ AEM2013_CarterEtal_MultiplexPCR-Cronobacter_ AEM
2013_CarterEtal_MultiplexPCR-Cronobacter_ AEM
 
J. Clin. Microbiol.-2014-Davidson-JCM.01144-14
J. Clin. Microbiol.-2014-Davidson-JCM.01144-14J. Clin. Microbiol.-2014-Davidson-JCM.01144-14
J. Clin. Microbiol.-2014-Davidson-JCM.01144-14
 
An Evolutionary and Structural Analysis of the Connective Tissue Growth Facto...
An Evolutionary and Structural Analysis of the Connective Tissue Growth Facto...An Evolutionary and Structural Analysis of the Connective Tissue Growth Facto...
An Evolutionary and Structural Analysis of the Connective Tissue Growth Facto...
 
Grindberg - PNAS
Grindberg - PNASGrindberg - PNAS
Grindberg - PNAS
 
Next generation seqencing tecnologies and application vegetable crops
Next generation seqencing tecnologies and application vegetable cropsNext generation seqencing tecnologies and application vegetable crops
Next generation seqencing tecnologies and application vegetable crops
 
Gardner and Song_2015_Genetics in Medicine
Gardner and Song_2015_Genetics in MedicineGardner and Song_2015_Genetics in Medicine
Gardner and Song_2015_Genetics in Medicine
 
Utility of transcriptome sequencing for phylogenetic
Utility of transcriptome sequencing for phylogeneticUtility of transcriptome sequencing for phylogenetic
Utility of transcriptome sequencing for phylogenetic
 
How to transform genomic big data into valuable clinical information
How to transform genomic big data into valuable clinical informationHow to transform genomic big data into valuable clinical information
How to transform genomic big data into valuable clinical information
 
A common rejection module (CRM) for acute rejection across multiple organs
A common rejection module (CRM) for acute rejection across multiple organsA common rejection module (CRM) for acute rejection across multiple organs
A common rejection module (CRM) for acute rejection across multiple organs
 
EVE 161 Winter 2018 Class 16
EVE 161 Winter 2018 Class 16EVE 161 Winter 2018 Class 16
EVE 161 Winter 2018 Class 16
 
Genome responses of trypanosome infected cattle
Genome responses of trypanosome infected cattleGenome responses of trypanosome infected cattle
Genome responses of trypanosome infected cattle
 
Genomics
GenomicsGenomics
Genomics
 
Global Gene Expression Profiles from Breast Tumor Samples using the Ion Ampli...
Global Gene Expression Profiles from Breast Tumor Samples using the Ion Ampli...Global Gene Expression Profiles from Breast Tumor Samples using the Ion Ampli...
Global Gene Expression Profiles from Breast Tumor Samples using the Ion Ampli...
 
Transcriptomics and metabolomics
Transcriptomics and metabolomicsTranscriptomics and metabolomics
Transcriptomics and metabolomics
 
importance of pathogenomics in plant pathology
importance of pathogenomics in plant pathologyimportance of pathogenomics in plant pathology
importance of pathogenomics in plant pathology
 
Kishor Presentation
Kishor PresentationKishor Presentation
Kishor Presentation
 
Assessment of immunomolecular_expression_and_prognostic_role_of_tlr7_among_pa...
Assessment of immunomolecular_expression_and_prognostic_role_of_tlr7_among_pa...Assessment of immunomolecular_expression_and_prognostic_role_of_tlr7_among_pa...
Assessment of immunomolecular_expression_and_prognostic_role_of_tlr7_among_pa...
 
1-s2.0-S0923250811001872-main
1-s2.0-S0923250811001872-main1-s2.0-S0923250811001872-main
1-s2.0-S0923250811001872-main
 

Plus de Jonathan Eisen

Eisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdfEisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdfJonathan Eisen
 
Phylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of MicrobesPhylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of MicrobesJonathan Eisen
 
Talk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meetingTalk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meetingJonathan Eisen
 
Thoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current ActionsThoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current ActionsJonathan Eisen
 
EVE198 Summer Session Class 4
EVE198 Summer Session Class 4EVE198 Summer Session Class 4
EVE198 Summer Session Class 4Jonathan Eisen
 
EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1 EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1 Jonathan Eisen
 
EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines Jonathan Eisen
 
EVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 IntroductionEVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 IntroductionJonathan Eisen
 
EVE198 Spring2021 Class2
EVE198 Spring2021 Class2EVE198 Spring2021 Class2
EVE198 Spring2021 Class2Jonathan Eisen
 
EVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 VaccinesEVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 VaccinesJonathan Eisen
 
EVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA DetectionEVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA DetectionJonathan Eisen
 
EVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 IntroductionEVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 IntroductionJonathan Eisen
 
EVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID TestingEVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID TestingJonathan Eisen
 
EVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID VaccinesEVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID VaccinesJonathan Eisen
 
EVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID TransmissionEVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID TransmissionJonathan Eisen
 
EVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 VaccinesEVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 VaccinesJonathan Eisen
 
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and TestingEVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and TestingJonathan Eisen
 
EVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 IntroductionEVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 IntroductionJonathan Eisen
 
A renewed need for a genomic field guide to microbes
A renewed need for a genomic field guide to microbesA renewed need for a genomic field guide to microbes
A renewed need for a genomic field guide to microbesJonathan Eisen
 
BIS2C_2020. Lecture 22 Fungi Part 1
BIS2C_2020. Lecture 22 Fungi Part 1BIS2C_2020. Lecture 22 Fungi Part 1
BIS2C_2020. Lecture 22 Fungi Part 1Jonathan Eisen
 

Plus de Jonathan Eisen (20)

Eisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdfEisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdf
 
Phylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of MicrobesPhylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of Microbes
 
Talk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meetingTalk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meeting
 
Thoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current ActionsThoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current Actions
 
EVE198 Summer Session Class 4
EVE198 Summer Session Class 4EVE198 Summer Session Class 4
EVE198 Summer Session Class 4
 
EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1 EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1
 
EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines
 
EVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 IntroductionEVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 Introduction
 
EVE198 Spring2021 Class2
EVE198 Spring2021 Class2EVE198 Spring2021 Class2
EVE198 Spring2021 Class2
 
EVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 VaccinesEVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 Vaccines
 
EVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA DetectionEVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA Detection
 
EVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 IntroductionEVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 Introduction
 
EVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID TestingEVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID Testing
 
EVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID VaccinesEVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID Vaccines
 
EVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID TransmissionEVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID Transmission
 
EVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 VaccinesEVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
 
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and TestingEVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
 
EVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 IntroductionEVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
 
A renewed need for a genomic field guide to microbes
A renewed need for a genomic field guide to microbesA renewed need for a genomic field guide to microbes
A renewed need for a genomic field guide to microbes
 
BIS2C_2020. Lecture 22 Fungi Part 1
BIS2C_2020. Lecture 22 Fungi Part 1BIS2C_2020. Lecture 22 Fungi Part 1
BIS2C_2020. Lecture 22 Fungi Part 1
 

Dernier

LAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary Microbiology
LAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary MicrobiologyLAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary Microbiology
LAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary MicrobiologyChayanika Das
 
Introduction of Organ-On-A-Chip - Creative Biolabs
Introduction of Organ-On-A-Chip - Creative BiolabsIntroduction of Organ-On-A-Chip - Creative Biolabs
Introduction of Organ-On-A-Chip - Creative BiolabsCreative-Biolabs
 
Introduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptxIntroduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptxMedical College
 
Harry Coumnas Thinks That Human Teleportation May Ensure Humanity's Survival
Harry Coumnas Thinks That Human Teleportation May Ensure Humanity's SurvivalHarry Coumnas Thinks That Human Teleportation May Ensure Humanity's Survival
Harry Coumnas Thinks That Human Teleportation May Ensure Humanity's Survivalkevin8smith
 
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Sérgio Sacani
 
bonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girlsbonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girlshansessene
 
Total Legal: A “Joint” Journey into the Chemistry of Cannabinoids
Total Legal: A “Joint” Journey into the Chemistry of CannabinoidsTotal Legal: A “Joint” Journey into the Chemistry of Cannabinoids
Total Legal: A “Joint” Journey into the Chemistry of CannabinoidsMarkus Roggen
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptxpallavirawat456
 
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...Chayanika Das
 
Abnormal LFTs rate of deco and NAFLD.pptx
Abnormal LFTs rate of deco and NAFLD.pptxAbnormal LFTs rate of deco and NAFLD.pptx
Abnormal LFTs rate of deco and NAFLD.pptxzeus70441
 
The Sensory Organs, Anatomy and Function
The Sensory Organs, Anatomy and FunctionThe Sensory Organs, Anatomy and Function
The Sensory Organs, Anatomy and FunctionJadeNovelo1
 
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfKDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfGABYFIORELAMALPARTID1
 
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPRPirithiRaju
 
Role of Gibberellins, mode of action and external applications.pptx
Role of Gibberellins, mode of action and external applications.pptxRole of Gibberellins, mode of action and external applications.pptx
Role of Gibberellins, mode of action and external applications.pptxjana861314
 
Environmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxEnvironmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxpriyankatabhane
 
Food_safety_Management_pptx.pptx in microbiology
Food_safety_Management_pptx.pptx in microbiologyFood_safety_Management_pptx.pptx in microbiology
Food_safety_Management_pptx.pptx in microbiologyHemantThakare8
 
Think Science: What Are Eclipses (101), by Craig Bobchin
Think Science: What Are Eclipses (101), by Craig BobchinThink Science: What Are Eclipses (101), by Craig Bobchin
Think Science: What Are Eclipses (101), by Craig BobchinNathan Cone
 
3.-Acknowledgment-Dedication-Abstract.docx
3.-Acknowledgment-Dedication-Abstract.docx3.-Acknowledgment-Dedication-Abstract.docx
3.-Acknowledgment-Dedication-Abstract.docxUlahVanessaBasa
 
Understanding Nutrition, 16th Edition pdf
Understanding Nutrition, 16th Edition pdfUnderstanding Nutrition, 16th Edition pdf
Understanding Nutrition, 16th Edition pdfHabibouKarbo
 
Production technology of Brinjal -Solanum melongena
Production technology of Brinjal -Solanum melongenaProduction technology of Brinjal -Solanum melongena
Production technology of Brinjal -Solanum melongenajana861314
 

Dernier (20)

LAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary Microbiology
LAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary MicrobiologyLAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary Microbiology
LAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary Microbiology
 
Introduction of Organ-On-A-Chip - Creative Biolabs
Introduction of Organ-On-A-Chip - Creative BiolabsIntroduction of Organ-On-A-Chip - Creative Biolabs
Introduction of Organ-On-A-Chip - Creative Biolabs
 
Introduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptxIntroduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptx
 
Harry Coumnas Thinks That Human Teleportation May Ensure Humanity's Survival
Harry Coumnas Thinks That Human Teleportation May Ensure Humanity's SurvivalHarry Coumnas Thinks That Human Teleportation May Ensure Humanity's Survival
Harry Coumnas Thinks That Human Teleportation May Ensure Humanity's Survival
 
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
 
bonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girlsbonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girls
 
Total Legal: A “Joint” Journey into the Chemistry of Cannabinoids
Total Legal: A “Joint” Journey into the Chemistry of CannabinoidsTotal Legal: A “Joint” Journey into the Chemistry of Cannabinoids
Total Legal: A “Joint” Journey into the Chemistry of Cannabinoids
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptx
 
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...
 
Abnormal LFTs rate of deco and NAFLD.pptx
Abnormal LFTs rate of deco and NAFLD.pptxAbnormal LFTs rate of deco and NAFLD.pptx
Abnormal LFTs rate of deco and NAFLD.pptx
 
The Sensory Organs, Anatomy and Function
The Sensory Organs, Anatomy and FunctionThe Sensory Organs, Anatomy and Function
The Sensory Organs, Anatomy and Function
 
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfKDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
 
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
 
Role of Gibberellins, mode of action and external applications.pptx
Role of Gibberellins, mode of action and external applications.pptxRole of Gibberellins, mode of action and external applications.pptx
Role of Gibberellins, mode of action and external applications.pptx
 
Environmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxEnvironmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptx
 
Food_safety_Management_pptx.pptx in microbiology
Food_safety_Management_pptx.pptx in microbiologyFood_safety_Management_pptx.pptx in microbiology
Food_safety_Management_pptx.pptx in microbiology
 
Think Science: What Are Eclipses (101), by Craig Bobchin
Think Science: What Are Eclipses (101), by Craig BobchinThink Science: What Are Eclipses (101), by Craig Bobchin
Think Science: What Are Eclipses (101), by Craig Bobchin
 
3.-Acknowledgment-Dedication-Abstract.docx
3.-Acknowledgment-Dedication-Abstract.docx3.-Acknowledgment-Dedication-Abstract.docx
3.-Acknowledgment-Dedication-Abstract.docx
 
Understanding Nutrition, 16th Edition pdf
Understanding Nutrition, 16th Edition pdfUnderstanding Nutrition, 16th Edition pdf
Understanding Nutrition, 16th Edition pdf
 
Production technology of Brinjal -Solanum melongena
Production technology of Brinjal -Solanum melongenaProduction technology of Brinjal -Solanum melongena
Production technology of Brinjal -Solanum melongena
 

EVE 161 Winter 2018 Class 13

  • 1. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: Implications for the microbial ‘‘pan-genome’’ Herve´ Tettelina,b, Vega Masignanib,c, Michael J. Cieslewiczb,d,e, Claudio Donatic, Duccio Medinic, Naomi L. Warda,f, Samuel V. Angiuolia, Jonathan Crabtreea, Amanda L. Jonesg, A. Scott Durkina, Robert T. DeBoya, Tanja M. Davidsena, Marirosa Morac , Maria Scarsellic , Immaculada Margarit y Rosc , Jeremy D. Petersona , Christopher R. Hausera , Jaideep P. Sundarama, William C. Nelsona, Ramana Madupua, Lauren M. Brinkaca, Robert J. Dodsona, Mary J. Rosovitza, Steven A. Sullivana, Sean C. Daughertya, Daniel H. Hafta, Jeremy Selenguta, Michelle L. Gwinna, Liwei Zhoua, Nikhat Zafara, Hoda Khouria, Diana Radunea, George Dimitrova, Kisha Watkinsa, Kevin J. B. O’Connorh, Shannon Smithi, Teresa R. Utterbacki , Owen Whitea , Craig E. Rubensg , Guido Grandic , Lawrence C. Madoffe,j , Dennis L. Kaspere,j , John L. Telfordc, Michael R. Wesselsd,e, Rino Rappuolic,k,l, and Claire M. Frasera,b,k,m aInstitute for Genomic Research, 9712 Medical Center Drive, Rockville, MD 20850; cChiron Vaccines, Via Fiorentina 1, 53100 Siena, Italy; dDivision of Infectious Diseases, Children’s Hospital, 300 Longwood Avenue, Boston, MA 02115; eHarvard Medical School, Boston, MA 02115; fCenter of Marine Biotechnology, University of Maryland Biotechnology Institute, 701 East Pratt Street, Baltimore, MD 21202; gChildren’s Hospital and Regional Medical Center, 307 Westlake Avenue N, Seattle, WA 98109; hThe Johns Hopkins University, 3400 North Charles Street, Baltimore, MD 21218; iJ. Craig Venter Institute, 5 Research Place, Rockville, MD 20850; jChanning Laboratory, Brigham and Women’s Hospital, 181 Longwood Avenue, Boston, MA 02115; and mGeorge Washington University Medical Center, 2300 Eye Street NW, Washington, DC 20037 Contributed by Rino Rappuoli, August 5, 2005 The development of efficient and inexpensive genome sequencing methods has revolutionized the study of human bacterial patho- gens and improved vaccine design. Unfortunately, the sequence of a single genome does not reflect how genetic variability drives pathogenesis within a bacterial species and also limits genome- wide screens for vaccine candidates or for antimicrobial targets. We have generated the genomic sequence of six strains represent- ing the five major disease-causing serotypes of Streptococcus agalactiae, the main cause of neonatal infection in humans. Anal- ysis of these genomes and those available in databases showed that the S. agalactiae species can be described by a pan-genome consisting of a core genome shared by all isolates, accounting for Ϸ80% of any single genome, plus a dispensable genome consisting of partially shared and strain-specific genes. Mathematical extrap- genome sequence of the type Ia strain A909 and draft genome sequences (8ϫ sequence coverage) of five additional strains, rep- resenting the five major serotypes. Comparative analysis of the six newly sequenced genomes and the two genomes already available in the databases suggests that a bacterial species can be described by its ‘‘pan-genome’’ (pan, from the Greek word ␲␣␯, meaning whole), which includes a core genome containing genes present in all strains and a dispensable genome composed of genes absent from one or more strains and genes that are unique to each strain. Surprisingly, unique genes were still detected after eight genomes were sequenced, and mathematical extrapolation predicts that new genes will still be found after sequencing many more strains. Thus, the genomes of multiple, independent isolates are required to understand the global complexity of bacterial species. Analysis of
  • 2. In 1987, it was proposed (1) that bacterial strains showing 70% DNA DNA reassociation and sharing characteristic phenotypic traits should be considered to be strains of the same species. Thus far, the genome sequence of one or two strains for each species has provided unprecedented information; however, the question of how many genomes are necessary to fully describe a bacterial species has yet to be asked.
  • 3. Comparative analysis of the six newly sequenced genomes and the two genomes already available in the databases suggests that a bacterial species can be described by its ‘‘pan-genome’’ (pan, from the Greek word , meaning whole), which includes a core genome containing genes present in all strains and a dispensable genome composed of genes absent from one or more strains and genes that are unique to each strain. Surprisingly, unique genes were still detected after eight genomes were sequenced, and mathematical extrapolation predicts that new genes will still be found after sequencing many more strains.
  • 4. Methods • Sequence • Assemble • Annotate • Compare • Modeling
  • 5. rma- erial s for ever, fully have ns of alac- born derly ibed; tates V (5). I and gene plete Sequencing, Annotation, and Unfinished Genomes. Genome se- quences were generated by the whole-genome shotgun sequenc- ing approach (13, 14). Draft genomes were sequenced to 8ϫ- Freely available online through the PNAS open access option. Abbreviations: GBS, group B Streptococcus; MLST, multilocus sequence typing; GAS, group A Streptococcus. Data deposition: The sequences reported in this paper have been deposited in the DDBJ͞ EMBL͞GenBank database [accession nos. AAJO01000000 (18RS21), AAJP01000000 (515), AAJQ01000000 (CJB111), AAJR01000000 (COH1), AAJS01000000 (H36B), and CP000114 (A909)]. bH.T., V.M., and M.J.C. contributed equally to this work. kR.R. and C.M.F. contributed equally to this work. lTo whom correspondence should be addressed. E-mail: rino࿝rappuoli@chiron.com. © 2005 by The National Academy of Sciences of the USA www.pnas.org͞cgi͞doi͞10.1073͞pnas.0506758102 sequence coverage, and the sequences were assembled by using the Celera Assembler (Celera Genomics, Rockville, MD) (15). Contigs were ordered and oriented according to their alignment to strain 2603V͞R by using PROMER (16). Ordered matching contigs were pasted together into a pseudochromosome, and nonmatching contigs were tacked on the end in random order. In the pseudochromosome, contigs were separated by the se- quence NNNNNCATTCCATTCATTAATTAATTAATG- AATGAATGNNNNN, which (i) generates a stop codon in all six reading frames so that no gene is predicted across junctions and (ii) provides a start site in all frames, pointing toward contigs to predict incomplete genes at their extremities. ORFs were predicted and annotated by using an automated pipeline that combines GLIMMER gene prediction (17, 18), ORF and non-ORF feature identification, and assignment of functional role catego- ries to genes (14). Assembly of strain 18RS21 resulted in a higher number of contigs than for the other unfinished genomes, leading to the prediction of Ͼ3,500 genes. Many small contigs did not harbor protein-coding genes, and several were fragments of rRNAs or coded for tRNAs or structural RNAs. ortholog contigs draft ge NUCMER drawn ( In Fig follows: and from not shar merged and (iv) noted th Genomi ␣-galact region modifica regions and the transfer in strain
  • 6. leading to the prediction of Ͼ3,500 genes. Many small contigs did not harbor protein-coding genes, and several were fragments of rRNAs or coded for tRNAs or structural RNAs. Shared and Strain-Specific Genes. Each strain pair was compared by means of the following: (i) a Smith and Waterman protein search on all of the predicted proteins by using the SSEARCH program (version 3.4) (19, 20); (ii) a DNA search of all of the predicted ORFs of a strain against the complete DNA sequence of the other strain, by using the FASTA program (version 3.4) (20); and (iii) a translated protein search of all of the predicted proteins of a strain against the complete DNA sequence of the other strain, by using the TFASTY program (version 3.4) (20). A gene was considered conserved if at least one of these three methods produced an alignment with a minimum of 50% sequence conservation over 50% of the protein͞gene length. Core-Genome and Pan-Genome Extrapolation. The number of genes shared by all GBS isolates and the number of strain-specific
  • 7. Core-Genome and Pan-Genome Extrapolation. The number of genes shared by all GBS isolates and the number of strain-specific genes depend on how many strains are taken into account. The sequential inclusion of up to eight strains was simulated in all possible combinations. The number (N) of independent mea- surements of the shared (see Fig. 2) and strain-specific genes (see Fig. 3) present in the nth genome is N ϭ 8!͞[(n Ϫ 1)!⅐(8 Ϫ n)!]. The size of the species core genome and the number of strain- specific genes for a large number of sequenced strains were extrapolated by fitting the exponential decaying functions Fc ϭ ␬c exp[Ϫn͞␶c] ϩ ⍀ and Fs ϭ ␬s exp[Ϫn͞␶s] ϩ tg(␪), respectively, to the amount of conserved genes (see Fig. 2) and of strain- specific genes (see Fig. 3), where n is the number of sequenced strains and ␬c, ␬s, ␶c, ␶s, ⍀, and tg(␪) are free parameters. tg(␪) represents the extrapolated rate of growth of the pan-genome size, P(n), as a greater number of independent GBS strain sequences become available, i.e., lim͓P͑n͔͒ n3 ϱ Ϸ tg͑␪͒⅐n. The Inset of Fig. 3 displays the measured size of the pan-genome as a function of n [in this case, N ϭ 8!͞(8 Ϫ n)!; points are obtained for each value of n] together with a plot of the calculated P(n) (see Supporting Text, which is published as supporting information on the PNAS web site). of the whole regions of aty Results General Featur coverage) wer and CJB111, b and V, which the United St strain A909 of 515, both belo different sequ sequence typi genetic divers published as su six newly seq 2603V͞R and used for subse genome sizes entire nucleot the GBS strai nations with N ranged from 8 and noncodin Fig. 1 summ ison of the eig gene synteny i
  • 8. calculated P(n) (see Supporting Text, which is published as supporting information on the PNAS web site). Synteny. Paralog clusters in each genome were generated by using the Jaccard algorithm (21), with Ն 80% identity, and a Jaccard coefficient Ն 0.6. Members of paralog clusters were then organized into ortholog clusters by allowing any member of a paralog cluster to contribute to the reciprocal best matches used to construct the ortholog clusters. Syntenic blocks are defined as a set of five or more consecutive pairs of genes from the same Tettelin et al. ortholog cluster. Because they do not participate in clusters, all contigs that do not contain protein-coding genes from the five draft genomes were searched against all genomes by using the NUCMER program (16). Syntenic blocks and NUCMER results were drawn (Fig. 1) by using SYBIL (http:͞͞sybil.sourceforge.net͞). In Fig. 1, genomic islands of diversity Ͼ5 kb are predicted as follows: (i) strains are inspected from the top panel and down and from left to right on each panel; (ii) regions of at least 1 kb not shared with another strain are identified; (iii) regions are
  • 9. and er. se- G- all ons igs ere hat RF go- her mes, did of by rch am ted the drawn (Fig. 1) by using SYBIL (http:͞͞sybil.sourceforge.net͞). In Fig. 1, genomic islands of diversity Ͼ5 kb are predicted as follows: (i) strains are inspected from the top panel and down and from left to right on each panel; (ii) regions of at least 1 kb not shared with another strain are identified; (iii) regions are merged into single islands if they are within 5 kb of each other; and (iv) resulting islands Ͼ5 kb are considered. It should be noted that some islands are composed of more than one contig. Genomic islands discussed in the text are the following: the ␣-galactosidase region in strain H36B, island 7.4; the prophage region in strain H36B, island 7.5; the DNA restriction͞ modification system in strain 515, part of island 4.5; the Tn916 regions in strains 2603V͞R, 515, CJB111, and COH1, islands 1.8 and the left side of 5.3; and serine-rich protein and glycosyl- transferases flanked by cell-wall-anchored proteins and sortases in strain COH1, unnumbered region between islands 6.5 and 1.15. Fig. 1 reveals many non-protein-coding regions in strain 18RS21 that display NUCMER matches elsewhere in the 18RS21 genome. Most of these regions correspond to fragments of rRNAs, tRNAs, or structural RNAs, all of which exhibit an expected atypical nucleotide composition.
  • 10. 1.15. Fig. 1 reveals many non-protein-coding regions in strain 18RS21 that display NUCMER matches elsewhere in the 18RS21 genome. Most of these regions correspond to fragments of rRNAs, tRNAs, or structural RNAs, all of which exhibit an expected atypical nucleotide composition. ␹2 Analysis. Regions of atypical nucleotide composition were identified by the ␹2 analysis; the distribution of all 64 trinucle- otides (3mers) was computed for the complete genome in all six reading frames, followed by the 3mer distribution in 5,000-bp windows. Windows overlapped by 500 bp. For each window, the ␹2 statistic on the difference between its 3mer content and that of the whole genome was computed. Peaks in Fig. 1 indicate regions of atypical nucleotide composition. Results General Features of GBS Genomes. Draft genome sequences (8ϫ coverage) were obtained for strains 515, H36B, 18RS21, COH1, and CJB111, belonging, respectively, to serotypes Ia, Ib, II, III,
  • 11.
  • 12. Fig. 1. Whole genome alignment of GBS strains. The eight genomes are compared to each other by using COG (41) and NUCMER analyses (see Materials and Methods). Each genome (shaded strain name) is colored with a gradient that ranges from yellow (nucleotide 1) to blue (end). Differences in color between a reference sequence (the last colored line in each genome) and the other genomes indicate conserved protein-coding regions that have been rearranged. Uncolored segments denote coding regions in which no conserved genes were detected. NUCMER matches for contigs that do not contain protein-coding genes are displayed by red blocks (matches within the reference strain are displayed on the line directly above it). Genomic islands of diversity are boxed and numbered ‘‘x.y,’’ where x is the panel or strain number where the island first appeared and y is the island location in that genome from left to right. A indicates an island that was not identified in a previous genome. Islands that overlap by at least 50% (based on the number of shared genes) with previously identified islands receive the same number as the initial island. The gene content of the 69 islands identified is listed in Table 2, which is published as supporting information on the PNAS web site. Strain-specific regions, free of COG or NUCMER matches, are displayed in black at the bottom of each panel. Portions of these regions that harbor protein-coding genes are indicated in gray below the black blocks. The curves on top of each panel represent the nucleotide composition ( 2 analysis) (see Materials and Methods) of the reference strain of the panel, and peaks indicate regions of atypical composition.
  • 13. with the numbe verify whether repeated the an of Streptococcu strains of Bac levels of genom Fig. 2. GBS core genome. The number of shared genes is plotted as a function of the number n of strains sequentially added (see Materials and Methods). For each n, circles are the 8!͞[(n Ϫ 1)!⅐(8 Ϫ n)!] values obtained for the different strain combinations. Squares are the averages of such values. The continuous curve represents the least-squares fit of the function Fc ϭ ␬c exp[Ϫn͞␶c] ϩ ⍀ (see Eq. 1 in Supporting Text) to data. The best fit was obtained with correlation r2 ϭ 0.990 for ␬c ϭ 610 Ϯ 38, ␶c ϭ 2.16 Ϯ 0.28, and ⍀ ϭ 1,806 Ϯ 16. The extrapolated GBS core genome size ⍀ is shown as a dashed line. Fig. 4. Dendro was used to cluste the program CLUS profiles of presen and used as inpu drawing and boo Serotypes and M
  • 14. the B. anthrac Genome Diver phenotypic tra the capsular p widely used to information h therapy. Rece conserved cor correlate with terize the gen isolates, a den genes across t two belong to (NEM316 and Furthermore, (ST23). Comp 7, which is pub site) showed th type often sha serotype, resu strains. In sup at the nucleot related seroty the least cons whereas the tw identity over 9 (type V and I Fig. 3. GBS pan-genome. The number of specific genes is plotted as a function of the number n of strains sequentially added (see Materials and Methods). For each n, circles are the 8!͞[(n Ϫ 1)!⅐(8 Ϫ n)!] values obtained for the different strain combinations; squares are the averages of such values. The blue curve is the least-squares fit of the function Fs(n) ϭ ␬s exp[Ϫn͞␶s] ϩ tg(␪) (see Eq. 2 in Supporting Text) to the data. The best fit was obtained with correlation r2 ϭ 0.995 for ␬s ϭ 476 Ϯ 62, ␶s ϭ 1.51 Ϯ 0.15, and tg(␪) ϭ 33 Ϯ 3.5. The extrapolated average number tg(␪) of strain-specific genes is shown as a dashed line. (Inset) Size of the GBS pan-genome as a function of n. The red curve is the calculated pan-genome size
  • 15. Fig. 4. Dendrogram of the eight GBS genomes. Shared gene information was used to cluster proteins into groups by using the single-linkage method of the program CLUSTER (http:͞͞rana.lbl.gov). Groups were then converted into profiles of presence or absence of each gene (0 or 1) in the eight GBS strains and used as input to PAUP* 4.0b10 (Sinauer, Sunderland, MA) for dendrogram drawing and bootstrapping. Numbers at the nodes indicate bootstrap values. Serotypes and MLST types of each strain are within parentheses.
  • 16. Tettelin et al. Figure 5 Distribution of shared and unique genes sorted by functional categories. Groups of shared genes and strain-specific genes were sorted by broad functional categories as follows: housekeeping functions, cell envelope proteins, regulatory functions, transport and binding proteins, mobile and extrachromosomal elements, proteins of unknown function, and hypothetical proteins. The total number of genes in each of these categories was counted and displayed for groups shared by all eight strains, groups shared by any combination of seven to two strains, and strain-specific genes.
  • 17. Implications for Bacterial Taxonomy. Methods commonly used to define bacterial species (DNA⅐DNA reassociation, 16S rRNA typing, MLST, etc.) rely mostly on features associated with the core genome (40). Our work confirms that the essence of the species is linked to the core genome. However, the majority of the genetic traits linked to virulence, capsular serotype, adaptation, and anti- biotic resistance pertain to the dispensable genome. Therefore, sequencing of multiple strains is necessary to understand the virulence of pathogenic bacteria and to provide a more consistent definition of the species itself. We identified species with an open pan-genome, such as GBS and GAS, and species with a closed pan-genome, such as B. anthracis. Nevertheless, a different inter- pretation of the same data may lead to the conclusion that the present definition of bacterial species is inconsistent because, in reality, only species with an open pan-genome are species, whereas B. anthracis is not a true genetic species on its own, but only a clone of Bacillus cereus, with very distinctive phenotypic traits provided by the acquisition of the virulence plasmid coding for the anthrax toxin.
  • 18. Concluding Comment. Our data clearly show that the strategy to sequence one or two genomes per species, which has been used during the first decade of the genomic era, is not sufficient and that multiple strains need to be sequenced to understand the basics of bacterial species. The methods presently used to evaluate the species diversity, such as complete genome hybridization and MLST, can explain only the presence, absence, and variability of the genetic loci that are already known and do not provide information on the genes that are not present in the reference genome. Our work provides a clear demonstration that, by these approaches, we fail to include in the analysis the entire dispensable genome, the size of which can be vastly larger than the core genome. Our work on the protein-based vaccine against GBS has shown that this is not just a theoretical disadvantage but has very important practical conse- quences because a universal vaccine is possible only by including dispensable genes (8).