SlideShare une entreprise Scribd logo
1  sur  15
Télécharger pour lire hors ligne
PLEASE SCROLL DOWN FOR ARTICLE
This article was downloaded by: [University of Texas at Austin]
On: 13 August 2009
Access details: Access Details: [subscription number 907743372]
Publisher Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,
37-41 Mortimer Street, London W1T 3JH, UK
European Journal of Phycology
Publication details, including instructions for authors and subscription information:
http://www.informaworld.com/smpp/title~content=t713725516
The limits of nuclear-encoded SSU rDNA for resolving the diatom phylogeny
Edward C. Theriot a
; Jamie J. Cannone b
; Robin R. Gutell b
; Andrew J. Alverson c
a
Texas Natural Science Center, The University of Texas at Austin, Austin, TX 78705, USA b
Section of
Integrative Biology and Center for Computational Biology and Bioinformatics, The University of Texas at
Austin, 1 University Station, Austin, Texas 78712, USA c
Department of Biology, Indiana University
Bloomington, IN 47405, USA
First Published:August2009
To cite this Article Theriot, Edward C., Cannone, Jamie J., Gutell, Robin R. and Alverson, Andrew J.(2009)'The limits of nuclear-
encoded SSU rDNA for resolving the diatom phylogeny',European Journal of Phycology,44:3,277 — 290
To link to this Article: DOI: 10.1080/09670260902749159
URL: http://dx.doi.org/10.1080/09670260902749159
Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf
This article may be used for research, teaching and private study purposes. Any substantial or
systematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply or
distribution in any form to anyone is expressly forbidden.
The publisher does not give any warranty express or implied or make any representation that the contents
will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses
should be independently verified with primary sources. The publisher shall not be liable for any loss,
actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly
or indirectly in connection with or arising out of the use of this material.
Eur. J. Phycol., (2009), 44(3): 277–290
The limits of nuclear-encoded SSU rDNA
for resolving the diatom phylogeny
EDWARD C. THERIOT1
, JAMIE J. CANNONE2
,
ROBIN R. GUTELL2
AND ANDREW J. ALVERSON3
1
Texas Natural Science Center, The University of Texas at Austin, 2400 Trinity Street, Austin, TX 78705, USA
2
Section of Integrative Biology and Center for Computational Biology and Bioinformatics, The University
of Texas at Austin, 1 University Station, Austin, Texas 78712, USA
3
Department of Biology, Indiana University Bloomington, 1001 E Third Street, 142 Jordan Hall, IN 47405, USA
(Received 26 September 2008; revised 13 November 2008; accepted 25 November 2008)
A recent reclassification of diatoms based on phylogenies recovered using the nuclear-encoded small subunit ribosomal
RNA (SSU rRNA) gene contains three major classes, Coscinodiscophyceae, Mediophyceae and the Bacillariophyceae
(the CMB hypothesis). We evaluated this with a sequence alignment of 1336 protist and heterokont algae SSU rRNAs,
which includes 673 diatoms. Sequences were aligned to maintain structural elements conserved within this dataset.
Parsimony analysis rejected the CMB hypothesis, albeit weakly. Morphological data are also incongruent with this
recent CMB hypothesis of three diatom clades. We also re-analysed a recently published dataset that purports to support
the CMB hypothesis. Our re-analysis found that the original analysis had not converged on the true bipartition posterior
probability distribution, and rejected the CMB hypothesis. Thus we conclude that a reclassification of the evolutionary
relationships of the diatoms according to the CMB hypothesis is premature.
Key words: SSU, diatom phylogeny, diatom classification, Coscinodiscophyceae, Mediophyceae, Bacillariophyceae
Introduction
Analyses of molecular data (mainly nuclear-
encoded small subunit ribosomal DNA;
henceforth SSU) have generally reinforced the tra-
ditional view (Simonsen, 1979; Round et al., 1990)
that centric diatoms broadly grade into pennates
through many nodes (Medlin et al., 1993, 1996a,
b, 2000; Ehara et al., 2000; Medlin & Kaczmarska,
2004; Sorhannus, 2004, 2007; Alverson et al., 2006;
Choi et al., 2008; see Alverson & Theriot, 2005
for review). However, Medlin & Kaczmarska
(2004) recently proposed that centric diatoms were
composed of only two clades rather than many.
They retained the name Coscinodiscophyceae
for the so-called ‘radial centrics’ and applied
the name Mediophyceae for the so-called ‘bipolar’
or ‘multipolar centrics’. They also suggested
a number of morphological characters as diagnostic
for these groups. We refer to this as the CMB
hypothesis (for the three major clades discovered –
Coscinodiscophyceae, Mediophyceae and
Bacillariophyceae).
The CMB phylogenetic hypothesis has not been
universally embraced. For example, Adl et al.
(2005) treated both Coscinodiscophyceae
and Mediophyceae as paraphyletic taxa without
discussion. Williams & Kociolek (2007) challenged
the robustness of the CMB phylogeny based
on the fact that many different SSU analyses
return different trees. In contrast, Sims et al.
(2006) recovered the CMB hypothesis with high
bipartition posterior probability (BPP) support.
Medlin et al. (2008) recovered the CMB hypothesis
with high BPP support using a secondary structure
alignment but noted that several aspects of the tree
were unusual (e.g. the placement of Attheya).
In fact, topology of the diatom SSU tree
(and support values for incongruent groups) has
changed from study to study. For example,
the elongate Toxarium has been placed well
within the centric grade amidst multipolar diatoms
(very distant from the pennate diatoms) using max-
imum likelihood (ML) analysis on 38 diatoms
(Kooistra et al., 2003), as sister to all pennates
in a Bayesian analysis of 51 diatom sequences
(Chepurnov et al., 2008), poorly resolved
in a maximum parsimony (MP) analysis of 181
diatom sequences (Alverson et al., 2006), and
Correspondence to: Edward C. Theriot. E-mail: etheriot@mail.
utexas.edu
ISSN 0967-0262 print/ISSN 1469-4433 online/09/030277–290 ß 2009 British Phycological Society
DOI: 10.1080/09670260902749159
DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009
once again well within the multipolar diatoms
in a Bayesian analysis of 54 diatom SSU sequences
(Medlin et al., 2008). As underscored by this brief
comparison, the many different inferences of
diatom phylogeny have utilized different alignment
strategies, different optimality criteria, have
employed those criteria in different ways and
have used different taxa. Any or all of these factors
may have lead to the novel results of Medlin &
Kaczmarska (2004) and Sims et al. (2006), but
this cannot be directly studied because the
Medlin & Kaczmarska (2004) and Sims et al.
(2006) datasets which produced the CMB hypoth-
esis were not publicly available. However, the
Medlin et al. (2008) dataset is available and we
re-analyse it below. To test the effects of ingroup
and outgroup sampling, we created our own large
alignment of stramenopile SSU sequences, aligned
according to secondary structure (Gutell et al.,
1985, 1992, 2002) and used it to test the CMB
hypothesis and its robustness. Specifically we
address the effect (or lack thereof ) of adding dis-
tantly related outgroups on inferences of the
diatom SSU tree.
Materials and methods
Multiple sequence alignment
We included all 1549 SSU stramenopile sequences avail-
able in Genbank as of September 1, 2007. The
SSU rDNA sequences were aligned manually with the
alignment editor ‘AE2’ (developed by T. Macke, Scripps
Research Institute, San Diego, USA; Larsen et al. 1993),
which was developed for Sun Microsystems’ (Santa
Clara, USA) workstations running the Solaris operating
system. The manual alignment process involves first
positionally aligning homologous nucleotides (i.e.
those that map to the same locations and tertiary struc-
ture models) into columns in the alignment, maximizing
their sequence and structure similarity. For regions with
high similarity between sequences, the nucleotide
sequence is sufficient to align sequences with confidence.
For more variable regions in closely related sequences
or when aligning more distantly related sequences, how-
ever, a high-quality alignment only can be produced
when additional information (here, secondary and/or
tertiary structure data) is included.
The underlying SSU rRNA secondary structure
model was initially predicted with covariation analysis
(Gutell et al., 1985, 1992). Approximately 98% of the
predicted model base pairs were present in the high-
resolution crystal structure from the 30S ribosomal
subunit (Gutell et al., 2002). This model (based on the
bacterium Escherichia coli) has been extended to
the eukaryotic SSU rRNA (Cannone et al., 2002),
using covariation analysis to assess eukaryote-specific
features. The additional constraints of the eukaryotic
model were used to refine the alignment of the strame-
nopile sequences iteratively until positional homology
was established for the entire data matrix.
The initial SSU alignment contained 1549 sequences,
with a final length of 3786 columns. Medlin &
Kaczmarska (2004) filtered out sequences less than
50% complete and we followed this convention, result-
ing in a final dataset of 1336 stramenopile sequences of
which 673 are diatoms and seven are bolidophytes,
which are considered the immediate sister group to dia-
toms, according to both SSU and chloroplast-encoded
rbcL (Daugbjerg & Andersen, 1997; Goertzen &
Theriot, 2003; Andersen, 2004). The remaining taxa
are more distantly related stramenopiles. The final align-
ment is available at TreeBASE (http://www.treeba-
se.org/treebase/intro.html) or from the authors. Forty
secondary structure model diagrams representing the
major diatom lineages are available at http://
www.rna.ccbb.utexas.edu/SIM/4D/Diatom_nSSU/. We
analysed the data in two datasets: diatoms plus bolido-
phytes only (DiatBo) and diatoms plus all stramenopiles
(DiatStram).
Other datasets
We obtained the Nexus files used for Figs 2 and 3 in
Medlin & Kaczmarska (2004) from Dr Medlin. One
dataset had 126 sequences and the other had 281
sequences, and we refer to them as the MK126 and
MK281 datasets. Both had the same 123 diatom
sequences and differed only in that the former used boli-
dophytes only as the outgroup and the latter sampled
broadly across eukaryotes for the outgroups. We also
used the Nexus file used to produce Fig. 1a of Medlin
et al. (2008) from http://www3.interscience.wiley.
com.ezproxy.lib.utexas.edu/journal/121395867/suppinfo.
That file had 54 sequences, all diatoms but no outgroup,
and we refer to that as the M54 dataset.
Phylogenetic analysis
All datasets were subjected to parsimony analysis
using the TNT program (Goloboff et al., 2003). The
full suite of TNT options (sectorial search, ratchet,
drift and tree fusion) was used. There is no standard
recommendation for use of these algorithms and there
are few comparative studies of these algorithms. Within
the context of the ratchet, Nixon (1999) argued that for
large datasets, it may be better to limit length of searches
on individual islands of trees and search more islands.
The notion is that exploring a greater range of islands
containing optimal trees is more likely to cover the entire
diversity of optimal trees in a shorter period of time than
exhaustively searching one island. Thus, we took the
approach used by Goertzen & Theriot (2003) and
Alverson et al. (2006) when employing these newer algo-
rithms. We increased the number of all cycles, rounds
and repetitions for sectorial, drift, fusion and ratchet
searches 10-fold beyond default values, and used
between 100 and 1000 random taxon additions for
each run. We saved the resultant trees from each run
separately, and then repeated the procedure with
a new randomly selected seed number. After each run,
we checked that no shorter tree was found, combined
trees from all previous runs and then calculated the
E. C. Theriot et al. 278
DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009
number of nodes collapsed in their strict consensus.
If no shorter trees were found and if no additional
nodes were collapsed, we concluded that we had
sampled the complete representative set of MP trees,
as additional trees would be redundant and unlikely to
further erode the resolution of the strict consensus
(Nixon, 1999; Goertzen & Theriot, 2003).
We assessed the parsimony penalty required by
constraining each of the Coscinodiscophyceae,
Mediophyceae and Bacillariophyceae to monophyly
under searches as above. We assessed support for the
unconstrained MP trees using nonparametric bootstrap
(BS) analysis in TNT with the standard sampling with
replacement strategy. We used the new technology
search with 10 taxon additions and sectorial, ratchet,
drift and tree fusings for each of the 1000 pseudorepli-
cates of the BS analysis.
The DiatBo, MK 281, MK126 and M54 datasets were
subjected to Bayesian analyses. All Bayesian analyses
were run with the GTR þ G þ I model (nucmo-
del ¼ 4by4, nst ¼ 6, rates ¼ invgamma). These were the
settings used by Sims et al. (2006) and also corresponded
to the best model for each dataset as selected by
MrModelTest (Nylander, 2004). All initial runs for all
datasets were done at 1 000 000 Markov chain Monte
Carlo (MCMC) generations, equal to or greater than
the number of generations run by Medlin &
Kaczmarska (2004), Sims et al. (2006) and Medlin
et al. (2008). Where these papers did not specify other
settings for the Bayesian analysis, default settings were
used. To test reproducibility of the results, we ran three
separate analyses, each with two runs for a total of six
independent runs of 1 000 000 MCMC generations each.
We also ran one analysis of the DiatBo dataset with two
runs (four chains, three heated, one cold) for 10 million
generations, saving every 10 000th tree. Finally, we ran
the M54 dataset for 50 million generations, saving every
10 000th tree. We assessed whether independent runs in
all analyses had sampled the same posterior distribution
by comparing independent run (split) posterior
probabilities with the compare command in the AWTY
program (Wilgenbusch et al., 2004). We followed the
burn-in periods of Medlin et al. (2004) and (2008) for
their datasets when we ran 1 000 000 generations on
M54, MK126 and MK281. We used a burn-in of 90%
for our DiatBo dataset 1 000 000 and 10 000 000 genera-
tion analyses to approximate Sims et al. (2006).
Morphology
We coded the characters of symmetry, presence
or absence of mucilaginous matrix, auxospore shape/
growth, properizonium and perizonium presence/
absence for 34 taxa using Medlin & Kaczmarska’s
criteria (2004: 258, table 2), and treated all multistate
characters as unordered. Since no outgroup or
ontogenetic information was provided, the only rooting
option was to consider the Coscinodiscophyceae
as the outgroup to the remaining diatoms and
test for monophyly of the Mediophyceae and
Bacillariophyceae. However it is possible to determine
if the Coscinodiscophyceae formed a convex group
(possibly monophyletic depending on the placement of
the root within the unrooted network). Winclada run-
ning NONA was used for parsimony analysis, with
10 000 replications, holding 100 starting trees per repeti-
tion, and all other parameters set to defaults.
Results
Parsimony analysis
For the DiatStram dataset, 11 runs totalling 2898
random taxon addition repetitions were required
to converge on the representative set of MP trees
(length (L): 39822; consistency index (c.i.): 0.12;
retention index (r.i.): 0.84). We found 22 unique
MP trees on the first run. Their strict consensus
collapsed 441 nodes. Eight more runs produced
54 more MP trees for a total of 76 trees.
However, 11 of these were duplicates and there
were only 65 unique MP trees. Their strict consen-
sus collapsed 450 nodes or only nine more than
collapsed in the first single run. That we found
redundant trees and the reduced yield in topologi-
cal diversity suggest that we have found the true
diversity of all MP trees that could be obtained
from the DiatStram dataset (Fig. 1).
For the DiatBo dataset, three runs of 500 random
addition sequences seemed to converge on the
representative set of MP trees. The strict consensus
of the first 139 trees of L: 14094 (c.i.: 0.19, r.i.: 0.84)
collapsed 287 nodes, that of the 338 trees of the first
and second runs collapsed 288 nodes, and that of
the 554 trees of the all three runs combined col-
lapsed 288 nodes. In each of the runs, an MP tree
was found within the first 18 random additions
indicating that TNT was finding at least one tree
of optimal topology very early in the analysis.
In addition, 51 of the 554 total trees were identical
to trees previously found, indicating that there was
some redundancy in the coverage of tree space.
Thus, we believe Fig. 2 well represents the strict
consensus of all equally MP cladograms that
might be found in the DiatBo dataset.
Unconstrained searches in both analyses
resulted in non-monophyly for the classes
Coscinodiscophyceae and Mediophyceae, and
monophyly for the class Bacillariophyceae.
In both, the Coscinodiscophyceae was positively
paraphyletic (i.e. fully resolved as a ladder-like
grade with no polytomies) with the Melosirales
sister to a non-monophyletic Mediophyceae
plus a monophyletic Bacillariophyceae. The
Mediophyceae was positively paraphyletic in the
DiatStram analysis, with Chaetoceros and a few
other taxa forming a clade sister to the pennates.
In the DiatBo analyses the Mediophyceae formed
an unresolved polytomy.
Monophyly of the Coscinodiscophyceae and
Mediophyceae (i.e. the CMB hypothesis) required
SSU and the diatom phylogeny 279
DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009
Fig. 1. Strict consensus of 65 unique equally most parsimonious trees calculated from the stramenopile-outgroup and diatom-
ingroup analysis (DiatStram dataset). Only relationships among diatoms are shown.
E. C. Theriot et al. 280
DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009
Fig. 2. Strict consensus of 503 unique equally most parsimonious trees calculated from the diatom plus bolidophyte (DiatBo)
dataset. Only relationships among diatoms are shown.
SSU and the diatom phylogeny 281
DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009
little penalty for either large dataset: the CMB
hypothesis was only seven steps longer than the
unconstrained MP trees for the DiatStram dataset
and 10 steps longer for the DiatBo dataset.
Arrangements of terminal taxa were similar
for results for both datasets, and only the tree for
the DiatBo dataset is shown (Fig. 3).
Given the relatively low penalty incurred for
transforming any optimal tree into the CMB
hypothesis, it is not surprising that the BS values
along the backbone of the tree were generally quite
low. The Bacillariophyceae clade and the
Mediophyceae plus Bacillariophyceae clade were
the only two backbone nodes to receive BS support
values of 90% for either dataset.
Parsimony analysis of M54, MK126 and MK281
datasets yielded similar results (trees not shown).
The MP tree or trees rejected the CMB hypothesis,
and bootstrap values along the backbone were
typically less than 50%. The CMB constraint
trees were not much longer than the MP trees:
four steps longer for the M54 dataset (4530 vs.
4526), seven steps longer for the MK126 dataset
(5633 vs. 5626), and 23 steps longer for the
MK281 dataset (19 302 vs. 19 325).
Bayesian analysis
Analyses of 1 000 000 generations had clearly not
converged on the same posterior distributions
among independent runs in analyses of either
the DiatBo or M54 datasets. Plots of bipartition
posterior probability values between the first pair
of runs for each of the two datasets showed many
points (i.e. bipartitions) falling directly along the
abscissa and ordinate, indicating that some clades
found in one analysis (even at BPP values 0.8)
were not found in all others (Fig. 4). Furthermore,
convergence was not reached with 10 000 000
MCMC generations for the DiatBo dataset or
within 20 000 000 MCMC generations for the
M54 dataset (Fig. 5). For the DiatBo dataset,
topological differences between runs were not
minor. In the 10 000 000 generation analysis of
DiatBo, Toxarium, Lampriscus, Biddulphiopsis
and the Cymatosirales (Toxarium and allies)
grouped with Lithodesmiales plus Thalassiosirales
(BPP ¼ 0.90) in one run, whereas they (Toxarium
and allies) were sister to pennates (BPP ¼ 0.5) in
another run. Several species of Pinnularia,
a raphid pennate genus in traditional classifica-
tions and in our MP analyses, were placed at the
base of the diatom tree as sister to Leptocylindrus
with BPP values of 0.88 and 0.98 in each of the two
runs. While several of the 1 000 000 generation runs
recovered a monophyletic Bacillariophyceae, the
fact that we did not recover the pennates in any
of the 10 000 000 generation analyses clearly indi-
cates that even our longest Bayesian runs were far
short of convergence.
We analysed aspects of performance of the M54
data set to obtain a gross estimate of how difficult
it might be to reach convergence in a Bayesian
analysis of several hundred diatom sequences.
The standard deviation of bipartitions between
independent runs for the M54 dataset dropped to
near zero at about 22 million generations, and
thereafter oscillated at 0.1 until the analysis was
terminated at 50 000 000 generations (Fig. 6).
While this might suggest that convergence had
been reached by 22 million generations, plotting
the sampled trees for the last 28 million generations
shows clusters of points off a straight line (Fig. 7).
Discarding trees from the first 45 million genera-
tions resulted in a BPP plot approximating
a straight line. The majority rule consensus tree
returned a convex Coscinodiscophyceae and
monophyletic Bacillariophyceae, but the
Mediophyceae were positively paraphyletic
(Fig. 8). Attheya septentrionalis was grouped with
the pennates at a BPP of 0.95. This is the place-
ment of Attheya obtained from MP analysis.
In fact, incongruence between the Bayesian and
MP trees for dataset M54 is restricted to areas
where BPP values are below 0.70 (not shown).
As judged by the still rapidly dropping split
standard deviations (not shown), Bayesian ana-
lyses of the intermediate-sized MK126 and
MK281 datasets had not converged on the
same posterior probability distribution at
1 000 000 MCMC generations, again underscoring
the difficulty of completing a meaningful
Bayesian analysis on even 100 diatom SSU
sequences in so few MCMC generations. Our
analysis of the MK54 dataset indicates that it
might take as many as 50–100 million genera-
tions or more to reach convergence on datasets
with 600 or more diatom SSU sequences.
Morphological tree
Seven trees of length nine were found. Only
the pennates formed a convex group (neither
the Coscinodiscophyceae or Mediophyceae
were monophyletic, regardless of how the tree
was rooted). In the strict consensus, the
Thalassiosirales were excluded from the remaining
Mediophyceae (Fig. 9) because they share all the
characteristics of the Coscinodiscophyceae (radial
symmetry, globular/isometric auxospore shape/
growth, no perizonium or properizonium), and
have none of the features peculiar to other
Mediophyceae or the Bacillariophyceae.
E. C. Theriot et al. 282
DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009
Fig. 3. Strict consensus of 147 unique equally most parsimonious trees calculated from the Bolidomonas plus diatom (DiatBo)
dataset with Coscinodiscophyceae and Mediophyceae constrained to monophyly. Only relationships among diatoms are shown.
SSU and the diatom phylogeny 283
DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009
Discussion
Our results weakly reject the hypothesis that the
Coscinodiscophyceae, Mediophyceae and
Bacillariophyceae are each monophyletic (the
CMB hypothesis). Only the Bacillariophyceae
(pennate diatoms) were monophyletic, whether
we included only closely related outgroups (bolido-
phytes only) or distantly related outgroups (boli-
dophytes and all other stramenopiles). However,
there is little parsimony penalty to constrain trees
to the CMB hypothesis for all datasets.
Given that greatly different topologies can be
obtained from SSU datasets with little penalty,
it is not surprising that estimates of the diatom
phylogeny based on SSU sequences vary widely
between studies using different taxa, alignments,
and optimality criteria. For example, the few
studies hinting at the possibility of a monophyletic
Coscinodiscophyceae and paraphyletic Medio-
phyceae or vice versa used relatively few diatom
SSU sequences. Very early in the use of SSU
data in diatom systematics using 11 diatoms
including three Coscinodiscophyceae and one
member of the Mediophyceae, Medlin et al.
(1993) returned a monophyletic Coscinodisco-
phyceae. Using 29 diatom SSU sequences they
later (Medlin et al. 1996a, b) returned
a monophyletic Coscinodiscophyceae and parap-
hyletic Mediophyceae. Kooistra  Medlin (1996)
analysed that same dataset, experimenting with
various approaches to deal with the potential
long-branch problem introduced by ‘aberrantly
evolving’ diatoms; each approach returned
a monophyletic Coscinodiscophyceae and para-
phyletic Mediophyceae, although relationships
within the mediophytes were dependent upon the
method used. Kooistra et al. (2003) used 38 diatom
SSU sequences, only two of which were Coscino-
discophyceae, both on long branches, returning
a monophyletic Coscinodiscophyceae and para-
phyletic Mediophyceae. Using 51 diatom SSU
sequences, Chepurnov et al. (2008) also returned
a monophyletic Coscinodiscophyceae and para-
phyletic Mediophyceae. However, they only ran
4 000 000 MCMC generations, so it is unclear if
they had reached convergence of topology and
posterior probabilities.
In contrast, Cavalier-Smith  Chao (2006),
focusing not on diatoms but on a wide range of
Fig. 4. Bipartition partition probability plots of two runs
(split runs) from the 1 000 000 Markov Chain Monte Carlo
(MCMC) generation Bayesian analysis of our diatom plus
bolidophyte dataset (DiatBo: upper plot) and of the Medlin
et al. (2008) dataset (M54: lower plot). 90% burn-in used
for each.
Fig. 5. Bipartition partition probability plots of two runs
(split runs) from the 10 000 000 Markov Chain Monte
Carlo (MCMC) generation Bayesian analysis of our
diatom plus bolidophyte dataset (DiatBo: upper plot) and
the 20 000 000 generation Medlin et al. (2008) dataset (M54:
lower plot). 90% burn-in used for each.
E. C. Theriot et al. 284
DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009
protists including diatoms, used a wide range
of outgroups but only 32 diatom SSU sequences
in a distance (neighbor-joining) analysis. While
they found moderate (70%) BS support for mono-
phyly of the Mediophyceae, they also found
a paraphyletic Coscinodiscophyceae, with the
internode excluding Melosirales from other
Coscinodiscophyceae receiving slightly higher sup-
port than that found for a monophyletic
Mediophyceae (BS ¼ 72%). In perhaps the most
extreme case of taxon sampling effects using
eleven diatom exemplars when studying relation-
ships among alveolates and stramenopiles, Van
de Peer et al. (1996) returned monophyly for the
centric diatoms as a whole.
It is not simply monophyly (or not) of the cen-
trics, the Coscinodiscophyceae or Mediophyceae
that has proved unstable in different SSU analyses.
Three studies, which each included more than
100 diatom sequences, offer the opportunity to
compare trees calculated under a single optimality
criterion (Bayesian inference). These revealed that
taxon sampling differences alone may account for
very different tree topologies. Based on 123
diatoms (Medlin  Kaczmarska, 2004) the
Lithodesmiales grouped with the Thalassiosirales
(BPP ¼ 1.0), with 181 diatom SSU sequences
(Alverson et al., 2006) they grouped with the
Hemiaulales to the exclusion of the
Thalassiosirales (BPP ¼ 1.0) and with an unknown
number of diatom sequences (Sims et al., 2006) the
Lithodesmiales grouped with the Biddulphiales,
Triceratiales and Toxarium to the exclusion of
the Thalassiosirales (BPP ¼ 1.0). The unpublished
dataset for Fig. 2 of Sims et al. (2006) has been
characterized as including more than 800 ingroup
sequences (Medlin et al., 2008). It should be noted
that alignment methods have varied greatly among
the many studies using SSU sequences and these
could be a possible source of variation that has yet
to be fully explored. This variation has the poten-
tial to change diatom SSU tree topology radically
(Medlin et al., 2008).
Among the many trees generated using SSU,
the most radical and controversial trees
Fig. 7. Bipartition probability plot of two runs from the
M54 dataset of Medlin et al. (2008). The upper plot dis-
carded the first 22 million Markov Chain Monte Carlo
(MCMC) generations (or 44% burn-in, based on initial
minimum at ca. 22 million MCMC generations from
Fig. 6). The lower plot discarded the first 45 million
MCMC generations (or 90% burn-in).
Fig. 6. Standard deviation of likelihood scores among inde-
pendent runs (split runs) vs number of generations for
Bayesian analysis of our diatom plus bolidophyte
(DiatBo) dataset (upper plot) and of the Medlin et al.
(2008) dataset (M54: lower plot). The line in the upper
plot represents a power function estimate of split standard
deviations out to 50 000 000 Markov Chain Monte Carlo
(MCMC) generations.
SSU and the diatom phylogeny 285
DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009
Fig. 8. Majority rule consensus tree (calculated without an outgroup and arbitrarily rooted in the middle of the
Coscinodiscophyceae) derived from 50 000 000 Markov Chain Monte Carlo generation Bayesian analysis of the M54 dataset
from Medlin et al. (2008) with 90% burn-in. Numbers or symbols below nodes are bipartition posterior probability (BPP)
values. Asterisks indicate BPP values 95%.
E. C. Theriot et al. 286
DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009
Fig. 9. Unrooted tree of diatom genera as determined by a parsimony analysis of the morphology matrix of Table 2 in Medlin
(2004). Strict consensus of eight trees.
SSU and the diatom phylogeny 287
DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009
(Williams  Kociolek, 2007) are those that support
the CMB hypothesis: the MP tree of 8600þ SSU
sequences (123 diatoms) by Medlin  Kaczmarska
(2004), the Bayesian tree (800þ diatom sequences
with bolidophyte outgroups) by Sims et al. (2006),
and the Bayesian tree (54 diatom SSU sequences
with no outgroup) by Medlin et al. (2008).
Medlin  Kaczmarska (2004) claimed that their
MP tree (8600þ sequences, but only 123 diatoms)
was more accurate than their Bayesian tree (same
123 diatoms but only three bolidophyte SSU
outgroup sequences) because including distantly
related outgroups increased the number of parsi-
mony informative characters. However, while
increased taxon sampling within the scope of
the problem (i.e. within diatoms) may increase
accuracy, increased taxon sampling outside the
scope of the problem (adding distant outgroups)
will likely decrease accuracy of phylogenetic infer-
ence (Hillis, 1998; Pollock et al., 2002; Hillis et al.,
2003; Hedtke et al., 2006; Verbruggen  Theriot,
2008). Medlin  Kaczmarska (2004) cited Bollback
(2002) as support for their position, but that paper
studied the effects of adding characters, not taxa
(ingroup or outgroup), and only in the context
of model-based methods, specifically accuracy
of model selection for phylogenetic analysis, and
is therefore not pertinent. Thus, contrary to the
claims of Medlin  Kaczmarska (2004),
one could hypothesize that recovery of the CMB
tree under parsimony is an artefact of increased
error caused by addition of distantly related
outgroup taxa. In the light of the literature on
taxon sampling, a more substantive claim was
made by Sims et al. (2006), who suggested that
increased ingroup sampling led to recovery of
the CMB hypothesis, this time with high BPP
support values.
However, we suggest that recovery of the CMB
hypothesis in Medlin  Kaczmarska (2004), Sims
et al. (2006) and Medlin et al. (2008) is probably
a result of insufficient tree search effort. Medlin 
Kaczmarska (2004) used the MP search in ARB,
whose most effective heuristic search algorithm
employs a combination of Nearest Neighbor
Interchange and Kernighan-Lin optimization,
which together are less effective than the
commonly used Tree–Bisection–Reconnection
algorithm, and certainly not as effective as other
methods, such as the parsimony ratchet (Nixon,
1999). Given the large number of near-optimal
trees that support the CMB hypothesis in
our dataset, it is likely that a suboptimal
search might find any one of these suboptimal
trees. Similarly, the Bayesian inference (Sims
et al. (2006)) was probably also confounded by
insufficient search of tree space. They only ran
1 000 000 MCMC generations. Our analysis of
our DiatBo dataset (673 diatoms plus seven boli-
dophytes) had not reached convergence at
10 000 000 generations. Our analysis of the M54
dataset, presumably the same alignment but with
far fewer taxa than used by Sims et al. (2006),
seems to have required at least 45 million genera-
tions for the burn-in alone. The tree in Fig. 1a
(Medlin et al. 2008) supporting the CMB hypoth-
esis, is clearly an artefact of running far too few
MCMC generations, and even if the tree topology
is correct, monophyly of the Coscinodiscophyceae
is an artefact of arbitrary rooting in the absence of
an outgroup.
Thus, our results strongly suggest that the choice
of optimality criterion has less influence on trees
derived from SSU data than does the proper appli-
cation of that choice. All methods, alignments and
taxon sampling schemes we reviewed or re-analysed
returned weak rejection of the CMB hypothesis.
Both Medlin  Kaczmarska (2004) and Sims
et al. (2006) argued that morphological data were
congruent with their SSU trees. However, the char-
acters discussed are either irrelevant to testing the
CMB hypothesis, or ambiguous about it (e.g. sper-
matozoid structure [both the Coscinodiscophyceae
and Mediophyceae have merogenous and hologen-
ous spermatozoids]; pyrenoid structure [one type
is apparently symplesiomorphically shared by the
Coscinodiscophyceae and Mediophyceae, while
pyrenoid structure in the Thalassiosirales is auta-
pomorphic for the order]). Using the Medlin 
Kaczmarska (2004) morphological character
matrix, our tree excluded the Thalassiosirales
from the Mediophyceae on the basis of auxospore
characteristics. Nevertheless it was claimed that the
particular pattern of auxospore formation under
discussion was retained in the Thalassiosirales
(Medlin  Kaczmarska, 2004: 267). To make this
argument under parsimony, the Thalassiosirales
would have to be the sister group to all remaining
Mediophyceae, a relationship not recovered in
either Medlin  Kaczmarska (2004) or Sims
et al. (2006).
Complicated scenarios are invoked ad hoc to
explain the distribution of the four different
Golgi body arrangements. Of the two widely dis-
tributed arrangements, the so-called Type 1 (sensu
Medlin  Kaczmarska, 2004) arrangement was
attributed to most of the Coscinodiscophyceae,
and the Type 2 arrangement was attributed
to the Aulacoseirales (Coscinodiscophyceae),
Mediophyceae, and Bacillariophyceae. If Type 1
is apomorphic and Type 2 is not, then there
is no evidence from the Golgi arrangement
that the Aulacoseirales belong to the
Coscinodiscophyceae. If Type 2 is apomorphic,
regardless of the interpretation of Type 1, the
Golgi character is congruent with our SSU trees
E. C. Theriot et al. 288
DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009
and rejects the CMB hypothesis by placing the
Aulacoseirales with the Mediophyceae and
Bacillariophyceae. Nevertheless, Medlin 
Kaczmarksa (2004) argued away this incongru-
ence, explaining the distribution of Golgi body
types in terms of ancestral polymorphisms, impli-
citly invoking unobserved character conditions in
unobserved ancestral species for as far back as the
common ancestor to red algae and diatoms
(Medlin  Kaczmarska, 2004: 265): ‘However,
G-ER-M units are known from the oomycetes
and the red algae, whereas an association of the
Golgi around the nucleus is also known in the
Labyrinthuloides. Thus, it would appear that
both features are present in ancestors of the dia-
toms and the potential host cells of their plastids.
It can be argued that the two traits then segregated
themselves in the two separate lineages as they
evolved.’
Conclusion
Medlin  Kaczmarska (2004) and Sims et al.
(2006) proposed monophyly of each of the
Coscinodiscophyceae, Mediophyceae, and
Bacillariophyceae. Since the unavailability of the
datasets (the only ones to support the CMB
hypothesis apart from that of Medlin et al.
(2008)) precluded direct reproduction of their
results, we assembled datasets of similar size and
characteristics. Our results suggest that the CMB
hypothesis is rejected by SSU data, albeit very
weakly. Similarly, our re-analysis of morphological
evidence proposed by Medlin  Kaczmarska
(2004) also weakly rejects the CMB hypothesis.
Medlin  Kaczmarska (2004) very likely recovered
a suboptimal MP tree for their 8600þ sequence
dataset and Sims et al. (2006) very likely failed to
converge on the true posterior distribution of trees
in their Bayesian analysis. Conversely, if Medlin 
Kaczmarska (2004) did recover the MP tree or
trees and if the Sims et al. (2006) analysis did
reach convergence for their dataset, then our
results demonstrate that the likelihood of their
having done so is highly dependent on taxon
sampling and/or sequence alignment. We have
demonstrated that the Medlin et al. (2008) tree
supporting the CMB hypothesis is an artefact
and that it must be concluded that the CMB
hypothesis is far from robust, regardless of how
one interprets the variation between studies.
In summary, pursuit of a well-supported phylo-
geny of diatoms seems to be limited as much by the
number of characters per taxon as by the number
of taxa for which data exist. There is a small but
growing rbcL dataset which rejects the CMB
hypothesis (Choi et al., 2008). Very limited coxI
data supports the CMB hypothesis, but analyses
so far only include four species (Ehara et al.,
2000). While nSSU data are a useful addition to
the difficult problem of inferring diatom phylo-
geny, continued use of SSU alone, as Patterson
(1994: 185) wrote in a similar context, might
simply be an ineffective attempt to ‘. . . wring
truth from recalcitrant data.’
Acknowledgements
ECT was supported by NSF EF 0629410 and the
Jane and Roland Blumberg Centennial
Professorship in Molecular Evolution. AJA was sup-
ported by an NIH Ruth L. Kirschstein NRSA
Postdoctoral Fellowship (1F32GM080079-01A1).
Both also acknowledge the Tony Institute. RRG
and JJC were supported by NIH GM067317.
References
ADL, S.M., SIMPSON, A.G.B., FARMER, M.A., ANDERSEN, R.A.,
ANDERSON, O.R., BARTA, J.R., BOWSER, S.S.,
BRUGEROLLE, G.U.Y., FENSOME, R.A., FREDERICQ, S., et al.
(2005). A new higher level classification of eukaryotes with
emphasis on the taxonomy of protists. J. Eukaryotic
Microbiol., 52: 399–451.
ALVERSON, A.J., CANNONE, J.J., GUTELL, R.R.  THERIOT, E.C.
(2006). The evolution of elongate shape in diatoms. J. Phycol.,
42: 655–668.
ALVERSON, A.J.  THERIOT, E.C. (2005). Comments on recent pro-
gress toward reconstructing the diatom phylogeny. J. Nanosci.
Nanotechnol., 5: 57–62.
ANDERSEN, R.A. (2004). Biology and systematics of heterokont and
haptophyte algae. Am. J. Bot., 91: 1508–1522.
BOLLBACK, J.P. (2002). Bayesian model adequacy and choice
in phylogenetics. Mol. Biol. Evol., 19: 1171–1180.
CANNONE, J.J., SUBRAMANIAN, S., SCHNARE, M.N., COLLETT, J.R.,
D’SOUZA, L.M., DU, Y., FENG, B., LIN, N., MADABUSI, L.V.,
MULLER, K.M., et al. (2002). The Comparative RNA Web
(CRW) Site: an online database of comparative sequence and
structure information for ribosomal, intron, and other RNAs.
BMC Bioinformatics, 3: 2.
CAVALIER-SMITH, T.  CHAO, E. (2006). Phylogeny and megasyste-
matics of phagotrophic heterokonts (Kingdom Chromista).
J. Mol. Evol., 62: 388–420.
CHEPURNOV, V.A., MANN, D.G., VON DASSOW, P.,
VANORMELINGEN, P., GILLARD, J., INZE´ , D., SABBE, K. 
VYVERMAN, W. (2008). In search of new tractable diatoms for
experimental biology. BioEssays, 30: 692–702.
CHOI, H.-G., JOO, H.M., JUNG, W., HONG, S.S., KANG, J.-S. 
KANG, S.-H. (2008). Morphology and phylogenetic relationships
of some psychrophilic polar diatoms (Bacillariophyta). Nov.
Hedwig. Beih., 133: 7–30.
DAUGBJERG, N.  ANDERSEN, R.A. (1997). A molecular phylogeny
of the heterokont algae based on analyses of chloroplast-encoded
rbcL sequence data. J. Phycol., 33: 1031–1041.
EHARA, M., INAGAKI, Y., WATANABE, K.I.  OHAMA, T. (2000).
Phylogenetic analysis of diatom coxI genes and implications
of a fluctuating GC content on mitochondrial genetic code evo-
lution. Curr. Genet., 37: 29–33.
GOERTZEN, L.R.  THERIOT, E.C. (2003). Effect of taxon sampling,
character weighting, and combined data on the interpretation of
relationships among the heterokont algae. J. Phycol., 39:
423–439.
GOLOBOFF, P., FARRIS, J., NIXON, K. (2003). T.N.T.: Tree analysis
using new technology. Available at: www.cladistics.com.
SSU and the diatom phylogeny 289
DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009
GUTELL, R.R., LEE, J.C.  CANNONE, J.J. (2002). The accuracy of
ribosomal RNA comparative structure models. Curr. Opin.
Structural Biol., 12: 301–310.
GUTELL, R.R., POWER, A., HERTZ, G.Z., PUTZ, E.J. 
STORMO, G.D. (1992). Identifying constraints on the higher-
order structure of RNA: continued development and application
of comparative sequence analysis methods. Nucl. Acids Res., 20:
5785–5795.
GUTELL, R.R., WEISER, B., WOESE, C.R.  NOLLER, H.F. (1985).
Comparative anatomy of 16S-like ribosomal RNA. Prog. Nucl.
Acid Res. Mol. Biol., 32: 155–216.
HEDTKE, S., TOWNSEND, T.  HILLIS, D. (2006). Resolution of phy-
logenetic conflict in large data sets by increased taxon sampling.
System. Biol., 55: 522–529.
HILLIS, D.M. (1998). Taxonomic sampling, phylogenetic accuracy
and investigator bias. System. Biol., 47: 3–8.
HILLIS, D.M., POLLOCK, D.D., MCGUIRE, J.A.  ZWICKL, D.J.
(2003). Is sparse taxon sampling a problem for phylogenetic
inference? System. Biol., 52: 124–126.
KOOISTRA, W., DE STEFANO, M., MANN, D.G., SALMA, N. 
MEDLIN, L.K. (2003). Phylogenetic position of Toxarium,
a pennate-like lineage within centric diatoms
(Bacillariophyceae). J. Phycol., 39: 185–197.
KOOISTRA, W.H.C.F.  MEDLIN, L.K. (1996). Evolution of the
diatoms (Bacillariophyta): IV. A reconstruction of their age
from small subunit rRNA coding regions and the fossil record.
Mol. Phylogenet. Evol., 6: 391–407.
LARSEN, N., OLSEN, G.J., MAIDAK, B.L., MCCAUGHEY, M.J.,
OVERBEEK, R., MACKE, T.J., MARSH, T.L.  WOESE, C.R.
(1993). The Ribosomal Database Project. Nucl. Acids Res., 21:
3021–3023.
MEDLIN, L.K.  KACZMARSKA, I. (2004). Evolution of the diatoms
V: Morphological and cytological support for the major clades
and a taxonomic revision. Phycologia, 43: 245–270.
MEDLIN, L.K., KOOISTRA, W.H.C.F., GERSONDE, R. 
WELLBROCK, U. (1996a). Evolution of the diatoms
(Bacillariophyta): II. Nuclear-encoded small-subunit rRNA
sequence comparisons confirm a paraphyletic origin for the cen-
tric diatoms. Mol. Biol. Evol., 13: 67–75.
MEDLIN, L.K., KOOISTRA, W.H.C.F., GERSONDE, R. 
WELLBROCK, U. (1996b). Evolution of the diatoms
(Bacillariophyta): III. Molecular evidence for the origin of the
Thalassiosirales. Nov. Hedwig. Beih., 112: 221–234.
MEDLIN, L.K., KOOISTRA, W.H.C.F.  SCHMID, A.-M.M. (2000).
A review of the evolution of the diatoms–a total approach using
molecules, morphology and geology. In The Origin and Early
Evolution of the Diatoms: Fossil, Molecular and Biogeographical
Approaches (Witkowski, A. and Sieminska, J., editors), 13–36.
Szafer Institute of Botany, Polish Academy of Sciences,
Krako´ w, Poland.
MEDLIN, L.K., SATO, S., MANN, D.G.  KOOISTRA, W.H.C.F.
(2008). Molecular evidence confirms sister relationship of
Ardissonea, Climacosphenia and Toxarium within the bipolar cen-
tric diatoms (Bacillariophyta, Mediophyceae), and cladistic ana-
lyses confirm that extremely elongate shape has arisen twice in
the diatoms. J. Phycol., 45: 1340–1348.
MEDLIN, L.K., WILLIAMS, D.M.  SIMS, P.A. (1993). The evolution
of the diatoms (Bacillariophyta). I. Origin of the group and
assessment of the monophyly of its major divisions. Eur.
J. Phycol., 28: 261–275.
NIXON, K. (1999). The parsimony ratchet, a new method for rapid
parsimony analysis. Cladistics, 15: 407–414.
NYLANDER, J. A. A. (2004). MrModeltest v2. Program distributed
by the author. Evolutionary Biology Centre, Uppsala University.
PATTERSON, C. (1994). Null or minimal models. In Models in
Phylogeny Reconstruction (Scotland, R., Siebert, D.J., and
Williams, D.M, editors), 173–192. Oxford University Press,
Oxford, UK.
POLLOCK, D.D., ZWICKL, D.J., MCGUIRE, J.A.  HILLIS, D.M.
(2002). Increased taxon sampling is advantageous for phyloge-
netic inference. System. Biol., 51: 664–671.
ROUND, F.E., CRAWFORD, R.M.  MANN, D.G. (1990).The
Diatoms: Biology  Morphology of the Genera. Cambridge,
UK: Cambridge University Press.
SIMONSEN, R. (1979). The diatom system: Ideas on phylogeny.
Bacillaria, 2: 9–71.
SIMS, P.A., MANN, D.G.  MEDLIN, L.K. (2006). Evolution of the
diatoms: Insights from fossil, biological and molecular data.
Phycologia, 45: 361–402.
SORHANNUS, U. (2004). Diatom phylogenetics inferred based on
direct optimization of nuclear-encoded SSU rRNA sequences.
Cladistics, 20: 487–497.
SORHANNUS, U. (2007). A nuclear-encoded small-subunit
ribosomal RNA timescale for diatom evolution. Mar.
Micropaleontol., 65: 1–12.
VAN DE PEER, Y., VAN DER AUWERA, G.  DE WACHTER, R.
(1996). The evolution of stramenopiles and alveolates as derived
by ‘‘substitution rate calibration’’ of small ribosomal subunit
RNA. J. Mol. Evol., 42: 201–210.
VERBRUGGEN, H.  THERIOT, E.C. (2008). Building trees of algae:
Some advances in phylogenetic and evolutionary analysis. Eur. J.
Phycol., 43: 229–252.
WILGENBUSCH, J.C., WARREN, D.L., SWOFFORD, D.L. (2004).
AWTY: A system for graphical exploration of MCMC conver-
gence in Bayesian phylogenetic inference. Available at: http://
ceb.csit.fsu.edu/awty.
WILLIAMS, D.M.  KOCIOLEK, J.P. (2007). Pursuit of a natural
classification of diatoms: History, monophyly and the rejection
of paraphyletic taxa. Eur. J. Phycol., 42: 313–319.
E. C. Theriot et al. 290
DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009

Contenu connexe

Tendances

Gutell 087.mpe.2003.29.0216
Gutell 087.mpe.2003.29.0216Gutell 087.mpe.2003.29.0216
Gutell 087.mpe.2003.29.0216Robin Gutell
 
Phylum Ciliophora: Morphology, Phylogeny, and Systematics
Phylum Ciliophora: Morphology, Phylogeny, and SystematicsPhylum Ciliophora: Morphology, Phylogeny, and Systematics
Phylum Ciliophora: Morphology, Phylogeny, and SystematicsEukRef
 
GRM2013: Rice root phenotyping of the OryzaSNP panel: associated genomic regi...
GRM2013: Rice root phenotyping of the OryzaSNP panel: associated genomic regi...GRM2013: Rice root phenotyping of the OryzaSNP panel: associated genomic regi...
GRM2013: Rice root phenotyping of the OryzaSNP panel: associated genomic regi...CGIAR Generation Challenge Programme
 
Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...
Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...
Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...Joe Suzuki
 
Evaluation of viable selection criteria at the seedling stage in corn genotyp...
Evaluation of viable selection criteria at the seedling stage in corn genotyp...Evaluation of viable selection criteria at the seedling stage in corn genotyp...
Evaluation of viable selection criteria at the seedling stage in corn genotyp...Innspub Net
 
Flow Cytometric Analysis for Ploidy and DNA Content of Banana Variants Induce...
Flow Cytometric Analysis for Ploidy and DNA Content of Banana Variants Induce...Flow Cytometric Analysis for Ploidy and DNA Content of Banana Variants Induce...
Flow Cytometric Analysis for Ploidy and DNA Content of Banana Variants Induce...paperpublications3
 
PAG XXV POSTER KHANAL
PAG XXV POSTER KHANALPAG XXV POSTER KHANAL
PAG XXV POSTER KHANALSameer Khanal
 
Summer SURP Poster v6-3
Summer SURP Poster v6-3Summer SURP Poster v6-3
Summer SURP Poster v6-3Austin Clark
 
Knowledge Discovery in Classification and Distribution of Butterfly Species F...
Knowledge Discovery in Classification and Distribution of Butterfly Species F...Knowledge Discovery in Classification and Distribution of Butterfly Species F...
Knowledge Discovery in Classification and Distribution of Butterfly Species F...ijtsrd
 
Epigenetic variation in seagrass clones
Epigenetic variation in seagrass clonesEpigenetic variation in seagrass clones
Epigenetic variation in seagrass clonesAlexander Jueterbock
 
Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...
Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...
Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...Jonathan Eisen
 
Characterization of CesA6 in Oat copy
Characterization of CesA6 in Oat copyCharacterization of CesA6 in Oat copy
Characterization of CesA6 in Oat copyCassie Dohse
 
Nrf2: A Guardian of Healthspan and Gatekeeper of Species Longevity
Nrf2:  A Guardian of Healthspan and Gatekeeper of Species LongevityNrf2:  A Guardian of Healthspan and Gatekeeper of Species Longevity
Nrf2: A Guardian of Healthspan and Gatekeeper of Species LongevityLifeVantage
 
Effects of exotic alleles and genetic backgrounds on fiber quality traits in ...
Effects of exotic alleles and genetic backgrounds on fiber quality traits in ...Effects of exotic alleles and genetic backgrounds on fiber quality traits in ...
Effects of exotic alleles and genetic backgrounds on fiber quality traits in ...Sameer Khanal
 

Tendances (20)

Gutell 087.mpe.2003.29.0216
Gutell 087.mpe.2003.29.0216Gutell 087.mpe.2003.29.0216
Gutell 087.mpe.2003.29.0216
 
Phylum Ciliophora: Morphology, Phylogeny, and Systematics
Phylum Ciliophora: Morphology, Phylogeny, and SystematicsPhylum Ciliophora: Morphology, Phylogeny, and Systematics
Phylum Ciliophora: Morphology, Phylogeny, and Systematics
 
GRM2013: Rice root phenotyping of the OryzaSNP panel: associated genomic regi...
GRM2013: Rice root phenotyping of the OryzaSNP panel: associated genomic regi...GRM2013: Rice root phenotyping of the OryzaSNP panel: associated genomic regi...
GRM2013: Rice root phenotyping of the OryzaSNP panel: associated genomic regi...
 
Current Projects Summary
Current Projects SummaryCurrent Projects Summary
Current Projects Summary
 
Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...
Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...
Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...
 
Evaluation of viable selection criteria at the seedling stage in corn genotyp...
Evaluation of viable selection criteria at the seedling stage in corn genotyp...Evaluation of viable selection criteria at the seedling stage in corn genotyp...
Evaluation of viable selection criteria at the seedling stage in corn genotyp...
 
Flow Cytometric Analysis for Ploidy and DNA Content of Banana Variants Induce...
Flow Cytometric Analysis for Ploidy and DNA Content of Banana Variants Induce...Flow Cytometric Analysis for Ploidy and DNA Content of Banana Variants Induce...
Flow Cytometric Analysis for Ploidy and DNA Content of Banana Variants Induce...
 
PAG XXV POSTER KHANAL
PAG XXV POSTER KHANALPAG XXV POSTER KHANAL
PAG XXV POSTER KHANAL
 
Isaac Kohane, "A Data Perspective on Autonomy, Human Rights, and the End of N...
Isaac Kohane, "A Data Perspective on Autonomy, Human Rights, and the End of N...Isaac Kohane, "A Data Perspective on Autonomy, Human Rights, and the End of N...
Isaac Kohane, "A Data Perspective on Autonomy, Human Rights, and the End of N...
 
Summer SURP Poster v6-3
Summer SURP Poster v6-3Summer SURP Poster v6-3
Summer SURP Poster v6-3
 
CV_QAJ_2014
CV_QAJ_2014CV_QAJ_2014
CV_QAJ_2014
 
Knowledge Discovery in Classification and Distribution of Butterfly Species F...
Knowledge Discovery in Classification and Distribution of Butterfly Species F...Knowledge Discovery in Classification and Distribution of Butterfly Species F...
Knowledge Discovery in Classification and Distribution of Butterfly Species F...
 
Epigenetic variation in seagrass clones
Epigenetic variation in seagrass clonesEpigenetic variation in seagrass clones
Epigenetic variation in seagrass clones
 
Resume
ResumeResume
Resume
 
Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...
Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...
Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...
 
Characterization of CesA6 in Oat copy
Characterization of CesA6 in Oat copyCharacterization of CesA6 in Oat copy
Characterization of CesA6 in Oat copy
 
Nrf2: A Guardian of Healthspan and Gatekeeper of Species Longevity
Nrf2:  A Guardian of Healthspan and Gatekeeper of Species LongevityNrf2:  A Guardian of Healthspan and Gatekeeper of Species Longevity
Nrf2: A Guardian of Healthspan and Gatekeeper of Species Longevity
 
Isme 2018
Isme 2018Isme 2018
Isme 2018
 
CurriculumVitae
CurriculumVitaeCurriculumVitae
CurriculumVitae
 
Effects of exotic alleles and genetic backgrounds on fiber quality traits in ...
Effects of exotic alleles and genetic backgrounds on fiber quality traits in ...Effects of exotic alleles and genetic backgrounds on fiber quality traits in ...
Effects of exotic alleles and genetic backgrounds on fiber quality traits in ...
 

Similaire à SSU-DIATOM

Gutell 097.jphy.2006.42.0655
Gutell 097.jphy.2006.42.0655Gutell 097.jphy.2006.42.0655
Gutell 097.jphy.2006.42.0655Robin Gutell
 
EVE 161 Winter 2018 Class 18
EVE 161 Winter 2018 Class 18EVE 161 Winter 2018 Class 18
EVE 161 Winter 2018 Class 18Jonathan Eisen
 
MIB200A at UCDavis Module: Microbial Phylogeny; Class 3
MIB200A at UCDavis Module: Microbial Phylogeny; Class 3MIB200A at UCDavis Module: Microbial Phylogeny; Class 3
MIB200A at UCDavis Module: Microbial Phylogeny; Class 3Jonathan Eisen
 
Bioinformatics A Biased Overview
Bioinformatics A Biased OverviewBioinformatics A Biased Overview
Bioinformatics A Biased OverviewPhilip Bourne
 
Utility of transcriptome sequencing for phylogenetic
Utility of transcriptome sequencing for phylogeneticUtility of transcriptome sequencing for phylogenetic
Utility of transcriptome sequencing for phylogeneticEdizonJambormias2
 
Gutell 056.mpe.1996.05.0391
Gutell 056.mpe.1996.05.0391Gutell 056.mpe.1996.05.0391
Gutell 056.mpe.1996.05.0391Robin Gutell
 
Gutell 094.int.j.plant.sci.2005.166.815
Gutell 094.int.j.plant.sci.2005.166.815Gutell 094.int.j.plant.sci.2005.166.815
Gutell 094.int.j.plant.sci.2005.166.815Robin Gutell
 
Microbial Phylogenomics (EVE161) Class 5
Microbial Phylogenomics (EVE161) Class 5Microbial Phylogenomics (EVE161) Class 5
Microbial Phylogenomics (EVE161) Class 5Jonathan Eisen
 
parasite poster for linkin
parasite poster for linkinparasite poster for linkin
parasite poster for linkinRachael Swedberg
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
Gutell 069.mpe.2000.15.0083
Gutell 069.mpe.2000.15.0083Gutell 069.mpe.2000.15.0083
Gutell 069.mpe.2000.15.0083Robin Gutell
 
Data as research output, data as part of the scholarly record
Data as research output, data as part of the scholarly recordData as research output, data as part of the scholarly record
Data as research output, data as part of the scholarly recordTodd Vision
 
Plant pathology in the post-genomics era
Plant pathology in the post-genomics eraPlant pathology in the post-genomics era
Plant pathology in the post-genomics eraSophien Kamoun
 
A general model for the origin of allometric scaling laws in biology
A general model for the origin of allometric scaling laws in biologyA general model for the origin of allometric scaling laws in biology
A general model for the origin of allometric scaling laws in biologyJosé Luis Moreno Garvayo
 
Classification_and_Ordination_Methods_as_a_Tool.pdf
Classification_and_Ordination_Methods_as_a_Tool.pdfClassification_and_Ordination_Methods_as_a_Tool.pdf
Classification_and_Ordination_Methods_as_a_Tool.pdfAgathaHaselvin
 
Bioinformatica 24-11-2011-t6-phylogenetics
Bioinformatica 24-11-2011-t6-phylogeneticsBioinformatica 24-11-2011-t6-phylogenetics
Bioinformatica 24-11-2011-t6-phylogeneticsProf. Wim Van Criekinge
 

Similaire à SSU-DIATOM (20)

Gutell 097.jphy.2006.42.0655
Gutell 097.jphy.2006.42.0655Gutell 097.jphy.2006.42.0655
Gutell 097.jphy.2006.42.0655
 
bai2
bai2bai2
bai2
 
EVE 161 Winter 2018 Class 18
EVE 161 Winter 2018 Class 18EVE 161 Winter 2018 Class 18
EVE 161 Winter 2018 Class 18
 
MIB200A at UCDavis Module: Microbial Phylogeny; Class 3
MIB200A at UCDavis Module: Microbial Phylogeny; Class 3MIB200A at UCDavis Module: Microbial Phylogeny; Class 3
MIB200A at UCDavis Module: Microbial Phylogeny; Class 3
 
Bioinformatics A Biased Overview
Bioinformatics A Biased OverviewBioinformatics A Biased Overview
Bioinformatics A Biased Overview
 
Utility of transcriptome sequencing for phylogenetic
Utility of transcriptome sequencing for phylogeneticUtility of transcriptome sequencing for phylogenetic
Utility of transcriptome sequencing for phylogenetic
 
Gutell 056.mpe.1996.05.0391
Gutell 056.mpe.1996.05.0391Gutell 056.mpe.1996.05.0391
Gutell 056.mpe.1996.05.0391
 
Gutell 094.int.j.plant.sci.2005.166.815
Gutell 094.int.j.plant.sci.2005.166.815Gutell 094.int.j.plant.sci.2005.166.815
Gutell 094.int.j.plant.sci.2005.166.815
 
Topology
TopologyTopology
Topology
 
Microbial Phylogenomics (EVE161) Class 5
Microbial Phylogenomics (EVE161) Class 5Microbial Phylogenomics (EVE161) Class 5
Microbial Phylogenomics (EVE161) Class 5
 
parasite poster for linkin
parasite poster for linkinparasite poster for linkin
parasite poster for linkin
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
Gutell 069.mpe.2000.15.0083
Gutell 069.mpe.2000.15.0083Gutell 069.mpe.2000.15.0083
Gutell 069.mpe.2000.15.0083
 
Data as research output, data as part of the scholarly record
Data as research output, data as part of the scholarly recordData as research output, data as part of the scholarly record
Data as research output, data as part of the scholarly record
 
Plant pathology in the post-genomics era
Plant pathology in the post-genomics eraPlant pathology in the post-genomics era
Plant pathology in the post-genomics era
 
A general model for the origin of allometric scaling laws in biology
A general model for the origin of allometric scaling laws in biologyA general model for the origin of allometric scaling laws in biology
A general model for the origin of allometric scaling laws in biology
 
Classification_and_Ordination_Methods_as_a_Tool.pdf
Classification_and_Ordination_Methods_as_a_Tool.pdfClassification_and_Ordination_Methods_as_a_Tool.pdf
Classification_and_Ordination_Methods_as_a_Tool.pdf
 
E1062632
E1062632E1062632
E1062632
 
Molecular phylogenetics
Molecular phylogeneticsMolecular phylogenetics
Molecular phylogenetics
 
Bioinformatica 24-11-2011-t6-phylogenetics
Bioinformatica 24-11-2011-t6-phylogeneticsBioinformatica 24-11-2011-t6-phylogenetics
Bioinformatica 24-11-2011-t6-phylogenetics
 

Plus de Robin Gutell

Gutell 124.rna 2013-woese-19-vii-xi
Gutell 124.rna 2013-woese-19-vii-xiGutell 124.rna 2013-woese-19-vii-xi
Gutell 124.rna 2013-woese-19-vii-xiRobin Gutell
 
Gutell 123.app environ micro_2013_79_1803
Gutell 123.app environ micro_2013_79_1803Gutell 123.app environ micro_2013_79_1803
Gutell 123.app environ micro_2013_79_1803Robin Gutell
 
Gutell 122.chapter comparative analy_russell_2013
Gutell 122.chapter comparative analy_russell_2013Gutell 122.chapter comparative analy_russell_2013
Gutell 122.chapter comparative analy_russell_2013Robin Gutell
 
Gutell 121.bibm12 alignment 06392676
Gutell 121.bibm12 alignment 06392676Gutell 121.bibm12 alignment 06392676
Gutell 121.bibm12 alignment 06392676Robin Gutell
 
Gutell 120.plos_one_2012_7_e38320_supplemental_data
Gutell 120.plos_one_2012_7_e38320_supplemental_dataGutell 120.plos_one_2012_7_e38320_supplemental_data
Gutell 120.plos_one_2012_7_e38320_supplemental_dataRobin Gutell
 
Gutell 119.plos_one_2017_7_e39383
Gutell 119.plos_one_2017_7_e39383Gutell 119.plos_one_2017_7_e39383
Gutell 119.plos_one_2017_7_e39383Robin Gutell
 
Gutell 118.plos_one_2012.7_e38203.supplementalfig
Gutell 118.plos_one_2012.7_e38203.supplementalfigGutell 118.plos_one_2012.7_e38203.supplementalfig
Gutell 118.plos_one_2012.7_e38203.supplementalfigRobin Gutell
 
Gutell 114.jmb.2011.413.0473
Gutell 114.jmb.2011.413.0473Gutell 114.jmb.2011.413.0473
Gutell 114.jmb.2011.413.0473Robin Gutell
 
Gutell 117.rcad_e_science_stockholm_pp15-22
Gutell 117.rcad_e_science_stockholm_pp15-22Gutell 117.rcad_e_science_stockholm_pp15-22
Gutell 117.rcad_e_science_stockholm_pp15-22Robin Gutell
 
Gutell 116.rpass.bibm11.pp618-622.2011
Gutell 116.rpass.bibm11.pp618-622.2011Gutell 116.rpass.bibm11.pp618-622.2011
Gutell 116.rpass.bibm11.pp618-622.2011Robin Gutell
 
Gutell 115.rna2dmap.bibm11.pp613-617.2011
Gutell 115.rna2dmap.bibm11.pp613-617.2011Gutell 115.rna2dmap.bibm11.pp613-617.2011
Gutell 115.rna2dmap.bibm11.pp613-617.2011Robin Gutell
 
Gutell 113.ploso.2011.06.e18768
Gutell 113.ploso.2011.06.e18768Gutell 113.ploso.2011.06.e18768
Gutell 113.ploso.2011.06.e18768Robin Gutell
 
Gutell 112.j.phys.chem.b.2010.114.13497
Gutell 112.j.phys.chem.b.2010.114.13497Gutell 112.j.phys.chem.b.2010.114.13497
Gutell 112.j.phys.chem.b.2010.114.13497Robin Gutell
 
Gutell 111.bmc.genomics.2010.11.485
Gutell 111.bmc.genomics.2010.11.485Gutell 111.bmc.genomics.2010.11.485
Gutell 111.bmc.genomics.2010.11.485Robin Gutell
 
Gutell 110.ant.v.leeuwenhoek.2010.98.195
Gutell 110.ant.v.leeuwenhoek.2010.98.195Gutell 110.ant.v.leeuwenhoek.2010.98.195
Gutell 110.ant.v.leeuwenhoek.2010.98.195Robin Gutell
 
Gutell 108.jmb.2009.391.769
Gutell 108.jmb.2009.391.769Gutell 108.jmb.2009.391.769
Gutell 108.jmb.2009.391.769Robin Gutell
 
Gutell 107.ssdbm.2009.200
Gutell 107.ssdbm.2009.200Gutell 107.ssdbm.2009.200
Gutell 107.ssdbm.2009.200Robin Gutell
 
Gutell 106.j.euk.microbio.2009.56.0142.2
Gutell 106.j.euk.microbio.2009.56.0142.2Gutell 106.j.euk.microbio.2009.56.0142.2
Gutell 106.j.euk.microbio.2009.56.0142.2Robin Gutell
 
Gutell 105.zoologica.scripta.2009.38.0043
Gutell 105.zoologica.scripta.2009.38.0043Gutell 105.zoologica.scripta.2009.38.0043
Gutell 105.zoologica.scripta.2009.38.0043Robin Gutell
 
Gutell 104.biology.direct.2008.03.016
Gutell 104.biology.direct.2008.03.016Gutell 104.biology.direct.2008.03.016
Gutell 104.biology.direct.2008.03.016Robin Gutell
 

Plus de Robin Gutell (20)

Gutell 124.rna 2013-woese-19-vii-xi
Gutell 124.rna 2013-woese-19-vii-xiGutell 124.rna 2013-woese-19-vii-xi
Gutell 124.rna 2013-woese-19-vii-xi
 
Gutell 123.app environ micro_2013_79_1803
Gutell 123.app environ micro_2013_79_1803Gutell 123.app environ micro_2013_79_1803
Gutell 123.app environ micro_2013_79_1803
 
Gutell 122.chapter comparative analy_russell_2013
Gutell 122.chapter comparative analy_russell_2013Gutell 122.chapter comparative analy_russell_2013
Gutell 122.chapter comparative analy_russell_2013
 
Gutell 121.bibm12 alignment 06392676
Gutell 121.bibm12 alignment 06392676Gutell 121.bibm12 alignment 06392676
Gutell 121.bibm12 alignment 06392676
 
Gutell 120.plos_one_2012_7_e38320_supplemental_data
Gutell 120.plos_one_2012_7_e38320_supplemental_dataGutell 120.plos_one_2012_7_e38320_supplemental_data
Gutell 120.plos_one_2012_7_e38320_supplemental_data
 
Gutell 119.plos_one_2017_7_e39383
Gutell 119.plos_one_2017_7_e39383Gutell 119.plos_one_2017_7_e39383
Gutell 119.plos_one_2017_7_e39383
 
Gutell 118.plos_one_2012.7_e38203.supplementalfig
Gutell 118.plos_one_2012.7_e38203.supplementalfigGutell 118.plos_one_2012.7_e38203.supplementalfig
Gutell 118.plos_one_2012.7_e38203.supplementalfig
 
Gutell 114.jmb.2011.413.0473
Gutell 114.jmb.2011.413.0473Gutell 114.jmb.2011.413.0473
Gutell 114.jmb.2011.413.0473
 
Gutell 117.rcad_e_science_stockholm_pp15-22
Gutell 117.rcad_e_science_stockholm_pp15-22Gutell 117.rcad_e_science_stockholm_pp15-22
Gutell 117.rcad_e_science_stockholm_pp15-22
 
Gutell 116.rpass.bibm11.pp618-622.2011
Gutell 116.rpass.bibm11.pp618-622.2011Gutell 116.rpass.bibm11.pp618-622.2011
Gutell 116.rpass.bibm11.pp618-622.2011
 
Gutell 115.rna2dmap.bibm11.pp613-617.2011
Gutell 115.rna2dmap.bibm11.pp613-617.2011Gutell 115.rna2dmap.bibm11.pp613-617.2011
Gutell 115.rna2dmap.bibm11.pp613-617.2011
 
Gutell 113.ploso.2011.06.e18768
Gutell 113.ploso.2011.06.e18768Gutell 113.ploso.2011.06.e18768
Gutell 113.ploso.2011.06.e18768
 
Gutell 112.j.phys.chem.b.2010.114.13497
Gutell 112.j.phys.chem.b.2010.114.13497Gutell 112.j.phys.chem.b.2010.114.13497
Gutell 112.j.phys.chem.b.2010.114.13497
 
Gutell 111.bmc.genomics.2010.11.485
Gutell 111.bmc.genomics.2010.11.485Gutell 111.bmc.genomics.2010.11.485
Gutell 111.bmc.genomics.2010.11.485
 
Gutell 110.ant.v.leeuwenhoek.2010.98.195
Gutell 110.ant.v.leeuwenhoek.2010.98.195Gutell 110.ant.v.leeuwenhoek.2010.98.195
Gutell 110.ant.v.leeuwenhoek.2010.98.195
 
Gutell 108.jmb.2009.391.769
Gutell 108.jmb.2009.391.769Gutell 108.jmb.2009.391.769
Gutell 108.jmb.2009.391.769
 
Gutell 107.ssdbm.2009.200
Gutell 107.ssdbm.2009.200Gutell 107.ssdbm.2009.200
Gutell 107.ssdbm.2009.200
 
Gutell 106.j.euk.microbio.2009.56.0142.2
Gutell 106.j.euk.microbio.2009.56.0142.2Gutell 106.j.euk.microbio.2009.56.0142.2
Gutell 106.j.euk.microbio.2009.56.0142.2
 
Gutell 105.zoologica.scripta.2009.38.0043
Gutell 105.zoologica.scripta.2009.38.0043Gutell 105.zoologica.scripta.2009.38.0043
Gutell 105.zoologica.scripta.2009.38.0043
 
Gutell 104.biology.direct.2008.03.016
Gutell 104.biology.direct.2008.03.016Gutell 104.biology.direct.2008.03.016
Gutell 104.biology.direct.2008.03.016
 

Dernier

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 

Dernier (20)

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 

SSU-DIATOM

  • 1. PLEASE SCROLL DOWN FOR ARTICLE This article was downloaded by: [University of Texas at Austin] On: 13 August 2009 Access details: Access Details: [subscription number 907743372] Publisher Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK European Journal of Phycology Publication details, including instructions for authors and subscription information: http://www.informaworld.com/smpp/title~content=t713725516 The limits of nuclear-encoded SSU rDNA for resolving the diatom phylogeny Edward C. Theriot a ; Jamie J. Cannone b ; Robin R. Gutell b ; Andrew J. Alverson c a Texas Natural Science Center, The University of Texas at Austin, Austin, TX 78705, USA b Section of Integrative Biology and Center for Computational Biology and Bioinformatics, The University of Texas at Austin, 1 University Station, Austin, Texas 78712, USA c Department of Biology, Indiana University Bloomington, IN 47405, USA First Published:August2009 To cite this Article Theriot, Edward C., Cannone, Jamie J., Gutell, Robin R. and Alverson, Andrew J.(2009)'The limits of nuclear- encoded SSU rDNA for resolving the diatom phylogeny',European Journal of Phycology,44:3,277 — 290 To link to this Article: DOI: 10.1080/09670260902749159 URL: http://dx.doi.org/10.1080/09670260902749159 Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf This article may be used for research, teaching and private study purposes. Any substantial or systematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.
  • 2. Eur. J. Phycol., (2009), 44(3): 277–290 The limits of nuclear-encoded SSU rDNA for resolving the diatom phylogeny EDWARD C. THERIOT1 , JAMIE J. CANNONE2 , ROBIN R. GUTELL2 AND ANDREW J. ALVERSON3 1 Texas Natural Science Center, The University of Texas at Austin, 2400 Trinity Street, Austin, TX 78705, USA 2 Section of Integrative Biology and Center for Computational Biology and Bioinformatics, The University of Texas at Austin, 1 University Station, Austin, Texas 78712, USA 3 Department of Biology, Indiana University Bloomington, 1001 E Third Street, 142 Jordan Hall, IN 47405, USA (Received 26 September 2008; revised 13 November 2008; accepted 25 November 2008) A recent reclassification of diatoms based on phylogenies recovered using the nuclear-encoded small subunit ribosomal RNA (SSU rRNA) gene contains three major classes, Coscinodiscophyceae, Mediophyceae and the Bacillariophyceae (the CMB hypothesis). We evaluated this with a sequence alignment of 1336 protist and heterokont algae SSU rRNAs, which includes 673 diatoms. Sequences were aligned to maintain structural elements conserved within this dataset. Parsimony analysis rejected the CMB hypothesis, albeit weakly. Morphological data are also incongruent with this recent CMB hypothesis of three diatom clades. We also re-analysed a recently published dataset that purports to support the CMB hypothesis. Our re-analysis found that the original analysis had not converged on the true bipartition posterior probability distribution, and rejected the CMB hypothesis. Thus we conclude that a reclassification of the evolutionary relationships of the diatoms according to the CMB hypothesis is premature. Key words: SSU, diatom phylogeny, diatom classification, Coscinodiscophyceae, Mediophyceae, Bacillariophyceae Introduction Analyses of molecular data (mainly nuclear- encoded small subunit ribosomal DNA; henceforth SSU) have generally reinforced the tra- ditional view (Simonsen, 1979; Round et al., 1990) that centric diatoms broadly grade into pennates through many nodes (Medlin et al., 1993, 1996a, b, 2000; Ehara et al., 2000; Medlin & Kaczmarska, 2004; Sorhannus, 2004, 2007; Alverson et al., 2006; Choi et al., 2008; see Alverson & Theriot, 2005 for review). However, Medlin & Kaczmarska (2004) recently proposed that centric diatoms were composed of only two clades rather than many. They retained the name Coscinodiscophyceae for the so-called ‘radial centrics’ and applied the name Mediophyceae for the so-called ‘bipolar’ or ‘multipolar centrics’. They also suggested a number of morphological characters as diagnostic for these groups. We refer to this as the CMB hypothesis (for the three major clades discovered – Coscinodiscophyceae, Mediophyceae and Bacillariophyceae). The CMB phylogenetic hypothesis has not been universally embraced. For example, Adl et al. (2005) treated both Coscinodiscophyceae and Mediophyceae as paraphyletic taxa without discussion. Williams & Kociolek (2007) challenged the robustness of the CMB phylogeny based on the fact that many different SSU analyses return different trees. In contrast, Sims et al. (2006) recovered the CMB hypothesis with high bipartition posterior probability (BPP) support. Medlin et al. (2008) recovered the CMB hypothesis with high BPP support using a secondary structure alignment but noted that several aspects of the tree were unusual (e.g. the placement of Attheya). In fact, topology of the diatom SSU tree (and support values for incongruent groups) has changed from study to study. For example, the elongate Toxarium has been placed well within the centric grade amidst multipolar diatoms (very distant from the pennate diatoms) using max- imum likelihood (ML) analysis on 38 diatoms (Kooistra et al., 2003), as sister to all pennates in a Bayesian analysis of 51 diatom sequences (Chepurnov et al., 2008), poorly resolved in a maximum parsimony (MP) analysis of 181 diatom sequences (Alverson et al., 2006), and Correspondence to: Edward C. Theriot. E-mail: etheriot@mail. utexas.edu ISSN 0967-0262 print/ISSN 1469-4433 online/09/030277–290 ß 2009 British Phycological Society DOI: 10.1080/09670260902749159 DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009
  • 3. once again well within the multipolar diatoms in a Bayesian analysis of 54 diatom SSU sequences (Medlin et al., 2008). As underscored by this brief comparison, the many different inferences of diatom phylogeny have utilized different alignment strategies, different optimality criteria, have employed those criteria in different ways and have used different taxa. Any or all of these factors may have lead to the novel results of Medlin & Kaczmarska (2004) and Sims et al. (2006), but this cannot be directly studied because the Medlin & Kaczmarska (2004) and Sims et al. (2006) datasets which produced the CMB hypoth- esis were not publicly available. However, the Medlin et al. (2008) dataset is available and we re-analyse it below. To test the effects of ingroup and outgroup sampling, we created our own large alignment of stramenopile SSU sequences, aligned according to secondary structure (Gutell et al., 1985, 1992, 2002) and used it to test the CMB hypothesis and its robustness. Specifically we address the effect (or lack thereof ) of adding dis- tantly related outgroups on inferences of the diatom SSU tree. Materials and methods Multiple sequence alignment We included all 1549 SSU stramenopile sequences avail- able in Genbank as of September 1, 2007. The SSU rDNA sequences were aligned manually with the alignment editor ‘AE2’ (developed by T. Macke, Scripps Research Institute, San Diego, USA; Larsen et al. 1993), which was developed for Sun Microsystems’ (Santa Clara, USA) workstations running the Solaris operating system. The manual alignment process involves first positionally aligning homologous nucleotides (i.e. those that map to the same locations and tertiary struc- ture models) into columns in the alignment, maximizing their sequence and structure similarity. For regions with high similarity between sequences, the nucleotide sequence is sufficient to align sequences with confidence. For more variable regions in closely related sequences or when aligning more distantly related sequences, how- ever, a high-quality alignment only can be produced when additional information (here, secondary and/or tertiary structure data) is included. The underlying SSU rRNA secondary structure model was initially predicted with covariation analysis (Gutell et al., 1985, 1992). Approximately 98% of the predicted model base pairs were present in the high- resolution crystal structure from the 30S ribosomal subunit (Gutell et al., 2002). This model (based on the bacterium Escherichia coli) has been extended to the eukaryotic SSU rRNA (Cannone et al., 2002), using covariation analysis to assess eukaryote-specific features. The additional constraints of the eukaryotic model were used to refine the alignment of the strame- nopile sequences iteratively until positional homology was established for the entire data matrix. The initial SSU alignment contained 1549 sequences, with a final length of 3786 columns. Medlin & Kaczmarska (2004) filtered out sequences less than 50% complete and we followed this convention, result- ing in a final dataset of 1336 stramenopile sequences of which 673 are diatoms and seven are bolidophytes, which are considered the immediate sister group to dia- toms, according to both SSU and chloroplast-encoded rbcL (Daugbjerg & Andersen, 1997; Goertzen & Theriot, 2003; Andersen, 2004). The remaining taxa are more distantly related stramenopiles. The final align- ment is available at TreeBASE (http://www.treeba- se.org/treebase/intro.html) or from the authors. Forty secondary structure model diagrams representing the major diatom lineages are available at http:// www.rna.ccbb.utexas.edu/SIM/4D/Diatom_nSSU/. We analysed the data in two datasets: diatoms plus bolido- phytes only (DiatBo) and diatoms plus all stramenopiles (DiatStram). Other datasets We obtained the Nexus files used for Figs 2 and 3 in Medlin & Kaczmarska (2004) from Dr Medlin. One dataset had 126 sequences and the other had 281 sequences, and we refer to them as the MK126 and MK281 datasets. Both had the same 123 diatom sequences and differed only in that the former used boli- dophytes only as the outgroup and the latter sampled broadly across eukaryotes for the outgroups. We also used the Nexus file used to produce Fig. 1a of Medlin et al. (2008) from http://www3.interscience.wiley. com.ezproxy.lib.utexas.edu/journal/121395867/suppinfo. That file had 54 sequences, all diatoms but no outgroup, and we refer to that as the M54 dataset. Phylogenetic analysis All datasets were subjected to parsimony analysis using the TNT program (Goloboff et al., 2003). The full suite of TNT options (sectorial search, ratchet, drift and tree fusion) was used. There is no standard recommendation for use of these algorithms and there are few comparative studies of these algorithms. Within the context of the ratchet, Nixon (1999) argued that for large datasets, it may be better to limit length of searches on individual islands of trees and search more islands. The notion is that exploring a greater range of islands containing optimal trees is more likely to cover the entire diversity of optimal trees in a shorter period of time than exhaustively searching one island. Thus, we took the approach used by Goertzen & Theriot (2003) and Alverson et al. (2006) when employing these newer algo- rithms. We increased the number of all cycles, rounds and repetitions for sectorial, drift, fusion and ratchet searches 10-fold beyond default values, and used between 100 and 1000 random taxon additions for each run. We saved the resultant trees from each run separately, and then repeated the procedure with a new randomly selected seed number. After each run, we checked that no shorter tree was found, combined trees from all previous runs and then calculated the E. C. Theriot et al. 278 DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009
  • 4. number of nodes collapsed in their strict consensus. If no shorter trees were found and if no additional nodes were collapsed, we concluded that we had sampled the complete representative set of MP trees, as additional trees would be redundant and unlikely to further erode the resolution of the strict consensus (Nixon, 1999; Goertzen & Theriot, 2003). We assessed the parsimony penalty required by constraining each of the Coscinodiscophyceae, Mediophyceae and Bacillariophyceae to monophyly under searches as above. We assessed support for the unconstrained MP trees using nonparametric bootstrap (BS) analysis in TNT with the standard sampling with replacement strategy. We used the new technology search with 10 taxon additions and sectorial, ratchet, drift and tree fusings for each of the 1000 pseudorepli- cates of the BS analysis. The DiatBo, MK 281, MK126 and M54 datasets were subjected to Bayesian analyses. All Bayesian analyses were run with the GTR þ G þ I model (nucmo- del ¼ 4by4, nst ¼ 6, rates ¼ invgamma). These were the settings used by Sims et al. (2006) and also corresponded to the best model for each dataset as selected by MrModelTest (Nylander, 2004). All initial runs for all datasets were done at 1 000 000 Markov chain Monte Carlo (MCMC) generations, equal to or greater than the number of generations run by Medlin & Kaczmarska (2004), Sims et al. (2006) and Medlin et al. (2008). Where these papers did not specify other settings for the Bayesian analysis, default settings were used. To test reproducibility of the results, we ran three separate analyses, each with two runs for a total of six independent runs of 1 000 000 MCMC generations each. We also ran one analysis of the DiatBo dataset with two runs (four chains, three heated, one cold) for 10 million generations, saving every 10 000th tree. Finally, we ran the M54 dataset for 50 million generations, saving every 10 000th tree. We assessed whether independent runs in all analyses had sampled the same posterior distribution by comparing independent run (split) posterior probabilities with the compare command in the AWTY program (Wilgenbusch et al., 2004). We followed the burn-in periods of Medlin et al. (2004) and (2008) for their datasets when we ran 1 000 000 generations on M54, MK126 and MK281. We used a burn-in of 90% for our DiatBo dataset 1 000 000 and 10 000 000 genera- tion analyses to approximate Sims et al. (2006). Morphology We coded the characters of symmetry, presence or absence of mucilaginous matrix, auxospore shape/ growth, properizonium and perizonium presence/ absence for 34 taxa using Medlin & Kaczmarska’s criteria (2004: 258, table 2), and treated all multistate characters as unordered. Since no outgroup or ontogenetic information was provided, the only rooting option was to consider the Coscinodiscophyceae as the outgroup to the remaining diatoms and test for monophyly of the Mediophyceae and Bacillariophyceae. However it is possible to determine if the Coscinodiscophyceae formed a convex group (possibly monophyletic depending on the placement of the root within the unrooted network). Winclada run- ning NONA was used for parsimony analysis, with 10 000 replications, holding 100 starting trees per repeti- tion, and all other parameters set to defaults. Results Parsimony analysis For the DiatStram dataset, 11 runs totalling 2898 random taxon addition repetitions were required to converge on the representative set of MP trees (length (L): 39822; consistency index (c.i.): 0.12; retention index (r.i.): 0.84). We found 22 unique MP trees on the first run. Their strict consensus collapsed 441 nodes. Eight more runs produced 54 more MP trees for a total of 76 trees. However, 11 of these were duplicates and there were only 65 unique MP trees. Their strict consen- sus collapsed 450 nodes or only nine more than collapsed in the first single run. That we found redundant trees and the reduced yield in topologi- cal diversity suggest that we have found the true diversity of all MP trees that could be obtained from the DiatStram dataset (Fig. 1). For the DiatBo dataset, three runs of 500 random addition sequences seemed to converge on the representative set of MP trees. The strict consensus of the first 139 trees of L: 14094 (c.i.: 0.19, r.i.: 0.84) collapsed 287 nodes, that of the 338 trees of the first and second runs collapsed 288 nodes, and that of the 554 trees of the all three runs combined col- lapsed 288 nodes. In each of the runs, an MP tree was found within the first 18 random additions indicating that TNT was finding at least one tree of optimal topology very early in the analysis. In addition, 51 of the 554 total trees were identical to trees previously found, indicating that there was some redundancy in the coverage of tree space. Thus, we believe Fig. 2 well represents the strict consensus of all equally MP cladograms that might be found in the DiatBo dataset. Unconstrained searches in both analyses resulted in non-monophyly for the classes Coscinodiscophyceae and Mediophyceae, and monophyly for the class Bacillariophyceae. In both, the Coscinodiscophyceae was positively paraphyletic (i.e. fully resolved as a ladder-like grade with no polytomies) with the Melosirales sister to a non-monophyletic Mediophyceae plus a monophyletic Bacillariophyceae. The Mediophyceae was positively paraphyletic in the DiatStram analysis, with Chaetoceros and a few other taxa forming a clade sister to the pennates. In the DiatBo analyses the Mediophyceae formed an unresolved polytomy. Monophyly of the Coscinodiscophyceae and Mediophyceae (i.e. the CMB hypothesis) required SSU and the diatom phylogeny 279 DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009
  • 5. Fig. 1. Strict consensus of 65 unique equally most parsimonious trees calculated from the stramenopile-outgroup and diatom- ingroup analysis (DiatStram dataset). Only relationships among diatoms are shown. E. C. Theriot et al. 280 DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009
  • 6. Fig. 2. Strict consensus of 503 unique equally most parsimonious trees calculated from the diatom plus bolidophyte (DiatBo) dataset. Only relationships among diatoms are shown. SSU and the diatom phylogeny 281 DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009
  • 7. little penalty for either large dataset: the CMB hypothesis was only seven steps longer than the unconstrained MP trees for the DiatStram dataset and 10 steps longer for the DiatBo dataset. Arrangements of terminal taxa were similar for results for both datasets, and only the tree for the DiatBo dataset is shown (Fig. 3). Given the relatively low penalty incurred for transforming any optimal tree into the CMB hypothesis, it is not surprising that the BS values along the backbone of the tree were generally quite low. The Bacillariophyceae clade and the Mediophyceae plus Bacillariophyceae clade were the only two backbone nodes to receive BS support values of 90% for either dataset. Parsimony analysis of M54, MK126 and MK281 datasets yielded similar results (trees not shown). The MP tree or trees rejected the CMB hypothesis, and bootstrap values along the backbone were typically less than 50%. The CMB constraint trees were not much longer than the MP trees: four steps longer for the M54 dataset (4530 vs. 4526), seven steps longer for the MK126 dataset (5633 vs. 5626), and 23 steps longer for the MK281 dataset (19 302 vs. 19 325). Bayesian analysis Analyses of 1 000 000 generations had clearly not converged on the same posterior distributions among independent runs in analyses of either the DiatBo or M54 datasets. Plots of bipartition posterior probability values between the first pair of runs for each of the two datasets showed many points (i.e. bipartitions) falling directly along the abscissa and ordinate, indicating that some clades found in one analysis (even at BPP values 0.8) were not found in all others (Fig. 4). Furthermore, convergence was not reached with 10 000 000 MCMC generations for the DiatBo dataset or within 20 000 000 MCMC generations for the M54 dataset (Fig. 5). For the DiatBo dataset, topological differences between runs were not minor. In the 10 000 000 generation analysis of DiatBo, Toxarium, Lampriscus, Biddulphiopsis and the Cymatosirales (Toxarium and allies) grouped with Lithodesmiales plus Thalassiosirales (BPP ¼ 0.90) in one run, whereas they (Toxarium and allies) were sister to pennates (BPP ¼ 0.5) in another run. Several species of Pinnularia, a raphid pennate genus in traditional classifica- tions and in our MP analyses, were placed at the base of the diatom tree as sister to Leptocylindrus with BPP values of 0.88 and 0.98 in each of the two runs. While several of the 1 000 000 generation runs recovered a monophyletic Bacillariophyceae, the fact that we did not recover the pennates in any of the 10 000 000 generation analyses clearly indi- cates that even our longest Bayesian runs were far short of convergence. We analysed aspects of performance of the M54 data set to obtain a gross estimate of how difficult it might be to reach convergence in a Bayesian analysis of several hundred diatom sequences. The standard deviation of bipartitions between independent runs for the M54 dataset dropped to near zero at about 22 million generations, and thereafter oscillated at 0.1 until the analysis was terminated at 50 000 000 generations (Fig. 6). While this might suggest that convergence had been reached by 22 million generations, plotting the sampled trees for the last 28 million generations shows clusters of points off a straight line (Fig. 7). Discarding trees from the first 45 million genera- tions resulted in a BPP plot approximating a straight line. The majority rule consensus tree returned a convex Coscinodiscophyceae and monophyletic Bacillariophyceae, but the Mediophyceae were positively paraphyletic (Fig. 8). Attheya septentrionalis was grouped with the pennates at a BPP of 0.95. This is the place- ment of Attheya obtained from MP analysis. In fact, incongruence between the Bayesian and MP trees for dataset M54 is restricted to areas where BPP values are below 0.70 (not shown). As judged by the still rapidly dropping split standard deviations (not shown), Bayesian ana- lyses of the intermediate-sized MK126 and MK281 datasets had not converged on the same posterior probability distribution at 1 000 000 MCMC generations, again underscoring the difficulty of completing a meaningful Bayesian analysis on even 100 diatom SSU sequences in so few MCMC generations. Our analysis of the MK54 dataset indicates that it might take as many as 50–100 million genera- tions or more to reach convergence on datasets with 600 or more diatom SSU sequences. Morphological tree Seven trees of length nine were found. Only the pennates formed a convex group (neither the Coscinodiscophyceae or Mediophyceae were monophyletic, regardless of how the tree was rooted). In the strict consensus, the Thalassiosirales were excluded from the remaining Mediophyceae (Fig. 9) because they share all the characteristics of the Coscinodiscophyceae (radial symmetry, globular/isometric auxospore shape/ growth, no perizonium or properizonium), and have none of the features peculiar to other Mediophyceae or the Bacillariophyceae. E. C. Theriot et al. 282 DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009
  • 8. Fig. 3. Strict consensus of 147 unique equally most parsimonious trees calculated from the Bolidomonas plus diatom (DiatBo) dataset with Coscinodiscophyceae and Mediophyceae constrained to monophyly. Only relationships among diatoms are shown. SSU and the diatom phylogeny 283 DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009
  • 9. Discussion Our results weakly reject the hypothesis that the Coscinodiscophyceae, Mediophyceae and Bacillariophyceae are each monophyletic (the CMB hypothesis). Only the Bacillariophyceae (pennate diatoms) were monophyletic, whether we included only closely related outgroups (bolido- phytes only) or distantly related outgroups (boli- dophytes and all other stramenopiles). However, there is little parsimony penalty to constrain trees to the CMB hypothesis for all datasets. Given that greatly different topologies can be obtained from SSU datasets with little penalty, it is not surprising that estimates of the diatom phylogeny based on SSU sequences vary widely between studies using different taxa, alignments, and optimality criteria. For example, the few studies hinting at the possibility of a monophyletic Coscinodiscophyceae and paraphyletic Medio- phyceae or vice versa used relatively few diatom SSU sequences. Very early in the use of SSU data in diatom systematics using 11 diatoms including three Coscinodiscophyceae and one member of the Mediophyceae, Medlin et al. (1993) returned a monophyletic Coscinodisco- phyceae. Using 29 diatom SSU sequences they later (Medlin et al. 1996a, b) returned a monophyletic Coscinodiscophyceae and parap- hyletic Mediophyceae. Kooistra Medlin (1996) analysed that same dataset, experimenting with various approaches to deal with the potential long-branch problem introduced by ‘aberrantly evolving’ diatoms; each approach returned a monophyletic Coscinodiscophyceae and para- phyletic Mediophyceae, although relationships within the mediophytes were dependent upon the method used. Kooistra et al. (2003) used 38 diatom SSU sequences, only two of which were Coscino- discophyceae, both on long branches, returning a monophyletic Coscinodiscophyceae and para- phyletic Mediophyceae. Using 51 diatom SSU sequences, Chepurnov et al. (2008) also returned a monophyletic Coscinodiscophyceae and para- phyletic Mediophyceae. However, they only ran 4 000 000 MCMC generations, so it is unclear if they had reached convergence of topology and posterior probabilities. In contrast, Cavalier-Smith Chao (2006), focusing not on diatoms but on a wide range of Fig. 4. Bipartition partition probability plots of two runs (split runs) from the 1 000 000 Markov Chain Monte Carlo (MCMC) generation Bayesian analysis of our diatom plus bolidophyte dataset (DiatBo: upper plot) and of the Medlin et al. (2008) dataset (M54: lower plot). 90% burn-in used for each. Fig. 5. Bipartition partition probability plots of two runs (split runs) from the 10 000 000 Markov Chain Monte Carlo (MCMC) generation Bayesian analysis of our diatom plus bolidophyte dataset (DiatBo: upper plot) and the 20 000 000 generation Medlin et al. (2008) dataset (M54: lower plot). 90% burn-in used for each. E. C. Theriot et al. 284 DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009
  • 10. protists including diatoms, used a wide range of outgroups but only 32 diatom SSU sequences in a distance (neighbor-joining) analysis. While they found moderate (70%) BS support for mono- phyly of the Mediophyceae, they also found a paraphyletic Coscinodiscophyceae, with the internode excluding Melosirales from other Coscinodiscophyceae receiving slightly higher sup- port than that found for a monophyletic Mediophyceae (BS ¼ 72%). In perhaps the most extreme case of taxon sampling effects using eleven diatom exemplars when studying relation- ships among alveolates and stramenopiles, Van de Peer et al. (1996) returned monophyly for the centric diatoms as a whole. It is not simply monophyly (or not) of the cen- trics, the Coscinodiscophyceae or Mediophyceae that has proved unstable in different SSU analyses. Three studies, which each included more than 100 diatom sequences, offer the opportunity to compare trees calculated under a single optimality criterion (Bayesian inference). These revealed that taxon sampling differences alone may account for very different tree topologies. Based on 123 diatoms (Medlin Kaczmarska, 2004) the Lithodesmiales grouped with the Thalassiosirales (BPP ¼ 1.0), with 181 diatom SSU sequences (Alverson et al., 2006) they grouped with the Hemiaulales to the exclusion of the Thalassiosirales (BPP ¼ 1.0) and with an unknown number of diatom sequences (Sims et al., 2006) the Lithodesmiales grouped with the Biddulphiales, Triceratiales and Toxarium to the exclusion of the Thalassiosirales (BPP ¼ 1.0). The unpublished dataset for Fig. 2 of Sims et al. (2006) has been characterized as including more than 800 ingroup sequences (Medlin et al., 2008). It should be noted that alignment methods have varied greatly among the many studies using SSU sequences and these could be a possible source of variation that has yet to be fully explored. This variation has the poten- tial to change diatom SSU tree topology radically (Medlin et al., 2008). Among the many trees generated using SSU, the most radical and controversial trees Fig. 7. Bipartition probability plot of two runs from the M54 dataset of Medlin et al. (2008). The upper plot dis- carded the first 22 million Markov Chain Monte Carlo (MCMC) generations (or 44% burn-in, based on initial minimum at ca. 22 million MCMC generations from Fig. 6). The lower plot discarded the first 45 million MCMC generations (or 90% burn-in). Fig. 6. Standard deviation of likelihood scores among inde- pendent runs (split runs) vs number of generations for Bayesian analysis of our diatom plus bolidophyte (DiatBo) dataset (upper plot) and of the Medlin et al. (2008) dataset (M54: lower plot). The line in the upper plot represents a power function estimate of split standard deviations out to 50 000 000 Markov Chain Monte Carlo (MCMC) generations. SSU and the diatom phylogeny 285 DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009
  • 11. Fig. 8. Majority rule consensus tree (calculated without an outgroup and arbitrarily rooted in the middle of the Coscinodiscophyceae) derived from 50 000 000 Markov Chain Monte Carlo generation Bayesian analysis of the M54 dataset from Medlin et al. (2008) with 90% burn-in. Numbers or symbols below nodes are bipartition posterior probability (BPP) values. Asterisks indicate BPP values 95%. E. C. Theriot et al. 286 DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009
  • 12. Fig. 9. Unrooted tree of diatom genera as determined by a parsimony analysis of the morphology matrix of Table 2 in Medlin (2004). Strict consensus of eight trees. SSU and the diatom phylogeny 287 DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009
  • 13. (Williams Kociolek, 2007) are those that support the CMB hypothesis: the MP tree of 8600þ SSU sequences (123 diatoms) by Medlin Kaczmarska (2004), the Bayesian tree (800þ diatom sequences with bolidophyte outgroups) by Sims et al. (2006), and the Bayesian tree (54 diatom SSU sequences with no outgroup) by Medlin et al. (2008). Medlin Kaczmarska (2004) claimed that their MP tree (8600þ sequences, but only 123 diatoms) was more accurate than their Bayesian tree (same 123 diatoms but only three bolidophyte SSU outgroup sequences) because including distantly related outgroups increased the number of parsi- mony informative characters. However, while increased taxon sampling within the scope of the problem (i.e. within diatoms) may increase accuracy, increased taxon sampling outside the scope of the problem (adding distant outgroups) will likely decrease accuracy of phylogenetic infer- ence (Hillis, 1998; Pollock et al., 2002; Hillis et al., 2003; Hedtke et al., 2006; Verbruggen Theriot, 2008). Medlin Kaczmarska (2004) cited Bollback (2002) as support for their position, but that paper studied the effects of adding characters, not taxa (ingroup or outgroup), and only in the context of model-based methods, specifically accuracy of model selection for phylogenetic analysis, and is therefore not pertinent. Thus, contrary to the claims of Medlin Kaczmarska (2004), one could hypothesize that recovery of the CMB tree under parsimony is an artefact of increased error caused by addition of distantly related outgroup taxa. In the light of the literature on taxon sampling, a more substantive claim was made by Sims et al. (2006), who suggested that increased ingroup sampling led to recovery of the CMB hypothesis, this time with high BPP support values. However, we suggest that recovery of the CMB hypothesis in Medlin Kaczmarska (2004), Sims et al. (2006) and Medlin et al. (2008) is probably a result of insufficient tree search effort. Medlin Kaczmarska (2004) used the MP search in ARB, whose most effective heuristic search algorithm employs a combination of Nearest Neighbor Interchange and Kernighan-Lin optimization, which together are less effective than the commonly used Tree–Bisection–Reconnection algorithm, and certainly not as effective as other methods, such as the parsimony ratchet (Nixon, 1999). Given the large number of near-optimal trees that support the CMB hypothesis in our dataset, it is likely that a suboptimal search might find any one of these suboptimal trees. Similarly, the Bayesian inference (Sims et al. (2006)) was probably also confounded by insufficient search of tree space. They only ran 1 000 000 MCMC generations. Our analysis of our DiatBo dataset (673 diatoms plus seven boli- dophytes) had not reached convergence at 10 000 000 generations. Our analysis of the M54 dataset, presumably the same alignment but with far fewer taxa than used by Sims et al. (2006), seems to have required at least 45 million genera- tions for the burn-in alone. The tree in Fig. 1a (Medlin et al. 2008) supporting the CMB hypoth- esis, is clearly an artefact of running far too few MCMC generations, and even if the tree topology is correct, monophyly of the Coscinodiscophyceae is an artefact of arbitrary rooting in the absence of an outgroup. Thus, our results strongly suggest that the choice of optimality criterion has less influence on trees derived from SSU data than does the proper appli- cation of that choice. All methods, alignments and taxon sampling schemes we reviewed or re-analysed returned weak rejection of the CMB hypothesis. Both Medlin Kaczmarska (2004) and Sims et al. (2006) argued that morphological data were congruent with their SSU trees. However, the char- acters discussed are either irrelevant to testing the CMB hypothesis, or ambiguous about it (e.g. sper- matozoid structure [both the Coscinodiscophyceae and Mediophyceae have merogenous and hologen- ous spermatozoids]; pyrenoid structure [one type is apparently symplesiomorphically shared by the Coscinodiscophyceae and Mediophyceae, while pyrenoid structure in the Thalassiosirales is auta- pomorphic for the order]). Using the Medlin Kaczmarska (2004) morphological character matrix, our tree excluded the Thalassiosirales from the Mediophyceae on the basis of auxospore characteristics. Nevertheless it was claimed that the particular pattern of auxospore formation under discussion was retained in the Thalassiosirales (Medlin Kaczmarska, 2004: 267). To make this argument under parsimony, the Thalassiosirales would have to be the sister group to all remaining Mediophyceae, a relationship not recovered in either Medlin Kaczmarska (2004) or Sims et al. (2006). Complicated scenarios are invoked ad hoc to explain the distribution of the four different Golgi body arrangements. Of the two widely dis- tributed arrangements, the so-called Type 1 (sensu Medlin Kaczmarska, 2004) arrangement was attributed to most of the Coscinodiscophyceae, and the Type 2 arrangement was attributed to the Aulacoseirales (Coscinodiscophyceae), Mediophyceae, and Bacillariophyceae. If Type 1 is apomorphic and Type 2 is not, then there is no evidence from the Golgi arrangement that the Aulacoseirales belong to the Coscinodiscophyceae. If Type 2 is apomorphic, regardless of the interpretation of Type 1, the Golgi character is congruent with our SSU trees E. C. Theriot et al. 288 DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009
  • 14. and rejects the CMB hypothesis by placing the Aulacoseirales with the Mediophyceae and Bacillariophyceae. Nevertheless, Medlin Kaczmarksa (2004) argued away this incongru- ence, explaining the distribution of Golgi body types in terms of ancestral polymorphisms, impli- citly invoking unobserved character conditions in unobserved ancestral species for as far back as the common ancestor to red algae and diatoms (Medlin Kaczmarska, 2004: 265): ‘However, G-ER-M units are known from the oomycetes and the red algae, whereas an association of the Golgi around the nucleus is also known in the Labyrinthuloides. Thus, it would appear that both features are present in ancestors of the dia- toms and the potential host cells of their plastids. It can be argued that the two traits then segregated themselves in the two separate lineages as they evolved.’ Conclusion Medlin Kaczmarska (2004) and Sims et al. (2006) proposed monophyly of each of the Coscinodiscophyceae, Mediophyceae, and Bacillariophyceae. Since the unavailability of the datasets (the only ones to support the CMB hypothesis apart from that of Medlin et al. (2008)) precluded direct reproduction of their results, we assembled datasets of similar size and characteristics. Our results suggest that the CMB hypothesis is rejected by SSU data, albeit very weakly. Similarly, our re-analysis of morphological evidence proposed by Medlin Kaczmarska (2004) also weakly rejects the CMB hypothesis. Medlin Kaczmarska (2004) very likely recovered a suboptimal MP tree for their 8600þ sequence dataset and Sims et al. (2006) very likely failed to converge on the true posterior distribution of trees in their Bayesian analysis. Conversely, if Medlin Kaczmarska (2004) did recover the MP tree or trees and if the Sims et al. (2006) analysis did reach convergence for their dataset, then our results demonstrate that the likelihood of their having done so is highly dependent on taxon sampling and/or sequence alignment. We have demonstrated that the Medlin et al. (2008) tree supporting the CMB hypothesis is an artefact and that it must be concluded that the CMB hypothesis is far from robust, regardless of how one interprets the variation between studies. In summary, pursuit of a well-supported phylo- geny of diatoms seems to be limited as much by the number of characters per taxon as by the number of taxa for which data exist. There is a small but growing rbcL dataset which rejects the CMB hypothesis (Choi et al., 2008). Very limited coxI data supports the CMB hypothesis, but analyses so far only include four species (Ehara et al., 2000). While nSSU data are a useful addition to the difficult problem of inferring diatom phylo- geny, continued use of SSU alone, as Patterson (1994: 185) wrote in a similar context, might simply be an ineffective attempt to ‘. . . wring truth from recalcitrant data.’ Acknowledgements ECT was supported by NSF EF 0629410 and the Jane and Roland Blumberg Centennial Professorship in Molecular Evolution. AJA was sup- ported by an NIH Ruth L. Kirschstein NRSA Postdoctoral Fellowship (1F32GM080079-01A1). Both also acknowledge the Tony Institute. RRG and JJC were supported by NIH GM067317. References ADL, S.M., SIMPSON, A.G.B., FARMER, M.A., ANDERSEN, R.A., ANDERSON, O.R., BARTA, J.R., BOWSER, S.S., BRUGEROLLE, G.U.Y., FENSOME, R.A., FREDERICQ, S., et al. (2005). A new higher level classification of eukaryotes with emphasis on the taxonomy of protists. J. Eukaryotic Microbiol., 52: 399–451. ALVERSON, A.J., CANNONE, J.J., GUTELL, R.R. THERIOT, E.C. (2006). The evolution of elongate shape in diatoms. J. Phycol., 42: 655–668. ALVERSON, A.J. THERIOT, E.C. (2005). Comments on recent pro- gress toward reconstructing the diatom phylogeny. J. Nanosci. Nanotechnol., 5: 57–62. ANDERSEN, R.A. (2004). Biology and systematics of heterokont and haptophyte algae. Am. J. Bot., 91: 1508–1522. BOLLBACK, J.P. (2002). Bayesian model adequacy and choice in phylogenetics. Mol. Biol. Evol., 19: 1171–1180. CANNONE, J.J., SUBRAMANIAN, S., SCHNARE, M.N., COLLETT, J.R., D’SOUZA, L.M., DU, Y., FENG, B., LIN, N., MADABUSI, L.V., MULLER, K.M., et al. (2002). The Comparative RNA Web (CRW) Site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs. BMC Bioinformatics, 3: 2. CAVALIER-SMITH, T. CHAO, E. (2006). Phylogeny and megasyste- matics of phagotrophic heterokonts (Kingdom Chromista). J. Mol. Evol., 62: 388–420. CHEPURNOV, V.A., MANN, D.G., VON DASSOW, P., VANORMELINGEN, P., GILLARD, J., INZE´ , D., SABBE, K. VYVERMAN, W. (2008). In search of new tractable diatoms for experimental biology. BioEssays, 30: 692–702. CHOI, H.-G., JOO, H.M., JUNG, W., HONG, S.S., KANG, J.-S. KANG, S.-H. (2008). Morphology and phylogenetic relationships of some psychrophilic polar diatoms (Bacillariophyta). Nov. Hedwig. Beih., 133: 7–30. DAUGBJERG, N. ANDERSEN, R.A. (1997). A molecular phylogeny of the heterokont algae based on analyses of chloroplast-encoded rbcL sequence data. J. Phycol., 33: 1031–1041. EHARA, M., INAGAKI, Y., WATANABE, K.I. OHAMA, T. (2000). Phylogenetic analysis of diatom coxI genes and implications of a fluctuating GC content on mitochondrial genetic code evo- lution. Curr. Genet., 37: 29–33. GOERTZEN, L.R. THERIOT, E.C. (2003). Effect of taxon sampling, character weighting, and combined data on the interpretation of relationships among the heterokont algae. J. Phycol., 39: 423–439. GOLOBOFF, P., FARRIS, J., NIXON, K. (2003). T.N.T.: Tree analysis using new technology. Available at: www.cladistics.com. SSU and the diatom phylogeny 289 DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009
  • 15. GUTELL, R.R., LEE, J.C. CANNONE, J.J. (2002). The accuracy of ribosomal RNA comparative structure models. Curr. Opin. Structural Biol., 12: 301–310. GUTELL, R.R., POWER, A., HERTZ, G.Z., PUTZ, E.J. STORMO, G.D. (1992). Identifying constraints on the higher- order structure of RNA: continued development and application of comparative sequence analysis methods. Nucl. Acids Res., 20: 5785–5795. GUTELL, R.R., WEISER, B., WOESE, C.R. NOLLER, H.F. (1985). Comparative anatomy of 16S-like ribosomal RNA. Prog. Nucl. Acid Res. Mol. Biol., 32: 155–216. HEDTKE, S., TOWNSEND, T. HILLIS, D. (2006). Resolution of phy- logenetic conflict in large data sets by increased taxon sampling. System. Biol., 55: 522–529. HILLIS, D.M. (1998). Taxonomic sampling, phylogenetic accuracy and investigator bias. System. Biol., 47: 3–8. HILLIS, D.M., POLLOCK, D.D., MCGUIRE, J.A. ZWICKL, D.J. (2003). Is sparse taxon sampling a problem for phylogenetic inference? System. Biol., 52: 124–126. KOOISTRA, W., DE STEFANO, M., MANN, D.G., SALMA, N. MEDLIN, L.K. (2003). Phylogenetic position of Toxarium, a pennate-like lineage within centric diatoms (Bacillariophyceae). J. Phycol., 39: 185–197. KOOISTRA, W.H.C.F. MEDLIN, L.K. (1996). Evolution of the diatoms (Bacillariophyta): IV. A reconstruction of their age from small subunit rRNA coding regions and the fossil record. Mol. Phylogenet. Evol., 6: 391–407. LARSEN, N., OLSEN, G.J., MAIDAK, B.L., MCCAUGHEY, M.J., OVERBEEK, R., MACKE, T.J., MARSH, T.L. WOESE, C.R. (1993). The Ribosomal Database Project. Nucl. Acids Res., 21: 3021–3023. MEDLIN, L.K. KACZMARSKA, I. (2004). Evolution of the diatoms V: Morphological and cytological support for the major clades and a taxonomic revision. Phycologia, 43: 245–270. MEDLIN, L.K., KOOISTRA, W.H.C.F., GERSONDE, R. WELLBROCK, U. (1996a). Evolution of the diatoms (Bacillariophyta): II. Nuclear-encoded small-subunit rRNA sequence comparisons confirm a paraphyletic origin for the cen- tric diatoms. Mol. Biol. Evol., 13: 67–75. MEDLIN, L.K., KOOISTRA, W.H.C.F., GERSONDE, R. WELLBROCK, U. (1996b). Evolution of the diatoms (Bacillariophyta): III. Molecular evidence for the origin of the Thalassiosirales. Nov. Hedwig. Beih., 112: 221–234. MEDLIN, L.K., KOOISTRA, W.H.C.F. SCHMID, A.-M.M. (2000). A review of the evolution of the diatoms–a total approach using molecules, morphology and geology. In The Origin and Early Evolution of the Diatoms: Fossil, Molecular and Biogeographical Approaches (Witkowski, A. and Sieminska, J., editors), 13–36. Szafer Institute of Botany, Polish Academy of Sciences, Krako´ w, Poland. MEDLIN, L.K., SATO, S., MANN, D.G. KOOISTRA, W.H.C.F. (2008). Molecular evidence confirms sister relationship of Ardissonea, Climacosphenia and Toxarium within the bipolar cen- tric diatoms (Bacillariophyta, Mediophyceae), and cladistic ana- lyses confirm that extremely elongate shape has arisen twice in the diatoms. J. Phycol., 45: 1340–1348. MEDLIN, L.K., WILLIAMS, D.M. SIMS, P.A. (1993). The evolution of the diatoms (Bacillariophyta). I. Origin of the group and assessment of the monophyly of its major divisions. Eur. J. Phycol., 28: 261–275. NIXON, K. (1999). The parsimony ratchet, a new method for rapid parsimony analysis. Cladistics, 15: 407–414. NYLANDER, J. A. A. (2004). MrModeltest v2. Program distributed by the author. Evolutionary Biology Centre, Uppsala University. PATTERSON, C. (1994). Null or minimal models. In Models in Phylogeny Reconstruction (Scotland, R., Siebert, D.J., and Williams, D.M, editors), 173–192. Oxford University Press, Oxford, UK. POLLOCK, D.D., ZWICKL, D.J., MCGUIRE, J.A. HILLIS, D.M. (2002). Increased taxon sampling is advantageous for phyloge- netic inference. System. Biol., 51: 664–671. ROUND, F.E., CRAWFORD, R.M. MANN, D.G. (1990).The Diatoms: Biology Morphology of the Genera. Cambridge, UK: Cambridge University Press. SIMONSEN, R. (1979). The diatom system: Ideas on phylogeny. Bacillaria, 2: 9–71. SIMS, P.A., MANN, D.G. MEDLIN, L.K. (2006). Evolution of the diatoms: Insights from fossil, biological and molecular data. Phycologia, 45: 361–402. SORHANNUS, U. (2004). Diatom phylogenetics inferred based on direct optimization of nuclear-encoded SSU rRNA sequences. Cladistics, 20: 487–497. SORHANNUS, U. (2007). A nuclear-encoded small-subunit ribosomal RNA timescale for diatom evolution. Mar. Micropaleontol., 65: 1–12. VAN DE PEER, Y., VAN DER AUWERA, G. DE WACHTER, R. (1996). The evolution of stramenopiles and alveolates as derived by ‘‘substitution rate calibration’’ of small ribosomal subunit RNA. J. Mol. Evol., 42: 201–210. VERBRUGGEN, H. THERIOT, E.C. (2008). Building trees of algae: Some advances in phylogenetic and evolutionary analysis. Eur. J. Phycol., 43: 229–252. WILGENBUSCH, J.C., WARREN, D.L., SWOFFORD, D.L. (2004). AWTY: A system for graphical exploration of MCMC conver- gence in Bayesian phylogenetic inference. Available at: http:// ceb.csit.fsu.edu/awty. WILLIAMS, D.M. KOCIOLEK, J.P. (2007). Pursuit of a natural classification of diatoms: History, monophyly and the rejection of paraphyletic taxa. Eur. J. Phycol., 42: 313–319. E. C. Theriot et al. 290 DownloadedBy:[UniversityofTexasatAustin]At:15:4413August2009