1) The document describes research that used whole genome sequence data from 42 fungal species to construct phylogenetic trees of fungal relationships using both supertree and concatenated gene analysis methods.
2) The supertree and concatenated tree methods produced highly congruent results, with both supporting traditional classifications of fungal phyla, sub-phyla, and classes.
3) Within the Ascomycota, the trees resolved the relationships between the classes Leotiomycetes and Sordariomycetes, and identified two clades within the CTG clade of the Saccharomycotina that may correlate with sexual status.
2. BMC Evolutionary Biology 2006, 6:99 http://www.biomedcentral.com/1471-2148/6/99
Background sets used. Phylogenetic inference from sequence data can
Traditional methods of systematics based on morphology also be misled by systematic errors (e.g. compositional
of vegetative cells, sexual states, physiological responses to biases) [14]. These errors can be exacerbated when longer
fermentation and growth tests can assign fungal species to sequences are used, leading to strong support for infer-
particular genera and families. However, higher-level rela- ences that in reality may be false.
tionships amongst these groups are less certain and are
best elucidated using molecular techniques. Today single- A supertree analysis on the other hand generates a phylog-
gene phylogenies (especially 18S ribosomal DNA-based eny from a set of input trees that possess fully or partially
ones) have established many of the accepted relationships overlapping sets of taxa. Because the input trees need only
between fungal organisms. The benefits of the 18S rDNA overlap minimally, each source tree must share at least
approach are the vertical transmission of this gene, its two taxa with one other source tree; more generally, super-
ubiquity and the fact that it has slowly evolving sites. tree methods take as input a set of phylogenetic trees and
However, single-gene analyses are dependent on the gene return one or more phylogenetic trees that represent the
having an evolutionary history that reflects that of the input trees [15]. Because of the way supertrees summarise
entire organism, an assumption that is not always true. It taxonomic congruence, they limit the impact of individ-
has been estimated that there are approximately 1.42 mil- ual genes on the global topology and account for exten-
lion fungi species yet to be discovered [1,2]. It follows that sive differences in evolutionary rates and substitution
it is essential that we develop methods to infer a robust patterns among genes in a gene-by-gene manner [16].
phylogeny of known taxonomic groups, so we can pro- Therefore, we can get a phylogeny that is truly representa-
vide a framework for future studies. tive of the entire genome. Supertree techniques are slowly
becoming commonplace in biology [17-22] and will play
Between 1990 and 2003, 560 fungal research papers an important role in ascertaining the tree of life.
reporting phylogenies were published, of which about
84% were derived using rDNA [3]. Protein coding genes This study undertook a phylogenomic approach [23,24]
are rarely used in fungal phylogenetics but when used they to fungal taxonomy. Using both supertree and concate-
have the ability to resolve deep level phylogenetic rela- nated methods, all available fungal genomic data was
tionships [4]. Phylogeny reconstruction based on a single analysed in an effort to address some long-standing ques-
gene may not be robust, as vital physiological processes tions regarding ancestry and sister group relationships
and basic adaptive strategies do not always correlate with amongst diverse fungal species.
ribosomal derived trees [5]. Individual genes also contain
a limited number of nucleotide sites and therefore limited Results and discussion
resolution. An alternative approach to a single gene phyl- Genome data infers a robust fungal phylogeny
ogeny is to combine all available phylogenetic data. There Our dataset consisted of 345,829 protein-coding genes
are two commonly used methods to do this: multigene from 42 fungal genomes (Table 1). Overall we identified
concatenation and supertree analysis. 4,805 putative orthologous gene families (see methods).
Maximum likelihood (ML) phylogenies were recon-
Multigene concatenation proposes that phylogenetic structed for individual gene families. These 4,805 trees
analysis should always be performed using all available were used as input data for our supertree analysis, con-
character data, essentially sticking many aligned genes structed using three different methods: matrix representa-
together to give a large alignment. Combining the data tion with parsimony (MRP) [25,26], the average
increases its informativeness, helps resolve nodes, basal consensus method (AV) [27], and the most similar super-
branching and improve phylogenetic accuracy [6]. Gene tree analysis (MSSA) method [21]. All three methods
concatenation has been justified on philosophical inferred congruent phylogenies, all supertree results dis-
grounds, as it attempts to maximise the informativeness cussed here are based on the MRP and AV phylogenies
and explanatory power of the character data used in the (Figure 1A&B). The results for the MSSA supertrees can be
analysis [7]. Numerous genome phylogenies have been found in additional material [see additional file 1]. The
derived by concatenation of universally distributed genes YAPTP (yet another permutation tail probability rand-
[8-13]. One advantage of concatenated phylogenies is that omization) test [21], which tests the null hypothesis that
observed branch lengths are comparable across the entire congruence between the input trees is no better than ran-
tree, as they are derived from common proteins. This dom, was used to assess the degree of congruence between
allows an objective, quantitative analysis of the consist- input trees. The distribution of the scores of the 100 opti-
ency of traditional groupings [8]. However, gene concate- mal supertrees from the YAPTP test is between 84,184 –
nation also has some well-documented problems. For 84,464, whereas the original unpermuted data received a
example, erroneous phylogenetic inferences can be made score of 27,686. These scores suggested that congruence
if recombination has occurred within the individual data- across the input trees is greater than expected by chance (P
Page 2 of 15
(page number not for citation purposes)
3. BMC Evolutionary Biology 2006, 6:99 http://www.biomedcentral.com/1471-2148/6/99
> 0.01) [21,22] and we deemed the data suitable for relationship between these classes has been the subject of
supertree analyses. debate. Our supertrees and concatenated phylogenies
infer that the Leotiomycetes and Sordariomycetes are sis-
Presently there is a heated philosophical debate as to what ter classes. This agrees with the poorly supported rDNA
is the best approach for reconstructing genome phyloge- based analysis of Lumbsch et al [30] but is in disagree-
nies. Instead of using supertree methods, some prefer to ment with Lutzoni et al [3], who based on a four gene
concatenate universally distributed genes. In an attempt combined dataset placed the Dothideomycetes as a sister
to circumvent this argument we decided to use a global group to the Sordariomycetes. The grouping of Leotio-
congruence [28] approach, where both ideologies are mycetes and Sordariomycetes in both our phylogenies is
used and the resulting phylogenies are cross-corrobo- highly supported (100% BP) and is likely to represent the
rated. true relationship. Furthermore, a recent phylogenomic
study of 17 Ascomycota genomes by Robbertse et al [12]
From our analysis, we initially located 227 protein fami- reported similar inferences.
lies that were universally distributed between all taxa.
Seven of the genomes present in this analysis have under- There is conflict however between our supertrees and con-
gone a genome duplication. In an effort to minimize the catenated phylogenies regarding the positioning of Stago-
effects of hidden paralogy, we only considered genes that nospora nodorum (the only representative of the
were found in conserved syntenic blocks for selected Dothideomycetes lineage). The supertrees (Figure 1) place
organisms (see methods). Overall 153 of the 227 gene S. nodorum beside the Eurotiomycetes (100% BP), and
families met these criteria, and were used for further anal- supports the analysis of Lutzoni et al [3] who also group
ysis [see additional file 2]. These gene families were indi- the Dothideomycetes and Eurotiomycetes lineages
vidually aligned and concatenated together to give an together. Conversely, our concatenated alignment (Figure
alignment of exactly 38,000 amino acids in length. A ML 2) infers that S. nodorum is more closely related to the Sor-
phylogeny was reconstructed (Figure 2) and compared to dariomycetes and Leotiomycetes lineages (100% BP).
the supertree derived from 4,805 gene families (Figure 1). Based on their concatenated alignment Robbertse et al
In the following discussion we use the phylum, sub-phy- [12] have also reported conflicting inferences regarding
lum and class taxonomic scheme of the NCBI taxonomy the phylogenetic position of S. nodorum [12]. Their phyl-
browser [29]. ogenies reconstructed using neighbor joining and maxi-
mum likelihood methods inferred a sister group
Overall, there is a high degree of congruence between relationship between S. nodorum and Eurotiomycetes in
supertree and concatenated alignment phylogenies (Fig- line with our supertree inference. However a phylogeny
ures 1 &2). Unsurprisingly all phylogenies inferred 3 inferred using maximum parsimony placed S. nodorum at
strongly supported phyla branches, the Zygomycota, the the base of the Pezizomycotina [12]. To confidently
Basidiomycota and the Ascomycota (Figures 1 &2). resolve this incongruence additional Dothideomycetes
genomes will be required.
The Basidiomycota form a well-supported clade. The three
members of the Hymenomycetes class form a robust sub- Within the Eurotiomycetes class there is a clade corre-
group with 100% bootstrap support (BP). Within the sponding to the order Onygenales {Histoplasma capsula-
Hymenomycetes there is a clade containing the two mem- tum, Coccidioides immitis and Uncinocarpus reesii}. The
bers {Coprinus cinereus and Phanerochaete chrysosporium} Onygenales clade is of interest as it contains Coccidioides
of the order Agaricales, separate from Cryptococcus neofor- immitis. This organism was initially classified as a protist
mans, which belongs to the order Tremellales. [31] but further research showed it was fungal, and sepa-
rate studies placed it in three different divisions of Eumy-
The majority of the species studied in this analysis belong cota [32-34]. Subsequent ribosomal phylogeny studies
to the Ascomycota phylum. Within the Ascomycota there [35,36] suggested a close phylogenetic relationship
are two main subphyla, the Pezizomycotina and Saccha- between C. immitis and U. reesii to the exclusion of H. cap-
romycotina. Both these groups form separate well-sup- sulatum. Our supertrees and concatenated phylogenies
ported sub-phyla clades (Figures 1 &2). based on whole genome data concur with the placement
Schizosaccharomyces pombe, the only member of the of C. immitis and U. reesii as sister taxa.
Schizosaccharomycetes, sits outside these two sub-phyla
clades. The Eurotiomycetes branch containing the Aspergillus
clade is also of interest, as supertree and concatenated
Within the Pezizomycotina a number of well-defined phylogenies infer that A. oryzae and A. terreus are each oth-
class-clades are observed, namely the Sordariomycetes, ers closest relatives (Figures 1 &2) (100% BP respectively).
the Leotiomycetes and Eurotiomycetes (Figures 1 &2). The A minor difference between the supertrees and concate-
Page 3 of 15
(page number not for citation purposes)
5. BMC Evolutionary Biology 2006, 6:99 http://www.biomedcentral.com/1471-2148/6/99
(A) Zygomycota (B) Zygomycota
Rhizopus oryzae
Rhizopus oryzae Basidiomycota
Basidiomycota Ustilago maydis Ustilago maydis
Cryptococcus neoformans 99 100 Cryptococcus neoformans
100 Phanerochaete chrysosporium
Hymenomycetes 100 Coprinus cinereus
100 Phanerochaete chrysosporium Hymenomycetes 100 Coprinus cinereus
Schizosaccharomyces pombe Schizosaccharomyces pombe
Eurotiomycetes Stagonospora nodorum
Stagonospora nodorum Histoplasma capsulatum
Histoplasma capsulatum 78 100
100 Uncinocarpus reesii
100 100 Uncinocarpus reesii 100 100 Coccidioides immitis
100 Coccidioides immitis
Pezizomycotina Aspergillus nidulans
Aspergillus nidulans Aspergillus fumigatus
Pezizomycotina 100
100 Eurotiomycetes 100 Aspergillus fumigatus Aspergillus oryzae
60
100 Aspergillus oryzae 60 Aspergillus terreus
100
100 Aspergillus terreus 100 Magnaporthe grisea
Magnaporthe grisea 71
Ascomycota Sordariomycetes Sordariomycetes Neurospora crassa
100
Neurospora crassa Podospora anserina
98
100 Podospora anserina 100 99 Chaetomium globosum
100 100 Chaetomium globosum Trichoderma reesei
Trichoderma reesei 93 Fusarium verticillioides
Fusarium verticillioides 100
100 100 100 Fusarium graminearum
Fusarium graminearum 100 100 Botrytis cinerea
100 Botrytis cinerea
100 Leotiomycetes Sclerotinia sclerotiorum
Leotiomycetes Sclerotinia sclerotiorum Ascomycota
Yarrowia lipolytica Yarrowia lipolytica
Candida lusitaniae 100 Candida lusitaniae
100 Candida guilliermondii
Candida guilliermondii
CTG 98 Debaryomyces hansenii
CTG 90 Debaryomyces hansenii 100 Candida parapsilosis
100 100 Candida parapsilosis Candida tropicalis
Candida tropicalis 76 100
Saccharomycotina 100
Candida dubliniensis Saccharomycotina 100 Candida dubliniensis
100 100 Candida albicans
100 Candida albicans
Candida glabrata
Candida glabrata WGD Saccharomyces castellii
WGD Saccharomyces castellii
86
59 Saccharomyces bayanus
100 Saccharomyces kudriavzevii
100 Saccharomyces bayanus 100
100 100
Saccharomyces kudriavzevii 95 Saccharomyces mikatae
100 100
Saccharomyces mikatae 51 Saccharomyces cerevisiae
100
100 100 Saccharomyces paradoxus 0.1
77 Saccharomyces paradoxus
100 Saccharomyces cerevisiae Kluyveromyces waltii
100
100 Kluyveromyces waltii 100 Saccharomyces kluyveri
Saccharomyces kluyveri Ashbya gossypii
96 Ashbya gossypii 87 Kluyveromyces lactis
98 Kluyveromyces lactis
Figure 1
MRP (A) and AV (B) fungal supertrees derived from 4,805 fungal gene families
MRP (A) and AV (B) fungal supertrees derived from 4,805 fungal gene families. Bootstrap scores for all nodes are
displayed. The AV supertree method makes use of input tree branch lengths. Rhizopus oryzae has been selected as an outgroup.
The Basidiomycota and Ascomycota phyla form distinct clades. Subphyla and class clades are highlighted. Two clades of special
interest include the node that contains the organisms that translate CTG as serine instead of leucine, and the node that con-
tains the genomes that have undergone a genome duplication (WGD). Topological differences between supertree phylogenies
are highlighted in red font.
nated phylogenies regards the phylogenetic position of A. the grouping of {K. waltii, S. kluyveri} and {K. lactis, A. gos-
nidulans and A. fumigatus. The concatenated alignment sypii} as sister branches, while the AV supertree infers that
infers that these organisms are sister taxa (100% BP), the {K. waltii, S. kluyveri} are closer to the WGD clade than to
supertrees fails to make this inference and instead posi- the {K. lactis, A. gossypii} clade. Recently Jeffroy et al [38]
tions A. fumigatus beside the {A. oryzae, A. terreus} clade constructed a multigene phylogeny (using 13 of the 42
with 100% BP. species included in our analysis) that is congruent with
our MRP supertree for these species. They state that K. lac-
A number of subclass clades are evident in the Sordario- tis and A. gossypii are evolving faster than S. kluyveri and K.
mycetes clade. For example Fusarium graminearum, Fusar- waltii and are therefore likely to be "attracted" to long
ium verticilliodes and Trichoderma reesei belong to the branches. The AV method makes use of branch length
subclass Hypocreomycetidae. Similarily Neurospora crassa, information from individual gene trees, and we suspect
Chaetomium globosum and Podospora anserina all belong to the inferred AV supertree phylogeny amongst the {K. lac-
the subclass Sordariomycetidae. The inferred phyloge- tis, A. gossypii} and {S. kluyveri, K. waltii} clades may be
netic relationships amongst the Sordariomycetidae organ- suffering from long-branch attraction artifacts. As addi-
isms concurs with previous phylogenetic studies [37]. tional taxa can help break long branches, it is likely that
stochastic errors will be eradicated with the addition of
Relationships within the Saccharomycotina lineage extra genome data when it becomes available, thus elimi-
Overall the MRP and AV supertree topologies (Figure nating erroneous inferences.
1A&B) are very similar. A noticeable difference occurs in
the branch directly adjacent to the WGD clade. The MRP The sister group relationships amongst the Saccharomyces
tree (and the concatenated phylogeny (Figure 2)) places sensu stricto species also differs between our supertree phy-
Page 5 of 15
(page number not for citation purposes)
6. BMC Evolutionary Biology 2006, 6:99 http://www.biomedcentral.com/1471-2148/6/99
Zygomycota
Rhizopus oryzae
Basidiomycota Ustilago maydis
100 Cryptococcus neoformans
100 Phanerochaete chrysosporium
Hymenomycetes 100 Coprinus cinereus
Schizosaccharomyces pombe
100
Histoplasma capsulatum
Eurotiomycetes
Uncinocarpus reesii
100 Coccidioides immitis
100 100 Aspergillus nidulans
Aspergillus fumigatus
100 Aspergillus oryzae
Pezizomycotina 100 Aspergillus terreus
100 100 Stagonospora nodorum
Magnaporthe grisea
100
Neurospora crassa
100 Podospora anserina
100
Sordariomycetes 100
100 Chaetomium globosum
Trichoderma reesei
100 100 Fusarium verticillioides
100
100 Fusarium graminearum
Ascomycota 100 Botrytis cinerea
Leotiomycetes Sclerotinia sclerotiorum
Yarrowia lipolytica
Candida lusitaniae
100 Candida guilliermondii
100 CTG 60 Debaryomyces hansenii
100 Candida parapsilosis
Saccharomycotina
100 Candida tropicalis
100 Candida dubliniensis
100 Candida albicans
100
Saccharomyces castellii
WGD
100
Candida glabrata
90
Saccharomyces bayanus
Saccharomyces kudriavzevii
100
100 Saccharomyces mikatae
60
100 Saccharomyces cerevisiae
100 Saccharomyces paradoxus
70 Kluyveromyces waltii
Saccharomyces kluyveri
70 Ashbya gossypii
100 Kluyveromyces lactis
0.1
Figure 2
Maximum likelihood phylogeny reconstructed using a concatenated alignment of 153 universally distributed fungal genes
Maximum likelihood phylogeny reconstructed using a concatenated alignment of 153 universally distributed
fungal genes. The concatenated alignment contains 42 taxa and exactly 38,000 amino acid positions. The optimum model
according to ModelGenerator [85] was found to be WAG+I+G. The number of rate categories was 4 (alpha = 0.83) and the
proportion of invariable sites was approximated at 0.20. Bootstrap scores for all nodes are displayed. S. castellii is found at the
base of the WGD node.
Page 6 of 15
(page number not for citation purposes)
7. BMC Evolutionary Biology 2006, 6:99 http://www.biomedcentral.com/1471-2148/6/99
logenies (Figure 1A&B). For example, the MRP phylogeny 3, and again places C. glabrata at the base of the WGD
places S. bayanus at the base of the Saccharomyces sensu clade.
stricto node and infers a ladderised topology amongst the
Saccharomyces sensu stricto species. The MRP inferences To analyse the degree of conflicting phylogenetic signal
(Figure 1A) match those proposed by our multigene phy- within the concatenated alignment, a phylogenetic net-
logeny (Figure 2) and are identical to that proposed by Jef- work was constructed (Figure 4). Numerous alternative
froy et al. Alternatively, the AV supertree infers that S. splits are present (491 in total). A bootstrap analysis was
bayanus and S. kudriavzevii are sister taxa (Figure 1B). preformed on the phylogenetic network [see additional
There is also an interesting difference regarding the rela- file 5]. It is interesting to note that we never observe a split
tive position of Candida glabrata and Saccharomyces castel- that excludes either C. glabrata or S. castellii from the
lii, the supertrees and the multigene phylogeny remaining WGD organisms. This conflicts with the con-
constructed by Jeffroy et al [38] place C. glabrata at the catenated phylogeny (Figure 2), which strongly infers that
base of the clade containing the organisms that have C. glabrata sits beside the remaining WGD organisms to
undergone a WGD (Figure 1A). Alternatively, our concate- the exclusion of S. castellii. It is possible that a systematic
nated alignment infers a phylogeny with S. castellii at the bias [40] may be influencing our supertrees, as synteny
base of the WGD clade (Figure 2), in agreement with syn- information clearly shows that S. castellii diverges from
tenic studies [39]. the Saccharomyces sensu stricto lineage before S. castellii,
[39]. Therefore topologies that place C. glabrata as an out-
It is possible that the differences between the phylogenies group to the Saccharomyces sensu stricto lineage and S. cas-
inferred by the MRP and AV supertrees for the Saccharomy- tellii are unreliable [39] and need closer scrutiny. These
ces sensu stricto group are due the inclusion of paralagous incongruences suggest that genome data for additional
sequences from the WGD species. We therefore con- basal WGD species is required to confidently resolve infer-
structed a supertree based exclusively on the species that ences at the base of the WGD clade.
have undergone the WGD, using 1,368 putative ortholo-
gous gene families (see methods). ML phylogenies were Phylogenetic relationships amongst Candida species
reconstructed for all gene families. The WGD-specific Both super tree (Figure 1) and superalignment (Figure 2)
supertree (Figure 3) concurs with the MRP fungal super- topologies inferred a robust monophyletic clade contain-
tree (Figure 1A) and the phylogeny of Jeffroy et al, suggest- ing organisms which translate CTG as serine instead of
ing this topology is correct. leucine [41-44]. This codon reassignment has been pro-
posed to have occurred ~170 million years ago [45]. Fur-
The placement of C. glabrata as the most basal WGD ther inspection showed that there are two distinct CTG
genome is in disagreement with the tree inferred from the sub-clades, the first contains {Candida lusitaniae, Candida
concatenated alignment (Figure 2). We therefore investi- guilliermondii, Debaromyces hansenii} and the second con-
gated the influence of fast evolving sites. Using a gamma taining {Candida tropicalis, Candida albicans, Candida dub-
distribution, we placed fast-evolving sites for each gene liniensis, Candida parapsilosis} (Figure 1). C. lusitaniae and
family into one of 8 categories, where site class 8 was the C. guilliermondii are haploid yeasts, and are apparently
most heterogeneous, and class 1 were stationary. We sys- fully sexual [46-48]. D. hansenii is homothallic, with a
tematically removed the fastest evolving sites one at a fused mating locus [49,50]. In contrast, members of the
time, and rebuilt ML phylogenies based on these reduced second clade have at best a cryptic sexual cycle and have
alignments. Supertrees were once again reconstructed for never been observed to undergo meiosis [51-55]. We
these new phylogeny sets. When the two fastest classes of decided to investigate this clade in further detail, and per-
sites were removed, (reducing the combined length of all formed specific supertree, spectral and network analyses.
1,368 genes by ~18% and ~30%), the resultant supertrees Trace sequence data for Lodderomyces elongisporus, once
group S. castelli and C. glabrata as a monophyletic group proposed as the sexual form (teleomorph) of C. parapsilo-
and fail to differentiate which is closer to the outgroup sis were also included [56,57].
[see additional file 3]. When we additionally remove the
third fastest evolving site class (reducing the combined We located 2,146 putative orthologous gene families from
length by ~38%), the final supertree [see additional file 3] our CTG database (see methods). ML phylogenies were
again infers C. glabrata at the base of the WGD clade (Fig- reconstructed for all gene families, and a supertree based
ure 3). In an effort to account for compositional biases we on these trees was reconstructed. The resultant CTG spe-
also recoded the underlying amino acid alignments into cific supertree placed L. elongisporus within the asexual
the six Dayhoff groups and inferred individual gene phyl- clade (Figure 5A) with high BP support (100%), in agree-
ogenies using the Bayesian criterion [see additional file 4]. ment with other phylogenetic studies [58,59]. A CTG spe-
The resultant supertree is identical to that shown in Figure cific phylogenetic network was also constructed and infers
that L. elongisporus groups beside C. parapsilosis, although
Page 7 of 15
(page number not for citation purposes)
8. BMC Evolutionary Biology 2006, 6:99 http://www.biomedcentral.com/1471-2148/6/99
Kluyveromyces waltii
Candida glabrata
Saccharomyces castellii
100
100
Saccharomyces bayanus
83
Saccharomyces kudriavzevii
100 Saccharomyces mikatae
100 Saccharomyces cerevisiae
0.1 100
Saccharomyces paradoxus
Figure consensus supertree of WGD-specific clade inferred from 1,368 underlying phylogenies
Average3
Average consensus supertree of WGD-specific clade inferred from 1,368 underlying phylogenies. MRP and MSSA supertrees
are identical. Bootstrap scores are shown at all nodes. Bayesian analysis of recoded protein alignments and further supertree
analysis yielded identical results.
there is a degree of conflict with this inference illustrated niae in a clade beside C. guilliermondii, and inferred a
by a number of alternative splits (Figure 5B). Interestingly closer relationship between the two compared with Deba-
there is no conflict for the grouping of C. albicans and C. ryomyces species. We found 1,208 gene families present in
dubliniensis illustrating their high genotypic similarity all CTG taxa; these were concatenated together to give a
[60]. These results raise interesting questions regarding nucleotide alignment of 1,291,068 sites or 860,712 sites
the sexual status of the Candida species. It is possible that when third codon positions are removed. A phylogenetic
the "asexual" species are in fact fully sexual. C. albicans network based on this nucleotide alignment (Figure 5B)
and C. dubliniensis have been observed to mate [53], and corroborated the CTG-specific supertree regarding the
in addition the C. albicans genome contains most of the grouping of C. guilliermondii and D. hansenii as sister taxa
requirements for meiosis [61]. In contrast the evidence to the exclusion of C. lusitianiae. Subsequent spectral anal-
that L. elongisporus reproduces sexually is sketchy, and is yses (Figure 5C) reinforce our CTG specific supertree and
based on the appearance of asci, with one (or sometimes network inferences. For example, split A (Figure 5C)
two) spores [62]. It is clear that further analysis is shows the relatively high degree of support for the group-
required, which will be greatly aided when the fully anno- ing of three sexual species {C. lusitianiae, C. guilliermondii
tated genome sequences of L. elongisporus and C. parapsilo- and D. hansenii} as sister taxa. Split C groups C. guillier-
sis become available. mondii and D. hansenii together, in agreement with our
CTG supertree and network. However, there is nearly
Our CTG specific supertree also suggests that D. hansenii equal character support for the grouping of C. lusitaniae
and C. guilliermondii are sister taxa, as they are grouped and D. hansenii (0.00609 vs. 0.00501) illustrated by split
together with high support (100% BP) to the exclusion of E (Figure 5C). Therefore, based on whole genome com-
C. lusitaniae. Other studies [58,63] have placed C. lusita- parisons there is only marginal evidence for the grouping
Page 8 of 15
(page number not for citation purposes)
9. BMC Evolutionary Biology 2006, 6:99 http://www.biomedcentral.com/1471-2148/6/99
Coprinus cinereus
Phanerochaete chrysosporium Ustilago maydis
Cryptococcus neoformans
Schizosaccharomyces pombe
Rhizopus oryzae
Yarrowia lipolytica
Candida parapsilosis
Candida dubliniensis
Candida albicans
Candida tropicalis
Magnaporthe grisea
Candida guilliermondii
Neurospora crassa
Podospora anserina Debaryomyces hansenii
Chaetomium globosum
Candida lusitaniae
Fusarium graminearum
Fusarium verticillioides
Trichoderma reesei
Botrytis cinerea Ashbya gossypii
Sclerotinia sclerotiorum
Saccharomyces kluyveri
Kluyveromyces waltii
Stagonospora nodorum
Kluyveromyces lactis
Aspergillus nidulans
Aspergillus fumigatus Candida glabrata
Aspergillus terreus
Aspergillus oryzae Saccharomyces castellii
Uncinocarpus reesii
Coccidioides immitis Saccharomyces cerevisiae
Histoplasma capsulatum Saccharomyces paradoxus
Saccharomyces mikatae
Saccharomyces bayanus
Saccharomyces kudriavzevii
Figure 4
Phylogenetic network reconstructed using a concatenated alignment of 153 universally distributed fungal genes
Phylogenetic network reconstructed using a concatenated alignment of 153 universally distributed fungal
genes. The NeighborNet method was used to infer splits within the alignment. For display purposes bootstrap scores are not
shown [see additional file 5].
of C. guilliermondii with D. hansenii to the exclusion of C. robust phylogenetic markers. Our results suggest that it
lusitianiae. may be possible to piece together the tree of life using
whole genomes. This is of interest as we expect the
Conclusion number of available genomes to increase substantially in
In this study we set out to reconstruct a fungal phylogeny tandem with new sequencing strategies [64], which con-
from whole genome sequences. Two alternative strategies tinue to decrease the costs associated with sequencing.
were chosen (supertrees and concatenated methods), and However, our study also shows that certain nodes of the
overall we observed a high degree of congruence between tree (such as the WGD clade) are difficult to resolve even
both approaches. We recovered robust fungal, phyla, sub- with genome scale data.
phyla and class clades. Overall our inferences agreed with
previous phylogenetic studies based on single genes and Methods
morphological characteristics. Sequence data
The fungal database used in this analysis consisted of 42
The phylogenomic approach undertaken in this study is genomes (Table 1). Of these 28 are complete and gene
novel in fungal phylogenetics as it circumvents problems datasets are available. Gene annotation for genomes with
associated with single gene phylogenies and selection of no annotations was performed using two separate
Page 9 of 15
(page number not for citation purposes)
10. BMC Evolutionary Biology 2006, 6:99 http://www.biomedcentral.com/1471-2148/6/99
(A)
Yarrowia lipolytica
(B)
Candida albicans
Candida dubliniensis Candida tropicalis
Candida lusitaniae
A C
100 Candida guilliermondii Candida parapsilosis
100 Debaryomyces hansenii
Debaryomyces hansenii
Candida tropicalis
D Lodderomyces elongisporus
100 F Candida albicans
Candida guilliermondii
100 Candida dubliniensis
100
Candida parapsilosis Candida lusitaniae
B
100 Lodderomyces elongisporus
0.1
0.1
(C) 0.02
Yarrowia lipolytica
0.015
0.01
A
B
CD
0.005
E F
0
-0.05
-0.1
-0.15
-0.2
Figure consensus supertree of CTG specific clade (A)
Average5
Average consensus supertree of CTG specific clade (A). Y. lipolytica was chosen as an outgroup. Bootstrap scores are shown at
all nodes. (B) A phylogenetic network of 1,208 concatenated genes was inferred with the NeighborNet method. The topolo-
gies of CTG-clade specific supertree and network are congruent. (C) Spectral analysis of the concatenated alignment). Bars
above the x-axis represent frequency of support for each split. Bars below the x-axis represent the sum of all corresponding
conflicts. Letters above columns represent particular splits in the data, and where applicable these have also been mapped onto
the supertree.
Page 10 of 15
(page number not for citation purposes)
11. BMC Evolutionary Biology 2006, 6:99 http://www.biomedcentral.com/1471-2148/6/99
approaches. The first involved a reciprocal best BLAST We were concerned that our strategy for locating ortholo-
[65] search with a cutoff E-value of 10-7 of Candida albicans gous gene families was too liberal. Therefore, we also uti-
protein coding genes against unannotated Candida lised a second stricter database search strategy that located
genomes (Table 1). Top BLAST hits longer than 300 nucle- 809 gene families [see additional file 1 &additional file 6].
otides were retained as putative open reading frames. The
second approach involved a pipeline of analysis that com- Supertree reconstruction
bined several different gene prediction programs includ- In total 4,805 input trees were used as source data for this
ing ab initio programs SNAP [66], Genezilla [67], and supertree analysis. Using the supertree software package
AUGUSTUS [67] with gene models from Exonerate [68] CLANN 3.0.3b1 [78] three supertree methods were used
and Genewise [69] based on alignments of proteins and to reconstruct fungal phylogenies, the average consensus
Expressed sequence tags. The lines of evidence were method (AV) [27], the most similar supertree analysis
merged into a single gene prediction using a combiner (MSSA) method [21], and matrix representation with par-
GLEAN (AJ Mackey, Q Liu, FCN Pereira, DS Roos, unpub- simony (MRP) [25,26]. Using CLANN 3.0.3b1, 100 boot-
lished data). These annotations are freely available for strap resamplings were also carried out on the input data.
download [70]. We tested for the presence of signal within our data using
the YAPTP test.
Reconstruction of individual gene trees
Fungal homologous sequences were identified using the Multigene analysis
BLASTP algorithm [65] with a cutoff E-value of 10-7 by All proteins from the genome sequences were compared
randomly selecting a sequence from the database, finding with FASTP [79] to find orthologous genes via a best bi-
its homologs, and removing the entire family from the directional strategy. The ortholog sets for each pair of spe-
database. Another randomly selected sequence from cies were combined with single-linkage clustering to form
within the reduced database was then used as the new multi-gene clusters of orthologs. In order to identify a set
starting point for the next search. This procedure was of single-copy genes across all organisms, only those clus-
repeated until all sequences had been removed from the ters with exactly one member per species were considered
database. Gene families with more than one representa- for further analysis, we located 227 protein families that
tive from any species were not considered for further anal- contain all fungal taxa. To help identify ohnologs and
ysis. Those remaining families with a minimum of four possible paralogs (with reference to the genomes that
sequences, and longer than 100 amino acids in length have undergone a genome duplication) we used the yeast
were selected for phylogenetic analysis. In total 5,316 pro- genome browser [80,81] to filter out genes that have no
tein families met these criteria. Individual protein families syntenic evidence. Overall 153 gene families were used for
were aligned using ClustalW 1.81 [71] with the default further analysis [see additional file 2]. Individual gene
settings. The average length of each protein alignment was families were aligned, manually edited and concatenated
697 sites. Due to the large number of protein families it together to yield an alignment with 38,000 amino acid
was not possible to manually curate all alignments. We sites. A ML phylogeny was reconstructed for this align-
therefore used only conserved alignment blocks, located ment using the MultiPhyml software. Branch supports
using Gblocks version 0.91 b [72]. This filtering stage were determined via bootstrapping. In an attempt to visu-
reduced the average length of our alignments to 214 sites. alise the degree of phylogenetic conflict within this con-
Permutation tail probability tests (PTP) [73,74] were per- catenated alignment a phylogenetic network was
formed on each alignment to test for the presence of evo- generated using the NeighborNet method [82].
lutionary signal better than random (P < 0.01). We found
that 511 alignments failed the PTP test; therefore 4,805 Investigation of specific clades
were used for phylogenetic reconstruction analysis. Using CTG clade
MultiPhyl [75] appropriate protein substitution models The genomes of C. albicans, C. dubliniensis, C. tropicalis, C.
were selected and used to reconstruct ML phylogenies for parapsilosis, D. hansenii, C. guilliermondii, C. lusitaniae and
each individual gene family. Bootstrap resampling was the outgroup Y. lipolytica were combined to give a CTG
carried out 100 times on each alignment and the results specific database. Data for L. elongisporus was retrieved
were summarised with the majority-rule consensus from the NCBI trace database and coding genes were pre-
method with a threshold of 70%. These phylogenies were dicted using a reciprocal best BLASTP search against C.
used as input data in our supertree analysis. To account albicans. In total 2,146 gene families were longer than 100
for possible compositional biases within our data, neigh- amino acids in length, with evolutionary signal, were
bor joining [76] phylogenies were also reconstructed retained for supertree analysis. ML phylogenies were
based on distances derived from the LogDet transforma- reconstructed for all gene families as described above, and
tion [77]. representative supertrees were reconstructed. A concate-
nated alignment based on 1,208 genes containing all CTG
Page 11 of 15
(page number not for citation purposes)
12. BMC Evolutionary Biology 2006, 6:99 http://www.biomedcentral.com/1471-2148/6/99
taxa was created. Alternative splits in the concatenated
alignment were found using the NeighborNet method Additional File 4
[82], and represented as a phylogenetic network with the Additional Methods and Results.
SplitsTree software [83]. Using the software package Spec- Click here for file
[http://www.biomedcentral.com/content/supplementary/1471-
trum [84] we also performed a spectral analysis on this 2148-6-99-S4.doc]
nucleotide alignment.
Additional File 5
WGD clade Bootstrap scores for phylogenetic Network.
The WGD clade includes the genomes of S. cerevisiae, S. Click here for file
paradoxus, S. mikatae, S. kudriavzevii, S. bayanus, S. castellii [http://www.biomedcentral.com/content/supplementary/1471-
and C. glabrata. K. waltii was selected as an outgroup. For 2148-6-99-S5.doc]
a gene family to be retained, every gene within that family
must locate every other family member (and nothing else)
Additional File 6
Supertrees (AV (A), MRP (B) and MSSA (C)) derived from the strict
in a reciprocal BLASTP search (cutoff E-value of 10-7), be gene family dataset that contains 809 genes. Bootstrap scores are shown
in single copy and contain a minimum of 4 taxa. We at selected nodes. Overall there is agreement with supertrees derived from
found 1,368 single gene families that met our criteria for the larger (liberal) dataset.
supertree analysis. ML phylogenies were reconstructed for Click here for file
individual gene families as explained earlier. Phylogeny [http://www.biomedcentral.com/content/supplementary/1471-
2148-6-99-S6.eps]
sets were also generated using Bayesian and distance
based methods; [see additional file 4].
Authors' contributions
Acknowledgements
DAF and GB were involved in the design phase. MEL & JES The authors wish to acknowledge the Wellcome Trust Sanger Institute and
predicted genes in unannotated genomes. DAF & JES BROAD institute of MIT & Harvard for releasing data ahead of publication.
sourced putative orthologs. DAF performed all phyloge- We thank Dr Chris Creevey for providing software and insight into the
netic analyses. DAF and GB drafted the manuscript. All location of orthologous gene families. Special thanks to NUI Maynooth and
authors read and approved the final manuscript. Thomas Keane for allowing us run data on their distributed phylogenetics
platform. We would like to acknowledge the financial support of the Irish
Additional material Research Council for Science, Engineering and Technology (IRCSET) and
Science Foundation Ireland (SFI). We wish to acknowledge the SFI/HEA
Irish Centre for High-End Computing (ICHEC) for the provision of compu-
Additional File 1 tational facilities and support. J.E.S. was supported by an NSF graduate
MSSA supertree derived from 4,805 fungal gene families. Bootstrap scores research fellowship.
for all nodes are displayed. Rhizopus oryzae has been selected as an out-
group. The Basidiomycota and Ascomycota phyla form distinct clades. References
Subphyla and class clades are highlighted. 1. Hawksworth DL: The fungal dimension of biodiversity: magni-
Click here for file tude, significance, and conservation. Mycol Res 1991, 95:641–
[http://www.biomedcentral.com/content/supplementary/1471- 655..
2. Hawksworth DL: The magnitude of fungal diversity: the 1.5
2148-6-99-S1.eps] million species estimate revisited. Mycol Res 2001, 109:1422–
1432..
Additional File 2 3. Lutzoni F, Kauff F, Cox CJ, McLaughlin D, Celio G, Dentinger B, Pad-
Descriptions of the 153 universally distributed genes. amsee M, Hibbett D, James TY, Baloch E, Grube M, Reeb V, Hofstet-
ter V, Schoch C, Arnold AE, Miadlikowska J, Spatafora J, Johnson D,
Click here for file Hambleton S, Crockett M, Shoemaker R, Sung G, Lucking R, Lumbsch
[http://www.biomedcentral.com/content/supplementary/1471- T, O'Donnell K, Binder M, Diederich P, Ertz D, Gueidan C, Hansen
2148-6-99-S2.doc] K, Harris R, Hosaka K, Lim Y, Matheny B, Nishida H, Pfister D, Rogers
J, Rossman A, Schmitt I, Sipman H, Stone J, Sugiyama J, Yahr R, Vilgalys
R: Assembling the fungal tree of life: progress, classification,
Additional File 3 and evolution of subcellular traits. Am J Bot 2004,
Average consensus supertrees for WGD specific clade. For each of the 91(10):1446-1480.
1,368 underlying gene families, fast evolving sites were categorised into 8 4. Liu YJ, Whelen S, Hall BD: Phylogenetic relationships among
classes. Different site classes were systematically removed and phylogenies ascomycetes: evidence from an RNA polymerse II subunit.
were reconstructed based on reduced alignments. (A) Fastest evolving sites Mol Biol Evol 1999, 16(12):1799-1808.
5. Boucher Y, Douady CJ, Papke RT, Walsh DA, Boudreau ME, Nesbo
(class 8) were removed. (B) The two fastest evolving site classes (classes 7
CL, Case RJ, Doolittle WF: Lateral gene transfer and the origins
and 8) were removed. (C) The three fastest evolving site classes (classes of prokaryotic groups. In Annu Rev Genet Volume 37. United States
6, 7 and 8) were removed. Supertrees A and B group S. castelli and C. ; 2003:283-328.
glabrata together, supertree C places C. glabrata at the base of the WGD 6. Barrett M, Donoghue MJ, Sober E: Against consensus. Systematic
clade. Zoology 1991, 40:486-493.
Click here for file 7. Kluge AG: A concern for evidence and a phylogenetic hypoth-
esis of relationships among Epicrates (Boidae, Serpentes).
[http://www.biomedcentral.com/content/supplementary/1471- Systematic Biology 1989, 38:7-25.
2148-6-99-S3.eps]
Page 12 of 15
(page number not for citation purposes)
13. BMC Evolutionary Biology 2006, 6:99 http://www.biomedcentral.com/1471-2148/6/99
8. Ciccarelli FD, Doerks T, von Mering C, Creevey CJ, Snel B, Bork P: lapping sets of taxa. In Syst Biol Volume 46. Issue 2 England ;
Toward automatic reconstruction of a highly resolved tree 1997:306-312.
of life. Science 2006, 311(5765):1283-1287. 28. Levasseur C, Lapointe FJ: War and peace in phylogenetics: a
9. Brown JR, Douady CJ, Italia MJ, Marshall WE, Stanhope MJ: Univer- rejoinder on total evidence and consensus. Syst Biol 2001,
sal trees based on large combined protein sequence data 50(6):881-891.
sets. In Nat Genet Volume 28. Issue 3 United States ; 2001:281-285. 29. Taxonomy Browser http://130.14.29.110/Taxonomy/. .
10. Baldauf SL, Roger AJ, Wenk-Siefert I, Doolittle WF: A kingdom- 30. Lumbsch HT, Schmitt I, Lindemuth R, Miller A, Mangold A, Fernandez
level phylogeny of eukaryotes based on combined protein F, Huhndorf S: Performance of four ribosomal DNA regions to
data. Science 2000, 290(5493):972-977. infer higher-level phylogenetic relationships of inoperculate
11. Rokas A, Williams BL, King N, Carroll SB: Genome-scale euascomycetes (Leotiomyceta). Mol Phylogenet Evol 2005,
approaches to resolving incongruence in molecular phyloge- 34(3):512-524.
nies. In Nature Volume 425. Issue 6960 England ; 2003:798-804. 31. Rixford E, Gilchrist C: Two cases of protozoan (coccidioidal)
12. Robbertse B, Reeves JB, Schoch CL, Spatafora JW: A phylogenomic infection of the skin and other organs. Johns Hopkins Hospital
analysis of the Ascomycota. Fungal Genet Biol 2006. Report 1896, 1:209-268.
13. James TY, Kauff F, Schoch CL, Matheny PB, Hofstetter V, Cox CJ, 32. Ophuls MD: Further observations on a pathogenic mould for-
Celio G, Gueidan C, Fraker E, Miadlikowska J, Lumbsch HT, Rauhut merly described as a protozoon (Coccidioides immitis, Coc-
A, Reeb V, Arnold AE, Amtoft A, Stajich JE, Hosaka K, Sung GH, John- cidioides pyrogenes). J Exp Med 1905, 6:443-485.
son D, O'Rourke B, Crockett M, Binder M, Curtis JM, Slot JC, Wang 33. Ciferri R, Redaelli P: Morfologia, biologia e posizione sistemat-
Z, Wilson AW, Schussler A, Longcore JE, O'Donnell K, Mozley-Stan- ica di Coccidioides immitis stiles e delle sue varieta, con noti-
dridge S, Porter D, Letcher PM, Powell MJ, Taylor JW, White MM, zie sul granuloma coccidioide. R Accad Ital 1936, 7:399-474.
Griffith GW, Davies DR, Humber RA, Morton JB, Sugiyama J, Ross- 34. Baker EE, Mrak M, Smith CE: The morphology, taxonomy, and
man AY, Rogers JD, Pfister DH, Hewitt D, Hansen K, Hambleton S, distribution of Coccidioides irnmitis rixford and gilchrist
Shoemaker RA, Kohlmeyer J, Volkmann-Kohlmeyer B, Spotts RA, 1896. Farlowia 1943, 1:199-244.
Serdani M, Crous PW, Hughes KW, Matsuura K, Langer E, Langer G, 35. Bowman BH, White TJ, Taylor JW: Human pathogeneic fungi and
Untereiner WA, Lucking R, Budel B, Geiser DM, Aptroot A, Died- their close nonpathogenic relatives. Mol Phylogenet Evol 1996,
erich P, Schmitt I, Schultz M, Yahr R, Hibbett DS, Lutzoni F, McLaugh- 6(1):89-96.
lin DJ, Spatafora JW, Vilgalys R: Reconstructing the early 36. Pan S, Sigler L, Cole GT: Evidence for a phylogenetic connection
evolution of Fungi using a six-gene phylogeny. Nature 2006, between Coccidioides immitis and Uncinocarpus reesii
443(7113):818-822. (Onygenaceae). Microbiology 1994, 140 ( Pt 6):1481-1494.
14. Gadagkar SR, Rosenberg MS, Kumar S: Inferring species phyloge- 37. Berbee ML: The phylogeny of plant and animal pathogens in
nies from multiple genes: concatenated sequence tree ver- the Ascomycota. Physiological and Molecular Plant Pathology 2001,
sus consensus gene tree. J Exp Zoolog B Mol Dev Evol 2005, 59(4):165-187.
304(1):64-74. 38. Jeffroy O, Brinkmann H, Delsuc F, Philippe H: Phylogenomics: the
15. Wilkinson M, Cotton JA, Creevey C, Eulenstein O, Harris SR, beginning of incongruence? Trends Genet 2006, 22(4):225-231.
Lapointe FJ, Levasseur C, McInerney JO, Pisani D, Thorley JL: The 39. Scannell DR, Byrne KP, Gordon JL, Wong S, Wolfe KH: Multiple
shape of supertrees to come: tree shape related properties rounds of speciation associated with reciprocal gene loss in
of fourteen supertree methods. Syst Biol 2005, 54(3):419-431. polyploid yeasts. Nature 2006, 440(7082):341-345.
16. Bull JJ, Huelsenbeck JP, Cunningham CW, Swofford DL, Waddell PJ: 40. Phillips MJ, Delsuc F, Penny D: Genome-scale phylogeny and the
Partitioning and Combining Data in Phylogenetic Analysis. detection of systematic biases. Mol Biol Evol 2004,
Systematic Biology 1993, 42(3):384-397. 21(7):1455-1458.
17. Daubin V, Gouy M, Perriere G: A phylogenomic approach to 41. Kawaguchi Y, Honda H, Taniguchi-Morimura J, Iwasaki S: The codon
bacterial phylogeny: evidence of a core of genes sharing a CUG is read as serine in an asporogenic yeast Candida cylin-
common history. Genome Res 2002, 12(7):1080-1090. dracea. Nature 1989, 341(6238):164-166.
18. Jones KE, Purvis A, MacLarnon A, Bininda_Emonds OR, Simmons NB: 42. Ohama T, Suzuki T, Mori M, Osawa S, Ueda T, Watanabe K, Nakase
A phylogenetic supertree of the bats (Mammalia: Chirop- T: Non-universal decoding of the leucine codon CUG in sev-
tera). In Biol Rev Camb Philos Soc Volume 77. Issue 2 England ; eral Candida species. Nucleic Acids Res 1993, 21(17):4039-4045.
2002:223-259. 43. Santos MA, Tuite MF: The CUG codon is decoded in vivo as ser-
19. Ruta M, Jeffery JE, Coates MI: A supertree of early tetrapods. In ine and not leucine in Candida albicans. Nucleic Acids Res 1995,
Proc R Soc Lond B Biol Sci Volume 270. Issue 1532 England ; 23(9):1481-1486.
2003:2507-2516. 44. Sugita T, Nakase T: Nonuniversal usage of the leucine CUG
20. Pisani D, Yates AM, Langer MC, Benton MJ: A genus-level super- codon in yeasts: Investigation of basidiomycetous yeast. J Gen
tree of the Dinosauria. In Proc R Soc Lond B Biol Sci Volume 269. Appl Microbiol 1999, 45(4):193-197.
Issue 1494 England ; 2002:915-921. 45. Massey SE, Moura G, Beltrao P, Almeida R, Garey JR, Tuite MF, Santos
21. Creevey CJ, Fitzpatrick DA, Philip GK, Kinsella RJ, O'Connell M J, MA: Comparative evolutionary genomics unveils the molec-
Pentony MM, Travers SA, Wilkinson M, McInerney JO: Does a tree- ular mechanism of reassignment of the CTG codon in Cand-
like phylogeny only exist at the tips in the prokaryotes? Proc ida spp. Genome Res 2003, 13(4):544-557.
R Soc Lond B Biol Sci 2004, 271(1557):2551-2558. 46. Rodrigues de Miranda L: Clavispora, a new yeast genus of the
22. Fitzpatrick DA, Creevey CJ, McInerney JO: Genome phylogenies Saccharomycetales. Antonie Van Leeuwenhoek 1979,
indicate a meaningful alpha-proteobacterial phylogeny and 45(3):479-483.
support a grouping of the mitochondria with the Rickett- 47. Wickerham LJ, Burton KA: A clarification of the relationship of
siales. Mol Biol Evol 2006, 23(1):74-85. Candida guilliermondii to other yeasts by a study of their
23. Delsuc F, Brinkmann H, Philippe H: Phylogenomics and the mating types. J Bacteriol 1954, 68(5):594-597.
reconstruction of the tree of life. Nat Rev Genet 2005, 48. Young LY, Lorenz MC, Heitman J: A STE12 homolog is required
6(5):361-375. for mating but dispensable for filamentation in Candida lusi-
24. Eisen JA: Phylogenomics: improving functional predictions for taniae. Genetics 2000, 155(1):17-29.
uncharacterized genes by evolutionary analysis. Genome Res 49. Dujon B, Sherman D, Fischer G, Durrens P, Casaregola S, Lafontaine
1998, 8(3):163-167. I, De Montigny J, Marck C, Neuveglise C, Talla E, Goffard N, Frangeul
25. Baum BR: Combining trees as a way of combining data sets for L, Aigle M, Anthouard V, Babour A, Barbe V, Barnay S, Blanchin S,
phylogenetic inference, and the desirability of combining Beckerich JM, Beyne E, Bleykasten C, Boisrame A, Boyer J, Cattolico
gene trees. Taxon 1992, 41:3-10. L, Confanioleri F, De Daruvar A, Despons L, Fabre E, Fairhead C,
26. Ragan MA: Matrix representation in reconstructing phyloge- Ferry-Dumazet H, Groppi A, Hantraye F, Hennequin C, Jauniaux N,
netic relationships among the eukaryotes. Biosystems 1992, Joyet P, Kachouri R, Kerrest A, Koszul R, Lemaire M, Lesur I, Ma L,
28(1-3):47-55. Muller H, Nicaud JM, Nikolski M, Oztas S, Ozier-Kalogeropoulos O,
27. Lapointe FJ, Cucumel G: The average consensus procedure: Pellenz S, Potier S, Richard GF, Straub ML, Suleau A, Swennen D,
combination of weighted trees containing identical or over- Tekaia F, Wesolowski-Louvel M, Westhof E, Wirth B, Zeniou-Meyer
M, Zivanovic I, Bolotin-Fukuhara M, Thierry A, Bouchier C, Caudron
Page 13 of 15
(page number not for citation purposes)