The Open Tree of Life project aims to create a complete and freely accessible digital tree of life drawing from published phylogenetic studies and taxonomies. It has synthesized data from over 4,800 phylogenetic trees representing over 2,300 studies and 2.6 million taxonomic names. The synthesis process involves filtering and combining source trees from published studies into a single consensus tree using a graph database, while taxonomies provide taxon coverage. The resulting public tree of life is being refined with user feedback and new data. Future goals include improving the draft tree, developing synthesis methods, and adding user features like comparing trees and on-demand synthesis.
3. What does it mean to “have” the tree of life?
complete & dynamic
browse, download, query
use for research questions
implies digital access
4. Open Tree of Life
Taxonomy +
Source trees
•filter / weight input trees
•combine into synthetic tree
•feedback
•input new data sets
5. ~ 4% of all published
phylogenetic trees
Stoltzfus et al 2012
Inputs: Phylogenetic data
Archiving sequence data is a community norm
6. Heroic data collection efforts
Surveyed >7000 phylogenetic studies in plants, fungi and
animals, unicellular organisms
Result: data for >2300 studies, >4800 trees
Poster P133003 tonight!
7. Inputs: Taxonomies
Large fraction of species not represented in phylogenies
taxonomy provides backbone & coverage at tips
2,644,685 names: NCBI (structure) + GBIF (completeness)
https://github.com/OpenTreeOfLife/opentree/wiki/Open-
Tree-Taxonomy
10. Synthesizing trees and taxonomies
Graph database for phylogenies (treemachine) and
taxonomy (taxomachine)
Allows for extremely efficient storage and retrieval
Rules to extract binary tree from highly conflicting graph
More details? Stephen
Smith 8:30 am Monday!
14. Collaborations
providing images and text for public tree
developing methods for subtree extraction
summer student providing links to ToLWeb
pages
treeviz project from U Indiana MOOC,
GNOME summer intern
partner for data archiving / harvest
16. Year 2 & 3 goals
Refine draft tree based on user feedback / new data
Research into phylogenetic synthesis
User features
How does my tree compare with others?
Synthesis on demand
Quantifying / visualizing conflict
Suggestions?
17. Gordon Burleigh
Keith Crandall
Karen Cranston
Karl Gude
David Hibbett
Mark Holder
Laura Katz
Rick Ree
Stephen Smith
Doug Soltis
Tiffani Williams