SlideShare une entreprise Scribd logo
1  sur  12
Télécharger pour lire hors ligne
assembling a draft
overall tree of life from
phylogenetic trees and
taxonomic databases
Jonathan A Rees
US National Evolutionary Synthesis Center
Duke University
rees@nescent.org
TDWG, 31 October 2013
software team:
Jim Allman
Joseph Brown
Karen Cranston
Cody Hinchliff
Mark Holder
Jonathan Leto
Emily McTavish
Peter Midford
Rick Ree
Stephen Smith

funding:
US NSF
what is
open tree of life?
1. collect phylogenetic trees for best
possible coverage of entire tree of life

Drew BT, Gazis R, Cabezas P, Swithers KS, Deng J, et al. (2013) Lost Branches on the Tree of
Life. PLoS Biol 11(9): e1001636. http://dx.doi.org/10.1371/journal.pbio.1001636
2. normalize tips so that they match
between source trees
label

normalization

Hemsleya amabilis HS454

524163 Hemsleya amabilis

Theria

4267989 Theria in Arthropoda

Nicotiana suaveolans var
excelsior

232354 Nicotiana rotundifolia

Selysia prunifera

949305 Cayaponia prunifera
3. synthesize a single ‘big tree’
algorithmically from the source trees

Smith SA, Brown JW, Hinchliff CE (2013) Analyzing and Synthesizing Phylogenies Using Tree
Alignment Graphs. PLoS Comput Biol 9(9): e1003223. http://dx.doi.org/10.1371/journal.pcbi.1003223
4. expose source trees and ‘big tree’ in
various ways
exposing provenance
• links to studies
• links to data deposits (e.g. treebase)
• links to taxonomic database records
• methods documentation
• versioning
reference taxonomy
• used for normalization, internal node
labeling, gap-filling

• need NCBI taxonomy
• supplement with GBIF
• patch system
• future: other sources
‘open’
trees are not creative expression
... ergo no © protection
... ergo © licensing is meaningless
... CC0 is nice (and required by Dryad),
but no CC0 for legacy data or NCBI
lessons
• NeXML and badgerfish are good
• machine-processable tip identity would
be awfully nice

• we were surprised by tree rooting
problem

• provenance is an uphill battle
• to be seen: github for data curation?
© 2013 Jonathan A Rees / CC-BY 3.0

Contenu connexe

Similaire à Assembling a draft overall tree of life from phylogenetic trees and taxonomic databases

Science 2011-fumagalli-1245-9
Science 2011-fumagalli-1245-9Science 2011-fumagalli-1245-9
Science 2011-fumagalli-1245-9
Sérgio Sacani
 
Science 2012-levine-907-11
Science 2012-levine-907-11Science 2012-levine-907-11
Science 2012-levine-907-11
vtsiri
 
Seasonal erosion and restoration of mars’ northern polar dunes
Seasonal erosion and restoration of mars’ northern polar dunesSeasonal erosion and restoration of mars’ northern polar dunes
Seasonal erosion and restoration of mars’ northern polar dunes
Sérgio Sacani
 
Carleton Biology talk : March 2014
Carleton Biology talk : March 2014Carleton Biology talk : March 2014
Carleton Biology talk : March 2014
Karen Cranston
 
biod_cons_week1_lec2_09
biod_cons_week1_lec2_09biod_cons_week1_lec2_09
biod_cons_week1_lec2_09
joernfischer
 
microBEnet: Perspectives on trying to nurture a growing MoBE field
microBEnet:   Perspectives on trying to nurture a growing MoBE fieldmicroBEnet:   Perspectives on trying to nurture a growing MoBE field
microBEnet: Perspectives on trying to nurture a growing MoBE field
Jonathan Eisen
 
Article critiques ( Min 1500 words) Styles of leadership. .docx
Article critiques ( Min 1500 words) Styles of leadership. .docxArticle critiques ( Min 1500 words) Styles of leadership. .docx
Article critiques ( Min 1500 words) Styles of leadership. .docx
davezstarr61655
 
SJawdy_CV_June2016_no_personal
SJawdy_CV_June2016_no_personalSJawdy_CV_June2016_no_personal
SJawdy_CV_June2016_no_personal
Sara Jawdy
 

Similaire à Assembling a draft overall tree of life from phylogenetic trees and taxonomic databases (20)

Science 2011-fumagalli-1245-9
Science 2011-fumagalli-1245-9Science 2011-fumagalli-1245-9
Science 2011-fumagalli-1245-9
 
[Ostrom, 2009] a general framework for analyzing sustainability of social-e...
[Ostrom, 2009]   a general framework for analyzing sustainability of social-e...[Ostrom, 2009]   a general framework for analyzing sustainability of social-e...
[Ostrom, 2009] a general framework for analyzing sustainability of social-e...
 
Beyond Blue to Green: The Benefits of Contact with Nature for Mental Health a...
Beyond Blue to Green: The Benefits of Contact with Nature for Mental Health a...Beyond Blue to Green: The Benefits of Contact with Nature for Mental Health a...
Beyond Blue to Green: The Benefits of Contact with Nature for Mental Health a...
 
Growing Physical, Social and Cognitive Capacity: Engaging with Natural Enviro...
Growing Physical, Social and Cognitive Capacity: Engaging with Natural Enviro...Growing Physical, Social and Cognitive Capacity: Engaging with Natural Enviro...
Growing Physical, Social and Cognitive Capacity: Engaging with Natural Enviro...
 
Why Life is Difficult, and What We MIght Do About It
Why Life is Difficult, and What We MIght Do About ItWhy Life is Difficult, and What We MIght Do About It
Why Life is Difficult, and What We MIght Do About It
 
Science 2012-levine-907-11
Science 2012-levine-907-11Science 2012-levine-907-11
Science 2012-levine-907-11
 
EVE 161 Lecture 4
EVE 161 Lecture 4EVE 161 Lecture 4
EVE 161 Lecture 4
 
Seasonal erosion and restoration of mars’ northern polar dunes
Seasonal erosion and restoration of mars’ northern polar dunesSeasonal erosion and restoration of mars’ northern polar dunes
Seasonal erosion and restoration of mars’ northern polar dunes
 
Noble progress report 2014 2015
Noble progress report 2014 2015Noble progress report 2014 2015
Noble progress report 2014 2015
 
Souder Trust in Science SLA 2011
Souder Trust in Science SLA 2011Souder Trust in Science SLA 2011
Souder Trust in Science SLA 2011
 
Carleton Biology talk : March 2014
Carleton Biology talk : March 2014Carleton Biology talk : March 2014
Carleton Biology talk : March 2014
 
biod_cons_week1_lec2_09
biod_cons_week1_lec2_09biod_cons_week1_lec2_09
biod_cons_week1_lec2_09
 
microBEnet: Perspectives on trying to nurture a growing MoBE field
microBEnet:   Perspectives on trying to nurture a growing MoBE fieldmicroBEnet:   Perspectives on trying to nurture a growing MoBE field
microBEnet: Perspectives on trying to nurture a growing MoBE field
 
ContentMine at EuropePMC AGM
ContentMine at EuropePMC AGMContentMine at EuropePMC AGM
ContentMine at EuropePMC AGM
 
Withinfamily che presentation_200609
Withinfamily che presentation_200609Withinfamily che presentation_200609
Withinfamily che presentation_200609
 
387.full
387.full387.full
387.full
 
E C O S Y S T E M 2007
E C O S Y S T E M 2007E C O S Y S T E M 2007
E C O S Y S T E M 2007
 
Article critiques ( Min 1500 words) Styles of leadership. .docx
Article critiques ( Min 1500 words) Styles of leadership. .docxArticle critiques ( Min 1500 words) Styles of leadership. .docx
Article critiques ( Min 1500 words) Styles of leadership. .docx
 
Intro.ppt
Intro.pptIntro.ppt
Intro.ppt
 
SJawdy_CV_June2016_no_personal
SJawdy_CV_June2016_no_personalSJawdy_CV_June2016_no_personal
SJawdy_CV_June2016_no_personal
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Assembling a draft overall tree of life from phylogenetic trees and taxonomic databases

  • 1. assembling a draft overall tree of life from phylogenetic trees and taxonomic databases Jonathan A Rees US National Evolutionary Synthesis Center Duke University rees@nescent.org TDWG, 31 October 2013
  • 2. software team: Jim Allman Joseph Brown Karen Cranston Cody Hinchliff Mark Holder Jonathan Leto Emily McTavish Peter Midford Rick Ree Stephen Smith funding: US NSF
  • 3. what is open tree of life?
  • 4. 1. collect phylogenetic trees for best possible coverage of entire tree of life Drew BT, Gazis R, Cabezas P, Swithers KS, Deng J, et al. (2013) Lost Branches on the Tree of Life. PLoS Biol 11(9): e1001636. http://dx.doi.org/10.1371/journal.pbio.1001636
  • 5. 2. normalize tips so that they match between source trees label normalization Hemsleya amabilis HS454 524163 Hemsleya amabilis Theria 4267989 Theria in Arthropoda Nicotiana suaveolans var excelsior 232354 Nicotiana rotundifolia Selysia prunifera 949305 Cayaponia prunifera
  • 6. 3. synthesize a single ‘big tree’ algorithmically from the source trees Smith SA, Brown JW, Hinchliff CE (2013) Analyzing and Synthesizing Phylogenies Using Tree Alignment Graphs. PLoS Comput Biol 9(9): e1003223. http://dx.doi.org/10.1371/journal.pcbi.1003223
  • 7. 4. expose source trees and ‘big tree’ in various ways
  • 8. exposing provenance • links to studies • links to data deposits (e.g. treebase) • links to taxonomic database records • methods documentation • versioning
  • 9. reference taxonomy • used for normalization, internal node labeling, gap-filling • need NCBI taxonomy • supplement with GBIF • patch system • future: other sources
  • 10. ‘open’ trees are not creative expression ... ergo no © protection ... ergo © licensing is meaningless ... CC0 is nice (and required by Dryad), but no CC0 for legacy data or NCBI
  • 11. lessons • NeXML and badgerfish are good • machine-processable tip identity would be awfully nice • we were surprised by tree rooting problem • provenance is an uphill battle • to be seen: github for data curation?
  • 12. © 2013 Jonathan A Rees / CC-BY 3.0