SlideShare une entreprise Scribd logo
1  sur  17
Télécharger pour lire hors ligne
Bio.Phylo
A unified phylogenetics toolkit for Biopython


                Eric Talevich

            Institute of Bioinformatics
              University of Georgia


               June 29, 2010
Abstract


       Bio.Phylo is a new phylogenetics library for:

• Exploring, modifying and annotating trees
• Reading & writing standard file formats
• Quick visualization
• Gluing together computational pipelines



                 Availability: Biopython 1.54
A quick survey of file formats

   Newick (a.k.a. New Hampshire) is a simple nested-parens
          format:    (A, (B, C), (D, E))
             • Extended & tweaked, led to NHX (and parsing
               problems)

   Nexus is a collection of formats, including Newick trees
             • More than just tree data. . . still tough to parse

PhyloXML is an XML-based replacement for NHX
             • Annotations formalized as XML elements;
               extensible with user-defined element types

  NeXML is an XML-based successor to Nexus
             • Ontology-based — key-value assignments have
               semantic meaning
Demo: What’s in a tree?




1. Read a simple Newick file
                              4. Promote to a PhyloXML tree
2. Inspect through IPython
                              5. Set branch colors
3. Draw with
                              6. Write a PhyloXML file
   PyLab/matplotlib
# In a terminal, make a simple Newick file
# Then launch the IPython interpreter and read the file


% cat > simple.dnd <<EOF
> (((A,B),(C,D)),(E,F,G))
> EOF

% ipython -pylab
>>> from Bio import Phylo
>>> tree = Phylo.read(’simple.dnd’, ’newick’)
# String representation shows the object structure

>>> print tree

Tree(weight=1.0, rooted=False, name=’’)
    Clade(branch_length=1.0)
        Clade(branch_length=1.0)
            Clade(branch_length=1.0)
                Clade(branch_length=1.0, name=’A’)
                Clade(branch_length=1.0, name=’B’)
            Clade(branch_length=1.0)
                Clade(branch_length=1.0, name=’C’)
                Clade(branch_length=1.0, name=’D’)
        Clade(branch_length=1.0)
            Clade(branch_length=1.0, name=’E’)
            Clade(branch_length=1.0, name=’F’)
            Clade(branch_length=1.0, name=’G’)
# Draw an ASCII-art dendrogram

>>> Phylo.draw_ascii(tree, column_width=52)

                                  ______________   A
                  ______________|
                 |               |______________   B
   ______________|
 |               |                ______________   C
 |               |______________|
_|                               |______________   D
 |
 |                 ______________ E
 |               |
 |______________|______________ F
                 |
                 |______________ G
>>> tree.rooted = True
>>> Phylo.draw graphiz(tree)

                                   D
              A


                                           C



       B

                                       G

                  E
                               F
# Promote a basic tree to PhyloXML
>>> from Bio.Phylo.PhyloXML import Phylogeny
>>> phy = Phylogeny.from_tree(tree)
>>> print phy

Phylogeny(rooted=True, name=’’)
    Clade(branch_length=1.0)
        Clade(branch_length=1.0)
            Clade(branch_length=1.0)
                Clade(branch_length=1.0, name=’A’)
                Clade(branch_length=1.0, name=’B’)
            Clade(branch_length=1.0)
                Clade(branch_length=1.0, name=’C’)
                Clade(branch_length=1.0, name=’D’)
        Clade(branch_length=1.0)
            Clade(branch_length=1.0, name=’E’)
            Clade(branch_length=1.0, name=’F’)
            Clade(branch_length=1.0, name=’G’)
Branch color
>>> phy.root.color = (128, 128, 128)
Or:
>>> phy.root.color = ’#808080’
Or:
>>> phy.root.color = ’gray’

Find clades by attribute values:
>>> mrca = phy.common ancestor({’name’:’E’},
                                 {’name’:’F’})
>>> mrca.color = ’salmon’

Directly index a clade:
>>> phy.clade[0,1].color = ’blue’

>>> Phylo.draw graphviz(phy, prog=’neato’)
D               B


C                       A




        G       F

            E
# Save the color annotations in phyloXML

>>> Phylo.write(phy, ’simple-color.xml’, ’phyloxml’)

<phy:phyloxml xmlns:phy="http://www.phyloxml.org">
  <phylogeny rooted="true">
    <clade>
        <branch_length>1.0</branch_length>
        <color>
            <red>128</red>
            <green>128</green>
            <blue>128</blue>
        </color>
        <clade>
            <branch_length>1.0</branch_length>
            <clade>
                 <branch_length>1.0</branch_length>
                 <clade>
                     <name>A</name>
                     ...
Thanks


Holla:
  • Brad Chapman and Christian Zmasek, GSoC 2009 mentors
  • The Biopython developers, feat. Peter J. A. Cock,
    Frank Kauff & Cymon J. Cox
  • Hilmar Lapp & the NESCent Phyloinformatics program
  • Google’s Open Source Programs Office
  • My professor, Dr. Natarajan Kannan
  • Developers like you
Q&A



• Which 3rd-party applications should we wrap in
  Bio.Phylo.Applications? (e.g. RAxML, MrBayes)
• Which other libraries should we support interoperability with?
  (PyCogent, ape)
• What other algorithms are simple, stable and relevant?
  (Consensus, rooting)
• Features for systematics? (Geography, PopGen integration?)
Extra: Tree methods
>>> dir(tree)

collapse                      get terminals
collapse all                  is bifurcating
common ancestor               is monophyletic
count terminals               is parent of
depths                        is preterminal
distance                      ladderize
find any                      prune
find clades                   split
find elements                 total branch length
get nonterminals              trace
get path

   See: http://biopython.org/DIST/docs/api/Bio.Phylo.
             BaseTree.TreeMixin-class.html
Extra: The Bio.Phylo class hierarchy




Figure: Inheritance relationship among the core classes
Extra: PhyloXML classes

 $ pydoc Bio.Phylo.PhyloXML

Accession              Date                 Point
Alphabet               Distribution         Polygon
Annotation             DomainArchitecture   Property
BaseTree               Events               ProteinDomain
BinaryCharacters       Id                   Reference
BranchColor            MolSeq               Sequence
Clade                  Other                SequenceRelation
CladeRelation          Phylogeny            Taxonomy
Confidence              Phyloxml             Uri


            See: http://biopython.org/wiki/PhyloXML

Contenu connexe

En vedette

Goozzy
Goozzy Goozzy
Goozzy alarin
 
The Organic IT Department: Strategic Cost Analysis to Unlock a Sustainable Co...
The Organic IT Department: Strategic Cost Analysis to Unlock a Sustainable Co...The Organic IT Department: Strategic Cost Analysis to Unlock a Sustainable Co...
The Organic IT Department: Strategic Cost Analysis to Unlock a Sustainable Co...Juan Carbonell
 
The holy land 2015
The holy land 2015The holy land 2015
The holy land 2015tomdinapoli
 
Brochure Koertse Bouw En Onderhoud
Brochure Koertse Bouw En OnderhoudBrochure Koertse Bouw En Onderhoud
Brochure Koertse Bouw En OnderhoudCees Koertse
 
Русская бизнес-профессионалов
Русская бизнес-профессионаловРусская бизнес-профессионалов
Русская бизнес-профессионаловDavid Dugas
 
Handout sekolah pilot aaa academy
Handout sekolah pilot aaa academyHandout sekolah pilot aaa academy
Handout sekolah pilot aaa academyMuhammad Abdullah
 
Westweaves Profile
Westweaves ProfileWestweaves Profile
Westweaves Profileanantdamani
 
Overzicht syllabus beroepspraktijk 1
Overzicht syllabus beroepspraktijk 1Overzicht syllabus beroepspraktijk 1
Overzicht syllabus beroepspraktijk 1CVO-SSH
 
4de lesdag kindfactoren
4de lesdag kindfactoren4de lesdag kindfactoren
4de lesdag kindfactorenCVO-SSH
 
Moser and Riutta - Partnerships for Student Learning
Moser and Riutta - Partnerships for Student LearningMoser and Riutta - Partnerships for Student Learning
Moser and Riutta - Partnerships for Student Learningoxfordcollegelibrary
 
Shannon Smith Cv 201109
Shannon Smith Cv 201109Shannon Smith Cv 201109
Shannon Smith Cv 201109shagsa
 
Overall complete result indonesia friendly memory championship i
Overall complete result indonesia friendly memory championship iOverall complete result indonesia friendly memory championship i
Overall complete result indonesia friendly memory championship iYudi Lesmana
 
Designing and implementing synergies; Coordinating investment in Research and...
Designing and implementing synergies; Coordinating investment in Research and...Designing and implementing synergies; Coordinating investment in Research and...
Designing and implementing synergies; Coordinating investment in Research and...Dimitri Corpakis
 

En vedette (20)

Proton ds userguide
Proton ds userguideProton ds userguide
Proton ds userguide
 
Shikha Verma_Resume
Shikha Verma_ResumeShikha Verma_Resume
Shikha Verma_Resume
 
Goozzy
Goozzy Goozzy
Goozzy
 
The Organic IT Department: Strategic Cost Analysis to Unlock a Sustainable Co...
The Organic IT Department: Strategic Cost Analysis to Unlock a Sustainable Co...The Organic IT Department: Strategic Cost Analysis to Unlock a Sustainable Co...
The Organic IT Department: Strategic Cost Analysis to Unlock a Sustainable Co...
 
Latest trends in em
Latest trends in emLatest trends in em
Latest trends in em
 
Small Business Profits Tune-Up
Small Business Profits Tune-UpSmall Business Profits Tune-Up
Small Business Profits Tune-Up
 
2004 04 27_ocpd_casestudies
2004 04 27_ocpd_casestudies2004 04 27_ocpd_casestudies
2004 04 27_ocpd_casestudies
 
cvBarisGomleksizoglu-eng
cvBarisGomleksizoglu-engcvBarisGomleksizoglu-eng
cvBarisGomleksizoglu-eng
 
The holy land 2015
The holy land 2015The holy land 2015
The holy land 2015
 
Brochure Koertse Bouw En Onderhoud
Brochure Koertse Bouw En OnderhoudBrochure Koertse Bouw En Onderhoud
Brochure Koertse Bouw En Onderhoud
 
Русская бизнес-профессионалов
Русская бизнес-профессионаловРусская бизнес-профессионалов
Русская бизнес-профессионалов
 
Chi2015 sig-od
Chi2015 sig-odChi2015 sig-od
Chi2015 sig-od
 
Handout sekolah pilot aaa academy
Handout sekolah pilot aaa academyHandout sekolah pilot aaa academy
Handout sekolah pilot aaa academy
 
Westweaves Profile
Westweaves ProfileWestweaves Profile
Westweaves Profile
 
Overzicht syllabus beroepspraktijk 1
Overzicht syllabus beroepspraktijk 1Overzicht syllabus beroepspraktijk 1
Overzicht syllabus beroepspraktijk 1
 
4de lesdag kindfactoren
4de lesdag kindfactoren4de lesdag kindfactoren
4de lesdag kindfactoren
 
Moser and Riutta - Partnerships for Student Learning
Moser and Riutta - Partnerships for Student LearningMoser and Riutta - Partnerships for Student Learning
Moser and Riutta - Partnerships for Student Learning
 
Shannon Smith Cv 201109
Shannon Smith Cv 201109Shannon Smith Cv 201109
Shannon Smith Cv 201109
 
Overall complete result indonesia friendly memory championship i
Overall complete result indonesia friendly memory championship iOverall complete result indonesia friendly memory championship i
Overall complete result indonesia friendly memory championship i
 
Designing and implementing synergies; Coordinating investment in Research and...
Designing and implementing synergies; Coordinating investment in Research and...Designing and implementing synergies; Coordinating investment in Research and...
Designing and implementing synergies; Coordinating investment in Research and...
 

Similaire à Talevich bosc2010 bio-phylo

Phylogeny in R - Bianca Santini Sheffield R Users March 2015
Phylogeny in R - Bianca Santini Sheffield R Users March 2015Phylogeny in R - Bianca Santini Sheffield R Users March 2015
Phylogeny in R - Bianca Santini Sheffield R Users March 2015Paul Richards
 
A search engine for phylogenetic tree databases - D. Fernándes-Baca
A search engine for phylogenetic tree databases - D. Fernándes-BacaA search engine for phylogenetic tree databases - D. Fernándes-Baca
A search engine for phylogenetic tree databases - D. Fernándes-BacaRoderic Page
 
Package-based Description Logics – Preliminary Results
Package-based Description Logics – Preliminary ResultsPackage-based Description Logics – Preliminary Results
Package-based Description Logics – Preliminary ResultsJie Bao
 
Bioinformatics p5-bioperl v2013-wim_vancriekinge
Bioinformatics p5-bioperl v2013-wim_vancriekingeBioinformatics p5-bioperl v2013-wim_vancriekinge
Bioinformatics p5-bioperl v2013-wim_vancriekingeProf. Wim Van Criekinge
 
Representing and Reasoning with Modular Ontologies
Representing and Reasoning with Modular OntologiesRepresenting and Reasoning with Modular Ontologies
Representing and Reasoning with Modular OntologiesJie Bao
 
Phylogenetics Analysis in R
Phylogenetics Analysis in RPhylogenetics Analysis in R
Phylogenetics Analysis in RKlaus Schliep
 
Querying XML: XPath and XQuery
Querying XML: XPath and XQueryQuerying XML: XPath and XQuery
Querying XML: XPath and XQueryKatrien Verbert
 
Semi-automated Exploration and Extraction of Data in Scientific Tables
Semi-automated Exploration and Extraction of Data in Scientific TablesSemi-automated Exploration and Extraction of Data in Scientific Tables
Semi-automated Exploration and Extraction of Data in Scientific TablesElsevier
 
CS101S. ThompsonUniversity of BridgeportLab 7 Files, File.docx
CS101S. ThompsonUniversity of BridgeportLab 7 Files, File.docxCS101S. ThompsonUniversity of BridgeportLab 7 Files, File.docx
CS101S. ThompsonUniversity of BridgeportLab 7 Files, File.docxannettsparrow
 
PhyloTastic: names-based phyloinformatic data integration
PhyloTastic: names-based phyloinformatic data integrationPhyloTastic: names-based phyloinformatic data integration
PhyloTastic: names-based phyloinformatic data integrationRutger Vos
 
code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)Erik Hatcher
 
Plays Well with Others, or What I’ve learned as a data provider in an intero...
Plays Well with Others, or What I’ve learned as a data provider in an intero...Plays Well with Others, or What I’ve learned as a data provider in an intero...
Plays Well with Others, or What I’ve learned as a data provider in an intero...Chris Freeland
 
These questions will be a bit advanced level 2
These questions will be a bit advanced level 2These questions will be a bit advanced level 2
These questions will be a bit advanced level 2sadhana312471
 
Perl%20SYLLABUS%20PB
Perl%20SYLLABUS%20PBPerl%20SYLLABUS%20PB
Perl%20SYLLABUS%20PBtutorialsruby
 

Similaire à Talevich bosc2010 bio-phylo (20)

Phylogeny in R - Bianca Santini Sheffield R Users March 2015
Phylogeny in R - Bianca Santini Sheffield R Users March 2015Phylogeny in R - Bianca Santini Sheffield R Users March 2015
Phylogeny in R - Bianca Santini Sheffield R Users March 2015
 
A search engine for phylogenetic tree databases - D. Fernándes-Baca
A search engine for phylogenetic tree databases - D. Fernándes-BacaA search engine for phylogenetic tree databases - D. Fernándes-Baca
A search engine for phylogenetic tree databases - D. Fernándes-Baca
 
PYTHON 101.pptx
PYTHON 101.pptxPYTHON 101.pptx
PYTHON 101.pptx
 
Package-based Description Logics – Preliminary Results
Package-based Description Logics – Preliminary ResultsPackage-based Description Logics – Preliminary Results
Package-based Description Logics – Preliminary Results
 
biopython, doctest and makefiles
biopython, doctest and makefilesbiopython, doctest and makefiles
biopython, doctest and makefiles
 
Uncovering Library Features from API Usage on Stack Overflow
Uncovering Library Features from API Usage on Stack OverflowUncovering Library Features from API Usage on Stack Overflow
Uncovering Library Features from API Usage on Stack Overflow
 
Bioinformatica p6-bioperl
Bioinformatica p6-bioperlBioinformatica p6-bioperl
Bioinformatica p6-bioperl
 
Bioinformatics p5-bioperl v2013-wim_vancriekinge
Bioinformatics p5-bioperl v2013-wim_vancriekingeBioinformatics p5-bioperl v2013-wim_vancriekinge
Bioinformatics p5-bioperl v2013-wim_vancriekinge
 
Representing and Reasoning with Modular Ontologies
Representing and Reasoning with Modular OntologiesRepresenting and Reasoning with Modular Ontologies
Representing and Reasoning with Modular Ontologies
 
Phylogenetics Analysis in R
Phylogenetics Analysis in RPhylogenetics Analysis in R
Phylogenetics Analysis in R
 
Querying XML: XPath and XQuery
Querying XML: XPath and XQueryQuerying XML: XPath and XQuery
Querying XML: XPath and XQuery
 
Semi-automated Exploration and Extraction of Data in Scientific Tables
Semi-automated Exploration and Extraction of Data in Scientific TablesSemi-automated Exploration and Extraction of Data in Scientific Tables
Semi-automated Exploration and Extraction of Data in Scientific Tables
 
philogenetic tree
philogenetic treephilogenetic tree
philogenetic tree
 
CS101S. ThompsonUniversity of BridgeportLab 7 Files, File.docx
CS101S. ThompsonUniversity of BridgeportLab 7 Files, File.docxCS101S. ThompsonUniversity of BridgeportLab 7 Files, File.docx
CS101S. ThompsonUniversity of BridgeportLab 7 Files, File.docx
 
i18n and L10n in TYPO3 Flow
i18n and L10n in TYPO3 Flowi18n and L10n in TYPO3 Flow
i18n and L10n in TYPO3 Flow
 
PhyloTastic: names-based phyloinformatic data integration
PhyloTastic: names-based phyloinformatic data integrationPhyloTastic: names-based phyloinformatic data integration
PhyloTastic: names-based phyloinformatic data integration
 
code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)
 
Plays Well with Others, or What I’ve learned as a data provider in an intero...
Plays Well with Others, or What I’ve learned as a data provider in an intero...Plays Well with Others, or What I’ve learned as a data provider in an intero...
Plays Well with Others, or What I’ve learned as a data provider in an intero...
 
These questions will be a bit advanced level 2
These questions will be a bit advanced level 2These questions will be a bit advanced level 2
These questions will be a bit advanced level 2
 
Perl%20SYLLABUS%20PB
Perl%20SYLLABUS%20PBPerl%20SYLLABUS%20PB
Perl%20SYLLABUS%20PB
 

Plus de BOSC 2010

Langmead bosc2010 cloud-genomics
Langmead bosc2010 cloud-genomicsLangmead bosc2010 cloud-genomics
Langmead bosc2010 cloud-genomicsBOSC 2010
 
Schultheiss bosc2010 persistance-web-services
Schultheiss bosc2010 persistance-web-servicesSchultheiss bosc2010 persistance-web-services
Schultheiss bosc2010 persistance-web-servicesBOSC 2010
 
Swertz bosc2010 molgenis
Swertz bosc2010 molgenisSwertz bosc2010 molgenis
Swertz bosc2010 molgenisBOSC 2010
 
Rice bosc2010 emboss
Rice bosc2010 embossRice bosc2010 emboss
Rice bosc2010 embossBOSC 2010
 
Morris bosc2010 evoker
Morris bosc2010 evokerMorris bosc2010 evoker
Morris bosc2010 evokerBOSC 2010
 
Kono bosc2010 pathway_projector
Kono bosc2010 pathway_projectorKono bosc2010 pathway_projector
Kono bosc2010 pathway_projectorBOSC 2010
 
Kanterakis bosc2010 molgenis
Kanterakis bosc2010 molgenisKanterakis bosc2010 molgenis
Kanterakis bosc2010 molgenisBOSC 2010
 
Gautier bosc2010 pythonbioconductor
Gautier bosc2010 pythonbioconductorGautier bosc2010 pythonbioconductor
Gautier bosc2010 pythonbioconductorBOSC 2010
 
Gardler bosc2010 community_developmentattheasf
Gardler bosc2010 community_developmentattheasfGardler bosc2010 community_developmentattheasf
Gardler bosc2010 community_developmentattheasfBOSC 2010
 
Friedberg bosc2010 iprstats
Friedberg bosc2010 iprstatsFriedberg bosc2010 iprstats
Friedberg bosc2010 iprstatsBOSC 2010
 
Fields bosc2010 bio_perl
Fields bosc2010 bio_perlFields bosc2010 bio_perl
Fields bosc2010 bio_perlBOSC 2010
 
Chapman bosc2010 biopython
Chapman bosc2010 biopythonChapman bosc2010 biopython
Chapman bosc2010 biopythonBOSC 2010
 
Bonnal bosc2010 bio_ruby
Bonnal bosc2010 bio_rubyBonnal bosc2010 bio_ruby
Bonnal bosc2010 bio_rubyBOSC 2010
 
Puton bosc2010 bio_python-modules-rna
Puton bosc2010 bio_python-modules-rnaPuton bosc2010 bio_python-modules-rna
Puton bosc2010 bio_python-modules-rnaBOSC 2010
 
Bader bosc2010 cytoweb
Bader bosc2010 cytowebBader bosc2010 cytoweb
Bader bosc2010 cytowebBOSC 2010
 
Zmasek bosc2010 aptx
Zmasek bosc2010 aptxZmasek bosc2010 aptx
Zmasek bosc2010 aptxBOSC 2010
 
Wilkinson bosc2010 moby-to-sadi
Wilkinson bosc2010 moby-to-sadiWilkinson bosc2010 moby-to-sadi
Wilkinson bosc2010 moby-to-sadiBOSC 2010
 
Venkatesan bosc2010 onto-toolkit
Venkatesan bosc2010 onto-toolkitVenkatesan bosc2010 onto-toolkit
Venkatesan bosc2010 onto-toolkitBOSC 2010
 
Taylor bosc2010
Taylor bosc2010Taylor bosc2010
Taylor bosc2010BOSC 2010
 
Robinson bosc2010 bio_hdf
Robinson bosc2010 bio_hdfRobinson bosc2010 bio_hdf
Robinson bosc2010 bio_hdfBOSC 2010
 

Plus de BOSC 2010 (20)

Langmead bosc2010 cloud-genomics
Langmead bosc2010 cloud-genomicsLangmead bosc2010 cloud-genomics
Langmead bosc2010 cloud-genomics
 
Schultheiss bosc2010 persistance-web-services
Schultheiss bosc2010 persistance-web-servicesSchultheiss bosc2010 persistance-web-services
Schultheiss bosc2010 persistance-web-services
 
Swertz bosc2010 molgenis
Swertz bosc2010 molgenisSwertz bosc2010 molgenis
Swertz bosc2010 molgenis
 
Rice bosc2010 emboss
Rice bosc2010 embossRice bosc2010 emboss
Rice bosc2010 emboss
 
Morris bosc2010 evoker
Morris bosc2010 evokerMorris bosc2010 evoker
Morris bosc2010 evoker
 
Kono bosc2010 pathway_projector
Kono bosc2010 pathway_projectorKono bosc2010 pathway_projector
Kono bosc2010 pathway_projector
 
Kanterakis bosc2010 molgenis
Kanterakis bosc2010 molgenisKanterakis bosc2010 molgenis
Kanterakis bosc2010 molgenis
 
Gautier bosc2010 pythonbioconductor
Gautier bosc2010 pythonbioconductorGautier bosc2010 pythonbioconductor
Gautier bosc2010 pythonbioconductor
 
Gardler bosc2010 community_developmentattheasf
Gardler bosc2010 community_developmentattheasfGardler bosc2010 community_developmentattheasf
Gardler bosc2010 community_developmentattheasf
 
Friedberg bosc2010 iprstats
Friedberg bosc2010 iprstatsFriedberg bosc2010 iprstats
Friedberg bosc2010 iprstats
 
Fields bosc2010 bio_perl
Fields bosc2010 bio_perlFields bosc2010 bio_perl
Fields bosc2010 bio_perl
 
Chapman bosc2010 biopython
Chapman bosc2010 biopythonChapman bosc2010 biopython
Chapman bosc2010 biopython
 
Bonnal bosc2010 bio_ruby
Bonnal bosc2010 bio_rubyBonnal bosc2010 bio_ruby
Bonnal bosc2010 bio_ruby
 
Puton bosc2010 bio_python-modules-rna
Puton bosc2010 bio_python-modules-rnaPuton bosc2010 bio_python-modules-rna
Puton bosc2010 bio_python-modules-rna
 
Bader bosc2010 cytoweb
Bader bosc2010 cytowebBader bosc2010 cytoweb
Bader bosc2010 cytoweb
 
Zmasek bosc2010 aptx
Zmasek bosc2010 aptxZmasek bosc2010 aptx
Zmasek bosc2010 aptx
 
Wilkinson bosc2010 moby-to-sadi
Wilkinson bosc2010 moby-to-sadiWilkinson bosc2010 moby-to-sadi
Wilkinson bosc2010 moby-to-sadi
 
Venkatesan bosc2010 onto-toolkit
Venkatesan bosc2010 onto-toolkitVenkatesan bosc2010 onto-toolkit
Venkatesan bosc2010 onto-toolkit
 
Taylor bosc2010
Taylor bosc2010Taylor bosc2010
Taylor bosc2010
 
Robinson bosc2010 bio_hdf
Robinson bosc2010 bio_hdfRobinson bosc2010 bio_hdf
Robinson bosc2010 bio_hdf
 

Dernier

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 

Dernier (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 

Talevich bosc2010 bio-phylo

  • 1. Bio.Phylo A unified phylogenetics toolkit for Biopython Eric Talevich Institute of Bioinformatics University of Georgia June 29, 2010
  • 2. Abstract Bio.Phylo is a new phylogenetics library for: • Exploring, modifying and annotating trees • Reading & writing standard file formats • Quick visualization • Gluing together computational pipelines Availability: Biopython 1.54
  • 3. A quick survey of file formats Newick (a.k.a. New Hampshire) is a simple nested-parens format: (A, (B, C), (D, E)) • Extended & tweaked, led to NHX (and parsing problems) Nexus is a collection of formats, including Newick trees • More than just tree data. . . still tough to parse PhyloXML is an XML-based replacement for NHX • Annotations formalized as XML elements; extensible with user-defined element types NeXML is an XML-based successor to Nexus • Ontology-based — key-value assignments have semantic meaning
  • 4. Demo: What’s in a tree? 1. Read a simple Newick file 4. Promote to a PhyloXML tree 2. Inspect through IPython 5. Set branch colors 3. Draw with 6. Write a PhyloXML file PyLab/matplotlib
  • 5. # In a terminal, make a simple Newick file # Then launch the IPython interpreter and read the file % cat > simple.dnd <<EOF > (((A,B),(C,D)),(E,F,G)) > EOF % ipython -pylab >>> from Bio import Phylo >>> tree = Phylo.read(’simple.dnd’, ’newick’)
  • 6. # String representation shows the object structure >>> print tree Tree(weight=1.0, rooted=False, name=’’) Clade(branch_length=1.0) Clade(branch_length=1.0) Clade(branch_length=1.0) Clade(branch_length=1.0, name=’A’) Clade(branch_length=1.0, name=’B’) Clade(branch_length=1.0) Clade(branch_length=1.0, name=’C’) Clade(branch_length=1.0, name=’D’) Clade(branch_length=1.0) Clade(branch_length=1.0, name=’E’) Clade(branch_length=1.0, name=’F’) Clade(branch_length=1.0, name=’G’)
  • 7. # Draw an ASCII-art dendrogram >>> Phylo.draw_ascii(tree, column_width=52) ______________ A ______________| | |______________ B ______________| | | ______________ C | |______________| _| |______________ D | | ______________ E | | |______________|______________ F | |______________ G
  • 8. >>> tree.rooted = True >>> Phylo.draw graphiz(tree) D A C B G E F
  • 9. # Promote a basic tree to PhyloXML >>> from Bio.Phylo.PhyloXML import Phylogeny >>> phy = Phylogeny.from_tree(tree) >>> print phy Phylogeny(rooted=True, name=’’) Clade(branch_length=1.0) Clade(branch_length=1.0) Clade(branch_length=1.0) Clade(branch_length=1.0, name=’A’) Clade(branch_length=1.0, name=’B’) Clade(branch_length=1.0) Clade(branch_length=1.0, name=’C’) Clade(branch_length=1.0, name=’D’) Clade(branch_length=1.0) Clade(branch_length=1.0, name=’E’) Clade(branch_length=1.0, name=’F’) Clade(branch_length=1.0, name=’G’)
  • 10. Branch color >>> phy.root.color = (128, 128, 128) Or: >>> phy.root.color = ’#808080’ Or: >>> phy.root.color = ’gray’ Find clades by attribute values: >>> mrca = phy.common ancestor({’name’:’E’}, {’name’:’F’}) >>> mrca.color = ’salmon’ Directly index a clade: >>> phy.clade[0,1].color = ’blue’ >>> Phylo.draw graphviz(phy, prog=’neato’)
  • 11. D B C A G F E
  • 12. # Save the color annotations in phyloXML >>> Phylo.write(phy, ’simple-color.xml’, ’phyloxml’) <phy:phyloxml xmlns:phy="http://www.phyloxml.org"> <phylogeny rooted="true"> <clade> <branch_length>1.0</branch_length> <color> <red>128</red> <green>128</green> <blue>128</blue> </color> <clade> <branch_length>1.0</branch_length> <clade> <branch_length>1.0</branch_length> <clade> <name>A</name> ...
  • 13. Thanks Holla: • Brad Chapman and Christian Zmasek, GSoC 2009 mentors • The Biopython developers, feat. Peter J. A. Cock, Frank Kauff & Cymon J. Cox • Hilmar Lapp & the NESCent Phyloinformatics program • Google’s Open Source Programs Office • My professor, Dr. Natarajan Kannan • Developers like you
  • 14. Q&A • Which 3rd-party applications should we wrap in Bio.Phylo.Applications? (e.g. RAxML, MrBayes) • Which other libraries should we support interoperability with? (PyCogent, ape) • What other algorithms are simple, stable and relevant? (Consensus, rooting) • Features for systematics? (Geography, PopGen integration?)
  • 15. Extra: Tree methods >>> dir(tree) collapse get terminals collapse all is bifurcating common ancestor is monophyletic count terminals is parent of depths is preterminal distance ladderize find any prune find clades split find elements total branch length get nonterminals trace get path See: http://biopython.org/DIST/docs/api/Bio.Phylo. BaseTree.TreeMixin-class.html
  • 16. Extra: The Bio.Phylo class hierarchy Figure: Inheritance relationship among the core classes
  • 17. Extra: PhyloXML classes $ pydoc Bio.Phylo.PhyloXML Accession Date Point Alphabet Distribution Polygon Annotation DomainArchitecture Property BaseTree Events ProteinDomain BinaryCharacters Id Reference BranchColor MolSeq Sequence Clade Other SequenceRelation CladeRelation Phylogeny Taxonomy Confidence Phyloxml Uri See: http://biopython.org/wiki/PhyloXML