SlideShare une entreprise Scribd logo
1  sur  28
Télécharger pour lire hors ligne
Biopython Project
Update 2013
Peter Cock & the Biopython Developers,
BOSC 2013, Berlin, Germany
Twitter: @pjacock & @biopython
Introduction
2
My Employer
 After PhD joined Scottish Crop Research Institute
 In 2011, SCRI (Dundee) & MLURI (Aberdeen) merged
as The James Hutton Institute
Government funded research institute
I work mainly on the genomics of Plant Pathogens
I use Biopython in my day to day work
More about this in tomorrow’s panel discussion,
“Strategies for Funding and Maintaining Open Source
Software”
3
Biopython
 Open source bioinformatics library for Python
 Sister project to:
 BioPerl
 BioRuby
 BioJava
 EMBOSS
 etc (see OBF Project BOF meeting tonight)
 Long running!
4
Brief History of Biopython
 1999 - Started by Andrew Dalke & Jeff Chang
 2000 - First release, announcement publication
 Chapman & Chang (2000). ACM SIGBIO Newsletter 20, 15-19
 2001 - Biopython 1.00
 2009 - Application note publication
 Cock et al. (2009) DOI:10.1093/bioinformatics/btp163
 2011 - Biopython 1.57 and 1.58
 2012 - Biopython 1.59 and 1.60
 2013 - Biopython 1.61 and 1.62 beta
5
Recap from last BOSC 2012
 Eric Talevich presented in Boston
 Biopython 1.58, 1.59 and 1.60
 Visualization enhancements for chromosome and
genome diagrams, and phylogenetic trees
 More file format parsers
 BGZF compression
 Google Summer of Code students ...
 Bio.Phylo paper submitted and in review ...
 Biopython working nicely under PyPy 1.9 ...
6
Publications
7
Bio.Phylo paper published
 Talevich et al (2012) DOI:10.1186/1471-2105-13-209
8
Talevich et al. BMC Bioinformatics 2012, 13:209
http://www.biomedcentral.com/1471-2105/13/209
SOFTWARE Open Access
Bio.Phylo: A unified toolkit for processing,
analyzing and visualizing phylogenetic
trees in Biopython
Eric Talevich1*, Brandon M Invergo2, Peter JA Cock3 and Brad A Chapman4
Abstract
Background: Ongoing innovation in phylogenetics and evolutionary biology has been accompanied by a
proliferation of software tools, data formats, analytical techniques and web servers. This brings with it the challenge of
integrating phylogenetic and other related biological data found in a wide variety of formats, and underlines the need
for reusable software that can read, manipulate and transform this information into the various forms required to
build computational pipelines.
Results: We built a Python software library for working with phylogenetic data that is tightly integrated with
Biopython, a broad-ranging toolkit for computational biology. Our library, Bio.Phylo, is highly interoperable with
existing libraries, tools and standards, and is capable of parsing common file formats for phylogenetic trees,
performing basic transformations and manipulations, attaching rich annotations, and visualizing trees. We unified the
modules for working with the standard file formats Newick, NEXUS and phyloXML behind a consistent and simple API,
providing a common set of functionality independent of the data source.
Conclusions: Bio.Phylo meets a growing need in bioinformatics for working with heterogeneous types of
phylogenetic data. By supporting interoperability with multiple file formats and leveraging existing Biopython
features, this library simplifies the construction of phylogenetic workflows. We also provide examples of the benefits
of building a community around a shared open-source project. Bio.Phylo is included with Biopython, available
through the Biopython website, http://biopython.org.
Google Summer of Code (GSoC)
9
Google Summer of Code 2012
 Two students under OBF
 Lenna Peterson
 Genomic Variant Toolkit for Biopython
 Mentors: Brad Chapman & James Casbon
 http://arklenna.tumblr.com/tagged/gsoc2012
 Wibowo Arindrarto
 Bio.SearchIO - pairwise sequence search files input/
output, e.g. BLAST, HMMER
 Mentor: Peter Cock
 http://bow.web.id/blog/tag/gsoc/
 Both completed their projects & still contributing
10
Google Summer of Code 2013
 Two students under NESCent
 Zheng Ruan
 Codon alignment and analysis
 Mentors: Eric Talevich & Peter Cock
 http://zr1991.blogspot.de/
 Yanbo Ye
 Bio.Phylo: filling in the gaps
 Mentor: Eric Talevich
 http://blog.yeyanbo.com/tag/gsoc.html
11
Releases since BOSC 2012
12
Biopython 1.61
 Major refresh of sequence motif handling code
 Bio.SearchIO GSoC work as an experimental module
 Contributors:
 Brandon Invergo
 Bryan Lunt (*)
 Christian Brueffer (*)
 David Cain
 Eric Talevich
 Grace Yeo (*)
 Jeffrey Chang
 Jingping Li (*)
 Kai Blin (*) 13
 Leighton Pritchard
 Lenna Peterson
 Lucas Sinclair (*)
 Michiel de Hoon
 Nick Semenkovich (*)
 Peter Cock
 Robert Ernst (*)
 Tiago Antao
 Wibowo 'Bow' Arindrarto
Biopython 1.62
 Beta released
 Final release after BOSC/ISMB/ECCB
 Warning on translating partial codons
 Explicit is better than implicit!
 Parsers for GAF, GPA and GPI from UniProt-GOA
 Reworked feature location object model
 Cleaner handling of multi-region locations
 Linked to GTF/GFF3 parsing and other plans
 Official Python 3 support ...
14
Please test this beta!
Biopython 1.62
 Contributors (as of the beta release):
15
Alexander Campbell (*)
Andrea Rizzi (*)
Anthony Mathelier (*)
Ben Morris (*)
Brad Chapman
Christian Brueffer
David Arenillas (*)
David Martin (*)
Eric Talevich
Iddo Friedberg
Jian-Long Huang (*)
Joao Rodrigues
Kai Blin
Michiel de Hoon
Nate Sutton (*)
Peter Cock
Petra Kubincová (*)
Phillip Garland
Saket Choudhary (*)
Tiago Antao
Wibowo 'Bow' Arindrarto
Xabier Bello (*)
Python 3
16
Python 2 and 3
 Python 2.7 is the final release of Python 2
 Python 3 is similar but different to Python 2
 Most Python 2 code needs updating to run
 Big difference is Python 3 uses unicode for strings
 We need to be explicit about bytes vs unicode in many
of our parsers
 Text file IO defaults to unicode, which is slower
 Most Python libraries are gradually being updated
17
Python 3 strategy
 We’ve been testing under Python 3 for over a year
 Biopython 1.62 will officially support Python 3.3
 Current strategy:
 Develop under Python 2
 Installation under Python 3 uses 2to3 converter
 Test under Python 3
18
Cross Platform Testing
 Nightly tests with BuildBot
 Tests run on volunteer machines
 Covers multiple OS and Python combinations
 More volunteer machines welcome,
especially 64 bit Windows
 Server runs on OBF funded Amazon server
 Continuous integration with TravisCI
 Covers a range of languages using VMs
 Free service for Open Source projects
 Runs tests when code on GitHub updated
 Runs tests for GitHub pull requests (Nice!)
19
Cross Platform Testing
20
Python/OS
Linux
32 bit
Linux
64 bit
Mac OS X
64 bit
Windows
32 bit
C Python 2.5
C Python 2.6
C Python 2.7
C Python 3.1
C Python 3.2
C Python 3.3
PyPy 1.9
PyPy 2.0
Jython 2.5
Jython 2.7b
BB + Travis BB BB BB
BB + Travis BB BB BB
BB + Travis BB BB
BB BB BB BB
BB BB BB BB
BB + Travis BB BB
Travis BB BB
BB
BB BB
BB BB
This test
matrix is
quite big!
Cross Platform Testing
21
Python/OS
Linux
32 bit
Linux
64 bit
Mac OS X
64 bit
Windows
32 bit
C Python 2.5
C Python 2.6
C Python 2.7
C Python 3.1
C Python 3.2
C Python 3.3
PyPy 1.9
PyPy 2.0
Jython 2.5
Jython 2.7b
BB + Travis BB BB BB
BB + Travis BB BB BB
BB + Travis BB BB
BB BB BB BB
BB BB BB BB
BB + Travis BB BB
Travis BB BB
BB
BB BB
BB BB
Dropping
Python 2.5
support
Also
means
Jython 2.5
Cross Platform Testing
22
Python/OS
Linux
32 bit
Linux
64 bit
Mac OS X
64 bit
Windows
32 bit
C Python 2.5
C Python 2.6
C Python 2.7
C Python 3.1
C Python 3.2
C Python 3.3
PyPy 1.9
PyPy 2.0
Jython 2.5
Jython 2.7b
BB + Travis BB BB BB
BB + Travis BB BB BB
BB + Travis BB BB
BB BB BB BB
BB BB BB BB
BB + Travis BB BB
Travis BB BB
BB
BB BB
BB BB
Have been
useful in
Python 3
testing,
but won’t
support
Cross Platform Testing Plan
 Target Python 2.6, 2.7 and 3.3 (or later)
 Volunteer machines needed, especially 64 bit Windows
23
Python/OS
Linux
32 bit
Linux
64 bit
Mac OS X
64 bit
Windows
32 bit
Windows
64 bit
C Python 2.6
C Python 2.7
C Python 3.3
PyPy 1.9
PyPy 2.0
PyPy 2.1b
Jython 2.7b
BB + Travis BB BB BB
BB + Travis BB BB BB
BB + Travis BB BB
Travis BB BB
BB
BB BB
Python
2.7
variants
Python 3 strategy
 Current strategy:
 Develop under Python 2
 Installation under Python 3 uses 2to3 converter
 Test under Python 3
 Future strategy:
 Target Python 2.6, 2.7 and 3.3 (or later)
 Start writing code which works on both at same time
 Continue to use 2to3 on a case-by-case basis
during transition period, or for problem cases
24
Closing Remarks
25
Stability versus Flexibility
 We aim for rigorous cross-platform testing
 We value backwards compatibility in core code
 Flexibility through modularity?
 BioRuby’s Gems http://gems.bioruby.org
 BioPerl sub-packages on CPAN
 Can/should we move towards this with PyPI?
 Would this encourage more contributors?
 We’re trying ‘beta level’ experimental modules within
the monolithic Biopython distribution, e.g. SearchIO
26
Biopython Support: Resources
 OBF hosted website, mailing lists, bug tracker, etc
 GitHub hosted repository
 TravisCI hosted continuous integration testing
 Personal and institute BuildSlave machines for testing
27
Thank you all!
Biopython Support: People
 Google supports summer students via GSoC
 Some of the developers do contribute on work time
 However, Biopython is mainly volunteer funded
 Please help out, e.g.
 Feedback
 Bug reports
 Documentation improvements
 Review code
 More unit tests
 Enhancements or new code
 ...
28
Thank you all!

Contenu connexe

Tendances

Mixed-language Python/C++ debugging with Python Tools for Visual Studio- Pave...
Mixed-language Python/C++ debugging with Python Tools for Visual Studio- Pave...Mixed-language Python/C++ debugging with Python Tools for Visual Studio- Pave...
Mixed-language Python/C++ debugging with Python Tools for Visual Studio- Pave...PyData
 
Python programming | Fundamentals of Python programming
Python programming | Fundamentals of Python programming Python programming | Fundamentals of Python programming
Python programming | Fundamentals of Python programming KrishnaMildain
 
Prins Bio Lib Bosc 2009
Prins Bio Lib Bosc 2009Prins Bio Lib Bosc 2009
Prins Bio Lib Bosc 2009bosc
 
LingPy : A Python Library for Historical Linguistics
LingPy : A Python Library for Historical LinguisticsLingPy : A Python Library for Historical Linguistics
LingPy : A Python Library for Historical LinguisticsDr. Amit Kumar Jha
 
SWIG Hello World
SWIG Hello WorldSWIG Hello World
SWIG Hello Worlde8xu
 
Python 3.5: An agile, general-purpose development language.
Python 3.5: An agile, general-purpose development language.Python 3.5: An agile, general-purpose development language.
Python 3.5: An agile, general-purpose development language.Carlos Miguel Ferreira
 
A commercial open source project in Python
A commercial open source project in PythonA commercial open source project in Python
A commercial open source project in Pythonjbrendel
 
Python 101 for the .NET Developer
Python 101 for the .NET DeveloperPython 101 for the .NET Developer
Python 101 for the .NET DeveloperSarah Dutkiewicz
 
Introduction To Python | Edureka
Introduction To Python | EdurekaIntroduction To Python | Edureka
Introduction To Python | EdurekaEdureka!
 

Tendances (14)

Mixed-language Python/C++ debugging with Python Tools for Visual Studio- Pave...
Mixed-language Python/C++ debugging with Python Tools for Visual Studio- Pave...Mixed-language Python/C++ debugging with Python Tools for Visual Studio- Pave...
Mixed-language Python/C++ debugging with Python Tools for Visual Studio- Pave...
 
Python programming | Fundamentals of Python programming
Python programming | Fundamentals of Python programming Python programming | Fundamentals of Python programming
Python programming | Fundamentals of Python programming
 
Prins Bio Lib Bosc 2009
Prins Bio Lib Bosc 2009Prins Bio Lib Bosc 2009
Prins Bio Lib Bosc 2009
 
What is Python?
What is Python?What is Python?
What is Python?
 
Python Introduction
Python IntroductionPython Introduction
Python Introduction
 
LingPy : A Python Library for Historical Linguistics
LingPy : A Python Library for Historical LinguisticsLingPy : A Python Library for Historical Linguistics
LingPy : A Python Library for Historical Linguistics
 
SWIG Hello World
SWIG Hello WorldSWIG Hello World
SWIG Hello World
 
Python
PythonPython
Python
 
Python 3.5: An agile, general-purpose development language.
Python 3.5: An agile, general-purpose development language.Python 3.5: An agile, general-purpose development language.
Python 3.5: An agile, general-purpose development language.
 
A commercial open source project in Python
A commercial open source project in PythonA commercial open source project in Python
A commercial open source project in Python
 
Python 101 for the .NET Developer
Python 101 for the .NET DeveloperPython 101 for the .NET Developer
Python 101 for the .NET Developer
 
Python Introduction
Python IntroductionPython Introduction
Python Introduction
 
Introduction To Python | Edureka
Introduction To Python | EdurekaIntroduction To Python | Edureka
Introduction To Python | Edureka
 
Python
Python Python
Python
 

Similaire à Biopython Project Update 2013

BioPerl (Poster T02, ISMB 2010)
BioPerl (Poster T02, ISMB 2010)BioPerl (Poster T02, ISMB 2010)
BioPerl (Poster T02, ISMB 2010)Mark Jensen
 
BioPerl (Poster T02, ISMB 2010)
BioPerl (Poster T02, ISMB 2010)BioPerl (Poster T02, ISMB 2010)
BioPerl (Poster T02, ISMB 2010)Mark Jensen
 
Biopython Project Update (BOSC 2012)
Biopython Project Update (BOSC 2012)Biopython Project Update (BOSC 2012)
Biopython Project Update (BOSC 2012)Eric Talevich
 
Python 101 For The Net Developer
Python 101 For The Net DeveloperPython 101 For The Net Developer
Python 101 For The Net DeveloperSarah Dutkiewicz
 
Migrating python.org to buildbot 9 and python 3
Migrating python.org to buildbot 9 and python 3Migrating python.org to buildbot 9 and python 3
Migrating python.org to buildbot 9 and python 3Craig Rodrigues
 
Python Training in Pune - Ethans Tech Pune
Python Training in Pune - Ethans Tech PunePython Training in Pune - Ethans Tech Pune
Python Training in Pune - Ethans Tech PuneEthan's Tech
 
Bio-UnaGrid: Easing bioinformatics workflow execution
Bio-UnaGrid: Easing bioinformatics workflow executionBio-UnaGrid: Easing bioinformatics workflow execution
Bio-UnaGrid: Easing bioinformatics workflow executionMario Jose Villamizar Cano
 
BOSC 2008 Biopython
BOSC 2008 BiopythonBOSC 2008 Biopython
BOSC 2008 Biopythontiago
 
The Onward Journey: Porting Twisted to Python 3
The Onward Journey: Porting Twisted to Python 3The Onward Journey: Porting Twisted to Python 3
The Onward Journey: Porting Twisted to Python 3Craig Rodrigues
 
Biopython at BOSC 2010
Biopython at BOSC 2010Biopython at BOSC 2010
Biopython at BOSC 2010Brad Chapman
 
Chapman bosc2010 biopython
Chapman bosc2010 biopythonChapman bosc2010 biopython
Chapman bosc2010 biopythonBOSC 2010
 
Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...
Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...
Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...The University of Queensland
 
IPython: A Modern Vision of Interactive Computing (PyData SV 2013)
IPython: A Modern Vision of Interactive Computing (PyData SV 2013)IPython: A Modern Vision of Interactive Computing (PyData SV 2013)
IPython: A Modern Vision of Interactive Computing (PyData SV 2013)PyData
 
Biopython
BiopythonBiopython
Biopythonbosc
 

Similaire à Biopython Project Update 2013 (20)

BioPerl (Poster T02, ISMB 2010)
BioPerl (Poster T02, ISMB 2010)BioPerl (Poster T02, ISMB 2010)
BioPerl (Poster T02, ISMB 2010)
 
BioPerl (Poster T02, ISMB 2010)
BioPerl (Poster T02, ISMB 2010)BioPerl (Poster T02, ISMB 2010)
BioPerl (Poster T02, ISMB 2010)
 
Biopython Project Update (BOSC 2012)
Biopython Project Update (BOSC 2012)Biopython Project Update (BOSC 2012)
Biopython Project Update (BOSC 2012)
 
Python 101 For The Net Developer
Python 101 For The Net DeveloperPython 101 For The Net Developer
Python 101 For The Net Developer
 
Migrating python.org to buildbot 9 and python 3
Migrating python.org to buildbot 9 and python 3Migrating python.org to buildbot 9 and python 3
Migrating python.org to buildbot 9 and python 3
 
openBIO
openBIOopenBIO
openBIO
 
Python Training in Pune - Ethans Tech Pune
Python Training in Pune - Ethans Tech PunePython Training in Pune - Ethans Tech Pune
Python Training in Pune - Ethans Tech Pune
 
Bio-UnaGrid: Easing bioinformatics workflow execution
Bio-UnaGrid: Easing bioinformatics workflow executionBio-UnaGrid: Easing bioinformatics workflow execution
Bio-UnaGrid: Easing bioinformatics workflow execution
 
BOSC 2008 Biopython
BOSC 2008 BiopythonBOSC 2008 Biopython
BOSC 2008 Biopython
 
The Onward Journey: Porting Twisted to Python 3
The Onward Journey: Porting Twisted to Python 3The Onward Journey: Porting Twisted to Python 3
The Onward Journey: Porting Twisted to Python 3
 
Biopython at BOSC 2010
Biopython at BOSC 2010Biopython at BOSC 2010
Biopython at BOSC 2010
 
Chapman bosc2010 biopython
Chapman bosc2010 biopythonChapman bosc2010 biopython
Chapman bosc2010 biopython
 
Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...
Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...
Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...
 
Pybind11 - SciPy 2021
Pybind11 - SciPy 2021Pybind11 - SciPy 2021
Pybind11 - SciPy 2021
 
Python for beginners
Python for beginnersPython for beginners
Python for beginners
 
20101026 ASAP Seminar
20101026 ASAP Seminar20101026 ASAP Seminar
20101026 ASAP Seminar
 
IPython: A Modern Vision of Interactive Computing (PyData SV 2013)
IPython: A Modern Vision of Interactive Computing (PyData SV 2013)IPython: A Modern Vision of Interactive Computing (PyData SV 2013)
IPython: A Modern Vision of Interactive Computing (PyData SV 2013)
 
Biopython
BiopythonBiopython
Biopython
 
Python lecture 01
Python lecture 01Python lecture 01
Python lecture 01
 
python unit2.pptx
python unit2.pptxpython unit2.pptx
python unit2.pptx
 

Dernier

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Dernier (20)

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Biopython Project Update 2013

  • 1. Biopython Project Update 2013 Peter Cock & the Biopython Developers, BOSC 2013, Berlin, Germany Twitter: @pjacock & @biopython
  • 3. My Employer  After PhD joined Scottish Crop Research Institute  In 2011, SCRI (Dundee) & MLURI (Aberdeen) merged as The James Hutton Institute Government funded research institute I work mainly on the genomics of Plant Pathogens I use Biopython in my day to day work More about this in tomorrow’s panel discussion, “Strategies for Funding and Maintaining Open Source Software” 3
  • 4. Biopython  Open source bioinformatics library for Python  Sister project to:  BioPerl  BioRuby  BioJava  EMBOSS  etc (see OBF Project BOF meeting tonight)  Long running! 4
  • 5. Brief History of Biopython  1999 - Started by Andrew Dalke & Jeff Chang  2000 - First release, announcement publication  Chapman & Chang (2000). ACM SIGBIO Newsletter 20, 15-19  2001 - Biopython 1.00  2009 - Application note publication  Cock et al. (2009) DOI:10.1093/bioinformatics/btp163  2011 - Biopython 1.57 and 1.58  2012 - Biopython 1.59 and 1.60  2013 - Biopython 1.61 and 1.62 beta 5
  • 6. Recap from last BOSC 2012  Eric Talevich presented in Boston  Biopython 1.58, 1.59 and 1.60  Visualization enhancements for chromosome and genome diagrams, and phylogenetic trees  More file format parsers  BGZF compression  Google Summer of Code students ...  Bio.Phylo paper submitted and in review ...  Biopython working nicely under PyPy 1.9 ... 6
  • 8. Bio.Phylo paper published  Talevich et al (2012) DOI:10.1186/1471-2105-13-209 8 Talevich et al. BMC Bioinformatics 2012, 13:209 http://www.biomedcentral.com/1471-2105/13/209 SOFTWARE Open Access Bio.Phylo: A unified toolkit for processing, analyzing and visualizing phylogenetic trees in Biopython Eric Talevich1*, Brandon M Invergo2, Peter JA Cock3 and Brad A Chapman4 Abstract Background: Ongoing innovation in phylogenetics and evolutionary biology has been accompanied by a proliferation of software tools, data formats, analytical techniques and web servers. This brings with it the challenge of integrating phylogenetic and other related biological data found in a wide variety of formats, and underlines the need for reusable software that can read, manipulate and transform this information into the various forms required to build computational pipelines. Results: We built a Python software library for working with phylogenetic data that is tightly integrated with Biopython, a broad-ranging toolkit for computational biology. Our library, Bio.Phylo, is highly interoperable with existing libraries, tools and standards, and is capable of parsing common file formats for phylogenetic trees, performing basic transformations and manipulations, attaching rich annotations, and visualizing trees. We unified the modules for working with the standard file formats Newick, NEXUS and phyloXML behind a consistent and simple API, providing a common set of functionality independent of the data source. Conclusions: Bio.Phylo meets a growing need in bioinformatics for working with heterogeneous types of phylogenetic data. By supporting interoperability with multiple file formats and leveraging existing Biopython features, this library simplifies the construction of phylogenetic workflows. We also provide examples of the benefits of building a community around a shared open-source project. Bio.Phylo is included with Biopython, available through the Biopython website, http://biopython.org.
  • 9. Google Summer of Code (GSoC) 9
  • 10. Google Summer of Code 2012  Two students under OBF  Lenna Peterson  Genomic Variant Toolkit for Biopython  Mentors: Brad Chapman & James Casbon  http://arklenna.tumblr.com/tagged/gsoc2012  Wibowo Arindrarto  Bio.SearchIO - pairwise sequence search files input/ output, e.g. BLAST, HMMER  Mentor: Peter Cock  http://bow.web.id/blog/tag/gsoc/  Both completed their projects & still contributing 10
  • 11. Google Summer of Code 2013  Two students under NESCent  Zheng Ruan  Codon alignment and analysis  Mentors: Eric Talevich & Peter Cock  http://zr1991.blogspot.de/  Yanbo Ye  Bio.Phylo: filling in the gaps  Mentor: Eric Talevich  http://blog.yeyanbo.com/tag/gsoc.html 11
  • 13. Biopython 1.61  Major refresh of sequence motif handling code  Bio.SearchIO GSoC work as an experimental module  Contributors:  Brandon Invergo  Bryan Lunt (*)  Christian Brueffer (*)  David Cain  Eric Talevich  Grace Yeo (*)  Jeffrey Chang  Jingping Li (*)  Kai Blin (*) 13  Leighton Pritchard  Lenna Peterson  Lucas Sinclair (*)  Michiel de Hoon  Nick Semenkovich (*)  Peter Cock  Robert Ernst (*)  Tiago Antao  Wibowo 'Bow' Arindrarto
  • 14. Biopython 1.62  Beta released  Final release after BOSC/ISMB/ECCB  Warning on translating partial codons  Explicit is better than implicit!  Parsers for GAF, GPA and GPI from UniProt-GOA  Reworked feature location object model  Cleaner handling of multi-region locations  Linked to GTF/GFF3 parsing and other plans  Official Python 3 support ... 14 Please test this beta!
  • 15. Biopython 1.62  Contributors (as of the beta release): 15 Alexander Campbell (*) Andrea Rizzi (*) Anthony Mathelier (*) Ben Morris (*) Brad Chapman Christian Brueffer David Arenillas (*) David Martin (*) Eric Talevich Iddo Friedberg Jian-Long Huang (*) Joao Rodrigues Kai Blin Michiel de Hoon Nate Sutton (*) Peter Cock Petra Kubincová (*) Phillip Garland Saket Choudhary (*) Tiago Antao Wibowo 'Bow' Arindrarto Xabier Bello (*)
  • 17. Python 2 and 3  Python 2.7 is the final release of Python 2  Python 3 is similar but different to Python 2  Most Python 2 code needs updating to run  Big difference is Python 3 uses unicode for strings  We need to be explicit about bytes vs unicode in many of our parsers  Text file IO defaults to unicode, which is slower  Most Python libraries are gradually being updated 17
  • 18. Python 3 strategy  We’ve been testing under Python 3 for over a year  Biopython 1.62 will officially support Python 3.3  Current strategy:  Develop under Python 2  Installation under Python 3 uses 2to3 converter  Test under Python 3 18
  • 19. Cross Platform Testing  Nightly tests with BuildBot  Tests run on volunteer machines  Covers multiple OS and Python combinations  More volunteer machines welcome, especially 64 bit Windows  Server runs on OBF funded Amazon server  Continuous integration with TravisCI  Covers a range of languages using VMs  Free service for Open Source projects  Runs tests when code on GitHub updated  Runs tests for GitHub pull requests (Nice!) 19
  • 20. Cross Platform Testing 20 Python/OS Linux 32 bit Linux 64 bit Mac OS X 64 bit Windows 32 bit C Python 2.5 C Python 2.6 C Python 2.7 C Python 3.1 C Python 3.2 C Python 3.3 PyPy 1.9 PyPy 2.0 Jython 2.5 Jython 2.7b BB + Travis BB BB BB BB + Travis BB BB BB BB + Travis BB BB BB BB BB BB BB BB BB BB BB + Travis BB BB Travis BB BB BB BB BB BB BB This test matrix is quite big!
  • 21. Cross Platform Testing 21 Python/OS Linux 32 bit Linux 64 bit Mac OS X 64 bit Windows 32 bit C Python 2.5 C Python 2.6 C Python 2.7 C Python 3.1 C Python 3.2 C Python 3.3 PyPy 1.9 PyPy 2.0 Jython 2.5 Jython 2.7b BB + Travis BB BB BB BB + Travis BB BB BB BB + Travis BB BB BB BB BB BB BB BB BB BB BB + Travis BB BB Travis BB BB BB BB BB BB BB Dropping Python 2.5 support Also means Jython 2.5
  • 22. Cross Platform Testing 22 Python/OS Linux 32 bit Linux 64 bit Mac OS X 64 bit Windows 32 bit C Python 2.5 C Python 2.6 C Python 2.7 C Python 3.1 C Python 3.2 C Python 3.3 PyPy 1.9 PyPy 2.0 Jython 2.5 Jython 2.7b BB + Travis BB BB BB BB + Travis BB BB BB BB + Travis BB BB BB BB BB BB BB BB BB BB BB + Travis BB BB Travis BB BB BB BB BB BB BB Have been useful in Python 3 testing, but won’t support
  • 23. Cross Platform Testing Plan  Target Python 2.6, 2.7 and 3.3 (or later)  Volunteer machines needed, especially 64 bit Windows 23 Python/OS Linux 32 bit Linux 64 bit Mac OS X 64 bit Windows 32 bit Windows 64 bit C Python 2.6 C Python 2.7 C Python 3.3 PyPy 1.9 PyPy 2.0 PyPy 2.1b Jython 2.7b BB + Travis BB BB BB BB + Travis BB BB BB BB + Travis BB BB Travis BB BB BB BB BB Python 2.7 variants
  • 24. Python 3 strategy  Current strategy:  Develop under Python 2  Installation under Python 3 uses 2to3 converter  Test under Python 3  Future strategy:  Target Python 2.6, 2.7 and 3.3 (or later)  Start writing code which works on both at same time  Continue to use 2to3 on a case-by-case basis during transition period, or for problem cases 24
  • 26. Stability versus Flexibility  We aim for rigorous cross-platform testing  We value backwards compatibility in core code  Flexibility through modularity?  BioRuby’s Gems http://gems.bioruby.org  BioPerl sub-packages on CPAN  Can/should we move towards this with PyPI?  Would this encourage more contributors?  We’re trying ‘beta level’ experimental modules within the monolithic Biopython distribution, e.g. SearchIO 26
  • 27. Biopython Support: Resources  OBF hosted website, mailing lists, bug tracker, etc  GitHub hosted repository  TravisCI hosted continuous integration testing  Personal and institute BuildSlave machines for testing 27 Thank you all!
  • 28. Biopython Support: People  Google supports summer students via GSoC  Some of the developers do contribute on work time  However, Biopython is mainly volunteer funded  Please help out, e.g.  Feedback  Bug reports  Documentation improvements  Review code  More unit tests  Enhancements or new code  ... 28 Thank you all!