SlideShare une entreprise Scribd logo
1  sur  24
Open Babel
Noel M. O’Boyle
An open chemical toolbox
Open Babel development team and NextMove Software, Cambridge, UK
EMBL-EBI May 2016
MIOSS – Molecular Informatics Open-Source Software
J. Cheminf. 2011, 3, 33.
http://openbabel.org
Image credit: AJ Cann (AJC1 on Flickr)
File format A
Image credit: Jon Osborne (jonno101101 on Flickr)
File format B
What is Open Babel?
• A programming library in C++
– With access from Perl, Python, Java, Ruby, .NET/Mono, Ruby,
R, PHP
• A set of command-line applications
– Most famously obabel for interconverting chemical file formats
• A graphical user interface for interconverting chemical file
formats
• Available on Win/Mac/Lin, through
conda/pip/brew/apt/yum/dnf, or from http://openbabel.org
History
Sources: Andrew Dalke
http://www.dalkescientific.com/writings/diary/archive/2004/01/03/available_toolkits.html,Roger Sayle
• 1992
– Matt Stahl and Pat Walters wrote Babel (an open source
molecule converter) at the University of Arizona
• 1999
– Matt joined OpenEye Scientific and based their cheminformatics
library OELib on Babel – this was also open source
• 2001
– OpenEye decided to rewrite their cheminformatics library as a
proprietary library, OEChem
– OELib was renamed to Open Babel, and continued as a
community project led by Geoff Hutchison
• 2002 (Dec)
– First release (1.0)
Features
• Multiple chemical file formats (+ options) and utility
formats
• 2D coordinate generation and depiction (PNG and SVG)
• 3D coordinate generation, forcefield minimisation,
conformer generation
• Binary fingerprints (path-based, substructure-based) and
associated “fast search” database
• Bond perception, aromaticity detection and atom-typing
• Canonical labelling, automorphisms, alignment
• Materials science: computational chemistry, molecular
dynamics, crystal structures
• Charge models: MMFF, Gasteiger, EEM, (E)QEq, QTPIE
Known Usage
• 45K downloads (from SF) in last 12 months
– 1.2K downloads of Windows Python bindings
• Paper published in 2011
– 984 citations (Google Scholar)
• Pybel paper published in 2008
– 117 citations
https://github.com/Magnusnorrby/MolecularRift
https://twitter.com/AstraZeneca/status/730775739264536576
Molecular Rift (as used by the King of Sweden) uses Open
Babel
Norrby, Grebner, Eriksson, Boström. J. Chem. Inf. Model., 2015, 55, 2475
Measuring the project’s pulse
• Oct 2012 – Last release and move to Github
– 112 “forks” on Github
– Commits from 59 developers (12 drive-by, 41 in the
last year)
• 37 pull requests since the start of the year
• 52 emails to the general mailing list this year
– Of these, 45 were replied to at least once
Contributors per month
Most committed developers in last 12 months
• Geoff Hutchison
– Professor, materials chemistry, Uni Pitt, Avogadro
• Dmitriy Fomichev
– PhD student, comp chemistry, Lobachevsky Uni, Russia
• Alexandr Fonari
– Assoc developer, Schrödinger, materials science, NWChem,
Quantum Espresso
• David van der Spoel
– Prof, Cell and Mol Biol, Uppsala Uni, Gromacs
• David Koes
– Assistant Prof, Comp and Sys Biology, Uni Pittsburgh,
3DMol.js, pharmit, pharmer
• Jeff Janes
– PI, Calibr (California Institute for Biomed Res), PostgreSQL
Chemistry file formats
• Chemists love inventing new file formats
• Every new chemistry application has its own file format
– Some exceptions: e.g. Avogadro
– De facto standards such as Daylight SMILES and
MDL/Symyx/Accelrys/Biovia/Dassault MOL
• The ability to read and interconvert chemical file formats is
important, both for scientitific and economic reasons
– To unlock chemical data for analysis
– To avoid vendor lock-in
– To develop workflows/pipelines
Formats: most recent additions
• Siesta [read]
– ab initio molecular dynamics
• STL [write]
– (STereoLithography) 3D
printing
• Point cloud format [write]
– Write VdW surface as points
• AOForce [read]
– Turbomole vibrational freqs
• MDFF [read/write]
– MD fitting to density maps
• EXYZ [read/write]
– Extended XYZ
git log --pretty=oneline --name-status | grep "^A" | grep src/formats | grep -v inchi | grep -v
libxml | less
Formats: most recent additions
• Siesta [read]
– ab initio molecular dynamics
• STL [write]
– (STereoLithography) 3D
printing
• Point cloud format [write]
– Write VdW surface as points
• AOForce [read]
– Turbomole vibrational freqs
• MDFF [read/write]
– MD fitting to density maps
• EXYZ [read/write]
– Extended XYZ
git log --pretty=oneline --name-status | grep "^A" | grep src/formats | grep -v inchi | grep -v
libxml | less
• Orca [read/write]
– QM package
• JSON formats [read/write]
– ChemDoodle JSON
– PubChem JSON
• Confab report [write]
– Conformation generation
• Dalton [read]
– QM package
• LPMD [read/write]
– MD with interatomic potentials
• Smiley [read]
– Validating SMILES parser
Consider rolling your own plugins
• The Open Babel library itself is fairly compact and
much of the functionality is implemented as plugins
– File formats, descriptors, fingerprints, and arbitrary
operations that take molecules and do something
• Relatively straightforward to add your own plugins,
even if you have never programmed in C++ before
– Easier to add a plugin than write your own C++ application
– Can use the obabel command-line to call it
– Can optionally donate the plugin to the community
• Almost anything can be a plugin
– I have written an entire conformation generator as a plugin
(Confab)
The GPL and industry
• Companies can use or modify Open Babel, add
plugins, and write their own code using it without any
problem
• If they distribute the resulting software outside the
company then they need to provide the source code
under the GPL
– This clause really only affects software companies
developing their own products, not end users in companies
Industry involvement
Code
• OpenEye
• eMolecules
• Silicos-IT
• Kitware
• Dalke Scientific
• Acpharis
• Astex
• Materials Design
• Schrödinger
• Vernalis
Note: based on email addresses
• Acellera
• AMRI
• ArQule
• Avant-garde materials sim
• Avesthagen
• Basilea
• Bayer
• Cambridgesoft
• Constellation Pharma
• Culgi
• Digital Chemistry
• Evotec
• Givaudin
• Global Phasing
• GreenPharma
• Inhibox
• Ingenuity
• Invitrogen (now ThermoFisher)
• Jubilant Biosys
• Lexicon
• Ligon Discovery
• LHASA
• Merck(.de)
• Molplex
• OmegaChem
• PeakDale
• Prometic
• PsycoGenics
• Specs
• Symyx/Accelrys
• Syngenta
• Takasago
• Targacept
• Thomson Reuters
Emails to list
Supporting open source
• When emailing a list, please give your affiliation
– It’s nice to know companies find it useful
• Spread the word, give credit in talks
• Give feedback
– What we’re doing right/wrong
– Can help reorder our priorities/reality check
• Bug bounty?
Future outlook
• Dude, there’s a plan??
• New features are driven by needs/interests of individuals
– Research interests
– Gaps in functionality
– Features needed ‘downstream’ by software using the library
• Avogadro is driving improved support for QM/MD
packages
• Generation of 3D structures based on distance geometry
• Housekeeping: Kekulization rewrite, implicit valency
• Improved performance? Has historically been low on the
agenda.
• Would be nice to have meetings like RDKit does
• What do *you* think we should be focusing on?
Ascii Depiction
A cry for help
Like mailing lists?
openbabel-
discuss@lists.sf.net
Like forums?
http://forums.openbabel.org
Like to email a developer
directly?
Step away from the keyboard
:-)
Don’t forget to read the
docs first and Google it
http://openbabel.org/docs
Image: Tintin44 (Flickr)

Contenu connexe

En vedette

Scanwtcsdtentprisesletter
ScanwtcsdtentprisesletterScanwtcsdtentprisesletter
Scanwtcsdtentprisesletter
Marty Tiezzi
 
เป้าหมายการพัฒนา เขมร
เป้าหมายการพัฒนา เขมรเป้าหมายการพัฒนา เขมร
เป้าหมายการพัฒนา เขมร
Itnog Kamix
 
Copy of modern agriculture
Copy of modern agricultureCopy of modern agriculture
Copy of modern agriculture
Christine Bancod
 
Medical Books Presentation l
Medical Books Presentation lMedical Books Presentation l
Medical Books Presentation l
Dilshad Alam
 
2 rancang bangun ekonomi islam
2 rancang bangun ekonomi islam2 rancang bangun ekonomi islam
2 rancang bangun ekonomi islam
XINYOUWANZ
 

En vedette (17)

Donald BYOD/ BYOT Implementation 11th-12th Grade Human Anatomy & Physiology
Donald BYOD/ BYOT Implementation 11th-12th Grade Human Anatomy & PhysiologyDonald BYOD/ BYOT Implementation 11th-12th Grade Human Anatomy & Physiology
Donald BYOD/ BYOT Implementation 11th-12th Grade Human Anatomy & Physiology
 
Scanwtcsdtentprisesletter
ScanwtcsdtentprisesletterScanwtcsdtentprisesletter
Scanwtcsdtentprisesletter
 
UZZI Quotes
UZZI QuotesUZZI Quotes
UZZI Quotes
 
Tissues
TissuesTissues
Tissues
 
Cardiologia amir
Cardiologia amirCardiologia amir
Cardiologia amir
 
Penerimaan abstrak
Penerimaan abstrakPenerimaan abstrak
Penerimaan abstrak
 
Staging
StagingStaging
Staging
 
Thesis statement poster
Thesis statement posterThesis statement poster
Thesis statement poster
 
The 2015 Nspire Talks
The 2015 Nspire TalksThe 2015 Nspire Talks
The 2015 Nspire Talks
 
National anthem
National anthemNational anthem
National anthem
 
เป้าหมายการพัฒนา เขมร
เป้าหมายการพัฒนา เขมรเป้าหมายการพัฒนา เขมร
เป้าหมายการพัฒนา เขมร
 
KUMPULAN SOAL TRYOUT KABUPATEN UJIAN NASIONAL (UN) IPA TAHUN 2014-DOK.SMPN 1 ...
KUMPULAN SOAL TRYOUT KABUPATEN UJIAN NASIONAL (UN) IPA TAHUN 2014-DOK.SMPN 1 ...KUMPULAN SOAL TRYOUT KABUPATEN UJIAN NASIONAL (UN) IPA TAHUN 2014-DOK.SMPN 1 ...
KUMPULAN SOAL TRYOUT KABUPATEN UJIAN NASIONAL (UN) IPA TAHUN 2014-DOK.SMPN 1 ...
 
Malaria
MalariaMalaria
Malaria
 
Copy of modern agriculture
Copy of modern agricultureCopy of modern agriculture
Copy of modern agriculture
 
Medical Books Presentation l
Medical Books Presentation lMedical Books Presentation l
Medical Books Presentation l
 
Hvgpress presentation
Hvgpress presentationHvgpress presentation
Hvgpress presentation
 
2 rancang bangun ekonomi islam
2 rancang bangun ekonomi islam2 rancang bangun ekonomi islam
2 rancang bangun ekonomi islam
 

Similaire à Open Babel project overview

Open Source Visualization of Scientific Data
Open Source Visualization of Scientific DataOpen Source Visualization of Scientific Data
Open Source Visualization of Scientific Data
Marcus Hanwell
 
Chem4Word Wade
Chem4Word WadeChem4Word Wade
Chem4Word Wade
Alex Wade
 
Open Chemistry: Input Preparation, Data Visualization & Analysis
Open Chemistry: Input Preparation, Data Visualization & AnalysisOpen Chemistry: Input Preparation, Data Visualization & Analysis
Open Chemistry: Input Preparation, Data Visualization & Analysis
Marcus Hanwell
 
Chemical Databases and Open Chemistry on the Desktop
Chemical Databases and Open Chemistry on the DesktopChemical Databases and Open Chemistry on the Desktop
Chemical Databases and Open Chemistry on the Desktop
Marcus Hanwell
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.org
Norman Morrison
 

Similaire à Open Babel project overview (20)

SoundSoftware.ac.uk: Sustainable software for audio and music research (DMRN 5+)
SoundSoftware.ac.uk: Sustainable software for audio and music research (DMRN 5+)SoundSoftware.ac.uk: Sustainable software for audio and music research (DMRN 5+)
SoundSoftware.ac.uk: Sustainable software for audio and music research (DMRN 5+)
 
Open Source Visualization of Scientific Data
Open Source Visualization of Scientific DataOpen Source Visualization of Scientific Data
Open Source Visualization of Scientific Data
 
G3 talk rld_2
G3 talk rld_2G3 talk rld_2
G3 talk rld_2
 
Guidelines for Working with Contract Developers in Evergreen
Guidelines for Working with Contract Developers in EvergreenGuidelines for Working with Contract Developers in Evergreen
Guidelines for Working with Contract Developers in Evergreen
 
'Scikit-project': How open source is empowering open science – and vice versa
'Scikit-project': How open source is empowering open science – and vice versa'Scikit-project': How open source is empowering open science – and vice versa
'Scikit-project': How open source is empowering open science – and vice versa
 
10. ROS (1).pptx
10. ROS (1).pptx10. ROS (1).pptx
10. ROS (1).pptx
 
Intro to open source - 101 presentation
Intro to open source - 101 presentationIntro to open source - 101 presentation
Intro to open source - 101 presentation
 
Avogadro, Open Chemistry and Semantics
Avogadro, Open Chemistry and SemanticsAvogadro, Open Chemistry and Semantics
Avogadro, Open Chemistry and Semantics
 
2018 ABRF Tools for improving rigor and reproducibility in bioinformatics
2018 ABRF Tools for improving rigor and reproducibility in bioinformatics2018 ABRF Tools for improving rigor and reproducibility in bioinformatics
2018 ABRF Tools for improving rigor and reproducibility in bioinformatics
 
Code the docs-yu liu
Code the docs-yu liuCode the docs-yu liu
Code the docs-yu liu
 
Chem4Word Wade
Chem4Word WadeChem4Word Wade
Chem4Word Wade
 
Rob Davidson at the G3 Workshop: Open Source - Tools for Reproducibility
Rob Davidson at the G3 Workshop: Open Source - Tools for ReproducibilityRob Davidson at the G3 Workshop: Open Source - Tools for Reproducibility
Rob Davidson at the G3 Workshop: Open Source - Tools for Reproducibility
 
But we're already open source! Why would I want to bring my code to Apache?
But we're already open source! Why would I want to bring my code to Apache?But we're already open source! Why would I want to bring my code to Apache?
But we're already open source! Why would I want to bring my code to Apache?
 
Sound soft hackday-100905
Sound soft hackday-100905Sound soft hackday-100905
Sound soft hackday-100905
 
Open Chemistry: Input Preparation, Data Visualization & Analysis
Open Chemistry: Input Preparation, Data Visualization & AnalysisOpen Chemistry: Input Preparation, Data Visualization & Analysis
Open Chemistry: Input Preparation, Data Visualization & Analysis
 
Chemical Databases and Open Chemistry on the Desktop
Chemical Databases and Open Chemistry on the DesktopChemical Databases and Open Chemistry on the Desktop
Chemical Databases and Open Chemistry on the Desktop
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.org
 
Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science Lab
Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science LabScalable Ensemble Machine Learning @ Harvard Health Policy Data Science Lab
Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science Lab
 
Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012
 
Array computing and the evolution of SciPy, NumPy, and PyData
Array computing and the evolution of SciPy, NumPy, and PyDataArray computing and the evolution of SciPy, NumPy, and PyData
Array computing and the evolution of SciPy, NumPy, and PyData
 

Plus de baoilleach

Large-scale computational design and selection of polymers for solar cells
Large-scale computational design and selection of polymers for solar cellsLarge-scale computational design and selection of polymers for solar cells
Large-scale computational design and selection of polymers for solar cells
baoilleach
 
Improving the quality of chemical databases with community-developed tools (a...
Improving the quality of chemical databases with community-developed tools (a...Improving the quality of chemical databases with community-developed tools (a...
Improving the quality of chemical databases with community-developed tools (a...
baoilleach
 
Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...
baoilleach
 
Improving enrichment rates
Improving enrichment ratesImproving enrichment rates
Improving enrichment rates
baoilleach
 
The Blue Obelisk community
The Blue Obelisk communityThe Blue Obelisk community
The Blue Obelisk community
baoilleach
 
Open Babel 2.3 Quick Reference
Open Babel 2.3 Quick ReferenceOpen Babel 2.3 Quick Reference
Open Babel 2.3 Quick Reference
baoilleach
 
Classification of Enzyme Reaction Mechanisms
Classification of Enzyme Reaction MechanismsClassification of Enzyme Reaction Mechanisms
Classification of Enzyme Reaction Mechanisms
baoilleach
 

Plus de baoilleach (20)

We need to talk about Kekulization, Aromaticity and SMILES
We need to talk about Kekulization, Aromaticity and SMILESWe need to talk about Kekulization, Aromaticity and SMILES
We need to talk about Kekulization, Aromaticity and SMILES
 
Protein-ligand docking
Protein-ligand dockingProtein-ligand docking
Protein-ligand docking
 
Cheminformatics
CheminformaticsCheminformatics
Cheminformatics
 
Making the most of a QM calculation
Making the most of a QM calculationMaking the most of a QM calculation
Making the most of a QM calculation
 
Data Analysis in QSAR
Data Analysis in QSARData Analysis in QSAR
Data Analysis in QSAR
 
Large-scale computational design and selection of polymers for solar cells
Large-scale computational design and selection of polymers for solar cellsLarge-scale computational design and selection of polymers for solar cells
Large-scale computational design and selection of polymers for solar cells
 
My Open Access papers
My Open Access papersMy Open Access papers
My Open Access papers
 
Improving the quality of chemical databases with community-developed tools (a...
Improving the quality of chemical databases with community-developed tools (a...Improving the quality of chemical databases with community-developed tools (a...
Improving the quality of chemical databases with community-developed tools (a...
 
De novo design of molecular wires with optimal properties for solar energy co...
De novo design of molecular wires with optimal properties for solar energy co...De novo design of molecular wires with optimal properties for solar energy co...
De novo design of molecular wires with optimal properties for solar energy co...
 
Density functional theory calculations on Ruthenium polypyridyl complexes inc...
Density functional theory calculations on Ruthenium polypyridyl complexes inc...Density functional theory calculations on Ruthenium polypyridyl complexes inc...
Density functional theory calculations on Ruthenium polypyridyl complexes inc...
 
Application of Density Functional Theory to Scanning Tunneling Microscopy
Application of Density Functional Theory to Scanning Tunneling MicroscopyApplication of Density Functional Theory to Scanning Tunneling Microscopy
Application of Density Functional Theory to Scanning Tunneling Microscopy
 
Towards Practical Molecular Devices
Towards Practical Molecular DevicesTowards Practical Molecular Devices
Towards Practical Molecular Devices
 
Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...
 
Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...
 
Improving enrichment rates
Improving enrichment ratesImproving enrichment rates
Improving enrichment rates
 
The Blue Obelisk community
The Blue Obelisk communityThe Blue Obelisk community
The Blue Obelisk community
 
Interoperability and the Blue Obelisk
Interoperability and the Blue ObeliskInteroperability and the Blue Obelisk
Interoperability and the Blue Obelisk
 
Goslar2010 poster
Goslar2010 posterGoslar2010 poster
Goslar2010 poster
 
Open Babel 2.3 Quick Reference
Open Babel 2.3 Quick ReferenceOpen Babel 2.3 Quick Reference
Open Babel 2.3 Quick Reference
 
Classification of Enzyme Reaction Mechanisms
Classification of Enzyme Reaction MechanismsClassification of Enzyme Reaction Mechanisms
Classification of Enzyme Reaction Mechanisms
 

Dernier

Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
Silpa
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
NazaninKarimi6
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 

Dernier (20)

Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Chemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfChemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdf
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
An introduction on sequence tagged site mapping
An introduction on sequence tagged site mappingAn introduction on sequence tagged site mapping
An introduction on sequence tagged site mapping
 
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICEPATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptx
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Introduction of DNA analysis in Forensic's .pptx
Introduction of DNA analysis in Forensic's .pptxIntroduction of DNA analysis in Forensic's .pptx
Introduction of DNA analysis in Forensic's .pptx
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
Velocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.pptVelocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.ppt
 

Open Babel project overview

  • 1. Open Babel Noel M. O’Boyle An open chemical toolbox Open Babel development team and NextMove Software, Cambridge, UK EMBL-EBI May 2016 MIOSS – Molecular Informatics Open-Source Software J. Cheminf. 2011, 3, 33. http://openbabel.org
  • 2. Image credit: AJ Cann (AJC1 on Flickr)
  • 3.
  • 4. File format A Image credit: Jon Osborne (jonno101101 on Flickr) File format B
  • 5. What is Open Babel? • A programming library in C++ – With access from Perl, Python, Java, Ruby, .NET/Mono, Ruby, R, PHP • A set of command-line applications – Most famously obabel for interconverting chemical file formats • A graphical user interface for interconverting chemical file formats • Available on Win/Mac/Lin, through conda/pip/brew/apt/yum/dnf, or from http://openbabel.org
  • 6. History Sources: Andrew Dalke http://www.dalkescientific.com/writings/diary/archive/2004/01/03/available_toolkits.html,Roger Sayle • 1992 – Matt Stahl and Pat Walters wrote Babel (an open source molecule converter) at the University of Arizona • 1999 – Matt joined OpenEye Scientific and based their cheminformatics library OELib on Babel – this was also open source • 2001 – OpenEye decided to rewrite their cheminformatics library as a proprietary library, OEChem – OELib was renamed to Open Babel, and continued as a community project led by Geoff Hutchison • 2002 (Dec) – First release (1.0)
  • 7. Features • Multiple chemical file formats (+ options) and utility formats • 2D coordinate generation and depiction (PNG and SVG) • 3D coordinate generation, forcefield minimisation, conformer generation • Binary fingerprints (path-based, substructure-based) and associated “fast search” database • Bond perception, aromaticity detection and atom-typing • Canonical labelling, automorphisms, alignment • Materials science: computational chemistry, molecular dynamics, crystal structures • Charge models: MMFF, Gasteiger, EEM, (E)QEq, QTPIE
  • 8.
  • 9. Known Usage • 45K downloads (from SF) in last 12 months – 1.2K downloads of Windows Python bindings • Paper published in 2011 – 984 citations (Google Scholar) • Pybel paper published in 2008 – 117 citations
  • 10.
  • 11.
  • 12. https://github.com/Magnusnorrby/MolecularRift https://twitter.com/AstraZeneca/status/730775739264536576 Molecular Rift (as used by the King of Sweden) uses Open Babel Norrby, Grebner, Eriksson, Boström. J. Chem. Inf. Model., 2015, 55, 2475
  • 13. Measuring the project’s pulse • Oct 2012 – Last release and move to Github – 112 “forks” on Github – Commits from 59 developers (12 drive-by, 41 in the last year) • 37 pull requests since the start of the year • 52 emails to the general mailing list this year – Of these, 45 were replied to at least once Contributors per month
  • 14. Most committed developers in last 12 months • Geoff Hutchison – Professor, materials chemistry, Uni Pitt, Avogadro • Dmitriy Fomichev – PhD student, comp chemistry, Lobachevsky Uni, Russia • Alexandr Fonari – Assoc developer, Schrödinger, materials science, NWChem, Quantum Espresso • David van der Spoel – Prof, Cell and Mol Biol, Uppsala Uni, Gromacs • David Koes – Assistant Prof, Comp and Sys Biology, Uni Pittsburgh, 3DMol.js, pharmit, pharmer • Jeff Janes – PI, Calibr (California Institute for Biomed Res), PostgreSQL
  • 15. Chemistry file formats • Chemists love inventing new file formats • Every new chemistry application has its own file format – Some exceptions: e.g. Avogadro – De facto standards such as Daylight SMILES and MDL/Symyx/Accelrys/Biovia/Dassault MOL • The ability to read and interconvert chemical file formats is important, both for scientitific and economic reasons – To unlock chemical data for analysis – To avoid vendor lock-in – To develop workflows/pipelines
  • 16. Formats: most recent additions • Siesta [read] – ab initio molecular dynamics • STL [write] – (STereoLithography) 3D printing • Point cloud format [write] – Write VdW surface as points • AOForce [read] – Turbomole vibrational freqs • MDFF [read/write] – MD fitting to density maps • EXYZ [read/write] – Extended XYZ git log --pretty=oneline --name-status | grep "^A" | grep src/formats | grep -v inchi | grep -v libxml | less
  • 17. Formats: most recent additions • Siesta [read] – ab initio molecular dynamics • STL [write] – (STereoLithography) 3D printing • Point cloud format [write] – Write VdW surface as points • AOForce [read] – Turbomole vibrational freqs • MDFF [read/write] – MD fitting to density maps • EXYZ [read/write] – Extended XYZ git log --pretty=oneline --name-status | grep "^A" | grep src/formats | grep -v inchi | grep -v libxml | less • Orca [read/write] – QM package • JSON formats [read/write] – ChemDoodle JSON – PubChem JSON • Confab report [write] – Conformation generation • Dalton [read] – QM package • LPMD [read/write] – MD with interatomic potentials • Smiley [read] – Validating SMILES parser
  • 18. Consider rolling your own plugins • The Open Babel library itself is fairly compact and much of the functionality is implemented as plugins – File formats, descriptors, fingerprints, and arbitrary operations that take molecules and do something • Relatively straightforward to add your own plugins, even if you have never programmed in C++ before – Easier to add a plugin than write your own C++ application – Can use the obabel command-line to call it – Can optionally donate the plugin to the community • Almost anything can be a plugin – I have written an entire conformation generator as a plugin (Confab)
  • 19. The GPL and industry • Companies can use or modify Open Babel, add plugins, and write their own code using it without any problem • If they distribute the resulting software outside the company then they need to provide the source code under the GPL – This clause really only affects software companies developing their own products, not end users in companies
  • 20. Industry involvement Code • OpenEye • eMolecules • Silicos-IT • Kitware • Dalke Scientific • Acpharis • Astex • Materials Design • Schrödinger • Vernalis Note: based on email addresses • Acellera • AMRI • ArQule • Avant-garde materials sim • Avesthagen • Basilea • Bayer • Cambridgesoft • Constellation Pharma • Culgi • Digital Chemistry • Evotec • Givaudin • Global Phasing • GreenPharma • Inhibox • Ingenuity • Invitrogen (now ThermoFisher) • Jubilant Biosys • Lexicon • Ligon Discovery • LHASA • Merck(.de) • Molplex • OmegaChem • PeakDale • Prometic • PsycoGenics • Specs • Symyx/Accelrys • Syngenta • Takasago • Targacept • Thomson Reuters Emails to list
  • 21. Supporting open source • When emailing a list, please give your affiliation – It’s nice to know companies find it useful • Spread the word, give credit in talks • Give feedback – What we’re doing right/wrong – Can help reorder our priorities/reality check • Bug bounty?
  • 22. Future outlook • Dude, there’s a plan?? • New features are driven by needs/interests of individuals – Research interests – Gaps in functionality – Features needed ‘downstream’ by software using the library • Avogadro is driving improved support for QM/MD packages • Generation of 3D structures based on distance geometry • Housekeeping: Kekulization rewrite, implicit valency • Improved performance? Has historically been low on the agenda. • Would be nice to have meetings like RDKit does • What do *you* think we should be focusing on?
  • 24. A cry for help Like mailing lists? openbabel- discuss@lists.sf.net Like forums? http://forums.openbabel.org Like to email a developer directly? Step away from the keyboard :-) Don’t forget to read the docs first and Google it http://openbabel.org/docs Image: Tintin44 (Flickr)

Notes de l'éditeur

  1. OB is like a Swiss army knife, not a…
  2. …spork!
  3. “The 70s are calling. They want their depiction back.”