SlideShare une entreprise Scribd logo
1  sur  33
Télécharger pour lire hors ligne
Informatics for All

The Open Source and Freeware
Revolution in Chemical Biology




 Applications to Selected SRI
           Projects
Proprietary software: black box model
● Exclusive code control
● Exclusive development / customization control




User-initiated customization must
await proprietor implementation



    Or



reinvent all requisite wheels
Open source empowers users to pursue novel
enhancements according to their needs and their
timelines!
Open source = philosophically good

But in practice:
 can you replace well established
 proprietary tools with open source
 and still sustain
    ●   Effective
    ●   Accurate
    ●   Efficient

 science?
Sometimes it helps to
Yes!         have a guinea pig
(mostly)



                        = me
                        July – Oct.
                        16 distinct
                        projects for
                        8 clients
                        98+% open
                        source
Synthesis &              Intellectual                 Assay
Procurement               Property              Development


                           WWW

        Meta Data                       Omics   Meta
                                         Data   Data

        Chemical
        Structures                        Screening
                                            Data

                     Target Discovery
                                                      Scope
               Structure-Based Design


                 SAR, ADME, Tox, PK
Meta Data    Omics   Meta
                    Data   Data

      Chemical
      Structures     Screening
                       Data




Acquire
& manage
data
Chemical specification, drawing & editing:




Marvin (http://www.chemaxon.com/products/marvin/) functionality approaching
       that of ChemDraw; good drawing options; can embed into office documents
Enumerate combinatorial libraries




SmiLib (http://gecco.org.chemie.uni-frankfurt.de/smilib/) Efficient and flexible

Marvin (http://www.chemaxon.com) Intuitive but slower
Molecular Structure Conversion




Molconverter (http://www.chemaxon.com/products/marvin/molconverter/)      Fast

OpenBabel (http://www.openbabel.org)   Excellent functionality but slow
Store / analyze libraries and screening data




Screening Assistant SA2 (http://sa2.sourceforge.net/) Powerful, enterprise-like
software: capable of handling internal data management for serious operations
WWW




Need                         Caveat:
External               logged query
Knowledgebases         = disclosure!
Chemical Data / Meta Data




ChemSpider (http://www.chemspider.com/) structure, literature, suppliers

PubChem (http://pubchem.ncbi.nlm.nih.gov/) structure, screening data

SureChem (https://surechem.com/) patent searches
ADME/Tox profiling; target identification




PASS (http://www.pharmaexpert.ru/passonline/index.php) only offered online,
     free, and surprisingly accurate predictions on 300+ endpoints
ADME profiling




iLab2 (https://ilab.acdlabs.com/iLab2/) good range of ADME endpoints,
      online only, one compound at a time
Target Discovery


Need          Structure-Based Design
Modeling,
Informatics     SAR, ADME, Tox, PK
Molecular Structure Prediction / Characterization




Avogadro (http://avogadro.openmolecules.net/)
   Great builder; good graphics; built in molecular mechanics; hooks to free quantum codes
Molecular Structure Prediction / Characterization




VMD (http://www.ks.uiuc.edu/Research/vmd/)
  Good graphics; excellent analytical tools; hooks to NAMD (molecular dynamics)
Molecular Structure Prediction / Characterization




PyMol (http://www.openpymol.org)
  Great graphics; Decent builder, good analytical tools
Protein Structure Prediction




SwissModel (http://swissmodel.expasy.org/)      good control, must have close homolog

Modeller (http://salilab.org/modeller/)   use this for optimal control and efficient relaxation
Structure Based Design




PyRx / AutoDock (http://pyrx.sourceforge.net/)      easy to use; good predictions

Surflex (http://www.jainlab.org/contact.html)   fast; accurate; no free interface
QSAR: Descriptors




CDK (http://rguha.net/code/java/cdkdesc.html)
          Good descriptor selection, easy to use

SA2 (http://sa2.sourceforge.net/)
          Better descriptor selection, but harder to navigate
QSAR: modeling




BuildQSAR (http://profanderson.net/files/buildqsar.php)
                           Fast, flexible, easy to use
Toxicology profiling




ToxTree (http://toxtree.sourceforge.net/) fast, easy to use, clear logic,
        good array of toxicological endpoints
Synthesis &                                  Assay
Procurement                            Development




                 Target Discovery


              Structure-Based Design

Information
Flow            SAR, ADME, Tox, PK
Workflows (i.e., seamless process integration)
That's enough for now .....

  Thank you!
Any questions?

Contenu connexe

Similaire à Open source

Model-based Analysis of Large Scale Software Repositories
Model-based Analysis of Large Scale Software RepositoriesModel-based Analysis of Large Scale Software Repositories
Model-based Analysis of Large Scale Software RepositoriesMarkus Scheidgen
 
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...QuantUniversity
 
WSO2 Machine Learner - Product Overview
WSO2 Machine Learner - Product OverviewWSO2 Machine Learner - Product Overview
WSO2 Machine Learner - Product OverviewWSO2
 
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)Neotys_Partner
 
MS Word file resumes16869r.doc.doc
MS Word file resumes16869r.doc.docMS Word file resumes16869r.doc.doc
MS Word file resumes16869r.doc.docbutest
 
Appistry WGDAS Presentation
Appistry WGDAS PresentationAppistry WGDAS Presentation
Appistry WGDAS Presentationelasticdave
 
EUGM 2014 - Brock Luty (Dart Neuroscience): A ChemAxon/KNIME based tool for ...
EUGM 2014 -  Brock Luty (Dart Neuroscience): A ChemAxon/KNIME based tool for ...EUGM 2014 -  Brock Luty (Dart Neuroscience): A ChemAxon/KNIME based tool for ...
EUGM 2014 - Brock Luty (Dart Neuroscience): A ChemAxon/KNIME based tool for ...ChemAxon
 
2012 ieee projects software engineering @ Seabirds ( Trichy, Chennai, Pondich...
2012 ieee projects software engineering @ Seabirds ( Trichy, Chennai, Pondich...2012 ieee projects software engineering @ Seabirds ( Trichy, Chennai, Pondich...
2012 ieee projects software engineering @ Seabirds ( Trichy, Chennai, Pondich...SBGC
 
Tony Reid Resume
Tony Reid ResumeTony Reid Resume
Tony Reid Resumestoryhome
 
Venkata Sateesh_BigData_Latest-Resume
Venkata Sateesh_BigData_Latest-ResumeVenkata Sateesh_BigData_Latest-Resume
Venkata Sateesh_BigData_Latest-Resumevenkata sateeshs
 
The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...University of California, San Diego
 
Accelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time AnalyticsAccelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time AnalyticsArcadia Data
 
Case Study for Ego-centric Citation Network
Case Study for Ego-centric Citation NetworkCase Study for Ego-centric Citation Network
Case Study for Ego-centric Citation NetworkMike Taylor
 
Building a Data Discovery Network for Sustainability Science
Building a Data Discovery Network for Sustainability ScienceBuilding a Data Discovery Network for Sustainability Science
Building a Data Discovery Network for Sustainability ScienceRobert H. McDonald
 
Leveraging Knowledge Graphs in your Enterprise Knowledge Management System
Leveraging Knowledge Graphs in your Enterprise Knowledge Management SystemLeveraging Knowledge Graphs in your Enterprise Knowledge Management System
Leveraging Knowledge Graphs in your Enterprise Knowledge Management SystemSemantic Web Company
 
Data Science with the Help of Metadata
Data Science with the Help of MetadataData Science with the Help of Metadata
Data Science with the Help of MetadataJim Dowling
 
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...Joachim Schlosser
 
Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...
Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...
Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...Dilnoza Bobokalonova
 

Similaire à Open source (20)

Model-based Analysis of Large Scale Software Repositories
Model-based Analysis of Large Scale Software RepositoriesModel-based Analysis of Large Scale Software Repositories
Model-based Analysis of Large Scale Software Repositories
 
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
 
WSO2 Machine Learner - Product Overview
WSO2 Machine Learner - Product OverviewWSO2 Machine Learner - Product Overview
WSO2 Machine Learner - Product Overview
 
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
 
MS Word file resumes16869r.doc.doc
MS Word file resumes16869r.doc.docMS Word file resumes16869r.doc.doc
MS Word file resumes16869r.doc.doc
 
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
 
Appistry WGDAS Presentation
Appistry WGDAS PresentationAppistry WGDAS Presentation
Appistry WGDAS Presentation
 
EUGM 2014 - Brock Luty (Dart Neuroscience): A ChemAxon/KNIME based tool for ...
EUGM 2014 -  Brock Luty (Dart Neuroscience): A ChemAxon/KNIME based tool for ...EUGM 2014 -  Brock Luty (Dart Neuroscience): A ChemAxon/KNIME based tool for ...
EUGM 2014 - Brock Luty (Dart Neuroscience): A ChemAxon/KNIME based tool for ...
 
2012 ieee projects software engineering @ Seabirds ( Trichy, Chennai, Pondich...
2012 ieee projects software engineering @ Seabirds ( Trichy, Chennai, Pondich...2012 ieee projects software engineering @ Seabirds ( Trichy, Chennai, Pondich...
2012 ieee projects software engineering @ Seabirds ( Trichy, Chennai, Pondich...
 
Tony Reid Resume
Tony Reid ResumeTony Reid Resume
Tony Reid Resume
 
Venkata Sateesh_BigData_Latest-Resume
Venkata Sateesh_BigData_Latest-ResumeVenkata Sateesh_BigData_Latest-Resume
Venkata Sateesh_BigData_Latest-Resume
 
The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...
 
Accelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time AnalyticsAccelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time Analytics
 
Case Study for Ego-centric Citation Network
Case Study for Ego-centric Citation NetworkCase Study for Ego-centric Citation Network
Case Study for Ego-centric Citation Network
 
Building a Data Discovery Network for Sustainability Science
Building a Data Discovery Network for Sustainability ScienceBuilding a Data Discovery Network for Sustainability Science
Building a Data Discovery Network for Sustainability Science
 
Leveraging Knowledge Graphs in your Enterprise Knowledge Management System
Leveraging Knowledge Graphs in your Enterprise Knowledge Management SystemLeveraging Knowledge Graphs in your Enterprise Knowledge Management System
Leveraging Knowledge Graphs in your Enterprise Knowledge Management System
 
Data Science with the Help of Metadata
Data Science with the Help of MetadataData Science with the Help of Metadata
Data Science with the Help of Metadata
 
FC Brochure & Insert
FC Brochure & InsertFC Brochure & Insert
FC Brochure & Insert
 
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
 
Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...
Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...
Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...
 

Plus de Gerald Lushington

A Biclustering Method for Rationalizing Chemical Biology Mechanisms of Action
A Biclustering Method for Rationalizing Chemical Biology Mechanisms of ActionA Biclustering Method for Rationalizing Chemical Biology Mechanisms of Action
A Biclustering Method for Rationalizing Chemical Biology Mechanisms of ActionGerald Lushington
 
Gerald Lushington presentation on Biologically Relevant Chemical Diversity An...
Gerald Lushington presentation on Biologically Relevant Chemical Diversity An...Gerald Lushington presentation on Biologically Relevant Chemical Diversity An...
Gerald Lushington presentation on Biologically Relevant Chemical Diversity An...Gerald Lushington
 
Personalized medicine via molecular interrogation, data mining and systems bi...
Personalized medicine via molecular interrogation, data mining and systems bi...Personalized medicine via molecular interrogation, data mining and systems bi...
Personalized medicine via molecular interrogation, data mining and systems bi...Gerald Lushington
 
Introduction to Data Mining / Bioinformatics
Introduction to Data Mining / BioinformaticsIntroduction to Data Mining / Bioinformatics
Introduction to Data Mining / BioinformaticsGerald Lushington
 

Plus de Gerald Lushington (6)

A Biclustering Method for Rationalizing Chemical Biology Mechanisms of Action
A Biclustering Method for Rationalizing Chemical Biology Mechanisms of ActionA Biclustering Method for Rationalizing Chemical Biology Mechanisms of Action
A Biclustering Method for Rationalizing Chemical Biology Mechanisms of Action
 
Report ghl20130320
Report ghl20130320Report ghl20130320
Report ghl20130320
 
Gerald Lushington presentation on Biologically Relevant Chemical Diversity An...
Gerald Lushington presentation on Biologically Relevant Chemical Diversity An...Gerald Lushington presentation on Biologically Relevant Chemical Diversity An...
Gerald Lushington presentation on Biologically Relevant Chemical Diversity An...
 
LiS services
LiS servicesLiS services
LiS services
 
Personalized medicine via molecular interrogation, data mining and systems bi...
Personalized medicine via molecular interrogation, data mining and systems bi...Personalized medicine via molecular interrogation, data mining and systems bi...
Personalized medicine via molecular interrogation, data mining and systems bi...
 
Introduction to Data Mining / Bioinformatics
Introduction to Data Mining / BioinformaticsIntroduction to Data Mining / Bioinformatics
Introduction to Data Mining / Bioinformatics
 

Open source

  • 1. Informatics for All The Open Source and Freeware Revolution in Chemical Biology Applications to Selected SRI Projects
  • 2. Proprietary software: black box model ● Exclusive code control ● Exclusive development / customization control User-initiated customization must await proprietor implementation Or reinvent all requisite wheels
  • 3. Open source empowers users to pursue novel enhancements according to their needs and their timelines!
  • 4. Open source = philosophically good But in practice: can you replace well established proprietary tools with open source and still sustain ● Effective ● Accurate ● Efficient science?
  • 5. Sometimes it helps to Yes! have a guinea pig (mostly) = me July – Oct. 16 distinct projects for 8 clients 98+% open source
  • 6. Synthesis & Intellectual Assay Procurement Property Development WWW Meta Data Omics Meta Data Data Chemical Structures Screening Data Target Discovery Scope Structure-Based Design SAR, ADME, Tox, PK
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12. Meta Data Omics Meta Data Data Chemical Structures Screening Data Acquire & manage data
  • 13. Chemical specification, drawing & editing: Marvin (http://www.chemaxon.com/products/marvin/) functionality approaching that of ChemDraw; good drawing options; can embed into office documents
  • 14. Enumerate combinatorial libraries SmiLib (http://gecco.org.chemie.uni-frankfurt.de/smilib/) Efficient and flexible Marvin (http://www.chemaxon.com) Intuitive but slower
  • 15. Molecular Structure Conversion Molconverter (http://www.chemaxon.com/products/marvin/molconverter/) Fast OpenBabel (http://www.openbabel.org) Excellent functionality but slow
  • 16. Store / analyze libraries and screening data Screening Assistant SA2 (http://sa2.sourceforge.net/) Powerful, enterprise-like software: capable of handling internal data management for serious operations
  • 17. WWW Need Caveat: External logged query Knowledgebases = disclosure!
  • 18. Chemical Data / Meta Data ChemSpider (http://www.chemspider.com/) structure, literature, suppliers PubChem (http://pubchem.ncbi.nlm.nih.gov/) structure, screening data SureChem (https://surechem.com/) patent searches
  • 19. ADME/Tox profiling; target identification PASS (http://www.pharmaexpert.ru/passonline/index.php) only offered online, free, and surprisingly accurate predictions on 300+ endpoints
  • 20. ADME profiling iLab2 (https://ilab.acdlabs.com/iLab2/) good range of ADME endpoints, online only, one compound at a time
  • 21. Target Discovery Need Structure-Based Design Modeling, Informatics SAR, ADME, Tox, PK
  • 22. Molecular Structure Prediction / Characterization Avogadro (http://avogadro.openmolecules.net/) Great builder; good graphics; built in molecular mechanics; hooks to free quantum codes
  • 23. Molecular Structure Prediction / Characterization VMD (http://www.ks.uiuc.edu/Research/vmd/) Good graphics; excellent analytical tools; hooks to NAMD (molecular dynamics)
  • 24. Molecular Structure Prediction / Characterization PyMol (http://www.openpymol.org) Great graphics; Decent builder, good analytical tools
  • 25. Protein Structure Prediction SwissModel (http://swissmodel.expasy.org/) good control, must have close homolog Modeller (http://salilab.org/modeller/) use this for optimal control and efficient relaxation
  • 26. Structure Based Design PyRx / AutoDock (http://pyrx.sourceforge.net/) easy to use; good predictions Surflex (http://www.jainlab.org/contact.html) fast; accurate; no free interface
  • 27. QSAR: Descriptors CDK (http://rguha.net/code/java/cdkdesc.html) Good descriptor selection, easy to use SA2 (http://sa2.sourceforge.net/) Better descriptor selection, but harder to navigate
  • 29. Toxicology profiling ToxTree (http://toxtree.sourceforge.net/) fast, easy to use, clear logic, good array of toxicological endpoints
  • 30. Synthesis & Assay Procurement Development Target Discovery Structure-Based Design Information Flow SAR, ADME, Tox, PK
  • 31. Workflows (i.e., seamless process integration)
  • 32.
  • 33. That's enough for now ..... Thank you! Any questions?