Collaborative Database and Computational Models for Tuberculosis Drug Discovery
1. Collaborative Database and Computational Models for Tuberculosis Drug Discovery Sean Ekins Collaborations in Chemistry, Fuquay Varina, NC. Collaborative Drug Discovery, Burlingame, CA. Department of Pharmacology, University of Medicine & Dentistry of New Jersey-Robert Wood Johnson Medical School, Piscataway, NJ. School of Pharmacy, Department of Pharmaceutical Sciences, University of Maryland, Baltimore, MD.
2. In the long history of human kind (and animal kind, too) those who have learned to collaborate and improvise most effectively have prevailed. Charles Darwin
3.
4. Open Innovation Open innovation is a paradigm that assumes that firms can and should use external ideas as well as internal ideas, and internal and external paths to market, as the firms look to advance their technology Chesbrough, H.W. (2003). Open Innovation: The new imperative for creating and profiting from technology. Boston: Harvard Business School Press, p. xxiv Collaborative Innovation A strategy in which groups partner to create a product - drive the efficient allocation of R&D resources. Collaborating with outsiders-including customers, vendors and even competitors-a company is able to import lower-cost, higher-quality ideas from the best sources in the world. Open Source While open source and open innovation might conflict on patent issues, they are not mutually exclusive, as participating companies can donate their patents to an independent organization, put them in a common pool or grant unlimited license use to anybody. Hence some open source initiatives can merge the two concepts Some Definitions
5.
6. Major collaborative grants in EU: Framework, IMI …NIH moving in same direction? Cross continent collaboration CROs in China, India etc – Pharma’s in US / Europe More industry – academia collaboration ‘not invented here’ a thing of the past More effort to go after rare and neglected diseases -Globalization and connectivity of scientists will be key – Current pace of change in pharma may not be enough. Need to rethink how we use all technologies & resources… Collaboration is everywhere
7. Hardware is getting smaller 1930’s 1980s 1990s Room size Desktop size Not to scale and not equivalent computing power – illustrates mobility Laptop Netbook Phone Watch 2000s
8. Models and software becoming more accessible- free, precompetitive efforts - collaboration Free tools are proliferating
9. Typical Lab: The Data Explosion Problem & Collaborations DDT Feb 2009
15. ~ 20 public datasets for TB Including Novartis data on TB hits >300,000 cpds Patents, Papers Annotated by CDD Open to browse by anyone http://www.collaborativedrug.com/register Molecules with activity against
16. CDD is a partner on a 5 year project supporting >20 labs and proving cheminformatics support www.mm4tb.org More Medicines for Tuberculosis
17. Ekins et al, Trends in Microbiology 19: 65-74, 2011 Fitting into the drug discovery process
18. Searching for TB molecular mimics; collaboration Lamichhane G, et al Mbio, 2: e00301-10, 2011 Modeling – CDD Biology – Johns Hopkins Chemistry – Texas A&M
20. Bayesian Classification Models for TB Good Bad active compounds with MIC < 5uM Laplacian-corrected Bayesian classifier models were generated using FCFP-6 and simple descriptors. 2 models 220,000 and >2000 compounds Ekins et al., Mol BioSyst, 6: 840-851, 2010
22. Bayesian machine learning Ekins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010 Bayesian classification is a simple probabilistic classification model. It is based on Bayes’ theorem h is the hypothesis or model d is the observed data p ( h ) is the prior belief (probability of hypothesis h before observing any data) p ( d ) is the data evidence (marginal probability of the data) p ( d|h ) is the likelihood (probability of data d if hypothesis h is true) p ( h|d ) is the posterior probability (probability of hypothesis h being true given the observed data d ) A weight is calculated for each feature using a Laplacian-adjusted probability estimate to account for the different sampling frequencies of different features. The weights are summed to provide a probability estimate
23. Bayesian Classification TB Models Leave out 50% x 100 Ekins et al., Mol BioSyst, 6: 840-851, 2010 65.47 ± 7.96 67.21 ± 7.05 66.85 ± 4.06 0.75 ± 0.01 0.73 ± 0.01 MLSMR dose response set (N = 2273) 77.13 ± 2.26 78.59 ± 1.94 78.56 ± 1.86 0.86 ± 0 0.86 ± 0 MLSMR All single point screen (N = 220463) Sensitivity Specificity Concordance Internal ROC Score External ROC Score Dateset (number of molecules)
24. 100K library Novartis Data FDA drugs Additional test sets Suggests models can predict data from the same and independent labs Initial enrichment – enables screening few compounds to find actives 21 hits in 2108 cpds 34 hits in 248 cpds 1702 hits in >100K cpds Ekins and Freundlich, Pharm Res, 28, 1859-1869, 2011. Ekins et al., Mol BioSyst, 6: 840-851, 2010
25.
26. Models with SRI kinase library data; refining data with cytotoxicity Model 1 ROC XV AUC (N 23797) = 0.89 Model 2 (N 1248) = 0.72 Model 3 (N 1248) = 0.77 Leave out 50% x 100 Adding cytotoxicity data improves models Dateset (number of molecules) External ROC Score Internal ROC Score Concordance Specificity Sensitivity Model 1 (N = 23797) 0.87 ± 0 0.88 ± 0 76.77 ± 2.14 76.49 ± 2.41 81.7 ± 2.96 Model 2 (N = 1248) 0.65 ± 0.01 0.70 ± 0.01 61.58 ± 1.56 61.85 ± 8.45 61.30 ± 8.24 Model 3 (N=1248) 0.74 ± 0.02 0.75 ± 0.02 68.67 ± 6.88 69.28 ± 9.84 64.84 ± 12.11
27. Original TB Models : refining data with cytotoxicity Ekins et al., Mol BioSyst, 6: 840-851, 2010 Single pt ROC XV AUC = 0.88 Dose resp = 0.78 Dose resp + cyto = 0.86 Leave out 50% x 100 Dateset (number of molecules) External ROC Score Internal ROC Score Concordance Specificity Sensitivity MLSMR All single point screen (N = 220463) 0.86 ± 0 0.86 ± 0 78.56 ± 1.86 78.59 ± 1.94 77.13 ± 2.26 MLSMR dose response set (N = 2273) 0.73 ± 0.01 0.75 ± 0.01 66.85 ± 4.06 67.21 ± 7.05 65.47 ± 7.96 NEW Dose resp and cytotoxicity (N = 2273) 0.82 ± 0.02 0.84 ± 0.02 82.61 ± 4.68 83.91 ± 5.48 65.99 ± 7.47
30. TB Compound libraries and filter failures Filtering using SMARTs filters to remove thiol reactives, false positives etc at University of New Mexico (http://pasilla.health.unm.edu/tomcat/biocomp/smartsfilter) Ekins et al., Mol Biosyst, 6: 2316-2324, 2010
32. Correlation between the number of SMARTS filter failures and the number of Lipinski violations for different types of rules sets with FDA drug set from CDD (N = 2804) Suggests # of Lipinski violations may also be an indicator of undesirable chemical features that result in reactivity Filter Correlations with Rule of 5 Ekins and Freundlich, Pharm Res, 28, 1859-1869, 2011.
33. Summary Computational models based on Whole cell TB data could improve efficiency of screening Collaborations get us to interesting compounds quickly Availability of datasets enable analysis that could suggest simple rules A high proportion of compounds fail the Abbott filters for reactivity when compared to drugs and antimalarials Understanding the chemical properties and characteristics of compounds = better compounds for lead optimization.
34. Could all pharmas share their data as models with each other? Increasing Data & Model Access Ekins and Williams, Lab On A Chip, 10: 13-22, 2010.
35.
36.
37.
38. RRCK Permeability and MDR Open descriptors results almost identical to commercial descriptors Across many datasets and quantitative and qualitative data Smaller solubility datasets give similar results Provides confidence that open models could be viable MDCK training 25,000 testing 25,000 MDR training 25,000 testing 18,400 Gupta RR, et al., Drug Metab Dispos, 38: 2083-2090, 2010 Kappa = 0.50 Sensitivity = 0.62 Specificity = 0.94 PPV = 0.68 Kappa = 0.53 Sensitivity = 0.64 Specificity = 0.94 PPV = 0.72 (Baseline) Kappa = 0.47 Sensitivity = 0.59 Specificity = 0.93 PPV = 0.67 C5.0 RRCK Permeability Kappa = 0.65 Sensitivity = 0.86 Specificity = 0.78 PPV = 0.84 CDK and SMARTS Keys Kappa = 0.67 Sensitivity = 0.86 Specificity = 0.80 PPV = 0.85 (Baseline) MOE2D and SMARTS Keys Kappa = 0.62 Sensitivity = 0.85 Specificity = 0.77 PPV = 0.83 CDK descriptors C5.0 MDR
39. Merck KGaA Combining models may give greater coverage of ADME/ Tox chemistry space and improve predictions? Model coverage of chemistry space Lundbeck Pfizer Merck GSK Novartis Lilly BMS Allergan Bayer AZ Roche BI Merk KGaA
40.
41. Bunin & Ekins DDT 16: 643-645, 2011 A complex ecosystem of collaborations: A new business model Inside Company Collaborators Inside Academia Collaborators Molecules, Models, Data Molecules, Models, Data Inside Foundation Collaborators Molecules, Models, Data Inside Government Collaborators Molecules, Models, Data IP IP IP IP Shared IP Collaborative platform/s
42.
43.
44. 2D Similarity search with “hit” from screening Export database and use for 3D searching with a pharmacophore or other model Suggest approved drugs for testing - may also indicate other uses if it is present in more than one database Suggest in silico hits for in vitro screening Key databases of structures and bioactivity data FDA drugs database Repurpose FDA drugs in silico Ekins S, Williams AJ, Krasowski MD and Freundlich JS, Drug Disc Today, 16: 298-310, 2011
45. Crowdsourcing Project “Off the Shelf R&D” All pharmas have assets on shelf that reached clinic “ Off the Shelf R&D” Get the crowd to help in repurposing / repositioning these assets How can software help? - Create communities to test - Provide informatics tools that are accessible to the crowd - enlarge user base - Data storage on cloud – integration with public data - Crowd becomes virtual pharma-CROs and the “customer” for enabling services
46.
47. 2020: A Drug Discovery Odyssey Could our Pharma R&D look like this Massive collaboration networks – software enabled. We are in “Generation App”. Crowdsourcing will have a role in R&D. Drug discovery possible by anyone with “app access” Ekins & Williams, Pharm Res, 27: 393-395, 2010.
48.
49. www.scimobileapps.com How do you find scientific mobile Apps ? Development of Wiki’s to track developments in tools..
50.
51. Novartis aerobic and anaerobic TB hits Anaerobic compounds showed statistically different and higher mean descriptor property values compared with the aerobic hits (e.g. molecular weight, logP, hydrogen bond donor, hydrogen bond acceptor, polar surface area and rotatable bond number) The mean molecular properties for the Novartis compounds are in a similar range to the MLSMR and TAACF-NIAID CB2 hits Ekins and Freundlich, Pharm Res, 28, 1859-1869, 2011.
Notes de l'éditeur
CDD Experienced Team Innovates and Executes Barry Bunin, PhD (Pres. & Cofounder as first Eli Lilly EIR) Libraria (CEO, Pres.-CSO), Arris Pharmaceuticals (Sr. Scientist), Genentech, UC Berkeley (Ellman), Columbia University, author. Moses Hohman, PhD (Director Software Engineering) Northwestern Assoc. Director of Bioinformatics, Thoughtworks, Inc., U of Chicago (PhD), Harvard ( magna cum laude, Physics) Sylvia Ernst, PhD (Director Community Growth & Sales) Left 800-lb Gorillas: Accelrys-Scitegic, MDL-Elsevier-Beilstein Peter Cohan (BOD & Overall Sales Strategy) Symyx (VP Bus Dev & President-Discovery Tools), MDL (VP Customer Marketing), www.secondderivative.com, author. Omidyar Network, Founders Fund, & Lilly (BOD observers) WSGR (Corporate Counsel), Rina Accountancy (GAAP compliance) Partners: Hub Consortium Members, ChemAxon, DNDi, MMV, Sandler Center… CDD SAB: Christopher Lipinski PhD, James McKerrow, MD PhD, David Roos PhD, Adam Renslo PhD, Wes Van Voorhis, MD PhD
CDD Experienced Team Innovates and Executes Barry Bunin, PhD (Pres. & Cofounder as first Eli Lilly EIR) Libraria (CEO, Pres.-CSO), Arris Pharmaceuticals (Sr. Scientist), Genentech, UC Berkeley (Ellman), Columbia University, author. Moses Hohman, PhD (Director Software Engineering) Northwestern Assoc. Director of Bioinformatics, Thoughtworks, Inc., U of Chicago (PhD), Harvard ( magna cum laude, Physics) Sylvia Ernst, PhD (Director Community Growth & Sales) Left 800-lb Gorillas: Accelrys-Scitegic, MDL-Elsevier-Beilstein Peter Cohan (BOD & Overall Sales Strategy) Symyx (VP Bus Dev & President-Discovery Tools), MDL (VP Customer Marketing), www.secondderivative.com, author. Omidyar Network, Founders Fund, & Lilly (BOD observers) WSGR (Corporate Counsel), Rina Accountancy (GAAP compliance) Partners: Hub Consortium Members, ChemAxon, DNDi, MMV, Sandler Center… CDD SAB: Christopher Lipinski PhD, James McKerrow, MD PhD, David Roos PhD, Adam Renslo PhD, Wes Van Voorhis, MD PhD
CDD Experienced Team Innovates and Executes Barry Bunin, PhD (Pres. & Cofounder as first Eli Lilly EIR) Libraria (CEO, Pres.-CSO), Arris Pharmaceuticals (Sr. Scientist), Genentech, UC Berkeley (Ellman), Columbia University, author. Moses Hohman, PhD (Director Software Engineering) Northwestern Assoc. Director of Bioinformatics, Thoughtworks, Inc., U of Chicago (PhD), Harvard ( magna cum laude, Physics) Sylvia Ernst, PhD (Director Community Growth & Sales) Left 800-lb Gorillas: Accelrys-Scitegic, MDL-Elsevier-Beilstein Peter Cohan (BOD & Overall Sales Strategy) Symyx (VP Bus Dev & President-Discovery Tools), MDL (VP Customer Marketing), www.secondderivative.com, author. Omidyar Network, Founders Fund, & Lilly (BOD observers) WSGR (Corporate Counsel), Rina Accountancy (GAAP compliance) Partners: Hub Consortium Members, ChemAxon, DNDi, MMV, Sandler Center… CDD SAB: Christopher Lipinski PhD, James McKerrow, MD PhD, David Roos PhD, Adam Renslo PhD, Wes Van Voorhis, MD PhD
CDD Experienced Team Innovates and Executes Barry Bunin, PhD (Pres. & Cofounder as first Eli Lilly EIR) Libraria (CEO, Pres.-CSO), Arris Pharmaceuticals (Sr. Scientist), Genentech, UC Berkeley (Ellman), Columbia University, author. Moses Hohman, PhD (Director Software Engineering) Northwestern Assoc. Director of Bioinformatics, Thoughtworks, Inc., U of Chicago (PhD), Harvard ( magna cum laude, Physics) Sylvia Ernst, PhD (Director Community Growth & Sales) Left 800-lb Gorillas: Accelrys-Scitegic, MDL-Elsevier-Beilstein Peter Cohan (BOD & Overall Sales Strategy) Symyx (VP Bus Dev & President-Discovery Tools), MDL (VP Customer Marketing), www.secondderivative.com, author. Omidyar Network, Founders Fund, & Lilly (BOD observers) WSGR (Corporate Counsel), Rina Accountancy (GAAP compliance) Partners: Hub Consortium Members, ChemAxon, DNDi, MMV, Sandler Center… CDD SAB: Christopher Lipinski PhD, James McKerrow, MD PhD, David Roos PhD, Adam Renslo PhD, Wes Van Voorhis, MD PhD
CDD Experienced Team Innovates and Executes Barry Bunin, PhD (Pres. & Cofounder as first Eli Lilly EIR) Libraria (CEO, Pres.-CSO), Arris Pharmaceuticals (Sr. Scientist), Genentech, UC Berkeley (Ellman), Columbia University, author. Moses Hohman, PhD (Director Software Engineering) Northwestern Assoc. Director of Bioinformatics, Thoughtworks, Inc., U of Chicago (PhD), Harvard ( magna cum laude, Physics) Sylvia Ernst, PhD (Director Community Growth & Sales) Left 800-lb Gorillas: Accelrys-Scitegic, MDL-Elsevier-Beilstein Peter Cohan (BOD & Overall Sales Strategy) Symyx (VP Bus Dev & President-Discovery Tools), MDL (VP Customer Marketing), www.secondderivative.com, author. Omidyar Network, Founders Fund, & Lilly (BOD observers) WSGR (Corporate Counsel), Rina Accountancy (GAAP compliance) Partners: Hub Consortium Members, ChemAxon, DNDi, MMV, Sandler Center… CDD SAB: Christopher Lipinski PhD, James McKerrow, MD PhD, David Roos PhD, Adam Renslo PhD, Wes Van Voorhis, MD PhD
CDD Experienced Team Innovates and Executes Barry Bunin, PhD (Pres. & Cofounder as first Eli Lilly EIR) Libraria (CEO, Pres.-CSO), Arris Pharmaceuticals (Sr. Scientist), Genentech, UC Berkeley (Ellman), Columbia University, author. Moses Hohman, PhD (Director Software Engineering) Northwestern Assoc. Director of Bioinformatics, Thoughtworks, Inc., U of Chicago (PhD), Harvard ( magna cum laude, Physics) Sylvia Ernst, PhD (Director Community Growth & Sales) Left 800-lb Gorillas: Accelrys-Scitegic, MDL-Elsevier-Beilstein Peter Cohan (BOD & Overall Sales Strategy) Symyx (VP Bus Dev & President-Discovery Tools), MDL (VP Customer Marketing), www.secondderivative.com, author. Omidyar Network, Founders Fund, & Lilly (BOD observers) WSGR (Corporate Counsel), Rina Accountancy (GAAP compliance) Partners: Hub Consortium Members, ChemAxon, DNDi, MMV, Sandler Center… CDD SAB: Christopher Lipinski PhD, James McKerrow, MD PhD, David Roos PhD, Adam Renslo PhD, Wes Van Voorhis, MD PhD
CDD Experienced Team Innovates and Executes Barry Bunin, PhD (Pres. & Cofounder as first Eli Lilly EIR) Libraria (CEO, Pres.-CSO), Arris Pharmaceuticals (Sr. Scientist), Genentech, UC Berkeley (Ellman), Columbia University, author. Moses Hohman, PhD (Director Software Engineering) Northwestern Assoc. Director of Bioinformatics, Thoughtworks, Inc., U of Chicago (PhD), Harvard ( magna cum laude, Physics) Sylvia Ernst, PhD (Director Community Growth & Sales) Left 800-lb Gorillas: Accelrys-Scitegic, MDL-Elsevier-Beilstein Peter Cohan (BOD & Overall Sales Strategy) Symyx (VP Bus Dev & President-Discovery Tools), MDL (VP Customer Marketing), www.secondderivative.com, author. Omidyar Network, Founders Fund, & Lilly (BOD observers) WSGR (Corporate Counsel), Rina Accountancy (GAAP compliance) Partners: Hub Consortium Members, ChemAxon, DNDi, MMV, Sandler Center… CDD SAB: Christopher Lipinski PhD, James McKerrow, MD PhD, David Roos PhD, Adam Renslo PhD, Wes Van Voorhis, MD PhD
CDD Experienced Team Innovates and Executes Barry Bunin, PhD (Pres. & Cofounder as first Eli Lilly EIR) Libraria (CEO, Pres.-CSO), Arris Pharmaceuticals (Sr. Scientist), Genentech, UC Berkeley (Ellman), Columbia University, author. Moses Hohman, PhD (Director Software Engineering) Northwestern Assoc. Director of Bioinformatics, Thoughtworks, Inc., U of Chicago (PhD), Harvard ( magna cum laude, Physics) Sylvia Ernst, PhD (Director Community Growth & Sales) Left 800-lb Gorillas: Accelrys-Scitegic, MDL-Elsevier-Beilstein Peter Cohan (BOD & Overall Sales Strategy) Symyx (VP Bus Dev & President-Discovery Tools), MDL (VP Customer Marketing), www.secondderivative.com, author. Omidyar Network, Founders Fund, & Lilly (BOD observers) WSGR (Corporate Counsel), Rina Accountancy (GAAP compliance) Partners: Hub Consortium Members, ChemAxon, DNDi, MMV, Sandler Center… CDD SAB: Christopher Lipinski PhD, James McKerrow, MD PhD, David Roos PhD, Adam Renslo PhD, Wes Van Voorhis, MD PhD
CDD Experienced Team Innovates and Executes Barry Bunin, PhD (Pres. & Cofounder as first Eli Lilly EIR) Libraria (CEO, Pres.-CSO), Arris Pharmaceuticals (Sr. Scientist), Genentech, UC Berkeley (Ellman), Columbia University, author. Moses Hohman, PhD (Director Software Engineering) Northwestern Assoc. Director of Bioinformatics, Thoughtworks, Inc., U of Chicago (PhD), Harvard ( magna cum laude, Physics) Sylvia Ernst, PhD (Director Community Growth & Sales) Left 800-lb Gorillas: Accelrys-Scitegic, MDL-Elsevier-Beilstein Peter Cohan (BOD & Overall Sales Strategy) Symyx (VP Bus Dev & President-Discovery Tools), MDL (VP Customer Marketing), www.secondderivative.com, author. Omidyar Network, Founders Fund, & Lilly (BOD observers) WSGR (Corporate Counsel), Rina Accountancy (GAAP compliance) Partners: Hub Consortium Members, ChemAxon, DNDi, MMV, Sandler Center… CDD SAB: Christopher Lipinski PhD, James McKerrow, MD PhD, David Roos PhD, Adam Renslo PhD, Wes Van Voorhis, MD PhD
CDD Experienced Team Innovates and Executes Barry Bunin, PhD (Pres. & Cofounder as first Eli Lilly EIR) Libraria (CEO, Pres.-CSO), Arris Pharmaceuticals (Sr. Scientist), Genentech, UC Berkeley (Ellman), Columbia University, author. Moses Hohman, PhD (Director Software Engineering) Northwestern Assoc. Director of Bioinformatics, Thoughtworks, Inc., U of Chicago (PhD), Harvard ( magna cum laude, Physics) Sylvia Ernst, PhD (Director Community Growth & Sales) Left 800-lb Gorillas: Accelrys-Scitegic, MDL-Elsevier-Beilstein Peter Cohan (BOD & Overall Sales Strategy) Symyx (VP Bus Dev & President-Discovery Tools), MDL (VP Customer Marketing), www.secondderivative.com, author. Omidyar Network, Founders Fund, & Lilly (BOD observers) WSGR (Corporate Counsel), Rina Accountancy (GAAP compliance) Partners: Hub Consortium Members, ChemAxon, DNDi, MMV, Sandler Center… CDD SAB: Christopher Lipinski PhD, James McKerrow, MD PhD, David Roos PhD, Adam Renslo PhD, Wes Van Voorhis, MD PhD
Added Massive collaboration networks – software enabled. We are in “Generation App”. Crowdsourcing will have a role in R&D. Drug discovery possible by anyone with “app access”