Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

Molecular and data visualization in drug discovery

Plus De Contenu Connexe

Livres associés

Gratuit avec un essai de 30 jours de Scribd

Tout voir

Livres audio associés

Gratuit avec un essai de 30 jours de Scribd

Tout voir

Molecular and data visualization in drug discovery

  1. 1. Molecular and Data Visualization in Drug Discovery Deepak Bandyopadhyay GlaxoSmithKline
  2. 2. Intro: Human Body & Disease Biology • From Wikipedia: – Abnormal condition that affects part or all of an organism. – Associated with specific symptoms and signs. • Causes: – Single cause, e.g. pathogen, poison, nutrient deficiency, genetics – Multiple factors including environment, lifestyle, genetics http://www.biologyguide.net/biol1/1_disease.htm Mycobacterium tuberculosis Chest X-ray showing lung cancer
  3. 3. Drug Discovery Parts/Timeline Focus of Drug Discovery • Narrow down on one or a few substances to test in humans and develop into a drug that treats a disease Components: Target Selection and Validation genome protein link to disease disease genetics pathology biological target In Vitro BiologyMedicinal Chemistry (Lead Optimization) Lead Discovery (a.k.a. Screening) In Vivo Biology
  4. 4. Molecular and Data Visualization • The two parts of my job at GSK!  • Molecules: – small (drugs/peptides) and large (proteins/DNA/RNA/lipids) – visualized in 1D (SMILES), 2D (structure), 3D (coords / conformations), 4D (Mol. Dynamics) • Data: – Format: numeric / text, continuous / categorical, Delimited/database/XML/proprietary – Source: instruments, manual entry, calculation – About drug discovery projects (key: molecule ID), genomics/proteomics (key: gene/protein ID), clinical studies (key: anon. patient ID), … Ibuprofen DRUG PROTEIN EGFR Ball and stick EGFR ribbons
  5. 5. Movie: Introduction to Drug Design By Schrödinger (molecular modeling software company): https://www.youtube.com/watch?v=u49k72rUdyc
  6. 6. Bioactivity 101 • Concentration-Response curve and IC50 • Structure Activity Relationship (SAR) pIC50 = -log IC50 IC50 = 12.8 uM (micromolar) pIC50 = 6-log10IC50 = 4.89 Think Avogadro, pH…
  7. 7. Molecular Visualization Deconstructed • Representations • Navigation • Interaction • What would you add? Aspirin (ligand) Cox-1 (protein) Binding pocket surface polar +ve charge hydrophobic -ve charge XY translate, Z zoom Rotate about X/Y or Z E.g. in program MOE  F1 F2 F3 Save/restore scenes Select Hide/Show Center Prev/Next Scene Expand Sel. Import/Export Align Compute…
  8. 8. Purposes of Molecule Visualization • Understand and rationalize “SAR” in 3D • (Protein) Structure-Based Drug Design. E.g.: – Aspirin Binds COX1/2, Celebrex binds COX2 only • Clearly illustrate biological systems / processes • What other tasks can you think of?
  9. 9. Case study 1: Protein-Protein Interactions HIV-1 coat protein gp120 bound to antibody 17b (Light, Heavy) and CD4 gp120/CD4 interfacegp120/antibody L/H interface Rank color:  >  >  >  >  >  >  Ban, Y. E. A., Edelsbrunner, H., & Rudolph, J. (2006). Interface surfaces for protein-protein complexes. J. ACM, 53(3), 361-378.
  10. 10. Case-Study 2: Molecular Dynamics Simulation of a drug entering into the binding site of a target protein Decherchi et al., Nature Comms. 6(6155), 2015. https://www.youtube.com/watch?v=ckTqh50r_2w
  11. 11. From Molecules to Data Mol spreadsheets, visualizations StarDrop Glowing Molecules™ image from http://www.asteris-app.com/technical-info.htm Hybrid molecule/data visualization
  12. 12. Software Systems: Spotfire • Feature set / distinguishing factors: – Handling large datasets via filtering and memory management – Tabular file (CSV, Excel) or database input – Multiple, configurable visualization types – Easy enough for domain experts to use / share – Life science add-ons • Molecule depiction • Specialized –omics packages Binned pIC50 trellised by HBA and HBDpIC50 vs. % inh
  13. 13. Software Systems: LiveDesign • Consolidate multiple disconnected tools for molecule design – Integrated Single Platform – Intuitive UI – 2D, 3D, Data & Visuals – Social aspect
  14. 14. Dimensions, dimensions… • Molecules: 1D (SMILES e.g. c1ccccc1), 2D (depiction), 3D (coords), 4D (motion) • Data: – 100s of activities, measured and predicted properties per row (compound) – ~100K for gene expression, clinical trial data – Millions for –omics, next-gen sequencing – Then there’s systems biology… • Dimensionality reduction is a key capability – PCA, SOM, Stochastic Proximity Embedding,…
  15. 15. Challenges / Types of Visualization • Key capabilities for data visualization – Large data  human comprehension – High-level summary + drill-down – Quickly (auto?) isolate interesting data points http://guides.library.duke.edu/datavis/vis_types map SOM Parallel coords Heat mapprotein Volume rendering http://flagshipbio.com/amino-acid-structure-properties-using-self-organizing-maps/ Radar plot Box Plot Sunburst 2D 3D nD hierarchical Dendro- gram Network/Graph layout Wikipedia
  16. 16. All the Data at Once: Vlaaivis T. J. Howe, G. Mahieu, P. Marichal,T. Tabruyn and P. Vugts. Data reduction and representation in drug discovery. Drug Discovery Today 12(1/2):45-53 Jan 2007 R
  17. 17. All the Data at Once (cont’d): Radar Plots • Circular histogram for viewing multi-parameter results The influence of the 'organizationalfactor'on compoundquality in drug discovery Paul D. Leeson & Stephen A. St-Gallay Nature Reviews Drug Discovery 10, 749-765 (October 2011) Property differences are scaled to either +1, whereby the company with a positive ('best') property value had the highest magnitude, or −1, whereby the company with the lowest ('worst') value had the highest magnitude.
  18. 18. Visualizing Large Datasets P. Ertl & B. Rohde, J. Cheminformatics 4(12), 2012 Gaspar et al. J. Chem. Inf. Model., 2015, 55 (1), pp 84–94 Network-like similarity graph Bajorath et al. • Dimensionality reduction • Graph layout • Activity landscape • Probabilistic property plots • Scaffold abstraction Steven Muchmore, Abbott Labs (now Abbvie) Molecule cloud MolecularProperty 1  MolecularProperty2 Probabilityofsuccess (crossingcellmembrane)
  19. 19. SAR Tables • SAR: Structure-Activity Relationship – Split molecule: core/scaffold, pendant R-groups – SAR Table: molecule spreadsheet with R-groups and Activity Data (-OH) (-COOH)
  20. 20. SAR Maps - R1 vs. R2 on a Core Selectiveforprotein1pIC502‒pIC501Selectiveforprotein2 R1 R2 Core “scaffold”: D. K. Agrafiotis et al. SAR Maps:  A New SAR Visualization Technique for Medicinal Chemists. J. Med. Chem., 2007, 50 (24), 5926–5937.
  21. 21. Clustering • Based on chemical descriptors, biological activity, etc… • Agglomerative or hierarchical Hoek, Keith S. et al.: Metastatic potential of melanomas defined by specific gene expression profiles with no BRAF signature. Pigment Cell Research 19 (4), 290-302 http://chemmine.ucr.edu/help/ Molecules Genes
  22. 22. Limitations of Clustering Molecule  single cluster, can be limiting seals (fur) ? singleton ? ducks (bill) ? penguins (flipper) ? Cluster 3 Cluster 10 similar molecules ≠ same cluster Many singletons Complete Link Cluster ID ClusterSize
  23. 23. Automatic Decomposition into (All) Overlapping Scaffolds Malarial parasite assay pIC50 8.1 … 49 total … 226 total 2 total Molecule Scaffold(s) Related Molecules
  24. 24. 8.2 Avg pIC50 8.15 Avg pIC50 7.8 Avg pIC50 7.8 Next Step: Combine with Activities and Properties … 49 total … 226 total 2 total 8.5 8.2 8.0 7.5 7.7 8.5 7.4 7.9 7.7 8.2 Molecule Scaffold(s) Annotation Related Molecules
  25. 25. Case Study: Linking Molecules By Scaffolds • Use aggregate properties for decision making • Find related molecules with improved properties  Improving property 1 Improvingactivity2 Aggregate (scaffold) ↓ Drill down (8 molecules) Improving activity 3  Improvingproperty4  >  Keep top half of molecule, substitute bottom half Example 1 Example 2
  26. 26. Summary and Lessons Learned • Drug discovery has specialized types of data that are best understood by visualization • Good visualizations can support the making of good decisions (and the converse: GIGO…) • The human element is important – visuals and analytics should be creatable/usable by scientists • As new visual analytics experts, consider careers in an industry where you can add value and be creative – Subtle plug for drug discovery 
  27. 27. Future Directions and Challenges in Data Visualization for Drug Discovery • Human vs. Machine or Human + Machine ? • Automate tediousness of data prep/integration • Intuitiveness by design • Interconnection by design • Integration of latest visualization techniques developed for other domains • Using emerging media eg. VR, Kinect • What can you think of?
  28. 28. Questions?

×