This document discusses protein-ligand docking and its applications in drug discovery. It notes that drug discovery is a lengthy and costly process with high attrition rates. Protein-ligand docking can be used to aid rational drug design by predicting how drug molecules may interact with protein targets. The document outlines different types of docking problems in biology and summarizes concepts and challenges in protein-ligand docking, including accounting for flexibility. It provides examples of how docking has been used to design new protease inhibitors and for virtual screening in drug development pipelines.
2. 2 Nat Rev Drug Discov 2010, 9 (3), 203-1
Drug discovery is a lengthy, cost-intensive process with high attrition rate.
drug
3.
4. A.L. Hopkins Nat. Chem. Biol. 2008 4:682-690
Single gene knockouts only
affect phenotype in 10-20% of
cases
35% of biologically active
compounds bind to two or
more targets that do not have
similar sequences or global
shapes
Paolini et al. Nat. Biotechnol. 2006 24:805–815
Kaiser et al. Nature 462 (2009) 175-81
Motivators
Predict side effects
Repurpose drugs
6. 6
Year
Number of entries and total number of polymer chains
released per year
Year Year
Year
Average Molecular Weight released per year
7. physiological process
Understanding of
dynamics and
kinetics of protein-
ligand interactions
physiological processphysiological processphysiological process
Knowledge
representation
and discovery &
model integration
Prediction of molecular
interaction network on
a large/genome scale
Reconstruction,
analysis and
simulation of
biological networks
Traditional
Approach
Systems-based
Approach
Motivators
10. 08feb10
Why dock? Enzymology, Drug design
Interested in designing a new protease inhibitor?
Common approach: Begin by determining the 3D crystal structures of
protein•drug complexes, thereby guiding further rounds of “rational” structure-
based drug design (SBDD)
11. Consider the cellular interior… the cytosol…
Proteins live in a densely crowded environment of molecular
interactions (not in isolation, not at infinite dilution…)
McGuffee&Elcock,PLOSCompBiol(2010)
12. Consider the cellular interior… the cytosol…
These molecular contacts occur across many length-scales…
the cytoplasm an antibiotic
bound to the
ribosome
an anti-cancer
drug bound to
ABL kinase
Mura & McAnany (2014), Molecular Simulation
13. P•L complexes: Intermolecular interactions
P should bind L, but (maybe?) not too tightly—too low a
G ==> extremely tight binding [low koff]
The usual types of (noncovalent)
interactions defining a P•L complex:
1) apolar (vdW / Lennard-Jones /
dipole•••induced dipole, etc.)
2) polar (H-bonds, dipole•••dipole)
3) electrostatic (Coulombic)
Recurring theme (& basis of docking!)
Biomolecular interactions exhibit
extremely high levels of geomet-
ric/steric & chemical (e.g.,
-δD–Hδ+
····Aδ-
) complementarity
14. Why dock in silico, in general?
Prediction of bio-molecular interactions? (A test of our basic
molecular/physicochemical models and theories...)
Computer aided analysis saves resources (time, $$)
Measuring the relative strengths of interactions in some milieu
of potentially interacting proteins…
Automated prediction of molecular interactions to aid in
rational drug design (SBDD, CADD pipelines) ?
Drug design: Virtual Screening (VS)
Drug molecule database growth
(A whole new area is opening up now, with Deep Learning)
15. Protein•Ligand Docking, in context
Molecular recognition is a (the!) central
phenomenon in biology
• Enzymes Substrates
• Receptors Signal inducing ligands
• Antibodies Antigens
Classifying docking problems in biology
• Protein•Ligand docking
– Rigid docking
– Flexible docking
• Protein•protein docking
• Protein•{DNA, RNA} docking
• {DNA, RNA}•ligand docking
Ligand•Protein Docking
• Proteins Drugs (SBDD/CADD)
• Proteins Natural small-molecule substrates (metabolomics, lipidomics)
OPTIONAL
slide?
16. The Molecular Docking Problem
Given two molecules with 3D conformations at
atomic resolution:
• Do the molecules bind to each other? …If yes,
• What does the inter-molecular complex look like (3D) ?
• What is the binding affinity (Kd)?
Structures of protein-ligand complexes
• X-ray (PDB: ~80,000 entries from X-ray crystallography, cryo-EM,
and possibly other diffraction-based methods, as well as NMR)
• NMR structure bundles
Importance of protein 3D structures
• Resolution < 2.5Å
• Homology models (as receptor model) can be problematic
Very OPTIONAL slide?
18. 17sep10
Generation of Cavity Model
X-ray structure of HIV protease Molecular surface model at active site
Active site filled with spheres. Sphere centers become potential locations for
ligand atoms.
Very OPTIONAL slide?
20. Flexibility—an added complication(/opportunity)
At T > 0, molecules are not static entities… They move!
Mura & McAnany (2014), Molecular Simulation
Esystem = U + K
Simple idea
Simple functional
forms…
To achieve a
balance between
accuracy and
computational
simplicity
21. Can docking account for dynamics?—Yes…
In many approaches, both clever and brute-force…
• Lifting the rigidly frozen receptor constraint (ligand almost
always treated as flexible, for years now); allowing
receptor DoFs to be sampled (at least at level of sampling
over side-chain rotamers)
• A "relaxed complex" family of approaches, pioneered by
McCammon & colleagues
• Ensemble-based methods (e.g., biased MD to generate the
docking ensemble)
• Simulate the receptor∙∙∙drug binding process, in physical
terms! (Recent, less brute-force approaches to this cleverly
combine BD and MD.)
26. physiological process
Understanding of
dynamics and
kinetics of protein-
ligand interactions
physiological processphysiological processphysiological process
Knowledge
representation
and discovery &
model integration
Prediction of molecular
interaction network on
a large/genome scale
Reconstruction,
analysis and
simulation of
biological networks
Traditional
Approach
Systems-based
Approach
Motivators
27. Integrating chemical genomics and structural systems biology
MD
simulation
Mj
Q
Refined
interaction
model
Mj
Q
SMAP
Protein-ligand
docking
Mj
Q
Mi
3D model
of novel
Target
3D model of
annotated
target
Initial
interaction
model
Query
chemical
Network
modeling
Experimental
support
Generalized Network
Enrichment of Structure-
Activity Relationships
Xie & Bourne 2008 PNAS 105(14):5441-6
Xie et al 2012 Ann Rev Pharm & Tox 52:361-79
Xie et al 2016 Ann Rev Pharm & Tox in press
28. Similar binding sites may bind similar ligands
A 3D object recognition problem
• Globally different, but locally similar
• Dynamic
• Scalable
SMAP – Determining Binding Site Similarity
Across Protein Space
29. Why? Large search space
Challenge: inherent flexibility
and errors in predicted
structures
Representation of the protein
structure
- Ca atoms only
- Delaunay tessellation
- Graph representation
Geometric Potential (GP)
0.2
0.1)cos(
0.1
i
Di
Pi
PGP
neighbors
a100 0
Geometric Potential Scale
0
0.5
1
1.5
2
2.5
3
3.5
4
0 11 22 33 44 55 66 77 88 99
Geometric Potential
binding site
non-binding site
Algorithm
Xie & Bourne 2007 BMC Bioinformatics 4:S9
30. SMAP - Sequence-order Independent
Profile-Profile Alignment (SOIPPA)
L E R
V K D L
L E R
V K D L
Structure A Structure B
S = 8
S = 4
Algorithm
L E R
V K D L
S = 8
Xie & Bourne 2008 PNAS 105(14):5441-6
31. 0
0.01
0.02
0.03
0.04
0.05
0.06
0 0.1 0.2 0.3 0.4
True Positive RatioFalsePositiveRatio
PSI-Blast
CE
SOIPPA
0
0.01
0.02
0.03
0.04
0.05
0.06
0 0.1 0.2 0.3 0.4
True Positive Ratio
FalsePositiveRatio
PSI-Blast
CE
SOIPPA
Proteins with the same global shape Proteins with different global shape
Xie & Bourne, PNAS, 105(2008):5441
32. • Tykerb – Breast cancer
• Gleevac – Leukemia, GI
cancers
• Nexavar – Kidney and liver
cancer
• Staurosporine – natural product
– alkaloid – uses many e.g.,
antifungal antihypertensive
Collins and Workman 2006 Nature Chemical Biology 2 689-700
Motivators
35. physiological process
Understanding of
dynamics and
kinetics of protein-
ligand interactions
physiological processphysiological processphysiological process
Knowledge
representation
and discovery &
model integration
Prediction of molecular
interaction network on
a large/genome scale
Reconstruction,
analysis and
simulation of
biological networks
Traditional
Approach
Systems-based
Approach
Motivators
36. Drug repurposing to target Ebola virus SSP Pipeline
Z Zhao, L Xie, P. E. Bourne et. al BMC Bioinform.
Proteom
e
Compound
lib
Target(
s)
Interaction
network(s)
Candidate
Compound(s)
Data
Collection
Literature
Chemical
space
3D structure
;
Docking;
MD simulation
5
Similarity
;
Profiling
Druggabilit
y
Drug-
likeness
37.
38. physiological process
Understanding of
dynamics and
kinetics of protein-
ligand interactions
physiological processphysiological processphysiological process
Knowledge
representation
and discovery &
model integration
Prediction of molecular
interaction network on
a large/genome scale
Reconstruction,
analysis and
simulation of
biological networks
Traditional
Approach
Systems-based
Approach
Motivators
40. Proteome
Drug binding
site alignments
SMAP
Predicted drug targets
Drug and endogenous
substrate binding site analysis
Competitively inhibitable targets
Inhibition simulations in
context-specific model
COBRA Toolbox
Predicted causal targets
and genetic risk factors
Metabolic
network
Scientific
literature
Tissue and biofluid
localization data
Gene
expression
data
Physiological
objectives
System
exchange
constraints
Flux states
optimizing
objective
Physiological
context-specific
model
Influx
Efflux
Drug response phenotypes
Drugtargets
Physiological
objectives
Causal drug targets
All targets
336 genes
1587 reactions
Chang et al PLOS Comp. Biol. 2010 6(9): e1000938
41. Lei Xie (CUNY CS): methodological developments
J Guler & J Papin labs (UVa): new anti-malarial drug
design project, using a systems-bio approach
Bourne lab (Eli Draizen, Daniel Mietchen, Stella
Veretnik): discussions and scientific feedback
Others? (Zheng??)
Include an Acknowledgements slide……?…
42. 42
Heart disease (cardiac systems biology; Saucerman)
Idea 1 ?...
Idea 2 ?...
Communicable diseases (systems biology of infectious
disease and microbial networks; Papin)
Anti-malarial drug-design collaboration w/ Guler and Papin
Cancers/neoplasms (signal transduction networks;
Janes)
Maybe…?
Others?
Add this Slide/idea as way to open doors to
potential future collaborations.... ? Could be
useful to end on a slide that’s something like
this.