Jack Tuszynski Accelerating Chemotherapy Drug Discovery with Analytics and High Performance Computing
1. Jack Tuszynski
Cross Cancer Institute
Department of Physics
University of Alberta
Edmonton, Canada
http://www.phys.ualberta.ca/~jtus
“Accelerating Chemotherapy
Drug Discovery with High
Performance Computing and
Analytics”
3. Modern Drug DevelopmentModern Drug Development
Success Rate 1:100,000 !Success Rate 1:100,000 !
00 22 44 66 88 1010 1212 1414 1616
DiscoveryDiscovery
Preclinical testingPreclinical testing
Phase IPhase I
Phase IIPhase II
Phase IIIPhase III
ApprovalApproval
Post marketPost market
100,000100,000
100100
55
11
Time in years Cost $1B
4. Identify disease
Isolate protein
Find drug
Preclinical testing
GENOMICS, PROTEOMICS & BIOPHARM.
HIGH THROUGHPUT SCREENING
MOLECULAR MODELING
VIRTUAL SCREENING
COMBINATORIAL CHEMISTRY
IN VITRO & IN SILICO ADME MODELS
Potentially producing many more targets
and “personalized” targets
Screening up to 100,000 compounds a
day for activity against a target protein
Using a computer to
predict activity
Rapidly producing vast numbers
of compounds
Computer graphics & models help improve activity
Tissue and computer models begin to replace animal testing
VIRTUAL SCREENING
MOLECULAR MODELING
The Evolution in Drug Design and Development
5. 5
Integration of biological dataIntegration of biological data
impacts drug developmentimpacts drug development
information stored in the genetic code (DNA)information stored in the genetic code (DNA)
protein sequencesprotein sequences
3D structures of biomolecules3D structures of biomolecules
experimental results from various sources (kd, IC50,experimental results from various sources (kd, IC50,
expression)expression)
clinical dataclinical data
patient statisticspatient statistics
scientific literaturescientific literature
6. 6
……and leads toand leads to
computational explosioncomputational explosion
An avalanche of data:An avalanche of data:
SequencesSequences
Functional relationsFunctional relations
StructuresStructures
This requiresThis requires
computationalcomputational
approachesapproaches
• 100’s of completed genomes
• 1000’s of known reactions
• 10,000’s of known 3D structures
• 100,000’s of protein-ligand
interactions
• 1,000,000’s of known proteins &
enzymes
• Decades of biological/chemical
know-how
• Computational & Mathematical
resources
The Push to Systems Biology
7. 77
Key areas ofKey areas of
bioinformaticsbioinformatics
organisation of knowledge
(sequences, structures,
functional data)
e.g. homology
searches
8. Specifically for drug discovery:
PDB : 50,000 proteins + homologs
1500 targets (human proteins)
Approx. 400 (80 in cancer) utilized
Orange Book: 1800 medicinal drugs
Drug Bank: 4900 drugs
Cancer chemotherapy drugs: 103
Protein-drug interactions but also
Protein-protein interactions
9. Molecular Targets:Cancer Cell NetworkMolecular Targets:Cancer Cell Network
A very complex but algorithmic system
Based on a lock-and-key principle
We will find keys to all these locks by 2061
10. CANCER CHEMOTHERAPY DRUGS
Approximately 100 standard chemotherapeutic drugs:
1)Alkylating agents: Genotoxic (20-25)
2) Plant alkaloids: Inhibition of mitosis (10-15)
3) Antimetabolites: Inhibition of base synthesis (15-20)
4) Antibiotics: Derived from Streptomyces (10-15)
5) Targeted antibodies: Bind cell surface receptors (5-10)
6) Hormones: Inhibit or stimulate hormone signaling (15-20)
7) Directly targeting small molecules
8)Other indirect effects: Angiogenesis or immune modulators (10-15)
Number of current chemotherapy targets: 101
Number of chemotherapy drugs: 102
Potential Targets (Pharmacogenomics): 103
Paclitaxel
Cisplatin
Methotrexate
Trastuzumab
Imatinib
Tamoxifen
Doxorubicin
Bevacizumab
11. G2
M
G1
S
G0
tyrosine kinases
DNA synthesis
topoisomerase I
CDK2
tubulin
polymerisation/
depolymerisation
Vinca alkaloids*
taxol/taxotere
halichondrin*
spongistatin*
rhizoxin*
cryptophycin
sarcodictyin
eleutherobin
epothilones
discodermolide
D-24851 ?
dolastatin*
combretastatin*
camptothecin
CDK4
flavopiridol
(R)-roscovitine (CYC202)
paullones, indirubins
gleevec
iressa
OSI774
hydroxyurea
cytarabine
antifolates
5-fluorouracil
6-mercaptopurine
nitrogen mustards
nitrosoureas
mitomycin C
CDK1
Chk1
Chk2
UCN-01, SB-218078
debromohymenialdisine
isogranulatimide
AhR
actin
kinesin Eg5
monastrol
ecteinascidin 743
podophyllotoxin,doxorubicin
etoposide, mitoxantrone
topoisomerase II
ATM/ATR
R115777
SCH66336
ROCK
Y-27632
CDC25
DF203
FK317 HMGA
Plk1
Aurora
wortmanni
n
caffeine
ODC/SAMDC
Pin1
GSK-3
Cdc7
nucleotide excision
repair
Raf cytochalasins
latrunculin A
scytophycins
dolastatin 11
jasplakinolide
paullones, indirubins
(R)-roscovitine (CYC202)
paullones, indirubins
BAY-43-9006
fumagillin,TNP-470
PRIMA-1, pifithrin a
rapamycin mTOR/FRAP
PS-341 proteasome
bryostatin,
PKC412
PKC
histone deacetylasetrichostatin,
FK228
HSP90geldanamycin, 17-
AAGATK, MAFP cytosolic phospholipase A2
hexadecylphosphocholin
e
phospholipase D
CT-2584 choline
kinase
MEK1/Erk-1/2
PD98059, U0126
menadione
(K3)
farnesyl transferase
phosphatasesokadaic acid, fostreicin, calyculin A
Wee1
PD0166285
polyamine analogues
Pin1
p53/MDM2
Source: Cell cycle laboratory, L. Meijer, Roscoff, France
~80 drugs and drug candidates
Cancer chemotherapy is based on cell cycle arrest
12. CAUSES OF FAILURE IN DRUG
DEVELOPMENT
ADME
ANIMAL TOXICITY
LACK OF EFFICACY
ADVERSE EFFECTS
IN HUMANS
More than 50% of this failure can be predicted computationally in 2011
In 2061: six sigma will be achieved in silico
13. WET LAB: High-throughput screening (HTS)WET LAB: High-throughput screening (HTS)
Experimental techniqueExperimental technique
384-well microplates, florescence-based detection &384-well microplates, florescence-based detection &
desktop robotsdesktop robots
Up to 1M compounds per targetUp to 1M compounds per target
DRY LAB: Virtual screening (VS)DRY LAB: Virtual screening (VS)
Ligand-based methodsLigand-based methods
2D structures, substructures, fingerprints2D structures, substructures, fingerprints
Volume/surface matchingVolume/surface matching
3D pharmacophores, fingerprints3D pharmacophores, fingerprints
Receptor-based methodsReceptor-based methods
DockingDocking
Even 100B compounds per target triedEven 100B compounds per target tried
Receptor flexibility
16. Molecular Dynamics
• Treats molecules
classically:
– Point charges and
masses
– Spring-like bonds
– Numerical integration of
equations of motion
17. Drug binding sites in tubulin
Of the more thanOf the more than 100100 approvedapproved
cancer chemotherapy drugs oncancer chemotherapy drugs on
the market, approximately 15%the market, approximately 15%
target tubulin directly.target tubulin directly.
None are specific for cancerNone are specific for cancer
cells, hence associated sidecells, hence associated side
effectseffects
18. Drug / Ligand
Protein
Drug ActionDrug Action: Inhibition of Protein-: Inhibition of Protein-
Protein InteractionsProtein Interactions
Cavity
Cavity
Cavity
19. The computational toolboxThe computational toolbox
The three-fold way:The three-fold way:
rational design andrational design and in silicoin silico testing of derivatives of knowntesting of derivatives of known
agentsagents
brute-force computational search using existing librariesbrute-force computational search using existing libraries
(pharma-matrix)(pharma-matrix)
De novo design from common pharmacophores for bestDe novo design from common pharmacophores for best
space filling propertiesspace filling properties
a pocketome data banka pocketome data bank
Reverse docking allows to predict side effectsReverse docking allows to predict side effects
21. ContentsContents
Compound dataCompound data sourcessources (PubChem, Zinc, NCI, SciFinder(PubChem, Zinc, NCI, SciFinder
~65M compounds)~65M compounds)
Drug dataDrug data sourcessources (DrugBank, Orange Book, CMC, WDI,(DrugBank, Orange Book, CMC, WDI,
MDDR ~ 250 k drugs)MDDR ~ 250 k drugs)
Molecular dataMolecular data toolkitstoolkits (OpenEye, Open Babel)(OpenEye, Open Babel)
Computational MethodsComputational Methods (MM, MD, QMMM)(MM, MD, QMMM)
Molecule file formatsMolecule file formats (PDB, Smilies )(PDB, Smilies )
DockingDocking (Autodock, Dock)(Autodock, Dock) ParallelParallel (Dovis)(Dovis)
22. Pharma-matrix apps:Pharma-matrix apps: eRxeRx
100 million targets (100,000 proteins x 100 pockets x 10 mutants):100 million targets (100,000 proteins x 100 pockets x 10 mutants):
pocketomepocketome
100 billion chemical compounds100 billion chemical compounds
10101919
potential interactions (filtering)potential interactions (filtering)
Hand-in-glove match by brute computational screeningHand-in-glove match by brute computational screening
pharmagooglepharmagoogle
25. Personalized eDx and eRx
in a few decades a personal genome will cost $10 and
will be our ID at birth included in our eRx app
26. The Virtual Human:The Virtual Human:
Multi-Scale ModelingMulti-Scale Modeling
lobule
liver
whole body
hepatocyte
Drug molecules Interaction matrix
Notes de l'éditeur
Let’s take a small part of the picture here. There many areas of research in this picture. In this presentation I will focus on these two subjects which are basically our research group interest.
The problem is even worse. Proteins may undergo post-translational modifications, associations with other molecules or prosthetic groups, and formation of multimeric complexes. In medicine, a prosthesis (plural prostheses) is an artificial extension that replaces a missing body part. It is part of the field of biomechatronics, the science of fusing mechanical devices with human muscle, skeleton, and nervous systems to assist or enhance motor control lost by trauma, disease, or defect.
Here is a close up view of the secondary strucutre of tubulin as it is observed in a protofilament. Several drug binding sites have been characterized crystallographically. Shown as VDW spheres are three of the best characterized tubulin binding drugs, Taxol, vinblastine and colchicine. There are currently about 100 approved chemotharapy drugs on the market, several of which are monoclonal antibodies. Of these drugs, about 15% specifically target tubulin in some way reducing or increasing their dynamicity. Unfortunately, none of the tubulin binding chemotherapy drugs are specific for cancer cells. I will folcus on colchcine for this talk
When the structure of the target protein is known, the drug discovery process usually follows a well-established procedure shown schematically in Figure 1. Virtual screening techniques are applied early during the docking protocol to reduce the size of large compound libraries. Initially, libraries are ‘‘pre-filtered’’ using a series of simple physicochemical descriptors to eliminate compounds not expected to be suitable drugs. Pharmacophore analysis, neural nets, similarity analysis, scaffold analysis, Lipinski’s rule of five. This procedure, which reduces the size of the library to a group of molecules more likely to bind the target receptor, is known as enrichment. Once an optimum library has been produced, molecules are docked to the target receptor to reduce further the number of candidates. This initial screening makes use of fast, but not very accurate, ranking functions to evaluate the relative stability of the docked complexes. The selected candidates, usually a few hundred, are subject to further docking experiments using more sophisticated scoring functions
When the structure of the target protein is known, the drug discovery process usually follows a well-established procedure shown schematically in Figure 1. Virtual screening techniques are applied early during the docking protocol to reduce the size of large compound libraries. Initially, libraries are ‘‘pre-filtered’’ using a series of simple physicochemical descriptors to eliminate compounds not expected to be suitable drugs. Pharmacophore analysis, neural nets, similarity analysis, scaffold analysis, Lipinski’s rule of five. This procedure, which reduces the size of the library to a group of molecules more likely to bind the target receptor, is known as enrichment. Once an optimum library has been produced, molecules are docked to the target receptor to reduce further the number of candidates. This initial screening makes use of fast, but not very accurate, ranking functions to evaluate the relative stability of the docked complexes. The selected candidates, usually a few hundred, are subject to further docking experiments using more sophisticated scoring functions