The document describes a computational study aimed at expediting drug discovery by identifying novel protein-protein interaction (PPI) ligands. The study used computational chemistry programs to test ligands against protein structures and identify those that overlay protein secondary structures with a root-mean-square deviation (RMSD) below 0.5 angstroms. Many ligands were found to successfully mimic protein secondary structures with low RMSD values, supporting the hypothesis that novel PPI ligands can be identified in this manner. The results indicate this computational approach may speed up drug discovery by targeting PPIs rather than single protein inhibition.
The Butterfly Effect: How to see the impact of small changes to your ADC
BCSRCv1.3
1. Utilizing Computational Chemistry for Expedited Drug Discovery
Abstract Results
For years, scientists have attempted to discover novel ligands to successfully inhibit
viral disease. Discovering and determining the efficacy of novel ligands is a dilemma
of modern drug discovery. If there exists a ligand which disrupts the secondary
structure of a protein using secondary structure mimetics, it is hypothesized that the
protein – protein interaction of the target will breakdown. Using this knowledge, I
hypothesized that there exist novel PPI ligands which overlay protein structures with
an RMSD value below 0.5 at a proximity below 0.5 Å (angstroms). To test this I used
a multifaceted procedure. First, I obtained ligands for testing and translated them
into a computer. Thereafter, I numerated the chiral locations of each isomer
(DDD…LLL) and created (α -- ß) vectors for the calculation of the proximity of the
overlay. Using technology such as ChemDraw, Maestro, and Establishing Key
Orientation (EKO), I systemically ran thorough tests of each ligand over various
structures to determine the effectivity of the ligand by RMSD value, detailed in the
“Procedure” section. Following this, I modified the most optimal ligands on Maestro
to better mimic the secondary structure of the protein of interest. My data was very
promising as seen in the “Results” section. Many of the hits were below an RMSD
value of 0.5. The results I gathered indicated the strengths and weaknesses of
certain ligands on certain structures. I ran a similar procedure for the basic
secondary structures of proteins. From my data, I was able to confirm my
hypothesis. There do exist novel ligands which successfully overlay proteins (in this
project, specifically those for cholesterol) with an RMSD below 0.5 at 0.5 Å.
Goals & Hypothesis
Goal: To harness the power of computation coupled with knowledge of
the chemical properties of Secondary Structure Mimetics to speed up
the process of and to identify novel PPI ligands to be used in drugs.
Hypothesis: There are novel PPI ligands which overlay protein structures
with an RMSD value below 0.5 below a proximity of 0.5 Å.
Materials
• Obtain molecules (2D sketches)
• Upload to ChemDraw, covert each conformation (R/S…3 locations,
8 combinations for each molecule) to a SMILES string, label atom
number on indexes
• Compile SMILES strings with indexes into text file, run QMD
(models F=ma on micro scale to create thousands of
conformations of same molecule)
• Gather generated file types, recheck atom number from LT3 file
• Get text.in file, modify for chains for matching (on new protein
interface…i.e chain A and chain E on 3gcx)
• Run Matching program
• Gather RMSD values, quantitatively sort results for best molecules
• Upload all below 0.5 RMSD to Maestro, modify molecule to
optimize the mimetics (add carbon/sulfur bonds)
• Mount Molecules with a variation of 120 chiral groups
• Establish Key Molecules, Pre Clinical, Clinical, Production.
Phase1Phase2Phase3
Figure 1.1: 3gcx pdb on Maestro Figure 1.2: 4ne9 pdb on Maestro Figure 1.3: 4nmx pdb on Maestro
3gcx – PCSK9 3gcx - EGFA 4ne9 - PCSK9
4ne9 - Peptide 4nmx – PCSK9 4nmx - Peptide
Procedure
The following data are RMSD values of the proximity and efficacy of the overlay of each ligand (denoted KB__) at an
Å value below 0.5. Overlay refers to how the ligand molecule mimics the secondary structure of the protein at a
specific chain PCSK9, EGFA, or Peptide. Figures 1.7 and 1.8
I also ran matching of each ligand on the basic
secondary structures. Sample results for KB36
are shown to the right. These secondary
structures are common in all proteins. Doing
this, one can observe which molecule works
best on a certain basic protein structure.
KB36 310-helix α-helix π-helix β-strand
parallel β-
sheet
anti-parallel
β-sheet
sheet-turn-
sheet
LLL 0.610761 0.728474 0.77998 0.360785 0.237269 0.235628 0.235628
DDD 0.633197 0.64119 0.694639 0.308973 0.271456 0.231659 0.231659
DDL 0.502959 0.744518 0.733763 0.606267 0.337057 0.489057 0.361595
DLD 0.640683 0.664521 0.747776 0.393544 0.327744 0.328627 0.328627
DLL 0.783648 0.686561 0.612491 0.767724 0.398959 0.401956 0.374109
LDD 0.243934 0.104826 0.298479 0.704939 0.374607 0.384152 0.247558
LDL 0.726969 0.488727 0.345833 0.741867 0.343061 0.695175 0.195279
LLD 0.465762 0.481516 0.65398 0.776903 0.408635 0.416962 0.416962
LLL 0.610761 0.728474 0.77998 0.360785 0.237269 0.235628 0.235628
A majority of this experiment was conducted on a computer on which the margins for error are slim to none. One major
concern about Quenched Molecular Dynamics (QMD), is floating point rounding errors. A possible source of error is also
the numeration of the chiral locations shown below in figure 1.4. The atom numbering is used to create α – ß (Alpha
Beta) vectors. Often, when imported onto Maestro, the interactions within the molecule can alter the atom numbers
from the ones on ChemDraw making it difficult to sift through and correct for phase 2, matching. This could have caused
slight variations in the end calculations of RMSD Values.
In the end, I can confirm my hypothesis that there are novel PPI ligands which overlay protein structures with an RMSD
value below 0.5 below 0.5 Å. Thus, my data holds my assertion of the efficacy of Secondary Structure Mimetics. For
future cases, it may provide insight to run trials on several different chains of the protein to observe if the inhibition of a
certain chain correlates to a greater level of remediation.
Future work includes the docking simulation in which I add various R groups to the ends of the chiral locations and
check the values for the overlay RMSD. This is done usually after the removal of the target chain to see the impacts on
the conformation of the original protein structure. Then, after a primed ligand is gathered, I can send it off to be
produced by synthetic chemists, tested in pre clinical trials, clinical trials, and ultimately sold as a drug.
Computational Device
Programs
• WinSCP, PuTTy (linux terminal), Quenched Molecular Dynamics with
GROMACS, Protein Matching with in-house program (EKO), PDB,
Maestro, ChemDraw.
Ligand Molecules (KB__)
Secondary Structure Matching PDB
• 310_helix, Antisheet1, β-Strand, α-helix, Parsheet,
π-helix, Sheet Turn Sheet
Proteins for Matching (Cholesterol Inhibitors)
• 3gcx, 4ne9, 4nmx
KB36DDD O=C([C@H](N1)C)N[C@@H](C(N[C@@H](C2=CN(C[C@H]3O[C@H]([C@@H]([C@@H]3O)O)CC1=O)N=N2)C)=O)C 10,27,7,29,3,5
KB36DDL O=C([C@@H](N1)C)N[C@@H](C(N[C@@H](C2=CN(C[C@H]3O[C@H]([C@@H]([C@@H]3O)O)CC1=O)N=N2)C)=O)C 10,27,7,29,3,5
KB36DLL O=C([C@@H](N1)C)N[C@@H](C(N[C@@H](C2=CN(C[C@H]3O[C@H]([C@@H]([C@@H]3O)O)CC1=O)N=N2)C)=O)C 10,27,7,29,3,5
KB36DLD O=C([C@H](N1)C)N[C@H](C(N[C@@H](C2=CN(C[C@H]3O[C@H]([C@@H]([C@@H]3O)O)CC1=O)N=N2)C)=O)C 10,27,7,29,3,5
KB36LLD O=C([C@H](N1)C)N[C@H](C(N[C@H](C2=CN(C[C@H]3O[C@H]([C@@H]([C@@H]3O)O)CC1=O)N=N2)C)=O)C 10,27,7,29,3,5
KB36LDD O=C([C@H](N1)C)N[C@@H](C(N[C@H](C2=CN(C[C@H]3O[C@H]([C@@H]([C@@H]3O)O)CC1=O)N=N2)C)=O)C 10,27,7,29,3,5
KB36LDL O=C([C@@H](N1)C)N[C@@H](C(N[C@H](C2=CN(C[C@H]3O[C@H]([C@@H]([C@@H]3O)O)CC1=O)N=N2)C)=O)C 10,27,7,29,3,5
KB36LLL O=C([C@@H](N1)C)N[C@H](C(N[C@H](C2=CN(C[C@H]3O[C@H]([C@@H]([C@@H]3O)O)CC1=O)N=N2)C)=O)C 10,27,7,29,3,5
Three Chiral Centers on KB36
Varying these gives us the 8 isomers
DDD, DDL, DLL, DLD, LLD, LDD, LDL, LLL
Phase 1
Figure 1.5: KB36 low RMSD on Maestro Figure 1.6: KB36 Modified on Maestro
Figure 1.7: Close up on overlay, chiral locations circled
Figure 1.4: SMILES strings and the 2D draw
up of KB36, emphasis on chiral locations.
Phase 2
Figure 1.8: Close up on modified overlay, chiral locations circled
Introduction
In the era of modern drug discovery and exploratory drug enhancement, time and
cost are major, if not main, factors. Traditionally, drug discovery is done by High
Throughput Screening (HTS), a method to assay biochemical activity of thousands
of drug like compounds. This method incurs huge costs and copious amounts of
time.
In this project, a revolutionary approach was utilized:
computational power coupled with knowledge of
Secondary Structure Mimetics. Instead of attempting to
find a single molecule which inhibits the activity of a viral
disease, a new type of interaction was targeted called
Protein-Protein Interactions. Being able to interrupt a PPI or
to “adapt nature’s protein recognition principles”(1) offers a
new class of therapeutic intervention points. In addition to
being an expedited process, the discovery of these novel
PPI ligands was done at the cost of simply powering the
computational device. This new process makes the
discovery of drugs to treat HIV, Hepatitis B, and cholesterol
a significantly faster and cheaper endeavor, making
treatment a more affordable option. Ultimately, the
premise of this project is to bring humanity one step closer
to saving more lives.
References
Amino Acid chain created by me, secondary,
tertiary and quaternary structures sketched
with mentor.
1. High Throughput Screening (HTS). The Scripps Research Institute. Scripps Florida, 21 Jan. 2015. Web. 10 Feb. 2016.
<https://www.scripps.edu/florida/technologies/hts/>.
2. Ko, Eunhwa, Arjun Raghuraman, and Lisa Perez. Exploring Key Orientations at Protein-Protein Interfaces with Small
Molecule Probes. Journal of the American Chemical Society,
3. Pierce, Ben. Overview of Protein Protein Interaction Analysis. ThermoFisher. ThermoFisher Scientific, 10 Nov. 2015. Web. 10
Feb. 2016. <https://www.thermofisher.com/us/en/home/life-science/protein-biology/protein-biology-learning-
center/protein-biology-resource-library/pierce-protein-methods/overview-protein-protein-interaction-analysis.html>.
4. Raj, Monika, Brooke Bullock, and Paramjit Arora. Plucking the High Hanging Fruit: A Systematic Approach for Targeting
Protein-protein Interactions. Department of Chemistry, NYU, 10 Feb. 2016.
5. Ross, Nathan, William Katt, and Andrew Hamilton. "Synthetic Mimetics of Protein Secondary Structure Domains." Royal
Society Publishing. The Royal Society, 1 Feb. 2010. Web. 10 Feb. 2016.
<http://rsta.royalsocietypublishing.org/content/368/1914/989>.
Discussion and Conclusion