What is QSAR?, introduction to 3D QSAR, CoMFA, CoMSIA, Case Study on CoMFA contour maps analysis and CoMSIA interactive forces between ligand and receptor, various Statistical techniques involved in QSAR
2. CONTENTS
Introduction to QSAR
QSAR Analysis models
3D QSAR
COMFA
COMSIA
Applications
Case study
3. INTRODUCTION TO QSAR
To relate the biological activity of a series of compounds to their
physicochemical parameters in a quantitative fashion using a
mathematical formula.
The fundamental principle involved is difference in structural properties
is responsible for variations in biological activities of the compound.
Physico-chemical parameters:
1. Hydrophobicity of substituents
2. Hydrophobicity of the molecule
3. Electronic properties of substituents
4. Steric properties of substituents
4. QSAR Analysis models
Hansch Analysis: Corelates biological activity with physico-chemical
parsmeters.
Log(1/c) = k1 log P + k2 σ + k3 Es + k4
Free-Wilson Analysis: Corelates biological activity with certain
structural features of the compound.
Limitation:
1. Does not consider 3D structure.
2. No graphical output thereby making the interpretation of results in
familiar chemical terms, frequently difficult if not impossible
5. 3D QSAR
3D QSAR is an extension of classical QSAR which exploits the 3 dimensional
properties of the ligands to predict their biological activity using robust stastical
analysis like PLS, G/PLS, ANN etc.
3D-QSAR uses probe-based sampling within a molecular lattice to determine three-
dimensional properties of molecules and can then correlate these 3D descriptors with
biological activity.
No QSAR model can replace the experimental assays, though experimental
techniques are also not free from errors.
Some of the major factors like desolvation energetics, temperature, diffusion,
transport, pH, salt concentration etc. which contribute to the overall free energy of
binding are difficult to handle, and thus usually ignored.
Regardless of all such problems, 3D-QSAR becomes a useful alternative approach.
7. CoMFA(Comparative Molecular Field
Analysis)
In 1987, Cramer developed the predecessor of 3D approaches called Dynamic
Lattice-Oriented Molecular Modeling System (DYLOMMS) that involves the
use of PCA to extract vectors from the molecular interaction fields, which are
then correlated with biological activities.
Soon after he modified it by combining the two existing techniques, GRID and
PLS, to develop a powerful 3D QSAR methodology, Comparative Molecular
Field Analysis (CoMFA).
The underlying idea of CoMFA is that differences in a target property, e.g.,
biological activity, are often closely related to equivalent changes in shapes
and strengths of non-covalent interaction fields surrounding the molecules.
Hence, the molecules are placed in a cubic grid and the interaction energies
between the molecule and a defined probe are calculated for each grid point.
8. Protocol for CoMFA:
A standard CoMFA procedure, as implemented in the Sybyl Software, follows the following
sequential steps:
1. Bioactive conformations of each molecule are determined.
2. All the molecules are superimposed or aligned using either manual or automated methods, in
a manner defined by the supposed mode of interaction with the receptor.
3. The overlaid molecules are placed in the center of a lattice grid with a spacing of 2 Å.
4. The algorithm compares, in three-dimensions, the steric and electrostatic fields calculated
around the molecules with different probe groups positioned at all intersections of the lattice.
5. The interaction energy or field values are correlated with the biological activity data using
PLS technique, which identifies and extracts the quantitative influence of specific chemical
features of molecules on their biological activity.
6. The results are articulated as correlation equations with the number of latent variable terms,
each of which is a linear combination of original independent lattice descriptors.
7. For visual understanding, the PLS output is presented in the form of an interactive graphics
consisting of colored contour plots of coefficients of the corresponding field variables at each
lattice intersection, and showing the imperative favorable and unfavorable regions in three
dimensional space which are considerably associated with the biological activity.
9. DRAWBACKS AND LIMITATIONS
OF CoMFA
CoMFA has several pitfalls and imperfections:
1. Too many adjustable parameters like overall orientation, lattice placement, step size,
probe atom type etc.
2. Uncertainty in selection of compounds and variables
3. Fragmented contour maps with variable selection procedures
4. Hydrophobicity not well-quantified
5. Cut-off limits used
6. Low signal to noise ratio due to many useless field variables
7. Imperfections in potential energy functions
8. Various practical problems with PLS
9. Applicable only to in vitro data
10. Comparative Molecular Similarity Indices Analysis
(CoMSIA)
CoMSIA was developed to overcome certain limitations of CoMFA.
In CoMSIA, molecular similarity indices calculated from modified SEAL similarity
fields are employed as descriptors to simultaneously consider steric, electrostatic,
hydrophobic and hydrogen bonding properties.
These indices are estimated indirectly by comparing the similarity of each
molecule in the dataset with a common probe atom (having a radius of 1 Å,
charge of +1 and hydrophobicity of +1) positioned at the intersections of a
surrounding grid/lattice.
For computing similarity at all grid points, the mutual distances between the probe
atom and the atoms of the molecules in the aligned dataset are also taken into
account.
11. To describe this distance-dependence and calculate the molecular properties, Gaussian-type
functions are employed. Since the underlying Gaussian-type functional forms are ‘smooth’
with no singularities, their slopes are not as steep as the Coulombic and Lennard-Jones
potentials in CoMFA; therefore, no arbitrary cut- off limits are required to be defined.
CoMSIA is provided by Tripos Inc. in the Sybyl software, along with CoMFA.
The Comparison of CoMFA and CoMSIA
12. APPLICATIONS:
1. QSAR in Chromatography: Quantitative Structure–Retention
Relationships (QSRRs)
2. The Use of QSAR and Computational Methods in Drug Design.
3. In Silico Approaches for Predicting ADME Properties.
4. Prediction of Harmful Human Health Effects of Chemicals from Structure.
5. Chemometric Methods and Theoretical Molecular Descriptors in Predictive
QSAR Modeling of the Environmental Behavior of Organic Pollutants.
6. The Role of QSAR Methodology in the Regulatory Assessment of
Chemicals.
7. Nanomaterials – the Great Challenge for QSAR Modelers
13. Case study:Human Eosinophil
Phosphodiesterase •
The phosphodiesterase type IV (PDE4) plays an important role
in regulating intracellular levels of cAMP and cGMP.
PDE4 has highly expressed in inflammatory and immune cells
and airway smooth muscle and degrade the cAMP’s
concentration.
The inhibition of PDE4 increase the intracellular cAMP
concentrations to kill inflammatory cells and relax airway
smooth muscle.
To develop the selective PDE4 inhibitors as anti-inflammatory
and asthmatic drugs has attracted extensive research has been
conducted. The QSAR studies of PDE4 inhibitors have also
been done by using CoMFA and CoMSIA methods.
14. More potent and selective PDE4 inhibitors, a series of 5,6-dihydro-(9H) -
pyrazolo[3,4-c] -1,2,4-triazolo [4,3R]pyridine, were improved and synthesized
based on the structures of 7-oxo-4,5,6,7-tetrahydro-1H-pyrazolo[3,4-c]
pyridine.
In order to study the interaction mechanism of PDE4 with 31 new
compounds, the QSAR model was built by using the CoMFA. 5,6-dihydro-
(9H) -pyrazolo[3,4- c] -1,2,4-triazolo [4,3R]pyridine
Structures of 5, 6-Dihydro-(9H)-pyrazolo[3,4-c]-
1,2,4-triazolo[4,3-a]pyridines.
The superstition of 31structures of
5,6-Dihydro-(9H)-pyrazolo[3,4-c]-1,2,4-
triazolo[4,3-α] pyridines.
15. Four compounds were randomly selected as test set, other twenty-seven
compounds as training set.
16. Method:
The structures of 31 compounds were built with molecular sketch program.
Then Gasteiger-Hückel charges were assigned to each atom and the
energy minimization of each molecule.
All the compounds studied had common rigid substructure. Therefore, the
common rigid substructure alignment was carried out by using database
alignment tool.
The most active compound cyclobutyl substituent is used as template
molecule.
All aligned molecules were put into a 3D cubic lattice that extending at least
4 Å beyond the volumes of all investigated molecules on all axes.
In the 3D lattice, the grid spacing was set to 2.0 Å in the x, y, and z
directions.
A sp3 hybrized carbon atom with a charge of +1 was used as the probe
atom, CoMFA steric and electrostatic interaction fields were calculated.
17. Method contd.
Partial least squares (PLS) method was carried out to build the 3D-QSAR models.
Leave-one- out (LOO) cross-validated PLS analysis was used to check the
predictive ability of the models and to determine the optimal number of components
to be used in the final QSAR models.
The PLS analysis gave a CoMFA model with cross-validated q2 value of 0.565 for 3
optimal components. The non-cross-validated PLS analysis of these compounds
was repeated with the optimal number of components and the R2 value was 0.867.
The correlation plot of experimental values vs
the predicted values was shown in Plot of the
predicted pIC50 vs experimental pIC50
18. • The interacting mode of compound with
protein.
• The inhibitors and the important residues for
inhibitor-protein interaction are represented
by ball-and-stick models, respectively.
• The green dashed lines denote the hydrogen
bonds.
Structures of Erβ protein binding with
compounds obtained from molecular
docking
19. Contour Maps of CoMFA
Model
Contour plots of the CoMFA steric fields (left) and electrostatic fields (right) of
compound No. 24
21. CONCLUSION
CoMFA and CoMSIA are useful techniques in understanding
pharmacological properties of studied compounds, and they
have been successfully used in modern drug design.
Despite of all the pitfalls it has now been globally used for drug
discovery based on well-established principles of statistics, is
intrinsically a valuable and viable medicinal chemistry tool
whose application domain range from explaining the structure-
activity relationships quantitatively and retrospectively, to
endowing synthetic guidance leading to logical and
experimentally testable hypotheses.
Apart from synthetic applications it has also been used in
various other fields too.
22. Reference
An Introduction to Medicinal Chemistry FIFTH EDITION
by Graham L. Patrick
Yuhong Xiang, Zhuoyong Zhang*, Aijing Xiao, and Jinxu
Huo “Recent Studies of QSAR on Inhibitors of Estrogen
Receptor and Human Eosinophil Phosphodiesterase”
Department of Chemistry, Capital Normal University,
Beijing 100048, P.R. China
Jitender Verma, Vijay M. Khedkar and Evans C. Coutinho
“3D-QSAR in Drug Design - A Review” Department of
Pharmaceutical Chemistry, Bombay College of Pharmacy,
Kalina, Santacruz (E), Mumbai 400 098, India