2. Contents of Seminar
2
Introduction
Molecular modeling
Types of molecular modeling
Applications of molecular modeling
Proteins in brief
Purpose of protein structure prediction
Types of PSP
Conclusion
4. Molecular Modeling
4
The science (or art) of representing molecular structures numerically
and simulating their behavior with the equations of quantum and
classical physics.
Combination of computational chemistry and computer graphics.
Allows scientists to generate and present molecular data including
geometries (bond lengths, bond angles, torsion angles), energies
(heat of formation, activation energy, etc.), electronic properties
(moments, charges, ionization potential, electron affinity),
spectroscopic properties (vibrational modes, chemical shifts) and bulk
properties (volumes, surface areas, diffusion, viscosity, etc.).
Thomas L L, David A W, Victoria K(1999), Foye‟s principles of Medicinal Chemistry, Lippincott Williams &
Wilkins publications, 6th edition, 3, pp. 55-63.
5. Potential energy variation
5
Thomas L L, David A W, Victoria K(1999), Foye‟s principles of Medicinal Chemistry, Lippincott Williams &
Wilkins publications, 6th edition, 3, pp. 55-63.
6. Molecular Modeling methods
6
The two most common computational methods
Molecular mechanics
Quantum mechanics
Both these methods produce equations for the total energy(E) of the
structure.
MOLECULAR MECHANICS:
Calculation of energy of atoms, force on atoms and
their resulting motion.
Used to model the geometry of the molecule, motion of
molecule and to get the global minimum energy
structure.
Thomas L L, David A W, Victoria K(1999), Foye‟s principles of Medicinal Chemistry, Lippincott Williams &
Wilkins publications, 6th edition, 3, pp. 55-63.
7. Molecular mechanics
7
Consider a molecule as system of rigid balls connected
via springs.
Depends strongly on concepts of bonding
Follows the Newtonian laws
Neglect the electronic degrees of freedom
Leach A R(2001), Molecular Modelling: Principles and Applications, Oxford press, 4th edition, 1, pp. 1-3.
8. Methods for Molecular mechanics study
8
Potential surface
Study of force field
Study of Electrostatics
Molecular dynamics
Conformational Analysis
Leach A R(2001), Molecular Modelling: Principles and Applications, Oxford press, 4th edition, 1, pp. 1-3.
9. Methods for Molecular mechanics study
9
Force field is used to describe the total potential energy of a
molecule or system as a function of geometry. and the set of
parameters required is called “force field parameters”. The total
energy is a sum of Taylor series expansions for stretches for every
pair of bonded atoms, and adds additional potential energy terms
coming from bending, torsional energy, Vander wall energy,
electrostatics and cross terms.
Study of Electrostatics involves the study of interaction between
various dipoles.
Leach A R(2001), Molecular Modelling: Principles and Applications, Oxford press, 4th edition, 1, pp. 1-3.
10. 10
Leach A R(2001), Molecular Modelling: Principles and Applications, Oxford press, 4th edition, 1, pp. 1-3.
11. Methods for Molecular mechanics study
11
Molecular dynamics program allow the model to how the natural
motion of atoms in the structure. This is achieved by including the
kinetic energy term of atoms in the force field equation by using
equations based on Newton's law of motion.
Conformational Analysis involves the determination or analysis of the
spatial arrangement of the functional group of the respective
molecule. Strategies used to study the conformational analysis are
Rigid geometry approximation, Rigid body rotation, Conformational
clustering.
Leach A R(2001), Molecular Modelling: Principles and Applications, Oxford press, 4th edition, 1, pp. 1-3.
12. QUANTUM MECHANICS
12
Provides information about both nuclear position and distribution.
Based on study of arrangement and interaction of electrons and
nuclei of a molecular system.
It does not require the use of parameters similar to those used in
molecular mechanics.
It is based on the wave properties of electrons and all material
particles.
Griffith S, David J(2004), Introduction to Quantum Mechanics, Prentice Hall press, 2nd edition, pp. 1-4.
13. QUANTUM MECHANICS
13
HΨ = EΨ = (U+K) Ψ
Where,
H = Hamiltonian for the system,
Ψ(“p-sigh”) = wave function,
E = energy.
Simply put, the Hamiltonian is an “operator,” a
mathematical construct that operates on the molecular
orbital, Ψ, to determine the energy.
U= Potential energy,
K= Kinetic energy.
Thomas L L, David A W, Victoria K(1999), Foye‟s principles of Medicinal Chemistry, Lippincott Williams &
Wilkins publications, 6th edition, 3, pp. 55-63.
14. ADVANTAGES
14
To calculate the value of potentials, electron affinities ,heat of
formation, dipole moment and other physical properties
To find the electron density in a structure
To determine the points at which a structure will react with
electrophiles and nucleophiles
To determine the shape and electron density of a molecule
Rajkumar B, Branson K, Giddy J, Abramson D(2003), The Virtual Laboratory : A toolset to enable distributed
molecular modeling for drug design on the World-Wide Grid, Concurrency and computation, 15, pp. 1–25.
16. Proteins….
16
If there is a job to be
done in the molecular
world of our cells,
usually that job is done
by a protein. CATALASE
An enzyme which removes
Hydrogen peroxide from your body
so it does not become toxic
A protein hormone which
helps to regulate your
blood sugar levels
http://courses.washington.edu/conj/protein/insulin2.gif
http://www.biochem.ucl.ac.uk/bsm/pdbsum/1gwf/main.html
17. Proteins for cell motility
17
Myosin and actin filaments work in coordination
for the proper muscle contraction
http://www.biochem.ucl.ac.uk/bsm/pdbsum/1gwf/main.html
18. Cell structures
18
Microtubules
Tubulin frame work for the exoskeleton
Cellular coat
Eukaryotic exoskeleton
http://www.fz-juelich.de/ibi/ibi-1/Cellular_signaling/
http://www.biochem.ucl.ac.uk/bsm/pdbsum/1gwf/main.html
http://cpmcnet.columbia.edu/dept/gsas/anatomy/Faculty/Gundersen/main.html
19. Enzymes
2 2 +
Energy
Substrate Product
Progress of reaction
http://www.biochem.ucl.ac.uk/bsm/pdbsum/1gwf/main.html 19
23. Fibrous proteins
23
•Collagen is the most abundant protein in
vertebrates. Collagen fibers are a major
portion of tendons, bone and skin. Alpha
helices of collagen make up a triple helix
structure giving it tough and flexible
properties.
•Fibroin fibers make the silk spun by spiders
and silk worms stronger weight for weight
than steel! The soft and flexible properties
come from the beta structure.
•Keratin is a tough insoluble protein that
makes up the quills of echidna, your hair and
nails and the rattle of a rattle snake. The
structure comes from alpha helices that are
cross-linked by disulfide bonds.
http://opbs.okstate.edu/~petracek/2002%20protein%20structure%20function/CH06.gif
http://my.webmd.com/hw/health_guide_atoz/zm2662.asp?printing=true
24. Globular proteins
24
Cell motility – proteins link together to form filaments which make
movement possible.
Organic catalysts in biochemical reactions – enzymes
Regulatory proteins – hormones, transcription factors
Membrane proteins – protein channels
Defense against pathogens – poisons/toxins, antibodies,
complement
Transport and storage – hemoglobin
http://my.webmd.com/hw/health_guide_atoz/zm2662.asp?printing=true
25. Molecular Logic of Life is Same
25
English Genome
26-Letter alphabet 4-Letter alphabet
Only one grammar Only one grammar
Extremely diverse Extremely diverse
literature organisms
26. Gene Expression The protein
folds to form its
working shape
26
Gene
DNA
Cell machinery
CELL copies the code
G T A C T A making an mRNA
The order of bases in molecule. This
NUCLEUS DNA is a code for moves into the
Chromoso making proteins. The cytoplasm.
me code is read in
Ribosomes read
groups of three
the code and
AUGAGUAAAGGAGAAGAACUUUUCACUGGAU accurately join
A Amino acids
M S E E L F T
K together to make a
protein
http://my.webmd.com/hw/health_guide_atoz/zm2662.asp?printing=true
27. Hallmark of Proteins: Specificity
27
Know exactly which small molecule (ligand) they should
bind to or interact with.
Also know which part of a macromolecule they should
bind to.
One Aspect of Genome Sequence Analysis is to
Assign Functions to Proteins
(Reverse Genetics)
Function is critically dependent on
structure
Schween G, Egener T, Fritzkowsky D, Granado J, Guitton M C(2005), Large-scale analysis of Physcomitrella
plants transformed with different gene disruption libraries: Production parameters and mutant phenotypes,
Plant Biology, 7 (3), pp. 228–237.
29. How Does Sequence Specify Structure?
29
Sequence Functional
? Genomics
Structure Function
The Protein Folding Problem
(second half of the genetic code)
Structure has to be determined experimentally
30. Protein Structure
30
• Levels of organization
– Primary Sequence
– Secondary Structure (Modular building
blocks)
• α-helices
• β-sheets
– Tertiary Structure
– Quartenary Structure
• Hydrophobic/Hydrophilic Organization.
Lubert S, Tymoczko J L, Jeremy M B(2003). Text book of biochemistry, W. H. Freeman and company
press, 5th edition, pp. 198-230.
31. Protein Structure
31
Lubert S, Tymoczko J L, Jeremy M B(2003). Text book of biochemistry, W. H. Freeman and company
press, 5th edition, pp. 198-230.
32. Secondary Structure: -helix
32
Lubert S, Tymoczko J L, Jeremy M B(2003). Text book of biochemistry, W. H. Freeman and company
press, 5th edition, pp. 198-230.
33. Secondary Structure: -sheets
33
Lubert S, Tymoczko J L, Jeremy M B(2003). Text book of biochemistry, W. H. Freeman and company
press, 5th edition, pp. 198-230.
34. Definition of -turn
34
four consecutive residues i, i+1, i+2 and i+3 that do not form a helix
and the turn lead to reversal in the protein chain.
The conformation of -turn is defined in terms of two central
residues, i+1 and i+2 and can be classified into different types on the
basis of this conformation.
i+1 i+2
i H- i+3
bond
D <7Å
Lubert S, Tymoczko J L, Jeremy M B(2003). Text book of biochemistry, W. H. Freeman and company press, 5th edition,
pp. 198-230.
35. Biology/Chemistry of Protein Structure
35
Primary Assembly
STRUCTURE
PROCESS
Secondary Folding
Tertiary Packing
Quaternary Interaction
Lubert S, Tymoczko J L, Jeremy M B(2003). Text book of biochemistry, W. H. Freeman and company press, 5th edition, pp.
198-230.
36. 3 main questions…
36
1. Why predict the structure?
2. Methods for structure prediction
3. What next?
37. Purpose of PSP
37
Explaining phenotype of existing mutations
(experimental or patient-derived)
Designing mutants to disrupt or alter specific
functions (leaving others unaffected)
Hints at function
Drug design (at high sequence identity)
Hypothesis generation
Mateusz K, Michał J, Andrzej K(2011), Multiscale Approaches to Protein Modeling, Springer Science
Business Media, 12th edition, pp. 21-32
38. • Anfinsen’s (1973) thermodynamic hypothesis:
38
Proteins are not assembled into their native
structures by a biological process, but folding is a
purely physical process that depends only on the
specific amino acid sequence of the protein.
Anfinsen C B(1973), Principles that govern the folding of protein chains, Biological Science, 181, pp. 223–230
39. The Prediction Problem
39
Can we predict the final 3D protein structure
knowing only its amino acid sequence?
• Studied for 4 Decades
• “Holy Grail” in Biological Sciences
• Primary Motivation for Bioinformatics
• Based on this 1-to-1 Mapping of
Sequence to Structure
• Still very much an OPEN PROBLEM
Mateusz K, Michał J, Andrzej K(2011), Multiscale Approaches to Protein Modeling, Springer Science
Business Media, 12th edition, pp. 21-32
40. PSP: Major Hurdles
40
Energetics
We don‟t know all the forces involved in detail
Too computationally expensive BY FAR!
Conformational search impossibly large
100 AA. protein, 2 moving dihedrals, 2 possible positions
for each diheral: 2200 conformations!
Levinthal‟s Paradox
Longer than time of universe to search
Proteins fold in a couple of seconds??
Mateusz K, Michał J, Andrzej K(2011), Multiscale Approaches to Protein Modeling, Springer Science
Business Media, 12th edition, pp. 21-32
41. PSP: Goals
41
Accurate 3D structures. But not there yet.
Good “guesses”
Working models for researchers
Understand the FOLDING PROCESS
Get into the Black Box
Only hope for some proteins
25% won‟t crystallize, too big for NMR
Best hope for novel protein engineering
Drug design, etc.
Mateusz K, Michał J, Andrzej K(2011), Multiscale Approaches to Protein Modeling, Springer Science
Business Media, 12th edition, pp. 21-32
42. Comparative Modeling--Basic Protocol
42
1. Identification of homologue for target sequence
2. Alignment of target sequence to template sequence and
structure
3. Side-chain modeling, copy the backbone of the template
and model the new side chains onto this backbone
4. Loop modeling, for insertions and deletions in the
alignment
5. Refinement of model -- moving template closer to target
6. Assessment of (predicted) model quality
7. Using the model to explain experiments and guide new
ones
David F B, Charlotte M D, Hampapathalu A N, Nuria C, An Iterative Structure-Assisted Approach to Sequence
Alignment and Comparative Modeling, PROTEINS: Structure, Function, and Genetics Supplementations, 3,
pp. 55-60.
43. Experimental techniques for structure
determination
43
X-ray Crystallography
Nuclear Magnetic Resonance
spectroscopy (NMR)
Electron Microscopy/Diffraction
Free electron lasers.
David F B, Charlotte M D, Hampapathalu A N, Nuria C, An Iterative Structure-Assisted Approach to Sequence
Alignment and Comparative Modeling, PROTEINS: Structure, Function, and Genetics Supplementations, 3,
pp. 55-60.
44. X-ray Crystallography
44
From small molecules to viruses
Information about the positions of
individual atoms
Limited information about dynamics
Requires crystals
Saville W B, Shearer G (1925), An X-ray Investigation of Saturated Aliphatic Ketones,
Journal of the Chemical Society, 127, pp. 591.
45. NMR
45
Limited to molecules up to ~50kDa
(good quality up to 30 kDa)
Distances between pairs of
hydrogen atoms
Lots of information about dynamics
Requires soluble, non-aggregating
material
Assignment problem
Addess M, Kenneth J, Feigon J(1996). Introduction to 1H NMR Spectroscopy of DNA. Bioorganic
Chemistry: Nucleic Acids, Oxford University Press, 8th edition, pp. 238.
46. Electron Microscopy/ Diffraction
46
Low to medium resolution
Limited information about
dynamics
Can use very small
crystals (nm range)
Can be used for very large
molecules and complexes
47. Tertiary Structure Prediction
47
Template Modeling
Homology Modeling
Threading
Template-Free Modeling
ab initio Methods
Physics-Based
Knowledge-Based
Thomas L, Ralf Z(2000), Protein structure prediction methods for drug design, Briefings in Bioinformatics,
3, pp. 275-288.
48. HOMOLOGY MODELING
48
Constructing an atomic-resolution model of the "target" protein from its
amino acid sequence and an experimental 3d structure of a related
homologous protein (the "template").
Homology modeling relies on the identification of one or more known
protein structures likely to resemble the structure of the query
sequence, and on the production of an alignment that maps residues
in the query sequence to residues in the template sequence
This approach can be complicated by the presence of alignment gaps
(commonly called indels) that arise from poor resolution in the
experimental procedure (usually X-ray crystallography).
Thomas L, Ralf Z(2000), Protein structure prediction methods for drug design, Briefings in Bioinformatics, 3, pp.
275-288
49. HOMOLOGY MODELING
49
Homology modeling can produce high-quality
structural models when the target and template
are closely related, which has inspired the
formation of a structural genomics consortium.
The analysis and prediction of loop structures for
small and medium sized loops and the
positioning of side chains, given the
conformation of the protein's backbone.
Thomas L, Ralf Z(2000), Protein structure prediction methods for drug design, Briefings in Bioinformatics, 3, pp.
275-288
50. Threading or Fold Recognition Method
50
Computational protein structure prediction
Distinction between two fold recognition scenarios.
“Threading" (i.e. placing, aligning) each amino acid in the target
sequence to a position in the template structure, and evaluating how
well the target fits the template. After the best-fit template is selected,
the structural model of the sequence is built based on the alignment
with the chosen template.
Homologous folds share the Same structure through divergent
evolution from a common ancestor.
Analogous folds, on the other hand, share the same structure, but give
insufficient evidence for an evolutionary relationship.
Thomas L, Ralf Z(2000), Protein structure prediction methods for drug design, Briefings in Bioinformatics, 3, pp.
275-288
51. Threading or Fold Recognition Method
51
One popular model for protein folding assumes a
sequence of events:
Hydrophobic collapse
Local interactions stabilize secondary structures
Secondary structures interact to form motifs
Motifs aggregate to form tertiary structure
Thomas L, Ralf Z(2000), Protein structure prediction methods for drug design, Briefings in Bioinformatics, 3, pp.
275-288
52. Ab-initio method
52
It calculates energetics involved in the process of folding
Finding the structure with lowest free energy
It is based on the „thermodynamic hypothesis‟, which states that the
native structure of a protein is the one for which the free energy
achieves the global minimum.
2 components to ab initio prediction:
1. devising a scoring (ie, energy) function that can distinguish
between correct (native or native-like) structures from incorrect
ones.
2. a search method to explore the conformational space.
The most difficult, but most useful approach.
Richard B, David B(2001), AB INITIO PROTEIN STRUCTURE PREDICTION: Progress and Prospects,
Annual review of biophysical and biomolecular structures, 30, pp. 73-88.
53. Ab-initio method
53
Sequence
Prediction
Secondary
structure
Low
Tertiary Validation Predicted
energy
structure Energy Mean field structure
structures
Minimization potentials
Richard B, David B(2001), AB INITIO PROTEIN STRUCTURE PREDICTION: Progress and Prospects,
Annual review of biophysical and biomolecular structures, 30, pp. 73-88.
54. Secondary Structure Prediction
54
• Existing SSP Methods
• Statistical Methods (Chou,GOR)
• Physio-chemical Methods
• A.I. (Neural Network Approach)
• Consensus and Multiple Alignment
• Our Method APSSP of SSP
• Neural Network
• Example Based Learnning
• Multiple Alignment
• Steps involved in APSSP
• Blast search against protein sequence (NR)
• Multiple Alignment (ClustalW)
• Profile by HMMER, Result by Email
• Recogntion: CASP,CAFASP,LiveBench, MetaServer
Thomas L, Ralf Z(2000), Protein structure prediction methods for drug design, Briefings in Bioinformatics, 3, pp.
275-288
55. Web servers for structure prediction
55
JPRED:-http://www.compbio.dundee.ac.uk/~www-jpred/
PHD:-http://cubic.bioc.columbia.edu/predictprotein/
PSIPRED:-http://bioinf.cs.ucl.ac.uk/psipred/
Chou and Fassman:-
http://fasta.bioch.virginia.edu/fasta_www/chofas.htm
56. Future technologies
56
Modeling of biologically relevant states of proteins using all available
templates
Homooligomers
Heterooligomers
Amino acid modifications
Bound ligands (small molecules, nucleic acids)
Modeling of specific classes of proteins
Antibodies
Repeat proteins (ARM/HEAT, WD repeats)
Ram S, Yu Xia, Enoch H, Michael L(1999), Ab Initio Protein Structure Prediction Using a Combined
Hierarchical Approach, PROTEINS: Structure, Function, and Genetics, 3, pp. 194–198.
57. Available databases, software, and services
57
Rotamer library
ProtBuD -- biological units database across families
PISCES -- non-redundant sequences in PDB
MolIDE 1.5
ArboDraw -- drawing phylogenetic trees
BioDownloader -- automatic updating of biological
databases
Ram S, Yu Xia, Enoch H, Michael L(1999), Ab Initio Protein Structure Prediction Using a Combined
Hierarchical Approach, PROTEINS: Structure, Function, and Genetics, 3, pp. 194–198.
58. Assessment of accuracy of PSP
58
P = (N – total incorrect)
N
total incorrect = total number of residues
whose conformations are predicted
incorrectly
N = the number of residues in the protein.
Ram S, Yu Xia, Enoch H, Michael L(1999), Ab Initio Protein Structure Prediction Using a Combined
Hierarchical Approach, PROTEINS: Structure, Function, and Genetics, 3, pp. 194–198.
59. Applications of PSP:
59
Drug targetting.
Pharmacogenetics.
Pharmacogenomics.
MOA.
Dosage regimen.
60. Conclusion
60
Pharmacist Biotechnologis
t
Molecular modeling
61. Conclusion
61
Pharmacist Biotechnologis
t
Protein structure prediction
62. References:
62
1. Lubert S, Tymoczko J L, Jeremy M B(2003). Text book of biochemistry, W. H. Freeman
and company press, 5th edition, pp. 198-230.
2. Thomas L L, David A W, Victoria K(1999), Foye‟s principles of Medicinal Chemistry,
Lippincott Williams & Wilkins publications, 6th edition, 3, pp. 55-63.
3. Mateusz K, Michał J, Andrzej K(2011), Multiscale Approaches to Protein Modeling,
Springer Science+Business Media, 12th edition, pp. 21-32.
4. Zhan Y Z, Tom L B(1996), The Use of Amino Acid Patterns of Classified Helices and
Strands in Secondary Structure Prediction, Journal of Molecular biology, 260, pp. 61–76.
5. Schween G, Egener T, Fritzkowsky D, Granado J, Guitton M C(2005), Large-scale
analysis of Physcomitrella plants transformed with different gene disruption libraries:
Production parameters and mutant phenotypes, Plant Biology, 7 (3), pp. 228–237.
6. David F B, Charlotte M D, Hampapathalu A N, Nuria C, An Iterative Structure-Assisted
Approach to Sequence Alignment and Comparative Modeling, PROTEINS: Structure,
Function, and Genetics Supplementations, 3, pp. 55-60.
7. Thomas L, Ralf Z(2000), Protein structure prediction methods for drug design, Briefings in
Bioinformatics, 3, pp. 275-288.
8. Anfinsen C B(1973), Principles that govern the folding of protein chains, Biological
Science, 181, pp. 223–230.
63. References:
63
9. Richard B, David B(2001), AB INITIO PROTEIN STRUCTURE PREDICTION: Progress
and Prospects, Annual review of biophysical and biomolecular structures, 30, pp. 73-88.
10. Saville W B, Shearer G (1925), An X-ray Investigation of Saturated Aliphatic Ketones,
Journal of the Chemical Society, 127, pp. 591.
11. Addess M, Kenneth J, Feigon J(1996). Introduction to 1H NMR Spectroscopy of DNA.
Bioorganic Chemistry: Nucleic Acids, Oxford University Press, 8th edition, pp. 238.
12. Ram S, Yu Xia, Enoch H, Michael L(1999), Ab Initio Protein Structure Prediction Using a
Combined Hierarchical Approach, PROTEINS: Structure, Function, and Genetics, 3, pp.
194–198.
13. Rajkumar B, Branson K, Giddy J, Abramson D(2003), The Virtual Laboratory : A toolset to
enable distributed molecular modeling for drug design on the World-Wide Grid,
Concurrency and computation, 15, pp. 1–25.
14. Griffith S, David J(2004), Introduction to Quantum Mechanics, Prentice Hall press, 2nd
edition, 1, pp. 1-4.
15. Leach A R(2001), Molecular Modelling: Principles and Applications, Oxford press, 4th
edition, 1, pp. 1-3.
16. LEONOR C H, PAULO A S(2001), Protein folding : thermodynamic versus kinetic control,
Journal of biological physics, 27, pp. 6-8.