Contenu connexe Similaire à A fast and accurate computational approach to protein ionization: combining the Generalized Born model with an iterative mobile cluster method (20) A fast and accurate computational approach to protein ionization: combining the Generalized Born model with an iterative mobile cluster method1. A fast and accurate computational
approach to protein ionization:
combining the Generalized Born
model with an iterative mobile
cluster method
Velin Z Spassov, Accelrys
3. INTRODUCTION
Protein Ionization and pK
Scientific Needs
• To provide a fast and convenient way to study the effects
of the pH changes on a wide range of important
mechanism such as enzyme catalysis, ligand binding and
protein stability.
• In protein modeling, a correct assignment of protonation
states and hydrogen atom positions are critical for:
» Accurate docking of small molecules to receptors
» Accurate protein-protein docking
» Stable, convergent molecular dynamics simulations
© 2008 Accelrys, Inc. 3
4. Introduction
Calculate Protein Ionization and Residue pK
A new Discovery Studio computational protocol to calculate the pH dependent
electrostatic effects in protein molecules*.
Calculates:
– the titration curves and pK1/2 of the titratible residues.
– the electrostatic contribution to the protein free energy as a function of pH.
– the pH dependency of the folding energy of the protein and the pH optimum
of protein stability.
– pI of the protein.
Optimizes the positions of all hydrogen atoms and
– automatically sets the protonation state of each residue at a given pH, based
on the calculated pK1/2 .
– finds the optimal proton binding sites for tautomeric ASP, GLU and HIS
residues.
– flips the O and N atoms of ASN and GLN residue to find an optimal
conformation.
*Spassov, V.Z. and Yan, L. (2008) Protein Science,17,1955-1969.
© 2008 Accelrys, Inc. 4
5. Protein Ionization and pK: Background
Deprotonated Protonated Deprotonated Protonated
H+
Arg Lys
H
+
Asp Glu
• Titratable residues: exist in protonated and deprotonated
forms
• A titration curve gives the fractional protonation of a titratable
group as a function of pH Tyr His
B:ASP30
Cys
HA + H2O H3O+ + A-
1.2
1
N-ter
0.8
pH = pKa + log10{[A-]/[HA]}
0.6 B:ASP30
C-ter
pK1/2 = 3.9
0.4
0.2
Titratable Groups in Proteins
0
0 2 4 6 8 10 12 14 16
© 2008 Accelrys, Inc. 5
6. THEORY
Calculate Protein Ionization and Residue pK
CHARMM force-field Extended GB/IM2,3,4,5 instead of
grid based PB solvers
Ionization Model1
exp[−G ( X l , pH ) / RT ]
ρ ( X l , pH ) =
2N
∑ exp[−G(X , pH ) / RT ]
l =1
l
( )
N
G (X, pH ) = 2.3RT ∑ xi pH − pK intr ,i + 1 / 2∑ Wij ( xi , x j )
i i, j
pK int r = pK mod + (2.303RT ) −1 [∆∆G ( PH , P ) − ∆∆G ( MH , M )]
Library of pentapeptide model
compounds and pKmod data7 IMC6 instead of Monte Carlo
instead of monopeptides CHARMm-based Protocol for
Preliminary Optimization
1Bashford D, Karplus M. (1990) Biochemistry, 29, 10219-10225. 5Spassov VZ et al. (2002) J. Phys. Chem B106:8762-8738.
2Still, W.C. et al. (1990)J. Am. Chem. Soc. 1990, 112, 6127-6129 6Spassov V.Z., Bashford, D. (1999) J..Comput. Chem.,20,1091-1111.
3Dominy, B.N.,Brooks III, C.L. (1999) J. Phys. Chem. B 103, 3765-3773. 7Thurlkill et al. 2006. Protein Science,15,1214-1218.
4 Onufriev A. et al. (2000) J. Phys. Chem. B 2000, 104, 3712-3720.
© 2008 Accelrys, Inc. 6
7. Protein Ionization and pK: Solution
• New method1 to ‘Calculate Protein Ionization and pK’
– Predicts pK1/2 and titration curves for each titratable residue using 3D environment of protein
– Automatically protonates the residues at a given pH according to predicted pK1/2.
• For HIS, ASP, and GLU residues the hydrogens are added to yield the lowest CHARMm energy
• The N and O atoms on the side-chain of ASN and GLN residues are flipped if necessary to give the lower
energy conformation
– Calculates the following as a function of pH
• Electrostatic contribution to the free energy
• Estimate of relative folding energy (electrostatic contribution)
• Total charge of system
– Based on CHARMm Generalized-Born methods
• Strength of Solution
– More accurate and rigorous than rule-based methods
– Faster and more accurate than existing Poison-Boltzmann/Monte Carlo methods
– Consistent CHARMm force field used throughout
1.2 *:GLU23
*:GLU38
*:GLU77
1
*:GLU97
*:GLU104
0.8 *:GLU107
*:GLU119
*:GLU129
0.6
*:GLU133
*:GLU135
0.4 *:GLU140
*:GLU145
*:GLU165
0.2
*:GLU183
*:GLU186
0 *:GLU219
0 2 4 6 8 10 12 14 *:GLU239
© 2008 Accelrys, Inc. 1. Spassov, et al, Protein Sci. 2008, 17, 1955-1969) 7
8. Model Compounds
MEAD, UHBD and others DS Protein Ionization
Structure: Monopeptide Structure: Blocked Pentapeptides
pK data: standard set Ala-Ala-X-Ala-Ala
Nozaki Y, Tanford C. 1967. Examination of titration pK data:
behavior. Methods Enzymol 11:715–734.
Thurlkill et al. 2006. Protein Science,15,1214-1218.
© 2008 Accelrys, Inc. 8
9. IMC (Iterative Mobile Clustering) Approach
Mean-field approach to protein ionization:
Spassov V.Z., Bashford, D. (1999) J..Comput. Chem.,20,1091-1111
One site/Single conformer
Tanford C., Roxby R (1972),11,2192-2198.
IMC: Ntot(cluster) = Nglobal 3Nclstr2Nclstr Clustering/distance criterion/single conformer
Yang A.S. et al. (1993) Proteins,15,252-265.
Gilson M.K. (1993) Proteins,15,266-282.
ρ (C , X ) = f g (k ) f Γ (c, x | k ) f out (c' , x' | k ) Clustering/energy criterion/single or multiple conformers
Spassov & Bashford (1999)
© 2008 Accelrys, Inc. 9
10. Protein Ionization and pK: Method
• Electrostatic interaction energies are calculated using an implementation of
Generalized Born solvation model in CHARMm
– atomic parameters from either CHARMm or CHARMM polar hydrogen forcefields
• The energies of the protonated and deprotonated states are calculated and the
percentage of protonation of each residue is predicted at given pH based on
Boltzmann distribution
• Relative folding energy estimated based on energy of protonation of the protein and
the protonation energy of the model compounds
• Current implementation treats protein as a single conformer embedded in a dielectric
medium
– A dielectric constant of 10-11 for the protein interior gives the lowest RMSD compared to
experimentally obtained pK data.
– This dielectric constant is the only parametrized variable in the method
© 2008 Accelrys, Inc. 10
11. Parameterization of the model
In contrast to some popular pK prediction programs based on multi-parameter empirical models,
the only fitting parameter in our method is the value of intra-molecular dielectric constant, εm, while
all other parameters are kept at their standard CHARMm force-field values.
qi q j 1 D( I , α i , α j ) qi q j
∆Gelec = 332∑∑ − 166 ( − )∑∑
j >i ε m ri , j εm ε slv r + α iα j exp(−rij2 / 4α iα j )
2
i i j ij
Hen-egg lyzozyme 2lzt.pdb
pK1/2
Residue Experimental* CHARMM CHARMM
polar H
1.2
LYS1_NTR 7.9 7.81 8.00
LYS1 10.6 10.01 10.01
GLU7 2.9 3.17 3.39
1
LYS13 10.3 10.49 10.56
HIS15 5.4 6.20 5.87
ASP18 2.7 2.87 3.11 0.8
TYR20 10.3 10.85 11.18
TYR23 9.8 10.16 10.87 RMSD
LYS33 10.4 10.58 10.79 0.6
GLU35 6.2 5.05 5.90
ASP48 2.5 2.96 2.91
ASP52 3.7 4.32 4.67 0.4
TYR53 >12 11.71 >12
ASP66 <2.0 2.15 2.87
ASP87 2.1 2.43 2.97 0.2
LYS96 10.7 11.18 11.42
LYS97 10.1 10.79 10.85
ASP101 4.1 3.89 3.92 0
LYS116 10.2 10.12 10.09 0 5 10 15 20 25
ASP119 3.2 3.08 3.28
LEU129_CTR 2.8 2.73 2.83 dielectric constant
rmsd 0.45 0.57
* Bartik et al., 1994, Kuramitsu and Hamaguchi 1980.
© 2008 Accelrys, Inc. 11
12. Results: pK Prediction of Selected Proteins
Sites with CHARMm CHARMm
• Comparison of experimental PDB code experimantal pK data polar all
hydrogens hydrogens PROPKA MCCE
pK1/2 with calculated values for ε=8
select PDB files 1 4pti 14 0.36 0.36 0.6 0.47
2 2lzt 21 0.45 0.57 0.66 0.74
• All computations about 1 3 2rn2 25 0.59 0.68 0.72 0.87
4 3rn3 16 0.47 0.71 0.67 0.66
minute per system on a single 5 1pga 15 0.50 0.57 0.72 0.63
CPU 6
7
3icb
1hng
10
14
0.33
0.55
0.35
0.53
0.9
0.83
0.38
0.76
8 1a2p 12 0.60 0.49 0.68 0.89
9 1omu 15 0.64 0.70 0.44 1.10
10 9rnt 14 0.54 0.65 1.51
11 1bi6-heavy chain 18 0.54 0.53 0.56
12 1bi6-light chain 4 0.18 0.27 0.38
13 1rgg 24 0.84 0.89 0.97
14 1igd 16 0.35 0.36 0.62
15 135l 11 0.63 0.65 0.66
PROPKA: Li et al. (2005) Proteins, 16 1div 6 0.26 0.32 0.74
17 1xnb* 13 0.70 1.09 0.62
61, 704-721. 18 1kxi 3 0.57 0.50 0.66
19 1beo 10 0.46 0.56 0.98
MCCE: Georgescu et al. (2002) 20 1trs 17 0.88 0.86 0.94
Biophysical Journal, 83, 1731-1748. 21 1qbs 16 0.34 0.34 0.78
22 1de3 25 0.66 0.70 1.33
23 2bus 4 0.46 0.49 0.23
24 1egf 9 0.49 0.53 0.49
Total sites 331 331 331 331 141
Average RMSD 0.508 0.548 0.742 0.720
© 2008 Accelrys, Inc. 12
13. Results: pK Prediction of Selected Proteins
14
y = 0.9868x + 0.0282
6
12 R2 = 0.9672
Intel Pentium4 3.0 GHz machine
5
10
8 4
pK calc
Time [min]
6 3
4 2
2 1
0
0
0 2 4 6 8 10 12 14 0 100 200 300 400 500 600 700 800
pK exp residues
• Predicted results well correlate
with the experimental
measurements
• Computation time scales
roughly linearly with residue
number
• Most systems take about 1 to 2
minutes on a single CPU
© 2008 Accelrys, Inc. 13
14. The Comparison of the accuracy of pK
predictions with other methods
sites GB/IMC MCCE Const. pH FD/DH SCP PROPKA
4pti 14 0.36 0.47 NA 0.35 0.33 0.6
2lzt 21 0.45 0.76 0.6 0.47 0.49 0.66
2rn2 25 0.59 0.87 0.9 1.17 0.57 0.72
3rn3 16 0.44 0.66 1.2 0.87 0.55 0.94
1pga 15 0.42 0.63 NA 0.80 0.59 0.72
3icb 10 0.33 0.38 NA 0.37 0.39 0.9
3rnt 4 0.28 0.54 NA NA 0.41 NA
Average 0.41 0.63 - 0.67 0.49 0.76
© 2008 Accelrys, Inc. 14
15. pK1/2 Prediction – Applications
• Application 1: Optimize the protonation state of proteins and hydrogen coordinates
– Prepare the protein for other calculations, such as more stable Molecular Dynamics
simulations
• Application 2: Estimate maximum stability by studying the pH dependent folding energy
of proteins
• Application 3: Calculate the electrostatic component of protein-ligand binding energies
or protein-protein binding energy
• Application 4: Use unusual tritation curves to find relevant functional residues
• Application 5: Estimate the effect of mutation
– pK and titration curve changes on other titratible sites when a residues is mutated
– Shift of the stability of the protein to different pH when a residue is mutated
1.2
1
*:HIS26
His 95
0.8 *:HIS95
*:HIS100
*:HIS115
0.6
*:HIS185
*:HIS195
0.4 *:HIS224
*:HIS248
0.2
0
0 2 4 6 8 10 12 14
© 2008 Accelrys, Inc. 15
16. Application – Protonation and Hydrogen Coordinates
Rubredoxin from Pyrococcus Furiosus at pH 8; 1vcx.pdb
Comparison of the predicted hydrogen positions with neutron diffraction
structure
© 2008 Accelrys, Inc. 16
17. Application – Protonation and Hydrogen Coordinates
• Protonation state of HEWL: Comparison with neutron diffraction data at pH 4.7
• Asn and Gln flips:
13 sucessfully predicted out 17 residues in the structure (77%)
Comparison between the predicted protonation state
of HEWL and neutron diffraction data at pH 4.7
File: 1lzn.pdb
protonation pK1.2
Residue Neutron Predicted Experimental Calculated
diffraction NMR*
LYS1_NTR P P 7.9 8.172
LYS1 P P 10.6 10.840
GLU7 P D 2.9 3.701
LYS13 P P 10.3 11.120
HIS15 P P 5.4 7.380
ASP18 D D 2.7 3.674
TYR20 P P 10.3 11.271
TYR23 P P 9.8 10.886
LYS33 P P 10.4 11.669
GLU35 P P 6.2 5.691
ASP48 D D 2.5 2.818
ASP52 D D 3.7 4.604
TYR53 P P >12 12.000
ASP66 D D <2.0 3.526
ASP87 D D 2.1 3.389
LYS96 P P 10.7 11.456
LYS97 P P 10.1 10.933
ASP101 D D 4.1 3.916
LYS116 P P 10.2 10.220
ASP119 D D 3.2 3.456
LEU129_CTR D D 2.8 2.984
* Bartik et al., 1994, Kuramitsu and Hamaguchi 1980.
© 2008 Accelrys, Inc. 17
18. Myoglobin 1l2k.pdb: Neutron Diffraction Structure at pH 6.8
The protonation and tautomeric states of histidine residues.
A
B
A. Predicted structure.
B. Neutron-diffraction structure
© 2008 Accelrys, Inc. 18
19. Application – Protonation and Hydrogen Coordinates
1lzn, pH 4.7 1l2k, pH 6.8 2gve, pH 8.0 6rsa, pH 6.6
ASP18 3.66 NTR1 7.30 NTR1 7.6 NTR1 7.40
0.13 D 0.75 NA 0.30 P* 0.86 P
Comparison between calculated and ASP48 2.80 HIS12 6.76 HIS49 6.17 HIS12 6.86
0.03 D 0.48 D 0.02 D 0.62 P
experimental protonation states in ASP52 4.54 HIS24 6.69 HIS54 7.6 HIS48 8.70
neutron-diffraction structures. 0.47 D 0.47 D 0.30 P* 0.99 P
First row - computed pKhalf values; ASP66 3.67 HIS36 7.19 HIS71 7.03 HIS105 6.95
second row – the fractional 0.13 D 0.69 P 0.11 D 0.68 P
ASP87 3.33 HIS48 6.22 HIS96 5.13 HIS119 6.50
protonations of residues.
0.07 D 0.22 P** 0.03 D 0.43 P*
P – residue protonated in crystal ASP101 3.90 HIS64 4.47 HIS198 6.64 1vcx, pH 8
structure; D – deprotonated; NA – 0.18 D 0.02 D 0.06 P**
more than one polar hydrogen is ASP119 3.45 HIS81 6.37 HIS220 7.08 NTR1 9.22
missing. 0.08 D 0.31 NA 0.15 P** 0.94 P
In bold – accurately predicted GLU7 3.70 HIS82 6.41 HIS230 6.67
0.13 P** 0.33 D 0.06 P**
structures; ** -completely incorrect GLU35 5.67 HIS97 6.28 HIS243 6.40
prediction; * - underpredicted, but 0.89 P 0.26 D 0.07 D
close. HIS15 7.50 HIS113 5.60 HIS285 9.35
0.99 P 0.10 NA 0.93 P
CTR129 2.90 HIS116 6.71 HIS382 7.54
0.03 D 0.46 NA 0.29 P*
HIS119 4.94
0.19 D
© 2008 Accelrys, Inc. 19
20. Application - Optimized Protonation for
Stable Molecular Dynamics
• HIV Protease dimer has two Asp 25
residues in binding pocket
• Run CHARMm MD (100pS, GBSW
solvent model) on two forms of the
protein (PDB ID 1kzk)
– Protein with default protonation
– Protein with pK-optimized protonation
(Asp 25 B protonated)
Optimized-protonation of Asp 25’s in
HIV protease leads to more stable MD
trajectories
RMSD of select residues to starting RMSD of select residues to starting
conformation, default protonation of Asp 25’s conformation, optimized protonation
© 2008 Accelrys, Inc. 20
21. Application – Unfolding Energy
• HIV Protease apo form; 1hhp.pdb
β-model • Folding energy calculated using zero
model and beta-model
Extended
conformation Zero model
∆G(unfld) = - (Relative Folding Energy)
∆G(unfld) = ∆G0 – ∆G(fld)
∆G0: pKint,I = pKmod
Wij = 0
1HHP- predicted unfolding energy
Unfolding in urea
15
14
13
12
∆ G(unfold)
11
10
9
8
7
6
2 3 4 5 6
pH
Todd et al. (1998) J Mol Biol,283,475-488
© 2008 Accelrys, Inc. 21
22. Application – Ligand Binding Energy
Energy of binding of KNI-272 to HIV-1 protease – 1hpx.pdb
14
12
10
Energy [kcal/mol]
8
6
4
2
0
0 2 4 6 8 10 12
pH
Calculated pH optimum of binding at pH ~ 5.0
The association constant is maximal between pH 5 and pH 6
Velazquez-Campoy et al. (2007) Protein Science, 9,1801-1809.
© 2008 Accelrys, Inc. 22
23. MEMBRANE PROTEINS
Bacteriorhodopsin: 1c3w.pdb1
pK1/2
Calculated
Calculated Calculate using MEAD
GBIM d without with PB and Experiment
membran membrane2
e
ARG82 > 14 >14 >15 >13.8
ASP85 2.96 7.1 1.7 2.6
ASP96 8.80 8.7 >15 >12
ASP115 6.54 8.1 8.4 >9.5
GLU194 9.69 8.6 > 15 Proton
GLU204
release
3.35 8.7 <0
group
keeps one
proton
ASP212 <0.00 7.1 <0 <2.5
Schiff > 14 12.1 >15 >12
base216
1Luecke et al. (1999) J. Mol.Biol.,291,899-911.
2Spassov et al. (2001) J. Mol.Biol.,312,203-219
© 2008 Accelrys, Inc. 23
24. MEMBRANE PROTEINS
β2-adrenergic G Protein-coupled Receptor: 2rh1.pdb1
agonist: epinephrine
antagonist: carazolol (adrenaline,a cateholeamine)
Calculated pK1/2
carazolol adrenaline
residue unbound bound unbound bound
Asp 113 9.4 2.6 9.4 2.4
Asp 79 8.2 8.4 8.2 8.2
Glu 122 11.0 10.5 11.0 10.8
ligand: -NH2- 9.0 12.7 8.9 13.
Ligand: catehol -OH 10.4 14.
© 2008 Accelrys, Inc. 24
25. MEMBRANE PROTEINS
β2-adrenergic G Protein-coupled Receptor:
Electrostatic contribution to the free energy of ligand binding.
8.00
6.00 adrenaline
4.00
∆∆G binding
2.00
0.00
0 2 4 6 8 10 12 14 16
carazolol
-2.00
-4.00
pH
© 2008 Accelrys, Inc. 25
26. MEMBRANE PROTEINS
MD simulation of β2-adrenergic G Protein-coupled Receptor – adrenaline
complex.
Selected parameters of the production run :
Preliminary preparation of the structure before MD simulations. Production Steps 500000
Production Time Step 0.002
1. Use the Discovery Studio Create and Edit Membrane tool to add a Production Target Temperature 300.0
membrane object to the input protein structure.
Generalized Born with Implicit Membrane
Implicit Solvent Model
(GBIM)
2. Run the Discovery Studio Calculate Protein Ionization and Residue pK Dielectric Constant 2
protocol to assign the protonation state of all acidic and basic titratable Implicit Solvent Dielectric
80
groups at a selected pH. Constant
Minimum Hydrogen Radius 1.0
Use Non-polar Surface Area True
3. Run Add Membrane and Orient Molecule protocol for a preliminary
Non-polar Surface Constant 0
optimization of the position of the protein relative to membrane.
Non-polar Surface Coefficient 0.00542
Nonbond List Radius 12.0
Steps 2 and 3 could be critical for the success of the MD simulations: When
Nonbond Higher Cutoff Distance 11.0
using the default state of protonation, the simulation on 2rh1 structure
was compromised in a early phase, because of a significant overheating Nonbond Lower Cutoff Distance 11.0
of the system. Dynamics Integrator Leapfrog Verlet
Apply SHAKE Constraint False
Random Number Seed 314159
Number of Processors 1
© 2008 Accelrys, Inc. 26
27. MEMBRANE PROTEINS
A 1 ns MD simulation of β2-adrenergic G Protein-coupled Receptor complex with
adrenaline.
RMSD values of CA atoms along the MD trajectory.
all CA atoms CA atoms inside membrane
(helix 1 excluded)
The low dielectric environment of membrane stabilizes the structure of transmembrane helices.
© 2008 Accelrys, Inc. 27
28. Conclusions
• The combination of the GB calculations with IMC approach increases dramatically
the speed of calculations and makes it possible to treat very large structures of
arbitrary shape which are difficult to calculate using methods based on grid
techniques to solve Poisson-Boltzmann equation and Monte-Carlo sampling
schemes.
• The results of the tests indicate that the method returns very accurate pK values,
comparable to the best results previously reported in the literature.
• Compared to crystallographic data at given pH, the tests show a high accuracy of
the predicted protonation and hydrogen coordinates.
• The use of the GBIM CHARMm module makes it possible to study not only water
soluble proteins but also protein-membrane complexes.
• The Discovery Studio implementation provides an easy way to integrate the
protein ionization calculations with many other molecular modeling protocols, such
as pH-dependent MD simulations, ligand docking, protein docking, ion binding. It
also made it easy to study the pH dependent protein stability and the effect of
mutation on protein stability.
© 2008 Accelrys, Inc. 28
29. Acknowledgments
Lisa Yan
Paul Flook
Don Bashford
© 2008 Accelrys, Inc. 29