Protein Structure Prediction using Coarse Grain Force Fields

Protein Structure Prediction
using Coarse Grain Force Fields

Nasir Mahmood

12.02.2010

Overview

• Introduction

• Probabilistic Ab Initio – Standard
– Score function
– Search Method
– Results

• Probabilistic Ab Initio - Extended
– Score Function : Introducing Solvation
– Search Method: Bias Fix
– Results

• Outlook

• Summary
2

“All the information required
by protein to adopt its final
conformation is encoded in
its sequence”

• information he referred to has not
been decoded yet

• interestingly, these days we also
know about proteins like ‘prions’ Christian B. Anfinsen (1916 - 1995)

Source: http://nobelprize.org/

3

X-Ray
Crystallography
Experimental
NMR
Methods
Spectroscopy
N
Cryo-EM

Time (year)

X-Ray
Crystallography
Experimental
Methods NMR
Spectroscopy
N
Cryo-EM

Time (year)

More than 3 decades and
only 60000+ structures
5

100 × 10 6
Sequence
90 × 10 6
Database Growth
80 × 10 6 X-Ray
Crystallography

70 × 10 6 Experimental
NMR
Methods
Spectroscopy
60 × 10 6
Cryo-EM

N 50 × 10 6

40 × 10 6

30 × 10 6

20 × 10 6

10 × 10 6

Time (year) 6

Experimental Data

X-Ray
Experimental Crystallography
Methods
NMR
Spectroscopy
PDB
Methods

Accuracy
Cryo-EM

Computation cost
Homology

PDB dependence
Computational
Modeling
Methods
Fold
Recognition

Ab Initio
Modeling

Physical Principles 7

• Monte Carlo Methods

• Molecular Dynamics

• Physics-based
• Best but most difficult (Force fields)
• Computationally expensive

• Statistics-based Pi = e - ∆E/kBT
• Boltzmann distributions
• Statistical mechanical ensembles

• We use Descriptive Statistics
Ab Initio • Bayesian formulation
• No hidden approximations
Methods • No energies but find distributions
8

• Simulated Annealing /
• Coarse Grained Monte Carlo
• reduced dimensionality • Move set: biased & unbiased
• relies on dihedral angles • Acceptance criterion: ratio
of probabilities
• no side chains
• 5-atoms representation
• Fragment Assembly

• Purely Probabilistic Force Field
• Mixture of Probabilities:
• Sequence, Structure, Solvation
Our Ab Initio • No energies
Method • No Boltzmann statistics
9

Probabilistic
Score Function

10

1. Sequence • Multi-way Bernoulli
E MP
N S A
W
Y F I D
KG Q H T S L C

2. Structure • Representation :
• Reduced, Simplified
• 5-atoms per amino acid
• dihedral angles (phi, psi)
• Bivariate Gaussian 11

i
i+1

i+2

1.5 × 10 6 (B)
(A)
Sequence Structure
-3.1 -2.0 -0.5 -1.7 -2.0 -1.5 -2.2
i A S T C W R I -1.1 -0.9 -0.7 -0.5 -0.3 -0.8 -1.0
-2.0 -0.5 -1.7 -2.0 -1.5 -2.2 -1.1
i+1 S T C W R I M -0.9 -0.7 -0.5 -0.3 -0.8 -1.0 -1.1
-0.5 -1.7 -2.0 -1.5 -2.2 -1.1 -2.1
i+2 T C W R I M F -0.7 -0.5 -0.3 -0.8 -1.0 -1.1 -0.4
…
…

3.1 2.0 1.5 1.7 -2.0 -1.5 -1.2
N P L E N R R V 1.1 0.9 -2.5 2.3 -0.9 -1.2 -0.8

(C)
12

Fragment Generation Classified
ACAD .. CCAD .. WFTG .. STST.. STDC ..

WFDC .. DCWF .. GAEG .. GAEG .. GGGG ..

Expectation
Maximization

Fragment Bayesian
Library Statistical Models Classifier
13

14

20 05 -32 80
W E W C
87 -71 15 -07
20 05 -32 80
W W E W
87 -71 15 -07
20 05 -32 80
Q W W E
87 -71 15 -07
20 05 -32 80
87 -71 15 -07

A Q W W
20 05 -32 80

87 -71 15 -07
Structure

20 05 -32 80
T A Q W
87 -71 15 -07
20 05 -32 80
T T A T
87 -71 15 -07
20 05 -32 80
L T T A
87 -71 15 -07
20 05 -32 80
T L T I
87 -71 15 -07
T

Sequence

20 05 -32 80
L T L T
87 -71 15 -07
L

20 05 -32 80
S L T M
87 -71 15 -07
S

20 05 -32 80
A S L T
87 -71 15 -07
A

class 0
class 1
class 2
class 3
class 4
class 5
class 6
DCWF ..
GAEG ..

WFDC ..
GGGG ..

GAEG ..

Classified
ACAD ..
CCAD ..
WFTG ..
STDC ..

STST..

Initial (random) p(x i )
conformation Relative probabilities: Pi = p(x )
i -1
Probability

• Normal methods : Pi = e - ∆E/kBT
(i)

(i-1) Final
Model

Conformational space 16

180

Random Angle
0
Generator PDB
-180 0 180 180

phi psi
0

psi
93 177 66 14 167 73 31 54
-180
-180 0 180
phi

Fragment ≈ 2 × 10 6
fragments
Library

Unbiased Biased 17

Interplay of Cartesian Coordinates
& Dihedral Angles

Choi, V.: 2005, On Updating torsion angles of molecular conformations, 18
J Chem Inf Model 46, 438–444.

Results
2hfq

Model Native
20

Results 2hd3

Model Native
21

Results 2gzv

Psi
Phi

Model Native
22

Results 2hj1

Score
Time
Temperature

Model
Native 23

Results

Psi
Phi

Score

Time
Temperature 24

Score Function:
Introducing Solvation

25

Trp

PDB

Gly Lys Ser

28

1. Sequence
• Multi-way Bernoulli

E MP
N S A
W
Y F I
KG QH T S D L C

2. Structure 3. Solvation
• Representation : • Simple Gaussian
• Reduced, Simplified
• 5-atoms per amino acid
• dihedral angles (phi, psi)
• Bivariate Gaussian
29

• Mixture Models: Re-Classified
 Connections
ACAD .. CCAD .. WFTG .. STST.. STDC ..
 Residues
PDB 

Geometry
Location in protein
WFDC .. DCWF .. GAEG .. GAEG .. GGGG ..

Sequence Structure Solvation
-3.1 -2.0 -0.5 -1.7
A S L T 12 07 08 11
-1.1 -0.9 -0.7 -0.5
-2.0 -0.5 -1.7 -1.2
S L T I 07 08 11 09
-0.9 -0.7 -0.5 -0.4

Expectation
Maximization

Fragment Bayesian
Library Statistical Models Classifier
30

Search Method:
Bias Fix & Combining
Fragments

31

Combining Fragments and
Probabilities

33

Results

1fsv

2hep

Native Model 35

Results

2k4x

1agt

Model
Native 36

Results

2k53

2k4n

Native Model 37

Results

2hf1

Native Model
38

Future Outlook

• Introduce hydrogen
bonds – as a
probabilistic term

• Hydrogen bond N
energies have normal
distribution

• Use Simple Gaussian
model Hydrogen bond energy
(kcal/mol)

39

Summary

• Purely Probabilistic Approach for Protein Structure
Prediction
• Score function consists of a set of probability distributions
• Conformation probabilities - mixture of probabilities, no
energies at all

• generates protein/protein-like conformations
• long-range interactions not well represented
• In future, hydrogen bond term could improve results

• Application to sequence optimization
• Rapid sampling – combine with other score functions
40

Protein Structure Prediction using Coarse Grain Force Fields

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (20)

Similaire à Protein Structure Prediction using Coarse Grain Force Fields

Similaire à Protein Structure Prediction using Coarse Grain Force Fields (20)

Dernier

Dernier (20)

Protein Structure Prediction using Coarse Grain Force Fields