This document discusses moment closure inference for stochastic kinetic models. It begins with an introduction to moment closure techniques using a simple birth-death process as a case study. It then discusses how to derive moment equations from the chemical master equation and how the deterministic model can be viewed as an approximation of the stochastic model by setting the variance to zero. The document also examines some limitations of moment closure approximations using examples of heat shock and p53-Mdm2 oscillation models. Finally, it presents a case study of using moment closure to model cotton aphid populations based on field data.
Moment closure inference for stochastic kinetic models
1. Moment closure inference for
stochastic kinetic models
Colin Gillespie
School of Mathematics & Statistics
2. Talk outline
An introduction to moment closure
Case study: Aphids
Conclusion
2/43
3. Birth-death process
Birth-death model
X −→ 2X and 2X −→ X
which has the propensity functions λX and µX .
Deterministic representation
The deterministic model is
dX (t )
= ( λ − µ )X (t ) ,
dt
which can be solved to give X (t ) = X (0) exp[(λ − µ)t ].
3/43
4. Birth-death process
Birth-death model
X −→ 2X and 2X −→ X
which has the propensity functions λX and µX .
Deterministic representation
The deterministic model is
dX (t )
= ( λ − µ )X (t ) ,
dt
which can be solved to give X (t ) = X (0) exp[(λ − µ)t ].
3/43
5. Stochastic representation
In the stochastic framework, each
reaction has a probability of occurring
50
The analogous version of the
40
birth-death process is the difference
Population
equation 30
20
dpn
= λ(n − 1)pn−1 + µ(n + 1)pn+1 10
dt
− (λ + µ)npn 0
0 1 2 3 4
Time
Usually called the forward Kolmogorov
equation or chemical master equation
4/43
6. Moment equations
Multiply the CME by enθ and sum over n, to obtain
∂M ∂M
= [λ(eθ − 1) + µ(e−θ − 1)]
∂t ∂θ
where
∞
M (θ; t ) = ∑ e n θ pn ( t )
n =0
If we differentiate this p.d.e. w.r.t θ and set θ = 0, we get
dE[N (t )]
= (λ − µ)E[N (t )]
dt
where E[N (t )] is the mean
5/43
7. The mean equation
dE[N (t )]
= (λ − µ)E[N (t )]
dt
This ODE is solvable - the associated forward Kolmogorov equation is
also solvable
The equation for the mean and deterministic ODE are identical
When the rate laws are linear, the stochastic mean and deterministic
solution always correspond
6/43
8. The variance equation
If we differentiate the p.d.e. w.r.t θ twice and set θ = 0, we get:
dE[N (t )2 ]
= (λ − µ)E[N (t )] + 2(λ − µ)E[N (t )2 ]
dt
and hence the variance Var[N (t )] = E[N (t )2 ] − E[N (t )]2 .
Differentiating three times gives an expression for the skewness, etc
7/43
10. Dimerisation moment equations
We formulate the dimer model in terms of moment equations
dE[X1 ] 2
= 0.5k1 (E[X1 ] − E[X1 ]) − k2 E[X1 ]
dt
2
dE[X1 ] 2 2
= k1 (E[X1 X2 ] − E[X1 X2 ]) + 0.5k1 (E[X1 ] − E[X1 ])
dt
2
+ k2 (E[X1 ] − 2E[X1 ])
where E[X1 ] is the mean of X1 and E[X1 ] − E[X1 ]2 is the variance
2
The i th moment equation depends on the (i + 1)th equation
9/43
11. Deterministic approximates stochastic
Rewriting
dE[X1 ] 2
= 0.5k1 (E[X1 ] − E[X1 ]) − k2 E[X1 ]
dt
in terms of its variance, i.e. E[X1 ] = Var[X1 ] + E[X1 ]2 , we get
2
dE[X1 ]
= 0.5k1 E [X1 ](E[X1 ] − 1) + 0.5k1 Var[X1 ] − k2 E[X1 ] (1)
dt
Setting Var[X1 ] = 0 in (1), recovers the deterministic equation
So we can consider the deterministic models as an approximation to
the stochastic
When we have polynomial rate laws, setting the variance to zero
results in the deterministic equation
10/43
12. Deterministic approximates stochastic
Rewriting
dE[X1 ] 2
= 0.5k1 (E[X1 ] − E[X1 ]) − k2 E[X1 ]
dt
in terms of its variance, i.e. E[X1 ] = Var[X1 ] + E[X1 ]2 , we get
2
dE[X1 ]
= 0.5k1 E [X1 ](E[X1 ] − 1) + 0.5k1 Var[X1 ] − k2 E[X1 ] (1)
dt
Setting Var[X1 ] = 0 in (1), recovers the deterministic equation
So we can consider the deterministic models as an approximation to
the stochastic
When we have polynomial rate laws, setting the variance to zero
results in the deterministic equation
10/43
13. Simple dimerisation model
To close the equations, we assume an underlying distribution
The easiest option is to assume an underlying Normal distribution, i.e.
E[X1 ] = 3E[X1 ]E[X1 ] − 2E[X1 ]3
3 2
But we could also use, the Poisson
3
E[X1 ] = E[X1 ] + 3E[X1 ]2 + E[X1 ]3
or the Log normal
2 3
3 E [ X1 ]
E [ X1 ] =
E [ X1 ]
11/43
14. Simple dimerisation model
To close the equations, we assume an underlying distribution
The easiest option is to assume an underlying Normal distribution, i.e.
E[X1 ] = 3E[X1 ]E[X1 ] − 2E[X1 ]3
3 2
But we could also use, the Poisson
3
E[X1 ] = E[X1 ] + 3E[X1 ]2 + E[X1 ]3
or the Log normal
2 3
3 E [ X1 ]
E [ X1 ] =
E [ X1 ]
11/43
15. Heat shock model
Proctor et al, 2005. Stochastic kinetic model of the heat shock system
twenty-three reactions
seventeen chemical species
A single stochastic simulation up to t = 2000 takes about 35 minutes.
If we convert the model to moment equations, we get 139 equations
ADP Native Protein
1200 6000000
5950000
1000
5900000
800
Population
5850000
600
5800000
400
5750000
200
5700000
0
0 500 1000 1500 2000 0 500 1000 1500 2000
Time
Gillespie, CS, 2009
12/43
16. Density plots: heat shock model
Time t=200 Time t=2000
0.006
Density
0.004
0.002
0.000
600 800 1000 1200 1400 600 800 1000 1200 1400
ADP population
13/43
17. P53-Mdm2 oscillation model
Proctor and Grey, 2008 300
16 chemical species
250
Around a dozen reactions
200
Population
The model contains an events
At t = 1, set X = 0 150
If we convert the model to moment 100
equations, we get 139 equations. 50
However, in this case the moment 0
closure approximation doesn’t do to 0 5 10 15 20 25 30
Time
well!
14/43
18. P53-Mdm2 oscillation model
Proctor and Grey, 2008
300
16 chemical species
Around a dozen reactions 250
The model contains an events 200
Population
At t = 1, set X = 0 150
If we convert the model to moment 100
equations, we get 139 equations.
50
However, in this case the moment
0
closure approximation doesn’t do to
0 5 10 15 20 25 30
well! Time
14/43
19. P53-Mdm2 oscillation model
Proctor and Grey, 2008
300
16 chemical species
Around a dozen reactions 250
The model contains an events 200
Population
At t = 1, set X = 0 150
If we convert the model to moment 100
equations, we get 139 equations.
50
However, in this case the moment
0
closure approximation doesn’t do to
0 5 10 15 20 25 30
well! Time
14/43
20. What went wrong?
The Moment closure (tends) to fail when there is a large difference
between the deterministic and stochastic formulations
In this particular case, strongly correlated species
Typically when the MC approximation fails, it gives a negative
variance
The MC approximation does work well for other parameter values for
the p53 model
15/43
22. Cotton aphids
Aphid infestation (G & Golightly, 2010)
A cotton aphid infestation of a cotton plant can result in:
leaves that curl and pucker
seedling plants become stunted and may die
a late season infestation can result in stained cotton
cotton aphids have developed resistance to many chemical
treatments and so can be difficult to treat
Basically it costs someone a lot of money
17/43
23. Cotton aphids
Aphid infestation (G & Golightly, 2010)
A cotton aphid infestation of a cotton plant can result in:
leaves that curl and pucker
seedling plants become stunted and may die
a late season infestation can result in stained cotton
cotton aphids have developed resistance to many chemical
treatments and so can be difficult to treat
Basically it costs someone a lot of money
17/43
24. Cotton aphids
The data consists of
five observations at each plot
the sampling times are t=0, 1.14, 2.29, 3.57 and 4.57 weeks (i.e.
every 7 to 8 days)
three blocks, each being in a distinct area
three irrigation treatments (low, medium and high)
three nitrogen levels (blanket, variable and none)
18/43
27. Some notation
Let
n (t ) to be the size of the aphid population at time t
c (t ) to be the cumulative aphid population at time t
1. We observe n (t ) at discrete time points
2. We don’t observe c (t )
3. c (t ) ≥ n (t )
20/43
28. The model
We assume, based on previous modelling (Matis et al., 2004)
An aphid birth rate of λn (t )
An aphid death rate of µn (t )c (t )
So extinction is certain, as eventually µnc > λn for large t
21/43
29. The model
Deterministic representation
Previous modelling efforts have focused on deterministic models:
dN (t )
= λN (t ) − µC (t )N (t )
dt
dC (t )
= λN (t )
dt
Some problems
Initial and final aphid populations are quite small
No allowance for ‘natural’ random variation
Solution: use a stochastic model
22/43
30. The model
Deterministic representation
Previous modelling efforts have focused on deterministic models:
dN (t )
= λN (t ) − µC (t )N (t )
dt
dC (t )
= λN (t )
dt
Some problems
Initial and final aphid populations are quite small
No allowance for ‘natural’ random variation
Solution: use a stochastic model
22/43
31. The model
Stochastic representation
Let pn,c (t ) denote the probability:
there are n aphids in the population at time t
a cumulative population size of c at time t
This gives the forward Kolmogorov equation
dpn,c (t )
= λ(n − 1)pn−1,c −1 (t ) + µc (n + 1)pn+1,c (t )
dt
− n ( λ + µ c ) p n ,c ( t )
Even though this equation is fairly simple, it still can’t be solved exactly.
23/43
32. Some simulations
800
600
Aphid pop.
400
200
0
0 2 4 6 8 10
Time (days)
Parameters: n (0) = c (0) = 1, λ = 1.7 and µ = 0.001 24/43
33. Some simulations
800
600
Aphid pop.
400
200
0
0 2 4 6 8 10
Time (days)
Parameters: n (0) = c (0) = 1, λ = 1.7 and µ = 0.001 24/43
34. Some simulations
800
600
Aphid pop.
400
200
0
0 2 4 6 8 10
Time (days)
Parameters: n (0) = c (0) = 1, λ = 1.7 and µ = 0.001 24/43
35. Stochastic parameter estimation
Let X(tu ) = (n (tu ), c (tu )) be the vector of observed aphid counts
and unobserved cumulative population size at time tu ;
To infer λ and µ, we need to estimate
Pr[X(tu )| X(tu −1 ), λ, µ]
i.e. the solution of the forward Kolmogorov equation
We will use moment closure to estimate this distribution
25/43
36. Stochastic parameter estimation
Let X(tu ) = (n (tu ), c (tu )) be the vector of observed aphid counts
and unobserved cumulative population size at time tu ;
To infer λ and µ, we need to estimate
Pr[X(tu )| X(tu −1 ), λ, µ]
i.e. the solution of the forward Kolmogorov equation
We will use moment closure to estimate this distribution
25/43
37. Moment equations for the means
dE[n (t )]
= λE[n(t )] − µ(E[n(t )]E[c (t )] + Cov[n(t ), c (t )])
dt
dE[c (t )]
= λE[n(t )]
dt
The equation for the E[n (t )] depends on the Cov[n (t ), c (t )]
Setting Cov[n (t ), c (t )]=0 gives the deterministic model
We obtain similar equations for higher-order moments
26/43
38. Moment equations for the means
dE[n (t )]
= λE[n(t )] − µ(E[n(t )]E[c (t )] + Cov[n(t ), c (t )])
dt
dE[c (t )]
= λE[n(t )]
dt
The equation for the E[n (t )] depends on the Cov[n (t ), c (t )]
Setting Cov[n (t ), c (t )]=0 gives the deterministic model
We obtain similar equations for higher-order moments
26/43
39. Parameter inference
Given
the parameters: {λ, µ}
the initial states: X(tu −1 ) = (n (tu −1 ), c (tu −1 ));
We have
X(tu ) | X(tu −1 ), λ, µ ∼ N (ψu −1 , Σu −1 )
where ψu −1 and Σu −1 are calculated using the moment closure
approximation
27/43
40. Parameter inference
Summarising our beliefs about {λ, µ} and the unobserved
cumulative population c (t0 ) via priors p (λ, µ) and p (c (t0 ))
The joint posterior for parameters and unobserved states (for a single
data set) is
4
p (λ, µ, c | n) ∝ p (λ, µ) p (c(t0 )) ∏ p (x(tu ) | x(tu−1 ), λ, µ)
u =1
For the results shown, we used a simple random walk MH step to
explore the parameter and state spaces
For more complicated models, we can use a Durham & Gallant style
bridge (Milner, G & Wilkinson, 2012).
28/43
41. Parameter inference
Summarising our beliefs about {λ, µ} and the unobserved
cumulative population c (t0 ) via priors p (λ, µ) and p (c (t0 ))
The joint posterior for parameters and unobserved states (for a single
data set) is
4
p (λ, µ, c | n) ∝ p (λ, µ) p (c(t0 )) ∏ p (x(tu ) | x(tu−1 ), λ, µ)
u =1
For the results shown, we used a simple random walk MH step to
explore the parameter and state spaces
For more complicated models, we can use a Durham & Gallant style
bridge (Milner, G & Wilkinson, 2012).
28/43
42. Simulation study
Three treatments & two blocks
Baseline birth and death rates: {λ = 1.75, µ = 0.00095}
Treatment 2 increases µ by 0.0004
Treatment 3 increases λ by 0.35
The block effect reduces µ by 0.0003
Treatment 1 Treatment 2 Treatment 3
Block 1 {1.75, 0.00095} {1.75, 0.00135} {2.1, 0.00095}
Block 2 {1.75, 0.00065} {1.75, 0.00105} {2.1, 0.00065}
29/43
43. Simulation study
Three treatments & two blocks
Baseline birth and death rates: {λ = 1.75, µ = 0.00095}
Treatment 2 increases µ by 0.0004
Treatment 3 increases λ by 0.35
The block effect reduces µ by 0.0003
Treatment 1 Treatment 2 Treatment 3
Block 1 {1.75, 0.00095} {1.75, 0.00135} {2.1, 0.00095}
Block 2 {1.75, 0.00065} {1.75, 0.00105} {2.1, 0.00065}
29/43
44. Simulation study
Three treatments & two blocks
Baseline birth and death rates: {λ = 1.75, µ = 0.00095}
Treatment 2 increases µ by 0.0004
Treatment 3 increases λ by 0.35
The block effect reduces µ by 0.0003
Treatment 1 Treatment 2 Treatment 3
Block 1 {1.75, 0.00095} {1.75, 0.00135} {2.1, 0.00095}
Block 2 {1.75, 0.00065} {1.75, 0.00105} {2.1, 0.00065}
29/43
45. Simulation study
Three treatments & two blocks
Baseline birth and death rates: {λ = 1.75, µ = 0.00095}
Treatment 2 increases µ by 0.0004
Treatment 3 increases λ by 0.35
The block effect reduces µ by 0.0003
Treatment 1 Treatment 2 Treatment 3
Block 1 {1.75, 0.00095} {1.75, 0.00135} {2.1, 0.00095}
Block 2 {1.75, 0.00065} {1.75, 0.00105} {2.1, 0.00065}
29/43
47. Parameter structure
Let i , k represent the block and treatments level, i ∈ {1, 2} and
k ∈ {1, 2, 3}
For each data set, we assume birth rates of the form:
λik = λ + αi + β k
where α1 = β 1 = 0
So for block 1, treatment 1 we have:
λ11 = λ
and for block 2, treatment 1 we have:
λ21 = λ + α2
31/43
48. MCMC scheme
Using the MCMC scheme described previously, we generated 2M
iterates and thinned by 1K
This took a few hours and convergence was fairly quick
We used independent proper uniform priors for the parameters
For the initial unobserved cumulative population, we had
c (t0 ) = n (t0 ) +
where has a Gamma distribution with shape 1 and scale 10.
This set up mirrors the scheme that we used for the real data set
32/43
49. Marginal posterior distributions for
λ and µ
20000
6
15000
Density
Density
4
10000
2
5000
0
X 0
X
1.6 1.7 1.8 1.9 2.0 0.00090 0.00095 0.00100
Birth Rate Death Rate
33/43
50. Marginal posterior distributions for birth
rates
−0.2 0.0 0.2 0.4
Block 2 Treatment 2 Treatment 3
6
Density
4
2
0 X X X
−0.2 0.0 0.2 0.4 −0.2 0.0 0.2 0.4
Birth Rate
We obtained similar densities for the death rates.
34/43
51. Application to the cotton aphid data set
Recall that the data consists of
five observations on twenty randomly chosen leaves in each plot;
three blocks, each being in a distinct area;
three irrigation treatments (low, medium and high);
three nitrogen levels (blanket, variable and none);
the sampling times are t=0, 1.14, 2.29, 3.57 and 4.57 weeks (i.e.
every 7 to 8 days).
Following in the same vein as the simulated data, we are estimating 38
parameters (including interaction terms) and the latent cumulative aphid
population.
35/43
52. Cotton aphid data
Marginal posterior distributions
6
15000
Density
Density
4
10000
2 5000
0 0
1.6 1.7 1.8 1.9 2.0 0.00090 0.00095 0.00100
Birth Rate Death Rate
36/43
53. Does the model fit the data?
We simulate predictive distributions from the MCMC output, i.e. we
randomly sample parameter values (λ, µ) and the unobserved state
c and simulate forward
We simulate forward using the Gillespie simulator
not the moment closure approximation
37/43
54. Does the model fit the data?
Predictive distributions for 6 of the 27 Aphid data sets
D 123 D 121 D131
2500
2000
1500
X
q
q q
q 1000
X
q
q
X q
q
q q
Aphid Population
q q
q
q q
q
q 500
X
q q
q
X
q q
q
q q X
q
q
q
q X X
q
q
q
X
q X q
q
q
X X 0
q
D 112 D 122 D 113
q
q
X
2500
q
q
2000
1500 q
q
X
q
q q
q
1000
q
q q
q
X q
q X
q
q
q
q
q
q
q
500 X q q
X q
q
X q
q
q q
q
X
q
q
q
X X q
X X
q
0
q
1.14 2.29 3.57 4.57 1.14 2.29 3.57 4.57 1.14 2.29 3.57 4.57
Time
38/43
55. Summarising the results
Consider the additional number of aphids per treatment combination
Set c (0) = n (0) = 1 and tmax = 6
We now calculate the number of aphids we would see for each
parameter combination in addition to the baseline
For example, the effect due to medium water:
∗
λ211 = λ + αWater (M) and µ211 = µ + αWater (M)
So
i i
Additional aphids = cWater (M) − cbaseline
39/43
56. Aphids over baseline
Main Effects
0 2000 6000 10000
Nitrogen (V) Water (H) Water (M)
0.0025
0.0020
0.0015
0.0010
0.0005
0.0000
Density
Block 3 Block 2 Nitrogen (Z)
0.0025
0.0020
0.0015
0.0010
0.0005
0.0000
0 2000 6000 10000 0 2000 6000 10000
Aphids
40/43
58. Conclusions
The 95% credible intervals for the baseline birth and death rates are
(1.64, 1.86) and (0.00090, 0.00099).
Main effects have little effect by themselves
However block 2 appears to have a very strong interaction with
nitrogen
Moment closure parameter inference is a very useful technique for
estimating parameters in stochastic population models
41/43
59. Future work
Aphid model
Other data sets suggest that there is aphid immigration in the early
stages
Model selection for stochastic models
Incorporate measurement error
Moment closure
Better closure techniques
Assessing the fit
42/43
60. Acknowledgements
Andrew Golightly Richard Boys
Peter Milner
Darren Wilkinson Jim Matis (Texas A & M)
References
Gillespie, CS Moment closure approximations for mass-action models. IET Systems Biology 2009.
Gillespie, CS, Golightly, A Bayesian inference for generalized stochastic population growth models with application to aphids.
Journal of the Royal Statistical Society, Series C 2010.
Milner, P, Gillespie, CS, Wilkinson, DJ Moment closure approximations for stochastic kinetic models with rational rate laws.
Mathematical Biosciences 2011.
Milner, P, Gillespie, CS and Wilkinson, DJ Moment closure based parameter inference of stochastic kinetic models.
Statistics and Computing 2012.
43/43