Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Moment Closure Based Parameter Inference of Stochastic Kinetic Models
1. Moment Closure Based Parameter
Inference of Stochastic Kinetic Models
Colin Gillespie
School of Mathematics & Statistics
2. Overview
Talk outline
An introduction to moment closure
Parameter inference
Conclusion
2/25
3. Birth-death process
Birth-death model
X −→ 2X and 2X −→ X
which has the propensity functions λX and µX .
Deterministic representation
The deterministic model is
dX (t )
= ( λ − µ )X (t ) ,
dt
which can be solved to give X (t ) = X (0) exp[(λ − µ)t ].
3/25
4. Birth-death process
Birth-death model
X −→ 2X and 2X −→ X
which has the propensity functions λX and µX .
Deterministic representation
The deterministic model is
dX (t )
= ( λ − µ )X (t ) ,
dt
which can be solved to give X (t ) = X (0) exp[(λ − µ)t ].
3/25
5. Stochastic representation
In the stochastic framework, each
reaction has a probability of occurring
50
The analogous version of the
40
birth-death process is the difference
Population
equation 30
20
dpn
= λ(n − 1)pn−1 + µ(n + 1)pn+1 10
dt
− (λ + µ)npn 0
0 1 2 3 4
Time
Usually called the forward Kolmogorov
equation or chemical master equation
4/25
6. Moment equations
Multiply the CME by enθ and sum over n, to obtain
∂M ∂M
= [λ(eθ − 1) + µ(e−θ − 1)]
∂t ∂θ
where
∞
M (θ; t ) = ∑ e n θ pn ( t )
n =0
If we differentiate this p.d.e. w.r.t θ and set θ = 0, we get
dE[N (t )]
= (λ − µ)E[N (t )]
dt
where E[N (t )] is the mean
5/25
7. The mean equation
dE[N (t )]
= (λ − µ)E[N (t )]
dt
This ODE is solvable - the associated forward Kolmogorov equation is
also solvable
The equation for the mean and deterministic ODE are identical
When the rate laws are linear, the stochastic mean and deterministic
solution always correspond
6/25
8. The variance equation
If we differentiate the p.d.e. w.r.t θ twice and set θ = 0, we get:
dE[N (t )2 ]
= (λ − µ)E[N (t )] + 2(λ − µ)E[N (t )2 ]
dt
and hence the variance Var[N (t )] = E[N (t )2 ] − E[N (t )]2 .
Differentiating three times gives an expression for the skewness, etc
7/25
10. Dimerisation moment equations
We formulate the dimer model in terms of moment equations
dE[X1 ] 2
= 0.5k1 (E[X1 ] − E[X1 ]) − k2 E[X1 ]
dt
2
dE[X1 ] 2 2
= k1 (E[X1 X2 ] − E[X1 X2 ]) + 0.5k1 (E[X1 ] − E[X1 ])
dt
2
+ k2 (E[X1 ] − 2E[X1 ])
where E[X1 ] is the mean of X1 and E[X1 ] − E[X1 ]2 is the variance
2
The i th moment equation depends on the (i + 1)th equation
9/25
11. Deterministic approximates stochastic
Rewriting
dE[X1 ] 2
= 0.5k1 (E[X1 ] − E[X1 ]) − k2 E[X1 ]
dt
in terms of its variance, i.e. E[X1 ] = Var[X1 ] + E[X1 ]2 , we get
2
dE[X1 ]
= 0.5k1 E [X1 ](E[X1 ] − 1) + 0.5k1 Var[X1 ] − k2 E[X1 ] (1)
dt
Setting Var[X1 ] = 0 in (1), recovers the deterministic equation
So we can consider the deterministic model as an approximation to
the stochastic
When we have polynomial rate laws, setting the variance to zero
results in the deterministic equation
10/25
12. Deterministic approximates stochastic
Rewriting
dE[X1 ] 2
= 0.5k1 (E[X1 ] − E[X1 ]) − k2 E[X1 ]
dt
in terms of its variance, i.e. E[X1 ] = Var[X1 ] + E[X1 ]2 , we get
2
dE[X1 ]
= 0.5k1 E [X1 ](E[X1 ] − 1) + 0.5k1 Var[X1 ] − k2 E[X1 ] (1)
dt
Setting Var[X1 ] = 0 in (1), recovers the deterministic equation
So we can consider the deterministic model as an approximation to
the stochastic
When we have polynomial rate laws, setting the variance to zero
results in the deterministic equation
10/25
13. Simple dimerisation model
To close the equations, we assume an underlying distribution
The easiest option is to assume an underlying Normal distribution, i.e.
E[X1 ] = 3E[X1 ]E[X1 ] − 2E[X1 ]3
3 2
But we could also use, the Poisson
3
E[X1 ] = E[X1 ] + 3E[X1 ]2 + E[X1 ]3
or the Log normal
2 3
3 E [ X1 ]
E [ X1 ] =
E [ X1 ]
11/25
14. Simple dimerisation model
To close the equations, we assume an underlying distribution
The easiest option is to assume an underlying Normal distribution, i.e.
E[X1 ] = 3E[X1 ]E[X1 ] − 2E[X1 ]3
3 2
But we could also use, the Poisson
3
E[X1 ] = E[X1 ] + 3E[X1 ]2 + E[X1 ]3
or the Log normal
2 3
3 E [ X1 ]
E [ X1 ] =
E [ X1 ]
11/25
15. Heat shock model
Proctor et al, 2005. Stochastic kinetic model of the heat shock system
twenty-three reactions
seventeen chemical species
A single stochastic simulation up to t = 2000 takes about 35 minutes.
If we convert the model to moment equations, we get 139 equations
ADP Native Protein
1200 6000000
5950000
1000
5900000
800
Population
5850000
600
5800000
400
5750000
200
5700000
0
0 500 1000 1500 2000 0 500 1000 1500 2000
Time
Gillespie, CS, 2009
12/25
16. Density plots: heat shock model
Time t=200 Time t=2000
0.006
Density
0.004
0.002
0.000
600 800 1000 1200 1400 600 800 1000 1200 1400
ADP population
13/25
17. P53-Mdm2 oscillation model
Proctor and Grey, 2008
300
16 chemical species
Around a dozen reactions 250
The model contains an event 200
Population
At t = 1, set X = 0 150
If we convert the model to moment 100
equations, we get 139 equations.
50
However, in this case the moment
0
closure approximation doesn’t do to
0 5 10 15 20 25 30
well! Time
14/25
18. P53-Mdm2 oscillation model
Proctor and Grey, 2008
300
16 chemical species
Around a dozen reactions 250
The model contains an event 200
Population
At t = 1, set X = 0 150
If we convert the model to moment 100
equations, we get 139 equations.
50
However, in this case the moment
0
closure approximation doesn’t do to
0 5 10 15 20 25 30
well! Time
14/25
19. P53-Mdm2 oscillation model
Proctor and Grey, 2008
300
16 chemical species
Around a dozen reactions 250
The model contains an event 200
Population
At t = 1, set X = 0 150
If we convert the model to moment 100
equations, we get 139 equations.
50
However, in this case the moment
0
closure approximation doesn’t do to
0 5 10 15 20 25 30
well! Time
14/25
20. What went wrong?
The moment closure (tends) to fail when there is a large difference
between the deterministic and stochastic formulations
In this particular case, strongly correlated species
Typically when the MC approximation fails, it gives a negative
variance
The MC approximation does work well for other parameter values for
the p53 model
15/25
21. Parameter inference
4
3
Population
2
Simple immigration-death
1 process
k1
0 R1 : ∅ − X
→
0 10 20 30 40 50 k2
Time R2 : X − ∅
→
The CME can be solved
Discrete time course data
The likelihood can be very flat
16/25
22. Parameter inference
4
3 q
Population
2 q
Simple immigration-death
1 q q q q q
process
k1
0 q q q q
R1 : ∅ − X
→
0 10 20 30 40 50 k2
Time R2 : X − ∅
→
The CME can be solved
Discrete time course data
The likelihood can be very flat
16/25
23. Parameter inference
4
3 q
Population
2 q
Simple immigration-death
1 q q q q q
process
k1
0 q q q q
R1 : ∅ − X
→
0 10 20 30 40 50 k2
Time R2 : X − ∅
→
10
8
The CME can be solved
6 Discrete time course data
k2
4 The likelihood can be very flat
2
0
0 2 4 6 8 10
k1
16/25
24. Lotka-Volterra model
Species Predator Prey
The Lotka-Volterra predator prey system,
describes the time evolution of two 400
species, Y1 and Y2
Prey birth: Y1 → 2Y1 300
Population
Interaction: Y1 + Y2 → 2Y2
200
Predator death: Y2 → ∅
Since the Lotka-Volterra model 100
contains a non-linear rate law, the i th
moment equation depends on the 0
(i + 1)th moment. 0 10 20 30 40
Time
17/25
25. Lotka-Volterra model
Species Predator Prey
The Lotka-Volterra predator prey system,
describes the time evolution of two 400
species, Y1 and Y2
Prey birth: Y1 → 2Y1 300
Population
Interaction: Y1 + Y2 → 2Y2
200
Predator death: Y2 → ∅
Since the Lotka-Volterra model 100
contains a non-linear rate law, the i th
moment equation depends on the 0
(i + 1)th moment. 0 10 20 30 40
Time
17/25
26. Lotka-Volterra model
Species Predator Prey
The Lotka-Volterra predator prey system,
describes the time evolution of two 400
species, Y1 and Y2
Prey birth: Y1 → 2Y1 300
Population
Interaction: Y1 + Y2 → 2Y2
200
Predator death: Y2 → ∅
Since the Lotka-Volterra model 100
contains a non-linear rate law, the i th
moment equation depends on the 0
(i + 1)th moment. 0 10 20 30 40
Time
17/25
27. Lotka-Volterra model
Species Predator Prey
The Lotka-Volterra predator prey system,
describes the time evolution of two 400
species, Y1 and Y2
Prey birth: Y1 → 2Y1 300
Population
Interaction: Y1 + Y2 → 2Y2
200
Predator death: Y2 → ∅
Since the Lotka-Volterra model 100
contains a non-linear rate law, the i th
moment equation depends on the 0
(i + 1)th moment. 0 10 20 30 40
Time
17/25
28. Parameter estimation
Let Y(tu ) = (Y1 (tu ), Y2 (tu )) be the vector of the observed predator
and prey
To infer c1 , c2 and c3 , we need to estimate
Pr[Y(tu )| Y(tu −1 ), c]
i.e. the solution of the forward Kolmogorov equation
We will use moment closure to estimate this distribution:
Y(tu ) | Y(tu −1 ), c ∼ N (ψu −1 , Σu −1 )
where ψu −1 and Σu −1 are calculated using the moment closure
approximation
18/25
29. Parameter estimation
Let Y(tu ) = (Y1 (tu ), Y2 (tu )) be the vector of the observed predator
and prey
To infer c1 , c2 and c3 , we need to estimate
Pr[Y(tu )| Y(tu −1 ), c]
i.e. the solution of the forward Kolmogorov equation
We will use moment closure to estimate this distribution:
Y(tu ) | Y(tu −1 ), c ∼ N (ψu −1 , Σu −1 )
where ψu −1 and Σu −1 are calculated using the moment closure
approximation
18/25
30. Bayesian parameter inference
Summarising our beliefs about c and the unobserved predator
population Y2 (0) via uninformative priors
The joint posterior for parameters and unobserved states (for a single
data set) is
40
p (y2 , c | y1 ) ∝ p (c) p (y2 (0)) ∏ p (y(tu ) | y(tu−1 ), c)
u =1
For the results shown, we used a vanilla Metropolis-Hasting step to
explore the parameter and state spaces
For more complicated models, we can use a Durham & Gallant style
bridge (Milner, G & Wilkinson, 2012)
19/25
31. Bayesian parameter inference
Summarising our beliefs about c and the unobserved predator
population Y2 (0) via uninformative priors
The joint posterior for parameters and unobserved states (for a single
data set) is
40
p (y2 , c | y1 ) ∝ p (c) p (y2 (0)) ∏ p (y(tu ) | y(tu−1 ), c)
u =1
For the results shown, we used a vanilla Metropolis-Hasting step to
explore the parameter and state spaces
For more complicated models, we can use a Durham & Gallant style
bridge (Milner, G & Wilkinson, 2012)
19/25
33. Auto regulation system
This system contains twelve reactions and six species
The species populations ranges from zero (for species i) to around
65,000 for species G
The moment closure approximation yields a closed set of
twenty-seven ODEs
Six ODEs for the means
Six ODEs for the variances
Fifteen ODEs for the covariance terms
21/25
34. Stochastic realisation
30 Species
25 g
Population
20
i
15
10 r_g
5
r_i
0
0 10 20 30 40 50
Time
15 65100
10
65050
G
I
5
65000
0
0 10 20 30 40 50 0 10 20 30 40 50
Time Time
22/25
35. Stochastic realisation
30 Species
25 g
Population
20
i
15
10 r_g
5
r_i
0
0 10 20 30 40 50
Time
15 65100
10
65050
G
I
5
65000
0
0 10 20 30 40 50 0 10 20 30 40 50
Time Time
22/25
36. Stochastic realisation
30 Species
25 g
Population
20
i
15
10 r_g
5
r_i
0
0 10 20 30 40 50
Time
15 65100
10
65050
G
I
5
65000
0
0 10 20 30 40 50 0 10 20 30 40 50
Time Time
22/25
37. Parameter inference
Fully Obs. Partially Obs.
c1
c2
Posterior distributions for c1 to
c8 : mean ± 2 sd. True values in
c3
red
Given information on all
Parameter
c4
species, inference is reasonable
c5
For most of the parameters,
c6
fewer data points results in
c7 larger credible regions
c8
But not in all cases!
0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0
Parameter value
23/25
38. Parameter inference
Fully Obs. Partially Obs.
c1
c2
Posterior distributions for c1 to
c8 : mean ± 2 sd. True values in
c3
red
Given information on all
Parameter
c4
species, inference is reasonable
c5
For most of the parameters,
c6
fewer data points results in
c7 larger credible regions
c8
But not in all cases!
0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0
Parameter value
23/25
39. Parameter inference
Fully Obs. Partially Obs.
c1
c2
Posterior distributions for c1 to
c8 : mean ± 2 sd. True values in
c3
red
Given information on all
Parameter
c4
species, inference is reasonable
c5
For most of the parameters,
c6
fewer data points results in
c7 larger credible regions
c8
But not in all cases!
0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0
Parameter value
23/25
40. Parameter inference
Fully Obs. Partially Obs.
c1
c2
Posterior distributions for c1 to
c8 : mean ± 2 sd. True values in
c3
red
Given information on all
Parameter
c4
species, inference is reasonable
c5
For most of the parameters,
c6
fewer data points results in
c7 larger credible regions
c8
But not in all cases!
0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0
Parameter value
23/25
41. Future work
Techniques for assessing the moment closure approximation
Better closure techniques
Computer emulation for moments
Using the moment closure approximation as a proposal distribution in
an MCMC algorithm
The proposal can be (almost) anything we want
The likelihood can be calculated using anything we want
24/25
42. Acknowledgements
Peter Milner Darren Wilkinson
References
Gillespie, CS Moment closure approximations for mass-action models. IET Systems Biology 2009.
Gillespie, CS, Golightly, A Bayesian inference for generalized stochastic population growth models with application to aphids.
Journal of the Royal Statistical Society, Series C 2010.
Milner, P, Gillespie, CS, Wilkinson, DJ Moment closure approximations for stochastic kinetic models with rational rate laws.
Mathematical Biosciences 2011.
Milner, P, Gillespie, CS and Wilkinson, DJ Moment closure based parameter inference of stochastic kinetic models.
Statistics and Computing 2012.
25/25