SlideShare une entreprise Scribd logo
1  sur  29
Télécharger pour lire hors ligne
Estimation of the score vector and observed
information matrix in intractable models
Arnaud Doucet (University of Oxford)
Pierre E. Jacob (University of Oxford)
Sylvain Rubenthaler (Universit´e Nice Sophia Antipolis)
June 15th, 2015
Pierre Jacob Derivative estimation 1/ 25
Outline
1 Context: intractable derivatives
2 Shift estimators
3 Monte Carlo shift estimators
Pierre Jacob Derivative estimation 1/ 25
Outline
1 Context: intractable derivatives
2 Shift estimators
3 Monte Carlo shift estimators
Pierre Jacob Derivative estimation 2/ 25
Motivation
Derivatives of the likelihood help optimizing / sampling.
They become crucial when the parameter space is
high-dimensional.
For many models, these derivatives are intractable.
One can resort to approximation techniques.
Pierre Jacob Derivative estimation 2/ 25
Motivation: hidden Markov models
y2
X2X0
y1
X1
...
... yT
XT
θ
Figure : Graph representation of a general hidden Markov model.
Pierre Jacob Derivative estimation 3/ 25
Motivation: hidden Markov models
The associated loglikelihood is
(θ) = log p(y1:T ; θ)
= log
XT+1
T
t=1
g(yt | xt, θ) µ(dx1 | θ)
T
t=2
f (dxt | xt−1, θ).
This high-dimensional integral can be approximated
efficiently using particle filters.
However, the approximation itself is “discontinuous” as a
function of the parameter θ, even when the “seed” is fixed.
Figures coming.
Pierre Jacob Derivative estimation 4/ 25
Motivation: hidden Markov models
−198
−196
−194
−192
−190
0.500 0.525 0.550 0.575 0.600
θ
loglikelihood
different seed truth
Figure : Approximation of the log-likelihood by particle filters.
Pierre Jacob Derivative estimation 5/ 25
Motivation: hidden Markov models
−196
−194
−192
−190
0.500 0.525 0.550 0.575 0.600
θ
loglikelihood
same seed truth
Figure : Approximation of the log-likelihood by particle filters.
Pierre Jacob Derivative estimation 6/ 25
Motivation: general context
Let (θ) denote a “log-likelihood” but it could be any
function.
Let L(θ) = exp (θ), the “likelihood”.
Assume that we have access to estimators L(θ) of L(θ)
such that
E[L(θ)] = L(θ)
and
V
L(θ)
L(θ)
=
v(θ)
M
≈
v
M
(∀θ).
Pierre Jacob Derivative estimation 7/ 25
Finite difference
First derivative:
(θ ) =
log L(θ + h) − log L(θ − h)
2h
converges to (θ ) when M → ∞ and h → 0.
Second derivative:
2 (θ) =
log L(θ + h) − 2 log L(θ) + log L(θ − h)
h2
converges to 2 (θ ) when M → ∞ and h → 0.
Pierre Jacob Derivative estimation 8/ 25
Finite difference
Bias:
E (θ ) − (θ ) = O(h2
).
Variance:
V (θ ) = O
1
Mh2
.
Optimal rate of convergence for the first derivative:
h ∼ M−1/6
leading to MSE ∼ M−2/3
.
For the second derivative:
h ∼ M−1/8
leading to MSE ∼ M−1/2
.
Pierre Jacob Derivative estimation 9/ 25
Outline
1 Context: intractable derivatives
2 Shift estimators
3 Monte Carlo shift estimators
Pierre Jacob Derivative estimation 10/ 25
Iterated Filtering
Ionides, Bhadra, Atchad´e, King, Iterated filtering, AoS, 2011.
Given a log-likelihood and a given point θ , consider a
“prior” distribution centered on θ , for instance
Θ ∼ N(θ , τ2
Σ).
We can consider “posterior” expectations, when they exist:
Epost [ϕ (Θ)] =
E [ϕ (Θ) exp (Θ)]
E [exp (Θ)]
,
where the expectations E are with respect to the “prior”.
Pierre Jacob Derivative estimation 10/ 25
Iterated Filtering
Under conditions on the prior distribution, and on the
log-likelihood, one gets the following result.
Posterior expectation when the prior variance goes to zero
For any θ , there exists a C such that, for τ small enough,
|τ−2
Σ−1
(Epost[Θ] − θ ) − (θ )| ≤ Cτ.
Phrased simply,
posterior mean − prior mean
prior variance
≈ score.
Result from Ionides, Bhadra, Atchad´e, King, Iterated filtering,
2011.
Pierre Jacob Derivative estimation 11/ 25
Normal priors
Let us focus on the normal case:
Θ ∼ N(θ , τ2
Σ).
Stein’s Lemma says...
Assume that E [| iL (Θ)|] < ∞ and E 2
ijL (Θ) < ∞ for
i, j ∈ {1, . . . , d}. Then we have
Epost [Θ − θ ] = τ2
Σ Epost [ (Θ)] ,
and
Epost (Θ − θ ) (Θ − θ )T
= τ2
Σ + τ4
ΣEpost
2
(Θ) + (Θ) (Θ)T
Σ.
Pierre Jacob Derivative estimation 12/ 25
Expected derivatives and derivatives
We do not want to estimate
Epost [ (Θ)]
but rather
(θ ).
Posterior moment expansion
When τ → 0, we have for ϕ satisfying some conditions,
Epost [ϕ (Θ)] = ϕ (θ )
+
τ2
2
d
i=1
d
j=1
2
ijϕ (θ ) + 2 iϕ (θ ) j (θ ) Σij
+ O τ4
.
Pierre Jacob Derivative estimation 13/ 25
Expected derivatives and derivatives
Conditions for this to hold:
Let ϕ : Rd → R be a four times continuously differentiable
function. Assume that there exists a constant M < ∞ and
δ > 0 such that ∂4ϕ(θ)
∂θi∂θj∂θk∂θl
≤ M for all θ ∈ BΣ(θ , δ), all
(i, j, k, l) ∈ {1, . . . , d}4
.
The likelihood function L is such that both θ → L (θ) and
θ → ϕ (θ) × L (θ) satisfy the same assumption as ϕ.
There exists τ0 > 0 such that Eτ0 [|ϕ (Θ)|] < ∞,
Eτ0 [|L (Θ)|] < ∞ and Eτ0 [|ϕ (Θ) L (Θ)|] < ∞.
Pierre Jacob Derivative estimation 14/ 25
Shift estimators
Shift estimators
Sτ (θ ) = τ−2
Σ−1
Epost [Θ − θ ] = (θ ) + O τ2
,
Iτ (θ ) = τ−4
Σ−1
Vpost [Θ] − τ2
Σ Σ−1
= 2
(θ ) + O τ2
.
Using Monte Carlo approximations of
Epost [Θ] and Vpost [Θ] ,
these formulas provide estimators of
(θ ) and 2
(θ ).
Pierre Jacob Derivative estimation 15/ 25
Example: normal likelihood
The parameter θ represents a d-dimensional location, and
Y ∼ N(θ, Λ−1
y ).
Thus the derivatives of the log-likelihood are
(θ) = −Λy(θ − y),
2
(θ) = −Λy.
Using a prior N θ , τ2Σ , the posterior is conjugate and the
shift estimators are given by
Sτ (θ ) =
1
(1 + τ2ΣΛy)
(−Λy(θ − y)),
Iτ (θ ) =
1
(1 + τ2ΣΛy)
(−Λy) .
Pierre Jacob Derivative estimation 16/ 25
Outline
1 Context: intractable derivatives
2 Shift estimators
3 Monte Carlo shift estimators
Pierre Jacob Derivative estimation 17/ 25
Monte Carlo shift estimators
Draw θi ∼ N θ , τ2Σ , for each i ∈ {1, . . . , N}.
Draw ˆwi = L (θi), for each i ∈ {1, . . . , N}.
Normalize the weights: for each i ∈ {1, . . . , N},
ˆWi = ˆwi/
N
j=1
ˆwj.
Then we can estimate
τ−2
Σ−1
(Epost[Θ] − θ )
by
τ−2
Σ−1
N
i=1
ˆWiθi − θ ,
or, even better,
τ−2
Σ−1
N
i=1
ˆWiθi −
1
N
N
i=1
θi .
Pierre Jacob Derivative estimation 17/ 25
Monte Carlo shift estimators
For the second order derivative, recall that:
τ−4
Σ−1
Vpost [Θ] − τ2
Σ Σ−1
= 2
(θ ) + O τ2
.
Importance sampling strategy (with control variates) leads to:
τ−4
Σ−1


N
i=1
ˆWi θi −
N
i=1
ˆWjθj
2
−
1
N
N
i=1
θi −
1
N
N
i=1
θi
2

 Σ−1
.
with obvious extension in the multivariate case.
Pierre Jacob Derivative estimation 18/ 25
Monte Carlo shift estimators
Score estimator:
SN,τ (θ ) = τ−2
Σ−1
N
i=1
ˆWiθi −
1
N
N
i=1
θi .
Bias:
E SN,τ (θ ) − (θ ) = O τ2
.
Variance:
V SN,τ (θ ) = O
1
Nτ2
.
⇒ same rate of convergence than the finite difference method!
Similar results for the estimator of 2 (θ ).
Pierre Jacob Derivative estimation 19/ 25
Monte Carlo shift estimators
The shift estimators are competitive with the gold
standard!
The shift estimators are not better than the gold standard!
So why bother?
Why is Iterated Filtering used in practice?
Let’s have a look at the non-asymptotic behaviour.
Pierre Jacob Derivative estimation 20/ 25
Robustness
Recall:
SN,τ (θ ) = τ−2
Σ−1
N
i=1
ˆWiθi −
1
N
N
i=1
θi ,
where
ˆWi =
L(θi)
N
j=1 L(θj)
,
and V(L(θi)/L(θi)) = v/M.
Scenario: v is very large. Then N
i=1
ˆWiθi ≈ θj for some j.
We obtain, for all v:
V(SN,τ (θ ) ≤ Cτ−2
.
On the other hand, the variance of finite difference estimators
increases to ∞ with v.
Pierre Jacob Derivative estimation 21/ 25
Example: normal likelihood
Latent variable model:
X ∼ N(θ, Λ−1
y − λΛ−1
y|x), Y | X = x ∼ N(x, λΛ−1
y|x),
for fixed matrices Λy and Λy|x, and λ ∈ (0, 1).
Still corresponds to
Y ∼ N(θ, Λ−1
y ),
but a naive Monte Carlo approach has an infinite relative
variance when λ goes to zero.
The following figure represents the behaviour of the MSE for
the Monte Carlo shift estimator and the finite difference
estimator as λ goes to zero, i.e. v → ∞.
Pierre Jacob Derivative estimation 22/ 25
Robustness
1e+01
1e+04
1e+07
1e−01 1e−02 1e−03 1e−04 1e−05 1e−06
λ
MSE
Monte Carlo shift finite difference
Figure : Robustness of the Monte Carlo shift estimator when the
variance of the likelihood estimator increases.
Pierre Jacob Derivative estimation 23/ 25
Discussion
Monte Carlo shift estimators and finite difference
estimators: equivalent in mean squared error (N → ∞).
Monte Carlo shift estimators are robust to the noise of the
likelihood estimators (v → ∞). This might explain the
observed gain within optimization procedures.
There are specific forms of shift estimators for hidden
Markov models (e.g. Iterated Filtering).
Behaviour of the posterior distribution when the prior is a
concentrated normal distribution, interesting on its own
right?
Pierre Jacob Derivative estimation 24/ 25
Bibliography
Main references:
Inference for nonlinear dynamical systems, Ionides, Breto,
King, PNAS, 2006.
Iterated filtering, Ionides, Bhadra, Atchad´e, King, Annals
of Statistics, 2011.
Derivative-Free Estimation of the Score Vector
and Observed Information Matrix,
Doucet, Jacob, Rubenthaler, (on arXiv, to be updated).
Pierre Jacob Derivative estimation 25/ 25

Contenu connexe

Tendances

Monte Carlo Statistical Methods
Monte Carlo Statistical MethodsMonte Carlo Statistical Methods
Monte Carlo Statistical MethodsChristian Robert
 
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithms
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithmsRao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithms
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithmsChristian Robert
 
Signal Processing Introduction using Fourier Transforms
Signal Processing Introduction using Fourier TransformsSignal Processing Introduction using Fourier Transforms
Signal Processing Introduction using Fourier TransformsArvind Devaraj
 
An introduction to moment closure techniques
An introduction to moment closure techniquesAn introduction to moment closure techniques
An introduction to moment closure techniquesColin Gillespie
 
Levitan Centenary Conference Talk, June 27 2014
Levitan Centenary Conference Talk, June 27 2014Levitan Centenary Conference Talk, June 27 2014
Levitan Centenary Conference Talk, June 27 2014Nikita V. Artamonov
 
Approximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-LikelihoodsApproximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-LikelihoodsStefano Cabras
 
Bayesian Experimental Design for Stochastic Kinetic Models
Bayesian Experimental Design for Stochastic Kinetic ModelsBayesian Experimental Design for Stochastic Kinetic Models
Bayesian Experimental Design for Stochastic Kinetic ModelsColin Gillespie
 
An application of the hyperfunction theory to numerical integration
An application of the hyperfunction theory to numerical integrationAn application of the hyperfunction theory to numerical integration
An application of the hyperfunction theory to numerical integrationHidenoriOgata
 
Doering Savov
Doering SavovDoering Savov
Doering Savovgh
 
Tutorial of topological_data_analysis_part_1(basic)
Tutorial of topological_data_analysis_part_1(basic)Tutorial of topological_data_analysis_part_1(basic)
Tutorial of topological_data_analysis_part_1(basic)Ha Phuong
 
Stochastic Frank-Wolfe for Constrained Finite Sum Minimization @ Montreal Opt...
Stochastic Frank-Wolfe for Constrained Finite Sum Minimization @ Montreal Opt...Stochastic Frank-Wolfe for Constrained Finite Sum Minimization @ Montreal Opt...
Stochastic Frank-Wolfe for Constrained Finite Sum Minimization @ Montreal Opt...Geoffrey Négiar
 
Bayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear modelsBayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear modelsCaleb (Shiqiang) Jin
 
IVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionIVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionCharles Deledalle
 
Monte Carlo Statistical Methods
Monte Carlo Statistical MethodsMonte Carlo Statistical Methods
Monte Carlo Statistical MethodsChristian Robert
 
Convex Optimization Modelling with CVXOPT
Convex Optimization Modelling with CVXOPTConvex Optimization Modelling with CVXOPT
Convex Optimization Modelling with CVXOPTandrewmart11
 
A series of maximum entropy upper bounds of the differential entropy
A series of maximum entropy upper bounds of the differential entropyA series of maximum entropy upper bounds of the differential entropy
A series of maximum entropy upper bounds of the differential entropyFrank Nielsen
 
Lesson 15: Exponential Growth and Decay (slides)
Lesson 15: Exponential Growth and Decay (slides)Lesson 15: Exponential Growth and Decay (slides)
Lesson 15: Exponential Growth and Decay (slides)Matthew Leingang
 
Hyperfunction method for numerical integration and Fredholm integral equation...
Hyperfunction method for numerical integration and Fredholm integral equation...Hyperfunction method for numerical integration and Fredholm integral equation...
Hyperfunction method for numerical integration and Fredholm integral equation...HidenoriOgata
 

Tendances (20)

Monte Carlo Statistical Methods
Monte Carlo Statistical MethodsMonte Carlo Statistical Methods
Monte Carlo Statistical Methods
 
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithms
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithmsRao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithms
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithms
 
Signal Processing Introduction using Fourier Transforms
Signal Processing Introduction using Fourier TransformsSignal Processing Introduction using Fourier Transforms
Signal Processing Introduction using Fourier Transforms
 
An introduction to moment closure techniques
An introduction to moment closure techniquesAn introduction to moment closure techniques
An introduction to moment closure techniques
 
Levitan Centenary Conference Talk, June 27 2014
Levitan Centenary Conference Talk, June 27 2014Levitan Centenary Conference Talk, June 27 2014
Levitan Centenary Conference Talk, June 27 2014
 
Approximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-LikelihoodsApproximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-Likelihoods
 
Bayesian Experimental Design for Stochastic Kinetic Models
Bayesian Experimental Design for Stochastic Kinetic ModelsBayesian Experimental Design for Stochastic Kinetic Models
Bayesian Experimental Design for Stochastic Kinetic Models
 
An application of the hyperfunction theory to numerical integration
An application of the hyperfunction theory to numerical integrationAn application of the hyperfunction theory to numerical integration
An application of the hyperfunction theory to numerical integration
 
Doering Savov
Doering SavovDoering Savov
Doering Savov
 
Tutorial of topological_data_analysis_part_1(basic)
Tutorial of topological_data_analysis_part_1(basic)Tutorial of topological_data_analysis_part_1(basic)
Tutorial of topological_data_analysis_part_1(basic)
 
Chapter 5 (maths 3)
Chapter 5 (maths 3)Chapter 5 (maths 3)
Chapter 5 (maths 3)
 
Stochastic Frank-Wolfe for Constrained Finite Sum Minimization @ Montreal Opt...
Stochastic Frank-Wolfe for Constrained Finite Sum Minimization @ Montreal Opt...Stochastic Frank-Wolfe for Constrained Finite Sum Minimization @ Montreal Opt...
Stochastic Frank-Wolfe for Constrained Finite Sum Minimization @ Montreal Opt...
 
Bayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear modelsBayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear models
 
IVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionIVR - Chapter 1 - Introduction
IVR - Chapter 1 - Introduction
 
Monte Carlo Statistical Methods
Monte Carlo Statistical MethodsMonte Carlo Statistical Methods
Monte Carlo Statistical Methods
 
Convex Optimization Modelling with CVXOPT
Convex Optimization Modelling with CVXOPTConvex Optimization Modelling with CVXOPT
Convex Optimization Modelling with CVXOPT
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
A series of maximum entropy upper bounds of the differential entropy
A series of maximum entropy upper bounds of the differential entropyA series of maximum entropy upper bounds of the differential entropy
A series of maximum entropy upper bounds of the differential entropy
 
Lesson 15: Exponential Growth and Decay (slides)
Lesson 15: Exponential Growth and Decay (slides)Lesson 15: Exponential Growth and Decay (slides)
Lesson 15: Exponential Growth and Decay (slides)
 
Hyperfunction method for numerical integration and Fredholm integral equation...
Hyperfunction method for numerical integration and Fredholm integral equation...Hyperfunction method for numerical integration and Fredholm integral equation...
Hyperfunction method for numerical integration and Fredholm integral equation...
 

Similaire à Estimation of the score vector and observed information matrix in intractable models

Recent developments on unbiased MCMC
Recent developments on unbiased MCMCRecent developments on unbiased MCMC
Recent developments on unbiased MCMCPierre Jacob
 
Unbiased MCMC with couplings
Unbiased MCMC with couplingsUnbiased MCMC with couplings
Unbiased MCMC with couplingsPierre Jacob
 
Moment Closure Based Parameter Inference of Stochastic Kinetic Models
Moment Closure Based Parameter Inference of Stochastic Kinetic ModelsMoment Closure Based Parameter Inference of Stochastic Kinetic Models
Moment Closure Based Parameter Inference of Stochastic Kinetic ModelsColin Gillespie
 
Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods Pierre Jacob
 
Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Alexander Litvinenko
 
Low rank tensor approximation of probability density and characteristic funct...
Low rank tensor approximation of probability density and characteristic funct...Low rank tensor approximation of probability density and characteristic funct...
Low rank tensor approximation of probability density and characteristic funct...Alexander Litvinenko
 
Linear response theory and TDDFT
Linear response theory and TDDFT Linear response theory and TDDFT
Linear response theory and TDDFT Claudio Attaccalite
 
SLC 2015 talk improved version
SLC 2015 talk improved versionSLC 2015 talk improved version
SLC 2015 talk improved versionZheng Mengdi
 
Problem_Session_Notes
Problem_Session_NotesProblem_Session_Notes
Problem_Session_NotesLu Mao
 
An investigation of inference of the generalized extreme value distribution b...
An investigation of inference of the generalized extreme value distribution b...An investigation of inference of the generalized extreme value distribution b...
An investigation of inference of the generalized extreme value distribution b...Alexander Decker
 
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...Alexander Litvinenko
 
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...Chiheb Ben Hammouda
 
Complete l fuzzy metric spaces and common fixed point theorems
Complete l fuzzy metric spaces and  common fixed point theoremsComplete l fuzzy metric spaces and  common fixed point theorems
Complete l fuzzy metric spaces and common fixed point theoremsAlexander Decker
 
On estimating the integrated co volatility using
On estimating the integrated co volatility usingOn estimating the integrated co volatility using
On estimating the integrated co volatility usingkkislas
 
Tail Probabilities for Randomized Program Runtimes via Martingales for Higher...
Tail Probabilities for Randomized Program Runtimes via Martingales for Higher...Tail Probabilities for Randomized Program Runtimes via Martingales for Higher...
Tail Probabilities for Randomized Program Runtimes via Martingales for Higher...Satoshi Kura
 

Similaire à Estimation of the score vector and observed information matrix in intractable models (20)

Recent developments on unbiased MCMC
Recent developments on unbiased MCMCRecent developments on unbiased MCMC
Recent developments on unbiased MCMC
 
Unbiased MCMC with couplings
Unbiased MCMC with couplingsUnbiased MCMC with couplings
Unbiased MCMC with couplings
 
Mcqmc talk
Mcqmc talkMcqmc talk
Mcqmc talk
 
Moment Closure Based Parameter Inference of Stochastic Kinetic Models
Moment Closure Based Parameter Inference of Stochastic Kinetic ModelsMoment Closure Based Parameter Inference of Stochastic Kinetic Models
Moment Closure Based Parameter Inference of Stochastic Kinetic Models
 
Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods
 
Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...
 
Low rank tensor approximation of probability density and characteristic funct...
Low rank tensor approximation of probability density and characteristic funct...Low rank tensor approximation of probability density and characteristic funct...
Low rank tensor approximation of probability density and characteristic funct...
 
Linear response theory and TDDFT
Linear response theory and TDDFT Linear response theory and TDDFT
Linear response theory and TDDFT
 
SLC 2015 talk improved version
SLC 2015 talk improved versionSLC 2015 talk improved version
SLC 2015 talk improved version
 
Problem_Session_Notes
Problem_Session_NotesProblem_Session_Notes
Problem_Session_Notes
 
An investigation of inference of the generalized extreme value distribution b...
An investigation of inference of the generalized extreme value distribution b...An investigation of inference of the generalized extreme value distribution b...
An investigation of inference of the generalized extreme value distribution b...
 
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
 
Basic concepts and how to measure price volatility
Basic concepts and how to measure price volatility Basic concepts and how to measure price volatility
Basic concepts and how to measure price volatility
 
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...
 
Complete l fuzzy metric spaces and common fixed point theorems
Complete l fuzzy metric spaces and  common fixed point theoremsComplete l fuzzy metric spaces and  common fixed point theorems
Complete l fuzzy metric spaces and common fixed point theorems
 
On estimating the integrated co volatility using
On estimating the integrated co volatility usingOn estimating the integrated co volatility using
On estimating the integrated co volatility using
 
stochastic processes assignment help
stochastic processes assignment helpstochastic processes assignment help
stochastic processes assignment help
 
Lecture5
Lecture5Lecture5
Lecture5
 
Tail Probabilities for Randomized Program Runtimes via Martingales for Higher...
Tail Probabilities for Randomized Program Runtimes via Martingales for Higher...Tail Probabilities for Randomized Program Runtimes via Martingales for Higher...
Tail Probabilities for Randomized Program Runtimes via Martingales for Higher...
 
The Gaussian Hardy-Littlewood Maximal Function
The Gaussian Hardy-Littlewood Maximal FunctionThe Gaussian Hardy-Littlewood Maximal Function
The Gaussian Hardy-Littlewood Maximal Function
 

Plus de Pierre Jacob

Talk at CIRM on Poisson equation and debiasing techniques
Talk at CIRM on Poisson equation and debiasing techniquesTalk at CIRM on Poisson equation and debiasing techniques
Talk at CIRM on Poisson equation and debiasing techniquesPierre Jacob
 
ISBA 2022 Susie Bayarri lecture
ISBA 2022 Susie Bayarri lectureISBA 2022 Susie Bayarri lecture
ISBA 2022 Susie Bayarri lecturePierre Jacob
 
Couplings of Markov chains and the Poisson equation
Couplings of Markov chains and the Poisson equation Couplings of Markov chains and the Poisson equation
Couplings of Markov chains and the Poisson equation Pierre Jacob
 
Monte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problemsMonte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problemsPierre Jacob
 
Monte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problemsMonte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problemsPierre Jacob
 
Markov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing themMarkov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing themPierre Jacob
 
Current limitations of sequential inference in general hidden Markov models
Current limitations of sequential inference in general hidden Markov modelsCurrent limitations of sequential inference in general hidden Markov models
Current limitations of sequential inference in general hidden Markov modelsPierre Jacob
 
On non-negative unbiased estimators
On non-negative unbiased estimatorsOn non-negative unbiased estimators
On non-negative unbiased estimatorsPierre Jacob
 
Path storage in the particle filter
Path storage in the particle filterPath storage in the particle filter
Path storage in the particle filterPierre Jacob
 
Density exploration methods
Density exploration methodsDensity exploration methods
Density exploration methodsPierre Jacob
 
SMC^2: an algorithm for sequential analysis of state-space models
SMC^2: an algorithm for sequential analysis of state-space modelsSMC^2: an algorithm for sequential analysis of state-space models
SMC^2: an algorithm for sequential analysis of state-space modelsPierre Jacob
 
PAWL - GPU meeting @ Warwick
PAWL - GPU meeting @ WarwickPAWL - GPU meeting @ Warwick
PAWL - GPU meeting @ WarwickPierre Jacob
 
Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7Pierre Jacob
 
Presentation MCB seminar 09032011
Presentation MCB seminar 09032011Presentation MCB seminar 09032011
Presentation MCB seminar 09032011Pierre Jacob
 

Plus de Pierre Jacob (14)

Talk at CIRM on Poisson equation and debiasing techniques
Talk at CIRM on Poisson equation and debiasing techniquesTalk at CIRM on Poisson equation and debiasing techniques
Talk at CIRM on Poisson equation and debiasing techniques
 
ISBA 2022 Susie Bayarri lecture
ISBA 2022 Susie Bayarri lectureISBA 2022 Susie Bayarri lecture
ISBA 2022 Susie Bayarri lecture
 
Couplings of Markov chains and the Poisson equation
Couplings of Markov chains and the Poisson equation Couplings of Markov chains and the Poisson equation
Couplings of Markov chains and the Poisson equation
 
Monte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problemsMonte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problems
 
Monte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problemsMonte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problems
 
Markov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing themMarkov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing them
 
Current limitations of sequential inference in general hidden Markov models
Current limitations of sequential inference in general hidden Markov modelsCurrent limitations of sequential inference in general hidden Markov models
Current limitations of sequential inference in general hidden Markov models
 
On non-negative unbiased estimators
On non-negative unbiased estimatorsOn non-negative unbiased estimators
On non-negative unbiased estimators
 
Path storage in the particle filter
Path storage in the particle filterPath storage in the particle filter
Path storage in the particle filter
 
Density exploration methods
Density exploration methodsDensity exploration methods
Density exploration methods
 
SMC^2: an algorithm for sequential analysis of state-space models
SMC^2: an algorithm for sequential analysis of state-space modelsSMC^2: an algorithm for sequential analysis of state-space models
SMC^2: an algorithm for sequential analysis of state-space models
 
PAWL - GPU meeting @ Warwick
PAWL - GPU meeting @ WarwickPAWL - GPU meeting @ Warwick
PAWL - GPU meeting @ Warwick
 
Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7
 
Presentation MCB seminar 09032011
Presentation MCB seminar 09032011Presentation MCB seminar 09032011
Presentation MCB seminar 09032011
 

Dernier

Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Silpa
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Monika Rani
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .Poonam Aher Patil
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedDelhi Call girls
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxRizalinePalanog2
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Servicenishacall1
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
American Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptxAmerican Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptxabhishekdhamu51
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...Lokesh Kothari
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 

Dernier (20)

Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
American Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptxAmerican Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptx
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 

Estimation of the score vector and observed information matrix in intractable models

  • 1. Estimation of the score vector and observed information matrix in intractable models Arnaud Doucet (University of Oxford) Pierre E. Jacob (University of Oxford) Sylvain Rubenthaler (Universit´e Nice Sophia Antipolis) June 15th, 2015 Pierre Jacob Derivative estimation 1/ 25
  • 2. Outline 1 Context: intractable derivatives 2 Shift estimators 3 Monte Carlo shift estimators Pierre Jacob Derivative estimation 1/ 25
  • 3. Outline 1 Context: intractable derivatives 2 Shift estimators 3 Monte Carlo shift estimators Pierre Jacob Derivative estimation 2/ 25
  • 4. Motivation Derivatives of the likelihood help optimizing / sampling. They become crucial when the parameter space is high-dimensional. For many models, these derivatives are intractable. One can resort to approximation techniques. Pierre Jacob Derivative estimation 2/ 25
  • 5. Motivation: hidden Markov models y2 X2X0 y1 X1 ... ... yT XT θ Figure : Graph representation of a general hidden Markov model. Pierre Jacob Derivative estimation 3/ 25
  • 6. Motivation: hidden Markov models The associated loglikelihood is (θ) = log p(y1:T ; θ) = log XT+1 T t=1 g(yt | xt, θ) µ(dx1 | θ) T t=2 f (dxt | xt−1, θ). This high-dimensional integral can be approximated efficiently using particle filters. However, the approximation itself is “discontinuous” as a function of the parameter θ, even when the “seed” is fixed. Figures coming. Pierre Jacob Derivative estimation 4/ 25
  • 7. Motivation: hidden Markov models −198 −196 −194 −192 −190 0.500 0.525 0.550 0.575 0.600 θ loglikelihood different seed truth Figure : Approximation of the log-likelihood by particle filters. Pierre Jacob Derivative estimation 5/ 25
  • 8. Motivation: hidden Markov models −196 −194 −192 −190 0.500 0.525 0.550 0.575 0.600 θ loglikelihood same seed truth Figure : Approximation of the log-likelihood by particle filters. Pierre Jacob Derivative estimation 6/ 25
  • 9. Motivation: general context Let (θ) denote a “log-likelihood” but it could be any function. Let L(θ) = exp (θ), the “likelihood”. Assume that we have access to estimators L(θ) of L(θ) such that E[L(θ)] = L(θ) and V L(θ) L(θ) = v(θ) M ≈ v M (∀θ). Pierre Jacob Derivative estimation 7/ 25
  • 10. Finite difference First derivative: (θ ) = log L(θ + h) − log L(θ − h) 2h converges to (θ ) when M → ∞ and h → 0. Second derivative: 2 (θ) = log L(θ + h) − 2 log L(θ) + log L(θ − h) h2 converges to 2 (θ ) when M → ∞ and h → 0. Pierre Jacob Derivative estimation 8/ 25
  • 11. Finite difference Bias: E (θ ) − (θ ) = O(h2 ). Variance: V (θ ) = O 1 Mh2 . Optimal rate of convergence for the first derivative: h ∼ M−1/6 leading to MSE ∼ M−2/3 . For the second derivative: h ∼ M−1/8 leading to MSE ∼ M−1/2 . Pierre Jacob Derivative estimation 9/ 25
  • 12. Outline 1 Context: intractable derivatives 2 Shift estimators 3 Monte Carlo shift estimators Pierre Jacob Derivative estimation 10/ 25
  • 13. Iterated Filtering Ionides, Bhadra, Atchad´e, King, Iterated filtering, AoS, 2011. Given a log-likelihood and a given point θ , consider a “prior” distribution centered on θ , for instance Θ ∼ N(θ , τ2 Σ). We can consider “posterior” expectations, when they exist: Epost [ϕ (Θ)] = E [ϕ (Θ) exp (Θ)] E [exp (Θ)] , where the expectations E are with respect to the “prior”. Pierre Jacob Derivative estimation 10/ 25
  • 14. Iterated Filtering Under conditions on the prior distribution, and on the log-likelihood, one gets the following result. Posterior expectation when the prior variance goes to zero For any θ , there exists a C such that, for τ small enough, |τ−2 Σ−1 (Epost[Θ] − θ ) − (θ )| ≤ Cτ. Phrased simply, posterior mean − prior mean prior variance ≈ score. Result from Ionides, Bhadra, Atchad´e, King, Iterated filtering, 2011. Pierre Jacob Derivative estimation 11/ 25
  • 15. Normal priors Let us focus on the normal case: Θ ∼ N(θ , τ2 Σ). Stein’s Lemma says... Assume that E [| iL (Θ)|] < ∞ and E 2 ijL (Θ) < ∞ for i, j ∈ {1, . . . , d}. Then we have Epost [Θ − θ ] = τ2 Σ Epost [ (Θ)] , and Epost (Θ − θ ) (Θ − θ )T = τ2 Σ + τ4 ΣEpost 2 (Θ) + (Θ) (Θ)T Σ. Pierre Jacob Derivative estimation 12/ 25
  • 16. Expected derivatives and derivatives We do not want to estimate Epost [ (Θ)] but rather (θ ). Posterior moment expansion When τ → 0, we have for ϕ satisfying some conditions, Epost [ϕ (Θ)] = ϕ (θ ) + τ2 2 d i=1 d j=1 2 ijϕ (θ ) + 2 iϕ (θ ) j (θ ) Σij + O τ4 . Pierre Jacob Derivative estimation 13/ 25
  • 17. Expected derivatives and derivatives Conditions for this to hold: Let ϕ : Rd → R be a four times continuously differentiable function. Assume that there exists a constant M < ∞ and δ > 0 such that ∂4ϕ(θ) ∂θi∂θj∂θk∂θl ≤ M for all θ ∈ BΣ(θ , δ), all (i, j, k, l) ∈ {1, . . . , d}4 . The likelihood function L is such that both θ → L (θ) and θ → ϕ (θ) × L (θ) satisfy the same assumption as ϕ. There exists τ0 > 0 such that Eτ0 [|ϕ (Θ)|] < ∞, Eτ0 [|L (Θ)|] < ∞ and Eτ0 [|ϕ (Θ) L (Θ)|] < ∞. Pierre Jacob Derivative estimation 14/ 25
  • 18. Shift estimators Shift estimators Sτ (θ ) = τ−2 Σ−1 Epost [Θ − θ ] = (θ ) + O τ2 , Iτ (θ ) = τ−4 Σ−1 Vpost [Θ] − τ2 Σ Σ−1 = 2 (θ ) + O τ2 . Using Monte Carlo approximations of Epost [Θ] and Vpost [Θ] , these formulas provide estimators of (θ ) and 2 (θ ). Pierre Jacob Derivative estimation 15/ 25
  • 19. Example: normal likelihood The parameter θ represents a d-dimensional location, and Y ∼ N(θ, Λ−1 y ). Thus the derivatives of the log-likelihood are (θ) = −Λy(θ − y), 2 (θ) = −Λy. Using a prior N θ , τ2Σ , the posterior is conjugate and the shift estimators are given by Sτ (θ ) = 1 (1 + τ2ΣΛy) (−Λy(θ − y)), Iτ (θ ) = 1 (1 + τ2ΣΛy) (−Λy) . Pierre Jacob Derivative estimation 16/ 25
  • 20. Outline 1 Context: intractable derivatives 2 Shift estimators 3 Monte Carlo shift estimators Pierre Jacob Derivative estimation 17/ 25
  • 21. Monte Carlo shift estimators Draw θi ∼ N θ , τ2Σ , for each i ∈ {1, . . . , N}. Draw ˆwi = L (θi), for each i ∈ {1, . . . , N}. Normalize the weights: for each i ∈ {1, . . . , N}, ˆWi = ˆwi/ N j=1 ˆwj. Then we can estimate τ−2 Σ−1 (Epost[Θ] − θ ) by τ−2 Σ−1 N i=1 ˆWiθi − θ , or, even better, τ−2 Σ−1 N i=1 ˆWiθi − 1 N N i=1 θi . Pierre Jacob Derivative estimation 17/ 25
  • 22. Monte Carlo shift estimators For the second order derivative, recall that: τ−4 Σ−1 Vpost [Θ] − τ2 Σ Σ−1 = 2 (θ ) + O τ2 . Importance sampling strategy (with control variates) leads to: τ−4 Σ−1   N i=1 ˆWi θi − N i=1 ˆWjθj 2 − 1 N N i=1 θi − 1 N N i=1 θi 2   Σ−1 . with obvious extension in the multivariate case. Pierre Jacob Derivative estimation 18/ 25
  • 23. Monte Carlo shift estimators Score estimator: SN,τ (θ ) = τ−2 Σ−1 N i=1 ˆWiθi − 1 N N i=1 θi . Bias: E SN,τ (θ ) − (θ ) = O τ2 . Variance: V SN,τ (θ ) = O 1 Nτ2 . ⇒ same rate of convergence than the finite difference method! Similar results for the estimator of 2 (θ ). Pierre Jacob Derivative estimation 19/ 25
  • 24. Monte Carlo shift estimators The shift estimators are competitive with the gold standard! The shift estimators are not better than the gold standard! So why bother? Why is Iterated Filtering used in practice? Let’s have a look at the non-asymptotic behaviour. Pierre Jacob Derivative estimation 20/ 25
  • 25. Robustness Recall: SN,τ (θ ) = τ−2 Σ−1 N i=1 ˆWiθi − 1 N N i=1 θi , where ˆWi = L(θi) N j=1 L(θj) , and V(L(θi)/L(θi)) = v/M. Scenario: v is very large. Then N i=1 ˆWiθi ≈ θj for some j. We obtain, for all v: V(SN,τ (θ ) ≤ Cτ−2 . On the other hand, the variance of finite difference estimators increases to ∞ with v. Pierre Jacob Derivative estimation 21/ 25
  • 26. Example: normal likelihood Latent variable model: X ∼ N(θ, Λ−1 y − λΛ−1 y|x), Y | X = x ∼ N(x, λΛ−1 y|x), for fixed matrices Λy and Λy|x, and λ ∈ (0, 1). Still corresponds to Y ∼ N(θ, Λ−1 y ), but a naive Monte Carlo approach has an infinite relative variance when λ goes to zero. The following figure represents the behaviour of the MSE for the Monte Carlo shift estimator and the finite difference estimator as λ goes to zero, i.e. v → ∞. Pierre Jacob Derivative estimation 22/ 25
  • 27. Robustness 1e+01 1e+04 1e+07 1e−01 1e−02 1e−03 1e−04 1e−05 1e−06 λ MSE Monte Carlo shift finite difference Figure : Robustness of the Monte Carlo shift estimator when the variance of the likelihood estimator increases. Pierre Jacob Derivative estimation 23/ 25
  • 28. Discussion Monte Carlo shift estimators and finite difference estimators: equivalent in mean squared error (N → ∞). Monte Carlo shift estimators are robust to the noise of the likelihood estimators (v → ∞). This might explain the observed gain within optimization procedures. There are specific forms of shift estimators for hidden Markov models (e.g. Iterated Filtering). Behaviour of the posterior distribution when the prior is a concentrated normal distribution, interesting on its own right? Pierre Jacob Derivative estimation 24/ 25
  • 29. Bibliography Main references: Inference for nonlinear dynamical systems, Ionides, Breto, King, PNAS, 2006. Iterated filtering, Ionides, Bhadra, Atchad´e, King, Annals of Statistics, 2011. Derivative-Free Estimation of the Score Vector and Observed Information Matrix, Doucet, Jacob, Rubenthaler, (on arXiv, to be updated). Pierre Jacob Derivative estimation 25/ 25