SlideShare une entreprise Scribd logo
1  sur  95
Télécharger pour lire hors ligne
Vanilla Rao–Blackwellisation of
        Metropolis–Hastings algorithms

                    Christian P. Robert
         Universit´ Paris-Dauphine, IuF, and CREST
                  e
Joint works with Randal Douc, Pierre Jacob and Murray Smith

              xian@ceremade.dauphine.fr



                    November 16, 2011


                                                              1 / 36
Main themes



   1   Rao–Blackwellisation on MCMC
   2   Can be performed in any Hastings Metropolis algorithm
   3   Asymptotically more efficient than usual MCMC with a
       controlled additional computing
   4   Takes advantage of parallel capacities at a very basic level
       (GPUs)




                                                                      2 / 36
Main themes



   1   Rao–Blackwellisation on MCMC
   2   Can be performed in any Hastings Metropolis algorithm
   3   Asymptotically more efficient than usual MCMC with a
       controlled additional computing
   4   Takes advantage of parallel capacities at a very basic level
       (GPUs)




                                                                      2 / 36
Main themes



   1   Rao–Blackwellisation on MCMC
   2   Can be performed in any Hastings Metropolis algorithm
   3   Asymptotically more efficient than usual MCMC with a
       controlled additional computing
   4   Takes advantage of parallel capacities at a very basic level
       (GPUs)




                                                                      2 / 36
Main themes



   1   Rao–Blackwellisation on MCMC
   2   Can be performed in any Hastings Metropolis algorithm
   3   Asymptotically more efficient than usual MCMC with a
       controlled additional computing
   4   Takes advantage of parallel capacities at a very basic level
       (GPUs)




                                                                      2 / 36
Metropolis Hastings revisited
                          Rao–Blackwellisation
                       Rao-Blackwellisation (2)


Outline


  1   Metropolis Hastings revisited

  2   Rao–Blackwellisation
        Formal importance sampling
        Variance reduction
        Asymptotic results
        Illustrations

  3   Rao-Blackwellisation (2)
        Independent case
        General MH algorithms



                                                   3 / 36
Metropolis Hastings revisited
                          Rao–Blackwellisation
                       Rao-Blackwellisation (2)


Outline


  1   Metropolis Hastings revisited

  2   Rao–Blackwellisation
        Formal importance sampling
        Variance reduction
        Asymptotic results
        Illustrations

  3   Rao-Blackwellisation (2)
        Independent case
        General MH algorithms



                                                   3 / 36
Metropolis Hastings revisited
                          Rao–Blackwellisation
                       Rao-Blackwellisation (2)


Outline


  1   Metropolis Hastings revisited

  2   Rao–Blackwellisation
        Formal importance sampling
        Variance reduction
        Asymptotic results
        Illustrations

  3   Rao-Blackwellisation (2)
        Independent case
        General MH algorithms



                                                   3 / 36
Metropolis Hastings revisited
                          Rao–Blackwellisation
                       Rao-Blackwellisation (2)


Outline


  1   Metropolis Hastings revisited

  2   Rao–Blackwellisation
        Formal importance sampling
        Variance reduction
        Asymptotic results
        Illustrations

  3   Rao-Blackwellisation (2)
        Independent case
        General MH algorithms



                                                   4 / 36
Metropolis Hastings revisited
                         Rao–Blackwellisation
                      Rao-Blackwellisation (2)


Metropolis Hastings algorithm


    1   We wish to approximate

                                    h(x)π(x)dx
                         I =                   =      h(x)¯ (x)dx
                                                          π
                                      π(x)dx


    2   π(x) is known but not           π(x)dx.
                                   1              n
    3   Approximate I with δ = n t=1 h(x (t) ) where (x (t) ) is a Markov
        chain with limiting distribution π .
                                         ¯
    4   Convergence obtained from Law of Large Numbers or CLT for
        Markov chains.


                                                                            5 / 36
Metropolis Hastings revisited
                         Rao–Blackwellisation
                      Rao-Blackwellisation (2)


Metropolis Hastings algorithm


    1   We wish to approximate

                                    h(x)π(x)dx
                         I =                   =      h(x)¯ (x)dx
                                                          π
                                      π(x)dx


    2   π(x) is known but not           π(x)dx.
                                   1              n
    3   Approximate I with δ = n t=1 h(x (t) ) where (x (t) ) is a Markov
        chain with limiting distribution π .
                                         ¯
    4   Convergence obtained from Law of Large Numbers or CLT for
        Markov chains.


                                                                            5 / 36
Metropolis Hastings revisited
                         Rao–Blackwellisation
                      Rao-Blackwellisation (2)


Metropolis Hastings algorithm


    1   We wish to approximate

                                    h(x)π(x)dx
                         I =                   =      h(x)¯ (x)dx
                                                          π
                                      π(x)dx


    2   π(x) is known but not           π(x)dx.
                                   1              n
    3   Approximate I with δ = n t=1 h(x (t) ) where (x (t) ) is a Markov
        chain with limiting distribution π .
                                         ¯
    4   Convergence obtained from Law of Large Numbers or CLT for
        Markov chains.


                                                                            5 / 36
Metropolis Hastings revisited
                         Rao–Blackwellisation
                      Rao-Blackwellisation (2)


Metropolis Hastings algorithm


    1   We wish to approximate

                                    h(x)π(x)dx
                         I =                   =      h(x)¯ (x)dx
                                                          π
                                      π(x)dx


    2   π(x) is known but not           π(x)dx.
                                   1              n
    3   Approximate I with δ = n t=1 h(x (t) ) where (x (t) ) is a Markov
        chain with limiting distribution π .
                                         ¯
    4   Convergence obtained from Law of Large Numbers or CLT for
        Markov chains.


                                                                            5 / 36
Metropolis Hastings revisited
                           Rao–Blackwellisation
                        Rao-Blackwellisation (2)


Metropolis Hasting Algorithm

  Suppose that x (t) is drawn.
    1   Simulate yt ∼ q(·|x (t) ).
    2   Set x (t+1) = yt with probability

                                                     π(yt ) q(x (t) |yt )
                        α(x (t) , yt ) = min 1,
                                                    π(x (t) ) q(yt |x (t) )

        Otherwise, set x (t+1) = x (t) .
    3   α is such that the detailed balance equation is satisfied:             π is the
                                                                              ¯
        stationary distribution of (x (t) ).
    The accepted candidates are simulated with the rejection algorithm.


                                                                                         6 / 36
Metropolis Hastings revisited
                           Rao–Blackwellisation
                        Rao-Blackwellisation (2)


Metropolis Hasting Algorithm

  Suppose that x (t) is drawn.
    1   Simulate yt ∼ q(·|x (t) ).
    2   Set x (t+1) = yt with probability

                                                     π(yt ) q(x (t) |yt )
                        α(x (t) , yt ) = min 1,
                                                    π(x (t) ) q(yt |x (t) )

        Otherwise, set x (t+1) = x (t) .
    3   α is such that the detailed balance equation is satisfied:             π is the
                                                                              ¯
        stationary distribution of (x (t) ).
    The accepted candidates are simulated with the rejection algorithm.


                                                                                         6 / 36
Metropolis Hastings revisited
                           Rao–Blackwellisation
                        Rao-Blackwellisation (2)


Metropolis Hasting Algorithm
  Suppose that x (t) is drawn.
    1   Simulate yt ∼ q(·|x (t) ).
    2   Set x (t+1) = yt with probability

                                                     π(yt ) q(x (t) |yt )
                        α(x (t) , yt ) = min 1,
                                                    π(x (t) ) q(yt |x (t) )

        Otherwise, set x (t+1) = x (t) .
    3   α is such that the detailed balance equation is satisfied:

                        π(x)q(y |x)α(x, y ) = π(y )q(x|y )α(y , x).

          π is the stationary distribution of (x (t) ).
          ¯
    The accepted candidates are simulated with the rejection algorithm.
                                                                              6 / 36
Metropolis Hastings revisited
                           Rao–Blackwellisation
                        Rao-Blackwellisation (2)


Metropolis Hasting Algorithm
  Suppose that x (t) is drawn.
    1   Simulate yt ∼ q(·|x (t) ).
    2   Set x (t+1) = yt with probability

                                                     π(yt ) q(x (t) |yt )
                        α(x (t) , yt ) = min 1,
                                                    π(x (t) ) q(yt |x (t) )

        Otherwise, set x (t+1) = x (t) .
    3   α is such that the detailed balance equation is satisfied:

                        π(x)q(y |x)α(x, y ) = π(y )q(x|y )α(y , x).

          π is the stationary distribution of (x (t) ).
          ¯
    The accepted candidates are simulated with the rejection algorithm.
                                                                              6 / 36
Metropolis Hastings revisited
                          Rao–Blackwellisation
                       Rao-Blackwellisation (2)


Some properties of the HM algorithm



    1   Alternative representation of the estimator δ is
                                       n                      MN
                                  1                       1
                           δ=               h(x (t) ) =             ni h(zi ) ,
                                  n   t=1
                                                          N
                                                              i=1

        where
             zi ’s are the accepted yj ’s,
             MN is the number of accepted yj ’s till time N,
             ni is the number of times zi appears in the sequence (x (t) )t .




                                                                                  7 / 36
Metropolis Hastings revisited
                        Rao–Blackwellisation
                     Rao-Blackwellisation (2)




                                     α(zi , ·) q(·|zi )   q(·|zi )
                     q (·|zi ) =
                     ˜                                  ≤          ,
                                          p(zi )           p(zi )

where p(zi ) =   α(zi , y ) q(y |zi )dy . To simulate from q (·|zi ):
                                                           ˜
  1   Propose a candidate y ∼ q(·|zi )
  2   Accept with probability

                                                 q(y |zi )
                            q (y |zi )
                            ˜                                = α(zi , y )
                                                  p(zi )

      Otherwise, reject it and starts again.
  this is the transition of the HM algorithm.The transition kernel q
                                                                   ˜
admits π as a stationary distribution:
        ˜

                                     π (x)˜ (y |x) =
                                     ˜ q


                                                                            8 / 36
Metropolis Hastings revisited
                        Rao–Blackwellisation
                     Rao-Blackwellisation (2)




                                     α(zi , ·) q(·|zi )   q(·|zi )
                     q (·|zi ) =
                     ˜                                  ≤          ,
                                          p(zi )           p(zi )

where p(zi ) =   α(zi , y ) q(y |zi )dy . To simulate from q (·|zi ):
                                                           ˜
  1   Propose a candidate y ∼ q(·|zi )
  2   Accept with probability

                                                 q(y |zi )
                            q (y |zi )
                            ˜                                = α(zi , y )
                                                  p(zi )

      Otherwise, reject it and starts again.
  this is the transition of the HM algorithm.The transition kernel q
                                                                   ˜
admits π as a stationary distribution:
        ˜

                                     π (x)˜ (y |x) =
                                     ˜ q


                                                                            8 / 36
Metropolis Hastings revisited
                        Rao–Blackwellisation
                     Rao-Blackwellisation (2)




                                     α(zi , ·) q(·|zi )   q(·|zi )
                     q (·|zi ) =
                     ˜                                  ≤          ,
                                          p(zi )           p(zi )

where p(zi ) =   α(zi , y ) q(y |zi )dy . To simulate from q (·|zi ):
                                                           ˜
  1   Propose a candidate y ∼ q(·|zi )
  2   Accept with probability

                                                 q(y |zi )
                            q (y |zi )
                            ˜                                = α(zi , y )
                                                  p(zi )

      Otherwise, reject it and starts again.
  this is the transition of the HM algorithm.The transition kernel q
                                                                   ˜
admits π as a stationary distribution:
        ˜

                                     π (x)˜ (y |x) =
                                     ˜ q


                                                                            8 / 36
Metropolis Hastings revisited
                          Rao–Blackwellisation
                       Rao-Blackwellisation (2)




                                       α(zi , ·) q(·|zi )   q(·|zi )
                       q (·|zi ) =
                       ˜                                  ≤          ,
                                            p(zi )           p(zi )
where p(zi ) =     α(zi , y ) q(y |zi )dy . To simulate from q (·|zi ):
                                                             ˜
  1   Propose a candidate y ∼ q(·|zi )
  2   Accept with probability
                                                   q(y |zi )
                              q (y |zi )
                              ˜                                = α(zi , y )
                                                    p(zi )
      Otherwise, reject it and starts again.
  this is the transition of the HM algorithm.The transition kernel q
                                                                   ˜
admits π as a stationary distribution:
        ˜
                                           π(x)p(x)   α(x, y )q(y |x)
                 π (x)˜ (y |x) =
                 ˜ q
                                           π(u)p(u)du     p(x)
                                               π (x)
                                               ˜                   q (y |x)
                                                                   ˜

                                                                              8 / 36
Metropolis Hastings revisited
                        Rao–Blackwellisation
                     Rao-Blackwellisation (2)




                                     α(zi , ·) q(·|zi )   q(·|zi )
                     q (·|zi ) =
                     ˜                                  ≤          ,
                                          p(zi )           p(zi )

where p(zi ) =   α(zi , y ) q(y |zi )dy . To simulate from q (·|zi ):
                                                           ˜
  1   Propose a candidate y ∼ q(·|zi )
  2   Accept with probability

                                                 q(y |zi )
                            q (y |zi )
                            ˜                                = α(zi , y )
                                                  p(zi )

      Otherwise, reject it and starts again.
  this is the transition of the HM algorithm.The transition kernel q
                                                                   ˜
admits π as a stationary distribution:
        ˜

                                                 π(x)α(x, y )q(y |x)
                       π (x)˜ (y |x) =
                       ˜ q
                                                     π(u)p(u)du

                                                                            8 / 36
Metropolis Hastings revisited
                        Rao–Blackwellisation
                     Rao-Blackwellisation (2)




                                     α(zi , ·) q(·|zi )   q(·|zi )
                     q (·|zi ) =
                     ˜                                  ≤          ,
                                          p(zi )           p(zi )

where p(zi ) =   α(zi , y ) q(y |zi )dy . To simulate from q (·|zi ):
                                                           ˜
  1   Propose a candidate y ∼ q(·|zi )
  2   Accept with probability

                                                 q(y |zi )
                            q (y |zi )
                            ˜                                = α(zi , y )
                                                  p(zi )

      Otherwise, reject it and starts again.
  this is the transition of the HM algorithm.The transition kernel q
                                                                   ˜
admits π as a stationary distribution:
        ˜

                                                 π(y )α(y , x)q(x|y )
                       π (x)˜ (y |x) =
                       ˜ q
                                                      π(u)p(u)du

                                                                            8 / 36
Metropolis Hastings revisited
                        Rao–Blackwellisation
                     Rao-Blackwellisation (2)




                                     α(zi , ·) q(·|zi )   q(·|zi )
                     q (·|zi ) =
                     ˜                                  ≤          ,
                                          p(zi )           p(zi )

where p(zi ) =   α(zi , y ) q(y |zi )dy . To simulate from q (·|zi ):
                                                           ˜
  1   Propose a candidate y ∼ q(·|zi )
  2   Accept with probability

                                                 q(y |zi )
                            q (y |zi )
                            ˜                                = α(zi , y )
                                                  p(zi )

      Otherwise, reject it and starts again.
  this is the transition of the HM algorithm.The transition kernel q
                                                                   ˜
admits π as a stationary distribution:
        ˜

                            π (x)˜ (y |x) = π (y )˜ (x|y ) ,
                            ˜ q             ˜ q


                                                                            8 / 36
Metropolis Hastings revisited
                         Rao–Blackwellisation
                      Rao-Blackwellisation (2)




Lemma (Douc & X., AoS, 2011)

The sequence (zi , ni ) satisfies
  1   (zi , ni )i is a Markov chain;
  2   zi+1 and ni are independent given zi ;
  3   ni is distributed as a geometric random variable with probability
      parameter
                              p(zi ) :=           α(zi , y ) q(y |zi ) dy ;         (1)


  4                                                   ˜
      (zi )i is a Markov chain with transition kernel Q(z, dy ) = q (y |z)dy
                                                                  ˜
      and stationary distribution π such that
                                  ˜

                   q (·|z) ∝ α(z, ·) q(·|z)
                   ˜                                   and     π (·) ∝ π(·)p(·) .
                                                               ˜


                                                                                          9 / 36
Metropolis Hastings revisited
                         Rao–Blackwellisation
                      Rao-Blackwellisation (2)




Lemma (Douc & X., AoS, 2011)

The sequence (zi , ni ) satisfies
  1   (zi , ni )i is a Markov chain;
  2   zi+1 and ni are independent given zi ;
  3   ni is distributed as a geometric random variable with probability
      parameter
                              p(zi ) :=           α(zi , y ) q(y |zi ) dy ;         (1)


  4                                                   ˜
      (zi )i is a Markov chain with transition kernel Q(z, dy ) = q (y |z)dy
                                                                  ˜
      and stationary distribution π such that
                                  ˜

                   q (·|z) ∝ α(z, ·) q(·|z)
                   ˜                                   and     π (·) ∝ π(·)p(·) .
                                                               ˜


                                                                                          9 / 36
Metropolis Hastings revisited
                         Rao–Blackwellisation
                      Rao-Blackwellisation (2)




Lemma (Douc & X., AoS, 2011)

The sequence (zi , ni ) satisfies
  1   (zi , ni )i is a Markov chain;
  2   zi+1 and ni are independent given zi ;
  3   ni is distributed as a geometric random variable with probability
      parameter
                              p(zi ) :=           α(zi , y ) q(y |zi ) dy ;         (1)


  4                                                   ˜
      (zi )i is a Markov chain with transition kernel Q(z, dy ) = q (y |z)dy
                                                                  ˜
      and stationary distribution π such that
                                  ˜

                   q (·|z) ∝ α(z, ·) q(·|z)
                   ˜                                   and     π (·) ∝ π(·)p(·) .
                                                               ˜


                                                                                          9 / 36
Metropolis Hastings revisited
                         Rao–Blackwellisation
                      Rao-Blackwellisation (2)




Lemma (Douc & X., AoS, 2011)

The sequence (zi , ni ) satisfies
  1   (zi , ni )i is a Markov chain;
  2   zi+1 and ni are independent given zi ;
  3   ni is distributed as a geometric random variable with probability
      parameter
                              p(zi ) :=           α(zi , y ) q(y |zi ) dy ;         (1)


  4                                                   ˜
      (zi )i is a Markov chain with transition kernel Q(z, dy ) = q (y |z)dy
                                                                  ˜
      and stationary distribution π such that
                                  ˜

                   q (·|z) ∝ α(z, ·) q(·|z)
                   ˜                                   and     π (·) ∝ π(·)p(·) .
                                                               ˜


                                                                                          9 / 36
Metropolis Hastings revisited
                     Rao–Blackwellisation
                  Rao-Blackwellisation (2)


Old bottle, new wine [or vice-versa]


                     zi−1




                                              10 / 36
Metropolis Hastings revisited
                     Rao–Blackwellisation
                  Rao-Blackwellisation (2)


Old bottle, new wine [or vice-versa]


                                 indep
                    zi−1                      zi


                         indep


                    ni−1




                                                   10 / 36
Metropolis Hastings revisited
                     Rao–Blackwellisation
                  Rao-Blackwellisation (2)


Old bottle, new wine [or vice-versa]


                              indep                   indep
                zi−1                          zi              zi+1


                      indep                        indep


                ni−1                          ni




                                                                     10 / 36
Metropolis Hastings revisited
                     Rao–Blackwellisation
                  Rao-Blackwellisation (2)


Old bottle, new wine [or vice-versa]
                              indep                     indep
                zi−1                          zi                     zi+1


                      indep                        indep


                ni−1                          ni




                              n                          MN
                         1                          1
                  δ=               h(x (t) ) =                ni h(zi ) .
                         n   t=1
                                                    N
                                                        i=1


                                                                            10 / 36
Metropolis Hastings revisited
                     Rao–Blackwellisation
                  Rao-Blackwellisation (2)


Old bottle, new wine [or vice-versa]
                              indep                     indep
                zi−1                          zi                     zi+1


                      indep                        indep


                ni−1                          ni




                              n                          MN
                         1                          1
                  δ=               h(x (t) ) =                ni h(zi ) .
                         n   t=1
                                                    N
                                                        i=1


                                                                            10 / 36
Formal importance sampling
                   Metropolis Hastings revisited
                                                   Variance reduction
                          Rao–Blackwellisation
                                                   Asymptotic results
                       Rao-Blackwellisation (2)
                                                   Illustrations


Outline


  1   Metropolis Hastings revisited

  2   Rao–Blackwellisation
        Formal importance sampling
        Variance reduction
        Asymptotic results
        Illustrations

  3   Rao-Blackwellisation (2)
        Independent case
        General MH algorithms



                                                                                11 / 36
Formal importance sampling
                   Metropolis Hastings revisited
                                                       Variance reduction
                          Rao–Blackwellisation
                                                       Asymptotic results
                       Rao-Blackwellisation (2)
                                                       Illustrations


Importance sampling perspective




    1   A natural idea:
                                                       MN
                                                   1         h(zi )
                                        δ∗ =                        ,
                                                   N         p(zi )
                                                       i=1




                                                                                    12 / 36
Formal importance sampling
                   Metropolis Hastings revisited
                                                   Variance reduction
                          Rao–Blackwellisation
                                                   Asymptotic results
                       Rao-Blackwellisation (2)
                                                   Illustrations


Importance sampling perspective



    1   A natural idea:

                                      MN    h(zi )       MN   π(zi )
                                      i=1                i=1          h(zi )
                                            p(zi )            π (zi )
                                                               ˜
                          δ∗                       =                         .
                                      MN      1              MN π(zi )
                                      i=1                    i=1
                                            p(zi )               π (zi )
                                                                 ˜




                                                                                 12 / 36
Formal importance sampling
                   Metropolis Hastings revisited
                                                   Variance reduction
                          Rao–Blackwellisation
                                                   Asymptotic results
                       Rao-Blackwellisation (2)
                                                   Illustrations


Importance sampling perspective



    1   A natural idea:

                                      MN    h(zi )       MN   π(zi )
                                      i=1                i=1          h(zi )
                                            p(zi )            π (zi )
                                                               ˜
                          δ∗                       =                         .
                                      MN      1              MN π(zi )
                                      i=1                    i=1
                                            p(zi )               π (zi )
                                                                 ˜

    2   But p not available in closed form.




                                                                                 12 / 36
Formal importance sampling
                   Metropolis Hastings revisited
                                                   Variance reduction
                          Rao–Blackwellisation
                                                   Asymptotic results
                       Rao-Blackwellisation (2)
                                                   Illustrations


Importance sampling perspective


    1   A natural idea:

                                      MN    h(zi )       MN   π(zi )
                                      i=1                i=1          h(zi )
                                            p(zi )            π (zi )
                                                               ˜
                          δ∗                       =                         .
                                      MN      1              MN π(zi )
                                      i=1                    i=1
                                            p(zi )               π (zi )
                                                                 ˜

    2   But p not available in closed form.
    3   The geometric ni is the replacement obvious solution that is used in
        the original Metropolis–Hastings estimate since E[ni ] = 1/p(zi ).




                                                                                 12 / 36
Formal importance sampling
                   Metropolis Hastings revisited
                                                      Variance reduction
                          Rao–Blackwellisation
                                                      Asymptotic results
                       Rao-Blackwellisation (2)
                                                      Illustrations


The Bernoulli factory
  The crude estimate of 1/p(zi ),
                                       ∞
                       ni = 1 +                    I {u ≥ α(zi , y )} ,
                                      j=1 ≤j

  can be improved:

  Lemma (Douc & X., AoS, 2011)
  If (yj )j is an iid sequence with distribution q(y |zi ), the quantity
                                          ∞
                          ˆ
                          ξi = 1 +                   {1 − α(zi , y )}
                                         j=1 ≤j


  is an unbiased estimator of 1/p(zi ) which variance, conditional on zi , is
  lower than the conditional variance of ni , {1 − p(zi )}/p 2 (zi ).

                                                                                   13 / 36
Formal importance sampling
                    Metropolis Hastings revisited
                                                     Variance reduction
                           Rao–Blackwellisation
                                                     Asymptotic results
                        Rao-Blackwellisation (2)
                                                     Illustrations


Rao-Blackwellised, for sure?
                                           ∞
                           ˆ
                           ξi = 1 +                 {1 − α(zi , y )}
                                          j=1 ≤j


    1   Infinite sum but finite with at least positive probability:

                                                        π(yt ) q(x (t) |yt )
                        α(x (t) , yt ) = min 1,
                                                       π(x (t) ) q(yt |x (t) )
        For example: take a symmetric random walk as a proposal.
    2   What if we wish to be sure that the sum is finite?
  Finite horizon improvement:
                    ∞
        ˆ
        ξik = 1 +                    {1 − α(zi , yj )}               I {u ≥ α(zi , y )}
                    j=1 1≤ ≤k∧j                          k+1≤ ≤j

                                                                                          14 / 36
Formal importance sampling
                    Metropolis Hastings revisited
                                                     Variance reduction
                           Rao–Blackwellisation
                                                     Asymptotic results
                        Rao-Blackwellisation (2)
                                                     Illustrations


Rao-Blackwellised, for sure?
                                           ∞
                           ˆ
                           ξi = 1 +                 {1 − α(zi , y )}
                                          j=1 ≤j


    1   Infinite sum but finite with at least positive probability:

                                                        π(yt ) q(x (t) |yt )
                        α(x (t) , yt ) = min 1,
                                                       π(x (t) ) q(yt |x (t) )
        For example: take a symmetric random walk as a proposal.
    2   What if we wish to be sure that the sum is finite?
  Finite horizon improvement:
                    ∞
        ˆ
        ξik = 1 +                    {1 − α(zi , yj )}               I {u ≥ α(zi , y )}
                    j=1 1≤ ≤k∧j                          k+1≤ ≤j

                                                                                          14 / 36
Formal importance sampling
                    Metropolis Hastings revisited
                                                    Variance reduction
                           Rao–Blackwellisation
                                                    Asymptotic results
                        Rao-Blackwellisation (2)
                                                    Illustrations


Variance improvement



  Proposition (Douc & X., AoS, 2011)
  If (yj )j is an iid sequence with distribution q(y |zi ) and (uj )j is an iid
  uniform sequence, for any k ≥ 0, the quantity
                    ∞
        ˆ
        ξik = 1 +                    {1 − α(zi , yj )}              I {u ≥ α(zi , y )}
                    j=1 1≤ ≤k∧j                          k+1≤ ≤j


  is an unbiased estimator of 1/p(zi ) with an almost sure finite number of
  terms.




                                                                                         15 / 36
Formal importance sampling
                     Metropolis Hastings revisited
                                                       Variance reduction
                            Rao–Blackwellisation
                                                       Asymptotic results
                         Rao-Blackwellisation (2)
                                                       Illustrations


Variance improvement

  Proposition (Douc & X., AoS, 2011)
  If (yj )j is an iid sequence with distribution q(y |zi ) and (uj )j is an iid
  uniform sequence, for any k ≥ 0, the quantity
                     ∞
        ˆ
        ξik = 1 +                     {1 − α(zi , yj )}                   I {u ≥ α(zi , y )}
                    j=1 1≤ ≤k∧j                                k+1≤ ≤j


  is an unbiased estimator of 1/p(zi ) with an almost sure finite number of
  terms. Moreover, for k ≥ 1,

       ˆ        1 − p(zi )   1 − (1 − 2p(zi ) + r (zi ))k             2 − p(zi )
     V ξik zi =    2 (z )
                           −                                                          (p(zi ) − r (zi )) ,
                  p i             2p(zi ) − r (zi )                     p 2 (zi )

  where p(zi ) :=   α(zi , y ) q(y |zi ) dy . and r (zi ) :=    α2 (zi , y ) q(y |zi ) dy .


                                                                                                             15 / 36
Formal importance sampling
                    Metropolis Hastings revisited
                                                    Variance reduction
                           Rao–Blackwellisation
                                                    Asymptotic results
                        Rao-Blackwellisation (2)
                                                    Illustrations


Variance improvement

  Proposition (Douc & X., AoS, 2011)
  If (yj )j is an iid sequence with distribution q(y |zi ) and (uj )j is an iid
  uniform sequence, for any k ≥ 0, the quantity
                    ∞
        ˆ
        ξik = 1 +                    {1 − α(zi , yj )}              I {u ≥ α(zi , y )}
                    j=1 1≤ ≤k∧j                          k+1≤ ≤j


  is an unbiased estimator of 1/p(zi ) with an almost sure finite number of
  terms. Therefore, we have

                  ˆ         ˆ          ˆ
                V ξi zi ≤ V ξik zi ≤ V ξi0 zi = V [ni | zi ] .




                                                                                         15 / 36
Formal importance sampling
            Metropolis Hastings revisited
                                            Variance reduction
                   Rao–Blackwellisation
                                            Asymptotic results
                Rao-Blackwellisation (2)
                                            Illustrations




                   zi−1




            ∞
ˆ
ξik = 1 +                    {1 − α(zi , yj )}              I {u ≥ α(zi , y )}
            j=1 1≤ ≤k∧j                          k+1≤ ≤j




                                                                                 16 / 36
Formal importance sampling
            Metropolis Hastings revisited
                                             Variance reduction
                   Rao–Blackwellisation
                                             Asymptotic results
                Rao-Blackwellisation (2)
                                             Illustrations



                           not indep
                  zi−1                      zi


                       not indep


                  ˆk
                  ξi−1




            ∞
ˆ
ξik = 1 +                    {1 − α(zi , yj )}               I {u ≥ α(zi , y )}
            j=1 1≤ ≤k∧j                          k+1≤ ≤j




                                                                                  16 / 36
Formal importance sampling
            Metropolis Hastings revisited
                                                  Variance reduction
                   Rao–Blackwellisation
                                                  Asymptotic results
                Rao-Blackwellisation (2)
                                                  Illustrations



                        not indep                  not indep
              zi−1                          zi                      zi+1


                    not indep                     not indep


              ˆk
              ξi−1                          ˆ
                                            ξik




            ∞
ˆ
ξik = 1 +                    {1 − α(zi , yj )}                    I {u ≥ α(zi , y )}
            j=1 1≤ ≤k∧j                               k+1≤ ≤j




                                                                                       16 / 36
Formal importance sampling
Metropolis Hastings revisited
                                      Variance reduction
       Rao–Blackwellisation
                                      Asymptotic results
    Rao-Blackwellisation (2)
                                      Illustrations



            not indep                  not indep
  zi−1                          zi                      zi+1


        not indep                     not indep


  ˆk
  ξi−1                          ˆ
                                ξik




                                M ˆk
                k               i=1 ξi h(zi )
               δM =               M ˆk
                                                .
                                  i=1 ξi




                                                                   16 / 36
Formal importance sampling
Metropolis Hastings revisited
                                      Variance reduction
       Rao–Blackwellisation
                                      Asymptotic results
    Rao-Blackwellisation (2)
                                      Illustrations



            not indep                  not indep
  zi−1                          zi                      zi+1


        not indep                     not indep


  ˆk
  ξi−1                          ˆ
                                ξik




                                M ˆk
                k               i=1 ξi h(zi )
               δM =               M ˆk
                                                .
                                  i=1 ξi




                                                                   16 / 36
Formal importance sampling
               Metropolis Hastings revisited
                                                   Variance reduction
                      Rao–Blackwellisation
                                                   Asymptotic results
                   Rao-Blackwellisation (2)
                                                   Illustrations




Let
                                               M ˆk
                               k               i=1 ξi h(zi )
                              δM =               M ˆk
                                                               .
                                                 i=1 ξi

For any positive function ϕ, we denote Cϕ = {h; |h/ϕ|∞ < ∞}.




                                                                                17 / 36
Formal importance sampling
                    Metropolis Hastings revisited
                                                        Variance reduction
                           Rao–Blackwellisation
                                                        Asymptotic results
                        Rao-Blackwellisation (2)
                                                        Illustrations



Let
                                                    M ˆk
                                    k               i=1 ξi h(zi )
                                   δM =               M ˆk
                                                                    .
                                                      i=1 ξi

For any positive function ϕ, we denote Cϕ = {h; |h/ϕ|∞ < ∞}. Assume
that there exists a positive function ϕ ≥ 1 such that
                                             M
                                             i=1 h(zi )/p(zi )      P
                          ∀h ∈ Cϕ ,            M
                                                                 −→ π(h)
                                               i=1 1/p(zi )



Theorem (Douc & X., AoS, 2011)

Under the assumption that π(p) > 0, the following convergence property holds:
   i) If h is in Cϕ , then

                                k     P
                               δM −→M→∞ π(h) ( Consistency)



                                                                                     17 / 36
Formal importance sampling
                  Metropolis Hastings revisited
                                                      Variance reduction
                         Rao–Blackwellisation
                                                      Asymptotic results
                      Rao-Blackwellisation (2)
                                                      Illustrations

Let
                                                  M ˆk
                                  k               i=1 ξi h(zi )
                                 δM =               M ˆk
                                                                  .
                                                    i=1 ξi

For any positive function ϕ, we denote Cϕ = {h; |h/ϕ|∞ < ∞}.
Assume that there exists a positive function ψ such that
                    √              M
                                   i=1 h(zi )/p(zi )                      L
      ∀h ∈ Cψ ,         M            M
                                                           − π(h)       −→ N (0, Γ(h))
                                     i=1 1/p(zi )


Theorem (Douc & X., AoS, 2011)

Under the assumption that π(p) > 0, the following convergence property
holds:
 ii) If, in addition, h2 /p ∈ Cϕ and h ∈ Cψ , then
          √      k                   L
              M(δM − π(h)) −→M→∞ N (0, Vk [h − π(h)]) , ( Clt)

      where Vk (h) := π(p)                   ˆ
                                      π(dz)V ξik z h2 (z)p(z) + Γ(h) .                   17 / 36
Formal importance sampling
                Metropolis Hastings revisited
                                                Variance reduction
                       Rao–Blackwellisation
                                                Asymptotic results
                    Rao-Blackwellisation (2)
                                                Illustrations

We will need some additional assumptions. Assume a maximal inequality
for the Markov chain (zi )i : there exists a measurable function ζ such that
for any starting point x,
                                                       
                                   i
                                                             NCh (x)
        ∀h ∈ Cζ , Px  sup           [h(zi ) − π (h)] >  ≤
                                               ˜                 2
                            0≤i≤N j=0



Theorem (Douc & X., AoS, 2011)

Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume
moreover that
              √      0           L
                 M δM − π(h) −→ N (0, V0 [h − π(h)]) .

Then, for any starting point x,
                   N
                   t=1 h(x (t) )                N→+∞
         MN                      − π(h)           −→ N (0, V0 [h − π(h)]) ,
                       N

where MN is defined by                                                          18 / 36
Formal importance sampling
                Metropolis Hastings revisited
                                                Variance reduction
                       Rao–Blackwellisation
                                                Asymptotic results
                    Rao-Blackwellisation (2)
                                                Illustrations

We will need some additional assumptions. Assume a maximal inequality
for the Markov chain (zi )i : there exists a measurable function ζ such that
for any starting point x,
                                                       
                                   i
                                                             NCh (x)
        ∀h ∈ Cζ , Px  sup           [h(zi ) − π (h)] >  ≤
                                               ˜                 2
                            0≤i≤N j=0

Moreover, assume that ∃φ ≥ 1 such that for any starting point x,
                              ˜           P
              ∀h ∈ Cφ ,       Q n (x, h) −→ π (h) = π(ph)/π(p) ,
                                            ˜


Theorem (Douc & X., AoS, 2011)

Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume
moreover that
              √      0           L
                 M δM − π(h) −→ N (0, V0 [h − π(h)]) .

Then, for any starting point x,
                   N                                                           18 / 36
Formal importance sampling
                Metropolis Hastings revisited
                                                Variance reduction
                       Rao–Blackwellisation
                                                Asymptotic results
                    Rao-Blackwellisation (2)
                                                Illustrations

We will need some additional assumptions. Assume a maximal inequality
for the Markov chain (zi )i : there exists a measurable function ζ such that
for any starting point x,
                                                       
                                   i
                                                             NCh (x)
        ∀h ∈ Cζ , Px  sup           [h(zi ) − π (h)] >  ≤
                                               ˜                 2
                            0≤i≤N j=0

Moreover, assume that ∃φ ≥ 1 such that for any starting point x,
                              ˜           P
              ∀h ∈ Cφ ,       Q n (x, h) −→ π (h) = π(ph)/π(p) ,
                                            ˜


Theorem (Douc & X., AoS, 2011)

Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume
moreover that
              √      0           L
                 M δM − π(h) −→ N (0, V0 [h − π(h)]) .

Then, for any starting point x,
                   N                                                           18 / 36
Formal importance sampling
                   Metropolis Hastings revisited
                                                   Variance reduction
                          Rao–Blackwellisation
                                                   Asymptotic results
                       Rao-Blackwellisation (2)
                                                   Illustrations


                                                                      
                                           i
                                                                      NCh (x)
       ∀h ∈ Cζ ,      Px  sup                 [h(zi ) − π (h)] >  ≤
                                                         ˜               2
                                0≤i≤N j=0

                                 ˜           P
              ∀h ∈ Cφ ,          Q n (x, h) −→ π (h) = π(ph)/π(p) ,
                                               ˜


Theorem (Douc & X., AoS, 2011)

Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume
moreover that
              √      0           L
                 M δM − π(h) −→ N (0, V0 [h − π(h)]) .

Then, for any starting point x,
                      N
                      t=1 h(x (t) )                N→+∞
         MN                         − π(h)           −→ N (0, V0 [h − π(h)]) ,
                          N

where MN is defined by                                                            18 / 36
Formal importance sampling
                Metropolis Hastings revisited
                                                Variance reduction
                       Rao–Blackwellisation
                                                Asymptotic results
                    Rao-Blackwellisation (2)
                                                Illustrations




Theorem (Douc & X., AoS, 2011)

Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume
moreover that
              √      0           L
                 M δM − π(h) −→ N (0, V0 [h − π(h)]) .

Then, for any starting point x,
                   N
                   t=1 h(x (t) )                N→+∞
         MN                      − π(h)           −→ N (0, V0 [h − π(h)]) ,
                       N

where MN is defined by
                              MN                 MN +1
                                    ˆ
                                    ξi0 ≤ N <            ˆ
                                                         ξi0 .
                              i=1                 i=1



                                                                              18 / 36
Formal importance sampling
                 Metropolis Hastings revisited
                                                 Variance reduction
                        Rao–Blackwellisation
                                                 Asymptotic results
                     Rao-Blackwellisation (2)
                                                 Illustrations


Variance gain (1)



                  h(x)          x            x2       IX >0       p(x)
                  τ = .1        0.971        0.953    0.957       0.207
                  τ =2          0.965        0.942    0.875       0.861
                  τ =5          0.913        0.982    0.785       0.826
                  τ =7          0.899        0.982    0.768       0.820

  Ratios of the empirical variances of δ ∞ and δ estimating E[h(X )]:
  100 MCMC iterations over 103 replications of a random walk Gaussian
  proposal with scale τ .




                                                                              19 / 36
Formal importance sampling
                  Metropolis Hastings revisited
                                                  Variance reduction
                         Rao–Blackwellisation
                                                  Asymptotic results
                      Rao-Blackwellisation (2)
                                                  Illustrations


Illustration (1)




  Figure: Overlay of the variations of 250 iid realisations of the estimates
  δ (gold) and δ ∞ (grey) of E[X ] = 0 for 1000 iterations, along with the
  90% interquantile range for the estimates δ (brown) and δ ∞ (pink), in
  the setting of a random walk Gaussian proposal with scale τ = 10.
                                                                               20 / 36
Formal importance sampling
                    Metropolis Hastings revisited
                                                       Variance reduction
                           Rao–Blackwellisation
                                                       Asymptotic results
                        Rao-Blackwellisation (2)
                                                       Illustrations


Extra computational effort



                                 median             mean     q.8     q.9      time
                τ   = .25        0.0                8.85     4.9     13       4.2
                τ   = .50        0.0                6.76     4       11       2.25
                τ   = 1.0        0.25               6.15     4       10       2.5
                τ   = 2.0        0.20               5.90     3.5     8.5      4.5

  Additional computing effort due: median and mean numbers of additional
  iterations, 80% and 90% quantiles for the additional iterations, and ratio
  of the average R computing times obtained over 105 simulations




                                                                                     21 / 36
Formal importance sampling
                  Metropolis Hastings revisited
                                                  Variance reduction
                         Rao–Blackwellisation
                                                  Asymptotic results
                      Rao-Blackwellisation (2)
                                                  Illustrations


Illustration (2)




  Figure: Overlay of the variations of 500 iid realisations of the estimates
  δ (deep grey), δ ∞ (medium grey) and of the importance sampling version
  (light grey) of E[X ] = 10 when X ∼ Exp(.1) for 100 iterations, along
  with the 90% interquantile ranges (same colour code), in the setting of
  an independent exponential proposal with scale µ = 0.02.                     22 / 36
Metropolis Hastings revisited
                                                   Independent case
                          Rao–Blackwellisation
                                                   General MH algorithms
                       Rao-Blackwellisation (2)


Outline


  1   Metropolis Hastings revisited

  2   Rao–Blackwellisation
        Formal importance sampling
        Variance reduction
        Asymptotic results
        Illustrations

  3   Rao-Blackwellisation (2)
        Independent case
        General MH algorithms



                                                                           23 / 36
Metropolis Hastings revisited
                                                    Independent case
                         Rao–Blackwellisation
                                                    General MH algorithms
                      Rao-Blackwellisation (2)


Integrating out white noise
  In Casella+X. (1996) paper, averaging of possible past and future
  histories (by integrating out uniforms) to improve weights of accepted
  values
  Rao–Blackwellised weight on proposed values yt
                                                                   p
                                          (i)
                                         ϕt       =       δt           ξtj
                                                               j=t
                                                          t−1
                   with       δ0 = 1 δt           =                δj ξj(t−1) ρjt
                                                          j=0

                                                               j
                   and      ξtt = 1 ,             ξtj =                (1 − ρtu )
                                                          u=t+1
  occurence survivals of the yt ’s, associated with Metropolis–Hastings ratio
                  ωt = π(yt )/µ(yt ) ,                ρtu = ωu /ωt ∧ 1 .
                                                                                    24 / 36
Metropolis Hastings revisited
                                                    Independent case
                         Rao–Blackwellisation
                                                    General MH algorithms
                      Rao-Blackwellisation (2)


Integrating out white noise
  In Casella+X. (1996) paper, averaging of possible past and future
  histories (by integrating out uniforms) to improve weights of accepted
  values
  Rao–Blackwellised weight on proposed values yt
                                                                   p
                                          (i)
                                         ϕt       =       δt           ξtj
                                                               j=t
                                                          t−1
                   with       δ0 = 1 δt           =                δj ξj(t−1) ρjt
                                                          j=0

                                                               j
                   and      ξtt = 1 ,             ξtj =                (1 − ρtu )
                                                          u=t+1
  occurence survivals of the yt ’s, associated with Metropolis–Hastings ratio
                  ωt = π(yt )/µ(yt ) ,                ρtu = ωu /ωt ∧ 1 .
                                                                                    24 / 36
Metropolis Hastings revisited
                                                      Independent case
                         Rao–Blackwellisation
                                                      General MH algorithms
                      Rao-Blackwellisation (2)


Integrating out white noise



  Potentialy large variance improvement but cost of O(T 2 )...
  Possible recovery of efficiency thanks to parallelisation:
  Moving from ( 1 , . . . , p ) towards...

                                       (   (1) , . . . , (p) )

  by averaging over ”all” possible orders




                                                                              25 / 36
Metropolis Hastings revisited
                                                      Independent case
                         Rao–Blackwellisation
                                                      General MH algorithms
                      Rao-Blackwellisation (2)


Integrating out white noise



  Potentialy large variance improvement but cost of O(T 2 )...
  Possible recovery of efficiency thanks to parallelisation:
  Moving from ( 1 , . . . , p ) towards...

                                       (   (1) , . . . , (p) )

  by averaging over ”all” possible orders




                                                                              25 / 36
Metropolis Hastings revisited
                                                      Independent case
                         Rao–Blackwellisation
                                                      General MH algorithms
                      Rao-Blackwellisation (2)


Integrating out white noise



  Potentialy large variance improvement but cost of O(T 2 )...
  Possible recovery of efficiency thanks to parallelisation:
  Moving from ( 1 , . . . , p ) towards...

                                       (   (1) , . . . , (p) )

  by averaging over ”all” possible orders




                                                                              25 / 36
Metropolis Hastings revisited
                                                     Independent case
                         Rao–Blackwellisation
                                                     General MH algorithms
                      Rao-Blackwellisation (2)


Case of the independent Metropolis–Hastings algorithm




  Starting at time t with p processors and a pool of p proposed values,

                                         (y1 , . . . , yp )

  use processors to examine in parallel p different “histories”




                                                                             26 / 36
Metropolis Hastings revisited
                                                     Independent case
                         Rao–Blackwellisation
                                                     General MH algorithms
                      Rao-Blackwellisation (2)


Case of the independent Metropolis–Hastings algorithm


  Starting at time t with p processors and a pool of p proposed values,

                                         (y1 , . . . , yp )

  use processors to examine in parallel p different “histories”




                                                                             26 / 36
Metropolis Hastings revisited
                                                       Independent case
                         Rao–Blackwellisation
                                                       General MH algorithms
                      Rao-Blackwellisation (2)


Improvement


  The standard estimator τ1 of Eπ [h(X )]
                         ˆ
                                                         p
                                              1
                            τ1 (xt , y1:p ) =
                            ˆ                                 h(xt+k )
                                              p
                                                       k=1

  is necessarily dominated by the average
                                                         p
                                                  1
                           τ2 (xt , y1:p ) =
                           ˆ                                  nk h(yk )
                                                  p2
                                                        k=0

  where y0 = xt and n0 is the number of times xt is repeated.




                                                                               27 / 36
Metropolis Hastings revisited
                                                       Independent case
                         Rao–Blackwellisation
                                                       General MH algorithms
                      Rao-Blackwellisation (2)


Further Rao-Blackwellisation
  E.g., use of the Metropolis–Hastings weights wj : j being the index such
  that xt+i−1 = yj , update of the weights at each time t + i:

                            wj = wj + 1 − ρ(xt+i−1 , yi )
                            wi = wi + ρ(xt+i−1 , yi )

  resulting into a more stable estimator
                                                         p
                                                  1
                           τ3 (xt , y1:p ) =
                           ˆ                                 wk h(yk )
                                                  p2
                                                       k=0

  E.g., Casella+X. (1996)
                                                         p
                                                  1
                           τ4 (xt , y1:p ) =
                           ˆ                                 ϕk h(yk )
                                                  p2
                                                       k=0


                                                                               28 / 36
Metropolis Hastings revisited
                                                       Independent case
                         Rao–Blackwellisation
                                                       General MH algorithms
                      Rao-Blackwellisation (2)


Further Rao-Blackwellisation
  E.g., use of the Metropolis–Hastings weights wj : j being the index such
  that xt+i−1 = yj , update of the weights at each time t + i:

                            wj = wj + 1 − ρ(xt+i−1 , yi )
                            wi = wi + ρ(xt+i−1 , yi )

  resulting into a more stable estimator
                                                         p
                                                  1
                           τ3 (xt , y1:p ) =
                           ˆ                                 wk h(yk )
                                                  p2
                                                       k=0

  E.g., Casella+X. (1996)
                                                         p
                                                  1
                           τ4 (xt , y1:p ) =
                           ˆ                                 ϕk h(yk )
                                                  p2
                                                       k=0


                                                                               28 / 36
Metropolis Hastings revisited
                                                  Independent case
                         Rao–Blackwellisation
                                                  General MH algorithms
                      Rao-Blackwellisation (2)


Markovian continuity




  The Markov validity of the chain is not jeopardised! The chain continues
                                                                    (j)
  by picking one sequence at random and taking the corresponding xt+p as
  starting point of the next parallel block.




                                                                             29 / 36
Metropolis Hastings revisited
                                                  Independent case
                         Rao–Blackwellisation
                                                  General MH algorithms
                      Rao-Blackwellisation (2)


Markovian continuity


  The Markov validity of the chain is not jeopardised! The chain continues
                                                                    (j)
  by picking one sequence at random and taking the corresponding xt+p as
  starting point of the next parallel block.




                                                                             29 / 36
Metropolis Hastings revisited
                                                  Independent case
                         Rao–Blackwellisation
                                                  General MH algorithms
                      Rao-Blackwellisation (2)


Impact of Rao-Blackwellisations



  Comparison of
      τ1 basic IMH estimator of Eπ [h(X )],
      ˆ
      τ2 improving by averaging over permutations of proposed values and
      ˆ
      using p times more uniforms
      τ3 improving upon τ2 by basic Rao-Blackwell argument,
      ˆ                 ˆ
      τ4 improving upon τ2 by integrating out ancillary uniforms, at a cost
      ˆ                 ˆ
      of O(p 2 ).




                                                                              30 / 36
Metropolis Hastings revisited
                                                 Independent case
                        Rao–Blackwellisation
                                                 General MH algorithms
                     Rao-Blackwellisation (2)


Illustration


  Variations of estimates based on RB and standard versions of parallel
  chains and on a standard MCMC chain for the mean and variance of the
  target N (0, 1) distribution (based on 10, 000 independent replicas).




                                                                          31 / 36
Metropolis Hastings revisited
                                                 Independent case
                        Rao–Blackwellisation
                                                 General MH algorithms
                     Rao-Blackwellisation (2)


Illustration


  Variations of estimates based on RB and standard versions of parallel
  chains and on a standard MCMC chain for the mean and variance of the
  target N (0, 1) distribution (based on 10, 000 independent replicas).




                                                                          31 / 36
Metropolis Hastings revisited
                                                 Independent case
                        Rao–Blackwellisation
                                                 General MH algorithms
                     Rao-Blackwellisation (2)


Illustration


  Variations of estimates based on RB and standard versions of parallel
  chains and on a standard MCMC chain for the mean and variance of the
  target N (0, 1) distribution (based on 10, 000 independent replicas).




                                                                          31 / 36
Metropolis Hastings revisited
                                                 Independent case
                        Rao–Blackwellisation
                                                 General MH algorithms
                     Rao-Blackwellisation (2)


Illustration


  Variations of estimates based on RB and standard versions of parallel
  chains and on a standard MCMC chain for the mean and variance of the
  target N (0, 1) distribution (based on 10, 000 independent replicas).




                                                                          31 / 36
Metropolis Hastings revisited
                                                   Independent case
                          Rao–Blackwellisation
                                                   General MH algorithms
                       Rao-Blackwellisation (2)


Impact of the order


  Parallelisation allows for the partial integration of the uniforms
  What about the permutation order?
  Comparison of
       τ2N with no permutation,
       ˆ
       τ2C with circular permutations,
       ˆ
       τ2R with random permutations,
       ˆ
       τ2H with half-random permutations,
       ˆ
       τ2S with stratified permutations,
       ˆ




                                                                           32 / 36
Metropolis Hastings revisited
                                                   Independent case
                          Rao–Blackwellisation
                                                   General MH algorithms
                       Rao-Blackwellisation (2)


Impact of the order



  Parallelisation allows for the partial integration of the uniforms
  What about the permutation order?




                                                                           32 / 36
Metropolis Hastings revisited
                                                   Independent case
                          Rao–Blackwellisation
                                                   General MH algorithms
                       Rao-Blackwellisation (2)


Impact of the order



  Parallelisation allows for the partial integration of the uniforms
  What about the permutation order?




                                                                           32 / 36
Metropolis Hastings revisited
                                                   Independent case
                          Rao–Blackwellisation
                                                   General MH algorithms
                       Rao-Blackwellisation (2)


Impact of the order



  Parallelisation allows for the partial integration of the uniforms
  What about the permutation order?




                                                                           32 / 36
Metropolis Hastings revisited
                                                   Independent case
                          Rao–Blackwellisation
                                                   General MH algorithms
                       Rao-Blackwellisation (2)


Impact of the order



  Parallelisation allows for the partial integration of the uniforms
  What about the permutation order?




                                                                           32 / 36
Metropolis Hastings revisited
                                                 Independent case
                        Rao–Blackwellisation
                                                 General MH algorithms
                     Rao-Blackwellisation (2)


Importance target


  Comparison with the ultimate importance sampling




                                                                         33 / 36
Metropolis Hastings revisited
                                                 Independent case
                        Rao–Blackwellisation
                                                 General MH algorithms
                     Rao-Blackwellisation (2)


Importance target


  Comparison with the ultimate importance sampling




                                                                         33 / 36
Metropolis Hastings revisited
                                                 Independent case
                        Rao–Blackwellisation
                                                 General MH algorithms
                     Rao-Blackwellisation (2)


Importance target


  Comparison with the ultimate importance sampling




                                                                         33 / 36
Metropolis Hastings revisited
                                                 Independent case
                        Rao–Blackwellisation
                                                 General MH algorithms
                     Rao-Blackwellisation (2)


Importance target


  Comparison with the ultimate importance sampling




                                                                         33 / 36
Metropolis Hastings revisited
                                                    Independent case
                         Rao–Blackwellisation
                                                    General MH algorithms
                      Rao-Blackwellisation (2)


Extension to the general case



  Same principle can be applied to any Markov update: if

                                     xt+1 = Ψ(xt , t )

  then generate
                                         ( 1, . . . ,   p)

  in advance and distribute to the p processors in different permutation
  orders
  Plus use of Douc & X’s (2011) Rao–Blackwellisation ξikˆ




                                                                            34 / 36
Metropolis Hastings revisited
                                                    Independent case
                         Rao–Blackwellisation
                                                    General MH algorithms
                      Rao-Blackwellisation (2)


Extension to the general case



  Same principle can be applied to any Markov update: if

                                     xt+1 = Ψ(xt , t )

  then generate
                                         ( 1, . . . ,   p)

  in advance and distribute to the p processors in different permutation
  orders
  Plus use of Douc & X’s (2011) Rao–Blackwellisation ξikˆ




                                                                            34 / 36
Metropolis Hastings revisited
                                                         Independent case
                          Rao–Blackwellisation
                                                         General MH algorithms
                       Rao-Blackwellisation (2)


Implementation



                                              (j)
  Similar run of p parallel chains (xt+i ), use of averages
                                                     p     p
                               (1:p)        1                           (j)
                         τ2 (x1:p ) =
                         ˆ                                     nk h(xt+k )
                                            p2
                                                    k=1 j=1

  and selection of new starting value at random at time t + p:




                                                                                 35 / 36
Metropolis Hastings revisited
                                                         Independent case
                          Rao–Blackwellisation
                                                         General MH algorithms
                       Rao-Blackwellisation (2)


Implementation
                                              (j)
  Similar run of p parallel chains (xt+i ), use of averages
                                                     p     p
                               (1:p)        1                           (j)
                         τ2 (x1:p ) =
                         ˆ                                     nk h(xt+k )
                                            p2
                                                    k=1 j=1

  and selection of new starting value at random at time t + p:




                                                                                 35 / 36
Metropolis Hastings revisited
                                                               Independent case
                                Rao–Blackwellisation
                                                               General MH algorithms
                             Rao-Blackwellisation (2)


Illustration
  Variations of estimates based on RB and standard versions of parallel
  chains and on a standard MCMC chain for the mean and variance of the
  target distribution (based on p = 64 parallel processors, 50 blocs of p
  MCMC steps and 500 independent replicas).




                                                         1.3
                 0.10




                                                         1.2
                 0.05




                                                         1.1
                 0.00




                                                         1.0
                 −0.05




                                                         0.9
                 −0.10




                               RB       par      org                RB      par        org



                                                                                             36 / 36
Metropolis Hastings revisited
                                                               Independent case
                                Rao–Blackwellisation
                                                               General MH algorithms
                             Rao-Blackwellisation (2)


Illustration
  Variations of estimates based on RB and standard versions of parallel
  chains and on a standard MCMC chain for the mean and variance of the
  target distribution (based on p = 64 parallel processors, 50 blocs of p
  MCMC steps and 500 independent replicas).




                                                         1.3
                 0.10




                                                         1.2
                 0.05




                                                         1.1
                 0.00




                                                         1.0
                 −0.05




                                                         0.9
                 −0.10




                               RB       par      org                RB      par        org



                                                                                             36 / 36

Contenu connexe

Tendances

Particle filtering
Particle filteringParticle filtering
Particle filteringWei Wang
 
Myers_SIAMCSE15
Myers_SIAMCSE15Myers_SIAMCSE15
Myers_SIAMCSE15Karen Pao
 
Parellelism in spectral methods
Parellelism in spectral methodsParellelism in spectral methods
Parellelism in spectral methodsRamona Corman
 
Large-scale structure non-Gaussianities with modal methods (Ascona)
Large-scale structure non-Gaussianities with modal methods (Ascona)Large-scale structure non-Gaussianities with modal methods (Ascona)
Large-scale structure non-Gaussianities with modal methods (Ascona)Marcel Schmittfull
 
Zvonimir Vlah "Lagrangian perturbation theory for large scale structure forma...
Zvonimir Vlah "Lagrangian perturbation theory for large scale structure forma...Zvonimir Vlah "Lagrangian perturbation theory for large scale structure forma...
Zvonimir Vlah "Lagrangian perturbation theory for large scale structure forma...SEENET-MTP
 
ABC with Wasserstein distances
ABC with Wasserstein distancesABC with Wasserstein distances
ABC with Wasserstein distancesChristian Robert
 
MLP輪読スパース8章 トレースノルム正則化
MLP輪読スパース8章 トレースノルム正則化MLP輪読スパース8章 トレースノルム正則化
MLP輪読スパース8章 トレースノルム正則化Akira Tanimoto
 
Talk given at the Workshop in Catania University
Talk given at the Workshop in Catania University Talk given at the Workshop in Catania University
Talk given at the Workshop in Catania University Marco Frasca
 
Resolving the black-hole information paradox
Resolving the black-hole information paradoxResolving the black-hole information paradox
Resolving the black-hole information paradoxFausto Intilla
 
Resource theory of asymmetric distinguishability
Resource theory of asymmetric distinguishabilityResource theory of asymmetric distinguishability
Resource theory of asymmetric distinguishabilityMark Wilde
 
Discrete form of the riccati equation
Discrete form of the riccati equationDiscrete form of the riccati equation
Discrete form of the riccati equationAlberth Carantón
 
System overflow blocking-transients-for-queues-with-batch-arrivals-using-a-fa...
System overflow blocking-transients-for-queues-with-batch-arrivals-using-a-fa...System overflow blocking-transients-for-queues-with-batch-arrivals-using-a-fa...
System overflow blocking-transients-for-queues-with-batch-arrivals-using-a-fa...Cemal Ardil
 
Modeling biased tracers at the field level
Modeling biased tracers at the field levelModeling biased tracers at the field level
Modeling biased tracers at the field levelMarcel Schmittfull
 
Computational methods and vibrational properties applied to materials modeling
Computational methods and vibrational properties applied to materials modelingComputational methods and vibrational properties applied to materials modeling
Computational methods and vibrational properties applied to materials modelingcippo1987Ita
 
About the 2-Banach Spaces
About the 2-Banach Spaces About the 2-Banach Spaces
About the 2-Banach Spaces IJMER
 
A lattice-based consensus clustering
A lattice-based consensus clusteringA lattice-based consensus clustering
A lattice-based consensus clusteringDmitrii Ignatov
 

Tendances (20)

Particle filtering
Particle filteringParticle filtering
Particle filtering
 
Myers_SIAMCSE15
Myers_SIAMCSE15Myers_SIAMCSE15
Myers_SIAMCSE15
 
Parellelism in spectral methods
Parellelism in spectral methodsParellelism in spectral methods
Parellelism in spectral methods
 
Large-scale structure non-Gaussianities with modal methods (Ascona)
Large-scale structure non-Gaussianities with modal methods (Ascona)Large-scale structure non-Gaussianities with modal methods (Ascona)
Large-scale structure non-Gaussianities with modal methods (Ascona)
 
An Improved Quantum-behaved Particle Swarm Optimization Algorithm Based on Ch...
An Improved Quantum-behaved Particle Swarm Optimization Algorithm Based on Ch...An Improved Quantum-behaved Particle Swarm Optimization Algorithm Based on Ch...
An Improved Quantum-behaved Particle Swarm Optimization Algorithm Based on Ch...
 
Zvonimir Vlah "Lagrangian perturbation theory for large scale structure forma...
Zvonimir Vlah "Lagrangian perturbation theory for large scale structure forma...Zvonimir Vlah "Lagrangian perturbation theory for large scale structure forma...
Zvonimir Vlah "Lagrangian perturbation theory for large scale structure forma...
 
ABC with Wasserstein distances
ABC with Wasserstein distancesABC with Wasserstein distances
ABC with Wasserstein distances
 
MLP輪読スパース8章 トレースノルム正則化
MLP輪読スパース8章 トレースノルム正則化MLP輪読スパース8章 トレースノルム正則化
MLP輪読スパース8章 トレースノルム正則化
 
Talk given at the Workshop in Catania University
Talk given at the Workshop in Catania University Talk given at the Workshop in Catania University
Talk given at the Workshop in Catania University
 
Resolving the black-hole information paradox
Resolving the black-hole information paradoxResolving the black-hole information paradox
Resolving the black-hole information paradox
 
1416336962.pdf
1416336962.pdf1416336962.pdf
1416336962.pdf
 
Resource theory of asymmetric distinguishability
Resource theory of asymmetric distinguishabilityResource theory of asymmetric distinguishability
Resource theory of asymmetric distinguishability
 
Discrete form of the riccati equation
Discrete form of the riccati equationDiscrete form of the riccati equation
Discrete form of the riccati equation
 
System overflow blocking-transients-for-queues-with-batch-arrivals-using-a-fa...
System overflow blocking-transients-for-queues-with-batch-arrivals-using-a-fa...System overflow blocking-transients-for-queues-with-batch-arrivals-using-a-fa...
System overflow blocking-transients-for-queues-with-batch-arrivals-using-a-fa...
 
Modeling biased tracers at the field level
Modeling biased tracers at the field levelModeling biased tracers at the field level
Modeling biased tracers at the field level
 
Introduction to DFT Part 1
Introduction to DFT Part 1 Introduction to DFT Part 1
Introduction to DFT Part 1
 
Computational methods and vibrational properties applied to materials modeling
Computational methods and vibrational properties applied to materials modelingComputational methods and vibrational properties applied to materials modeling
Computational methods and vibrational properties applied to materials modeling
 
About the 2-Banach Spaces
About the 2-Banach Spaces About the 2-Banach Spaces
About the 2-Banach Spaces
 
Introduction to DFT Part 2
Introduction to DFT Part 2Introduction to DFT Part 2
Introduction to DFT Part 2
 
A lattice-based consensus clustering
A lattice-based consensus clusteringA lattice-based consensus clustering
A lattice-based consensus clustering
 

Similaire à Talk in Telecom-Paris, Nov. 15, 2011

Vanilla rao blackwellisation
Vanilla rao blackwellisationVanilla rao blackwellisation
Vanilla rao blackwellisationDeb Roy
 
short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018Christian Robert
 
Monte Carlo Statistical Methods
Monte Carlo Statistical MethodsMonte Carlo Statistical Methods
Monte Carlo Statistical MethodsChristian Robert
 
InternshipReport
InternshipReportInternshipReport
InternshipReportHamza Ameur
 
MCMC and likelihood-free methods
MCMC and likelihood-free methodsMCMC and likelihood-free methods
MCMC and likelihood-free methodsChristian Robert
 
Introduction to advanced Monte Carlo methods
Introduction to advanced Monte Carlo methodsIntroduction to advanced Monte Carlo methods
Introduction to advanced Monte Carlo methodsChristian Robert
 
Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3Fabian Pedregosa
 
Sns mid term-test2-solution
Sns mid term-test2-solutionSns mid term-test2-solution
Sns mid term-test2-solutioncheekeong1231
 
Delayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithmsDelayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithmsChristian Robert
 
Introduction to MCMC methods
Introduction to MCMC methodsIntroduction to MCMC methods
Introduction to MCMC methodsChristian Robert
 
Slides used during my thesis defense "Du typage vectoriel"
Slides used during my thesis defense "Du typage vectoriel"Slides used during my thesis defense "Du typage vectoriel"
Slides used during my thesis defense "Du typage vectoriel"Alejandro Díaz-Caro
 

Similaire à Talk in Telecom-Paris, Nov. 15, 2011 (20)

Vanilla rao blackwellisation
Vanilla rao blackwellisationVanilla rao blackwellisation
Vanilla rao blackwellisation
 
short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018
 
Rdnd2008
Rdnd2008Rdnd2008
Rdnd2008
 
Adc
AdcAdc
Adc
 
Monte Carlo Statistical Methods
Monte Carlo Statistical MethodsMonte Carlo Statistical Methods
Monte Carlo Statistical Methods
 
InternshipReport
InternshipReportInternshipReport
InternshipReport
 
Shanghai tutorial
Shanghai tutorialShanghai tutorial
Shanghai tutorial
 
Demo
DemoDemo
Demo
 
Demo
DemoDemo
Demo
 
Demo
DemoDemo
Demo
 
MCMC and likelihood-free methods
MCMC and likelihood-free methodsMCMC and likelihood-free methods
MCMC and likelihood-free methods
 
Pres metabief2020jmm
Pres metabief2020jmmPres metabief2020jmm
Pres metabief2020jmm
 
Introduction to advanced Monte Carlo methods
Introduction to advanced Monte Carlo methodsIntroduction to advanced Monte Carlo methods
Introduction to advanced Monte Carlo methods
 
Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3
 
mcmc
mcmcmcmc
mcmc
 
Sns mid term-test2-solution
Sns mid term-test2-solutionSns mid term-test2-solution
Sns mid term-test2-solution
 
Delayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithmsDelayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithms
 
Hastings 1970
Hastings 1970Hastings 1970
Hastings 1970
 
Introduction to MCMC methods
Introduction to MCMC methodsIntroduction to MCMC methods
Introduction to MCMC methods
 
Slides used during my thesis defense "Du typage vectoriel"
Slides used during my thesis defense "Du typage vectoriel"Slides used during my thesis defense "Du typage vectoriel"
Slides used during my thesis defense "Du typage vectoriel"
 

Plus de Christian Robert

Asymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de FranceAsymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de FranceChristian Robert
 
Workshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinWorkshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinChristian Robert
 
How many components in a mixture?
How many components in a mixture?How many components in a mixture?
How many components in a mixture?Christian Robert
 
Testing for mixtures at BNP 13
Testing for mixtures at BNP 13Testing for mixtures at BNP 13
Testing for mixtures at BNP 13Christian Robert
 
Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?Christian Robert
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking componentsChristian Robert
 
discussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihooddiscussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihoodChristian Robert
 
NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)Christian Robert
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerChristian Robert
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Christian Robert
 
Likelihood-free Design: a discussion
Likelihood-free Design: a discussionLikelihood-free Design: a discussion
Likelihood-free Design: a discussionChristian Robert
 

Plus de Christian Robert (20)

Asymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de FranceAsymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de France
 
Workshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinWorkshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael Martin
 
discussion of ICML23.pdf
discussion of ICML23.pdfdiscussion of ICML23.pdf
discussion of ICML23.pdf
 
How many components in a mixture?
How many components in a mixture?How many components in a mixture?
How many components in a mixture?
 
restore.pdf
restore.pdfrestore.pdf
restore.pdf
 
Testing for mixtures at BNP 13
Testing for mixtures at BNP 13Testing for mixtures at BNP 13
Testing for mixtures at BNP 13
 
Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?
 
CDT 22 slides.pdf
CDT 22 slides.pdfCDT 22 slides.pdf
CDT 22 slides.pdf
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking components
 
discussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihooddiscussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihood
 
NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like sampler
 
eugenics and statistics
eugenics and statisticseugenics and statistics
eugenics and statistics
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
asymptotics of ABC
asymptotics of ABCasymptotics of ABC
asymptotics of ABC
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Likelihood-free Design: a discussion
Likelihood-free Design: a discussionLikelihood-free Design: a discussion
Likelihood-free Design: a discussion
 
the ABC of ABC
the ABC of ABCthe ABC of ABC
the ABC of ABC
 

Dernier

𓀤Call On 6297143586 𓀤 Ultadanga Call Girls In All Kolkata 24/7 Provide Call W...
𓀤Call On 6297143586 𓀤 Ultadanga Call Girls In All Kolkata 24/7 Provide Call W...𓀤Call On 6297143586 𓀤 Ultadanga Call Girls In All Kolkata 24/7 Provide Call W...
𓀤Call On 6297143586 𓀤 Ultadanga Call Girls In All Kolkata 24/7 Provide Call W...rahim quresi
 
Call Girls Bellandur ☎ 7737669865☎ Book Your One night Stand (Bangalore)
Call Girls Bellandur ☎ 7737669865☎ Book Your One night Stand (Bangalore)Call Girls Bellandur ☎ 7737669865☎ Book Your One night Stand (Bangalore)
Call Girls Bellandur ☎ 7737669865☎ Book Your One night Stand (Bangalore)amitlee9823
 
WhatsApp Chat: 📞 8617697112 Hire Call Girls Raiganj For a Sensual Sex Experience
WhatsApp Chat: 📞 8617697112 Hire Call Girls Raiganj For a Sensual Sex ExperienceWhatsApp Chat: 📞 8617697112 Hire Call Girls Raiganj For a Sensual Sex Experience
WhatsApp Chat: 📞 8617697112 Hire Call Girls Raiganj For a Sensual Sex ExperienceNitya salvi
 
Almora call girls 📞 8617697112 At Low Cost Cash Payment Booking
Almora call girls 📞 8617697112 At Low Cost Cash Payment BookingAlmora call girls 📞 8617697112 At Low Cost Cash Payment Booking
Almora call girls 📞 8617697112 At Low Cost Cash Payment BookingNitya salvi
 
Hire 💕 8617697112 North Sikkim Call Girls Service Call Girls Agency
Hire 💕 8617697112 North Sikkim Call Girls Service Call Girls AgencyHire 💕 8617697112 North Sikkim Call Girls Service Call Girls Agency
Hire 💕 8617697112 North Sikkim Call Girls Service Call Girls AgencyNitya salvi
 
Verified Trusted Call Girls Ambattur Chennai ✔✔7427069034 Independent Chenna...
Verified Trusted Call Girls Ambattur Chennai ✔✔7427069034  Independent Chenna...Verified Trusted Call Girls Ambattur Chennai ✔✔7427069034  Independent Chenna...
Verified Trusted Call Girls Ambattur Chennai ✔✔7427069034 Independent Chenna... Shivani Pandey
 
Science City Kolkata ( Call Girls ) Kolkata ✔ 6297143586 ✔ Hot Model With Sex...
Science City Kolkata ( Call Girls ) Kolkata ✔ 6297143586 ✔ Hot Model With Sex...Science City Kolkata ( Call Girls ) Kolkata ✔ 6297143586 ✔ Hot Model With Sex...
Science City Kolkata ( Call Girls ) Kolkata ✔ 6297143586 ✔ Hot Model With Sex...rahim quresi
 
Verified Trusted Call Girls Singaperumal Koil Chennai ✔✔7427069034 Independe...
Verified Trusted Call Girls Singaperumal Koil Chennai ✔✔7427069034  Independe...Verified Trusted Call Girls Singaperumal Koil Chennai ✔✔7427069034  Independe...
Verified Trusted Call Girls Singaperumal Koil Chennai ✔✔7427069034 Independe... Shivani Pandey
 
Call Girls Bhandara Just Call 8617697112 Top Class Call Girl Service Available
Call Girls Bhandara Just Call 8617697112 Top Class Call Girl Service AvailableCall Girls Bhandara Just Call 8617697112 Top Class Call Girl Service Available
Call Girls Bhandara Just Call 8617697112 Top Class Call Girl Service AvailableNitya salvi
 
Call Girls In Warangal Escorts ☎️7427069034 🔝 💃 Enjoy 24/7 Escort Service En...
Call Girls In Warangal Escorts ☎️7427069034  🔝 💃 Enjoy 24/7 Escort Service En...Call Girls In Warangal Escorts ☎️7427069034  🔝 💃 Enjoy 24/7 Escort Service En...
Call Girls In Warangal Escorts ☎️7427069034 🔝 💃 Enjoy 24/7 Escort Service En...HyderabadDolls
 
📞 Contact Number 8617697112 VIP Ganderbal Call Girls
📞 Contact Number 8617697112 VIP Ganderbal Call Girls📞 Contact Number 8617697112 VIP Ganderbal Call Girls
📞 Contact Number 8617697112 VIP Ganderbal Call GirlsNitya salvi
 
Hire 💕 8617697112 Pauri Garhwal Call Girls Service Call Girls Agency
Hire 💕 8617697112 Pauri Garhwal Call Girls Service Call Girls AgencyHire 💕 8617697112 Pauri Garhwal Call Girls Service Call Girls Agency
Hire 💕 8617697112 Pauri Garhwal Call Girls Service Call Girls AgencyNitya salvi
 
Borum Call Girls Service ☎ ️93326-06886 ❤️‍🔥 Enjoy 24/7 Escort Service
Borum Call Girls Service ☎ ️93326-06886 ❤️‍🔥 Enjoy 24/7 Escort ServiceBorum Call Girls Service ☎ ️93326-06886 ❤️‍🔥 Enjoy 24/7 Escort Service
Borum Call Girls Service ☎ ️93326-06886 ❤️‍🔥 Enjoy 24/7 Escort ServiceDamini Dixit
 
Kolkata Call Girls Service ❤️ at @30% discount Everyday
Kolkata Call Girls Service ❤️ at @30% discount EverydayKolkata Call Girls Service ❤️ at @30% discount Everyday
Kolkata Call Girls Service ❤️ at @30% discount Everydayonly4webmaster01
 
Model VVIP Call Girls In Porur 👉 Chennai 🍬 7427069034 Escort Service & Hotel ...
Model VVIP Call Girls In Porur 👉 Chennai 🍬 7427069034 Escort Service & Hotel ...Model VVIP Call Girls In Porur 👉 Chennai 🍬 7427069034 Escort Service & Hotel ...
Model VVIP Call Girls In Porur 👉 Chennai 🍬 7427069034 Escort Service & Hotel ...hotbabesbook
 
Low Rate Call Girls Dhakuria (8005736733) 100% GENUINE ESCORT SERVICE & HOTEL...
Low Rate Call Girls Dhakuria (8005736733) 100% GENUINE ESCORT SERVICE & HOTEL...Low Rate Call Girls Dhakuria (8005736733) 100% GENUINE ESCORT SERVICE & HOTEL...
Low Rate Call Girls Dhakuria (8005736733) 100% GENUINE ESCORT SERVICE & HOTEL... Shivani Pandey
 
(Verified Models) Airport Kolkata Escorts Service (+916297143586) Escort agen...
(Verified Models) Airport Kolkata Escorts Service (+916297143586) Escort agen...(Verified Models) Airport Kolkata Escorts Service (+916297143586) Escort agen...
(Verified Models) Airport Kolkata Escorts Service (+916297143586) Escort agen...rahim quresi
 
Model Call Girls In Pazhavanthangal WhatsApp Booking 7427069034 call girl ser...
Model Call Girls In Pazhavanthangal WhatsApp Booking 7427069034 call girl ser...Model Call Girls In Pazhavanthangal WhatsApp Booking 7427069034 call girl ser...
Model Call Girls In Pazhavanthangal WhatsApp Booking 7427069034 call girl ser... Shivani Pandey
 
VIP Model Call Girls Vijayawada ( Pune ) Call ON 8005736733 Starting From 5K ...
VIP Model Call Girls Vijayawada ( Pune ) Call ON 8005736733 Starting From 5K ...VIP Model Call Girls Vijayawada ( Pune ) Call ON 8005736733 Starting From 5K ...
VIP Model Call Girls Vijayawada ( Pune ) Call ON 8005736733 Starting From 5K ...SUHANI PANDEY
 
(TOP CLASS) Call Girls In Nungambakkam Phone 7427069034 Call Girls Model With...
(TOP CLASS) Call Girls In Nungambakkam Phone 7427069034 Call Girls Model With...(TOP CLASS) Call Girls In Nungambakkam Phone 7427069034 Call Girls Model With...
(TOP CLASS) Call Girls In Nungambakkam Phone 7427069034 Call Girls Model With... Shivani Pandey
 

Dernier (20)

𓀤Call On 6297143586 𓀤 Ultadanga Call Girls In All Kolkata 24/7 Provide Call W...
𓀤Call On 6297143586 𓀤 Ultadanga Call Girls In All Kolkata 24/7 Provide Call W...𓀤Call On 6297143586 𓀤 Ultadanga Call Girls In All Kolkata 24/7 Provide Call W...
𓀤Call On 6297143586 𓀤 Ultadanga Call Girls In All Kolkata 24/7 Provide Call W...
 
Call Girls Bellandur ☎ 7737669865☎ Book Your One night Stand (Bangalore)
Call Girls Bellandur ☎ 7737669865☎ Book Your One night Stand (Bangalore)Call Girls Bellandur ☎ 7737669865☎ Book Your One night Stand (Bangalore)
Call Girls Bellandur ☎ 7737669865☎ Book Your One night Stand (Bangalore)
 
WhatsApp Chat: 📞 8617697112 Hire Call Girls Raiganj For a Sensual Sex Experience
WhatsApp Chat: 📞 8617697112 Hire Call Girls Raiganj For a Sensual Sex ExperienceWhatsApp Chat: 📞 8617697112 Hire Call Girls Raiganj For a Sensual Sex Experience
WhatsApp Chat: 📞 8617697112 Hire Call Girls Raiganj For a Sensual Sex Experience
 
Almora call girls 📞 8617697112 At Low Cost Cash Payment Booking
Almora call girls 📞 8617697112 At Low Cost Cash Payment BookingAlmora call girls 📞 8617697112 At Low Cost Cash Payment Booking
Almora call girls 📞 8617697112 At Low Cost Cash Payment Booking
 
Hire 💕 8617697112 North Sikkim Call Girls Service Call Girls Agency
Hire 💕 8617697112 North Sikkim Call Girls Service Call Girls AgencyHire 💕 8617697112 North Sikkim Call Girls Service Call Girls Agency
Hire 💕 8617697112 North Sikkim Call Girls Service Call Girls Agency
 
Verified Trusted Call Girls Ambattur Chennai ✔✔7427069034 Independent Chenna...
Verified Trusted Call Girls Ambattur Chennai ✔✔7427069034  Independent Chenna...Verified Trusted Call Girls Ambattur Chennai ✔✔7427069034  Independent Chenna...
Verified Trusted Call Girls Ambattur Chennai ✔✔7427069034 Independent Chenna...
 
Science City Kolkata ( Call Girls ) Kolkata ✔ 6297143586 ✔ Hot Model With Sex...
Science City Kolkata ( Call Girls ) Kolkata ✔ 6297143586 ✔ Hot Model With Sex...Science City Kolkata ( Call Girls ) Kolkata ✔ 6297143586 ✔ Hot Model With Sex...
Science City Kolkata ( Call Girls ) Kolkata ✔ 6297143586 ✔ Hot Model With Sex...
 
Verified Trusted Call Girls Singaperumal Koil Chennai ✔✔7427069034 Independe...
Verified Trusted Call Girls Singaperumal Koil Chennai ✔✔7427069034  Independe...Verified Trusted Call Girls Singaperumal Koil Chennai ✔✔7427069034  Independe...
Verified Trusted Call Girls Singaperumal Koil Chennai ✔✔7427069034 Independe...
 
Call Girls Bhandara Just Call 8617697112 Top Class Call Girl Service Available
Call Girls Bhandara Just Call 8617697112 Top Class Call Girl Service AvailableCall Girls Bhandara Just Call 8617697112 Top Class Call Girl Service Available
Call Girls Bhandara Just Call 8617697112 Top Class Call Girl Service Available
 
Call Girls In Warangal Escorts ☎️7427069034 🔝 💃 Enjoy 24/7 Escort Service En...
Call Girls In Warangal Escorts ☎️7427069034  🔝 💃 Enjoy 24/7 Escort Service En...Call Girls In Warangal Escorts ☎️7427069034  🔝 💃 Enjoy 24/7 Escort Service En...
Call Girls In Warangal Escorts ☎️7427069034 🔝 💃 Enjoy 24/7 Escort Service En...
 
📞 Contact Number 8617697112 VIP Ganderbal Call Girls
📞 Contact Number 8617697112 VIP Ganderbal Call Girls📞 Contact Number 8617697112 VIP Ganderbal Call Girls
📞 Contact Number 8617697112 VIP Ganderbal Call Girls
 
Hire 💕 8617697112 Pauri Garhwal Call Girls Service Call Girls Agency
Hire 💕 8617697112 Pauri Garhwal Call Girls Service Call Girls AgencyHire 💕 8617697112 Pauri Garhwal Call Girls Service Call Girls Agency
Hire 💕 8617697112 Pauri Garhwal Call Girls Service Call Girls Agency
 
Borum Call Girls Service ☎ ️93326-06886 ❤️‍🔥 Enjoy 24/7 Escort Service
Borum Call Girls Service ☎ ️93326-06886 ❤️‍🔥 Enjoy 24/7 Escort ServiceBorum Call Girls Service ☎ ️93326-06886 ❤️‍🔥 Enjoy 24/7 Escort Service
Borum Call Girls Service ☎ ️93326-06886 ❤️‍🔥 Enjoy 24/7 Escort Service
 
Kolkata Call Girls Service ❤️ at @30% discount Everyday
Kolkata Call Girls Service ❤️ at @30% discount EverydayKolkata Call Girls Service ❤️ at @30% discount Everyday
Kolkata Call Girls Service ❤️ at @30% discount Everyday
 
Model VVIP Call Girls In Porur 👉 Chennai 🍬 7427069034 Escort Service & Hotel ...
Model VVIP Call Girls In Porur 👉 Chennai 🍬 7427069034 Escort Service & Hotel ...Model VVIP Call Girls In Porur 👉 Chennai 🍬 7427069034 Escort Service & Hotel ...
Model VVIP Call Girls In Porur 👉 Chennai 🍬 7427069034 Escort Service & Hotel ...
 
Low Rate Call Girls Dhakuria (8005736733) 100% GENUINE ESCORT SERVICE & HOTEL...
Low Rate Call Girls Dhakuria (8005736733) 100% GENUINE ESCORT SERVICE & HOTEL...Low Rate Call Girls Dhakuria (8005736733) 100% GENUINE ESCORT SERVICE & HOTEL...
Low Rate Call Girls Dhakuria (8005736733) 100% GENUINE ESCORT SERVICE & HOTEL...
 
(Verified Models) Airport Kolkata Escorts Service (+916297143586) Escort agen...
(Verified Models) Airport Kolkata Escorts Service (+916297143586) Escort agen...(Verified Models) Airport Kolkata Escorts Service (+916297143586) Escort agen...
(Verified Models) Airport Kolkata Escorts Service (+916297143586) Escort agen...
 
Model Call Girls In Pazhavanthangal WhatsApp Booking 7427069034 call girl ser...
Model Call Girls In Pazhavanthangal WhatsApp Booking 7427069034 call girl ser...Model Call Girls In Pazhavanthangal WhatsApp Booking 7427069034 call girl ser...
Model Call Girls In Pazhavanthangal WhatsApp Booking 7427069034 call girl ser...
 
VIP Model Call Girls Vijayawada ( Pune ) Call ON 8005736733 Starting From 5K ...
VIP Model Call Girls Vijayawada ( Pune ) Call ON 8005736733 Starting From 5K ...VIP Model Call Girls Vijayawada ( Pune ) Call ON 8005736733 Starting From 5K ...
VIP Model Call Girls Vijayawada ( Pune ) Call ON 8005736733 Starting From 5K ...
 
(TOP CLASS) Call Girls In Nungambakkam Phone 7427069034 Call Girls Model With...
(TOP CLASS) Call Girls In Nungambakkam Phone 7427069034 Call Girls Model With...(TOP CLASS) Call Girls In Nungambakkam Phone 7427069034 Call Girls Model With...
(TOP CLASS) Call Girls In Nungambakkam Phone 7427069034 Call Girls Model With...
 

Talk in Telecom-Paris, Nov. 15, 2011

  • 1. Vanilla Rao–Blackwellisation of Metropolis–Hastings algorithms Christian P. Robert Universit´ Paris-Dauphine, IuF, and CREST e Joint works with Randal Douc, Pierre Jacob and Murray Smith xian@ceremade.dauphine.fr November 16, 2011 1 / 36
  • 2. Main themes 1 Rao–Blackwellisation on MCMC 2 Can be performed in any Hastings Metropolis algorithm 3 Asymptotically more efficient than usual MCMC with a controlled additional computing 4 Takes advantage of parallel capacities at a very basic level (GPUs) 2 / 36
  • 3. Main themes 1 Rao–Blackwellisation on MCMC 2 Can be performed in any Hastings Metropolis algorithm 3 Asymptotically more efficient than usual MCMC with a controlled additional computing 4 Takes advantage of parallel capacities at a very basic level (GPUs) 2 / 36
  • 4. Main themes 1 Rao–Blackwellisation on MCMC 2 Can be performed in any Hastings Metropolis algorithm 3 Asymptotically more efficient than usual MCMC with a controlled additional computing 4 Takes advantage of parallel capacities at a very basic level (GPUs) 2 / 36
  • 5. Main themes 1 Rao–Blackwellisation on MCMC 2 Can be performed in any Hastings Metropolis algorithm 3 Asymptotically more efficient than usual MCMC with a controlled additional computing 4 Takes advantage of parallel capacities at a very basic level (GPUs) 2 / 36
  • 6. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) Outline 1 Metropolis Hastings revisited 2 Rao–Blackwellisation Formal importance sampling Variance reduction Asymptotic results Illustrations 3 Rao-Blackwellisation (2) Independent case General MH algorithms 3 / 36
  • 7. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) Outline 1 Metropolis Hastings revisited 2 Rao–Blackwellisation Formal importance sampling Variance reduction Asymptotic results Illustrations 3 Rao-Blackwellisation (2) Independent case General MH algorithms 3 / 36
  • 8. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) Outline 1 Metropolis Hastings revisited 2 Rao–Blackwellisation Formal importance sampling Variance reduction Asymptotic results Illustrations 3 Rao-Blackwellisation (2) Independent case General MH algorithms 3 / 36
  • 9. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) Outline 1 Metropolis Hastings revisited 2 Rao–Blackwellisation Formal importance sampling Variance reduction Asymptotic results Illustrations 3 Rao-Blackwellisation (2) Independent case General MH algorithms 4 / 36
  • 10. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) Metropolis Hastings algorithm 1 We wish to approximate h(x)π(x)dx I = = h(x)¯ (x)dx π π(x)dx 2 π(x) is known but not π(x)dx. 1 n 3 Approximate I with δ = n t=1 h(x (t) ) where (x (t) ) is a Markov chain with limiting distribution π . ¯ 4 Convergence obtained from Law of Large Numbers or CLT for Markov chains. 5 / 36
  • 11. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) Metropolis Hastings algorithm 1 We wish to approximate h(x)π(x)dx I = = h(x)¯ (x)dx π π(x)dx 2 π(x) is known but not π(x)dx. 1 n 3 Approximate I with δ = n t=1 h(x (t) ) where (x (t) ) is a Markov chain with limiting distribution π . ¯ 4 Convergence obtained from Law of Large Numbers or CLT for Markov chains. 5 / 36
  • 12. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) Metropolis Hastings algorithm 1 We wish to approximate h(x)π(x)dx I = = h(x)¯ (x)dx π π(x)dx 2 π(x) is known but not π(x)dx. 1 n 3 Approximate I with δ = n t=1 h(x (t) ) where (x (t) ) is a Markov chain with limiting distribution π . ¯ 4 Convergence obtained from Law of Large Numbers or CLT for Markov chains. 5 / 36
  • 13. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) Metropolis Hastings algorithm 1 We wish to approximate h(x)π(x)dx I = = h(x)¯ (x)dx π π(x)dx 2 π(x) is known but not π(x)dx. 1 n 3 Approximate I with δ = n t=1 h(x (t) ) where (x (t) ) is a Markov chain with limiting distribution π . ¯ 4 Convergence obtained from Law of Large Numbers or CLT for Markov chains. 5 / 36
  • 14. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) Metropolis Hasting Algorithm Suppose that x (t) is drawn. 1 Simulate yt ∼ q(·|x (t) ). 2 Set x (t+1) = yt with probability π(yt ) q(x (t) |yt ) α(x (t) , yt ) = min 1, π(x (t) ) q(yt |x (t) ) Otherwise, set x (t+1) = x (t) . 3 α is such that the detailed balance equation is satisfied: π is the ¯ stationary distribution of (x (t) ). The accepted candidates are simulated with the rejection algorithm. 6 / 36
  • 15. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) Metropolis Hasting Algorithm Suppose that x (t) is drawn. 1 Simulate yt ∼ q(·|x (t) ). 2 Set x (t+1) = yt with probability π(yt ) q(x (t) |yt ) α(x (t) , yt ) = min 1, π(x (t) ) q(yt |x (t) ) Otherwise, set x (t+1) = x (t) . 3 α is such that the detailed balance equation is satisfied: π is the ¯ stationary distribution of (x (t) ). The accepted candidates are simulated with the rejection algorithm. 6 / 36
  • 16. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) Metropolis Hasting Algorithm Suppose that x (t) is drawn. 1 Simulate yt ∼ q(·|x (t) ). 2 Set x (t+1) = yt with probability π(yt ) q(x (t) |yt ) α(x (t) , yt ) = min 1, π(x (t) ) q(yt |x (t) ) Otherwise, set x (t+1) = x (t) . 3 α is such that the detailed balance equation is satisfied: π(x)q(y |x)α(x, y ) = π(y )q(x|y )α(y , x). π is the stationary distribution of (x (t) ). ¯ The accepted candidates are simulated with the rejection algorithm. 6 / 36
  • 17. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) Metropolis Hasting Algorithm Suppose that x (t) is drawn. 1 Simulate yt ∼ q(·|x (t) ). 2 Set x (t+1) = yt with probability π(yt ) q(x (t) |yt ) α(x (t) , yt ) = min 1, π(x (t) ) q(yt |x (t) ) Otherwise, set x (t+1) = x (t) . 3 α is such that the detailed balance equation is satisfied: π(x)q(y |x)α(x, y ) = π(y )q(x|y )α(y , x). π is the stationary distribution of (x (t) ). ¯ The accepted candidates are simulated with the rejection algorithm. 6 / 36
  • 18. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) Some properties of the HM algorithm 1 Alternative representation of the estimator δ is n MN 1 1 δ= h(x (t) ) = ni h(zi ) , n t=1 N i=1 where zi ’s are the accepted yj ’s, MN is the number of accepted yj ’s till time N, ni is the number of times zi appears in the sequence (x (t) )t . 7 / 36
  • 19. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) α(zi , ·) q(·|zi ) q(·|zi ) q (·|zi ) = ˜ ≤ , p(zi ) p(zi ) where p(zi ) = α(zi , y ) q(y |zi )dy . To simulate from q (·|zi ): ˜ 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) q (y |zi ) ˜ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. this is the transition of the HM algorithm.The transition kernel q ˜ admits π as a stationary distribution: ˜ π (x)˜ (y |x) = ˜ q 8 / 36
  • 20. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) α(zi , ·) q(·|zi ) q(·|zi ) q (·|zi ) = ˜ ≤ , p(zi ) p(zi ) where p(zi ) = α(zi , y ) q(y |zi )dy . To simulate from q (·|zi ): ˜ 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) q (y |zi ) ˜ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. this is the transition of the HM algorithm.The transition kernel q ˜ admits π as a stationary distribution: ˜ π (x)˜ (y |x) = ˜ q 8 / 36
  • 21. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) α(zi , ·) q(·|zi ) q(·|zi ) q (·|zi ) = ˜ ≤ , p(zi ) p(zi ) where p(zi ) = α(zi , y ) q(y |zi )dy . To simulate from q (·|zi ): ˜ 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) q (y |zi ) ˜ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. this is the transition of the HM algorithm.The transition kernel q ˜ admits π as a stationary distribution: ˜ π (x)˜ (y |x) = ˜ q 8 / 36
  • 22. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) α(zi , ·) q(·|zi ) q(·|zi ) q (·|zi ) = ˜ ≤ , p(zi ) p(zi ) where p(zi ) = α(zi , y ) q(y |zi )dy . To simulate from q (·|zi ): ˜ 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) q (y |zi ) ˜ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. this is the transition of the HM algorithm.The transition kernel q ˜ admits π as a stationary distribution: ˜ π(x)p(x) α(x, y )q(y |x) π (x)˜ (y |x) = ˜ q π(u)p(u)du p(x) π (x) ˜ q (y |x) ˜ 8 / 36
  • 23. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) α(zi , ·) q(·|zi ) q(·|zi ) q (·|zi ) = ˜ ≤ , p(zi ) p(zi ) where p(zi ) = α(zi , y ) q(y |zi )dy . To simulate from q (·|zi ): ˜ 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) q (y |zi ) ˜ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. this is the transition of the HM algorithm.The transition kernel q ˜ admits π as a stationary distribution: ˜ π(x)α(x, y )q(y |x) π (x)˜ (y |x) = ˜ q π(u)p(u)du 8 / 36
  • 24. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) α(zi , ·) q(·|zi ) q(·|zi ) q (·|zi ) = ˜ ≤ , p(zi ) p(zi ) where p(zi ) = α(zi , y ) q(y |zi )dy . To simulate from q (·|zi ): ˜ 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) q (y |zi ) ˜ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. this is the transition of the HM algorithm.The transition kernel q ˜ admits π as a stationary distribution: ˜ π(y )α(y , x)q(x|y ) π (x)˜ (y |x) = ˜ q π(u)p(u)du 8 / 36
  • 25. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) α(zi , ·) q(·|zi ) q(·|zi ) q (·|zi ) = ˜ ≤ , p(zi ) p(zi ) where p(zi ) = α(zi , y ) q(y |zi )dy . To simulate from q (·|zi ): ˜ 1 Propose a candidate y ∼ q(·|zi ) 2 Accept with probability q(y |zi ) q (y |zi ) ˜ = α(zi , y ) p(zi ) Otherwise, reject it and starts again. this is the transition of the HM algorithm.The transition kernel q ˜ admits π as a stationary distribution: ˜ π (x)˜ (y |x) = π (y )˜ (x|y ) , ˜ q ˜ q 8 / 36
  • 26. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) Lemma (Douc & X., AoS, 2011) The sequence (zi , ni ) satisfies 1 (zi , ni )i is a Markov chain; 2 zi+1 and ni are independent given zi ; 3 ni is distributed as a geometric random variable with probability parameter p(zi ) := α(zi , y ) q(y |zi ) dy ; (1) 4 ˜ (zi )i is a Markov chain with transition kernel Q(z, dy ) = q (y |z)dy ˜ and stationary distribution π such that ˜ q (·|z) ∝ α(z, ·) q(·|z) ˜ and π (·) ∝ π(·)p(·) . ˜ 9 / 36
  • 27. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) Lemma (Douc & X., AoS, 2011) The sequence (zi , ni ) satisfies 1 (zi , ni )i is a Markov chain; 2 zi+1 and ni are independent given zi ; 3 ni is distributed as a geometric random variable with probability parameter p(zi ) := α(zi , y ) q(y |zi ) dy ; (1) 4 ˜ (zi )i is a Markov chain with transition kernel Q(z, dy ) = q (y |z)dy ˜ and stationary distribution π such that ˜ q (·|z) ∝ α(z, ·) q(·|z) ˜ and π (·) ∝ π(·)p(·) . ˜ 9 / 36
  • 28. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) Lemma (Douc & X., AoS, 2011) The sequence (zi , ni ) satisfies 1 (zi , ni )i is a Markov chain; 2 zi+1 and ni are independent given zi ; 3 ni is distributed as a geometric random variable with probability parameter p(zi ) := α(zi , y ) q(y |zi ) dy ; (1) 4 ˜ (zi )i is a Markov chain with transition kernel Q(z, dy ) = q (y |z)dy ˜ and stationary distribution π such that ˜ q (·|z) ∝ α(z, ·) q(·|z) ˜ and π (·) ∝ π(·)p(·) . ˜ 9 / 36
  • 29. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) Lemma (Douc & X., AoS, 2011) The sequence (zi , ni ) satisfies 1 (zi , ni )i is a Markov chain; 2 zi+1 and ni are independent given zi ; 3 ni is distributed as a geometric random variable with probability parameter p(zi ) := α(zi , y ) q(y |zi ) dy ; (1) 4 ˜ (zi )i is a Markov chain with transition kernel Q(z, dy ) = q (y |z)dy ˜ and stationary distribution π such that ˜ q (·|z) ∝ α(z, ·) q(·|z) ˜ and π (·) ∝ π(·)p(·) . ˜ 9 / 36
  • 30. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) Old bottle, new wine [or vice-versa] zi−1 10 / 36
  • 31. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) Old bottle, new wine [or vice-versa] indep zi−1 zi indep ni−1 10 / 36
  • 32. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) Old bottle, new wine [or vice-versa] indep indep zi−1 zi zi+1 indep indep ni−1 ni 10 / 36
  • 33. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) Old bottle, new wine [or vice-versa] indep indep zi−1 zi zi+1 indep indep ni−1 ni n MN 1 1 δ= h(x (t) ) = ni h(zi ) . n t=1 N i=1 10 / 36
  • 34. Metropolis Hastings revisited Rao–Blackwellisation Rao-Blackwellisation (2) Old bottle, new wine [or vice-versa] indep indep zi−1 zi zi+1 indep indep ni−1 ni n MN 1 1 δ= h(x (t) ) = ni h(zi ) . n t=1 N i=1 10 / 36
  • 35. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations Outline 1 Metropolis Hastings revisited 2 Rao–Blackwellisation Formal importance sampling Variance reduction Asymptotic results Illustrations 3 Rao-Blackwellisation (2) Independent case General MH algorithms 11 / 36
  • 36. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations Importance sampling perspective 1 A natural idea: MN 1 h(zi ) δ∗ = , N p(zi ) i=1 12 / 36
  • 37. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations Importance sampling perspective 1 A natural idea: MN h(zi ) MN π(zi ) i=1 i=1 h(zi ) p(zi ) π (zi ) ˜ δ∗ = . MN 1 MN π(zi ) i=1 i=1 p(zi ) π (zi ) ˜ 12 / 36
  • 38. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations Importance sampling perspective 1 A natural idea: MN h(zi ) MN π(zi ) i=1 i=1 h(zi ) p(zi ) π (zi ) ˜ δ∗ = . MN 1 MN π(zi ) i=1 i=1 p(zi ) π (zi ) ˜ 2 But p not available in closed form. 12 / 36
  • 39. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations Importance sampling perspective 1 A natural idea: MN h(zi ) MN π(zi ) i=1 i=1 h(zi ) p(zi ) π (zi ) ˜ δ∗ = . MN 1 MN π(zi ) i=1 i=1 p(zi ) π (zi ) ˜ 2 But p not available in closed form. 3 The geometric ni is the replacement obvious solution that is used in the original Metropolis–Hastings estimate since E[ni ] = 1/p(zi ). 12 / 36
  • 40. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations The Bernoulli factory The crude estimate of 1/p(zi ), ∞ ni = 1 + I {u ≥ α(zi , y )} , j=1 ≤j can be improved: Lemma (Douc & X., AoS, 2011) If (yj )j is an iid sequence with distribution q(y |zi ), the quantity ∞ ˆ ξi = 1 + {1 − α(zi , y )} j=1 ≤j is an unbiased estimator of 1/p(zi ) which variance, conditional on zi , is lower than the conditional variance of ni , {1 − p(zi )}/p 2 (zi ). 13 / 36
  • 41. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations Rao-Blackwellised, for sure? ∞ ˆ ξi = 1 + {1 − α(zi , y )} j=1 ≤j 1 Infinite sum but finite with at least positive probability: π(yt ) q(x (t) |yt ) α(x (t) , yt ) = min 1, π(x (t) ) q(yt |x (t) ) For example: take a symmetric random walk as a proposal. 2 What if we wish to be sure that the sum is finite? Finite horizon improvement: ∞ ˆ ξik = 1 + {1 − α(zi , yj )} I {u ≥ α(zi , y )} j=1 1≤ ≤k∧j k+1≤ ≤j 14 / 36
  • 42. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations Rao-Blackwellised, for sure? ∞ ˆ ξi = 1 + {1 − α(zi , y )} j=1 ≤j 1 Infinite sum but finite with at least positive probability: π(yt ) q(x (t) |yt ) α(x (t) , yt ) = min 1, π(x (t) ) q(yt |x (t) ) For example: take a symmetric random walk as a proposal. 2 What if we wish to be sure that the sum is finite? Finite horizon improvement: ∞ ˆ ξik = 1 + {1 − α(zi , yj )} I {u ≥ α(zi , y )} j=1 1≤ ≤k∧j k+1≤ ≤j 14 / 36
  • 43. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations Variance improvement Proposition (Douc & X., AoS, 2011) If (yj )j is an iid sequence with distribution q(y |zi ) and (uj )j is an iid uniform sequence, for any k ≥ 0, the quantity ∞ ˆ ξik = 1 + {1 − α(zi , yj )} I {u ≥ α(zi , y )} j=1 1≤ ≤k∧j k+1≤ ≤j is an unbiased estimator of 1/p(zi ) with an almost sure finite number of terms. 15 / 36
  • 44. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations Variance improvement Proposition (Douc & X., AoS, 2011) If (yj )j is an iid sequence with distribution q(y |zi ) and (uj )j is an iid uniform sequence, for any k ≥ 0, the quantity ∞ ˆ ξik = 1 + {1 − α(zi , yj )} I {u ≥ α(zi , y )} j=1 1≤ ≤k∧j k+1≤ ≤j is an unbiased estimator of 1/p(zi ) with an almost sure finite number of terms. Moreover, for k ≥ 1, ˆ 1 − p(zi ) 1 − (1 − 2p(zi ) + r (zi ))k 2 − p(zi ) V ξik zi = 2 (z ) − (p(zi ) − r (zi )) , p i 2p(zi ) − r (zi ) p 2 (zi ) where p(zi ) := α(zi , y ) q(y |zi ) dy . and r (zi ) := α2 (zi , y ) q(y |zi ) dy . 15 / 36
  • 45. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations Variance improvement Proposition (Douc & X., AoS, 2011) If (yj )j is an iid sequence with distribution q(y |zi ) and (uj )j is an iid uniform sequence, for any k ≥ 0, the quantity ∞ ˆ ξik = 1 + {1 − α(zi , yj )} I {u ≥ α(zi , y )} j=1 1≤ ≤k∧j k+1≤ ≤j is an unbiased estimator of 1/p(zi ) with an almost sure finite number of terms. Therefore, we have ˆ ˆ ˆ V ξi zi ≤ V ξik zi ≤ V ξi0 zi = V [ni | zi ] . 15 / 36
  • 46. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations zi−1 ∞ ˆ ξik = 1 + {1 − α(zi , yj )} I {u ≥ α(zi , y )} j=1 1≤ ≤k∧j k+1≤ ≤j 16 / 36
  • 47. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations not indep zi−1 zi not indep ˆk ξi−1 ∞ ˆ ξik = 1 + {1 − α(zi , yj )} I {u ≥ α(zi , y )} j=1 1≤ ≤k∧j k+1≤ ≤j 16 / 36
  • 48. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations not indep not indep zi−1 zi zi+1 not indep not indep ˆk ξi−1 ˆ ξik ∞ ˆ ξik = 1 + {1 − α(zi , yj )} I {u ≥ α(zi , y )} j=1 1≤ ≤k∧j k+1≤ ≤j 16 / 36
  • 49. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations not indep not indep zi−1 zi zi+1 not indep not indep ˆk ξi−1 ˆ ξik M ˆk k i=1 ξi h(zi ) δM = M ˆk . i=1 ξi 16 / 36
  • 50. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations not indep not indep zi−1 zi zi+1 not indep not indep ˆk ξi−1 ˆ ξik M ˆk k i=1 ξi h(zi ) δM = M ˆk . i=1 ξi 16 / 36
  • 51. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations Let M ˆk k i=1 ξi h(zi ) δM = M ˆk . i=1 ξi For any positive function ϕ, we denote Cϕ = {h; |h/ϕ|∞ < ∞}. 17 / 36
  • 52. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations Let M ˆk k i=1 ξi h(zi ) δM = M ˆk . i=1 ξi For any positive function ϕ, we denote Cϕ = {h; |h/ϕ|∞ < ∞}. Assume that there exists a positive function ϕ ≥ 1 such that M i=1 h(zi )/p(zi ) P ∀h ∈ Cϕ , M −→ π(h) i=1 1/p(zi ) Theorem (Douc & X., AoS, 2011) Under the assumption that π(p) > 0, the following convergence property holds: i) If h is in Cϕ , then k P δM −→M→∞ π(h) ( Consistency) 17 / 36
  • 53. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations Let M ˆk k i=1 ξi h(zi ) δM = M ˆk . i=1 ξi For any positive function ϕ, we denote Cϕ = {h; |h/ϕ|∞ < ∞}. Assume that there exists a positive function ψ such that √ M i=1 h(zi )/p(zi ) L ∀h ∈ Cψ , M M − π(h) −→ N (0, Γ(h)) i=1 1/p(zi ) Theorem (Douc & X., AoS, 2011) Under the assumption that π(p) > 0, the following convergence property holds: ii) If, in addition, h2 /p ∈ Cϕ and h ∈ Cψ , then √ k L M(δM − π(h)) −→M→∞ N (0, Vk [h − π(h)]) , ( Clt) where Vk (h) := π(p) ˆ π(dz)V ξik z h2 (z)p(z) + Γ(h) . 17 / 36
  • 54. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations We will need some additional assumptions. Assume a maximal inequality for the Markov chain (zi )i : there exists a measurable function ζ such that for any starting point x,   i NCh (x) ∀h ∈ Cζ , Px  sup [h(zi ) − π (h)] >  ≤ ˜ 2 0≤i≤N j=0 Theorem (Douc & X., AoS, 2011) Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume moreover that √ 0 L M δM − π(h) −→ N (0, V0 [h − π(h)]) . Then, for any starting point x, N t=1 h(x (t) ) N→+∞ MN − π(h) −→ N (0, V0 [h − π(h)]) , N where MN is defined by 18 / 36
  • 55. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations We will need some additional assumptions. Assume a maximal inequality for the Markov chain (zi )i : there exists a measurable function ζ such that for any starting point x,   i NCh (x) ∀h ∈ Cζ , Px  sup [h(zi ) − π (h)] >  ≤ ˜ 2 0≤i≤N j=0 Moreover, assume that ∃φ ≥ 1 such that for any starting point x, ˜ P ∀h ∈ Cφ , Q n (x, h) −→ π (h) = π(ph)/π(p) , ˜ Theorem (Douc & X., AoS, 2011) Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume moreover that √ 0 L M δM − π(h) −→ N (0, V0 [h − π(h)]) . Then, for any starting point x, N 18 / 36
  • 56. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations We will need some additional assumptions. Assume a maximal inequality for the Markov chain (zi )i : there exists a measurable function ζ such that for any starting point x,   i NCh (x) ∀h ∈ Cζ , Px  sup [h(zi ) − π (h)] >  ≤ ˜ 2 0≤i≤N j=0 Moreover, assume that ∃φ ≥ 1 such that for any starting point x, ˜ P ∀h ∈ Cφ , Q n (x, h) −→ π (h) = π(ph)/π(p) , ˜ Theorem (Douc & X., AoS, 2011) Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume moreover that √ 0 L M δM − π(h) −→ N (0, V0 [h − π(h)]) . Then, for any starting point x, N 18 / 36
  • 57. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations   i NCh (x) ∀h ∈ Cζ , Px  sup [h(zi ) − π (h)] >  ≤ ˜ 2 0≤i≤N j=0 ˜ P ∀h ∈ Cφ , Q n (x, h) −→ π (h) = π(ph)/π(p) , ˜ Theorem (Douc & X., AoS, 2011) Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume moreover that √ 0 L M δM − π(h) −→ N (0, V0 [h − π(h)]) . Then, for any starting point x, N t=1 h(x (t) ) N→+∞ MN − π(h) −→ N (0, V0 [h − π(h)]) , N where MN is defined by 18 / 36
  • 58. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations Theorem (Douc & X., AoS, 2011) Assume that h is such that h/p ∈ Cζ and {Ch/p , h2 /p 2 } ⊂ Cφ . Assume moreover that √ 0 L M δM − π(h) −→ N (0, V0 [h − π(h)]) . Then, for any starting point x, N t=1 h(x (t) ) N→+∞ MN − π(h) −→ N (0, V0 [h − π(h)]) , N where MN is defined by MN MN +1 ˆ ξi0 ≤ N < ˆ ξi0 . i=1 i=1 18 / 36
  • 59. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations Variance gain (1) h(x) x x2 IX >0 p(x) τ = .1 0.971 0.953 0.957 0.207 τ =2 0.965 0.942 0.875 0.861 τ =5 0.913 0.982 0.785 0.826 τ =7 0.899 0.982 0.768 0.820 Ratios of the empirical variances of δ ∞ and δ estimating E[h(X )]: 100 MCMC iterations over 103 replications of a random walk Gaussian proposal with scale τ . 19 / 36
  • 60. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations Illustration (1) Figure: Overlay of the variations of 250 iid realisations of the estimates δ (gold) and δ ∞ (grey) of E[X ] = 0 for 1000 iterations, along with the 90% interquantile range for the estimates δ (brown) and δ ∞ (pink), in the setting of a random walk Gaussian proposal with scale τ = 10. 20 / 36
  • 61. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations Extra computational effort median mean q.8 q.9 time τ = .25 0.0 8.85 4.9 13 4.2 τ = .50 0.0 6.76 4 11 2.25 τ = 1.0 0.25 6.15 4 10 2.5 τ = 2.0 0.20 5.90 3.5 8.5 4.5 Additional computing effort due: median and mean numbers of additional iterations, 80% and 90% quantiles for the additional iterations, and ratio of the average R computing times obtained over 105 simulations 21 / 36
  • 62. Formal importance sampling Metropolis Hastings revisited Variance reduction Rao–Blackwellisation Asymptotic results Rao-Blackwellisation (2) Illustrations Illustration (2) Figure: Overlay of the variations of 500 iid realisations of the estimates δ (deep grey), δ ∞ (medium grey) and of the importance sampling version (light grey) of E[X ] = 10 when X ∼ Exp(.1) for 100 iterations, along with the 90% interquantile ranges (same colour code), in the setting of an independent exponential proposal with scale µ = 0.02. 22 / 36
  • 63. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Outline 1 Metropolis Hastings revisited 2 Rao–Blackwellisation Formal importance sampling Variance reduction Asymptotic results Illustrations 3 Rao-Blackwellisation (2) Independent case General MH algorithms 23 / 36
  • 64. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Integrating out white noise In Casella+X. (1996) paper, averaging of possible past and future histories (by integrating out uniforms) to improve weights of accepted values Rao–Blackwellised weight on proposed values yt p (i) ϕt = δt ξtj j=t t−1 with δ0 = 1 δt = δj ξj(t−1) ρjt j=0 j and ξtt = 1 , ξtj = (1 − ρtu ) u=t+1 occurence survivals of the yt ’s, associated with Metropolis–Hastings ratio ωt = π(yt )/µ(yt ) , ρtu = ωu /ωt ∧ 1 . 24 / 36
  • 65. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Integrating out white noise In Casella+X. (1996) paper, averaging of possible past and future histories (by integrating out uniforms) to improve weights of accepted values Rao–Blackwellised weight on proposed values yt p (i) ϕt = δt ξtj j=t t−1 with δ0 = 1 δt = δj ξj(t−1) ρjt j=0 j and ξtt = 1 , ξtj = (1 − ρtu ) u=t+1 occurence survivals of the yt ’s, associated with Metropolis–Hastings ratio ωt = π(yt )/µ(yt ) , ρtu = ωu /ωt ∧ 1 . 24 / 36
  • 66. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Integrating out white noise Potentialy large variance improvement but cost of O(T 2 )... Possible recovery of efficiency thanks to parallelisation: Moving from ( 1 , . . . , p ) towards... ( (1) , . . . , (p) ) by averaging over ”all” possible orders 25 / 36
  • 67. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Integrating out white noise Potentialy large variance improvement but cost of O(T 2 )... Possible recovery of efficiency thanks to parallelisation: Moving from ( 1 , . . . , p ) towards... ( (1) , . . . , (p) ) by averaging over ”all” possible orders 25 / 36
  • 68. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Integrating out white noise Potentialy large variance improvement but cost of O(T 2 )... Possible recovery of efficiency thanks to parallelisation: Moving from ( 1 , . . . , p ) towards... ( (1) , . . . , (p) ) by averaging over ”all” possible orders 25 / 36
  • 69. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Case of the independent Metropolis–Hastings algorithm Starting at time t with p processors and a pool of p proposed values, (y1 , . . . , yp ) use processors to examine in parallel p different “histories” 26 / 36
  • 70. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Case of the independent Metropolis–Hastings algorithm Starting at time t with p processors and a pool of p proposed values, (y1 , . . . , yp ) use processors to examine in parallel p different “histories” 26 / 36
  • 71. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Improvement The standard estimator τ1 of Eπ [h(X )] ˆ p 1 τ1 (xt , y1:p ) = ˆ h(xt+k ) p k=1 is necessarily dominated by the average p 1 τ2 (xt , y1:p ) = ˆ nk h(yk ) p2 k=0 where y0 = xt and n0 is the number of times xt is repeated. 27 / 36
  • 72. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Further Rao-Blackwellisation E.g., use of the Metropolis–Hastings weights wj : j being the index such that xt+i−1 = yj , update of the weights at each time t + i: wj = wj + 1 − ρ(xt+i−1 , yi ) wi = wi + ρ(xt+i−1 , yi ) resulting into a more stable estimator p 1 τ3 (xt , y1:p ) = ˆ wk h(yk ) p2 k=0 E.g., Casella+X. (1996) p 1 τ4 (xt , y1:p ) = ˆ ϕk h(yk ) p2 k=0 28 / 36
  • 73. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Further Rao-Blackwellisation E.g., use of the Metropolis–Hastings weights wj : j being the index such that xt+i−1 = yj , update of the weights at each time t + i: wj = wj + 1 − ρ(xt+i−1 , yi ) wi = wi + ρ(xt+i−1 , yi ) resulting into a more stable estimator p 1 τ3 (xt , y1:p ) = ˆ wk h(yk ) p2 k=0 E.g., Casella+X. (1996) p 1 τ4 (xt , y1:p ) = ˆ ϕk h(yk ) p2 k=0 28 / 36
  • 74. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Markovian continuity The Markov validity of the chain is not jeopardised! The chain continues (j) by picking one sequence at random and taking the corresponding xt+p as starting point of the next parallel block. 29 / 36
  • 75. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Markovian continuity The Markov validity of the chain is not jeopardised! The chain continues (j) by picking one sequence at random and taking the corresponding xt+p as starting point of the next parallel block. 29 / 36
  • 76. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Impact of Rao-Blackwellisations Comparison of τ1 basic IMH estimator of Eπ [h(X )], ˆ τ2 improving by averaging over permutations of proposed values and ˆ using p times more uniforms τ3 improving upon τ2 by basic Rao-Blackwell argument, ˆ ˆ τ4 improving upon τ2 by integrating out ancillary uniforms, at a cost ˆ ˆ of O(p 2 ). 30 / 36
  • 77. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Illustration Variations of estimates based on RB and standard versions of parallel chains and on a standard MCMC chain for the mean and variance of the target N (0, 1) distribution (based on 10, 000 independent replicas). 31 / 36
  • 78. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Illustration Variations of estimates based on RB and standard versions of parallel chains and on a standard MCMC chain for the mean and variance of the target N (0, 1) distribution (based on 10, 000 independent replicas). 31 / 36
  • 79. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Illustration Variations of estimates based on RB and standard versions of parallel chains and on a standard MCMC chain for the mean and variance of the target N (0, 1) distribution (based on 10, 000 independent replicas). 31 / 36
  • 80. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Illustration Variations of estimates based on RB and standard versions of parallel chains and on a standard MCMC chain for the mean and variance of the target N (0, 1) distribution (based on 10, 000 independent replicas). 31 / 36
  • 81. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Impact of the order Parallelisation allows for the partial integration of the uniforms What about the permutation order? Comparison of τ2N with no permutation, ˆ τ2C with circular permutations, ˆ τ2R with random permutations, ˆ τ2H with half-random permutations, ˆ τ2S with stratified permutations, ˆ 32 / 36
  • 82. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Impact of the order Parallelisation allows for the partial integration of the uniforms What about the permutation order? 32 / 36
  • 83. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Impact of the order Parallelisation allows for the partial integration of the uniforms What about the permutation order? 32 / 36
  • 84. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Impact of the order Parallelisation allows for the partial integration of the uniforms What about the permutation order? 32 / 36
  • 85. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Impact of the order Parallelisation allows for the partial integration of the uniforms What about the permutation order? 32 / 36
  • 86. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Importance target Comparison with the ultimate importance sampling 33 / 36
  • 87. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Importance target Comparison with the ultimate importance sampling 33 / 36
  • 88. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Importance target Comparison with the ultimate importance sampling 33 / 36
  • 89. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Importance target Comparison with the ultimate importance sampling 33 / 36
  • 90. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Extension to the general case Same principle can be applied to any Markov update: if xt+1 = Ψ(xt , t ) then generate ( 1, . . . , p) in advance and distribute to the p processors in different permutation orders Plus use of Douc & X’s (2011) Rao–Blackwellisation ξikˆ 34 / 36
  • 91. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Extension to the general case Same principle can be applied to any Markov update: if xt+1 = Ψ(xt , t ) then generate ( 1, . . . , p) in advance and distribute to the p processors in different permutation orders Plus use of Douc & X’s (2011) Rao–Blackwellisation ξikˆ 34 / 36
  • 92. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Implementation (j) Similar run of p parallel chains (xt+i ), use of averages p p (1:p) 1 (j) τ2 (x1:p ) = ˆ nk h(xt+k ) p2 k=1 j=1 and selection of new starting value at random at time t + p: 35 / 36
  • 93. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Implementation (j) Similar run of p parallel chains (xt+i ), use of averages p p (1:p) 1 (j) τ2 (x1:p ) = ˆ nk h(xt+k ) p2 k=1 j=1 and selection of new starting value at random at time t + p: 35 / 36
  • 94. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Illustration Variations of estimates based on RB and standard versions of parallel chains and on a standard MCMC chain for the mean and variance of the target distribution (based on p = 64 parallel processors, 50 blocs of p MCMC steps and 500 independent replicas). 1.3 0.10 1.2 0.05 1.1 0.00 1.0 −0.05 0.9 −0.10 RB par org RB par org 36 / 36
  • 95. Metropolis Hastings revisited Independent case Rao–Blackwellisation General MH algorithms Rao-Blackwellisation (2) Illustration Variations of estimates based on RB and standard versions of parallel chains and on a standard MCMC chain for the mean and variance of the target distribution (based on p = 64 parallel processors, 50 blocs of p MCMC steps and 500 independent replicas). 1.3 0.10 1.2 0.05 1.1 0.00 1.0 −0.05 0.9 −0.10 RB par org RB par org 36 / 36