SlideShare une entreprise Scribd logo
1  sur  287
MCMC and likelihood-free methods Part/day II: ABC




                       MCMC and likelihood-free methods
                             Part/day II: ABC

                                            Christian P. Robert

                                 Universit´ Paris-Dauphine, IuF, & CREST
                                          e


                            Monash University, EBS, July 18, 2012
MCMC and likelihood-free methods Part/day II: ABC




Outline
simulation-based methods in Econometrics

Genetics of ABC

Approximate Bayesian computation

ABC for model choice

ABC model choice consistency
MCMC and likelihood-free methods Part/day II: ABC
  simulation-based methods in Econometrics




Econ’ections


   simulation-based methods in
   Econometrics

   Genetics of ABC

   Approximate Bayesian computation

   ABC for model choice

   ABC model choice consistency
MCMC and likelihood-free methods Part/day II: ABC
  simulation-based methods in Econometrics




Usages of simulation in Econometrics



      Similar exploration of simulation-based techniques in Econometrics
              Simulated method of moments
              Method of simulated moments
              Simulated pseudo-maximum-likelihood
              Indirect inference
                                                    [Gouri´roux & Monfort, 1996]
                                                          e
MCMC and likelihood-free methods Part/day II: ABC
  simulation-based methods in Econometrics




Simulated method of moments

                          o
      Given observations y1:n from a model

                                 yt = r (y1:(t−1) , t , θ) ,         t   ∼ g (·)

      simulate        1:n ,   derive

                                        yt (θ) = r (y1:(t−1) ,      t , θ)

      and estimate θ by
                                                     n
                                        arg min           (yto − yt (θ))2
                                               θ
                                                    t=1
MCMC and likelihood-free methods Part/day II: ABC
  simulation-based methods in Econometrics




Simulated method of moments

                          o
      Given observations y1:n from a model

                                 yt = r (y1:(t−1) , t , θ) ,               t   ∼ g (·)

      simulate        1:n ,   derive

                                        yt (θ) = r (y1:(t−1) ,            t , θ)

      and estimate θ by
                                                     n               n             2
                                   arg min                yto   −         yt (θ)
                                             θ
                                                    t=1             t=1
MCMC and likelihood-free methods Part/day II: ABC
  simulation-based methods in Econometrics




Method of simulated moments

      Given a statistic vector K (y ) with

                                 Eθ [K (Yt )|y1:(t−1) ] = k(y1:(t−1) ; θ)

      find an unbiased estimator of k(y1:(t−1) ; θ),

                                               ˜
                                               k( t , y1:(t−1) ; θ)

      Estimate θ by
                                     n                   S
                     arg min                 K (yt ) −         ˜ t
                                                               k( s , y1:(t−1) ; θ)/S
                             θ
                                    t=1                  s=1

                                                                       [Pakes & Pollard, 1989]
MCMC and likelihood-free methods Part/day II: ABC
  simulation-based methods in Econometrics




Indirect inference



                                                      ˆ
      Minimise (in θ) the distance between estimators β based on
      pseudo-models for genuine observations and for observations
      simulated under the true model and the parameter θ.

                                                    [Gouri´roux, Monfort, & Renault, 1993;
                                                          e
                                                    Smith, 1993; Gallant & Tauchen, 1996]
MCMC and likelihood-free methods Part/day II: ABC
  simulation-based methods in Econometrics




Indirect inference (PML vs. PSE)


      Example of the pseudo-maximum-likelihood (PML)

                            ˆ
                            β(y) = arg max              log f (yt |β, y1:(t−1) )
                                                β
                                                    t


      leading to
                                      ˆ        ˆ
                            arg min ||β(yo ) − β(y1 (θ), . . . , yS (θ))||2
                                    θ

      when
                                  ys (θ) ∼ f (y|θ)          s = 1, . . . , S
MCMC and likelihood-free methods Part/day II: ABC
  simulation-based methods in Econometrics




Indirect inference (PML vs. PSE)

      Example of the pseudo-score-estimator (PSE)
                                                                                     2
                       ˆ                                ∂ log f
                       β(y) = arg min                           (yt |β, y1:(t−1) )
                                             β
                                                    t
                                                           ∂β

      leading to
                                      ˆ        ˆ
                            arg min ||β(yo ) − β(y1 (θ), . . . , yS (θ))||2
                                    θ

      when
                                  ys (θ) ∼ f (y|θ)            s = 1, . . . , S
MCMC and likelihood-free methods Part/day II: ABC
  simulation-based methods in Econometrics




Consistent indirect inference



                  ...in order to get a unique solution the dimension of
              the auxiliary parameter β must be larger than or equal to
              the dimension of the initial parameter θ. If the problem is
              just identified the different methods become easier...
MCMC and likelihood-free methods Part/day II: ABC
  simulation-based methods in Econometrics




Consistent indirect inference



                  ...in order to get a unique solution the dimension of
              the auxiliary parameter β must be larger than or equal to
              the dimension of the initial parameter θ. If the problem is
              just identified the different methods become easier...

      Consistency depending on the criterion and on the asymptotic
      identifiability of θ
                                     [Gouri´roux, Monfort, 1996, p. 66]
                                            e
MCMC and likelihood-free methods Part/day II: ABC
  simulation-based methods in Econometrics




AR(2) vs. MA(1) example

      true (AR) model
                                               yt =   t   −θ   t−1

      and [wrong!] auxiliary (MA) model

                                      yt = β1 yt−1 + β2 yt−2 + ut

      R code
      x=eps=rnorm(250)
      x[2:250]=x[2:250]-0.5*x[1:249]
      simeps=rnorm(250)
      propeta=seq(-.99,.99,le=199)
      dist=rep(0,199)
      bethat=as.vector(arima(x,c(2,0,0),incl=FALSE)$coef)
      for (t in 1:199)
        dist[t]=sum((as.vector(arima(c(simeps[1],simeps[2:250]-propeta[t]*
        simeps[1:249]),c(2,0,0),incl=FALSE)$coef)-bethat)^2)
MCMC and likelihood-free methods Part/day II: ABC
  simulation-based methods in Econometrics




AR(2) vs. MA(1) example
      One sample:
                                  0.8
                                  0.6
                       distance

                                  0.4
                                  0.2
                                  0.0




                                        −1.0   −0.5   0.0   0.5   1.0
MCMC and likelihood-free methods Part/day II: ABC
  simulation-based methods in Econometrics




AR(2) vs. MA(1) example
      Many samples:
                           6
                           5
                           4
                           3
                           2
                           1
                           0




                                     0.2        0.4   0.6   0.8   1.0
MCMC and likelihood-free methods Part/day II: ABC
  simulation-based methods in Econometrics




Choice of pseudo-model



      Pick model such that
           ˆ
       1. β(θ) not flat
           (i.e. sensitive to changes in θ)
           ˆ
       2. β(θ) not dispersed (i.e. robust agains changes in ys (θ))
                                                    [Frigessi & Heggland, 2004]
MCMC and likelihood-free methods Part/day II: ABC
  simulation-based methods in Econometrics




ABC using indirect inference (1)


                  We present a novel approach for developing summary statistics
              for use in approximate Bayesian computation (ABC) algorithms by
              using indirect inference(...) In the indirect inference approach to
              ABC the parameters of an auxiliary model fitted to the data become
              the summary statistics. Although applicable to any ABC technique,
              we embed this approach within a sequential Monte Carlo algorithm
              that is completely adaptive and requires very little tuning(...)

                                                    [Drovandi, Pettitt & Faddy, 2011]

             c Indirect inference provides summary statistics for ABC...
MCMC and likelihood-free methods Part/day II: ABC
  simulation-based methods in Econometrics




ABC using indirect inference (2)



                  ...the above result shows that, in the limit as h → 0, ABC will
              be more accurate than an indirect inference method whose auxiliary
              statistics are the same as the summary statistic that is used for
              ABC(...) Initial analysis showed that which method is more
              accurate depends on the true value of θ.

                                                    [Fearnhead and Prangle, 2012]

     c Indirect inference provides estimates rather than global inference...
MCMC and likelihood-free methods Part/day II: ABC
  Genetics of ABC




Genetics of ABC


   simulation-based methods in
   Econometrics

   Genetics of ABC

   Approximate Bayesian computation

   ABC for model choice

   ABC model choice consistency
MCMC and likelihood-free methods Part/day II: ABC
  Genetics of ABC




Genetic background of ABC



      ABC is a recent computational technique that only requires being
      able to sample from the likelihood f (·|θ)
      This technique stemmed from population genetics models, about
      15 years ago, and population geneticists still contribute
      significantly to methodological developments of ABC.
                                [Griffith & al., 1997; Tavar´ & al., 1999]
                                                             e
MCMC and likelihood-free methods Part/day II: ABC
  Genetics of ABC




Population genetics

                              [Part derived from the teaching material of Raphael Leblois, ENS Lyon, November 2010]


              Describe the genotypes, estimate the alleles frequencies,
              determine their distribution among individuals, populations
              and between populations;

              Predict and understand the evolution of gene frequencies in
              populations as a result of various factors.

       c Analyses the effect of various evolutive forces (mutation, drift,
      migration, selection) on the evolution of gene frequencies in time
      and space.
MCMC and likelihood-free methods Part/day II: ABC
  Genetics of ABC




A[B]ctors
              Les mécanismes de l’évolution
      Towards an explanation of the mechanisms of evolutionary
      changes, mathematical models ofl’évolution ont to be developed
         •! Les mathématiques de evolution started
      in the years 1920-1930.
              commencé à être développés dans les
              années 1920-1930




            Ronald A Fisher (1890-1962)        Sewall Wright (1889-1988)   John BS Haldane (1892-1964)
MCMC and likelihood-free methods Part/day II: ABC
  Genetics of ABC




    Le modèle
Wright-Fisher model                            de Wright-Fisher

                                                    •! En l’absence de mutation et de
                                                               A population of constant
                                                       sélection, les fréquences
                                                               size, in which individuals
                                                       alléliques dérivent (augmentent
                                                               reproduce at the same time.
                                                       et diminuent) inévitablement
                                                       jusqu’à Each gene in a generation is
                                                                la fixation d’un allèle
                                                              a copy of a gene of the
                                                    •! La dérive conduit generation.
                                                                previous donc à la
                                                       perte de variation génétique à
                                                       l’intérieur des populations mutation
                                                                In the absence of
                                                              and selection, allele
                                                              frequencies derive inevitably
                                                              until the fixation of an
                                                              allele.
MCMC and likelihood-free methods Part/day II: ABC
  Genetics of ABC




Coalescent theory
                                                    [Kingman, 1982; Tajima, Tavar´, &tc]
                                                                                 e




           !"#$%&'(('")**+$,-'".'"/010234%'".'5"*$*%()23$15"6"
      Coalescence theory interested in the genealogy of a sample of
      genes7**+$,-'",()5534%'" common ancestor of the sample.
          "
        !! back in time to the     "       "!"7**+$,-'"8",$)('5,'1,'"9"
MCMC and likelihood-free methods Part/day II: ABC
  Genetics of ABC




Common ancestor



                                                                                          Les différentes
                                                                                          lignées fusionnent
                                                                                          (coalescent) au fur
                                                                                          et à mesure que
                                                                                          l’on remonte vers le




                                                                    Time of coalescence
                                                                                          passé




                                                                            (T)
                                       Modélisation du processus de dérive génétique
      The different lineages mergeen “remontant dans lein the past.
                                  when we go back temps”
                                   jusqu’à l’ancêtre commun d’un échantillon de gènes                    6
MCMC and likelihood-free methods Part/day II: ABC
  Genetics of ABC
  Sous l’hypothèse de neutralité des marqueurs génétiques étudiés,
  les mutations sont indépendantes de la généalogie
Neutral généalogie ne dépend que des processus démographiques
  i.e. la mutations

      On construit donc la généalogie selon les paramètres
      démographiques (ex. N),
                                                     puis on ajoute aassumption les
                                                           Under the posteriori of
                                                    mutations sur les différentes
                                                           neutrality, the mutations
                                                    branches,independentau feuilles de
                                                           are du MRCA of the
                                                    l’arbregenealogy.
                                                    On obtient ainsi des données de
                                                           We construct the genealogy
                                                    polymorphisme sous les modèles
                                                           according to the
                                                    démographiques et mutationnels
                                                           demographic parameters,
                                                    considérés
                                                           then we add a posteriori the
                                                           mutations.                20
MCMC and likelihood-free methods Part/day II: ABC
  Genetics of ABC




Demo-genetic inference



      Each model is characterized by a set of parameters θ that cover
      historical (time divergence, admixture time ...), demographics
      (population sizes, admixture rates, migration rates, ...) and genetic
      (mutation rate, ...) factors
      The goal is to estimate these parameters from a dataset of
      polymorphism (DNA sample) y observed at the present time
      Problem: most of the time, we can not calculate the likelihood of
      the polymorphism data f (y|θ).
MCMC and likelihood-free methods Part/day II: ABC
  Genetics of ABC




Untractable likelihood



      Missing (too missing!) data structure:

                                  f (y|θ) =             f (y|G , θ)f (G |θ)dG
                                                    G


      The genealogies are considered as nuisance parameters.
      This problematic thus differs from the phylogenetic approach
      where the tree is the parameter of interesst.
MCMC and likelihood-free methods Part/day II: ABC
  Genetics of ABC




A genuine !""#$%&'()*+,(-*.&(/+0$'"1)()&$/+2!,03!
          example of application
                    1/+*%*'"4*+56(""4&7()&$/.+.1#+4*.+8-9':*.+




                                                            94


      Pygmies populations: do they have a common origin? Is there a
      lot of exchanges between pygmies and non-pygmies populations?
MCMC and likelihood-free methods Part/day II: ABC
  Genetics of ABC

          !""#$%&'()*+,(-*.&(/+0$'"1)()&$/+2!,03!
Scenarios 1/+*%*'"4*+56(""4&7()&$/.+.1#+4*.+8-9':*.+
           under competition

             Différents scénarios possibles, choix de scenario par ABC




                                                                  Verdu et al. 2009

                                                                         96
1/+*%*'"4*+56(""4&7()&$/.+.1#+4*.+
MCMC and likelihood-free methods Part/day II: ABC


           !""#$%&'()*+,(-*.&(/+0$'"1)()&$/+2!,03
  Genetics of ABC



          1/+*%*'"4*+56(""4&7()&$/.+.1#+4*.+8-9':*.
Différentsresults
 Simulation scénarios possibles, choix de scenari

             Différents scénarios possibles, choix de scenario par ABC




       c Scenario 1A is chosen. largement soutenu par rapport aux
           Le scenario 1a est
            autres ! plaide pour une origine commune des
  Le      scenario 1a est largement soutenu par
            populations pygmées d’Afrique de l’Ouest              rap
                                                                  Verdu e
MCMC and likelihood-free methods Part/day II: ABC
  Genetics of ABC



    !""#$%&'()*+,(-*.&(/+0$'"1)()&$/+2!,03!
Most likely scenario
   1/+*%*'"4*+56(""4&7()&$/.+.1#+4*.+8-9':*.+

   Scénario
   évolutif :
   on « raconte »
   une histoire à
   partir de ces
   inférences




                                                    Verdu et al. 2009


                                                                    99
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation




Approximate Bayesian computation


   simulation-based methods in
   Econometrics

   Genetics of ABC

   Approximate Bayesian computation

   ABC for model choice

   ABC model choice consistency
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


Untractable likelihoods


   Cases when the likelihood function
   f (y|θ) is unavailable and when the
   completion step

                  f (y|θ) =          f (y, z|θ) dz
                                Z

   is impossible or too costly because of
   the dimension of z
    c MCMC cannot be implemented!
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


Untractable likelihoods


   Cases when the likelihood function
   f (y|θ) is unavailable and when the
   completion step

                  f (y|θ) =          f (y, z|θ) dz
                                Z

   is impossible or too costly because of
   the dimension of z
    c MCMC cannot be implemented!
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


Illustrations



      Example ()
     Stochastic volatility model: for                                Highest weight trajectories


     t = 1, . . . , T ,




                                                    0.4
                                                    0.2
     yt = exp(zt ) t , zt = a+bzt−1 +σηt ,


                                                    0.0
                                                    −0.2
     T very large makes it difficult to
                                                    −0.4
     include z within the simulated                        0   200    400

                                                                                  t
                                                                                          600      800   1000




     parameters
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


Illustrations



      Example ()
      Potts model: if y takes values on a grid Y of size k n and


                                     f (y|θ) ∝ exp θ         Iyl =yi
                                                       l∼i

      where l∼i denotes a neighbourhood relation, n moderately large
      prohibits the computation of the normalising constant
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


Illustrations


      Example (Genesis)
   Phylogenetic tree: in population
   genetics, reconstitution of a common
   ancestor from a sample of genes via
   a phylogenetic tree that is close to
   impossible to integrate out
   [100 processor days with 4
   parameters]
                                    [Cornuet et al., 2009, Bioinformatics]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


The ABC method

      Bayesian setting: target is π(θ)f (x|θ)
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


The ABC method

      Bayesian setting: target is π(θ)f (x|θ)
      When likelihood f (x|θ) not in closed form, likelihood-free rejection
      technique:
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


The ABC method

      Bayesian setting: target is π(θ)f (x|θ)
      When likelihood f (x|θ) not in closed form, likelihood-free rejection
      technique:
      ABC algorithm
      For an observation y ∼ f (y|θ), under the prior π(θ), keep jointly
      simulating
                           θ ∼ π(θ) , z ∼ f (z|θ ) ,
      until the auxiliary variable z is equal to the observed value, z = y.

                                                       [Tavar´ et al., 1997]
                                                             e
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


Why does it work?!



      The proof is trivial:

                                     f (θi ) ∝            π(θi )f (z|θi )Iy (z)
                                                    z∈D
                                             ∝ π(θi )f (y|θi )
                                             = π(θi |y) .

                                                                             [Accept–Reject 101]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


Earlier occurrence


              ‘Bayesian statistics and Monte Carlo methods are ideally
              suited to the task of passing many models over one
              dataset’
                                                    [Don Rubin, Annals of Statistics, 1984]

      Note Rubin (1984) does not promote this algorithm for
      likelihood-free simulation but frequentist intuition on posterior
      distributions: parameters from posteriors are more likely to be
      those that could have generated the data.
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


A as A...pproximative


      When y is a continuous random variable, equality z = y is replaced
      with a tolerance condition,

                                                    (y, z) ≤

      where         is a distance
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


A as A...pproximative


      When y is a continuous random variable, equality z = y is replaced
      with a tolerance condition,

                                                    (y, z) ≤

      where is a distance
      Output distributed from

                            π(θ) Pθ { (y, z) < } ∝ π(θ| (y, z) < )

                                                               [Pritchard et al., 1999]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


ABC algorithm


      Algorithm 1 Likelihood-free rejection sampler 2
        for i = 1 to N do
          repeat
             generate θ from the prior distribution π(·)
             generate z from the likelihood f (·|θ )
          until ρ{η(z), η(y)} ≤
          set θi = θ
        end for

      where η(y) defines a (not necessarily sufficient) statistic
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


Output

      The likelihood-free algorithm samples from the marginal in z of:

                                                     π(θ)f (z|θ)IA ,y (z)
                              π (θ, z|y) =                                  ,
                                                    A ,y ×Θ π(θ)f (z|θ)dzdθ

      where A       ,y   = {z ∈ D|ρ(η(z), η(y)) < }.
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


Output

      The likelihood-free algorithm samples from the marginal in z of:

                                                     π(θ)f (z|θ)IA ,y (z)
                              π (θ, z|y) =                                  ,
                                                    A ,y ×Θ π(θ)f (z|θ)dzdθ

      where A       ,y   = {z ∈ D|ρ(η(z), η(y)) < }.
      The idea behind ABC is that the summary statistics coupled with a
      small tolerance should provide a good approximation of the
      posterior distribution:

                               π (θ|y) =            π (θ, z|y)dz ≈ π(θ|y) .
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


Convergence of ABC (first attempt)




      What happens when                      → 0?
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


Convergence of ABC (first attempt)



      What happens when                      → 0?
      If f (·|θ) is continuous in y , uniformly in θ [!], given an arbitrary
      δ > 0, there exists 0 such that < 0 implies

               π(θ)      f (z|θ)IA ,y (z) dz         π(θ)f (y|θ)(1 δ)µ(B )
                                             ∈
                  A ,y ×Θ π(θ)f (z|θ)dzdθ           Θ π(θ)f (y|θ)dθ(1 ± δ)µ(B )
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


Convergence of ABC (first attempt)



      What happens when                      → 0?
      If f (·|θ) is continuous in y , uniformly in θ [!], given an arbitrary
      δ > 0, there exists 0 such that < 0 implies

               π(θ)      f (z|θ)IA ,y (z) dz          π(θ)f (y|θ)(1 δ)
                                                                       XX )
                                                                       µ(BX
                                             ∈
                  A ,y ×Θ π(θ)f (z|θ)dzdθ           Θ π(θ)f (y|θ)dθ(1 ± δ)
                                                                          µ(B )
                                                                          XXX
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


Convergence of ABC (first attempt)



      What happens when                      → 0?
      If f (·|θ) is continuous in y , uniformly in θ [!], given an arbitrary
      δ  0, there exists 0 such that  0 implies

               π(θ)      f (z|θ)IA ,y (z) dz         π(θ)f (y|θ)(1 δ)
                                                                       XX )
                                                                       µ(BX
                                             ∈
                  A ,y ×Θ π(θ)f (z|θ)dzdθ           Θ π(θ)f (y|θ)dθ(1 ± δ) X
                                                                          µ(B )
                                                                          X
                                                                            X

                                [Proof extends to other continuous-in-0 kernels K ]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


Convergence of ABC (second attempt)

      What happens when                      → 0?
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


Convergence of ABC (second attempt)

      What happens when                      → 0?
      For B ⊂ Θ, we have

                       A         f (z|θ)dz                                   f (z|θ)π(θ)dθ
                            ,y                                           B
                                                π(θ)dθ =                                      dz
        B A       ,y ×Θ
                            π(θ)f (z|θ)dzdθ                  A   ,y   A ,y ×Θ π(θ)f (z|θ)dzdθ

                                 B f (z|θ)π(θ)dθ             m(z)
            =                                                               dz
                   A   ,y
                                      m(z)          A ,y ×Θ π(θ)f (z|θ)dzdθ
                                                 m(z)
            =               π(B|z)                              dz
                   A   ,y               A ,y ×Θ π(θ)f (z|θ)dzdθ




      which indicates convergence for a continuous π(B|z).
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


Probit modelling on Pima Indian women
      Example (R benchmark)
      200 Pima Indian women with observed variables
              plasma glucose concentration in oral glucose tolerance test
              diastolic blood pressure
              diabetes pedigree function
              presence/absence of diabetes
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


Probit modelling on Pima Indian women
      Example (R benchmark)
      200 Pima Indian women with observed variables
              plasma glucose concentration in oral glucose tolerance test
              diastolic blood pressure
              diabetes pedigree function
              presence/absence of diabetes


      Probability of diabetes function of above variables
                               P(y = 1|x) = Φ(x1 β1 + x2 β2 + x3 β3 ) ,
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


Probit modelling on Pima Indian women
      Example (R benchmark)
      200 Pima Indian women with observed variables
              plasma glucose concentration in oral glucose tolerance test
              diastolic blood pressure
              diabetes pedigree function
              presence/absence of diabetes


      Probability of diabetes function of above variables
                               P(y = 1|x) = Φ(x1 β1 + x2 β2 + x3 β3 ) ,
      Test of H0 : β3 = 0 for 200 observations of Pima.tr based on a
      g -prior modelling:
                                             β ∼ N3 (0, n XT X)−1
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


Probit modelling on Pima Indian women
      Example (R benchmark)
      200 Pima Indian women with observed variables
              plasma glucose concentration in oral glucose tolerance test
              diastolic blood pressure
              diabetes pedigree function
              presence/absence of diabetes


      Probability of diabetes function of above variables
                               P(y = 1|x) = Φ(x1 β1 + x2 β2 + x3 β3 ) ,
      Test of H0 : β3 = 0 for 200 observations of Pima.tr based on a
      g -prior modelling:
                                             β ∼ N3 (0, n XT X)−1
      Use of importance function inspired from the MLE estimate
      distribution
                                         ˆ ˆ
                                β ∼ N (β, Σ)
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


Pima Indian benchmark




                                                                         80
                            100




                                                                                                                1.0
                            80




                                                                         60




                                                                                                                0.8
                            60




                                                                                                                0.6
                  Density




                                                               Density




                                                                                                      Density
                                                                         40
                            40




                                                                                                                0.4
                                                                         20
                            20




                                                                                                                0.2
                                                                                                                0.0
                            0




                                                                         0




                              −0.005   0.010   0.020   0.030                  −0.05   −0.03   −0.01                   −1.0   0.0   1.0   2.0




      Figure: Comparison between density estimates of the marginals on β1
      (left), β2 (center) and β3 (right) from ABC rejection samples (red) and
      MCMC samples (black)

                                                                                      .
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


MA example


      Back to the MA(q) model
                                                            q
                                           xt =     t   +         ϑi    t−i
                                                            i=1

      Simple prior: uniform over the inverse [real and complex] roots in
                                                                  q
                                           Q(u) = 1 −                  ϑi u i
                                                                i=1

      under the identifiability conditions
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


MA example



      Back to the MA(q) model
                                                            q
                                           xt =     t   +         ϑi   t−i
                                                            i=1

      Simple prior: uniform prior over the identifiability zone, e.g.
      triangle for MA(2)
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


MA example (2)
      ABC algorithm thus made of
         1. picking a new value (ϑ1 , ϑ2 ) in the triangle
         2. generating an iid sequence ( t )−qt≤T
         3. producing a simulated series (xt )1≤t≤T
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


MA example (2)
      ABC algorithm thus made of
         1. picking a new value (ϑ1 , ϑ2 ) in the triangle
         2. generating an iid sequence ( t )−qt≤T
         3. producing a simulated series (xt )1≤t≤T
      Distance: basic distance between the series
                                                                   T
                           ρ((xt )1≤t≤T , (xt )1≤t≤T ) =                (xt − xt )2
                                                                  t=1

      or distance between summary statistics like the q autocorrelations
                                                      T
                                              τj =           xt xt−j
                                                     t=j+1
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


Comparison of distance impact




      Evaluation of the tolerance on the ABC sample against both
      distances ( = 100%, 10%, 1%, 0.1%) for an MA(2) model
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


Comparison of distance impact

                       4




                                                          1.5
                       3




                                                          1.0
                       2




                                                          0.5
                       1




                                                          0.0
                       0




                            0.0   0.2   0.4   0.6   0.8         −2.0   −1.0    0.0   0.5   1.0   1.5

                                         θ1                                   θ2




      Evaluation of the tolerance on the ABC sample against both
      distances ( = 100%, 10%, 1%, 0.1%) for an MA(2) model
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


Comparison of distance impact

                       4




                                                          1.5
                       3




                                                          1.0
                       2




                                                          0.5
                       1




                                                          0.0
                       0




                            0.0   0.2   0.4   0.6   0.8         −2.0   −1.0    0.0   0.5   1.0   1.5

                                         θ1                                   θ2




      Evaluation of the tolerance on the ABC sample against both
      distances ( = 100%, 10%, 1%, 0.1%) for an MA(2) model
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


Homonomy
      The ABC algorithm is not to be confused with the ABC algorithm
              The Artificial Bee Colony algorithm is a swarm based meta-heuristic
              algorithm that was introduced by Karaboga in 2005 for optimizing
              numerical problems. It was inspired by the intelligent foraging
              behavior of honey bees. The algorithm is specifically based on the
              model proposed by Tereshko and Loengarov (2005) for the foraging
              behaviour of honey bee colonies. The model consists of three
              essential components: employed and unemployed foraging bees, and
              food sources. The first two components, employed and unemployed
              foraging bees, search for rich food sources (...) close to their hive.
              The model also defines two leading modes of behaviour (...):
              recruitment of foragers to rich food sources resulting in positive
              feedback and abandonment of poor sources by foragers causing
              negative feedback.

                                                           [Karaboga, Scholarpedia]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


ABC advances

      Simulating from the prior is often poor in efficiency
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


ABC advances

      Simulating from the prior is often poor in efficiency
      Either modify the proposal distribution on θ to increase the density
      of x’s within the vicinity of y ...
           [Marjoram et al, 2003; Bortot et al., 2007, Sisson et al., 2007]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


ABC advances

      Simulating from the prior is often poor in efficiency
      Either modify the proposal distribution on θ to increase the density
      of x’s within the vicinity of y ...
           [Marjoram et al, 2003; Bortot et al., 2007, Sisson et al., 2007]

      ...or by viewing the problem as a conditional density estimation
      and by developing techniques to allow for larger
                                                  [Beaumont et al., 2002]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     ABC basics


ABC advances

      Simulating from the prior is often poor in efficiency
      Either modify the proposal distribution on θ to increase the density
      of x’s within the vicinity of y ...
           [Marjoram et al, 2003; Bortot et al., 2007, Sisson et al., 2007]

      ...or by viewing the problem as a conditional density estimation
      and by developing techniques to allow for larger
                                                  [Beaumont et al., 2002]

      .....or even by including                in the inferential framework [ABCµ ]
                                                                    [Ratmann et al., 2009]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC-NP
    Better usage of [prior] simulations by
   adjustement: instead of throwing away
   θ such that ρ(η(z), η(y))  , replace
   θ’s with locally regressed transforms
      (use with BIC)



              θ∗ = θ − {η(z) − η(y)}T β
                                      ˆ                       [Csill´ry et al., TEE, 2010]
                                                                    e

             ˆ
      where β is obtained by [NP] weighted least square regression on
      (η(z) − η(y)) with weights

                                             Kδ {ρ(η(z), η(y))}

                                                     [Beaumont et al., 2002, Genetics]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC-NP (regression)


      Also found in the subsequent literature, e.g. in                   Fearnhead-Prangle (2012)   :
      weight directly simulation by

                                          Kδ {ρ(η(z(θ)), η(y))}

      or
                                          S
                                     1
                                               Kδ {ρ(η(zs (θ)), η(y))}
                                     S
                                         s=1

                                                          [consistent estimate of f (η|θ)]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC-NP (regression)


      Also found in the subsequent literature, e.g. in                   Fearnhead-Prangle (2012)   :
      weight directly simulation by

                                          Kδ {ρ(η(z(θ)), η(y))}

      or
                                          S
                                     1
                                               Kδ {ρ(η(zs (θ)), η(y))}
                                     S
                                         s=1

                                          [consistent estimate of f (η|θ)]
      Curse of dimensionality: poor estimate when d = dim(η) is large...
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC-NP (density estimation)


      Use of the kernel weights

                                             Kδ {ρ(η(z(θ)), η(y))}

      leads to the NP estimate of the posterior expectation

                                         i   θi Kδ {ρ(η(z(θi )), η(y))}
                                             i Kδ {ρ(η(z(θi )), η(y))}

                                                                          [Blum, JASA, 2010]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC-NP (density estimation)


      Use of the kernel weights

                                          Kδ {ρ(η(z(θ)), η(y))}

      leads to the NP estimate of the posterior conditional density
                                       ˜
                                       Kb (θi − θ)Kδ {ρ(η(z(θi )), η(y))}
                                   i
                                           i Kδ {ρ(η(z(θi )), η(y))}

                                                                    [Blum, JASA, 2010]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC-NP (density estimations)


      Other versions incorporating regression adjustments
                                      ˜
                                      Kb (θi∗ − θ)Kδ {ρ(η(z(θi )), η(y))}
                                  i
                                           i Kδ {ρ(η(z(θi )), η(y))}
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC-NP (density estimations)


      Other versions incorporating regression adjustments
                                      ˜
                                      Kb (θi∗ − θ)Kδ {ρ(η(z(θi )), η(y))}
                                  i
                                           i Kδ {ρ(η(z(θi )), η(y))}

      In all cases, error

           E[ˆ (θ|y)] − g (θ|y) = cb 2 + cδ 2 + OP (b 2 + δ 2 ) + OP (1/nδ d )
             g
                                    c
                   var(ˆ (θ|y)) =
                       g                (1 + oP (1))
                                  nbδ d
                                                                   [Blum, JASA, 2010]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC-NP (density estimations)


      Other versions incorporating regression adjustments
                                      ˜
                                      Kb (θi∗ − θ)Kδ {ρ(η(z(θi )), η(y))}
                                  i
                                           i Kδ {ρ(η(z(θi )), η(y))}

      In all cases, error

           E[ˆ (θ|y)] − g (θ|y) = cb 2 + cδ 2 + OP (b 2 + δ 2 ) + OP (1/nδ d )
             g
                                    c
                   var(ˆ (θ|y)) =
                       g                (1 + oP (1))
                                  nbδ d
                                                             [standard NP calculations]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC-NCH

      Incorporating non-linearities and heterocedasticities:

                                                            σ (η(y))
                                                            ˆ
                             θ∗ = m(η(y)) + [θ − m(η(z))]
                                  ˆ              ˆ
                                                            σ (η(z))
                                                            ˆ
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC-NCH

      Incorporating non-linearities and heterocedasticities:

                                                                  σ (η(y))
                                                                  ˆ
                             θ∗ = m(η(y)) + [θ − m(η(z))]
                                  ˆ              ˆ
                                                                  σ (η(z))
                                                                  ˆ

      where
              m(η) estimated by non-linear regression (e.g., neural network)
              ˆ
              σ (η) estimated by non-linear regression on residuals
              ˆ

                                     log{θi − m(ηi )}2 = log σ 2 (ηi ) + ξi
                                              ˆ

                                                               [Blum  Fran¸ois, 2009]
                                                                           c
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC-NCH (2)


      Why neural network?
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC-NCH (2)


      Why neural network?
              fights curse of dimensionality
              selects relevant summary statistics
              provides automated dimension reduction
              offers a model choice capability
              improves upon multinomial logistic
                                                    [Blum  Fran¸ois, 2009]
                                                                c
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC-MCMC


      Markov chain (θ(t) ) created via the transition function
                
                θ ∼ Kω (θ |θ(t) ) if x ∼ f (x|θ ) is such that x = y
                
                
                                                           π(θ )Kω (t) |θ )
      θ (t+1)
              =                       and u ∼ U(0, 1) ≤ π(θ(t) )K (θ |θ(t) ) ,
                                                                 ω (θ
                 (t)
                θ                   otherwise,
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC-MCMC


      Markov chain (θ(t) ) created via the transition function
                
                θ ∼ Kω (θ |θ(t) ) if x ∼ f (x|θ ) is such that x = y
                
                
                                                           π(θ )Kω (t) |θ )
      θ (t+1)
              =                       and u ∼ U(0, 1) ≤ π(θ(t) )K (θ |θ(t) ) ,
                                                                 ω (θ
                 (t)
                θ                   otherwise,

      has the posterior π(θ|y ) as stationary distribution
                                                     [Marjoram et al, 2003]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC-MCMC (2)

      Algorithm 2 Likelihood-free MCMC sampler
          Use Algorithm 1 to get (θ(0) , z(0) )
          for t = 1 to N do
            Generate θ from Kω ·|θ(t−1) ,
            Generate z from the likelihood f (·|θ ),
            Generate u from U[0,1] ,
                         π(θ )Kω (θ(t−1) |θ )
              if u ≤                            I
                        π(θ(t−1) Kω (θ |θ(t−1) ) A   ,y   (z ) then
               set     (θ(t) , z(t) ) = (θ , z )
            else
               (θ(t) , z(t) )) = (θ(t−1) , z(t−1) ),
            end if
          end for
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


Why does it work?

      Acceptance probability does not involve calculating the likelihood
      and

                  π (θ , z |y)                 q(θ (t−1) |θ )f (z(t−1) |θ (t−1) )
                                          ×
           π (θ (t−1) , z(t−1) |y)                  q(θ |θ (t−1) )f (z |θ )
                                                        π(θ ) XX|θ IA ,y (z )
                                                                )
                                                               f (z X  X
                                           =
                                               π(θ (t−1) ) f (z(t−1) |θ (t−1) ) IA ,y (z(t−1) )
                                               q(θ (t−1) |θ ) f (z(t−1) |θ (t−1) )
                                          ×
                                                    q(θ |θ (t−1) ) X|θ
                                                                   X )
                                                                   f (z XX
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


Why does it work?

      Acceptance probability does not involve calculating the likelihood
      and

                  π (θ , z |y)                 q(θ (t−1) |θ )f (z(t−1) |θ (t−1) )
                                          ×
           π (θ (t−1) , z(t−1) |y)                  q(θ |θ (t−1) )f (z |θ )
                                                        π(θ ) XX|θ IA ,y (z )
                                                                )
                                                               f (z X  X
                                           =               hh            (
                                               π(θ (t−1) ) ((hhhhh) IA ,y (z(t−1) )
                                                           f (z(t−1) |θ (t−1)
                                                                ((( h   ((
                                                              hh            (
                                               q(θ (t−1) |θ ) ((hhhhh)
                                                              f (z(t−1) |θ (t−1)
                                                                           ((
                                                                   ((( h
                                          ×
                                                    q(θ |θ (t−1) ) X|θ
                                                                   X )
                                                                   f (z XX
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


Why does it work?

      Acceptance probability does not involve calculating the likelihood
      and

                  π (θ , z |y)                 q(θ (t−1) |θ )f (z(t−1) |θ (t−1) )
                                          ×
           π (θ (t−1) , z(t−1) |y)                    q(θ |θ (t−1) )f (z |θ )
                                                          π(θ ) X|θ IA ,y (z )
                                                                 X )
                                                                 f (z X  X
                                           =                hh             (
                                               π(θ (t−1) ) ((hhhhh) XX(t−1) )
                                                           f (z(t−1) |θ (t−1) IAX
                                                                          (( X
                                                                ((( h ,y (zX           
                                                                                    X
                                                               hh             (
                                               q(θ (t−1) |θ ) ((hhhhh)
                                                              f (z(t−1) |θ (t−1)
                                                                             ((
                                                                   ((( h
                                          ×
                                                      q(θ |θ (t−1) ) X|θ
                                                                     X )
                                                                     f (z XX
                                                    π(θ )q(θ (t−1) |θ )
                                           =                              IA ,y (z )
                                               π(θ (t−1) q(θ |θ (t−1) )
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


A toy example



      Case of
                              1        1
                         x ∼ N (θ, 1) + N (−θ, 1)
                              2        2
      under prior θ ∼ N (0, 10)
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


A toy example



      Case of
                              1        1
                         x ∼ N (θ, 1) + N (−θ, 1)
                              2        2
      under prior θ ∼ N (0, 10)

      ABC sampler
      thetas=rnorm(N,sd=10)
      zed=sample(c(1,-1),N,rep=TRUE)*thetas+rnorm(N,sd=1)
      eps=quantile(abs(zed-x),.01)
      abc=thetas[abs(zed-x)eps]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


A toy example


      Case of
                              1        1
                         x ∼ N (θ, 1) + N (−θ, 1)
                              2        2
      under prior θ ∼ N (0, 10)

      ABC-MCMC sampler
      metas=rep(0,N)
      metas[1]=rnorm(1,sd=10)
      zed[1]=x
      for (t in 2:N){
        metas[t]=rnorm(1,mean=metas[t-1],sd=5)
        zed[t]=rnorm(1,mean=(1-2*(runif(1).5))*metas[t],sd=1)
        if ((abs(zed[t]-x)eps)||(runif(1)dnorm(metas[t],sd=10)/dnorm(metas[t-1],sd=10))){
         metas[t]=metas[t-1]
         zed[t]=zed[t-1]}
        }
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


A toy example


                                                    x =2
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


A toy example


                                                     x =2
                     0.30
                     0.25
                     0.20
                     0.15
                     0.10
                     0.05
                     0.00




                                  −4            −2      0   2   4

                                                       θ
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


A toy example


                                                    x = 10
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


A toy example


                                                    x = 10
                     0.25
                     0.20
                     0.15
                     0.10
                     0.05
                     0.00




                                 −10          −5       0     5   10

                                                        θ
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


A toy example


                                                    x = 50
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


A toy example


                                                    x = 50
                     0.10
                     0.08
                     0.06
                     0.04
                     0.02
                     0.00




                            −40           −20          0     20   40

                                                        θ
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABCµ


                        [Ratmann, Andrieu, Wiuf and Richardson, 2009, PNAS]

      Use of a joint density

                               f (θ, |y) ∝ ξ( |y, θ) × πθ (θ) × π ( )

      where y is the data, and ξ( |y, θ) is the prior predictive density of
      ρ(η(z), η(y)) given θ and y when z ∼ f (z|θ)
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABCµ


                        [Ratmann, Andrieu, Wiuf and Richardson, 2009, PNAS]

      Use of a joint density

                               f (θ, |y) ∝ ξ( |y, θ) × πθ (θ) × π ( )

      where y is the data, and ξ( |y, θ) is the prior predictive density of
      ρ(η(z), η(y)) given θ and y when z ∼ f (z|θ)
      Warning! Replacement of ξ( |y, θ) with a non-parametric kernel
      approximation.
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABCµ details

      Multidimensional distances ρk (k = 1, . . . , K ) and errors
      k = ρk (ηk (z), ηk (y)), with


                            ˆ              1
        k   ∼ ξk ( |y, θ) ≈ ξk ( |y, θ) =               K [{ k −ρk (ηk (zb ), ηk (y))}/hk ]
                                          Bhk
                                                    b

                                                 ˆ
      then used in replacing ξ( |y, θ) with mink ξk ( |y, θ)
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABCµ details

      Multidimensional distances ρk (k = 1, . . . , K ) and errors
      k = ρk (ηk (z), ηk (y)), with


                            ˆ              1
        k   ∼ ξk ( |y, θ) ≈ ξk ( |y, θ) =               K [{ k −ρk (ηk (zb ), ηk (y))}/hk ]
                                          Bhk
                                                    b

                                                 ˆ
      then used in replacing ξ( |y, θ) with mink ξk ( |y, θ)
      ABCµ involves acceptance probability

                                                         ˆ
                             π(θ , ) q(θ , θ)q( , ) mink ξk ( |y, θ )
                                                          ˆ
                             π(θ, ) q(θ, θ )q( , ) mink ξk ( |y, θ)
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABCµ multiple errors




                                                    [ c Ratmann et al., PNAS, 2009]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABCµ for model choice




                                                    [ c Ratmann et al., PNAS, 2009]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


Questions about ABCµ


      For each model under comparison, marginal posterior on   used to
      assess the fit of the model (HPD includes 0 or not).
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


Questions about ABCµ


      For each model under comparison, marginal posterior on                 used to
      assess the fit of the model (HPD includes 0 or not).
              Is the data informative about ? [Identifiability]
              How is the prior π( ) impacting the comparison?
              How is using both ξ( |x0 , θ) and π ( ) compatible with a
              standard probability model? [remindful of Wilkinson ]
              Where is the penalisation for complexity in the model
              comparison?
                                                    [X, Mengersen  Chen, 2010, PNAS]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC-PRC

      Another sequential version producing a sequence of Markov
                                             (t)          (t)
      transition kernels Kt and of samples (θ1 , . . . , θN ) (1 ≤ t ≤ T )
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC-PRC

      Another sequential version producing a sequence of Markov
                                             (t)          (t)
      transition kernels Kt and of samples (θ1 , . . . , θN ) (1 ≤ t ≤ T )

      ABC-PRC Algorithm
                                                               (t−1)
         1. Select θ at random from previous θi                        ’s with probabilities
             (t−1)
            ωi     (1 ≤ i ≤ N).
         2. Generate
                                        (t)                             (t)
                                      θi      ∼ Kt (θ|θ ) , x ∼ f (x|θi ) ,
         3. Check that (x, y )  , otherwise start again.

                                                                        [Sisson et al., 2007]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


Why PRC?


      Partial rejection control: Resample from a population of weighted
      particles by pruning away particles with weights below threshold C ,
      replacing them by new particles obtained by propagating an
      existing particle by an SMC step and modifying the weights
      accordinly.
                                                               [Liu, 2001]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


Why PRC?


      Partial rejection control: Resample from a population of weighted
      particles by pruning away particles with weights below threshold C ,
      replacing them by new particles obtained by propagating an
      existing particle by an SMC step and modifying the weights
      accordinly.
                                                               [Liu, 2001]
      PRC justification in ABC-PRC:
              Suppose we then implement the PRC algorithm for some
              c  0 such that only identically zero weights are smaller
              than c
      Trouble is, there is no such c...
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC-PRC weight


                            (t)
      Probability ωi              computed as
                      (t)            (t)            (t)     (t)
                     ωi     ∝ π(θi )Lt−1 (θ |θi ){π(θ )Kt (θi |θ )}−1 ,

        where Lt−1 is an arbitrary transition kernel.
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC-PRC weight


                            (t)
      Probability ωi              computed as
                      (t)            (t)            (t)     (t)
                     ωi     ∝ π(θi )Lt−1 (θ |θi ){π(θ )Kt (θi |θ )}−1 ,

       where Lt−1 is an arbitrary transition kernel.
      In case
                            Lt−1 (θ |θ) = Kt (θ|θ ) ,
      all weights are equal under a uniform prior.
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC-PRC weight


                            (t)
      Probability ωi              computed as
                      (t)            (t)            (t)     (t)
                     ωi     ∝ π(θi )Lt−1 (θ |θi ){π(θ )Kt (θi |θ )}−1 ,

       where Lt−1 is an arbitrary transition kernel.
      In case
                            Lt−1 (θ |θ) = Kt (θ|θ ) ,
      all weights are equal under a uniform prior.
      Inspired from Del Moral et al. (2006), who use backward kernels
      Lt−1 in SMC to achieve unbiasedness
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC-PRC bias

                            Lack of unbiasedness of the method
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC-PRC bias

                                     Lack of unbiasedness of the method
      Joint density of the accepted pair (θ(t−1) , θ(t) ) proportional to
                                                                     (t−1)                  (t)        (t−1)               (t)
                                                             π(θ               |y )Kt (θ          |θ            )f (y |θ         ),


        For an arbitrary function h(θ), E [ωt h(θ(t) )] proportional to

                           (t)       π(θ (t) )Lt−1 (θ (t−1) |θ (t) )                         (t−1)                   (t)        (t−1)              (t)         (t−1)        (t)
                     h(θ         )                                                    π(θ               |y )Kt (θ          |θ           )f (y |θ         )dθ           dθ
                                     π(θ (t−1) )Kt (θ (t) |θ (t−1) )

                                           (t)       π(θ (t) )Lt−1 (θ (t−1) |θ (t) )                         (t−1)               (t−1)
                     ∝               h(θ         )                                                 π(θ               )f (y |θ            )
                                                     π(θ (t−1) )Kt (θ (t) |θ (t−1) )
                                             (t)          (t−1)               (t)         (t−1)        (t)
                             × Kt (θ                 |θ            )f (y |θ         )dθ           dθ
                                       (t)            (t)                                 (t−1)        (t)              (t−1)          (t−1)             (t)
                     ∝           h(θ         )π(θ           |y )         Lt−1 (θ                  |θ         )f (y |θ            )dθ               dθ          .
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


A mixture example (1)

      Toy model of Sisson et al. (2007): if

              θ ∼ U(−10, 10) ,               x|θ ∼ 0.5 N (θ, 1) + 0.5 N (θ, 1/100) ,

      then the posterior distribution associated with y = 0 is the normal
      mixture
                   θ|y = 0 ∼ 0.5 N (0, 1) + 0.5 N (0, 1/100)
      restricted to [−10, 10].
      Furthermore, true target available as

      π(θ||x|  ) ∝ Φ( −θ)−Φ(− −θ)+Φ(10( −θ))−Φ(−10( +θ)) .
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


“Ugly, squalid graph...”
            1.0




                                        1.0




                                                                    1.0




                                                                                                1.0




                                                                                                                            1.0
            0.8




                                        0.8




                                                                    0.8




                                                                                                0.8




                                                                                                                            0.8
            0.6




                                        0.6




                                                                    0.6




                                                                                                0.6




                                                                                                                            0.6
            0.4




                                        0.4




                                                                    0.4




                                                                                                0.4




                                                                                                                            0.4
            0.2




                                        0.2




                                                                    0.2




                                                                                                0.2




                                                                                                                            0.2
            0.0




                                        0.0




                                                                    0.0




                                                                                                0.0




                                                                                                                            0.0
                  −3   −1       1   3         −3   −1       1   3         −3   −1       1   3         −3   −1       1   3         −3   −1       1   3

                            θ                           θ                           θ                           θ                           θ
            1.0




                                        1.0




                                                                    1.0




                                                                                                1.0




                                                                                                                            1.0
            0.8




                                        0.8




                                                                    0.8




                                                                                                0.8




                                                                                                                            0.8
            0.6




                                        0.6




                                                                    0.6




                                                                                                0.6




                                                                                                                            0.6
            0.4




                                        0.4




                                                                    0.4




                                                                                                0.4




                                                                                                                            0.4
            0.2




                                        0.2




                                                                    0.2




                                                                                                0.2




                                                                                                                            0.2
            0.0




                                        0.0




                                                                    0.0




                                                                                                0.0




                                                                                                                            0.0
                  −3   −1       1   3         −3   −1       1   3         −3   −1       1   3         −3   −1       1   3         −3   −1       1   3

                            θ                           θ                           θ                           θ                           θ



                        Comparison of τ = 0.15 and τ = 1/0.15 in Kt
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


A PMC version
      Use of the same kernel idea as ABC-PRC but with IS correction
      ( check PMC )
      Generate a sample at iteration t by
                                                    N
                                       (t)                (t−1)              (t−1)
                               πt (θ
                               ˆ             )∝          ωj       Kt (θ(t) |θj       )
                                                   j=1

      modulo acceptance of the associated xt , and use an importance
                                                      (t)
      weight associated with an accepted simulation θi
                                             (t)          (t)          (t)
                                        ωi         ∝ π(θi ) πt (θi ) .
                                                            ˆ

       c Still likelihood free
                                                                          [Beaumont et al., 2009]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


Our ABC-PMC algorithm
      Given a decreasing sequence of approximation levels                       1    ≥ ... ≥           T,

         1. At iteration t = 1,
                     For i = 1, ..., N
                                           (1)                           (1)
                             Simulate θi ∼ π(θ) and x ∼ f (x|θi ) until (x, y )                             1
                                  (1)
                             Set ωi = 1/N
                                                                               (1)
              Take τ 2 as twice the empirical variance of the θi ’s
         2. At iteration 2 ≤ t ≤ T ,
                     For i = 1, ..., N, repeat
                                                    (t−1)                              (t−1)
                             Pick θi from the θj            ’s with probabilities ωj
                                           (t)          2                            (t)
                             generate θi |θi ∼ N (θi , σt ) and x ∼ f (x|θi )
                     until (x, y )        t
                            (t)          (t)        N       (t−1)      −1      (t)         (t−1)
                     Set   ωi     ∝   π(θi )/       j=1   ωj        ϕ σt  θi         − θj          )
                    2                                                                                  (t)
              Take τt+1 as twice the weighted empirical variance of the θi ’s
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


Sequential Monte Carlo

      SMC is a simulation technique to approximate a sequence of
      related probability distributions πn with π0 “easy” and πT as
      target.
      Iterated IS as PMC: particles moved from time n to time n via
      kernel Kn and use of a sequence of extended targets πn˜
                                                           n
                                  πn (z0:n ) = πn (zn )
                                  ˜                             Lj (zj+1 , zj )
                                                          j=0

      where the Lj ’s are backward Markov kernels [check that πn (zn ) is a
      marginal]
                              [Del Moral, Doucet  Jasra, Series B, 2006]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


Sequential Monte Carlo (2)

      Algorithm 3 SMC sampler
                        (0)
          sample zi ∼ γ0 (x) (i = 1, . . . , N)
                              (0)        (0)        (0)
          compute weights wi = π0 (zi )/γ0 (zi )
          for t = 1 to N do
            if ESS(w (t−1) )  NT then
               resample N particles z (t−1) and set weights to 1
            end if
                       (t−1)        (t−1)
            generate zi      ∼ Kt (zi     , ·) and set weights to
                                                          (t)         (t)    (t−1)
                              (t)         (t−1)      πt (zi ))Lt−1 (zi ), zi         ))
                            wi       = wi−1                (t−1)        (t−1)     (t)
                                                    πt−1 (zi     ))Kt (zi     ), zi ))

          end for
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC-SMC
                                                      [Del Moral, Doucet  Jasra, 2009]

      True derivation of an SMC-ABC algorithm
      Use of a kernel Kn associated with target π                        n   and derivation of the
      backward kernel
                                                      π n (z )Kn (z , z)
                                     Ln−1 (z, z ) =
                                                            πn (z)

      Update of the weights
                                                        M                  m
                                                        m=1 IA       n
                                                                         (xin )
                               win ∝ wi(n−1)          M                m
                                                                     (xi(n−1) )
                                                      m=1 IA   n−1

            m
      when xin ∼ K (xi(n−1) , ·)
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC-SMCM



      Modification: Makes M repeated simulations of the pseudo-data z
      given the parameter, rather than using a single [M = 1] simulation,
      leading to weight that is proportional to the number of accepted
      zi s
                                      M
                                   1
                          ω(θ) =         Iρ(η(y),η(zi ))
                                  M
                                                    i=1

      [limit in M means exact simulation from (tempered) target]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


Properties of ABC-SMC


      The ABC-SMC method properly uses a backward kernel L(z, z ) to
      simplify the importance weight and to remove the dependence on
      the unknown likelihood from this weight. Update of importance
      weights is reduced to the ratio of the proportions of surviving
      particles
      Major assumption: the forward kernel K is supposed to be invariant
      against the true target [tempered version of the true posterior]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


Properties of ABC-SMC


      The ABC-SMC method properly uses a backward kernel L(z, z ) to
      simplify the importance weight and to remove the dependence on
      the unknown likelihood from this weight. Update of importance
      weights is reduced to the ratio of the proportions of surviving
      particles
      Major assumption: the forward kernel K is supposed to be invariant
      against the true target [tempered version of the true posterior]
      Adaptivity in ABC-SMC algorithm only found in on-line
      construction of the thresholds t , slowly enough to keep a large
      number of accepted transitions
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


A mixture example (2)
      Recovery of the target, whether using a fixed standard deviation of
      τ = 0.15 or τ = 1/0.15, or a sequence of adaptive τt ’s.
            1.0




                                                 1.0




                                                                                      1.0




                                                                                                                           1.0




                                                                                                                                                                1.0
            0.8




                                                 0.8




                                                                                      0.8




                                                                                                                           0.8




                                                                                                                                                                0.8
            0.6




                                                 0.6




                                                                                      0.6




                                                                                                                           0.6




                                                                                                                                                                0.6
            0.4




                                                 0.4




                                                                                      0.4




                                                                                                                           0.4




                                                                                                                                                                0.4
            0.2




                                                 0.2




                                                                                      0.2




                                                                                                                           0.2




                                                                                                                                                                0.2
            0.0




                                                 0.0




                                                                                      0.0




                                                                                                                           0.0




                                                                                                                                                                0.0
                  −3   −2   −1   0   1   2   3         −3   −2   −1   0   1   2   3         −3   −2   −1   0   1   2   3         −3   −2   −1   0   1   2   3         −3   −2   −1   0   1   2   3

                                 θ                                    θ                                    θ                                    θ                                    θ
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


Yet another ABC-PRC
      Another version of ABC-PRC called PRC-ABC with a proposal
      distribution Mt , S replications of the pseudo-data, and a PRC step:
      PRC-ABC Algorithm
                                                            (t−1)                          (t−1)
         1. Select θ at random from the θi                          ’s with probabilities ωi
                               (t)                          iid           (t−1)
         2. Generate θi              ∼ Kt , x1 , . . . , xS ∼ f (x|θi             ), set
                                     (t)                                            (t)
                                ωi         =        Kt {η(xs ) − η(y)} Kt (θi ) ,
                                               s
                                                                    (t)
              and accept with probability p (i) = ωi /ct,N
                       (t)        (t)
         3. Set ωi           ∝ ωi /p (i) (1 ≤ i ≤ N)

                                     [Peters, Sisson  Fan, Stat  Computing, 2012]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


Yet another ABC-PRC



      Another version of ABC-PRC called PRC-ABC with a proposal
      distribution Mt , S replications of the pseudo-data, and a PRC step:
                                                     (t−1)        (t−1)
              Use of a NP estimate Mt (θ) =         ωi       ψ(θ|θi       ) (with a
              fixed variance?!)
              S = 1 recommended for “greatest gains in sampler
              performance”
              ct,N upper quantile of non-zero weights at time t
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


Wilkinson’s exact BC
      ABC approximation error (i.e. non-zero tolerance) replaced with
      exact simulation from a controlled approximation to the target,
      convolution of true posterior with kernel function
                                    π(θ)f (z|θ)K (y − z)
                   π (θ, z|y) =                            ,
                                  π(θ)f (z|θ)K (y − z)dzdθ
      with K kernel parameterised by bandwidth .
                                                       [Wilkinson, 2008]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


Wilkinson’s exact BC
      ABC approximation error (i.e. non-zero tolerance) replaced with
      exact simulation from a controlled approximation to the target,
      convolution of true posterior with kernel function
                                    π(θ)f (z|θ)K (y − z)
                   π (θ, z|y) =                            ,
                                  π(θ)f (z|θ)K (y − z)dzdθ
      with K kernel parameterised by bandwidth .
                                                                  [Wilkinson, 2008]
      Theorem
      The ABC algorithm based on the assumption of a randomised
      observation y = y + ξ, ξ ∼ K , and an acceptance probability of
                      ˜

                                                    K (y − z)/M

      gives draws from the posterior distribution π(θ|y).
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


How exact a BC?




              “Using to represent measurement error is
              straightforward, whereas using to model the model
              discrepancy is harder to conceptualize and not as
              commonly used”
                                                    [Richard Wilkinson, 2008]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


How exact a BC?
      Pros
              Pseudo-data from true model and observed data from noisy
              model
              Interesting perspective in that outcome is completely
              controlled
              Link with ABCµ and assuming y is observed with a
              measurement error with density K
              Relates to the theory of model approximation
                                                    [Kennedy  O’Hagan, 2001]
      Cons
              Requires K to be bounded by M
              True approximation error never assessed
              Requires a modification of the standard ABC algorithm
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC for HMMs


      Specific case of a hidden                      Markov model



                                              Xt+1 ∼ Qθ (Xt , ·)
                                              Yt+1 ∼ gθ (·|xt )
                  0
      where only y1:n is observed.
                                                        [Dean, Singh, Jasra,  Peters, 2011]
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC for HMMs


      Specific case of a hidden                      Markov model



                                              Xt+1 ∼ Qθ (Xt , ·)
                                              Yt+1 ∼ gθ (·|xt )
                  0
      where only y1:n is observed.
                                                        [Dean, Singh, Jasra,  Peters, 2011]
      Use of specific constraints, adapted to the Markov structure:
                                      0                       0
                              y1 ∈ B(y1 , ) × · · · × yn ∈ B(yn , )
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC-MLE for HMMs

      ABC-MLE defined by
                     ˆ                       0                      0
                     θn = arg max Pθ Y1 ∈ B(y1 , ), . . . , Yn ∈ B(yn , )
                                     θ
MCMC and likelihood-free methods Part/day II: ABC
  Approximate Bayesian computation
     Alphabet soup


ABC-MLE for HMMs

      ABC-MLE defined by
                     ˆ                       0                      0
                     θn = arg max Pθ Y1 ∈ B(y1 , ), . . . , Yn ∈ B(yn , )
                                     θ


      Exact MLE for the likelihood                     same basis as Wilkinson!


                                                     0
                                                pθ (y1 , . . . , yn )

      corresponding to the perturbed process

                               (xt , yt + zt )1≤t≤n            zt ∼ U(B(0, 1)

                                                     [Dean, Singh, Jasra,  Peters, 2011]
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II
Mcmc & lkd free II

Contenu connexe

Tendances

ABC and empirical likelihood
ABC and empirical likelihoodABC and empirical likelihood
ABC and empirical likelihoodChristian Robert
 
Notes on “bayesian implementation” by matthew jackson in econometrica (1991)
Notes on “bayesian implementation” by matthew jackson in econometrica (1991)Notes on “bayesian implementation” by matthew jackson in econometrica (1991)
Notes on “bayesian implementation” by matthew jackson in econometrica (1991)Leo Vivas
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
A brief introduction to Gaussian process
A brief introduction to Gaussian processA brief introduction to Gaussian process
A brief introduction to Gaussian processEric Xihui Lin
 
Prévision de consommation électrique avec adaptive GAM
Prévision de consommation électrique avec adaptive GAMPrévision de consommation électrique avec adaptive GAM
Prévision de consommation électrique avec adaptive GAMCdiscount
 
On probability distributions
On probability distributionsOn probability distributions
On probability distributionsEric Xihui Lin
 
Digital Signal Processing[ECEG-3171]-Ch1_L02
Digital Signal Processing[ECEG-3171]-Ch1_L02Digital Signal Processing[ECEG-3171]-Ch1_L02
Digital Signal Processing[ECEG-3171]-Ch1_L02Rediet Moges
 
MCMC and likelihood-free methods
MCMC and likelihood-free methodsMCMC and likelihood-free methods
MCMC and likelihood-free methodsChristian Robert
 
Computation of the marginal likelihood
Computation of the marginal likelihoodComputation of the marginal likelihood
Computation of the marginal likelihoodBigMC
 
short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018Christian Robert
 
Approximating Bayes Factors
Approximating Bayes FactorsApproximating Bayes Factors
Approximating Bayes FactorsChristian Robert
 
Summation Operator
Summation OperatorSummation Operator
Summation OperatorJulio Huato
 
Intro to Approximate Bayesian Computation (ABC)
Intro to Approximate Bayesian Computation (ABC)Intro to Approximate Bayesian Computation (ABC)
Intro to Approximate Bayesian Computation (ABC)Umberto Picchini
 
Olivier Cappé's talk at BigMC March 2011
Olivier Cappé's talk at BigMC March 2011Olivier Cappé's talk at BigMC March 2011
Olivier Cappé's talk at BigMC March 2011BigMC
 

Tendances (18)

ABC and empirical likelihood
ABC and empirical likelihoodABC and empirical likelihood
ABC and empirical likelihood
 
Notes on “bayesian implementation” by matthew jackson in econometrica (1991)
Notes on “bayesian implementation” by matthew jackson in econometrica (1991)Notes on “bayesian implementation” by matthew jackson in econometrica (1991)
Notes on “bayesian implementation” by matthew jackson in econometrica (1991)
 
Mixture Models for Image Analysis
Mixture Models for Image AnalysisMixture Models for Image Analysis
Mixture Models for Image Analysis
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Figures
FiguresFigures
Figures
 
A brief introduction to Gaussian process
A brief introduction to Gaussian processA brief introduction to Gaussian process
A brief introduction to Gaussian process
 
Prévision de consommation électrique avec adaptive GAM
Prévision de consommation électrique avec adaptive GAMPrévision de consommation électrique avec adaptive GAM
Prévision de consommation électrique avec adaptive GAM
 
On probability distributions
On probability distributionsOn probability distributions
On probability distributions
 
Digital Signal Processing[ECEG-3171]-Ch1_L02
Digital Signal Processing[ECEG-3171]-Ch1_L02Digital Signal Processing[ECEG-3171]-Ch1_L02
Digital Signal Processing[ECEG-3171]-Ch1_L02
 
ABC in Venezia
ABC in VeneziaABC in Venezia
ABC in Venezia
 
MCMC and likelihood-free methods
MCMC and likelihood-free methodsMCMC and likelihood-free methods
MCMC and likelihood-free methods
 
Basic concepts and how to measure price volatility
Basic concepts and how to measure price volatility Basic concepts and how to measure price volatility
Basic concepts and how to measure price volatility
 
Computation of the marginal likelihood
Computation of the marginal likelihoodComputation of the marginal likelihood
Computation of the marginal likelihood
 
short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018
 
Approximating Bayes Factors
Approximating Bayes FactorsApproximating Bayes Factors
Approximating Bayes Factors
 
Summation Operator
Summation OperatorSummation Operator
Summation Operator
 
Intro to Approximate Bayesian Computation (ABC)
Intro to Approximate Bayesian Computation (ABC)Intro to Approximate Bayesian Computation (ABC)
Intro to Approximate Bayesian Computation (ABC)
 
Olivier Cappé's talk at BigMC March 2011
Olivier Cappé's talk at BigMC March 2011Olivier Cappé's talk at BigMC March 2011
Olivier Cappé's talk at BigMC March 2011
 

En vedette

ABC & Empirical Lkd
ABC & Empirical LkdABC & Empirical Lkd
ABC & Empirical LkdDeb Roy
 
PhD Recipe
PhD RecipePhD Recipe
PhD RecipeDeb Roy
 
Xian's abc
Xian's abcXian's abc
Xian's abcDeb Roy
 
Model complexity
Model complexityModel complexity
Model complexityDeb Roy
 
Discussion cabras-robert-130323171455-phpapp02
Discussion cabras-robert-130323171455-phpapp02Discussion cabras-robert-130323171455-phpapp02
Discussion cabras-robert-130323171455-phpapp02Deb Roy
 
Slides2 130201091056-phpapp01
Slides2 130201091056-phpapp01Slides2 130201091056-phpapp01
Slides2 130201091056-phpapp01Deb Roy
 

En vedette (8)

ABC & Empirical Lkd
ABC & Empirical LkdABC & Empirical Lkd
ABC & Empirical Lkd
 
PhD Recipe
PhD RecipePhD Recipe
PhD Recipe
 
DIC
DICDIC
DIC
 
Xian's abc
Xian's abcXian's abc
Xian's abc
 
PaulaTataruVienna
PaulaTataruViennaPaulaTataruVienna
PaulaTataruVienna
 
Model complexity
Model complexityModel complexity
Model complexity
 
Discussion cabras-robert-130323171455-phpapp02
Discussion cabras-robert-130323171455-phpapp02Discussion cabras-robert-130323171455-phpapp02
Discussion cabras-robert-130323171455-phpapp02
 
Slides2 130201091056-phpapp01
Slides2 130201091056-phpapp01Slides2 130201091056-phpapp01
Slides2 130201091056-phpapp01
 

Similaire à Mcmc & lkd free II

ABC and empirical likelihood
ABC and empirical likelihoodABC and empirical likelihood
ABC and empirical likelihoodChristian Robert
 
Hull White model presentation
Hull White model presentationHull White model presentation
Hull White model presentationStephan Chang
 
ABC with data cloning for MLE in state space models
ABC with data cloning for MLE in state space modelsABC with data cloning for MLE in state space models
ABC with data cloning for MLE in state space modelsUmberto Picchini
 
11.the comparative study of finite difference method and monte carlo method f...
11.the comparative study of finite difference method and monte carlo method f...11.the comparative study of finite difference method and monte carlo method f...
11.the comparative study of finite difference method and monte carlo method f...Alexander Decker
 
The comparative study of finite difference method and monte carlo method for ...
The comparative study of finite difference method and monte carlo method for ...The comparative study of finite difference method and monte carlo method for ...
The comparative study of finite difference method and monte carlo method for ...Alexander Decker
 
Numerical Methods
Numerical MethodsNumerical Methods
Numerical MethodsTeja Ande
 
An investigation of inference of the generalized extreme value distribution b...
An investigation of inference of the generalized extreme value distribution b...An investigation of inference of the generalized extreme value distribution b...
An investigation of inference of the generalized extreme value distribution b...Alexander Decker
 
The convenience yield implied by quadratic volatility smiles presentation [...
The convenience yield implied by quadratic volatility smiles   presentation [...The convenience yield implied by quadratic volatility smiles   presentation [...
The convenience yield implied by quadratic volatility smiles presentation [...yigalbt
 
SPDE presentation 2012
SPDE presentation 2012SPDE presentation 2012
SPDE presentation 2012Zheng Mengdi
 
Unbiased MCMC with couplings
Unbiased MCMC with couplingsUnbiased MCMC with couplings
Unbiased MCMC with couplingsPierre Jacob
 
Ray : modeling dynamic systems
Ray : modeling dynamic systemsRay : modeling dynamic systems
Ray : modeling dynamic systemsHouw Liong The
 
Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Pierre Jacob
 
(Approximate) Bayesian computation as a new empirical Bayes (something)?
(Approximate) Bayesian computation as a new empirical Bayes (something)?(Approximate) Bayesian computation as a new empirical Bayes (something)?
(Approximate) Bayesian computation as a new empirical Bayes (something)?Christian Robert
 

Similaire à Mcmc & lkd free II (20)

Desy
DesyDesy
Desy
 
ABC and empirical likelihood
ABC and empirical likelihoodABC and empirical likelihood
ABC and empirical likelihood
 
Hull White model presentation
Hull White model presentationHull White model presentation
Hull White model presentation
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
ABC with data cloning for MLE in state space models
ABC with data cloning for MLE in state space modelsABC with data cloning for MLE in state space models
ABC with data cloning for MLE in state space models
 
11.the comparative study of finite difference method and monte carlo method f...
11.the comparative study of finite difference method and monte carlo method f...11.the comparative study of finite difference method and monte carlo method f...
11.the comparative study of finite difference method and monte carlo method f...
 
The comparative study of finite difference method and monte carlo method for ...
The comparative study of finite difference method and monte carlo method for ...The comparative study of finite difference method and monte carlo method for ...
The comparative study of finite difference method and monte carlo method for ...
 
YSC 2013
YSC 2013YSC 2013
YSC 2013
 
Numerical Methods
Numerical MethodsNumerical Methods
Numerical Methods
 
Lecture_9.pdf
Lecture_9.pdfLecture_9.pdf
Lecture_9.pdf
 
An investigation of inference of the generalized extreme value distribution b...
An investigation of inference of the generalized extreme value distribution b...An investigation of inference of the generalized extreme value distribution b...
An investigation of inference of the generalized extreme value distribution b...
 
The convenience yield implied by quadratic volatility smiles presentation [...
The convenience yield implied by quadratic volatility smiles   presentation [...The convenience yield implied by quadratic volatility smiles   presentation [...
The convenience yield implied by quadratic volatility smiles presentation [...
 
SPDE presentation 2012
SPDE presentation 2012SPDE presentation 2012
SPDE presentation 2012
 
Unbiased MCMC with couplings
Unbiased MCMC with couplingsUnbiased MCMC with couplings
Unbiased MCMC with couplings
 
Ray : modeling dynamic systems
Ray : modeling dynamic systemsRay : modeling dynamic systems
Ray : modeling dynamic systems
 
002 ray modeling dynamic systems
002 ray modeling dynamic systems002 ray modeling dynamic systems
002 ray modeling dynamic systems
 
002 ray modeling dynamic systems
002 ray modeling dynamic systems002 ray modeling dynamic systems
002 ray modeling dynamic systems
 
Presentation.pdf
Presentation.pdfPresentation.pdf
Presentation.pdf
 
Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...
 
(Approximate) Bayesian computation as a new empirical Bayes (something)?
(Approximate) Bayesian computation as a new empirical Bayes (something)?(Approximate) Bayesian computation as a new empirical Bayes (something)?
(Approximate) Bayesian computation as a new empirical Bayes (something)?
 

Dernier

Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 

Dernier (20)

Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 

Mcmc & lkd free II

  • 1. MCMC and likelihood-free methods Part/day II: ABC MCMC and likelihood-free methods Part/day II: ABC Christian P. Robert Universit´ Paris-Dauphine, IuF, & CREST e Monash University, EBS, July 18, 2012
  • 2. MCMC and likelihood-free methods Part/day II: ABC Outline simulation-based methods in Econometrics Genetics of ABC Approximate Bayesian computation ABC for model choice ABC model choice consistency
  • 3. MCMC and likelihood-free methods Part/day II: ABC simulation-based methods in Econometrics Econ’ections simulation-based methods in Econometrics Genetics of ABC Approximate Bayesian computation ABC for model choice ABC model choice consistency
  • 4. MCMC and likelihood-free methods Part/day II: ABC simulation-based methods in Econometrics Usages of simulation in Econometrics Similar exploration of simulation-based techniques in Econometrics Simulated method of moments Method of simulated moments Simulated pseudo-maximum-likelihood Indirect inference [Gouri´roux & Monfort, 1996] e
  • 5. MCMC and likelihood-free methods Part/day II: ABC simulation-based methods in Econometrics Simulated method of moments o Given observations y1:n from a model yt = r (y1:(t−1) , t , θ) , t ∼ g (·) simulate 1:n , derive yt (θ) = r (y1:(t−1) , t , θ) and estimate θ by n arg min (yto − yt (θ))2 θ t=1
  • 6. MCMC and likelihood-free methods Part/day II: ABC simulation-based methods in Econometrics Simulated method of moments o Given observations y1:n from a model yt = r (y1:(t−1) , t , θ) , t ∼ g (·) simulate 1:n , derive yt (θ) = r (y1:(t−1) , t , θ) and estimate θ by n n 2 arg min yto − yt (θ) θ t=1 t=1
  • 7. MCMC and likelihood-free methods Part/day II: ABC simulation-based methods in Econometrics Method of simulated moments Given a statistic vector K (y ) with Eθ [K (Yt )|y1:(t−1) ] = k(y1:(t−1) ; θ) find an unbiased estimator of k(y1:(t−1) ; θ), ˜ k( t , y1:(t−1) ; θ) Estimate θ by n S arg min K (yt ) − ˜ t k( s , y1:(t−1) ; θ)/S θ t=1 s=1 [Pakes & Pollard, 1989]
  • 8. MCMC and likelihood-free methods Part/day II: ABC simulation-based methods in Econometrics Indirect inference ˆ Minimise (in θ) the distance between estimators β based on pseudo-models for genuine observations and for observations simulated under the true model and the parameter θ. [Gouri´roux, Monfort, & Renault, 1993; e Smith, 1993; Gallant & Tauchen, 1996]
  • 9. MCMC and likelihood-free methods Part/day II: ABC simulation-based methods in Econometrics Indirect inference (PML vs. PSE) Example of the pseudo-maximum-likelihood (PML) ˆ β(y) = arg max log f (yt |β, y1:(t−1) ) β t leading to ˆ ˆ arg min ||β(yo ) − β(y1 (θ), . . . , yS (θ))||2 θ when ys (θ) ∼ f (y|θ) s = 1, . . . , S
  • 10. MCMC and likelihood-free methods Part/day II: ABC simulation-based methods in Econometrics Indirect inference (PML vs. PSE) Example of the pseudo-score-estimator (PSE) 2 ˆ ∂ log f β(y) = arg min (yt |β, y1:(t−1) ) β t ∂β leading to ˆ ˆ arg min ||β(yo ) − β(y1 (θ), . . . , yS (θ))||2 θ when ys (θ) ∼ f (y|θ) s = 1, . . . , S
  • 11. MCMC and likelihood-free methods Part/day II: ABC simulation-based methods in Econometrics Consistent indirect inference ...in order to get a unique solution the dimension of the auxiliary parameter β must be larger than or equal to the dimension of the initial parameter θ. If the problem is just identified the different methods become easier...
  • 12. MCMC and likelihood-free methods Part/day II: ABC simulation-based methods in Econometrics Consistent indirect inference ...in order to get a unique solution the dimension of the auxiliary parameter β must be larger than or equal to the dimension of the initial parameter θ. If the problem is just identified the different methods become easier... Consistency depending on the criterion and on the asymptotic identifiability of θ [Gouri´roux, Monfort, 1996, p. 66] e
  • 13. MCMC and likelihood-free methods Part/day II: ABC simulation-based methods in Econometrics AR(2) vs. MA(1) example true (AR) model yt = t −θ t−1 and [wrong!] auxiliary (MA) model yt = β1 yt−1 + β2 yt−2 + ut R code x=eps=rnorm(250) x[2:250]=x[2:250]-0.5*x[1:249] simeps=rnorm(250) propeta=seq(-.99,.99,le=199) dist=rep(0,199) bethat=as.vector(arima(x,c(2,0,0),incl=FALSE)$coef) for (t in 1:199) dist[t]=sum((as.vector(arima(c(simeps[1],simeps[2:250]-propeta[t]* simeps[1:249]),c(2,0,0),incl=FALSE)$coef)-bethat)^2)
  • 14. MCMC and likelihood-free methods Part/day II: ABC simulation-based methods in Econometrics AR(2) vs. MA(1) example One sample: 0.8 0.6 distance 0.4 0.2 0.0 −1.0 −0.5 0.0 0.5 1.0
  • 15. MCMC and likelihood-free methods Part/day II: ABC simulation-based methods in Econometrics AR(2) vs. MA(1) example Many samples: 6 5 4 3 2 1 0 0.2 0.4 0.6 0.8 1.0
  • 16. MCMC and likelihood-free methods Part/day II: ABC simulation-based methods in Econometrics Choice of pseudo-model Pick model such that ˆ 1. β(θ) not flat (i.e. sensitive to changes in θ) ˆ 2. β(θ) not dispersed (i.e. robust agains changes in ys (θ)) [Frigessi & Heggland, 2004]
  • 17. MCMC and likelihood-free methods Part/day II: ABC simulation-based methods in Econometrics ABC using indirect inference (1) We present a novel approach for developing summary statistics for use in approximate Bayesian computation (ABC) algorithms by using indirect inference(...) In the indirect inference approach to ABC the parameters of an auxiliary model fitted to the data become the summary statistics. Although applicable to any ABC technique, we embed this approach within a sequential Monte Carlo algorithm that is completely adaptive and requires very little tuning(...) [Drovandi, Pettitt & Faddy, 2011] c Indirect inference provides summary statistics for ABC...
  • 18. MCMC and likelihood-free methods Part/day II: ABC simulation-based methods in Econometrics ABC using indirect inference (2) ...the above result shows that, in the limit as h → 0, ABC will be more accurate than an indirect inference method whose auxiliary statistics are the same as the summary statistic that is used for ABC(...) Initial analysis showed that which method is more accurate depends on the true value of θ. [Fearnhead and Prangle, 2012] c Indirect inference provides estimates rather than global inference...
  • 19. MCMC and likelihood-free methods Part/day II: ABC Genetics of ABC Genetics of ABC simulation-based methods in Econometrics Genetics of ABC Approximate Bayesian computation ABC for model choice ABC model choice consistency
  • 20. MCMC and likelihood-free methods Part/day II: ABC Genetics of ABC Genetic background of ABC ABC is a recent computational technique that only requires being able to sample from the likelihood f (·|θ) This technique stemmed from population genetics models, about 15 years ago, and population geneticists still contribute significantly to methodological developments of ABC. [Griffith & al., 1997; Tavar´ & al., 1999] e
  • 21. MCMC and likelihood-free methods Part/day II: ABC Genetics of ABC Population genetics [Part derived from the teaching material of Raphael Leblois, ENS Lyon, November 2010] Describe the genotypes, estimate the alleles frequencies, determine their distribution among individuals, populations and between populations; Predict and understand the evolution of gene frequencies in populations as a result of various factors. c Analyses the effect of various evolutive forces (mutation, drift, migration, selection) on the evolution of gene frequencies in time and space.
  • 22. MCMC and likelihood-free methods Part/day II: ABC Genetics of ABC A[B]ctors Les mécanismes de l’évolution Towards an explanation of the mechanisms of evolutionary changes, mathematical models ofl’évolution ont to be developed •! Les mathématiques de evolution started in the years 1920-1930. commencé à être développés dans les années 1920-1930 Ronald A Fisher (1890-1962) Sewall Wright (1889-1988) John BS Haldane (1892-1964)
  • 23. MCMC and likelihood-free methods Part/day II: ABC Genetics of ABC Le modèle Wright-Fisher model de Wright-Fisher •! En l’absence de mutation et de A population of constant sélection, les fréquences size, in which individuals alléliques dérivent (augmentent reproduce at the same time. et diminuent) inévitablement jusqu’à Each gene in a generation is la fixation d’un allèle a copy of a gene of the •! La dérive conduit generation. previous donc à la perte de variation génétique à l’intérieur des populations mutation In the absence of and selection, allele frequencies derive inevitably until the fixation of an allele.
  • 24. MCMC and likelihood-free methods Part/day II: ABC Genetics of ABC Coalescent theory [Kingman, 1982; Tajima, Tavar´, &tc] e !"#$%&'(('")**+$,-'".'"/010234%'".'5"*$*%()23$15"6" Coalescence theory interested in the genealogy of a sample of genes7**+$,-'",()5534%'" common ancestor of the sample. " !! back in time to the " "!"7**+$,-'"8",$)('5,'1,'"9"
  • 25. MCMC and likelihood-free methods Part/day II: ABC Genetics of ABC Common ancestor Les différentes lignées fusionnent (coalescent) au fur et à mesure que l’on remonte vers le Time of coalescence passé (T) Modélisation du processus de dérive génétique The different lineages mergeen “remontant dans lein the past. when we go back temps” jusqu’à l’ancêtre commun d’un échantillon de gènes 6
  • 26. MCMC and likelihood-free methods Part/day II: ABC Genetics of ABC Sous l’hypothèse de neutralité des marqueurs génétiques étudiés, les mutations sont indépendantes de la généalogie Neutral généalogie ne dépend que des processus démographiques i.e. la mutations On construit donc la généalogie selon les paramètres démographiques (ex. N), puis on ajoute aassumption les Under the posteriori of mutations sur les différentes neutrality, the mutations branches,independentau feuilles de are du MRCA of the l’arbregenealogy. On obtient ainsi des données de We construct the genealogy polymorphisme sous les modèles according to the démographiques et mutationnels demographic parameters, considérés then we add a posteriori the mutations. 20
  • 27. MCMC and likelihood-free methods Part/day II: ABC Genetics of ABC Demo-genetic inference Each model is characterized by a set of parameters θ that cover historical (time divergence, admixture time ...), demographics (population sizes, admixture rates, migration rates, ...) and genetic (mutation rate, ...) factors The goal is to estimate these parameters from a dataset of polymorphism (DNA sample) y observed at the present time Problem: most of the time, we can not calculate the likelihood of the polymorphism data f (y|θ).
  • 28. MCMC and likelihood-free methods Part/day II: ABC Genetics of ABC Untractable likelihood Missing (too missing!) data structure: f (y|θ) = f (y|G , θ)f (G |θ)dG G The genealogies are considered as nuisance parameters. This problematic thus differs from the phylogenetic approach where the tree is the parameter of interesst.
  • 29. MCMC and likelihood-free methods Part/day II: ABC Genetics of ABC A genuine !""#$%&'()*+,(-*.&(/+0$'"1)()&$/+2!,03! example of application 1/+*%*'"4*+56(""4&7()&$/.+.1#+4*.+8-9':*.+ 94 Pygmies populations: do they have a common origin? Is there a lot of exchanges between pygmies and non-pygmies populations?
  • 30. MCMC and likelihood-free methods Part/day II: ABC Genetics of ABC !""#$%&'()*+,(-*.&(/+0$'"1)()&$/+2!,03! Scenarios 1/+*%*'"4*+56(""4&7()&$/.+.1#+4*.+8-9':*.+ under competition Différents scénarios possibles, choix de scenario par ABC Verdu et al. 2009 96
  • 31. 1/+*%*'"4*+56(""4&7()&$/.+.1#+4*.+ MCMC and likelihood-free methods Part/day II: ABC !""#$%&'()*+,(-*.&(/+0$'"1)()&$/+2!,03 Genetics of ABC 1/+*%*'"4*+56(""4&7()&$/.+.1#+4*.+8-9':*. Différentsresults Simulation scénarios possibles, choix de scenari Différents scénarios possibles, choix de scenario par ABC c Scenario 1A is chosen. largement soutenu par rapport aux Le scenario 1a est autres ! plaide pour une origine commune des Le scenario 1a est largement soutenu par populations pygmées d’Afrique de l’Ouest rap Verdu e
  • 32. MCMC and likelihood-free methods Part/day II: ABC Genetics of ABC !""#$%&'()*+,(-*.&(/+0$'"1)()&$/+2!,03! Most likely scenario 1/+*%*'"4*+56(""4&7()&$/.+.1#+4*.+8-9':*.+ Scénario évolutif : on « raconte » une histoire à partir de ces inférences Verdu et al. 2009 99
  • 33. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Approximate Bayesian computation simulation-based methods in Econometrics Genetics of ABC Approximate Bayesian computation ABC for model choice ABC model choice consistency
  • 34. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics Untractable likelihoods Cases when the likelihood function f (y|θ) is unavailable and when the completion step f (y|θ) = f (y, z|θ) dz Z is impossible or too costly because of the dimension of z c MCMC cannot be implemented!
  • 35. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics Untractable likelihoods Cases when the likelihood function f (y|θ) is unavailable and when the completion step f (y|θ) = f (y, z|θ) dz Z is impossible or too costly because of the dimension of z c MCMC cannot be implemented!
  • 36. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics Illustrations Example () Stochastic volatility model: for Highest weight trajectories t = 1, . . . , T , 0.4 0.2 yt = exp(zt ) t , zt = a+bzt−1 +σηt , 0.0 −0.2 T very large makes it difficult to −0.4 include z within the simulated 0 200 400 t 600 800 1000 parameters
  • 37. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics Illustrations Example () Potts model: if y takes values on a grid Y of size k n and f (y|θ) ∝ exp θ Iyl =yi l∼i where l∼i denotes a neighbourhood relation, n moderately large prohibits the computation of the normalising constant
  • 38. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics Illustrations Example (Genesis) Phylogenetic tree: in population genetics, reconstitution of a common ancestor from a sample of genes via a phylogenetic tree that is close to impossible to integrate out [100 processor days with 4 parameters] [Cornuet et al., 2009, Bioinformatics]
  • 39. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics The ABC method Bayesian setting: target is π(θ)f (x|θ)
  • 40. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics The ABC method Bayesian setting: target is π(θ)f (x|θ) When likelihood f (x|θ) not in closed form, likelihood-free rejection technique:
  • 41. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics The ABC method Bayesian setting: target is π(θ)f (x|θ) When likelihood f (x|θ) not in closed form, likelihood-free rejection technique: ABC algorithm For an observation y ∼ f (y|θ), under the prior π(θ), keep jointly simulating θ ∼ π(θ) , z ∼ f (z|θ ) , until the auxiliary variable z is equal to the observed value, z = y. [Tavar´ et al., 1997] e
  • 42. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics Why does it work?! The proof is trivial: f (θi ) ∝ π(θi )f (z|θi )Iy (z) z∈D ∝ π(θi )f (y|θi ) = π(θi |y) . [Accept–Reject 101]
  • 43. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics Earlier occurrence ‘Bayesian statistics and Monte Carlo methods are ideally suited to the task of passing many models over one dataset’ [Don Rubin, Annals of Statistics, 1984] Note Rubin (1984) does not promote this algorithm for likelihood-free simulation but frequentist intuition on posterior distributions: parameters from posteriors are more likely to be those that could have generated the data.
  • 44. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics A as A...pproximative When y is a continuous random variable, equality z = y is replaced with a tolerance condition, (y, z) ≤ where is a distance
  • 45. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics A as A...pproximative When y is a continuous random variable, equality z = y is replaced with a tolerance condition, (y, z) ≤ where is a distance Output distributed from π(θ) Pθ { (y, z) < } ∝ π(θ| (y, z) < ) [Pritchard et al., 1999]
  • 46. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics ABC algorithm Algorithm 1 Likelihood-free rejection sampler 2 for i = 1 to N do repeat generate θ from the prior distribution π(·) generate z from the likelihood f (·|θ ) until ρ{η(z), η(y)} ≤ set θi = θ end for where η(y) defines a (not necessarily sufficient) statistic
  • 47. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics Output The likelihood-free algorithm samples from the marginal in z of: π(θ)f (z|θ)IA ,y (z) π (θ, z|y) = , A ,y ×Θ π(θ)f (z|θ)dzdθ where A ,y = {z ∈ D|ρ(η(z), η(y)) < }.
  • 48. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics Output The likelihood-free algorithm samples from the marginal in z of: π(θ)f (z|θ)IA ,y (z) π (θ, z|y) = , A ,y ×Θ π(θ)f (z|θ)dzdθ where A ,y = {z ∈ D|ρ(η(z), η(y)) < }. The idea behind ABC is that the summary statistics coupled with a small tolerance should provide a good approximation of the posterior distribution: π (θ|y) = π (θ, z|y)dz ≈ π(θ|y) .
  • 49. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics Convergence of ABC (first attempt) What happens when → 0?
  • 50. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics Convergence of ABC (first attempt) What happens when → 0? If f (·|θ) is continuous in y , uniformly in θ [!], given an arbitrary δ > 0, there exists 0 such that < 0 implies π(θ) f (z|θ)IA ,y (z) dz π(θ)f (y|θ)(1 δ)µ(B ) ∈ A ,y ×Θ π(θ)f (z|θ)dzdθ Θ π(θ)f (y|θ)dθ(1 ± δ)µ(B )
  • 51. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics Convergence of ABC (first attempt) What happens when → 0? If f (·|θ) is continuous in y , uniformly in θ [!], given an arbitrary δ > 0, there exists 0 such that < 0 implies π(θ) f (z|θ)IA ,y (z) dz π(θ)f (y|θ)(1 δ) XX ) µ(BX ∈ A ,y ×Θ π(θ)f (z|θ)dzdθ Θ π(θ)f (y|θ)dθ(1 ± δ) µ(B ) XXX
  • 52. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics Convergence of ABC (first attempt) What happens when → 0? If f (·|θ) is continuous in y , uniformly in θ [!], given an arbitrary δ 0, there exists 0 such that 0 implies π(θ) f (z|θ)IA ,y (z) dz π(θ)f (y|θ)(1 δ) XX ) µ(BX ∈ A ,y ×Θ π(θ)f (z|θ)dzdθ Θ π(θ)f (y|θ)dθ(1 ± δ) X µ(B ) X X [Proof extends to other continuous-in-0 kernels K ]
  • 53. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics Convergence of ABC (second attempt) What happens when → 0?
  • 54. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics Convergence of ABC (second attempt) What happens when → 0? For B ⊂ Θ, we have A f (z|θ)dz f (z|θ)π(θ)dθ ,y B π(θ)dθ = dz B A ,y ×Θ π(θ)f (z|θ)dzdθ A ,y A ,y ×Θ π(θ)f (z|θ)dzdθ B f (z|θ)π(θ)dθ m(z) = dz A ,y m(z) A ,y ×Θ π(θ)f (z|θ)dzdθ m(z) = π(B|z) dz A ,y A ,y ×Θ π(θ)f (z|θ)dzdθ which indicates convergence for a continuous π(B|z).
  • 55. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics Probit modelling on Pima Indian women Example (R benchmark) 200 Pima Indian women with observed variables plasma glucose concentration in oral glucose tolerance test diastolic blood pressure diabetes pedigree function presence/absence of diabetes
  • 56. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics Probit modelling on Pima Indian women Example (R benchmark) 200 Pima Indian women with observed variables plasma glucose concentration in oral glucose tolerance test diastolic blood pressure diabetes pedigree function presence/absence of diabetes Probability of diabetes function of above variables P(y = 1|x) = Φ(x1 β1 + x2 β2 + x3 β3 ) ,
  • 57. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics Probit modelling on Pima Indian women Example (R benchmark) 200 Pima Indian women with observed variables plasma glucose concentration in oral glucose tolerance test diastolic blood pressure diabetes pedigree function presence/absence of diabetes Probability of diabetes function of above variables P(y = 1|x) = Φ(x1 β1 + x2 β2 + x3 β3 ) , Test of H0 : β3 = 0 for 200 observations of Pima.tr based on a g -prior modelling: β ∼ N3 (0, n XT X)−1
  • 58. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics Probit modelling on Pima Indian women Example (R benchmark) 200 Pima Indian women with observed variables plasma glucose concentration in oral glucose tolerance test diastolic blood pressure diabetes pedigree function presence/absence of diabetes Probability of diabetes function of above variables P(y = 1|x) = Φ(x1 β1 + x2 β2 + x3 β3 ) , Test of H0 : β3 = 0 for 200 observations of Pima.tr based on a g -prior modelling: β ∼ N3 (0, n XT X)−1 Use of importance function inspired from the MLE estimate distribution ˆ ˆ β ∼ N (β, Σ)
  • 59. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics Pima Indian benchmark 80 100 1.0 80 60 0.8 60 0.6 Density Density Density 40 40 0.4 20 20 0.2 0.0 0 0 −0.005 0.010 0.020 0.030 −0.05 −0.03 −0.01 −1.0 0.0 1.0 2.0 Figure: Comparison between density estimates of the marginals on β1 (left), β2 (center) and β3 (right) from ABC rejection samples (red) and MCMC samples (black) .
  • 60. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics MA example Back to the MA(q) model q xt = t + ϑi t−i i=1 Simple prior: uniform over the inverse [real and complex] roots in q Q(u) = 1 − ϑi u i i=1 under the identifiability conditions
  • 61. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics MA example Back to the MA(q) model q xt = t + ϑi t−i i=1 Simple prior: uniform prior over the identifiability zone, e.g. triangle for MA(2)
  • 62. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics MA example (2) ABC algorithm thus made of 1. picking a new value (ϑ1 , ϑ2 ) in the triangle 2. generating an iid sequence ( t )−qt≤T 3. producing a simulated series (xt )1≤t≤T
  • 63. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics MA example (2) ABC algorithm thus made of 1. picking a new value (ϑ1 , ϑ2 ) in the triangle 2. generating an iid sequence ( t )−qt≤T 3. producing a simulated series (xt )1≤t≤T Distance: basic distance between the series T ρ((xt )1≤t≤T , (xt )1≤t≤T ) = (xt − xt )2 t=1 or distance between summary statistics like the q autocorrelations T τj = xt xt−j t=j+1
  • 64. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics Comparison of distance impact Evaluation of the tolerance on the ABC sample against both distances ( = 100%, 10%, 1%, 0.1%) for an MA(2) model
  • 65. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics Comparison of distance impact 4 1.5 3 1.0 2 0.5 1 0.0 0 0.0 0.2 0.4 0.6 0.8 −2.0 −1.0 0.0 0.5 1.0 1.5 θ1 θ2 Evaluation of the tolerance on the ABC sample against both distances ( = 100%, 10%, 1%, 0.1%) for an MA(2) model
  • 66. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics Comparison of distance impact 4 1.5 3 1.0 2 0.5 1 0.0 0 0.0 0.2 0.4 0.6 0.8 −2.0 −1.0 0.0 0.5 1.0 1.5 θ1 θ2 Evaluation of the tolerance on the ABC sample against both distances ( = 100%, 10%, 1%, 0.1%) for an MA(2) model
  • 67. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics Homonomy The ABC algorithm is not to be confused with the ABC algorithm The Artificial Bee Colony algorithm is a swarm based meta-heuristic algorithm that was introduced by Karaboga in 2005 for optimizing numerical problems. It was inspired by the intelligent foraging behavior of honey bees. The algorithm is specifically based on the model proposed by Tereshko and Loengarov (2005) for the foraging behaviour of honey bee colonies. The model consists of three essential components: employed and unemployed foraging bees, and food sources. The first two components, employed and unemployed foraging bees, search for rich food sources (...) close to their hive. The model also defines two leading modes of behaviour (...): recruitment of foragers to rich food sources resulting in positive feedback and abandonment of poor sources by foragers causing negative feedback. [Karaboga, Scholarpedia]
  • 68. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics ABC advances Simulating from the prior is often poor in efficiency
  • 69. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics ABC advances Simulating from the prior is often poor in efficiency Either modify the proposal distribution on θ to increase the density of x’s within the vicinity of y ... [Marjoram et al, 2003; Bortot et al., 2007, Sisson et al., 2007]
  • 70. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics ABC advances Simulating from the prior is often poor in efficiency Either modify the proposal distribution on θ to increase the density of x’s within the vicinity of y ... [Marjoram et al, 2003; Bortot et al., 2007, Sisson et al., 2007] ...or by viewing the problem as a conditional density estimation and by developing techniques to allow for larger [Beaumont et al., 2002]
  • 71. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation ABC basics ABC advances Simulating from the prior is often poor in efficiency Either modify the proposal distribution on θ to increase the density of x’s within the vicinity of y ... [Marjoram et al, 2003; Bortot et al., 2007, Sisson et al., 2007] ...or by viewing the problem as a conditional density estimation and by developing techniques to allow for larger [Beaumont et al., 2002] .....or even by including in the inferential framework [ABCµ ] [Ratmann et al., 2009]
  • 72. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC-NP Better usage of [prior] simulations by adjustement: instead of throwing away θ such that ρ(η(z), η(y)) , replace θ’s with locally regressed transforms (use with BIC) θ∗ = θ − {η(z) − η(y)}T β ˆ [Csill´ry et al., TEE, 2010] e ˆ where β is obtained by [NP] weighted least square regression on (η(z) − η(y)) with weights Kδ {ρ(η(z), η(y))} [Beaumont et al., 2002, Genetics]
  • 73. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC-NP (regression) Also found in the subsequent literature, e.g. in Fearnhead-Prangle (2012) : weight directly simulation by Kδ {ρ(η(z(θ)), η(y))} or S 1 Kδ {ρ(η(zs (θ)), η(y))} S s=1 [consistent estimate of f (η|θ)]
  • 74. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC-NP (regression) Also found in the subsequent literature, e.g. in Fearnhead-Prangle (2012) : weight directly simulation by Kδ {ρ(η(z(θ)), η(y))} or S 1 Kδ {ρ(η(zs (θ)), η(y))} S s=1 [consistent estimate of f (η|θ)] Curse of dimensionality: poor estimate when d = dim(η) is large...
  • 75. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC-NP (density estimation) Use of the kernel weights Kδ {ρ(η(z(θ)), η(y))} leads to the NP estimate of the posterior expectation i θi Kδ {ρ(η(z(θi )), η(y))} i Kδ {ρ(η(z(θi )), η(y))} [Blum, JASA, 2010]
  • 76. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC-NP (density estimation) Use of the kernel weights Kδ {ρ(η(z(θ)), η(y))} leads to the NP estimate of the posterior conditional density ˜ Kb (θi − θ)Kδ {ρ(η(z(θi )), η(y))} i i Kδ {ρ(η(z(θi )), η(y))} [Blum, JASA, 2010]
  • 77. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC-NP (density estimations) Other versions incorporating regression adjustments ˜ Kb (θi∗ − θ)Kδ {ρ(η(z(θi )), η(y))} i i Kδ {ρ(η(z(θi )), η(y))}
  • 78. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC-NP (density estimations) Other versions incorporating regression adjustments ˜ Kb (θi∗ − θ)Kδ {ρ(η(z(θi )), η(y))} i i Kδ {ρ(η(z(θi )), η(y))} In all cases, error E[ˆ (θ|y)] − g (θ|y) = cb 2 + cδ 2 + OP (b 2 + δ 2 ) + OP (1/nδ d ) g c var(ˆ (θ|y)) = g (1 + oP (1)) nbδ d [Blum, JASA, 2010]
  • 79. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC-NP (density estimations) Other versions incorporating regression adjustments ˜ Kb (θi∗ − θ)Kδ {ρ(η(z(θi )), η(y))} i i Kδ {ρ(η(z(θi )), η(y))} In all cases, error E[ˆ (θ|y)] − g (θ|y) = cb 2 + cδ 2 + OP (b 2 + δ 2 ) + OP (1/nδ d ) g c var(ˆ (θ|y)) = g (1 + oP (1)) nbδ d [standard NP calculations]
  • 80. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC-NCH Incorporating non-linearities and heterocedasticities: σ (η(y)) ˆ θ∗ = m(η(y)) + [θ − m(η(z))] ˆ ˆ σ (η(z)) ˆ
  • 81. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC-NCH Incorporating non-linearities and heterocedasticities: σ (η(y)) ˆ θ∗ = m(η(y)) + [θ − m(η(z))] ˆ ˆ σ (η(z)) ˆ where m(η) estimated by non-linear regression (e.g., neural network) ˆ σ (η) estimated by non-linear regression on residuals ˆ log{θi − m(ηi )}2 = log σ 2 (ηi ) + ξi ˆ [Blum Fran¸ois, 2009] c
  • 82. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC-NCH (2) Why neural network?
  • 83. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC-NCH (2) Why neural network? fights curse of dimensionality selects relevant summary statistics provides automated dimension reduction offers a model choice capability improves upon multinomial logistic [Blum Fran¸ois, 2009] c
  • 84. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC-MCMC Markov chain (θ(t) ) created via the transition function  θ ∼ Kω (θ |θ(t) ) if x ∼ f (x|θ ) is such that x = y   π(θ )Kω (t) |θ ) θ (t+1) = and u ∼ U(0, 1) ≤ π(θ(t) )K (θ |θ(t) ) ,  ω (θ  (t) θ otherwise,
  • 85. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC-MCMC Markov chain (θ(t) ) created via the transition function  θ ∼ Kω (θ |θ(t) ) if x ∼ f (x|θ ) is such that x = y   π(θ )Kω (t) |θ ) θ (t+1) = and u ∼ U(0, 1) ≤ π(θ(t) )K (θ |θ(t) ) ,  ω (θ  (t) θ otherwise, has the posterior π(θ|y ) as stationary distribution [Marjoram et al, 2003]
  • 86. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC-MCMC (2) Algorithm 2 Likelihood-free MCMC sampler Use Algorithm 1 to get (θ(0) , z(0) ) for t = 1 to N do Generate θ from Kω ·|θ(t−1) , Generate z from the likelihood f (·|θ ), Generate u from U[0,1] , π(θ )Kω (θ(t−1) |θ ) if u ≤ I π(θ(t−1) Kω (θ |θ(t−1) ) A ,y (z ) then set (θ(t) , z(t) ) = (θ , z ) else (θ(t) , z(t) )) = (θ(t−1) , z(t−1) ), end if end for
  • 87. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup Why does it work? Acceptance probability does not involve calculating the likelihood and π (θ , z |y) q(θ (t−1) |θ )f (z(t−1) |θ (t−1) ) × π (θ (t−1) , z(t−1) |y) q(θ |θ (t−1) )f (z |θ ) π(θ ) XX|θ IA ,y (z ) ) f (z X X = π(θ (t−1) ) f (z(t−1) |θ (t−1) ) IA ,y (z(t−1) ) q(θ (t−1) |θ ) f (z(t−1) |θ (t−1) ) × q(θ |θ (t−1) ) X|θ X ) f (z XX
  • 88. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup Why does it work? Acceptance probability does not involve calculating the likelihood and π (θ , z |y) q(θ (t−1) |θ )f (z(t−1) |θ (t−1) ) × π (θ (t−1) , z(t−1) |y) q(θ |θ (t−1) )f (z |θ ) π(θ ) XX|θ IA ,y (z ) ) f (z X X = hh ( π(θ (t−1) ) ((hhhhh) IA ,y (z(t−1) ) f (z(t−1) |θ (t−1) ((( h (( hh ( q(θ (t−1) |θ ) ((hhhhh) f (z(t−1) |θ (t−1) (( ((( h × q(θ |θ (t−1) ) X|θ X ) f (z XX
  • 89. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup Why does it work? Acceptance probability does not involve calculating the likelihood and π (θ , z |y) q(θ (t−1) |θ )f (z(t−1) |θ (t−1) ) × π (θ (t−1) , z(t−1) |y) q(θ |θ (t−1) )f (z |θ ) π(θ ) X|θ IA ,y (z ) X ) f (z X X = hh ( π(θ (t−1) ) ((hhhhh) XX(t−1) ) f (z(t−1) |θ (t−1) IAX (( X ((( h ,y (zX X hh ( q(θ (t−1) |θ ) ((hhhhh) f (z(t−1) |θ (t−1) (( ((( h × q(θ |θ (t−1) ) X|θ X ) f (z XX π(θ )q(θ (t−1) |θ ) = IA ,y (z ) π(θ (t−1) q(θ |θ (t−1) )
  • 90. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup A toy example Case of 1 1 x ∼ N (θ, 1) + N (−θ, 1) 2 2 under prior θ ∼ N (0, 10)
  • 91. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup A toy example Case of 1 1 x ∼ N (θ, 1) + N (−θ, 1) 2 2 under prior θ ∼ N (0, 10) ABC sampler thetas=rnorm(N,sd=10) zed=sample(c(1,-1),N,rep=TRUE)*thetas+rnorm(N,sd=1) eps=quantile(abs(zed-x),.01) abc=thetas[abs(zed-x)eps]
  • 92. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup A toy example Case of 1 1 x ∼ N (θ, 1) + N (−θ, 1) 2 2 under prior θ ∼ N (0, 10) ABC-MCMC sampler metas=rep(0,N) metas[1]=rnorm(1,sd=10) zed[1]=x for (t in 2:N){ metas[t]=rnorm(1,mean=metas[t-1],sd=5) zed[t]=rnorm(1,mean=(1-2*(runif(1).5))*metas[t],sd=1) if ((abs(zed[t]-x)eps)||(runif(1)dnorm(metas[t],sd=10)/dnorm(metas[t-1],sd=10))){ metas[t]=metas[t-1] zed[t]=zed[t-1]} }
  • 93. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup A toy example x =2
  • 94. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup A toy example x =2 0.30 0.25 0.20 0.15 0.10 0.05 0.00 −4 −2 0 2 4 θ
  • 95. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup A toy example x = 10
  • 96. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup A toy example x = 10 0.25 0.20 0.15 0.10 0.05 0.00 −10 −5 0 5 10 θ
  • 97. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup A toy example x = 50
  • 98. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup A toy example x = 50 0.10 0.08 0.06 0.04 0.02 0.00 −40 −20 0 20 40 θ
  • 99. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABCµ [Ratmann, Andrieu, Wiuf and Richardson, 2009, PNAS] Use of a joint density f (θ, |y) ∝ ξ( |y, θ) × πθ (θ) × π ( ) where y is the data, and ξ( |y, θ) is the prior predictive density of ρ(η(z), η(y)) given θ and y when z ∼ f (z|θ)
  • 100. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABCµ [Ratmann, Andrieu, Wiuf and Richardson, 2009, PNAS] Use of a joint density f (θ, |y) ∝ ξ( |y, θ) × πθ (θ) × π ( ) where y is the data, and ξ( |y, θ) is the prior predictive density of ρ(η(z), η(y)) given θ and y when z ∼ f (z|θ) Warning! Replacement of ξ( |y, θ) with a non-parametric kernel approximation.
  • 101. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABCµ details Multidimensional distances ρk (k = 1, . . . , K ) and errors k = ρk (ηk (z), ηk (y)), with ˆ 1 k ∼ ξk ( |y, θ) ≈ ξk ( |y, θ) = K [{ k −ρk (ηk (zb ), ηk (y))}/hk ] Bhk b ˆ then used in replacing ξ( |y, θ) with mink ξk ( |y, θ)
  • 102. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABCµ details Multidimensional distances ρk (k = 1, . . . , K ) and errors k = ρk (ηk (z), ηk (y)), with ˆ 1 k ∼ ξk ( |y, θ) ≈ ξk ( |y, θ) = K [{ k −ρk (ηk (zb ), ηk (y))}/hk ] Bhk b ˆ then used in replacing ξ( |y, θ) with mink ξk ( |y, θ) ABCµ involves acceptance probability ˆ π(θ , ) q(θ , θ)q( , ) mink ξk ( |y, θ ) ˆ π(θ, ) q(θ, θ )q( , ) mink ξk ( |y, θ)
  • 103. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABCµ multiple errors [ c Ratmann et al., PNAS, 2009]
  • 104. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABCµ for model choice [ c Ratmann et al., PNAS, 2009]
  • 105. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup Questions about ABCµ For each model under comparison, marginal posterior on used to assess the fit of the model (HPD includes 0 or not).
  • 106. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup Questions about ABCµ For each model under comparison, marginal posterior on used to assess the fit of the model (HPD includes 0 or not). Is the data informative about ? [Identifiability] How is the prior π( ) impacting the comparison? How is using both ξ( |x0 , θ) and π ( ) compatible with a standard probability model? [remindful of Wilkinson ] Where is the penalisation for complexity in the model comparison? [X, Mengersen Chen, 2010, PNAS]
  • 107. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC-PRC Another sequential version producing a sequence of Markov (t) (t) transition kernels Kt and of samples (θ1 , . . . , θN ) (1 ≤ t ≤ T )
  • 108. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC-PRC Another sequential version producing a sequence of Markov (t) (t) transition kernels Kt and of samples (θ1 , . . . , θN ) (1 ≤ t ≤ T ) ABC-PRC Algorithm (t−1) 1. Select θ at random from previous θi ’s with probabilities (t−1) ωi (1 ≤ i ≤ N). 2. Generate (t) (t) θi ∼ Kt (θ|θ ) , x ∼ f (x|θi ) , 3. Check that (x, y ) , otherwise start again. [Sisson et al., 2007]
  • 109. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup Why PRC? Partial rejection control: Resample from a population of weighted particles by pruning away particles with weights below threshold C , replacing them by new particles obtained by propagating an existing particle by an SMC step and modifying the weights accordinly. [Liu, 2001]
  • 110. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup Why PRC? Partial rejection control: Resample from a population of weighted particles by pruning away particles with weights below threshold C , replacing them by new particles obtained by propagating an existing particle by an SMC step and modifying the weights accordinly. [Liu, 2001] PRC justification in ABC-PRC: Suppose we then implement the PRC algorithm for some c 0 such that only identically zero weights are smaller than c Trouble is, there is no such c...
  • 111. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC-PRC weight (t) Probability ωi computed as (t) (t) (t) (t) ωi ∝ π(θi )Lt−1 (θ |θi ){π(θ )Kt (θi |θ )}−1 , where Lt−1 is an arbitrary transition kernel.
  • 112. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC-PRC weight (t) Probability ωi computed as (t) (t) (t) (t) ωi ∝ π(θi )Lt−1 (θ |θi ){π(θ )Kt (θi |θ )}−1 , where Lt−1 is an arbitrary transition kernel. In case Lt−1 (θ |θ) = Kt (θ|θ ) , all weights are equal under a uniform prior.
  • 113. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC-PRC weight (t) Probability ωi computed as (t) (t) (t) (t) ωi ∝ π(θi )Lt−1 (θ |θi ){π(θ )Kt (θi |θ )}−1 , where Lt−1 is an arbitrary transition kernel. In case Lt−1 (θ |θ) = Kt (θ|θ ) , all weights are equal under a uniform prior. Inspired from Del Moral et al. (2006), who use backward kernels Lt−1 in SMC to achieve unbiasedness
  • 114. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC-PRC bias Lack of unbiasedness of the method
  • 115. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC-PRC bias Lack of unbiasedness of the method Joint density of the accepted pair (θ(t−1) , θ(t) ) proportional to (t−1) (t) (t−1) (t) π(θ |y )Kt (θ |θ )f (y |θ ), For an arbitrary function h(θ), E [ωt h(θ(t) )] proportional to (t) π(θ (t) )Lt−1 (θ (t−1) |θ (t) ) (t−1) (t) (t−1) (t) (t−1) (t) h(θ ) π(θ |y )Kt (θ |θ )f (y |θ )dθ dθ π(θ (t−1) )Kt (θ (t) |θ (t−1) ) (t) π(θ (t) )Lt−1 (θ (t−1) |θ (t) ) (t−1) (t−1) ∝ h(θ ) π(θ )f (y |θ ) π(θ (t−1) )Kt (θ (t) |θ (t−1) ) (t) (t−1) (t) (t−1) (t) × Kt (θ |θ )f (y |θ )dθ dθ (t) (t) (t−1) (t) (t−1) (t−1) (t) ∝ h(θ )π(θ |y ) Lt−1 (θ |θ )f (y |θ )dθ dθ .
  • 116. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup A mixture example (1) Toy model of Sisson et al. (2007): if θ ∼ U(−10, 10) , x|θ ∼ 0.5 N (θ, 1) + 0.5 N (θ, 1/100) , then the posterior distribution associated with y = 0 is the normal mixture θ|y = 0 ∼ 0.5 N (0, 1) + 0.5 N (0, 1/100) restricted to [−10, 10]. Furthermore, true target available as π(θ||x| ) ∝ Φ( −θ)−Φ(− −θ)+Φ(10( −θ))−Φ(−10( +θ)) .
  • 117. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup “Ugly, squalid graph...” 1.0 1.0 1.0 1.0 1.0 0.8 0.8 0.8 0.8 0.8 0.6 0.6 0.6 0.6 0.6 0.4 0.4 0.4 0.4 0.4 0.2 0.2 0.2 0.2 0.2 0.0 0.0 0.0 0.0 0.0 −3 −1 1 3 −3 −1 1 3 −3 −1 1 3 −3 −1 1 3 −3 −1 1 3 θ θ θ θ θ 1.0 1.0 1.0 1.0 1.0 0.8 0.8 0.8 0.8 0.8 0.6 0.6 0.6 0.6 0.6 0.4 0.4 0.4 0.4 0.4 0.2 0.2 0.2 0.2 0.2 0.0 0.0 0.0 0.0 0.0 −3 −1 1 3 −3 −1 1 3 −3 −1 1 3 −3 −1 1 3 −3 −1 1 3 θ θ θ θ θ Comparison of τ = 0.15 and τ = 1/0.15 in Kt
  • 118. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup A PMC version Use of the same kernel idea as ABC-PRC but with IS correction ( check PMC ) Generate a sample at iteration t by N (t) (t−1) (t−1) πt (θ ˆ )∝ ωj Kt (θ(t) |θj ) j=1 modulo acceptance of the associated xt , and use an importance (t) weight associated with an accepted simulation θi (t) (t) (t) ωi ∝ π(θi ) πt (θi ) . ˆ c Still likelihood free [Beaumont et al., 2009]
  • 119. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup Our ABC-PMC algorithm Given a decreasing sequence of approximation levels 1 ≥ ... ≥ T, 1. At iteration t = 1, For i = 1, ..., N (1) (1) Simulate θi ∼ π(θ) and x ∼ f (x|θi ) until (x, y ) 1 (1) Set ωi = 1/N (1) Take τ 2 as twice the empirical variance of the θi ’s 2. At iteration 2 ≤ t ≤ T , For i = 1, ..., N, repeat (t−1) (t−1) Pick θi from the θj ’s with probabilities ωj (t) 2 (t) generate θi |θi ∼ N (θi , σt ) and x ∼ f (x|θi ) until (x, y ) t (t) (t) N (t−1) −1 (t) (t−1) Set ωi ∝ π(θi )/ j=1 ωj ϕ σt θi − θj ) 2 (t) Take τt+1 as twice the weighted empirical variance of the θi ’s
  • 120. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup Sequential Monte Carlo SMC is a simulation technique to approximate a sequence of related probability distributions πn with π0 “easy” and πT as target. Iterated IS as PMC: particles moved from time n to time n via kernel Kn and use of a sequence of extended targets πn˜ n πn (z0:n ) = πn (zn ) ˜ Lj (zj+1 , zj ) j=0 where the Lj ’s are backward Markov kernels [check that πn (zn ) is a marginal] [Del Moral, Doucet Jasra, Series B, 2006]
  • 121. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup Sequential Monte Carlo (2) Algorithm 3 SMC sampler (0) sample zi ∼ γ0 (x) (i = 1, . . . , N) (0) (0) (0) compute weights wi = π0 (zi )/γ0 (zi ) for t = 1 to N do if ESS(w (t−1) ) NT then resample N particles z (t−1) and set weights to 1 end if (t−1) (t−1) generate zi ∼ Kt (zi , ·) and set weights to (t) (t) (t−1) (t) (t−1) πt (zi ))Lt−1 (zi ), zi )) wi = wi−1 (t−1) (t−1) (t) πt−1 (zi ))Kt (zi ), zi )) end for
  • 122. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC-SMC [Del Moral, Doucet Jasra, 2009] True derivation of an SMC-ABC algorithm Use of a kernel Kn associated with target π n and derivation of the backward kernel π n (z )Kn (z , z) Ln−1 (z, z ) = πn (z) Update of the weights M m m=1 IA n (xin ) win ∝ wi(n−1) M m (xi(n−1) ) m=1 IA n−1 m when xin ∼ K (xi(n−1) , ·)
  • 123. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC-SMCM Modification: Makes M repeated simulations of the pseudo-data z given the parameter, rather than using a single [M = 1] simulation, leading to weight that is proportional to the number of accepted zi s M 1 ω(θ) = Iρ(η(y),η(zi )) M i=1 [limit in M means exact simulation from (tempered) target]
  • 124. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup Properties of ABC-SMC The ABC-SMC method properly uses a backward kernel L(z, z ) to simplify the importance weight and to remove the dependence on the unknown likelihood from this weight. Update of importance weights is reduced to the ratio of the proportions of surviving particles Major assumption: the forward kernel K is supposed to be invariant against the true target [tempered version of the true posterior]
  • 125. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup Properties of ABC-SMC The ABC-SMC method properly uses a backward kernel L(z, z ) to simplify the importance weight and to remove the dependence on the unknown likelihood from this weight. Update of importance weights is reduced to the ratio of the proportions of surviving particles Major assumption: the forward kernel K is supposed to be invariant against the true target [tempered version of the true posterior] Adaptivity in ABC-SMC algorithm only found in on-line construction of the thresholds t , slowly enough to keep a large number of accepted transitions
  • 126. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup A mixture example (2) Recovery of the target, whether using a fixed standard deviation of τ = 0.15 or τ = 1/0.15, or a sequence of adaptive τt ’s. 1.0 1.0 1.0 1.0 1.0 0.8 0.8 0.8 0.8 0.8 0.6 0.6 0.6 0.6 0.6 0.4 0.4 0.4 0.4 0.4 0.2 0.2 0.2 0.2 0.2 0.0 0.0 0.0 0.0 0.0 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 θ θ θ θ θ
  • 127. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup Yet another ABC-PRC Another version of ABC-PRC called PRC-ABC with a proposal distribution Mt , S replications of the pseudo-data, and a PRC step: PRC-ABC Algorithm (t−1) (t−1) 1. Select θ at random from the θi ’s with probabilities ωi (t) iid (t−1) 2. Generate θi ∼ Kt , x1 , . . . , xS ∼ f (x|θi ), set (t) (t) ωi = Kt {η(xs ) − η(y)} Kt (θi ) , s (t) and accept with probability p (i) = ωi /ct,N (t) (t) 3. Set ωi ∝ ωi /p (i) (1 ≤ i ≤ N) [Peters, Sisson Fan, Stat Computing, 2012]
  • 128. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup Yet another ABC-PRC Another version of ABC-PRC called PRC-ABC with a proposal distribution Mt , S replications of the pseudo-data, and a PRC step: (t−1) (t−1) Use of a NP estimate Mt (θ) = ωi ψ(θ|θi ) (with a fixed variance?!) S = 1 recommended for “greatest gains in sampler performance” ct,N upper quantile of non-zero weights at time t
  • 129. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup Wilkinson’s exact BC ABC approximation error (i.e. non-zero tolerance) replaced with exact simulation from a controlled approximation to the target, convolution of true posterior with kernel function π(θ)f (z|θ)K (y − z) π (θ, z|y) = , π(θ)f (z|θ)K (y − z)dzdθ with K kernel parameterised by bandwidth . [Wilkinson, 2008]
  • 130. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup Wilkinson’s exact BC ABC approximation error (i.e. non-zero tolerance) replaced with exact simulation from a controlled approximation to the target, convolution of true posterior with kernel function π(θ)f (z|θ)K (y − z) π (θ, z|y) = , π(θ)f (z|θ)K (y − z)dzdθ with K kernel parameterised by bandwidth . [Wilkinson, 2008] Theorem The ABC algorithm based on the assumption of a randomised observation y = y + ξ, ξ ∼ K , and an acceptance probability of ˜ K (y − z)/M gives draws from the posterior distribution π(θ|y).
  • 131. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup How exact a BC? “Using to represent measurement error is straightforward, whereas using to model the model discrepancy is harder to conceptualize and not as commonly used” [Richard Wilkinson, 2008]
  • 132. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup How exact a BC? Pros Pseudo-data from true model and observed data from noisy model Interesting perspective in that outcome is completely controlled Link with ABCµ and assuming y is observed with a measurement error with density K Relates to the theory of model approximation [Kennedy O’Hagan, 2001] Cons Requires K to be bounded by M True approximation error never assessed Requires a modification of the standard ABC algorithm
  • 133. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC for HMMs Specific case of a hidden Markov model Xt+1 ∼ Qθ (Xt , ·) Yt+1 ∼ gθ (·|xt ) 0 where only y1:n is observed. [Dean, Singh, Jasra, Peters, 2011]
  • 134. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC for HMMs Specific case of a hidden Markov model Xt+1 ∼ Qθ (Xt , ·) Yt+1 ∼ gθ (·|xt ) 0 where only y1:n is observed. [Dean, Singh, Jasra, Peters, 2011] Use of specific constraints, adapted to the Markov structure: 0 0 y1 ∈ B(y1 , ) × · · · × yn ∈ B(yn , )
  • 135. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC-MLE for HMMs ABC-MLE defined by ˆ 0 0 θn = arg max Pθ Y1 ∈ B(y1 , ), . . . , Yn ∈ B(yn , ) θ
  • 136. MCMC and likelihood-free methods Part/day II: ABC Approximate Bayesian computation Alphabet soup ABC-MLE for HMMs ABC-MLE defined by ˆ 0 0 θn = arg max Pθ Y1 ∈ B(y1 , ), . . . , Yn ∈ B(yn , ) θ Exact MLE for the likelihood same basis as Wilkinson! 0 pθ (y1 , . . . , yn ) corresponding to the perturbed process (xt , yt + zt )1≤t≤n zt ∼ U(B(0, 1) [Dean, Singh, Jasra, Peters, 2011]