SlideShare une entreprise Scribd logo
1  sur  63
Télécharger pour lire hors ligne
ABC model choice: beyond summary selection
Christian P. Robert
Universit´e Paris-Dauphine, Paris & University of Warwick, Coventry
JSM 2014, Boston
August 6, 2014
Outline
Joint work with J.-M. Cornuet, A. Estoup, J.-M. Marin, and P.
Pudlo
Intractable likelihoods
ABC methods
ABC model choice via random forests
Conclusion
Intractable likelihood
Case of a well-defined statistical model where the likelihood
function
(θ|y) = f (y1, . . . , yn|θ)
is (really!) not available in closed form
cannot (easily!) be either completed or demarginalised
cannot be (at all!) estimated by an unbiased estimator
examples of latent variable models of high dimension,
including combinatorial structures (trees, graphs), missing
constant f (x|θ) = g(y, θ) Z(θ) (eg. Markov random fields,
exponential graphs,. . . )
c Prohibits direct implementation of a generic MCMC algorithm
like Metropolis–Hastings which gets stuck exploring missing
structures
Intractable likelihood
Case of a well-defined statistical model where the likelihood
function
(θ|y) = f (y1, . . . , yn|θ)
is (really!) not available in closed form
cannot (easily!) be either completed or demarginalised
cannot be (at all!) estimated by an unbiased estimator
c Prohibits direct implementation of a generic MCMC algorithm
like Metropolis–Hastings which gets stuck exploring missing
structures
Getting approximative
Case of a well-defined statistical model where the likelihood
function
(θ|y) = f (y1, . . . , yn|θ)
is out of reach
Empirical A to the original B problem
Degrading the data precision down to tolerance level ε
Replacing the likelihood with a non-parametric approximation
Summarising/replacing the data with insufficient statistics
Getting approximative
Case of a well-defined statistical model where the likelihood
function
(θ|y) = f (y1, . . . , yn|θ)
is out of reach
Empirical A to the original B problem
Degrading the data precision down to tolerance level ε
Replacing the likelihood with a non-parametric approximation
Summarising/replacing the data with insufficient statistics
Getting approximative
Case of a well-defined statistical model where the likelihood
function
(θ|y) = f (y1, . . . , yn|θ)
is out of reach
Empirical A to the original B problem
Degrading the data precision down to tolerance level ε
Replacing the likelihood with a non-parametric approximation
Summarising/replacing the data with insufficient statistics
Approximate Bayesian computation
Intractable likelihoods
ABC methods
abc of ABC
Summary statistic
ABC for model choice
ABC model choice via random
forests
Conclusion
Genetic background of ABC
ABC is a recent computational technique that only requires being
able to sample from the likelihood f (·|θ)
This technique stemmed from population genetics models, about
15 years ago, and population geneticists still significantly
contribute to methodological developments of ABC.
[Griffith & al., 1997; Tavar´e & al., 1999]
ABC methodology
Bayesian setting: target is π(θ)f (x|θ)
When likelihood f (x|θ) not in closed form, likelihood-free rejection
technique:
Foundation
For an observation y ∼ f (y|θ), under the prior π(θ), if one keeps
jointly simulating
θ ∼ π(θ) , z ∼ f (z|θ ) ,
until the auxiliary variable z is equal to the observed value, z = y,
then the selected
θ ∼ π(θ|y)
[Rubin, 1984; Diggle & Gratton, 1984; Tavar´e et al., 1997]
ABC methodology
Bayesian setting: target is π(θ)f (x|θ)
When likelihood f (x|θ) not in closed form, likelihood-free rejection
technique:
Foundation
For an observation y ∼ f (y|θ), under the prior π(θ), if one keeps
jointly simulating
θ ∼ π(θ) , z ∼ f (z|θ ) ,
until the auxiliary variable z is equal to the observed value, z = y,
then the selected
θ ∼ π(θ|y)
[Rubin, 1984; Diggle & Gratton, 1984; Tavar´e et al., 1997]
ABC methodology
Bayesian setting: target is π(θ)f (x|θ)
When likelihood f (x|θ) not in closed form, likelihood-free rejection
technique:
Foundation
For an observation y ∼ f (y|θ), under the prior π(θ), if one keeps
jointly simulating
θ ∼ π(θ) , z ∼ f (z|θ ) ,
until the auxiliary variable z is equal to the observed value, z = y,
then the selected
θ ∼ π(θ|y)
[Rubin, 1984; Diggle & Gratton, 1984; Tavar´e et al., 1997]
A as A...pproximative
When y is a continuous random variable, strict equality z = y is
replaced with a tolerance zone
ρ(y, z)
where ρ is a distance
Output distributed from
π(θ) Pθ{ρ(y, z) < }
def
∝ π(θ|ρ(y, z) < )
[Pritchard et al., 1999]
A as A...pproximative
When y is a continuous random variable, strict equality z = y is
replaced with a tolerance zone
ρ(y, z)
where ρ is a distance
Output distributed from
π(θ) Pθ{ρ(y, z) < }
def
∝ π(θ|ρ(y, z) < )
[Pritchard et al., 1999]
ABC recap
Likelihood free rejection
sampling
Tavar´e et al. (1997) Genetics
1) Set i = 1,
2) Generate θ from the
prior distribution,
3) Generate z from the
likelihood f (·|θ ),
4) If ρ(η(z ), η(y)) , set
(θi , zi ) = (θ , z ) and
i = i + 1,
5) If i N, return to 2).
Keep θ’s such that the distance
between the corresponding
simulated dataset and the
observed dataset is small
enough.
Tuning parameters
> 0: tolerance level,
η(z): function that
summarizes datasets,
ρ(η, η ): distance between
vectors of summary
statistics
N: size of the output
Which summary?
Fundamental difficulty of the choice of the summary statistic when
there is no non-trivial sufficient statistics [except when done by the
experimenters in the field]
Loss of statistical information balanced against gain in data
roughening
Approximation error and information loss remain unknown
Choice of statistics induces choice of distance function
towards standardisation
may be imposed for external/practical reasons (e.g., DIYABC)
may gather several non-B point estimates
can learn about efficient combination
Which summary?
Fundamental difficulty of the choice of the summary statistic when
there is no non-trivial sufficient statistics [except when done by the
experimenters in the field]
Loss of statistical information balanced against gain in data
roughening
Approximation error and information loss remain unknown
Choice of statistics induces choice of distance function
towards standardisation
may be imposed for external/practical reasons (e.g., DIYABC)
may gather several non-B point estimates
can learn about efficient combination
Which summary?
Fundamental difficulty of the choice of the summary statistic when
there is no non-trivial sufficient statistics [except when done by the
experimenters in the field]
Loss of statistical information balanced against gain in data
roughening
Approximation error and information loss remain unknown
Choice of statistics induces choice of distance function
towards standardisation
may be imposed for external/practical reasons (e.g., DIYABC)
may gather several non-B point estimates
can learn about efficient combination
attempts at summaries
How to choose the set of summary statistics?
Joyce and Marjoram (2008, SAGMB)
Nunes and Balding (2010, SAGMB)
Fearnhead and Prangle (2012, JRSS B)
Ratmann et al. (2012, PLOS Comput. Biol)
Blum et al. (2013, Statistical science)
EP-ABC of Barthelm´e & Chopin (2013, JASA)
ABC for model choice
Intractable likelihoods
ABC methods
abc of ABC
Summary statistic
ABC for model choice
ABC model choice via random forests
Conclusion
ABC model choice
ABC model choice
A) Generate large set of
(m, θ, z) from the Bayesian
predictive, π(m)πm(θ)fm(z|θ)
B) Keep particles (m, θ, z) such
that ρ(η(y), η(z))
C) For each m, return
pm = proportion of m among
remaining particles
If ε is tuned so that end number
of particles is k, then pm is a
k-nearest neighbor estimate of
{M = m η(y)
Approximating posterior
probabilities of models is a
regression problem where
response is 1 M = m ,
co-variables are the
summary statistics η(z),
loss is, e.g., L2
(conditional expectation)
Prefered method for
approximating posterior
probabilities in DIYABC is
local polylogistic regression
Machine learning perspective
ABC model choice
A) Generate a large set
of (m, θ, z)’s from
Bayesian predictive,
π(m)πm(θ)fm(z|θ)
B) Use machine learning
tech. to infer on
π(m η(y))
In this perspective:
(iid) “data set” reference
table simulated during
stage A)
observed y becomes a
new data point
Note that:
predicting m is a
classification problem
⇐⇒ select the best
model based on a
maximal a posteriori rule
computing π(m|η(y)) is
a regression problem
⇐⇒ confidence in each
model
classification much simpler
than regression (e.g., dim. of
objects we try to learn)
Warning
the lost of information induced by using non sufficient
summary statistics is a real problem
Fundamental discrepancy between the genuine Bayes
factors/posterior probabilities and the Bayes factors based on
summary statistics. See, e.g.,
Didelot et al. (2011, Bayesian analysis)
Robert, Cornuet, Marin and Pillai (2011, PNAS)
Marin, Pillai, Robert, Rousseau (2014, JRSS B)
. . .
We suggest using instead a machine learning approach that can
deal with a large number of correlated summary statistics:
E.g., random forests
ABC model choice via random forests
Intractable likelihoods
ABC methods
ABC model choice via random forests
Random forests
ABC with random forests
Illustrations
Conclusion
Leaning towards machine learning
Main notions:
ABC-MC seen as learning about which model is most
appropriate from a huge (reference) table
exploiting a large number of summary statistics not an issue
for machine learning methods intended to estimate efficient
combinations
abandoning (temporarily?) the idea of estimating posterior
probabilities of the models, poorly approximated by machine
learning methods, and replacing those by posterior predictive
expected loss
Random forests
Technique that stemmed from Leo Breiman’s bagging (or
bootstrap aggregating) machine learning algorithm for both
classification and regression
[Breiman, 1996]
Improved classification performances by averaging over
classification schemes of randomly generated training sets, creating
a “forest” of (CART) decision trees, inspired by Amit and Geman
(1997) ensemble learning
[Breiman, 2001]
Growing the forest
Breiman’s solution for inducing random features in the trees of the
forest:
boostrap resampling of the dataset and
random subset-ing [of size
√
t] of the covariates driving the
classification at every node of each tree
Covariate xτ that drives the node separation
xτ cτ
and the separation bound cτ chosen by minimising entropy or Gini
index
Breiman and Cutler’s algorithm
Algorithm 1 Random forests
for t = 1 to T do
//*T is the number of trees*//
Draw a bootstrap sample of size nboot = n
Grow an unpruned decision tree
while leaves are not monochrome do
Select ntry of the predictors at random
Determine the best split from among those predictors
end while
end for
Predict new data by aggregating the predictions of the T trees
ABC with random forests
Idea: Starting with
possibly large collection of summary statistics (s1i , . . . , spi )
(from scientific theory input to available statistical softwares,
to machine-learning alternatives)
ABC reference table involving model index, parameter values
and summary statistics for the associated simulated
pseudo-data
run R randomforest to infer M from (s1i , . . . , spi )
ABC with random forests
Idea: Starting with
possibly large collection of summary statistics (s1i , . . . , spi )
(from scientific theory input to available statistical softwares,
to machine-learning alternatives)
ABC reference table involving model index, parameter values
and summary statistics for the associated simulated
pseudo-data
run R randomforest to infer M from (s1i , . . . , spi )
at each step O(
√
p) indices sampled at random and most
discriminating statistic selected, by minimising entropy Gini loss
ABC with random forests
Idea: Starting with
possibly large collection of summary statistics (s1i , . . . , spi )
(from scientific theory input to available statistical softwares,
to machine-learning alternatives)
ABC reference table involving model index, parameter values
and summary statistics for the associated simulated
pseudo-data
run R randomforest to infer M from (s1i , . . . , spi )
Average of the trees is resulting summary statistics, highly
non-linear predictor of the model index
Outcome of ABC-RF
Random forest predicts a (MAP) model index, from the observed
dataset: The predictor provided by the forest is “sufficient” to
select the most likely model but not to derive associated posterior
probability
exploit entire forest by computing how many trees lead to
picking each of the models under comparison but variability
too high to be trusted
frequency of trees associated with majority model is no proper
substitute to the true posterior probability
And usual ABC-MC approximation equally highly variable and
hard to assess
Outcome of ABC-RF
Random forest predicts a (MAP) model index, from the observed
dataset: The predictor provided by the forest is “sufficient” to
select the most likely model but not to derive associated posterior
probability
exploit entire forest by computing how many trees lead to
picking each of the models under comparison but variability
too high to be trusted
frequency of trees associated with majority model is no proper
substitute to the true posterior probability
And usual ABC-MC approximation equally highly variable and
hard to assess
Posterior predictive expected losses
We suggest replacing unstable approximation of
P(M = m|xo)
with xo observed sample and m model index, by average of the
selection errors across all models given the data xo,
P( ^M(X) = M|xo)
where pair (M, X) generated from the predictive
f (x|θ)π(θ, M|xo)dθ
and ^M(x) denotes the random forest model (MAP) predictor
Posterior predictive expected losses
Arguments:
Bayesian estimate of the posterior error
integrates error over most likely part of the parameter space
gives an averaged error rather than the posterior probability of
the null hypothesis
easily computed: Given ABC subsample of parameters from
reference table, simulate pseudo-samples associated with
those and derive error frequency
toy: MA(1) vs. MA(2)
Comparing an MA(1) and an MA(2) models:
xt = t − ϑ1 t−1[−ϑ2 t−2]
Earlier illustration using first two autocorrelations as S(x)
[Marin et al., Stat. & Comp., 2011]
Result #1: values of p(m|x) [obtained by numerical integration]
and p(m|S(x)) [obtained by mixing ABC outcome and density
estimation] highly differ!
toy: MA(1) vs. MA(2)
Difference between the posterior probability of MA(2) given either
x or S(x). Blue stands for data from MA(1), orange for data from
MA(2)
toy: MA(1) vs. MA(2)
Comparing an MA(1) and an MA(2) models:
xt = t − ϑ1 t−1[−ϑ2 t−2]
Earlier illustration using two autocorrelations as S(x)
[Marin et al., Stat. & Comp., 2011]
Result #2: Embedded models, with simulations from MA(1)
within those from MA(2), hence linear classification poor
toy: MA(1) vs. MA(2)
Simulations of S(x) under MA(1) (blue) and MA(2) (orange)
toy: MA(1) vs. MA(2)
Comparing an MA(1) and an MA(2) models:
xt = t − ϑ1 t−1[−ϑ2 t−2]
Earlier illustration using two autocorrelations as S(x)
[Marin et al., Stat. & Comp., 2011]
Result #3: On such a small dimension problem, random forests
should come second to k-nn ou kernel discriminant analyses
toy: MA(1) vs. MA(2)
classification prior
method error rate (in %)
LDA 27.43
Logist. reg. 28.34
SVM (library e1071) 17.17
“na¨ıve” Bayes (with G marg.) 19.52
“na¨ıve” Bayes (with NP marg.) 18.25
ABC k-nn (k = 100) 17.23
ABC k-nn (k = 50) 16.97
Local log. reg. (k = 1000) 16.82
Random Forest 17.04
Kernel disc. ana. (KDA) 16.95
True MAP 12.36
Evolution scenarios based on SNPs
Three scenarios for the evolution of three populations from their
most common ancestor
Evolution scenarios based on SNPs
Model 1 with 6 parameters:
four effective sample sizes: N1 for population 1, N2 for
population 2, N3 for population 3 and, finally, N4 for the
native population;
the time of divergence ta between populations 1 and 3;
the time of divergence ts between populations 1 and 2.
effective sample sizes with independent uniform priors on
[100, 30000]
vector of divergence times (ta, ts) with uniform prior on
{(a, s) ∈ [10, 30000] ⊗ [10, 30000]|a < s}
Evolution scenarios based on SNPs
Model 2 with same parameters as model 1 but the divergence time
ta corresponds to a divergence between populations 2 and 3; prior
distributions identical to those of model 1
Model 3 with extra seventh parameter, admixture rate r. For that
scenario, at time ta admixture between populations 1 and 2 from
which population 3 emerges. Prior distribution on r uniform on
[0.05, 0.95]. In that case models 1 and 2 are not embeddeded in
model 3. Prior distributions for other parameters the same as in
model 1
Evolution scenarios based on SNPs
Set of 48 summary statistics:
Single sample statistics
proportion of loci with null gene diversity (= proportion of monomorphic
loci)
mean gene diversity across polymorphic loci
[Nei, 1987]
variance of gene diversity across polymorphic loci
mean gene diversity across all loci
Evolution scenarios based on SNPs
Set of 48 summary statistics:
Two sample statistics
proportion of loci with null FST distance between both samples
[Weir and Cockerham, 1984]
mean across loci of non null FST distances between both samples
variance across loci of non null FST distances between both samples
mean across loci of FST distances between both samples
proportion of 1 loci with null Nei’s distance between both samples
[Nei, 1972]
mean across loci of non null Nei’s distances between both samples
variance across loci of non null Nei’s distances between both samples
mean across loci of Nei’s distances between the two samples
Evolution scenarios based on SNPs
Set of 48 summary statistics:
Three sample statistics
proportion of loci with null admixture estimate
mean across loci of non null admixture estimate
variance across loci of non null admixture estimated
mean across all locus admixture estimates
Evolution scenarios based on SNPs
For a sample of 1000 SNIPs measured on 25 biallelic individuals
per population, learning ABC reference table with 20, 000
simulations, prior predictive error rates:
“na¨ıve Bayes” classifier 33.3%
raw LDA classifier 23.27%
ABC k-nn [Euclidean dist. on summaries normalised by MAD]
25.93%
ABC k-nn [unnormalised Euclidean dist. on LDA components]
22.12%
local logistic classifier based on LDA components with
k = 500 neighbours 22.61%
random forest on summaries 21.03%
(Error rates computed on a prior sample of size 104)
Evolution scenarios based on SNPs
For a sample of 1000 SNIPs measured on 25 biallelic individuals
per population, learning ABC reference table with 20, 000
simulations, prior predictive error rates:
“na¨ıve Bayes” classifier 33.3%
raw LDA classifier 23.27%
ABC k-nn [Euclidean dist. on summaries normalised by MAD]
25.93%
ABC k-nn [unnormalised Euclidean dist. on LDA components]
22.12%
local logistic classifier based on LDA components with
k = 1000 neighbours 22.46%
random forest on summaries 21.03%
(Error rates computed on a prior sample of size 104)
Evolution scenarios based on SNPs
For a sample of 1000 SNIPs measured on 25 biallelic individuals
per population, learning ABC reference table with 20, 000
simulations, prior predictive error rates:
“na¨ıve Bayes” classifier 33.3%
raw LDA classifier 23.27%
ABC k-nn [Euclidean dist. on summaries normalised by MAD]
25.93%
ABC k-nn [unnormalised Euclidean dist. on LDA components]
22.12%
local logistic classifier based on LDA components with
k = 5000 neighbours 22.43%
random forest on summaries 21.03%
(Error rates computed on a prior sample of size 104)
Evolution scenarios based on SNPs
For a sample of 1000 SNIPs measured on 25 biallelic individuals
per population, learning ABC reference table with 20, 000
simulations, prior predictive error rates:
“na¨ıve Bayes” classifier 33.3%
raw LDA classifier 23.27%
ABC k-nn [Euclidean dist. on summaries normalised by MAD]
25.93%
ABC k-nn [unnormalised Euclidean dist. on LDA components]
22.12%
local logistic classifier based on LDA components with
k = 5000 neighbours 22.43%
random forest on LDA components only 23.1%
(Error rates computed on a prior sample of size 104)
Evolution scenarios based on SNPs
For a sample of 1000 SNIPs measured on 25 biallelic individuals
per population, learning ABC reference table with 20, 000
simulations, prior predictive error rates:
“na¨ıve Bayes” classifier 33.3%
raw LDA classifier 23.27%
ABC k-nn [Euclidean dist. on summaries normalised by MAD]
25.93%
ABC k-nn [unnormalised Euclidean dist. on LDA components]
22.12%
local logistic classifier based on LDA components with
k = 5000 neighbours 22.43%
random forest on summaries and LDA components 19.03%
(Error rates computed on a prior sample of size 104)
Evolution scenarios based on SNPs
Posterior predictive error rates
Evolution scenarios based on SNPs
Posterior predictive error rates
favourable: 0.010 error – unfavourable: 0.104 error
Instance of ecological questions [message in a beetle]
How did the Asian Ladybird
beetle arrive in Europe?
Why do they swarm right
now?
What are the routes of
invasion?
How to get rid of them?
[Lombaert & al., 2010, PLoS ONE]
beetles in forests
Worldwide invasion routes of Harmonia Axyridis
[Estoup et al., 2012, Molecular Ecology Res.]
Asian Ladybirds [message in a beetle]
Comparing 10 scenarios of Asian beetle invasion beetle moves
Asian Ladybirds [message in a beetle]
Comparing 10 scenarios of Asian beetle invasion beetle moves
classification prior error∗
method rate (in %)
raw LDA 38.94
“na¨ıve” Bayes (with G margins) 54.02
k-nn (MAD normalised sum stat) 58.47
RF without LDA components 38.84
RF with LDA components 35.32
∗
estimated on pseudo-samples of 104
items drawn from the prior
Back to Asian Ladybirds [message in a beetle]
Comparing 10 scenarios of Asian beetle invasion beetle moves
Random forest allocation frequencies
1 2 3 4 5 6 7 8 9 10
0.168 0.1 0.008 0.066 0.296 0.016 0.092 0.04 0.014 0.2
Posterior predictive error based on 20,000 prior simulations and
keeping 500 neighbours (or 100 neighbours and 10 pseudo-datasets
per parameter)
0.3682
Back to Asian Ladybirds [message in a beetle]
Comparing 10 scenarios of Asian beetle invasion
posterior predictive error 0.368
Conclusion
Key ideas
π m η(y) = π m y
Rather than approximating
π m η(y) , focus on
selecting the best model
(classif. vs regression)
Assess confidence in the
selection with posterior
predictive expected losses
Consequences on
ABC-PopGen
Often, RF k-NN (less
sensible to high correlation
in summaries)
RF requires many less prior
simulations
RF selects automatically
relevent summaries
Hence can handle much
more complex models
Conclusion
Key ideas
π m η(y) = π m y
Use a seasoned machine
learning technique
selecting from ABC
simulations: minimise 0-1
loss mimics MAP
Assess confidence in the
selection with posterior
predictive expected losses
Consequences on
ABC-PopGen
Often, RF k-NN (less
sensible to high correlation
in summaries)
RF requires many less prior
simulations
RF selects automatically
relevent summaries
Hence can handle much
more complex models
Thank you!

Contenu connexe

Tendances

Convergence of ABC methods
Convergence of ABC methodsConvergence of ABC methods
Convergence of ABC methodsChristian Robert
 
random forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationrandom forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationChristian Robert
 
(Approximate) Bayesian computation as a new empirical Bayes (something)?
(Approximate) Bayesian computation as a new empirical Bayes (something)?(Approximate) Bayesian computation as a new empirical Bayes (something)?
(Approximate) Bayesian computation as a new empirical Bayes (something)?Christian Robert
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Christian Robert
 
Likelihood-free Design: a discussion
Likelihood-free Design: a discussionLikelihood-free Design: a discussion
Likelihood-free Design: a discussionChristian Robert
 
Statistics symposium talk, Harvard University
Statistics symposium talk, Harvard UniversityStatistics symposium talk, Harvard University
Statistics symposium talk, Harvard UniversityChristian Robert
 
ABC short course: survey chapter
ABC short course: survey chapterABC short course: survey chapter
ABC short course: survey chapterChristian Robert
 
ABC short course: final chapters
ABC short course: final chaptersABC short course: final chapters
ABC short course: final chaptersChristian Robert
 
Discussion of ABC talk by Francesco Pauli, Padova, March 21, 2013
Discussion of ABC talk by Francesco Pauli, Padova, March 21, 2013Discussion of ABC talk by Francesco Pauli, Padova, March 21, 2013
Discussion of ABC talk by Francesco Pauli, Padova, March 21, 2013Christian Robert
 
ABC short course: model choice chapter
ABC short course: model choice chapterABC short course: model choice chapter
ABC short course: model choice chapterChristian Robert
 
better together? statistical learning in models made of modules
better together? statistical learning in models made of modulesbetter together? statistical learning in models made of modules
better together? statistical learning in models made of modulesChristian Robert
 
An overview of Bayesian testing
An overview of Bayesian testingAn overview of Bayesian testing
An overview of Bayesian testingChristian Robert
 
Considerate Approaches to ABC Model Selection
Considerate Approaches to ABC Model SelectionConsiderate Approaches to ABC Model Selection
Considerate Approaches to ABC Model SelectionMichael Stumpf
 
ABC short course: introduction chapters
ABC short course: introduction chaptersABC short course: introduction chapters
ABC short course: introduction chaptersChristian Robert
 
CISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergenceCISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergenceChristian Robert
 
Approximating Bayes Factors
Approximating Bayes FactorsApproximating Bayes Factors
Approximating Bayes FactorsChristian Robert
 

Tendances (20)

Convergence of ABC methods
Convergence of ABC methodsConvergence of ABC methods
Convergence of ABC methods
 
random forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationrandom forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimation
 
(Approximate) Bayesian computation as a new empirical Bayes (something)?
(Approximate) Bayesian computation as a new empirical Bayes (something)?(Approximate) Bayesian computation as a new empirical Bayes (something)?
(Approximate) Bayesian computation as a new empirical Bayes (something)?
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1
 
Likelihood-free Design: a discussion
Likelihood-free Design: a discussionLikelihood-free Design: a discussion
Likelihood-free Design: a discussion
 
Statistics symposium talk, Harvard University
Statistics symposium talk, Harvard UniversityStatistics symposium talk, Harvard University
Statistics symposium talk, Harvard University
 
ABC short course: survey chapter
ABC short course: survey chapterABC short course: survey chapter
ABC short course: survey chapter
 
ABC short course: final chapters
ABC short course: final chaptersABC short course: final chapters
ABC short course: final chapters
 
Discussion of ABC talk by Francesco Pauli, Padova, March 21, 2013
Discussion of ABC talk by Francesco Pauli, Padova, March 21, 2013Discussion of ABC talk by Francesco Pauli, Padova, March 21, 2013
Discussion of ABC talk by Francesco Pauli, Padova, March 21, 2013
 
ABC short course: model choice chapter
ABC short course: model choice chapterABC short course: model choice chapter
ABC short course: model choice chapter
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
better together? statistical learning in models made of modules
better together? statistical learning in models made of modulesbetter together? statistical learning in models made of modules
better together? statistical learning in models made of modules
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
An overview of Bayesian testing
An overview of Bayesian testingAn overview of Bayesian testing
An overview of Bayesian testing
 
Considerate Approaches to ABC Model Selection
Considerate Approaches to ABC Model SelectionConsiderate Approaches to ABC Model Selection
Considerate Approaches to ABC Model Selection
 
ABC short course: introduction chapters
ABC short course: introduction chaptersABC short course: introduction chapters
ABC short course: introduction chapters
 
asymptotics of ABC
asymptotics of ABCasymptotics of ABC
asymptotics of ABC
 
CISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergenceCISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergence
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Approximating Bayes Factors
Approximating Bayes FactorsApproximating Bayes Factors
Approximating Bayes Factors
 

Similaire à Boston talk

3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
3rd NIPS Workshop on PROBABILISTIC PROGRAMMING3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
3rd NIPS Workshop on PROBABILISTIC PROGRAMMINGChristian Robert
 
Colloquium in honor of Hans Ruedi Künsch
Colloquium in honor of Hans Ruedi KünschColloquium in honor of Hans Ruedi Künsch
Colloquium in honor of Hans Ruedi KünschChristian Robert
 
Workshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinWorkshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinChristian Robert
 
Likelihood free computational statistics
Likelihood free computational statisticsLikelihood free computational statistics
Likelihood free computational statisticsPierre Pudlo
 
Dr Chris Drovandi (QUT) - Bayesian Indirect Inference Using a Parametric Auxi...
Dr Chris Drovandi (QUT) - Bayesian Indirect Inference Using a Parametric Auxi...Dr Chris Drovandi (QUT) - Bayesian Indirect Inference Using a Parametric Auxi...
Dr Chris Drovandi (QUT) - Bayesian Indirect Inference Using a Parametric Auxi...QUT_SEF
 
Is ABC a new empirical Bayes approach?
Is ABC a new empirical Bayes approach?Is ABC a new empirical Bayes approach?
Is ABC a new empirical Bayes approach?Christian Robert
 
RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013
RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013
RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013Christian Robert
 
[A]BCel : a presentation at ABC in Roma
[A]BCel : a presentation at ABC in Roma[A]BCel : a presentation at ABC in Roma
[A]BCel : a presentation at ABC in RomaChristian Robert
 
slides of ABC talk at i-like workshop, Warwick, May 16
slides of ABC talk at i-like workshop, Warwick, May 16slides of ABC talk at i-like workshop, Warwick, May 16
slides of ABC talk at i-like workshop, Warwick, May 16Christian Robert
 
Workshop on Bayesian Inference for Latent Gaussian Models with Applications
Workshop on Bayesian Inference for Latent Gaussian Models with ApplicationsWorkshop on Bayesian Inference for Latent Gaussian Models with Applications
Workshop on Bayesian Inference for Latent Gaussian Models with ApplicationsChristian Robert
 
ABC in London, May 5, 2011
ABC in London, May 5, 2011ABC in London, May 5, 2011
ABC in London, May 5, 2011Christian Robert
 
Pittsburgh and Toronto "Halloween US trip" seminars
Pittsburgh and Toronto "Halloween US trip" seminarsPittsburgh and Toronto "Halloween US trip" seminars
Pittsburgh and Toronto "Halloween US trip" seminarsChristian Robert
 
seminar at Princeton University
seminar at Princeton Universityseminar at Princeton University
seminar at Princeton UniversityChristian Robert
 

Similaire à Boston talk (20)

3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
3rd NIPS Workshop on PROBABILISTIC PROGRAMMING3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
 
BIRS 12w5105 meeting
BIRS 12w5105 meetingBIRS 12w5105 meeting
BIRS 12w5105 meeting
 
Colloquium in honor of Hans Ruedi Künsch
Colloquium in honor of Hans Ruedi KünschColloquium in honor of Hans Ruedi Künsch
Colloquium in honor of Hans Ruedi Künsch
 
Workshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinWorkshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael Martin
 
Likelihood free computational statistics
Likelihood free computational statisticsLikelihood free computational statistics
Likelihood free computational statistics
 
ABC model choice
ABC model choiceABC model choice
ABC model choice
 
the ABC of ABC
the ABC of ABCthe ABC of ABC
the ABC of ABC
 
Dr Chris Drovandi (QUT) - Bayesian Indirect Inference Using a Parametric Auxi...
Dr Chris Drovandi (QUT) - Bayesian Indirect Inference Using a Parametric Auxi...Dr Chris Drovandi (QUT) - Bayesian Indirect Inference Using a Parametric Auxi...
Dr Chris Drovandi (QUT) - Bayesian Indirect Inference Using a Parametric Auxi...
 
ABC in Varanasi
ABC in VaranasiABC in Varanasi
ABC in Varanasi
 
Is ABC a new empirical Bayes approach?
Is ABC a new empirical Bayes approach?Is ABC a new empirical Bayes approach?
Is ABC a new empirical Bayes approach?
 
RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013
RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013
RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013
 
DIC
DICDIC
DIC
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
[A]BCel : a presentation at ABC in Roma
[A]BCel : a presentation at ABC in Roma[A]BCel : a presentation at ABC in Roma
[A]BCel : a presentation at ABC in Roma
 
Ab cancun
Ab cancunAb cancun
Ab cancun
 
slides of ABC talk at i-like workshop, Warwick, May 16
slides of ABC talk at i-like workshop, Warwick, May 16slides of ABC talk at i-like workshop, Warwick, May 16
slides of ABC talk at i-like workshop, Warwick, May 16
 
Workshop on Bayesian Inference for Latent Gaussian Models with Applications
Workshop on Bayesian Inference for Latent Gaussian Models with ApplicationsWorkshop on Bayesian Inference for Latent Gaussian Models with Applications
Workshop on Bayesian Inference for Latent Gaussian Models with Applications
 
ABC in London, May 5, 2011
ABC in London, May 5, 2011ABC in London, May 5, 2011
ABC in London, May 5, 2011
 
Pittsburgh and Toronto "Halloween US trip" seminars
Pittsburgh and Toronto "Halloween US trip" seminarsPittsburgh and Toronto "Halloween US trip" seminars
Pittsburgh and Toronto "Halloween US trip" seminars
 
seminar at Princeton University
seminar at Princeton Universityseminar at Princeton University
seminar at Princeton University
 

Plus de Christian Robert

Asymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de FranceAsymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de FranceChristian Robert
 
How many components in a mixture?
How many components in a mixture?How many components in a mixture?
How many components in a mixture?Christian Robert
 
Testing for mixtures at BNP 13
Testing for mixtures at BNP 13Testing for mixtures at BNP 13
Testing for mixtures at BNP 13Christian Robert
 
Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?Christian Robert
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking componentsChristian Robert
 
discussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihooddiscussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihoodChristian Robert
 
NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)Christian Robert
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerChristian Robert
 
a discussion of Chib, Shin, and Simoni (2017-8) Bayesian moment models
a discussion of Chib, Shin, and Simoni (2017-8) Bayesian moment modelsa discussion of Chib, Shin, and Simoni (2017-8) Bayesian moment models
a discussion of Chib, Shin, and Simoni (2017-8) Bayesian moment modelsChristian Robert
 
ABC based on Wasserstein distances
ABC based on Wasserstein distancesABC based on Wasserstein distances
ABC based on Wasserstein distancesChristian Robert
 
Poster for Bayesian Statistics in the Big Data Era conference
Poster for Bayesian Statistics in the Big Data Era conferencePoster for Bayesian Statistics in the Big Data Era conference
Poster for Bayesian Statistics in the Big Data Era conferenceChristian Robert
 
short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018Christian Robert
 
ABC with Wasserstein distances
ABC with Wasserstein distancesABC with Wasserstein distances
ABC with Wasserstein distancesChristian Robert
 
prior selection for mixture estimation
prior selection for mixture estimationprior selection for mixture estimation
prior selection for mixture estimationChristian Robert
 
Coordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like samplerCoordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like samplerChristian Robert
 
ABC convergence under well- and mis-specified models
ABC convergence under well- and mis-specified modelsABC convergence under well- and mis-specified models
ABC convergence under well- and mis-specified modelsChristian Robert
 

Plus de Christian Robert (20)

Asymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de FranceAsymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de France
 
discussion of ICML23.pdf
discussion of ICML23.pdfdiscussion of ICML23.pdf
discussion of ICML23.pdf
 
How many components in a mixture?
How many components in a mixture?How many components in a mixture?
How many components in a mixture?
 
restore.pdf
restore.pdfrestore.pdf
restore.pdf
 
Testing for mixtures at BNP 13
Testing for mixtures at BNP 13Testing for mixtures at BNP 13
Testing for mixtures at BNP 13
 
Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?
 
CDT 22 slides.pdf
CDT 22 slides.pdfCDT 22 slides.pdf
CDT 22 slides.pdf
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking components
 
discussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihooddiscussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihood
 
NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like sampler
 
eugenics and statistics
eugenics and statisticseugenics and statistics
eugenics and statistics
 
a discussion of Chib, Shin, and Simoni (2017-8) Bayesian moment models
a discussion of Chib, Shin, and Simoni (2017-8) Bayesian moment modelsa discussion of Chib, Shin, and Simoni (2017-8) Bayesian moment models
a discussion of Chib, Shin, and Simoni (2017-8) Bayesian moment models
 
ABC based on Wasserstein distances
ABC based on Wasserstein distancesABC based on Wasserstein distances
ABC based on Wasserstein distances
 
Poster for Bayesian Statistics in the Big Data Era conference
Poster for Bayesian Statistics in the Big Data Era conferencePoster for Bayesian Statistics in the Big Data Era conference
Poster for Bayesian Statistics in the Big Data Era conference
 
short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018
 
ABC with Wasserstein distances
ABC with Wasserstein distancesABC with Wasserstein distances
ABC with Wasserstein distances
 
prior selection for mixture estimation
prior selection for mixture estimationprior selection for mixture estimation
prior selection for mixture estimation
 
Coordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like samplerCoordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like sampler
 
ABC convergence under well- and mis-specified models
ABC convergence under well- and mis-specified modelsABC convergence under well- and mis-specified models
ABC convergence under well- and mis-specified models
 

Dernier

Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai YoungDubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Youngkajalvid75
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Serviceshivanisharma5244
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxFarihaAbdulRasheed
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professormuralinath2
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticssakshisoni2385
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and ClassificationsAreesha Ahmad
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedDelhi Call girls
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceAlex Henderson
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Silpa
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...chandars293
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformationAreesha Ahmad
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....muralinath2
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flyPRADYUMMAURYA1
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learninglevieagacer
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxMohamedFarag457087
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsOrtegaSyrineMay
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav
 

Dernier (20)

Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai YoungDubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 

Boston talk

  • 1. ABC model choice: beyond summary selection Christian P. Robert Universit´e Paris-Dauphine, Paris & University of Warwick, Coventry JSM 2014, Boston August 6, 2014
  • 2. Outline Joint work with J.-M. Cornuet, A. Estoup, J.-M. Marin, and P. Pudlo Intractable likelihoods ABC methods ABC model choice via random forests Conclusion
  • 3. Intractable likelihood Case of a well-defined statistical model where the likelihood function (θ|y) = f (y1, . . . , yn|θ) is (really!) not available in closed form cannot (easily!) be either completed or demarginalised cannot be (at all!) estimated by an unbiased estimator examples of latent variable models of high dimension, including combinatorial structures (trees, graphs), missing constant f (x|θ) = g(y, θ) Z(θ) (eg. Markov random fields, exponential graphs,. . . ) c Prohibits direct implementation of a generic MCMC algorithm like Metropolis–Hastings which gets stuck exploring missing structures
  • 4. Intractable likelihood Case of a well-defined statistical model where the likelihood function (θ|y) = f (y1, . . . , yn|θ) is (really!) not available in closed form cannot (easily!) be either completed or demarginalised cannot be (at all!) estimated by an unbiased estimator c Prohibits direct implementation of a generic MCMC algorithm like Metropolis–Hastings which gets stuck exploring missing structures
  • 5. Getting approximative Case of a well-defined statistical model where the likelihood function (θ|y) = f (y1, . . . , yn|θ) is out of reach Empirical A to the original B problem Degrading the data precision down to tolerance level ε Replacing the likelihood with a non-parametric approximation Summarising/replacing the data with insufficient statistics
  • 6. Getting approximative Case of a well-defined statistical model where the likelihood function (θ|y) = f (y1, . . . , yn|θ) is out of reach Empirical A to the original B problem Degrading the data precision down to tolerance level ε Replacing the likelihood with a non-parametric approximation Summarising/replacing the data with insufficient statistics
  • 7. Getting approximative Case of a well-defined statistical model where the likelihood function (θ|y) = f (y1, . . . , yn|θ) is out of reach Empirical A to the original B problem Degrading the data precision down to tolerance level ε Replacing the likelihood with a non-parametric approximation Summarising/replacing the data with insufficient statistics
  • 8. Approximate Bayesian computation Intractable likelihoods ABC methods abc of ABC Summary statistic ABC for model choice ABC model choice via random forests Conclusion
  • 9. Genetic background of ABC ABC is a recent computational technique that only requires being able to sample from the likelihood f (·|θ) This technique stemmed from population genetics models, about 15 years ago, and population geneticists still significantly contribute to methodological developments of ABC. [Griffith & al., 1997; Tavar´e & al., 1999]
  • 10. ABC methodology Bayesian setting: target is π(θ)f (x|θ) When likelihood f (x|θ) not in closed form, likelihood-free rejection technique: Foundation For an observation y ∼ f (y|θ), under the prior π(θ), if one keeps jointly simulating θ ∼ π(θ) , z ∼ f (z|θ ) , until the auxiliary variable z is equal to the observed value, z = y, then the selected θ ∼ π(θ|y) [Rubin, 1984; Diggle & Gratton, 1984; Tavar´e et al., 1997]
  • 11. ABC methodology Bayesian setting: target is π(θ)f (x|θ) When likelihood f (x|θ) not in closed form, likelihood-free rejection technique: Foundation For an observation y ∼ f (y|θ), under the prior π(θ), if one keeps jointly simulating θ ∼ π(θ) , z ∼ f (z|θ ) , until the auxiliary variable z is equal to the observed value, z = y, then the selected θ ∼ π(θ|y) [Rubin, 1984; Diggle & Gratton, 1984; Tavar´e et al., 1997]
  • 12. ABC methodology Bayesian setting: target is π(θ)f (x|θ) When likelihood f (x|θ) not in closed form, likelihood-free rejection technique: Foundation For an observation y ∼ f (y|θ), under the prior π(θ), if one keeps jointly simulating θ ∼ π(θ) , z ∼ f (z|θ ) , until the auxiliary variable z is equal to the observed value, z = y, then the selected θ ∼ π(θ|y) [Rubin, 1984; Diggle & Gratton, 1984; Tavar´e et al., 1997]
  • 13. A as A...pproximative When y is a continuous random variable, strict equality z = y is replaced with a tolerance zone ρ(y, z) where ρ is a distance Output distributed from π(θ) Pθ{ρ(y, z) < } def ∝ π(θ|ρ(y, z) < ) [Pritchard et al., 1999]
  • 14. A as A...pproximative When y is a continuous random variable, strict equality z = y is replaced with a tolerance zone ρ(y, z) where ρ is a distance Output distributed from π(θ) Pθ{ρ(y, z) < } def ∝ π(θ|ρ(y, z) < ) [Pritchard et al., 1999]
  • 15. ABC recap Likelihood free rejection sampling Tavar´e et al. (1997) Genetics 1) Set i = 1, 2) Generate θ from the prior distribution, 3) Generate z from the likelihood f (·|θ ), 4) If ρ(η(z ), η(y)) , set (θi , zi ) = (θ , z ) and i = i + 1, 5) If i N, return to 2). Keep θ’s such that the distance between the corresponding simulated dataset and the observed dataset is small enough. Tuning parameters > 0: tolerance level, η(z): function that summarizes datasets, ρ(η, η ): distance between vectors of summary statistics N: size of the output
  • 16. Which summary? Fundamental difficulty of the choice of the summary statistic when there is no non-trivial sufficient statistics [except when done by the experimenters in the field] Loss of statistical information balanced against gain in data roughening Approximation error and information loss remain unknown Choice of statistics induces choice of distance function towards standardisation may be imposed for external/practical reasons (e.g., DIYABC) may gather several non-B point estimates can learn about efficient combination
  • 17. Which summary? Fundamental difficulty of the choice of the summary statistic when there is no non-trivial sufficient statistics [except when done by the experimenters in the field] Loss of statistical information balanced against gain in data roughening Approximation error and information loss remain unknown Choice of statistics induces choice of distance function towards standardisation may be imposed for external/practical reasons (e.g., DIYABC) may gather several non-B point estimates can learn about efficient combination
  • 18. Which summary? Fundamental difficulty of the choice of the summary statistic when there is no non-trivial sufficient statistics [except when done by the experimenters in the field] Loss of statistical information balanced against gain in data roughening Approximation error and information loss remain unknown Choice of statistics induces choice of distance function towards standardisation may be imposed for external/practical reasons (e.g., DIYABC) may gather several non-B point estimates can learn about efficient combination
  • 19. attempts at summaries How to choose the set of summary statistics? Joyce and Marjoram (2008, SAGMB) Nunes and Balding (2010, SAGMB) Fearnhead and Prangle (2012, JRSS B) Ratmann et al. (2012, PLOS Comput. Biol) Blum et al. (2013, Statistical science) EP-ABC of Barthelm´e & Chopin (2013, JASA)
  • 20. ABC for model choice Intractable likelihoods ABC methods abc of ABC Summary statistic ABC for model choice ABC model choice via random forests Conclusion
  • 21. ABC model choice ABC model choice A) Generate large set of (m, θ, z) from the Bayesian predictive, π(m)πm(θ)fm(z|θ) B) Keep particles (m, θ, z) such that ρ(η(y), η(z)) C) For each m, return pm = proportion of m among remaining particles If ε is tuned so that end number of particles is k, then pm is a k-nearest neighbor estimate of {M = m η(y) Approximating posterior probabilities of models is a regression problem where response is 1 M = m , co-variables are the summary statistics η(z), loss is, e.g., L2 (conditional expectation) Prefered method for approximating posterior probabilities in DIYABC is local polylogistic regression
  • 22. Machine learning perspective ABC model choice A) Generate a large set of (m, θ, z)’s from Bayesian predictive, π(m)πm(θ)fm(z|θ) B) Use machine learning tech. to infer on π(m η(y)) In this perspective: (iid) “data set” reference table simulated during stage A) observed y becomes a new data point Note that: predicting m is a classification problem ⇐⇒ select the best model based on a maximal a posteriori rule computing π(m|η(y)) is a regression problem ⇐⇒ confidence in each model classification much simpler than regression (e.g., dim. of objects we try to learn)
  • 23. Warning the lost of information induced by using non sufficient summary statistics is a real problem Fundamental discrepancy between the genuine Bayes factors/posterior probabilities and the Bayes factors based on summary statistics. See, e.g., Didelot et al. (2011, Bayesian analysis) Robert, Cornuet, Marin and Pillai (2011, PNAS) Marin, Pillai, Robert, Rousseau (2014, JRSS B) . . . We suggest using instead a machine learning approach that can deal with a large number of correlated summary statistics: E.g., random forests
  • 24. ABC model choice via random forests Intractable likelihoods ABC methods ABC model choice via random forests Random forests ABC with random forests Illustrations Conclusion
  • 25. Leaning towards machine learning Main notions: ABC-MC seen as learning about which model is most appropriate from a huge (reference) table exploiting a large number of summary statistics not an issue for machine learning methods intended to estimate efficient combinations abandoning (temporarily?) the idea of estimating posterior probabilities of the models, poorly approximated by machine learning methods, and replacing those by posterior predictive expected loss
  • 26. Random forests Technique that stemmed from Leo Breiman’s bagging (or bootstrap aggregating) machine learning algorithm for both classification and regression [Breiman, 1996] Improved classification performances by averaging over classification schemes of randomly generated training sets, creating a “forest” of (CART) decision trees, inspired by Amit and Geman (1997) ensemble learning [Breiman, 2001]
  • 27. Growing the forest Breiman’s solution for inducing random features in the trees of the forest: boostrap resampling of the dataset and random subset-ing [of size √ t] of the covariates driving the classification at every node of each tree Covariate xτ that drives the node separation xτ cτ and the separation bound cτ chosen by minimising entropy or Gini index
  • 28. Breiman and Cutler’s algorithm Algorithm 1 Random forests for t = 1 to T do //*T is the number of trees*// Draw a bootstrap sample of size nboot = n Grow an unpruned decision tree while leaves are not monochrome do Select ntry of the predictors at random Determine the best split from among those predictors end while end for Predict new data by aggregating the predictions of the T trees
  • 29. ABC with random forests Idea: Starting with possibly large collection of summary statistics (s1i , . . . , spi ) (from scientific theory input to available statistical softwares, to machine-learning alternatives) ABC reference table involving model index, parameter values and summary statistics for the associated simulated pseudo-data run R randomforest to infer M from (s1i , . . . , spi )
  • 30. ABC with random forests Idea: Starting with possibly large collection of summary statistics (s1i , . . . , spi ) (from scientific theory input to available statistical softwares, to machine-learning alternatives) ABC reference table involving model index, parameter values and summary statistics for the associated simulated pseudo-data run R randomforest to infer M from (s1i , . . . , spi ) at each step O( √ p) indices sampled at random and most discriminating statistic selected, by minimising entropy Gini loss
  • 31. ABC with random forests Idea: Starting with possibly large collection of summary statistics (s1i , . . . , spi ) (from scientific theory input to available statistical softwares, to machine-learning alternatives) ABC reference table involving model index, parameter values and summary statistics for the associated simulated pseudo-data run R randomforest to infer M from (s1i , . . . , spi ) Average of the trees is resulting summary statistics, highly non-linear predictor of the model index
  • 32. Outcome of ABC-RF Random forest predicts a (MAP) model index, from the observed dataset: The predictor provided by the forest is “sufficient” to select the most likely model but not to derive associated posterior probability exploit entire forest by computing how many trees lead to picking each of the models under comparison but variability too high to be trusted frequency of trees associated with majority model is no proper substitute to the true posterior probability And usual ABC-MC approximation equally highly variable and hard to assess
  • 33. Outcome of ABC-RF Random forest predicts a (MAP) model index, from the observed dataset: The predictor provided by the forest is “sufficient” to select the most likely model but not to derive associated posterior probability exploit entire forest by computing how many trees lead to picking each of the models under comparison but variability too high to be trusted frequency of trees associated with majority model is no proper substitute to the true posterior probability And usual ABC-MC approximation equally highly variable and hard to assess
  • 34. Posterior predictive expected losses We suggest replacing unstable approximation of P(M = m|xo) with xo observed sample and m model index, by average of the selection errors across all models given the data xo, P( ^M(X) = M|xo) where pair (M, X) generated from the predictive f (x|θ)π(θ, M|xo)dθ and ^M(x) denotes the random forest model (MAP) predictor
  • 35. Posterior predictive expected losses Arguments: Bayesian estimate of the posterior error integrates error over most likely part of the parameter space gives an averaged error rather than the posterior probability of the null hypothesis easily computed: Given ABC subsample of parameters from reference table, simulate pseudo-samples associated with those and derive error frequency
  • 36. toy: MA(1) vs. MA(2) Comparing an MA(1) and an MA(2) models: xt = t − ϑ1 t−1[−ϑ2 t−2] Earlier illustration using first two autocorrelations as S(x) [Marin et al., Stat. & Comp., 2011] Result #1: values of p(m|x) [obtained by numerical integration] and p(m|S(x)) [obtained by mixing ABC outcome and density estimation] highly differ!
  • 37. toy: MA(1) vs. MA(2) Difference between the posterior probability of MA(2) given either x or S(x). Blue stands for data from MA(1), orange for data from MA(2)
  • 38. toy: MA(1) vs. MA(2) Comparing an MA(1) and an MA(2) models: xt = t − ϑ1 t−1[−ϑ2 t−2] Earlier illustration using two autocorrelations as S(x) [Marin et al., Stat. & Comp., 2011] Result #2: Embedded models, with simulations from MA(1) within those from MA(2), hence linear classification poor
  • 39. toy: MA(1) vs. MA(2) Simulations of S(x) under MA(1) (blue) and MA(2) (orange)
  • 40. toy: MA(1) vs. MA(2) Comparing an MA(1) and an MA(2) models: xt = t − ϑ1 t−1[−ϑ2 t−2] Earlier illustration using two autocorrelations as S(x) [Marin et al., Stat. & Comp., 2011] Result #3: On such a small dimension problem, random forests should come second to k-nn ou kernel discriminant analyses
  • 41. toy: MA(1) vs. MA(2) classification prior method error rate (in %) LDA 27.43 Logist. reg. 28.34 SVM (library e1071) 17.17 “na¨ıve” Bayes (with G marg.) 19.52 “na¨ıve” Bayes (with NP marg.) 18.25 ABC k-nn (k = 100) 17.23 ABC k-nn (k = 50) 16.97 Local log. reg. (k = 1000) 16.82 Random Forest 17.04 Kernel disc. ana. (KDA) 16.95 True MAP 12.36
  • 42. Evolution scenarios based on SNPs Three scenarios for the evolution of three populations from their most common ancestor
  • 43. Evolution scenarios based on SNPs Model 1 with 6 parameters: four effective sample sizes: N1 for population 1, N2 for population 2, N3 for population 3 and, finally, N4 for the native population; the time of divergence ta between populations 1 and 3; the time of divergence ts between populations 1 and 2. effective sample sizes with independent uniform priors on [100, 30000] vector of divergence times (ta, ts) with uniform prior on {(a, s) ∈ [10, 30000] ⊗ [10, 30000]|a < s}
  • 44. Evolution scenarios based on SNPs Model 2 with same parameters as model 1 but the divergence time ta corresponds to a divergence between populations 2 and 3; prior distributions identical to those of model 1 Model 3 with extra seventh parameter, admixture rate r. For that scenario, at time ta admixture between populations 1 and 2 from which population 3 emerges. Prior distribution on r uniform on [0.05, 0.95]. In that case models 1 and 2 are not embeddeded in model 3. Prior distributions for other parameters the same as in model 1
  • 45. Evolution scenarios based on SNPs Set of 48 summary statistics: Single sample statistics proportion of loci with null gene diversity (= proportion of monomorphic loci) mean gene diversity across polymorphic loci [Nei, 1987] variance of gene diversity across polymorphic loci mean gene diversity across all loci
  • 46. Evolution scenarios based on SNPs Set of 48 summary statistics: Two sample statistics proportion of loci with null FST distance between both samples [Weir and Cockerham, 1984] mean across loci of non null FST distances between both samples variance across loci of non null FST distances between both samples mean across loci of FST distances between both samples proportion of 1 loci with null Nei’s distance between both samples [Nei, 1972] mean across loci of non null Nei’s distances between both samples variance across loci of non null Nei’s distances between both samples mean across loci of Nei’s distances between the two samples
  • 47. Evolution scenarios based on SNPs Set of 48 summary statistics: Three sample statistics proportion of loci with null admixture estimate mean across loci of non null admixture estimate variance across loci of non null admixture estimated mean across all locus admixture estimates
  • 48. Evolution scenarios based on SNPs For a sample of 1000 SNIPs measured on 25 biallelic individuals per population, learning ABC reference table with 20, 000 simulations, prior predictive error rates: “na¨ıve Bayes” classifier 33.3% raw LDA classifier 23.27% ABC k-nn [Euclidean dist. on summaries normalised by MAD] 25.93% ABC k-nn [unnormalised Euclidean dist. on LDA components] 22.12% local logistic classifier based on LDA components with k = 500 neighbours 22.61% random forest on summaries 21.03% (Error rates computed on a prior sample of size 104)
  • 49. Evolution scenarios based on SNPs For a sample of 1000 SNIPs measured on 25 biallelic individuals per population, learning ABC reference table with 20, 000 simulations, prior predictive error rates: “na¨ıve Bayes” classifier 33.3% raw LDA classifier 23.27% ABC k-nn [Euclidean dist. on summaries normalised by MAD] 25.93% ABC k-nn [unnormalised Euclidean dist. on LDA components] 22.12% local logistic classifier based on LDA components with k = 1000 neighbours 22.46% random forest on summaries 21.03% (Error rates computed on a prior sample of size 104)
  • 50. Evolution scenarios based on SNPs For a sample of 1000 SNIPs measured on 25 biallelic individuals per population, learning ABC reference table with 20, 000 simulations, prior predictive error rates: “na¨ıve Bayes” classifier 33.3% raw LDA classifier 23.27% ABC k-nn [Euclidean dist. on summaries normalised by MAD] 25.93% ABC k-nn [unnormalised Euclidean dist. on LDA components] 22.12% local logistic classifier based on LDA components with k = 5000 neighbours 22.43% random forest on summaries 21.03% (Error rates computed on a prior sample of size 104)
  • 51. Evolution scenarios based on SNPs For a sample of 1000 SNIPs measured on 25 biallelic individuals per population, learning ABC reference table with 20, 000 simulations, prior predictive error rates: “na¨ıve Bayes” classifier 33.3% raw LDA classifier 23.27% ABC k-nn [Euclidean dist. on summaries normalised by MAD] 25.93% ABC k-nn [unnormalised Euclidean dist. on LDA components] 22.12% local logistic classifier based on LDA components with k = 5000 neighbours 22.43% random forest on LDA components only 23.1% (Error rates computed on a prior sample of size 104)
  • 52. Evolution scenarios based on SNPs For a sample of 1000 SNIPs measured on 25 biallelic individuals per population, learning ABC reference table with 20, 000 simulations, prior predictive error rates: “na¨ıve Bayes” classifier 33.3% raw LDA classifier 23.27% ABC k-nn [Euclidean dist. on summaries normalised by MAD] 25.93% ABC k-nn [unnormalised Euclidean dist. on LDA components] 22.12% local logistic classifier based on LDA components with k = 5000 neighbours 22.43% random forest on summaries and LDA components 19.03% (Error rates computed on a prior sample of size 104)
  • 53. Evolution scenarios based on SNPs Posterior predictive error rates
  • 54. Evolution scenarios based on SNPs Posterior predictive error rates favourable: 0.010 error – unfavourable: 0.104 error
  • 55. Instance of ecological questions [message in a beetle] How did the Asian Ladybird beetle arrive in Europe? Why do they swarm right now? What are the routes of invasion? How to get rid of them? [Lombaert & al., 2010, PLoS ONE] beetles in forests
  • 56. Worldwide invasion routes of Harmonia Axyridis [Estoup et al., 2012, Molecular Ecology Res.]
  • 57. Asian Ladybirds [message in a beetle] Comparing 10 scenarios of Asian beetle invasion beetle moves
  • 58. Asian Ladybirds [message in a beetle] Comparing 10 scenarios of Asian beetle invasion beetle moves classification prior error∗ method rate (in %) raw LDA 38.94 “na¨ıve” Bayes (with G margins) 54.02 k-nn (MAD normalised sum stat) 58.47 RF without LDA components 38.84 RF with LDA components 35.32 ∗ estimated on pseudo-samples of 104 items drawn from the prior
  • 59. Back to Asian Ladybirds [message in a beetle] Comparing 10 scenarios of Asian beetle invasion beetle moves Random forest allocation frequencies 1 2 3 4 5 6 7 8 9 10 0.168 0.1 0.008 0.066 0.296 0.016 0.092 0.04 0.014 0.2 Posterior predictive error based on 20,000 prior simulations and keeping 500 neighbours (or 100 neighbours and 10 pseudo-datasets per parameter) 0.3682
  • 60. Back to Asian Ladybirds [message in a beetle] Comparing 10 scenarios of Asian beetle invasion posterior predictive error 0.368
  • 61. Conclusion Key ideas π m η(y) = π m y Rather than approximating π m η(y) , focus on selecting the best model (classif. vs regression) Assess confidence in the selection with posterior predictive expected losses Consequences on ABC-PopGen Often, RF k-NN (less sensible to high correlation in summaries) RF requires many less prior simulations RF selects automatically relevent summaries Hence can handle much more complex models
  • 62. Conclusion Key ideas π m η(y) = π m y Use a seasoned machine learning technique selecting from ABC simulations: minimise 0-1 loss mimics MAP Assess confidence in the selection with posterior predictive expected losses Consequences on ABC-PopGen Often, RF k-NN (less sensible to high correlation in summaries) RF requires many less prior simulations RF selects automatically relevent summaries Hence can handle much more complex models