SlideShare une entreprise Scribd logo
1  sur  12
Télécharger pour lire hors ligne
Bayesian Case Studies, week 1

          Robin J. Ryder


          7 January 2013




      Robin J. Ryder   Bayesian Case Studies, week 1
About this course




  Two aims:
    1   Implement computational algorithms
    2   Analyse real datasets
  6 × 3 hours.
  E-mail: ryder@ceremade.dauphine.fr. Office B627.
  Evaluation: written-up analysis of a dataset, to hand in by end of
  March. The project topic will be given in February.




                          Robin J. Ryder   Bayesian Case Studies, week 1
Exponential family

  A family of distributions (=a model) is an exponential family if the
  density can be written as

                fX (x|θ) = h(x) exp[η(θ) · T (x) − A(θ)]

  where h, η, T and A are known functions.




                         Robin J. Ryder   Bayesian Case Studies, week 1
Exponential family

  A family of distributions (=a model) is an exponential family if the
  density can be written as

                 fX (x|θ) = h(x) exp[η(θ) · T (x) − A(θ)]

  where h, η, T and A are known functions.

  Then T (x) is a sufficient statistic. For iid x1 , . . . , xn , T (xi ) is a
  sufficient statistic for the sample: it encapsulates all the
  information about the parameters included in the data. The
  posterior depends on the sample only through the sufficient
  statistic.
  η(θ) is called the natural parameter.
  A(θ) is the log-partition, the log of the normalizing factor.


                           Robin J. Ryder   Bayesian Case Studies, week 1
Conjugate prior




  A family of distributions is a conjugate prior for a given model if
  the posterior belongs to the same family of distributions.
  This is mostly a computational advantage.
  If the model is an exponential family, then a conjugate prior exists.




                         Robin J. Ryder   Bayesian Case Studies, week 1
Jeffreys’ prior


  Jeffreys’ prior, also called the uninformative prior, is invariant by
  reparameterization. In the one-dimensional case, it is defined as

                                 π(θ) ∝           I (θ)

  where I (θ) is the Fisher information, which is defined as a function
  of the log-likelihood :
                                             2
                                  ∂                                ∂2
               I (θ) = EX                        θ = −EX               θ
                                  ∂θ                               ∂θ2

  (under certain regularity conditions)




                            Robin J. Ryder       Bayesian Case Studies, week 1
Jeffreys’ prior (contd)




  Jeffreys’ prior may be improper, which means that it integrates to
  infinity.
  This is not an issue as long as the corresponding posterior is
  proper. This point should always be checked.




                        Robin J. Ryder   Bayesian Case Studies, week 1
Data: Ship accidents




  The dataset ShipAccidents includes data on accidents of 40
  classes of ships. Each row corresponds to one class. Each class of
  ship is defined by 3 attributes: type of ship (5 modalities), period
  of construction (4 modalities), period of operation (2 modalities).

  For each type of ship, we are given the cumulative number of
  months in operation and the cumulative number of incidents,
  which we expect to follow a Poisson distribution.




                         Robin J. Ryder   Bayesian Case Studies, week 1
ABC


 Approximate Bayesian Computation is a computational method to
 draw approximate samples from a posterior distribution in cases
 where the likelihood is intractable, but where it is easy to simulate
 new datasets.
 Given observed data Dobs , with prior π(θ), we wish to sample θ
 from the posterior π(theta)L(D|θ).
 The non-approximate version of the algorithm is:
   1   Simulate θ from the prior π.
   2   Simulate a new dataset Dsim from the model, with parameter
       θ.
   3   If Dobs = Dsim , then accept θ; else reject θ.
   4   Repeat until we get a large enough sample of θ’s.



                          Robin J. Ryder   Bayesian Case Studies, week 1
ABC (contd)


  It is clear that this algorithm gives samples which follow exactly
  the posterior distribution, but the acceptation probability at step 3
  is very small, making the algorithm very slow. Instead, an
  approximate version is used, by introducing a distance d on
  datasets and a tolerance parameter :




                         Robin J. Ryder   Bayesian Case Studies, week 1
ABC (contd)


  It is clear that this algorithm gives samples which follow exactly
  the posterior distribution, but the acceptation probability at step 3
  is very small, making the algorithm very slow. Instead, an
  approximate version is used, by introducing a distance d on
  datasets and a tolerance parameter :
    1   Simulate θ from the prior π.
    2   Simulate a new dataset Dsim from the model, with parameter
        θ.
    3   If d(Dobs , Dsim ) < , then accept θ; else reject θ.
    4   Repeat until we get a large enough sample of θ’s.




                           Robin J. Ryder   Bayesian Case Studies, week 1
ABC (contd)




  In the limit → 0, this algorithm is exact.
  In practice, the distance is usually computed on a summary
  statistic of the data. Ideally, the summary statistic is sufficient,
  thus incurring no loss of information.




                          Robin J. Ryder   Bayesian Case Studies, week 1

Contenu connexe

Tendances

Tendances (12)

Dynamic Itemset Counting
Dynamic Itemset CountingDynamic Itemset Counting
Dynamic Itemset Counting
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Accessing the tuples
Accessing the tuplesAccessing the tuples
Accessing the tuples
 
Fp growth
Fp growthFp growth
Fp growth
 
Row enumeration by Carpenter algorithm_ANIKET CHOUDHURY
Row enumeration by Carpenter algorithm_ANIKET CHOUDHURYRow enumeration by Carpenter algorithm_ANIKET CHOUDHURY
Row enumeration by Carpenter algorithm_ANIKET CHOUDHURY
 
B0950814
B0950814B0950814
B0950814
 
Formal Derivatives
Formal DerivativesFormal Derivatives
Formal Derivatives
 
Apriori algorithm
Apriori algorithmApriori algorithm
Apriori algorithm
 
Apriori algorithm
Apriori algorithmApriori algorithm
Apriori algorithm
 
Mc ty-logarithms-2009-1
Mc ty-logarithms-2009-1Mc ty-logarithms-2009-1
Mc ty-logarithms-2009-1
 
An Extension to the Zero-Inflated Generalized Power Series Distributions
An Extension to the Zero-Inflated Generalized Power Series DistributionsAn Extension to the Zero-Inflated Generalized Power Series Distributions
An Extension to the Zero-Inflated Generalized Power Series Distributions
 
Pre-Cal 30S January 14, 2009
Pre-Cal 30S January 14, 2009Pre-Cal 30S January 14, 2009
Pre-Cal 30S January 14, 2009
 

Similaire à Bayesian case studies, practical 1

Considerate Approaches to ABC Model Selection
Considerate Approaches to ABC Model SelectionConsiderate Approaches to ABC Model Selection
Considerate Approaches to ABC Model SelectionMichael Stumpf
 
Statistical Decision Theory
Statistical Decision TheoryStatistical Decision Theory
Statistical Decision TheorySangwoo Mo
 
NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015Christian Robert
 
Approximate Bayesian model choice via random forests
Approximate Bayesian model choice via random forestsApproximate Bayesian model choice via random forests
Approximate Bayesian model choice via random forestsChristian Robert
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Christian Robert
 
SURF 2012 Final Report(1)
SURF 2012 Final Report(1)SURF 2012 Final Report(1)
SURF 2012 Final Report(1)Eric Zhang
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Valentin De Bortoli
 
Deep Learning for Cyber Security
Deep Learning for Cyber SecurityDeep Learning for Cyber Security
Deep Learning for Cyber SecurityAltoros
 
Bayesian Hierarchical Models
Bayesian Hierarchical ModelsBayesian Hierarchical Models
Bayesian Hierarchical ModelsAmmar Rashed
 
NIPS2007: learning using many examples
NIPS2007: learning using many examplesNIPS2007: learning using many examples
NIPS2007: learning using many exampleszukun
 
An investigation of inference of the generalized extreme value distribution b...
An investigation of inference of the generalized extreme value distribution b...An investigation of inference of the generalized extreme value distribution b...
An investigation of inference of the generalized extreme value distribution b...Alexander Decker
 
07 Machine Learning - Expectation Maximization
07 Machine Learning - Expectation Maximization07 Machine Learning - Expectation Maximization
07 Machine Learning - Expectation MaximizationAndres Mendez-Vazquez
 

Similaire à Bayesian case studies, practical 1 (20)

Considerate Approaches to ABC Model Selection
Considerate Approaches to ABC Model SelectionConsiderate Approaches to ABC Model Selection
Considerate Approaches to ABC Model Selection
 
Statistical Decision Theory
Statistical Decision TheoryStatistical Decision Theory
Statistical Decision Theory
 
NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015
 
Approximate Bayesian model choice via random forests
Approximate Bayesian model choice via random forestsApproximate Bayesian model choice via random forests
Approximate Bayesian model choice via random forests
 
Lesson 26
Lesson 26Lesson 26
Lesson 26
 
AI Lesson 26
AI Lesson 26AI Lesson 26
AI Lesson 26
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1
 
06 Machine Learning - Naive Bayes
06 Machine Learning - Naive Bayes06 Machine Learning - Naive Bayes
06 Machine Learning - Naive Bayes
 
SURF 2012 Final Report(1)
SURF 2012 Final Report(1)SURF 2012 Final Report(1)
SURF 2012 Final Report(1)
 
the ABC of ABC
the ABC of ABCthe ABC of ABC
the ABC of ABC
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...
 
kcde
kcdekcde
kcde
 
Deep Learning for Cyber Security
Deep Learning for Cyber SecurityDeep Learning for Cyber Security
Deep Learning for Cyber Security
 
Bayesian Hierarchical Models
Bayesian Hierarchical ModelsBayesian Hierarchical Models
Bayesian Hierarchical Models
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
25010001
2501000125010001
25010001
 
2.03 bayesian estimation
2.03 bayesian estimation2.03 bayesian estimation
2.03 bayesian estimation
 
NIPS2007: learning using many examples
NIPS2007: learning using many examplesNIPS2007: learning using many examples
NIPS2007: learning using many examples
 
An investigation of inference of the generalized extreme value distribution b...
An investigation of inference of the generalized extreme value distribution b...An investigation of inference of the generalized extreme value distribution b...
An investigation of inference of the generalized extreme value distribution b...
 
07 Machine Learning - Expectation Maximization
07 Machine Learning - Expectation Maximization07 Machine Learning - Expectation Maximization
07 Machine Learning - Expectation Maximization
 

Plus de Robin Ryder

Bayesian Methods for Historical Linguistics
Bayesian Methods for Historical LinguisticsBayesian Methods for Historical Linguistics
Bayesian Methods for Historical LinguisticsRobin Ryder
 
A phylogenetic model of language diversification
A phylogenetic model of language diversificationA phylogenetic model of language diversification
A phylogenetic model of language diversificationRobin Ryder
 
Statistical Methods in Historical Linguistics
Statistical Methods in Historical LinguisticsStatistical Methods in Historical Linguistics
Statistical Methods in Historical LinguisticsRobin Ryder
 
Introduction à ABC
Introduction à ABCIntroduction à ABC
Introduction à ABCRobin Ryder
 
On the convergence properties of the Wang-Landau algorithm
On the convergence properties of the Wang-Landau algorithmOn the convergence properties of the Wang-Landau algorithm
On the convergence properties of the Wang-Landau algorithmRobin Ryder
 
Bayesian case studies, practical 2
Bayesian case studies, practical 2Bayesian case studies, practical 2
Bayesian case studies, practical 2Robin Ryder
 
Modèles phylogéniques de la diversification des langues
Modèles phylogéniques de la diversification des languesModèles phylogéniques de la diversification des langues
Modèles phylogéniques de la diversification des languesRobin Ryder
 
Talk at Institut Jean Nicod on 6 October 2010
Talk at Institut Jean Nicod on 6 October 2010Talk at Institut Jean Nicod on 6 October 2010
Talk at Institut Jean Nicod on 6 October 2010Robin Ryder
 
Phylogenetic models and MCMC methods for the reconstruction of language history
Phylogenetic models and MCMC methods for the reconstruction of language historyPhylogenetic models and MCMC methods for the reconstruction of language history
Phylogenetic models and MCMC methods for the reconstruction of language historyRobin Ryder
 
Modèles phylogénétiques de la diversification des langues
Modèles phylogénétiques de la diversification des languesModèles phylogénétiques de la diversification des langues
Modèles phylogénétiques de la diversification des languesRobin Ryder
 
Approximate Bayesian Computation (ABC)
Approximate Bayesian Computation (ABC)Approximate Bayesian Computation (ABC)
Approximate Bayesian Computation (ABC)Robin Ryder
 

Plus de Robin Ryder (11)

Bayesian Methods for Historical Linguistics
Bayesian Methods for Historical LinguisticsBayesian Methods for Historical Linguistics
Bayesian Methods for Historical Linguistics
 
A phylogenetic model of language diversification
A phylogenetic model of language diversificationA phylogenetic model of language diversification
A phylogenetic model of language diversification
 
Statistical Methods in Historical Linguistics
Statistical Methods in Historical LinguisticsStatistical Methods in Historical Linguistics
Statistical Methods in Historical Linguistics
 
Introduction à ABC
Introduction à ABCIntroduction à ABC
Introduction à ABC
 
On the convergence properties of the Wang-Landau algorithm
On the convergence properties of the Wang-Landau algorithmOn the convergence properties of the Wang-Landau algorithm
On the convergence properties of the Wang-Landau algorithm
 
Bayesian case studies, practical 2
Bayesian case studies, practical 2Bayesian case studies, practical 2
Bayesian case studies, practical 2
 
Modèles phylogéniques de la diversification des langues
Modèles phylogéniques de la diversification des languesModèles phylogéniques de la diversification des langues
Modèles phylogéniques de la diversification des langues
 
Talk at Institut Jean Nicod on 6 October 2010
Talk at Institut Jean Nicod on 6 October 2010Talk at Institut Jean Nicod on 6 October 2010
Talk at Institut Jean Nicod on 6 October 2010
 
Phylogenetic models and MCMC methods for the reconstruction of language history
Phylogenetic models and MCMC methods for the reconstruction of language historyPhylogenetic models and MCMC methods for the reconstruction of language history
Phylogenetic models and MCMC methods for the reconstruction of language history
 
Modèles phylogénétiques de la diversification des langues
Modèles phylogénétiques de la diversification des languesModèles phylogénétiques de la diversification des langues
Modèles phylogénétiques de la diversification des langues
 
Approximate Bayesian Computation (ABC)
Approximate Bayesian Computation (ABC)Approximate Bayesian Computation (ABC)
Approximate Bayesian Computation (ABC)
 

Bayesian case studies, practical 1

  • 1. Bayesian Case Studies, week 1 Robin J. Ryder 7 January 2013 Robin J. Ryder Bayesian Case Studies, week 1
  • 2. About this course Two aims: 1 Implement computational algorithms 2 Analyse real datasets 6 × 3 hours. E-mail: ryder@ceremade.dauphine.fr. Office B627. Evaluation: written-up analysis of a dataset, to hand in by end of March. The project topic will be given in February. Robin J. Ryder Bayesian Case Studies, week 1
  • 3. Exponential family A family of distributions (=a model) is an exponential family if the density can be written as fX (x|θ) = h(x) exp[η(θ) · T (x) − A(θ)] where h, η, T and A are known functions. Robin J. Ryder Bayesian Case Studies, week 1
  • 4. Exponential family A family of distributions (=a model) is an exponential family if the density can be written as fX (x|θ) = h(x) exp[η(θ) · T (x) − A(θ)] where h, η, T and A are known functions. Then T (x) is a sufficient statistic. For iid x1 , . . . , xn , T (xi ) is a sufficient statistic for the sample: it encapsulates all the information about the parameters included in the data. The posterior depends on the sample only through the sufficient statistic. η(θ) is called the natural parameter. A(θ) is the log-partition, the log of the normalizing factor. Robin J. Ryder Bayesian Case Studies, week 1
  • 5. Conjugate prior A family of distributions is a conjugate prior for a given model if the posterior belongs to the same family of distributions. This is mostly a computational advantage. If the model is an exponential family, then a conjugate prior exists. Robin J. Ryder Bayesian Case Studies, week 1
  • 6. Jeffreys’ prior Jeffreys’ prior, also called the uninformative prior, is invariant by reparameterization. In the one-dimensional case, it is defined as π(θ) ∝ I (θ) where I (θ) is the Fisher information, which is defined as a function of the log-likelihood : 2 ∂ ∂2 I (θ) = EX θ = −EX θ ∂θ ∂θ2 (under certain regularity conditions) Robin J. Ryder Bayesian Case Studies, week 1
  • 7. Jeffreys’ prior (contd) Jeffreys’ prior may be improper, which means that it integrates to infinity. This is not an issue as long as the corresponding posterior is proper. This point should always be checked. Robin J. Ryder Bayesian Case Studies, week 1
  • 8. Data: Ship accidents The dataset ShipAccidents includes data on accidents of 40 classes of ships. Each row corresponds to one class. Each class of ship is defined by 3 attributes: type of ship (5 modalities), period of construction (4 modalities), period of operation (2 modalities). For each type of ship, we are given the cumulative number of months in operation and the cumulative number of incidents, which we expect to follow a Poisson distribution. Robin J. Ryder Bayesian Case Studies, week 1
  • 9. ABC Approximate Bayesian Computation is a computational method to draw approximate samples from a posterior distribution in cases where the likelihood is intractable, but where it is easy to simulate new datasets. Given observed data Dobs , with prior π(θ), we wish to sample θ from the posterior π(theta)L(D|θ). The non-approximate version of the algorithm is: 1 Simulate θ from the prior π. 2 Simulate a new dataset Dsim from the model, with parameter θ. 3 If Dobs = Dsim , then accept θ; else reject θ. 4 Repeat until we get a large enough sample of θ’s. Robin J. Ryder Bayesian Case Studies, week 1
  • 10. ABC (contd) It is clear that this algorithm gives samples which follow exactly the posterior distribution, but the acceptation probability at step 3 is very small, making the algorithm very slow. Instead, an approximate version is used, by introducing a distance d on datasets and a tolerance parameter : Robin J. Ryder Bayesian Case Studies, week 1
  • 11. ABC (contd) It is clear that this algorithm gives samples which follow exactly the posterior distribution, but the acceptation probability at step 3 is very small, making the algorithm very slow. Instead, an approximate version is used, by introducing a distance d on datasets and a tolerance parameter : 1 Simulate θ from the prior π. 2 Simulate a new dataset Dsim from the model, with parameter θ. 3 If d(Dobs , Dsim ) < , then accept θ; else reject θ. 4 Repeat until we get a large enough sample of θ’s. Robin J. Ryder Bayesian Case Studies, week 1
  • 12. ABC (contd) In the limit → 0, this algorithm is exact. In practice, the distance is usually computed on a summary statistic of the data. Ideally, the summary statistic is sufficient, thus incurring no loss of information. Robin J. Ryder Bayesian Case Studies, week 1