Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Gentle Introduction: Bayesian Modelling and Probabilistic Programming in R

601 vues

Publié le

What is probabilistic programming and Bayesian statistics? What are their strengths and limitations? In his talk, Marco located Bayesian networks in the current AI landscape, gently introduced Bayesian reasoning and computation and explained how to implement generative models in R.

Publié dans : Sciences
  • Identifiez-vous pour voir les commentaires

Gentle Introduction: Bayesian Modelling and Probabilistic Programming in R

  1. 1. Gentle Introduction: Bayesian Modelling and Probabilistic Programming in R Geneva R Users Group Speaker: Marco Wirthlin @marcowirthlin Image Source: https://i.stack.imgur.com/GONoV.jpg Uploaded Version
  2. 2. This talk was made for Geneva R Users: Image Source: https://i.stack.imgur.com/GONoV.jpg
  3. 3. After a long Search! You got a Job!
  4. 4. Your Boss: “Can you give us a hand?” “Look at this complex machine. Sometimes it malfunctions and produces items that will have faults difficult to spot. Can you predict when and why this happens?”
  5. 5. The Data: 10 TB of Joy
  6. 6. How would you solve this? (Discriminative Edition) Raw Data Tidy Data ML Ready Data Trained Classifier ● Cleaning ● Munching ● Exploratory Analysis ● KNN ● PCA/ICA ● Random Forest ● Feature Engineering ● Regularization ● Model Tuning ● Training Prediction / Classification ● Validation
  7. 7. Raw Data Tidy Data ● Cleaning ● Munching ● Exploratory Analysis ● KNN ● PCA/ICA ● Random Forest ● Feature Engineering ● Regularization How would you solve this? (Generative Edition) Candidate Model(s) Domain Knowledge ● (Re)parametrization ● Refinement ● Prior/Posterior Simulations ● Model Selection ● Scientific Comm. Phenomenon Simulations + Gain Understanding ● Apply Knowledge ● Know Uncertainty Fix Problem (?)
  8. 8. What is a generative model? ● Ng, A. Y. and Jordan, M. I. (2002). On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. In Advances in neural information processing systems, pages 841–848. ● Rasmus Bååth, Video Introduction to Bayesian Data Analysis, Part 1: What is Bayes?: https://www.youtube.com/watch?time_continue=366&v=3OJEae7Qb_o Hypothesis of underlying mechanisms AKA: “Learning the class” No shorcuts! =D [2, 8, ..., 9] θ, μ, ξ, ... Parameters Model Bayesian Inference
  9. 9. Recap: When to use which approach ● http://www.fharrell.com/post/stat-ml/ ● http://www.fharrell.com/post/stat-ml2/ Statistical Models Little/Expensive/Inaccessible Is relevant Isolate effects of few Are transparent Many, Explicit Understanding predictors Data Uncertainty Num. of Param. Interpretability Assumptions Goal Machine Learning Abundant Not relevant Many Black Box Some, Implicit Overall Prediction * * Very general guidelines! ● E.g. Bayesian models scale well with many parameters and also with data due to inter and intra chain GPU parallelization. ● Example hybrid methods: Deep (Hierarchical) Bayesian Neural Networks, Bayesian Optimization. Gaussian Mixture Models
  10. 10. Bayesian Inference (BI) BI Likelihoods Frequentist Statistics Graphical Models Probabilistic Programming Background Implementation
  11. 11. Likelihoods Normal Distribution =L p(D | θ) ~x N(μ, σ2 ) “The probability that D belongs to a distribution with mean μ and SD σ” =L p(D | μ, σ2 ) “X ”is normally distributed PDF: Fix parameters, vary data L: Fix data, vary parameters ● https://www.youtube.com/watch?v=ScduwntrMzc Applet: https://seneketh.shinyapps.io/Likelihood_Intuition
  12. 12. Interlude: Frequentist Inference Y = [7, ..., 2] X = [2, ..., 9] Y = a * X + b Y ~ N(a * X + b, σ2 ) =L p(D | a, b, σ2 ) argmax(Σln(p(D | a, b, σ2 )) a b σ2 MLE “True” Population “True” unique values
  13. 13. Interlude: Frequentist Inference “True” Population =D [7, 3, 2] Sample: N=3 Sampling Distribution e.g. F distribution Test statistic Inter-group var./ Intra-group var. ∞ H0 mean Central Limit Theorem “Long range” probability ● Sampling distribution applet: http://onlinestatbook.com/stat_sim/sampling_dist/index.html
  14. 14. Interlude: Frequentist Inference ● Sampling distribution applet: http://onlinestatbook.com/stat_sim/sampling_dist/index.html “When a frequentist says that the probability for "heads" in a coin toss is 0.5 (50%) she means that in infinitively many such coin tosses, 50% of the coins will show "head"”.
  15. 15. Bayesian Inference ● John Kruschke: Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan Chapter 5 ● Rasmus Bååth: http://www.sumsar.net/blog/2017/02/introduction-to-bayesian-data-analysis-part-two/ ● Richard McElreath: Statistical Rethinking book and lectures (https://www.youtube.com/watch?v=4WVelCswXo4)
  16. 16. Bayesian Inference ● John Kruschke: Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan Chapter 5 ● Rasmus Bååth: http://www.sumsar.net/blog/2017/02/introduction-to-bayesian-data-analysis-part-two/ ● Richard McElreath: Statistical Rethinking book and lectures (https://www.youtube.com/watch?v=4WVelCswXo4)
  17. 17. Bayesian Inference ● John Kruschke: Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan Chapter 5 ● Rasmus Bååth: http://www.sumsar.net/blog/2017/02/introduction-to-bayesian-data-analysis-part-two/ ● Richard McElreath: Statistical Rethinking book and lectures (https://www.youtube.com/watch?v=4WVelCswXo4) Discrete Values: Just sum it up! :) Cont. Values: Integration over complete parameter space... :(
  18. 18. Bayesian Inference: ● John Kruschke: Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan Chapter 5 ● Rasmus Bååth: http://www.sumsar.net/blog/2017/02/introduction-to-bayesian-data-analysis-part-two/ ● Richard McElreath: Statistical Rethinking book and lectures (https://www.youtube.com/watch?v=4WVelCswXo4) Averaging over the complete parameter space via integration is impractical! Solution: We sample from the conjugate probability distribution with smart MCMC algorithms! (Subject of another talk)
  19. 19. Bayesian Inference ● John Kruschke: Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan Chapter 5 ● Rasmus Bååth: http://www.sumsar.net/blog/2017/02/introduction-to-bayesian-data-analysis-part-two/ ● Richard McElreath: Statistical Rethinking book and lectures (https://www.youtube.com/watch?v=4WVelCswXo4) Lets compute this and sample from it!
  20. 20. Y = a * X + b Y ~ N(a * X + b, σ2 ) Quantify all model parts with uncertainty p(D, a, b, σ2 ) = p(D | a, b, σ2 )*p(a)*p(b)*p(σ2 ) a ~ N(1, 0.1) b ~ N(4, 0.5) σ2 ~ G(1, 0.1) p(a) p(b) p(σ2 ) p(D | a, b, σ2 ) p(D | θ)
  21. 21. From model to code Y = a * X + b a ~ N(1, 0.1) b ~ N(4, 0.5) σ2 ~ G(1, 0.1) Y ~ N(a * X + b, σ2 ) ● More examples: https://mc-stan.org/users/documentation/case-studies
  22. 22. Implementation in R
  23. 23. Example: Deep Bayesian Neural Nets ● https://alexgkendall.com/computer_vision/bayesian_deep_learning_for_safe_ai/ ● https://twiecki.io/blog/2018/08/13/hierarchical_bayesian_neural_network/
  24. 24. Example: Bayesian Inference and Volatility Modeling Using Stan https://luisdamiano.github.io/personal/volatility_stan2018.pdf Credit: Michael Weylandt, Luis Damiano
  25. 25. Example: Bayesian Inference and Volatility Modeling Using Stan https://luisdamiano.github.io/personal/volatility_stan2018.pdf Credit: Michael Weylandt, Luis Damiano
  26. 26. https://luisdamiano.github.io/personal/volatility_stan2018.pdf Credit: Michael Weylandt, Luis Damiano
  27. 27. Thank you for your attention… and endurance!
  28. 28. Additional Slides Sources, links and more!
  29. 29. All sources in one place! About Generative vs. Discriminative models: Ng, A. Y. and Jordan, M. I. (2002). On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. In Advances in neural information processing systems, pages 841–848. Rasmus Bååth: Video Introduction to Bayesian Data Analysis, Part 1: What is Bayes?: https://www.youtube.com/watch?time_continue=366&v=3OJEae7Qb_o When to use ML vs. Statistical Modelling: Frank Harrell's Blog: http://www.fharrell.com/post/stat-ml/ http://www.fharrell.com/post/stat-ml2/ Frequentist approach: How do sampling distributions work (applet): http://onlinestatbook.com/stat_sim/sampling_dist/index.html Bayesian inference and computation: John Kruschke: Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan Chapter 5 Rasmus Bååth: http://www.sumsar.net/blog/2017/02/introduction-to-bayesian-data-analysis-part-two/ Richard McElreath: Statistical Rethinking book and lectures (https://www.youtube.com/watch?v=4WVelCswXo4) Many model examples in Stan: https://mc-stan.org/users/documentation/case-studies About Bayesian Neural Networks: https://alexgkendall.com/computer_vision/bayesian_deep_learning_for_safe_ai/ https://twiecki.io/blog/2018/08/13/hierarchical_bayesian_neural_network/ Volatility Examples: Hidden Markov Models: https://github.com/luisdamiano/rfinance17 Volatility Garch Model and Bayesian Workflow: https://luisdamiano.github.io/personal/volatility_stan2018.pdf Dictionary: Stats ↔ ML https://ubc-mds.github.io/resources_pages/terminology/ The Bayesian Workflow: https://betanalpha.github.io/assets/case_studies/principled_bayesian_workflow.html Algorithm explanation applet for MCMC exploration of the parameter space: http://elevanth.org/blog/2017/11/28/build-a-better-markov-chain/ Probabilistic Programming Conference Talks: https://www.youtube.com/watch?v=crvNIGyqGSU
  30. 30. Who to follow on Twitter? ● Chris Fonnesbeck @fonnesbeck (pyMC3) ● Thomas Wiecki @twiecki (pyMC3) Blog: https://twiecki.io/ (nice intros) ● Bayes Dose @BayesDose (general info and papers) ● Richard McElreath @rlmcelreath (ecology, Bayesian statistics expert) All his lectures: https://www.youtube.com/channel/UCNJK6_DZvcMqNSzQdEkzvzA ● Michael Betancourt @betanalpha (Stan) Blog: https://betanalpha.github.io/writing/ Specifically: https://betanalpha.github.io/assets/case_studies/principled_bayesian_workflow.html ● Rasmus Bååth @rabaath Great video series: http://www.sumsar.net/blog/2017/02/introduction-to-bayesian-data-analysis-part-one/ ● Frank Harrell @f2harrell (statistics sage) Great Blog: http://www.fharrell.com/ ● Andrew Gelman @StatModeling (statistics sage) https://statmodeling.stat.columbia.edu/ ● Judea Pearl @yudapearl Book of Why: http://bayes.cs.ucla.edu/WHY/ (more about causality, BN and DAG) ● AND MANY MORE!
  31. 31. Dictionary: Stats ↔ ML Check: https://ubc-mds.github.io/resources_pages/terminology/ for more terminology Statistics Estimation/Fitting Hypothesis Data Point Regression Classification Covariates Parameters Response Factor Likelihood Machine learning / AI ~ Learning ~ Classification rule ~ Example/ Instance ~ Supervised Learning ~ Supervised Learning ~ Features ~ Features ~ Label ~ Factor (categorical variables) ~ Cost Function (sometimes)
  32. 32. Data Science + AI + ML + Stats Credit: Zoubin Ghahramani, CTO UBER. Talk: "Probabilistic Machine Learning: From theory to industrial impact"

×