2. • What is Pyro?
• Introduction to Bayesian modeling
• Example 1: linear regression
• Bayesian inference with Pyro
• Example 2: deep Markov model for music
14. Posterior approximation
• Referred to as variational distribution (‘guide’ in Pyro)
• Minimize wrt
p(W, b|D) ⇡ q (W, b)
p(W, b|D) ⇡ q W
(W)q b
(b)Simple version
KL [q (W, b)||p(W, b|D)]
16. Monte Carlo (MC)
approximation
Eq (⇥) [log p(D, ⇥) log q (⇥)]
= N 1
NX
i=1
[log p(D, ⇥i) log q (⇥i)] ⇥i ⇠ q (⇥i)
• Reparametrization is applied to reduce the variance of
stochastic gradient: https://stats.stackexchange.com/
questions/199605
17. Stochastic variational
inference
1. Draw samples of the RVs in the model
2. Compute ELBO with Monte Carlo approximation
3. Compute stochastic gradient of ELBO wrt
parameters
4. Apply a gradient descent algorithm
5. Back to 1.
18. Bayesian inference with Pyro
1. Prepare data
2. Implement model and guide
3. Run SVI
4. Draw samples from posterior for prediction
24. Probabilistic model of music
• Model polyphonic music
• Sequences of 88 dimensional binary vectors
• Nonlinear dynamics
• Different length
25. Deep Markov model
• Latent variable models
• Nonlinear transformation (dynamics)
• Kalman filter is a special case (linear dynamics,
Gaussian noise)
30. Amortized inference
• Instead of parametrized posterior on the latent RVs,
introduce neural network that mimics the inference
on each of the latent RVs by outputting variational
parameters given the information of other RVs
• Learning-to-learn
• Variational autoencoder (VAE)