SlideShare une entreprise Scribd logo
1  sur  53
Télécharger pour lire hors ligne
Bayesian Models in R 10/3/14, 13:37 
Bayesian Models in R 
Vivian Zhang | SupStat Inc. 
Copyright SupStat Inc., All rights reserved 
http://docs.supstat.com/BayesianModelEN/#1 Page 1 of 53
Bayesian Models in R 10/3/14, 13:37 
Outline 
1. Introduction to Bayes and Bayes' Theorem 
2. Distribution estimation 
3. Conditional probability 
4. Bayesian models 
2/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 2 of 53
Bayesian Models in R 10/3/14, 13:37 
Introduction to Bayes and 
Bayes' Theorem 
3/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 3 of 53
Bayesian Models in R 10/3/14, 13:37 
The*story*behind*the*Bayesian*model 
Thomas Bayes 
18th century English statistician 
Most known for the Bayes Theorem 
Essential contributor to early development of probability theory 
· 
· 
· 
Source: http://www.bioquest.org/products/auth_images/422_bayes.gif 
4/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 4 of 53
Bayesian Models in R 10/3/14, 13:37 
The*Model 
1. Models using Bayes' theorem (based on conditional probablity 
· Naive Bayes, Association Rules 
2. Bayes Decision Theory 
· Classical Bayesian model for Decision Theory 
3. Models implementing Bayesian thinking 
· Treat all the parameter as random variables, especially in hierarchical models 
5/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 5 of 53
Bayesian Models in R 10/3/14, 13:37 
Distribution Estimation 
6/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 6 of 53
Bayesian Models in R 10/3/14, 13:37 
Distribu6on*Es6ma6on 
Probablity Density Function 
In statistics, the Probablity Density Function (PDF) of a continous random variable is an output 
discribing this variable, which means the probability around a certain point. 
Example: plot of PDF of the Normal distribution 
· 
· 
7/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 7 of 53
Bayesian Models in R 10/3/14, 13:37 
Distribu6on*Es6ma6on 
Probablity Density Function 
The PDF has an important place in statistics 
- It contains all the information in the random variable 
Knowing the PDF, we can calculate the 
· 
· 
Mean 
Variance 
Median 
etc. 
- 
- 
- 
- 
8/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 8 of 53
Bayesian Models in R 10/3/14, 13:37 
Distribu6on*Es6ma6on 
Probablity Density Function 
Obtain the PDF, get everything from a random variable. This allows you to perform: 
Bayesian Hypothesis Tests 
Bayesian Interval Estimation 
Bayesian Regression Models 
Bayesian Logistic Models 
etc. 
· 
· 
· 
· 
· 
9/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 9 of 53
Bayesian Models in R 10/3/14, 13:37 
Distribu6on*Es6ma6on 
Probablity Density Function 
ExampleBayesian Regression: 
Y = Xβ + ϵ, ϵ ∼ N(0, σ2 ) 
Estimation methods for the regression model 
· 
· 
- 
- β ∼ N((X′ X)−1X′ Y, (X′ X)−1 ) 
- = ( X Y βˆ 
OLS (Ordinary Least Squres) 
X′ )−1X′ is the estimator of 
β 
10/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 10 of 53
Bayesian Models in R 10/3/14, 13:37 
Distribu6on*Es6ma6on 
The Bayesian Model 
Before obtaining data, one has beliefs about the value of the proportion and models his or her 
beliefs in terms of a prior distribution. 
After data have been observed, one updates one’s beliefs about the proportion by computing the 
posterior distribution. 
· 
· 
11/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 11 of 53
Bayesian Models in R 10/3/14, 13:37 
Distribu6on*Es6ma6on 
The Bayesian Model 
Building a Bayesian model begins with Bayesian Thinking (every value has its own distribution). 
Steps to build a Bayesian model: 
· 
· 
Make inferences about prior distribution 
Calculate the parameter of the posterior distribution 
Finish the statistical task (interval estimationstatistical decision, etc.) 
- 
- 
- 
12/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 12 of 53
Bayesian Models in R 10/3/14, 13:37 
Inferring*from*the*posterior*distribu6on 
Posterior inference is the core of Bayes' Theorem, because we do not actually know the 
population distribution which generated our data. We use the conditional distribution to address 
this gap indirectly. In this section, a certain degree of mathematical sophistication is required 
without which we cannot easily implement the model computationally. 
· 
Essentials: 
Bayes' theorem 
Conditional distribution 
- For example: ϵ in regression is from a normal distribution 
Certain prior distribution 
· 
· 
· 
- No information given 
13/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 13 of 53
Bayesian Models in R 10/3/14, 13:37 
Calcula6ng*the*posterior*distribu6on 
The most difficult part is calculating the posterior distribution, which requires integration. 
· Markov chain Monte Carlo (MCMC) 
Gibbs 
MH method 
- 
- 
14/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 14 of 53
Bayesian Models in R 10/3/14, 13:37 
Conditional probability 
15/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 15 of 53
Bayesian Models in R 10/3/14, 13:37 
Condi6onal*probability 
What is conditional probability? 
· A B 
P(A|B) 
The probablity that event will occur when event has occurred. This probability is written as 
. 
P(A|B) = 
P(AB) 
P(B) 
A and B are two events 
· P(AB) 
· P(B) 
is the probability that both A and B occur. 
is the probability that B occurs. 
· 
16/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 16 of 53
Bayesian Models in R 10/3/14, 13:37 
Condi6onal*probability 
Why conditional probability 
Example 
· Suppose 
A: The event of getting a cold 
B: The event of a rainy day (p = 0.2) 
AB: The event that when it rains you get a cold (p = 0.1) 
- 
- 
- 
P(AB) 
P(B) 
0.1 
0.2 
P(A|B) = = = 0.5 
· Interpretation: 
- When it rains, the probablity of getting a cold is 50% 
17/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 17 of 53
Bayesian Models in R 10/3/14, 13:37 
Condi6onal*probability 
Exercise 
· There are two kids in a family. 
If one of the kids is a boy, the probability that the other one is also a boy is... 
If the first one is a boy, the probability that the other one is a boy is... 
, 
- 
- 
- 23 
12 
18/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 18 of 53
Bayesian Models in R 10/3/14, 13:37 
Condi6onal*Probability 
The model relates to conditional probability 
· A priori 
Mining associated rules 
The association from A to B is defined as: 
- 
- 
P(AB) 
P(A) 
A = B : = P(B|A) 
· In R, use the arules package 
19/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 19 of 53
Bayesian Models in R 10/3/14, 13:37 
Condi6onal*Probability 
A priori 
Goal: find the items with strong relationships 
First, load the data: 
· 
· 
library(arules) 
data = read.csv(data/BASKETS1n) 
names(data) 
[1] cardid value pmethod sex homeown income 
[7] age fruitveg freshmeat dairy cannedveg cannedmeat 
[13] frozenmeal beer wine softdrink fish confectionery 
20/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 20 of 53
Bayesian Models in R 10/3/14, 13:37 
Condi6onal*Probability 
A priori 
basket = data[, 8:18] 
names(basket)[which(basket[1, ] == T)] 
[1] freshmeat dairy confectionery 
tbs2 = apply(basket, 1, function(x) names(basket)[which(x==T)]) 
len = sapply(tbs2, length) 
require(arules) 
trans.code = rep(1:1000, len) 
trans.items = unname(unlist(tbs2)) 
trans.code.ind = match(trans.code, unique(trans.code)) 
trans.items.ind = match(trans.items, unique(trans.items)) 
21/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 21 of 53
Bayesian Models in R 10/3/14, 13:37 
Condi6onal*Probability 
A priori 
mat = sparseMatrix(i = trans.items.ind, 
j = trans.code.ind, 
x = 1, 
dims = c(length(unique(trans.items)), 
length(unique(trans.code)))) 
mat = as(mat, 'ngCMatrix') 
#after setting the argument we get the model: 
trans.res = apriori(mat,parameter = list(confidence=0.05, 
support=0.05, 
minlen=2,maxlen=3)) 
22/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 22 of 53
Bayesian Models in R 10/3/14, 13:37 
Condi6onal*Probability 
A priori 
parameter specification: 
confidence minval smax arem aval originalSupport support minlen maxlen target ext 
0.05 0.1 1 none FALSE TRUE 0.05 2 3 rules FALSE 
algorithmic control: 
filter tree heap memopt load sort verbose 
0.1 TRUE TRUE FALSE TRUE 2 TRUE 
apriori - find association rules with the apriori algorithm 
version 4.21 (2004.05.09) (c) 1996-2004 Christian Borgelt 
set item appearances ...[0 item(s)] done [0.00s]. 
set transactions ...[11 item(s), 940 transaction(s)] done [0.00s]. 
sorting and recoding items ... [11 item(s)] done [0.00s]. 
creating transaction tree ... done [0.00s]. 
checking subsets of size 1 2 3 done [0.00s]. 
writing ... [108 rule(s)] done [0.00s]. 23/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 23 of 53
Bayesian Models in R 10/3/14, 13:37 
Condi6onal*Probability 
· At last, we have the items with the strongest relationship in one basket 
#let's see these rules: 
lhs.generic = unique(trans.items)[trans.res@lhs@data@i+1] 
rhs.generic = unique(trans.items)[trans.res@rhs@data@i+1] 
cbind(lhs.generic, rhs.generic)[1:10, ] 
lhs.generic rhs.generic 
[1,] dairy confectionery 
[2,] confectionery dairy 
[3,] dairy fish 
[4,] fish dairy 
[5,] dairy fruitveg 
[6,] fruitveg dairy 
[7,] dairy frozenmeal 
[8,] frozenmeal dairy 
[9,] freshmeat confectionery 
[10,] confectionery freshmeat 
24/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 24 of 53
Bayesian Models in R 10/3/14, 13:37 
Condi6onal*Probability 
The model relates to conditional probablity 
· Naive Bayes 
Used in recommendation systemsclassification problems 
Compute the posterior probability for all values of C using the Bayes 
theorem: 
- 
- P(C|A1, A2,…, An) 
P(C|A1A2 ⋯An) = 
- Choose the value of C that maximizes 
P(C|A1, A2, . . . , An) 
- P(A1, A2, . . . , An|C)P(C) 
P(A1A2 ⋯An |C) × P(C) 
P(A1A2 ⋯An ) 
Equivalent to choosing the value of C that maximizes 
25/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 25 of 53
Bayesian Models in R 10/3/14, 13:37 
Naive*Bayes 
data(iris) 
m = naiveBayes(Species ~ ., data=iris) 
## alternatively: 
m = naiveBayes(iris[, -5], iris[, 5]) 
26/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 26 of 53
Bayesian Models in R 10/3/14, 13:37 
Naive*Bayes 
Model: 
m 
Naive Bayes Classifier for Discrete Predictors 
Call: 
naiveBayes.default(x = iris[, -5], y = iris[, 5]) 
A-priori probabilities: 
iris[, 5] 
setosa versicolor virginica 
0.33333 0.33333 0.33333 
Conditional probabilities: 
Sepal.Length 
iris[, 5] [,1] [,2] 
setosa 5.006 0.35249 27/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 27 of 53
Bayesian Models in R 10/3/14, 13:37 
Naive*Bayes 
Predict: 
table(predict(m, iris), iris[,5]) 
setosa versicolor virginica 
setosa 50 0 0 
versicolor 0 47 3 
virginica 0 3 47 
28/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 28 of 53
Bayesian Models in R 10/3/14, 13:37 
From*condi6onal*probablity*to*Bayes'*Theorem 
We have: 
So: 
Change the Conditional Prob. 
· 
P(B|A) = 
P(AB) 
P(A) 
· 
P(AB) = P(B|A)P(A) 
· 
P(AB) 
P(B) 
P(A|B) = = 
P(B|A)P(A) 
P(B) 
29/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 29 of 53
Bayesian Models in R 10/3/14, 13:37 
Bayes'*Theorem 
P(A|B) = 
P(B|A)P(A) 
P(B) 
Bayes' theorem relates the conditional probablity to the marginal distribution of a random varable. 
Bayes' theorm can tell us how to update our thinking after obtaining new data. 
Harold Jeffreys has claimed that Bayes' theorem is to Statistics as the Pythagorean theorem is to 
geometry. 
· 
· 
30/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 30 of 53
Bayesian Models in R 10/3/14, 13:37 
Bayes'*theorem 
Continuous situation 
The Bayes' theorem mentioned above is in discrete form 
In the real world often we are using and analyzing continuous random variables 
The Bayes' theorem can be written in continuous form as: 
· 
· 
· 
π(θ|x) = 
f (x|θ)π(θ) 
m(x) 
31/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 31 of 53
Bayesian Models in R 10/3/14, 13:37 
Bayes'*Theorem 
Continous form 
π(θ|x) = 
f (x|θ)π(θ) 
m(x) 
· Here 
- θ 
is an unknown parameter 
- X 
is the data observed 
- Processing is from π(θ) to 
π(θ|x) 
- From the original knowledge of θ updated to the situation after we observe 
X 
32/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 32 of 53
Bayesian Models in R 10/3/14, 13:37 
Bayes'*Theorem 
Continuous form 
π(θ|x) = 
f (x|θ)π(θ) 
m(x) 
· Based on the properties of continous random variables, it can be written as: 
π(θ|x) = 
f (x|θ)π(θ) 
∫ f (x|θ)π(θ)dθ 
33/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 33 of 53
Bayesian Models in R 10/3/14, 13:37 
Bayes'*Theorem 
Continuous form 
Important distributions: 
f (x|θ)π(θ) 
m(x) 
π(θ|x) = = 
f (x|θ)π(θ) 
∫ f (x|θ)π(θ)dθ 
· π(θ) 
- Prior distribution 
· π(θ|x) 
- Posterior distribution 
34/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 34 of 53
Bayesian Models in R 10/3/14, 13:37 
Bayes'*Theorem 
Continuous form 
Other distributions: 
f (x|θ)π(θ) 
m(x) 
π(θ|x) = = 
f (x|θ)π(θ) 
∫ f (x|θ)π(θ)dθ 
· m(x) = ∫ f (x|θ)π(θ)dθ 
- Marginal Distribution 
· f (x|θ)π(θ) = f (x, θ) 
- Joint distribution 
35/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 35 of 53
Bayesian Models in R 10/3/14, 13:37 
Bayesian Models 
36/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 36 of 53
Bayesian Models in R 10/3/14, 13:37 
Bayesian*Models 
Bayesian thinking 
data(iris) 
head(iris) 
Sepal.Length Sepal.Width Petal.Length Petal.Width Species 
1 5.1 3.5 1.4 0.2 setosa 
2 4.9 3.0 1.4 0.2 setosa 
3 4.7 3.2 1.3 0.2 setosa 
4 4.6 3.1 1.5 0.2 setosa 
5 5.0 3.6 1.4 0.2 setosa 
6 5.4 3.9 1.7 0.4 setosa 
· Data are random variables with a mean of μ 
37/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 37 of 53
Bayesian Models in R 10/3/14, 13:37 
Bayesian*Models 
Bayesian thinking 
· The frequency perspective: The mean μ is a constant 
colMeans(iris[, 1:3]) 
Sepal.Length Sepal.Width Petal.Length 
5.8433 3.0573 3.7580 
38/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 38 of 53
Bayesian Models in R 10/3/14, 13:37 
Bayesian*Models 
Bayesian thinking 
· The Bayesian perspective: The mean μ is a random variable 
PROB SEPAL LENGTH SEPAL WIDTH PETAL LENGTH 
90% 5.843333 3.057333 3.758000 
10% Others Others Others 
39/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 39 of 53
Bayesian Models in R 10/3/14, 13:37 
Bayesian*Models 
In fact, nearly all of modern Bayesian modeling uses Bayesian thinking 
Nearly all statistical models can be implemented as Bayesian-form models 
Even some non-parametric models can be transformed to Bayeseian versions 
Bayes Cluster 
Bayes Regression 
- Logit, Probit, Tobit, Quantile, LASSO... 
Bayes Neural Net 
Non-parametric Bayes 
Hierarchical model 
etc. 
· 
· 
· 
· 
· 
· 
· 
· 
· 
40/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 40 of 53
Bayesian Models in R 10/3/14, 13:37 
Bayesian*Modeling*Example 
Question 
For a Sample from a normal distribution. We want to know the mean of this sample. 
ˆ 
θ· X1, X2, . . . , Xn ∼ N(θ, σ) 
· Frequentists think 
= mean(x) · θ 
· θ ∼ N(μ, τ2) 
Bayesians think is a random variable with a distribution 
Suppose that 
· 
Infer the posterior distribution 
Calculate the posterior distribution 
Estimate the mean of the sample 
- 
- 
- 
41/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 41 of 53
Bayesian Models in R 10/3/14, 13:37 
Bayesian*Modeling*Example 
Inference 
Inferring the posterior distribution using Bayes' Theorem in continous form: 
f (x|θ)π(θ) 
m(x) 
π(θ|x) = = 
f (x|θ)π(θ) 
∫ f (x|θ)π(θ)dθ 
· Put the distribution into the theorem to calculate the posterior distribution 
- Prior distribution 
θ ∼ N(μ, τ2) 
- Conditional distribution 
x|θ ∼ N(θ, σ2 ) 
42/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 42 of 53
Bayesian Models in R 10/3/14, 13:37 
Bayesian*Modeling*Example 
Inference 
43/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 43 of 53
Bayesian Models in R 10/3/14, 13:37 
Bayesian*Modeling*Example 
Calculating the posterior distribution 
According to the theorem, we know the mean and the variance of θ for a normal distribution. 
postDis = function(miu=2, tau=4, n=100) { 
x = rnorm(n,3,5) 
a = list(0) 
a[[1]] = (var(x)*miu+tau^2*mean(x))/(var(x)+tau^2) 
a[[2]] = var(x)*tau^2/(var(x)+tau^2) 
a 
} 
postDis(3, 5, 1000) 
[[1]] 
[1] 2.9284 
[[2]] 
[1] 12.254 
44/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 44 of 53
Bayesian Models in R 10/3/14, 13:37 
Bayesian*Modeling*Example 
Estimating the mean 
· μ 
In ordinary statistics, the MLE and moment estimators of in a normal distribution are the sample 
mean. 
For the Bayes posterior distribution 
· 
MLE --- posterior maximum likelihood estimator 
Can be considered as MLE of posterior distribution 
Posterior distribution is normal, too. So, the parameter of the mean is: 
- 
- 
- 
(σ2μ + τ2x)/(σ2 + τ2 ) 
45/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 45 of 53
Bayesian Models in R 10/3/14, 13:37 
Bayesian*Modeling*Example 
Estimating the mean 
· x ∼ N(μ, σ) = N(3, 5) 
- The mean is 3 
When using a different prior distribution 
Observe the error in a different situation 
· 
· 
46/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 46 of 53
Bayesian Models in R 10/3/14, 13:37 
Bayesian*Modeling*Example 
· Prior distribution: N(3, 1) 
library(ggplot2) 
plot_dif = function(miu=3, tau=1) { 
i = seq(100, 10000, by=10) 
set.seed(123) 
meanCompare = function(n=100, miu=3, tau=1) { 
x = rnorm(n, 3, 5) 
(var(x)*miu+tau^2*mean(x))/(var(x)+tau^2)-3 
} 
aa = sapply(i, meanCompare, miu=miu, tau=tau) 
bb = sapply(i,function(i) mean(rnorm(i,3,5))-3) 
g = ggplot(data.frame(i=i, a=aa, b=bb)) + 
geom_line(aes(x=i ,y=b), col=blue) + 
geom_line(aes(x=i, y=a), col=red) 
print(g) 
} 
47/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 47 of 53
Bayesian Models in R 10/3/14, 13:37 
Bayesian*Modeling*Example 
· Prior distribution: N(3, 1) (Bayes estimator in red, MLE in blue) 
plot_dif(3, 1) 
48/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 48 of 53
Bayesian Models in R 10/3/14, 13:37 
Bayesian*Modeling*Example 
· Prior distribution: N(2, 1) (Bayes estimator in red, MLE in blue) 
plot_dif(2,1) 
49/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 49 of 53
Bayesian Models in R 10/3/14, 13:37 
Bayesian*Modeling*Example 
· Prior distribution: N(2, 4) (Bayes estimator in red, MLE in blue) 
plot_dif(2,4) 
50/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 50 of 53
Bayesian Models in R 10/3/14, 13:37 
Bayesian*Modeling*Example 
· Prior distribution: N(2, 100) (Bayes estimator in red, MLE in blue) 
plot_dif(2,100) 
51/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 51 of 53
Bayesian Models in R 10/3/14, 13:37 
Bayesian*Modeling*Example 
1. As we can see, if the prior distribution is very accurate, the Bayes estimator is better than the 
ordinary estimator. 
2. If the prior distribution is not accurate enough: 
Larger variance is better 
For a suitable variance more data is better 
· 
· 
52/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 52 of 53
Bayesian Models in R 10/3/14, 13:37 
Bayesian*Modeling*Example 
Choosing the prior distribution 
· Choosing a prior distribution... 
If sure for the model, can improve the accuracy of the estimator 
If not sure, should be done by selecting for greater variance to improve the estimator 
- 
- 
53/53 
http://docs.supstat.com/BayesianModelEN/#1 Page 53 of 53

Contenu connexe

En vedette

Wikipedia: Tuned Predictions on Big Data
Wikipedia: Tuned Predictions on Big DataWikipedia: Tuned Predictions on Big Data
Wikipedia: Tuned Predictions on Big Data
Vivian S. Zhang
 

En vedette (10)

Kaggle Top1% Solution: Predicting Housing Prices in Moscow
Kaggle Top1% Solution: Predicting Housing Prices in Moscow Kaggle Top1% Solution: Predicting Housing Prices in Moscow
Kaggle Top1% Solution: Predicting Housing Prices in Moscow
 
Using Machine Learning to aid Journalism at the New York Times
Using Machine Learning to aid Journalism at the New York TimesUsing Machine Learning to aid Journalism at the New York Times
Using Machine Learning to aid Journalism at the New York Times
 
Hack session for NYTimes Dialect Map Visualization( developed by R Shiny)
 Hack session for NYTimes Dialect Map Visualization( developed by R Shiny) Hack session for NYTimes Dialect Map Visualization( developed by R Shiny)
Hack session for NYTimes Dialect Map Visualization( developed by R Shiny)
 
Wikipedia: Tuned Predictions on Big Data
Wikipedia: Tuned Predictions on Big DataWikipedia: Tuned Predictions on Big Data
Wikipedia: Tuned Predictions on Big Data
 
A Hybrid Recommender with Yelp Challenge Data
A Hybrid Recommender with Yelp Challenge Data A Hybrid Recommender with Yelp Challenge Data
A Hybrid Recommender with Yelp Challenge Data
 
We're so skewed_presentation
We're so skewed_presentationWe're so skewed_presentation
We're so skewed_presentation
 
Winning data science competitions, presented by Owen Zhang
Winning data science competitions, presented by Owen ZhangWinning data science competitions, presented by Owen Zhang
Winning data science competitions, presented by Owen Zhang
 
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorKaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
 
Xgboost
XgboostXgboost
Xgboost
 
Tips for data science competitions
Tips for data science competitionsTips for data science competitions
Tips for data science competitions
 

Similaire à Bayesian models in r

Naïve Bayes Machine Learning Classification with R Programming: A case study ...
Naïve Bayes Machine Learning Classification with R Programming: A case study ...Naïve Bayes Machine Learning Classification with R Programming: A case study ...
Naïve Bayes Machine Learning Classification with R Programming: A case study ...
SubmissionResearchpa
 
Random forest algorithm for regression a beginner's guide
Random forest algorithm for regression   a beginner's guideRandom forest algorithm for regression   a beginner's guide
Random forest algorithm for regression a beginner's guide
prateek kumar
 

Similaire à Bayesian models in r (17)

Belief Networks & Bayesian Classification
Belief Networks & Bayesian ClassificationBelief Networks & Bayesian Classification
Belief Networks & Bayesian Classification
 
Naïve Bayes Machine Learning Classification with R Programming: A case study ...
Naïve Bayes Machine Learning Classification with R Programming: A case study ...Naïve Bayes Machine Learning Classification with R Programming: A case study ...
Naïve Bayes Machine Learning Classification with R Programming: A case study ...
 
Unit-2.ppt
Unit-2.pptUnit-2.ppt
Unit-2.ppt
 
Bayesian Machine Learning & Python – Naïve Bayes (PyData SV 2013)
Bayesian Machine Learning & Python – Naïve Bayes (PyData SV 2013)Bayesian Machine Learning & Python – Naïve Bayes (PyData SV 2013)
Bayesian Machine Learning & Python – Naïve Bayes (PyData SV 2013)
 
Navies bayes
Navies bayesNavies bayes
Navies bayes
 
Topic Modeling
Topic ModelingTopic Modeling
Topic Modeling
 
Bayesian Machine Learning - Naive Bayes
Bayesian Machine Learning - Naive BayesBayesian Machine Learning - Naive Bayes
Bayesian Machine Learning - Naive Bayes
 
Ml4 naive bayes
Ml4 naive bayesMl4 naive bayes
Ml4 naive bayes
 
Bayesian Inference (UC Berkeley School of Information; July 25, 2019)
Bayesian Inference (UC Berkeley School of Information; July 25, 2019)Bayesian Inference (UC Berkeley School of Information; July 25, 2019)
Bayesian Inference (UC Berkeley School of Information; July 25, 2019)
 
Naive bayes classifier
Naive bayes classifierNaive bayes classifier
Naive bayes classifier
 
unit 3 -ML.pptx
unit 3 -ML.pptxunit 3 -ML.pptx
unit 3 -ML.pptx
 
Random forest algorithm for regression a beginner's guide
Random forest algorithm for regression   a beginner's guideRandom forest algorithm for regression   a beginner's guide
Random forest algorithm for regression a beginner's guide
 
tjosullivanThesis
tjosullivanThesistjosullivanThesis
tjosullivanThesis
 
UNIT2_NaiveBayes algorithms used in machine learning
UNIT2_NaiveBayes algorithms used in machine learningUNIT2_NaiveBayes algorithms used in machine learning
UNIT2_NaiveBayes algorithms used in machine learning
 
Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...
Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...
Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...
 
07 Machine Learning - Expectation Maximization
07 Machine Learning - Expectation Maximization07 Machine Learning - Expectation Maximization
07 Machine Learning - Expectation Maximization
 
On Extension of Weibull Distribution with Bayesian Analysis using S-Plus Soft...
On Extension of Weibull Distribution with Bayesian Analysis using S-Plus Soft...On Extension of Weibull Distribution with Bayesian Analysis using S-Plus Soft...
On Extension of Weibull Distribution with Bayesian Analysis using S-Plus Soft...
 

Plus de Vivian S. Zhang

Plus de Vivian S. Zhang (17)

Why NYC DSA.pdf
Why NYC DSA.pdfWhy NYC DSA.pdf
Why NYC DSA.pdf
 
Career services workshop- Roger Ren
Career services workshop- Roger RenCareer services workshop- Roger Ren
Career services workshop- Roger Ren
 
Nycdsa wordpress guide book
Nycdsa wordpress guide bookNycdsa wordpress guide book
Nycdsa wordpress guide book
 
Xgboost
XgboostXgboost
Xgboost
 
Nyc open-data-2015-andvanced-sklearn-expanded
Nyc open-data-2015-andvanced-sklearn-expandedNyc open-data-2015-andvanced-sklearn-expanded
Nyc open-data-2015-andvanced-sklearn-expanded
 
Nycdsa ml conference slides march 2015
Nycdsa ml conference slides march 2015 Nycdsa ml conference slides march 2015
Nycdsa ml conference slides march 2015
 
THE HACK ON JERSEY CITY CONDO PRICES explore trends in public data
THE HACK ON JERSEY CITY CONDO PRICES explore trends in public dataTHE HACK ON JERSEY CITY CONDO PRICES explore trends in public data
THE HACK ON JERSEY CITY CONDO PRICES explore trends in public data
 
Natural Language Processing(SupStat Inc)
Natural Language Processing(SupStat Inc)Natural Language Processing(SupStat Inc)
Natural Language Processing(SupStat Inc)
 
Data Science Academy Student Demo day--Moyi Dang, Visualizing global public c...
Data Science Academy Student Demo day--Moyi Dang, Visualizing global public c...Data Science Academy Student Demo day--Moyi Dang, Visualizing global public c...
Data Science Academy Student Demo day--Moyi Dang, Visualizing global public c...
 
Data Science Academy Student Demo day--Divyanka Sharma, Businesses in nyc
Data Science Academy Student Demo day--Divyanka Sharma, Businesses in nycData Science Academy Student Demo day--Divyanka Sharma, Businesses in nyc
Data Science Academy Student Demo day--Divyanka Sharma, Businesses in nyc
 
Data Science Academy Student Demo day--Chang Wang, dogs breeds in nyc
Data Science Academy Student Demo day--Chang Wang, dogs breeds in nycData Science Academy Student Demo day--Chang Wang, dogs breeds in nyc
Data Science Academy Student Demo day--Chang Wang, dogs breeds in nyc
 
Data Science Academy Student Demo day--Richard Sheng, kinvolved school attend...
Data Science Academy Student Demo day--Richard Sheng, kinvolved school attend...Data Science Academy Student Demo day--Richard Sheng, kinvolved school attend...
Data Science Academy Student Demo day--Richard Sheng, kinvolved school attend...
 
Data Science Academy Student Demo day--Peggy sobolewski,analyzing transporati...
Data Science Academy Student Demo day--Peggy sobolewski,analyzing transporati...Data Science Academy Student Demo day--Peggy sobolewski,analyzing transporati...
Data Science Academy Student Demo day--Peggy sobolewski,analyzing transporati...
 
Data Science Academy Student Demo day--Michael blecher,the importance of clea...
Data Science Academy Student Demo day--Michael blecher,the importance of clea...Data Science Academy Student Demo day--Michael blecher,the importance of clea...
Data Science Academy Student Demo day--Michael blecher,the importance of clea...
 
Data Science Academy Student Demo day--Shelby Ahern, An Exploration of Non-Mi...
Data Science Academy Student Demo day--Shelby Ahern, An Exploration of Non-Mi...Data Science Academy Student Demo day--Shelby Ahern, An Exploration of Non-Mi...
Data Science Academy Student Demo day--Shelby Ahern, An Exploration of Non-Mi...
 
R003 laila restaurant sanitation report(NYC Data Science Academy, Data Scienc...
R003 laila restaurant sanitation report(NYC Data Science Academy, Data Scienc...R003 laila restaurant sanitation report(NYC Data Science Academy, Data Scienc...
R003 laila restaurant sanitation report(NYC Data Science Academy, Data Scienc...
 
R003 jiten south park episode popularity analysis(NYC Data Science Academy, D...
R003 jiten south park episode popularity analysis(NYC Data Science Academy, D...R003 jiten south park episode popularity analysis(NYC Data Science Academy, D...
R003 jiten south park episode popularity analysis(NYC Data Science Academy, D...
 

Bayesian models in r

  • 1. Bayesian Models in R 10/3/14, 13:37 Bayesian Models in R Vivian Zhang | SupStat Inc. Copyright SupStat Inc., All rights reserved http://docs.supstat.com/BayesianModelEN/#1 Page 1 of 53
  • 2. Bayesian Models in R 10/3/14, 13:37 Outline 1. Introduction to Bayes and Bayes' Theorem 2. Distribution estimation 3. Conditional probability 4. Bayesian models 2/53 http://docs.supstat.com/BayesianModelEN/#1 Page 2 of 53
  • 3. Bayesian Models in R 10/3/14, 13:37 Introduction to Bayes and Bayes' Theorem 3/53 http://docs.supstat.com/BayesianModelEN/#1 Page 3 of 53
  • 4. Bayesian Models in R 10/3/14, 13:37 The*story*behind*the*Bayesian*model Thomas Bayes 18th century English statistician Most known for the Bayes Theorem Essential contributor to early development of probability theory · · · Source: http://www.bioquest.org/products/auth_images/422_bayes.gif 4/53 http://docs.supstat.com/BayesianModelEN/#1 Page 4 of 53
  • 5. Bayesian Models in R 10/3/14, 13:37 The*Model 1. Models using Bayes' theorem (based on conditional probablity · Naive Bayes, Association Rules 2. Bayes Decision Theory · Classical Bayesian model for Decision Theory 3. Models implementing Bayesian thinking · Treat all the parameter as random variables, especially in hierarchical models 5/53 http://docs.supstat.com/BayesianModelEN/#1 Page 5 of 53
  • 6. Bayesian Models in R 10/3/14, 13:37 Distribution Estimation 6/53 http://docs.supstat.com/BayesianModelEN/#1 Page 6 of 53
  • 7. Bayesian Models in R 10/3/14, 13:37 Distribu6on*Es6ma6on Probablity Density Function In statistics, the Probablity Density Function (PDF) of a continous random variable is an output discribing this variable, which means the probability around a certain point. Example: plot of PDF of the Normal distribution · · 7/53 http://docs.supstat.com/BayesianModelEN/#1 Page 7 of 53
  • 8. Bayesian Models in R 10/3/14, 13:37 Distribu6on*Es6ma6on Probablity Density Function The PDF has an important place in statistics - It contains all the information in the random variable Knowing the PDF, we can calculate the · · Mean Variance Median etc. - - - - 8/53 http://docs.supstat.com/BayesianModelEN/#1 Page 8 of 53
  • 9. Bayesian Models in R 10/3/14, 13:37 Distribu6on*Es6ma6on Probablity Density Function Obtain the PDF, get everything from a random variable. This allows you to perform: Bayesian Hypothesis Tests Bayesian Interval Estimation Bayesian Regression Models Bayesian Logistic Models etc. · · · · · 9/53 http://docs.supstat.com/BayesianModelEN/#1 Page 9 of 53
  • 10. Bayesian Models in R 10/3/14, 13:37 Distribu6on*Es6ma6on Probablity Density Function ExampleBayesian Regression: Y = Xβ + ϵ, ϵ ∼ N(0, σ2 ) Estimation methods for the regression model · · - - β ∼ N((X′ X)−1X′ Y, (X′ X)−1 ) - = ( X Y βˆ OLS (Ordinary Least Squres) X′ )−1X′ is the estimator of β 10/53 http://docs.supstat.com/BayesianModelEN/#1 Page 10 of 53
  • 11. Bayesian Models in R 10/3/14, 13:37 Distribu6on*Es6ma6on The Bayesian Model Before obtaining data, one has beliefs about the value of the proportion and models his or her beliefs in terms of a prior distribution. After data have been observed, one updates one’s beliefs about the proportion by computing the posterior distribution. · · 11/53 http://docs.supstat.com/BayesianModelEN/#1 Page 11 of 53
  • 12. Bayesian Models in R 10/3/14, 13:37 Distribu6on*Es6ma6on The Bayesian Model Building a Bayesian model begins with Bayesian Thinking (every value has its own distribution). Steps to build a Bayesian model: · · Make inferences about prior distribution Calculate the parameter of the posterior distribution Finish the statistical task (interval estimationstatistical decision, etc.) - - - 12/53 http://docs.supstat.com/BayesianModelEN/#1 Page 12 of 53
  • 13. Bayesian Models in R 10/3/14, 13:37 Inferring*from*the*posterior*distribu6on Posterior inference is the core of Bayes' Theorem, because we do not actually know the population distribution which generated our data. We use the conditional distribution to address this gap indirectly. In this section, a certain degree of mathematical sophistication is required without which we cannot easily implement the model computationally. · Essentials: Bayes' theorem Conditional distribution - For example: ϵ in regression is from a normal distribution Certain prior distribution · · · - No information given 13/53 http://docs.supstat.com/BayesianModelEN/#1 Page 13 of 53
  • 14. Bayesian Models in R 10/3/14, 13:37 Calcula6ng*the*posterior*distribu6on The most difficult part is calculating the posterior distribution, which requires integration. · Markov chain Monte Carlo (MCMC) Gibbs MH method - - 14/53 http://docs.supstat.com/BayesianModelEN/#1 Page 14 of 53
  • 15. Bayesian Models in R 10/3/14, 13:37 Conditional probability 15/53 http://docs.supstat.com/BayesianModelEN/#1 Page 15 of 53
  • 16. Bayesian Models in R 10/3/14, 13:37 Condi6onal*probability What is conditional probability? · A B P(A|B) The probablity that event will occur when event has occurred. This probability is written as . P(A|B) = P(AB) P(B) A and B are two events · P(AB) · P(B) is the probability that both A and B occur. is the probability that B occurs. · 16/53 http://docs.supstat.com/BayesianModelEN/#1 Page 16 of 53
  • 17. Bayesian Models in R 10/3/14, 13:37 Condi6onal*probability Why conditional probability Example · Suppose A: The event of getting a cold B: The event of a rainy day (p = 0.2) AB: The event that when it rains you get a cold (p = 0.1) - - - P(AB) P(B) 0.1 0.2 P(A|B) = = = 0.5 · Interpretation: - When it rains, the probablity of getting a cold is 50% 17/53 http://docs.supstat.com/BayesianModelEN/#1 Page 17 of 53
  • 18. Bayesian Models in R 10/3/14, 13:37 Condi6onal*probability Exercise · There are two kids in a family. If one of the kids is a boy, the probability that the other one is also a boy is... If the first one is a boy, the probability that the other one is a boy is... , - - - 23 12 18/53 http://docs.supstat.com/BayesianModelEN/#1 Page 18 of 53
  • 19. Bayesian Models in R 10/3/14, 13:37 Condi6onal*Probability The model relates to conditional probability · A priori Mining associated rules The association from A to B is defined as: - - P(AB) P(A) A = B : = P(B|A) · In R, use the arules package 19/53 http://docs.supstat.com/BayesianModelEN/#1 Page 19 of 53
  • 20. Bayesian Models in R 10/3/14, 13:37 Condi6onal*Probability A priori Goal: find the items with strong relationships First, load the data: · · library(arules) data = read.csv(data/BASKETS1n) names(data) [1] cardid value pmethod sex homeown income [7] age fruitveg freshmeat dairy cannedveg cannedmeat [13] frozenmeal beer wine softdrink fish confectionery 20/53 http://docs.supstat.com/BayesianModelEN/#1 Page 20 of 53
  • 21. Bayesian Models in R 10/3/14, 13:37 Condi6onal*Probability A priori basket = data[, 8:18] names(basket)[which(basket[1, ] == T)] [1] freshmeat dairy confectionery tbs2 = apply(basket, 1, function(x) names(basket)[which(x==T)]) len = sapply(tbs2, length) require(arules) trans.code = rep(1:1000, len) trans.items = unname(unlist(tbs2)) trans.code.ind = match(trans.code, unique(trans.code)) trans.items.ind = match(trans.items, unique(trans.items)) 21/53 http://docs.supstat.com/BayesianModelEN/#1 Page 21 of 53
  • 22. Bayesian Models in R 10/3/14, 13:37 Condi6onal*Probability A priori mat = sparseMatrix(i = trans.items.ind, j = trans.code.ind, x = 1, dims = c(length(unique(trans.items)), length(unique(trans.code)))) mat = as(mat, 'ngCMatrix') #after setting the argument we get the model: trans.res = apriori(mat,parameter = list(confidence=0.05, support=0.05, minlen=2,maxlen=3)) 22/53 http://docs.supstat.com/BayesianModelEN/#1 Page 22 of 53
  • 23. Bayesian Models in R 10/3/14, 13:37 Condi6onal*Probability A priori parameter specification: confidence minval smax arem aval originalSupport support minlen maxlen target ext 0.05 0.1 1 none FALSE TRUE 0.05 2 3 rules FALSE algorithmic control: filter tree heap memopt load sort verbose 0.1 TRUE TRUE FALSE TRUE 2 TRUE apriori - find association rules with the apriori algorithm version 4.21 (2004.05.09) (c) 1996-2004 Christian Borgelt set item appearances ...[0 item(s)] done [0.00s]. set transactions ...[11 item(s), 940 transaction(s)] done [0.00s]. sorting and recoding items ... [11 item(s)] done [0.00s]. creating transaction tree ... done [0.00s]. checking subsets of size 1 2 3 done [0.00s]. writing ... [108 rule(s)] done [0.00s]. 23/53 http://docs.supstat.com/BayesianModelEN/#1 Page 23 of 53
  • 24. Bayesian Models in R 10/3/14, 13:37 Condi6onal*Probability · At last, we have the items with the strongest relationship in one basket #let's see these rules: lhs.generic = unique(trans.items)[trans.res@lhs@data@i+1] rhs.generic = unique(trans.items)[trans.res@rhs@data@i+1] cbind(lhs.generic, rhs.generic)[1:10, ] lhs.generic rhs.generic [1,] dairy confectionery [2,] confectionery dairy [3,] dairy fish [4,] fish dairy [5,] dairy fruitveg [6,] fruitveg dairy [7,] dairy frozenmeal [8,] frozenmeal dairy [9,] freshmeat confectionery [10,] confectionery freshmeat 24/53 http://docs.supstat.com/BayesianModelEN/#1 Page 24 of 53
  • 25. Bayesian Models in R 10/3/14, 13:37 Condi6onal*Probability The model relates to conditional probablity · Naive Bayes Used in recommendation systemsclassification problems Compute the posterior probability for all values of C using the Bayes theorem: - - P(C|A1, A2,…, An) P(C|A1A2 ⋯An) = - Choose the value of C that maximizes P(C|A1, A2, . . . , An) - P(A1, A2, . . . , An|C)P(C) P(A1A2 ⋯An |C) × P(C) P(A1A2 ⋯An ) Equivalent to choosing the value of C that maximizes 25/53 http://docs.supstat.com/BayesianModelEN/#1 Page 25 of 53
  • 26. Bayesian Models in R 10/3/14, 13:37 Naive*Bayes data(iris) m = naiveBayes(Species ~ ., data=iris) ## alternatively: m = naiveBayes(iris[, -5], iris[, 5]) 26/53 http://docs.supstat.com/BayesianModelEN/#1 Page 26 of 53
  • 27. Bayesian Models in R 10/3/14, 13:37 Naive*Bayes Model: m Naive Bayes Classifier for Discrete Predictors Call: naiveBayes.default(x = iris[, -5], y = iris[, 5]) A-priori probabilities: iris[, 5] setosa versicolor virginica 0.33333 0.33333 0.33333 Conditional probabilities: Sepal.Length iris[, 5] [,1] [,2] setosa 5.006 0.35249 27/53 http://docs.supstat.com/BayesianModelEN/#1 Page 27 of 53
  • 28. Bayesian Models in R 10/3/14, 13:37 Naive*Bayes Predict: table(predict(m, iris), iris[,5]) setosa versicolor virginica setosa 50 0 0 versicolor 0 47 3 virginica 0 3 47 28/53 http://docs.supstat.com/BayesianModelEN/#1 Page 28 of 53
  • 29. Bayesian Models in R 10/3/14, 13:37 From*condi6onal*probablity*to*Bayes'*Theorem We have: So: Change the Conditional Prob. · P(B|A) = P(AB) P(A) · P(AB) = P(B|A)P(A) · P(AB) P(B) P(A|B) = = P(B|A)P(A) P(B) 29/53 http://docs.supstat.com/BayesianModelEN/#1 Page 29 of 53
  • 30. Bayesian Models in R 10/3/14, 13:37 Bayes'*Theorem P(A|B) = P(B|A)P(A) P(B) Bayes' theorem relates the conditional probablity to the marginal distribution of a random varable. Bayes' theorm can tell us how to update our thinking after obtaining new data. Harold Jeffreys has claimed that Bayes' theorem is to Statistics as the Pythagorean theorem is to geometry. · · 30/53 http://docs.supstat.com/BayesianModelEN/#1 Page 30 of 53
  • 31. Bayesian Models in R 10/3/14, 13:37 Bayes'*theorem Continuous situation The Bayes' theorem mentioned above is in discrete form In the real world often we are using and analyzing continuous random variables The Bayes' theorem can be written in continuous form as: · · · π(θ|x) = f (x|θ)π(θ) m(x) 31/53 http://docs.supstat.com/BayesianModelEN/#1 Page 31 of 53
  • 32. Bayesian Models in R 10/3/14, 13:37 Bayes'*Theorem Continous form π(θ|x) = f (x|θ)π(θ) m(x) · Here - θ is an unknown parameter - X is the data observed - Processing is from π(θ) to π(θ|x) - From the original knowledge of θ updated to the situation after we observe X 32/53 http://docs.supstat.com/BayesianModelEN/#1 Page 32 of 53
  • 33. Bayesian Models in R 10/3/14, 13:37 Bayes'*Theorem Continuous form π(θ|x) = f (x|θ)π(θ) m(x) · Based on the properties of continous random variables, it can be written as: π(θ|x) = f (x|θ)π(θ) ∫ f (x|θ)π(θ)dθ 33/53 http://docs.supstat.com/BayesianModelEN/#1 Page 33 of 53
  • 34. Bayesian Models in R 10/3/14, 13:37 Bayes'*Theorem Continuous form Important distributions: f (x|θ)π(θ) m(x) π(θ|x) = = f (x|θ)π(θ) ∫ f (x|θ)π(θ)dθ · π(θ) - Prior distribution · π(θ|x) - Posterior distribution 34/53 http://docs.supstat.com/BayesianModelEN/#1 Page 34 of 53
  • 35. Bayesian Models in R 10/3/14, 13:37 Bayes'*Theorem Continuous form Other distributions: f (x|θ)π(θ) m(x) π(θ|x) = = f (x|θ)π(θ) ∫ f (x|θ)π(θ)dθ · m(x) = ∫ f (x|θ)π(θ)dθ - Marginal Distribution · f (x|θ)π(θ) = f (x, θ) - Joint distribution 35/53 http://docs.supstat.com/BayesianModelEN/#1 Page 35 of 53
  • 36. Bayesian Models in R 10/3/14, 13:37 Bayesian Models 36/53 http://docs.supstat.com/BayesianModelEN/#1 Page 36 of 53
  • 37. Bayesian Models in R 10/3/14, 13:37 Bayesian*Models Bayesian thinking data(iris) head(iris) Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1 1.5 0.2 setosa 5 5.0 3.6 1.4 0.2 setosa 6 5.4 3.9 1.7 0.4 setosa · Data are random variables with a mean of μ 37/53 http://docs.supstat.com/BayesianModelEN/#1 Page 37 of 53
  • 38. Bayesian Models in R 10/3/14, 13:37 Bayesian*Models Bayesian thinking · The frequency perspective: The mean μ is a constant colMeans(iris[, 1:3]) Sepal.Length Sepal.Width Petal.Length 5.8433 3.0573 3.7580 38/53 http://docs.supstat.com/BayesianModelEN/#1 Page 38 of 53
  • 39. Bayesian Models in R 10/3/14, 13:37 Bayesian*Models Bayesian thinking · The Bayesian perspective: The mean μ is a random variable PROB SEPAL LENGTH SEPAL WIDTH PETAL LENGTH 90% 5.843333 3.057333 3.758000 10% Others Others Others 39/53 http://docs.supstat.com/BayesianModelEN/#1 Page 39 of 53
  • 40. Bayesian Models in R 10/3/14, 13:37 Bayesian*Models In fact, nearly all of modern Bayesian modeling uses Bayesian thinking Nearly all statistical models can be implemented as Bayesian-form models Even some non-parametric models can be transformed to Bayeseian versions Bayes Cluster Bayes Regression - Logit, Probit, Tobit, Quantile, LASSO... Bayes Neural Net Non-parametric Bayes Hierarchical model etc. · · · · · · · · · 40/53 http://docs.supstat.com/BayesianModelEN/#1 Page 40 of 53
  • 41. Bayesian Models in R 10/3/14, 13:37 Bayesian*Modeling*Example Question For a Sample from a normal distribution. We want to know the mean of this sample. ˆ θ· X1, X2, . . . , Xn ∼ N(θ, σ) · Frequentists think = mean(x) · θ · θ ∼ N(μ, τ2) Bayesians think is a random variable with a distribution Suppose that · Infer the posterior distribution Calculate the posterior distribution Estimate the mean of the sample - - - 41/53 http://docs.supstat.com/BayesianModelEN/#1 Page 41 of 53
  • 42. Bayesian Models in R 10/3/14, 13:37 Bayesian*Modeling*Example Inference Inferring the posterior distribution using Bayes' Theorem in continous form: f (x|θ)π(θ) m(x) π(θ|x) = = f (x|θ)π(θ) ∫ f (x|θ)π(θ)dθ · Put the distribution into the theorem to calculate the posterior distribution - Prior distribution θ ∼ N(μ, τ2) - Conditional distribution x|θ ∼ N(θ, σ2 ) 42/53 http://docs.supstat.com/BayesianModelEN/#1 Page 42 of 53
  • 43. Bayesian Models in R 10/3/14, 13:37 Bayesian*Modeling*Example Inference 43/53 http://docs.supstat.com/BayesianModelEN/#1 Page 43 of 53
  • 44. Bayesian Models in R 10/3/14, 13:37 Bayesian*Modeling*Example Calculating the posterior distribution According to the theorem, we know the mean and the variance of θ for a normal distribution. postDis = function(miu=2, tau=4, n=100) { x = rnorm(n,3,5) a = list(0) a[[1]] = (var(x)*miu+tau^2*mean(x))/(var(x)+tau^2) a[[2]] = var(x)*tau^2/(var(x)+tau^2) a } postDis(3, 5, 1000) [[1]] [1] 2.9284 [[2]] [1] 12.254 44/53 http://docs.supstat.com/BayesianModelEN/#1 Page 44 of 53
  • 45. Bayesian Models in R 10/3/14, 13:37 Bayesian*Modeling*Example Estimating the mean · μ In ordinary statistics, the MLE and moment estimators of in a normal distribution are the sample mean. For the Bayes posterior distribution · MLE --- posterior maximum likelihood estimator Can be considered as MLE of posterior distribution Posterior distribution is normal, too. So, the parameter of the mean is: - - - (σ2μ + τ2x)/(σ2 + τ2 ) 45/53 http://docs.supstat.com/BayesianModelEN/#1 Page 45 of 53
  • 46. Bayesian Models in R 10/3/14, 13:37 Bayesian*Modeling*Example Estimating the mean · x ∼ N(μ, σ) = N(3, 5) - The mean is 3 When using a different prior distribution Observe the error in a different situation · · 46/53 http://docs.supstat.com/BayesianModelEN/#1 Page 46 of 53
  • 47. Bayesian Models in R 10/3/14, 13:37 Bayesian*Modeling*Example · Prior distribution: N(3, 1) library(ggplot2) plot_dif = function(miu=3, tau=1) { i = seq(100, 10000, by=10) set.seed(123) meanCompare = function(n=100, miu=3, tau=1) { x = rnorm(n, 3, 5) (var(x)*miu+tau^2*mean(x))/(var(x)+tau^2)-3 } aa = sapply(i, meanCompare, miu=miu, tau=tau) bb = sapply(i,function(i) mean(rnorm(i,3,5))-3) g = ggplot(data.frame(i=i, a=aa, b=bb)) + geom_line(aes(x=i ,y=b), col=blue) + geom_line(aes(x=i, y=a), col=red) print(g) } 47/53 http://docs.supstat.com/BayesianModelEN/#1 Page 47 of 53
  • 48. Bayesian Models in R 10/3/14, 13:37 Bayesian*Modeling*Example · Prior distribution: N(3, 1) (Bayes estimator in red, MLE in blue) plot_dif(3, 1) 48/53 http://docs.supstat.com/BayesianModelEN/#1 Page 48 of 53
  • 49. Bayesian Models in R 10/3/14, 13:37 Bayesian*Modeling*Example · Prior distribution: N(2, 1) (Bayes estimator in red, MLE in blue) plot_dif(2,1) 49/53 http://docs.supstat.com/BayesianModelEN/#1 Page 49 of 53
  • 50. Bayesian Models in R 10/3/14, 13:37 Bayesian*Modeling*Example · Prior distribution: N(2, 4) (Bayes estimator in red, MLE in blue) plot_dif(2,4) 50/53 http://docs.supstat.com/BayesianModelEN/#1 Page 50 of 53
  • 51. Bayesian Models in R 10/3/14, 13:37 Bayesian*Modeling*Example · Prior distribution: N(2, 100) (Bayes estimator in red, MLE in blue) plot_dif(2,100) 51/53 http://docs.supstat.com/BayesianModelEN/#1 Page 51 of 53
  • 52. Bayesian Models in R 10/3/14, 13:37 Bayesian*Modeling*Example 1. As we can see, if the prior distribution is very accurate, the Bayes estimator is better than the ordinary estimator. 2. If the prior distribution is not accurate enough: Larger variance is better For a suitable variance more data is better · · 52/53 http://docs.supstat.com/BayesianModelEN/#1 Page 52 of 53
  • 53. Bayesian Models in R 10/3/14, 13:37 Bayesian*Modeling*Example Choosing the prior distribution · Choosing a prior distribution... If sure for the model, can improve the accuracy of the estimator If not sure, should be done by selecting for greater variance to improve the estimator - - 53/53 http://docs.supstat.com/BayesianModelEN/#1 Page 53 of 53