SlideShare une entreprise Scribd logo
1  sur  141
Télécharger pour lire hors ligne
PRML 2.4-2.5

The exponential family
&
Nonparametric methods	
 
June 11, 2014
by Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Today's topics	
 
1. The exponential family	

1.  What is exponential family?	

2.  Maximum likelihood for EF	

3.  How to decide priors for EF	

	

2. Nonparametric methods	

1.  What is the point of nonparametric methods ?	

2.  Kernel density estimator	

3.  Nearest-neighbour methods	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Today's topics	
 
1. The exponential family	

1.  What is exponential family?	

2.  Maximum likelihood for EF	

3.  How to decide priors for EF	

	

2. Nonparametric methods	

1.  What is the point of nonparametric methods ?	

2.  Kernel density estimator	

3.  Nearest-neighbour methods	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Today's topics	
 
1. The exponential family	

1.  What is exponential family?	

2.  Maximum likelihood for EF	

3.  How to decide priors for EF	

	

2. Nonparametric methods	

1.  What is the point of nonparametric methods ?	

2.  Kernel density estimator	

3.  Nearest-neighbour methods	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
The Exponential Family
Almost all of the distributions we studied so far belong
to a single class, namely the exponential family.	

June 11, 2014
 PRML 2.4-2.5
The exponential family	
 
Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
The Exponential Family
Almost all of the distributions we studied so far belong
to a single class, namely the exponential family.	

June 11, 2014
 PRML 2.4-2.5
Bernoulli,
The exponential family	
 
Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
The Exponential Family
Almost all of the distributions we studied so far belong
to a single class, namely the exponential family.	

June 11, 2014
 PRML 2.4-2.5
Bernoulli, multinomial,
The exponential family	
 
Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
The Exponential Family
Almost all of the distributions we studied so far belong
to a single class, namely the exponential family.	

June 11, 2014
 PRML 2.4-2.5
Bernoulli, multinomial, Gaussian, 
The exponential family	
 
Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
The Exponential Family
Almost all of the distributions we studied so far belong
to a single class, namely the exponential family.	

June 11, 2014
 PRML 2.4-2.5
Bernoulli, multinomial, Gaussian,
beta,
The exponential family	
 
Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
The Exponential Family
Almost all of the distributions we studied so far belong
to a single class, namely the exponential family.	

June 11, 2014
 PRML 2.4-2.5
Bernoulli, multinomial, Gaussian,
beta, gamma,
The exponential family	
 
Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
The Exponential Family
Almost all of the distributions we studied so far belong
to a single class, namely the exponential family.	

June 11, 2014
 PRML 2.4-2.5
Bernoulli, multinomial, Gaussian,
beta, gamma, von Mises...etc.	
 
The exponential family	
 
Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
The Exponential Family
Almost all of the distributions we studied so far belong
to a single class, namely the exponential family.	

June 11, 2014
 PRML 2.4-2.5
Parametric distributions	
 
Bernoulli, multinomial, Gaussian,
beta, gamma, von Mises...etc.	
 
The exponential family	
 
Gaussian mixture...etc.	
 
Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
p(x|η) = h(x)g(η) exp ηT
u(x)
The Exponential Family
The exponential family over x given 	

is a class of distributions which form is	

	

	

η
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
p(x|η) = h(x)g(η) exp ηT
u(x)
The Exponential Family
The exponential family over x given 	

is a class of distributions which form is	

	

	

η
Natural parameter	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
p(x|η) = h(x)g(η) exp ηT
u(x)
The Exponential Family
The exponential family over x given 	

is a class of distributions which form is	

	

	

η
Natural parameter	
  Where and
come across 	
 
x η
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
p(x|η) = h(x)g(η) exp ηT
u(x)
The Exponential Family
The exponential family over x given 	

is a class of distributions which form is	

	

	

η
Natural parameter	
 
Normalizing constant	
 
Where and
come across 	
 
x η
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
The Exponential Family
E.g. 1) The Bernoulli Distribution	

	

	

p(x|η) = µx
(1 − µ)1−x
= σ(−η) exp(ηx)
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
The Exponential Family
E.g. 1) The Bernoulli Distribution	

	

	

	

where	

η = ln
µ
1 − µ
p(x|η) = µx
(1 − µ)1−x
= σ(−η) exp(ηx)
u(x)
h(x) = 1
g(η)
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
The Exponential Family
E.g. 2) The Multinomial Distribution	

	

 p(x|η) = µxk
k
= exp(ηT
x)
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
The Exponential Family
E.g. 2) The Multinomial Distribution	

	

	

	

where	

η = (ln µ1, . . . , ln µM )T
⇒ exp(ηk) = µk = 1
p(x|η) = µxk
k
= exp(ηT
x)
u(x)
h(x) = 1
g(η) = 1
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
The Exponential Family
E.g. 2) The Multinomial Distribution	

	

	

	

where	

η = (ln µ1, . . . , ln µM )T
⇒ exp(ηk) = µk = 1
p(x|η) = µxk
k
= exp(ηT
x)
It's inconvenient!	
 
u(x)
h(x) = 1
g(η) = 1
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
The Exponential Family
E.g. 2) The Multinomial Distribution	

	

Remove the constraint by	

	

	

	

	

	

	

	

µM = 1 −
M−1
k=1 µk, xM = 1 −
M−1
k=1 xk
p(x|µ) = exp
M−1
k=1
xk ln µk + 1 −
M−1
k=1
xk ln 1 −
M−1
k=1
µk
= exp
M−1
k=1
xk ln
µk
1 −
M−1
k=1 µk
+ ln 1 −
M−1
k=1
µk
= 1 −
M−1
k=1
µk exp
M−1
k=1
xk ln
µk
1 −
M−1
k=1 µk
.
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
The Exponential Family
E.g. 2) The Multinomial Distribution	

	

Remove the constraint by	

	

	

	

	

	

	

	

µM = 1 −
M−1
k=1 µk, xM = 1 −
M−1
k=1 xk
p(x|µ) = exp
M−1
k=1
xk ln µk + 1 −
M−1
k=1
xk ln 1 −
M−1
k=1
µk
= exp
M−1
k=1
xk ln
µk
1 −
M−1
k=1 µk
+ ln 1 −
M−1
k=1
µk
= 1 −
M−1
k=1
µk exp
M−1
k=1
xk ln
µk
1 −
M−1
k=1 µk
.
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
The Exponential Family
E.g. 2) The Multinomial Distribution	

	

Remove the constraint by	

	

	

	

	

	

	

	

Therefore...	

µM = 1 −
M−1
k=1 µk, xM = 1 −
M−1
k=1 xk
p(x|µ) = exp
M−1
k=1
xk ln µk + 1 −
M−1
k=1
xk ln 1 −
M−1
k=1
µk
= exp
M−1
k=1
xk ln
µk
1 −
M−1
k=1 µk
+ ln 1 −
M−1
k=1
µk
= 1 −
M−1
k=1
µk exp
M−1
k=1
xk ln
µk
1 −
M−1
k=1 µk
.
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
The Exponential Family
E.g. 2') The Multinomial Distribution 	

w/o constraint	

	

	

	

	

where	

p(x|η) = µxk
k
= 1 +
M−1
k=1
exp(ηk)
−1
exp(ηT
x)
η = ln µ1
1−
P
j µj
, . . . , ln µM−1
1−
P
j µj
, 0
T
u(x)
h(x) = 1
g(η)
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
The Exponential Family
E.g. 3) The Gaussian Distribution 	

	

p(x|η) =
1
(2πσ2)1/2
exp −
1
2σ2
(x − µ)2
= (2π)−1/2
(−2η2)1/2
exp
η2
1
4η2
exp η1 η2
x
x2
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
The Exponential Family
E.g. 3) The Gaussian Distribution 	

	

	

	

	

	

where	

u(x)
h(x) = 1
g(η)
p(x|η) =
1
(2πσ2)1/2
exp −
1
2σ2
(x − µ)2
= (2π)−1/2
(−2η2)1/2
exp
η2
1
4η2
exp η1 η2
x
x2
η =
µ
σ2
, −
1
2σ2
T
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Today's topics	
 
1. The exponential family	

1.  What is exponential family?	

2.  Maximum likelihood for EF	

3.  How to decide priors for EF	

	

2. Nonparametric methods	

1.  What is the point of nonparametric methods ?	

2.  Kernel density estimator	

3.  Nearest-neighbour methods	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Today's topics	
 
1. The exponential family	

1.  What is exponential family?	

2.  Maximum likelihood for EF	

3.  How to decide priors for EF	

	

2. Nonparametric methods	

1.  What is the point of nonparametric methods ?	

2.  Kernel density estimator	

3.  Nearest-neighbour methods	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Maximum likelihood for EF	
 
OK, we know what EF looks like.	

Then, how to estimate the parameter?	

	

Maximize likelihood!	

Frequentist way.	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Maximum likelihood for EF	
 
Suppose we have i.i.d. data , 	

The log-likelihood of is 	
 
June 11, 2014
 PRML 2.4-2.5
η
X = {x1, . . . , xN }
Shinichi TAMURA
ln p(X|η) = ln
N
n=1
p(xn|η)
= ln
N
n=1
h(xn)g(η) exp ηT
u(xn)
=
N
n=1
ln h(xn) + N ln g(η) + ηT
N
n=1
u(xn).
∴ η ln p(X|η) = N η ln g(η) +
N
n=1
u(xn). −→ 0
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Maximum likelihood for EF	
 
Suppose we have i.i.d. data , 	

The log-likelihood of is 	
 
June 11, 2014
 PRML 2.4-2.5
η
X = {x1, . . . , xN }
Shinichi TAMURA
ln p(X|η) = ln
N
n=1
p(xn|η)
= ln
N
n=1
h(xn)g(η) exp ηT
u(xn)
=
N
n=1
ln h(xn) + N ln g(η) + ηT
N
n=1
u(xn).
∴ η ln p(X|η) = N η ln g(η) +
N
n=1
u(xn). −→ 0
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Maximum likelihood for EF	
 
Suppose we have i.i.d. data , 	

The log-likelihood of is 	
 
June 11, 2014
 PRML 2.4-2.5
η
X = {x1, . . . , xN }
Shinichi TAMURA
ln p(X|η) = ln
N
n=1
p(xn|η)
= ln
N
n=1
h(xn)g(η) exp ηT
u(xn)
=
N
n=1
ln h(xn) + N ln g(η) + ηT
N
n=1
u(xn).
∴ η ln p(X|η) = N η ln g(η) +
N
n=1
u(xn). −→ 0
By putting this to zero
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Maximum likelihood for EF	
 
Therefore	

	

	

	

Here, is determined only through , 	

so it is called “sufficient statistics”.	

	

We need to store only for estimation.	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
− η ln g(ηML) =
1
N
N
n=1
u(xn).
ηML n u(xn)
n u(xn)
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Maximum likelihood for EF	
 
E.g.) Gaussian distribution	

By and ,	

	

	

	

	

	

	

	

That's what we already know.	

	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
g(η) = (−2η2)1/2
exp η2
1/4η2 u(x) = (x, x2
)T
− ln g(η) =
− η1
2η2
− 1
2η2
+
η2
1
4η2
2
=
µ
σ2
+ µ2 .
∴ µML =
1
N n
xn,
σ2
ML =
1
N n
x2
n −
1
N n
xn
2
.
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Maximum likelihood for EF	
 
By the way, we want to know 	

the relation between and .	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
ηηML
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Maximum likelihood for EF	
 
Gradient of	

by gives	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
η
h(x)g(η) exp ηT
u(x) dx = 1
g(η) h(x) exp ηT
u(x) dx
+ h(x)g(η) exp ηT
u(x) u(x)dx = 0.
⇔ − ln g(η) = E [u(x)] .
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Maximum likelihood for EF	
 
Gradient of	

by gives	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
η
h(x)g(η) exp ηT
u(x) dx = 1
g(η) h(x) exp ηT
u(x) dx
+ h(x)g(η) exp ηT
u(x) u(x)dx = 0.
⇔ − ln g(η) = E [u(x)] .
Similar to	
 − η ln g(ηML) =
1
N
N
n=1
u(xn)
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Maximum likelihood for EF	
 
According to LLN, sample mean will converge to the
expectation, so will converge to .	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
ηηML
− η ln g(ηML) =
1
N
N
n=1
u(xn)
− ln g(η) = E [u(x)]
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Maximum likelihood for EF	
 
According to LLN, sample mean will converge to the
expectation, so will converge to .	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
ηηML
− η ln g(ηML) =
1
N
N
n=1
u(xn)
− ln g(η) = E [u(x)]
Converge
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Maximum likelihood for EF	
 
According to LLN, sample mean will converge to the
expectation, so will converge to .	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
ηηML
− η ln g(ηML) =
1
N
N
n=1
u(xn)
− ln g(η) = E [u(x)]
Converge	
 Converge
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Today's topics	
 
1. The exponential family	

1.  What is exponential family?	

2.  Maximum likelihood for EF	

3.  How to decide priors for EF	

	

2. Nonparametric methods	

1.  What is the point of nonparametric methods ?	

2.  Kernel density estimator	

3.  Nearest-neighbour methods	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Today's topics	
 
1. The exponential family	

1.  What is exponential family?	

2.  Maximum likelihood for EF	

3.  How to decide priors for EF	

	

2. Nonparametric methods	

1.  What is the point of nonparametric methods ?	

2.  Kernel density estimator	

3.  Nearest-neighbour methods	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF	
 
If you want to use the Bayesian inference, 	

a prior distribution is needed.	

	

Then, how to decide it, 	

if we don't know anything about the parameter?	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF	
 
Three candidates:	

	

1. Conjugate priors 	


2. Uniform distributions 	


3. Noninformative priors	

	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF	
 
Three candidates:	

	

1. Conjugate priors 	

... Easy to handle
2. Uniform distributions 	


3. Noninformative priors	

	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF	
 
Three candidates:	

	

1. Conjugate priors 	

... Easy to handle
2. Uniform distributions 	

... Principle of indifference
3. Noninformative priors	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF	
 
Three candidates:	

	

1. Conjugate priors 	

... Easy to handle
2. Uniform distributions 	

... Principle of indifference
3. Noninformative priors	

... Make effects of priors little	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Conjugate priors	
 
Three candidates:	

	

1. Conjugate priors 	

... Easy to handle
2. Uniform distributions 	

... Principle of indifference
3. Noninformative priors	

... Make effects of priors little	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Conjugate priors	
 
Distributions of EF has factors of ,
so conjugate priors is	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
g(η) exp(ηT
u)
p(η|X, ν) = f(X, ν) g(η) exp{ηT
X}
ν
= f(X, ν)g(η)ν
exp{νηT
X}.
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Conjugate priors	
 
Distributions of EF has factors of ,
so conjugate priors is	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
g(η) exp(ηT
u)
p(η|X, ν) = f(X, ν) g(η) exp{ηT
X}
ν
= f(X, ν)g(η)ν
exp{νηT
X}.
Correspond
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Conjugate priors	
 
Distributions of EF has factors of ,
so conjugate priors is	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
g(η) exp(ηT
u)
p(η|X, ν) = f(X, ν) g(η) exp{ηT
X}
ν
= f(X, ν)g(η)ν
exp{νηT
X}.
Normalizing constant
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Conjugate priors	
 
Distributions of EF has factors of ,
so conjugate priors is	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
g(η) exp(ηT
u)
p(η|X, ν) = f(X, ν) g(η) exp{ηT
X}
ν
= f(X, ν)g(η)ν
exp{νηT
X}.
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Conjugate priors	
 
Distributions of EF has factors of ,
so conjugate priors is	

	

	

	

It will give posteriors as follows.	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
g(η) exp(ηT
u)
p(η|X, ν) = f(X, ν) g(η) exp{ηT
X}
ν
= f(X, ν)g(η)ν
exp{νηT
X}.
p(η|X, X, ν) ∝
N
n=1
h(xn)g(η) exp ηT
u(xn) × g(η)ν
exp{ηT
X}
∝ g(η)N+ν
exp ηT
N
n=1
u(xn) + νX
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Conjugate priors	
 
Distributions of EF has factors of ,
so conjugate priors is	

	

	

	

It will give posteriors as follows.	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
g(η) exp(ηT
u)
p(η|X, ν) = f(X, ν) g(η) exp{ηT
X}
ν
= f(X, ν)g(η)ν
exp{νηT
X}.
p(η|X, X, ν) ∝
N
n=1
h(xn)g(η) exp ηT
u(xn) × g(η)ν
exp{ηT
X}
∝ g(η)N+ν
exp ηT
N
n=1
u(xn) + νX
Correspond
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Uniform distributions	
 
Three candidates:	

	

1. Conjugate priors 	

... Easy to handle
2. Uniform distributions 	

... Principle of indifference
3. Noninformative priors	

... Make effects of priors little	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Uniform distributions	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
The uniform distribution is common choice for discrete
bounded variable.	

C.f.: Principle of insufficient reason (or Principle of indifference)
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Uniform distributions	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
The uniform distribution is common choice for discrete
bounded variable.	

C.f.: Principle of insufficient reason (or Principle of indifference)
	

But two problems arise when it is applied to continuous
variables:	

1.  The normalization problem	

2.  The transformation problem
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Uniform distributions	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
1. Normalization Problem	

If the parameter is unbounded	

	

	

These priors are called “improper”.	

	

∞
−∞
p(λ)dλ =
∞
−∞
const dλ → ∞
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Uniform distributions	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
1. Normalization Problem	

If the parameter is unbounded	

	

	

These priors are called “improper”.	

	

Note that these priors can give proper posteriors, 
because posteriors are proportional to likelihood, 
which can be normalized.	
 
∞
−∞
p(λ)dλ =
∞
−∞
const dλ → ∞
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Uniform distributions	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
2. Transformation problem	

Non-linear transformation gives non-constant priors.	

	

E.g.)





(Sometimes, the posteriors are not sensitive to the difference.)	
 
p(λ) = 1


η=
√
λ
p(η) = p(λ)
dλ
dη
= 2η
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Uniform distributions	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
2. Transformation problem	

Non-linear transformation gives non-constant priors.	

	

E.g.)





(Sometimes, the posteriors are not sensitive to the difference.)	
 
Not constant for
η
p(λ) = 1


η=
√
λ
p(η) = p(λ)
dλ
dη
= 2η
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Uniform distributions	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
2. Transformation problem	

Non-linear transformation gives non-constant priors.	

	

E.g.)





(Sometimes, the posteriors are not sensitive to the difference.)	
 
Not constant for
η
Think "constant for what?"
p(λ) = 1


η=
√
λ
p(η) = p(λ)
dλ
dη
= 2η
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Uniform distributions	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
Keep these problems in mind:	

1.  The normalization problem	

2.  The transformation problem
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Noninformative priors	
 
Three candidates:	

	

1. Conjugate priors 	

... Easy to handle
2. Uniform distributions 	

... Principle of indifference
3. Noninformative priors	

... Make effects of priors little	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Noninformative priors	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
Two examples of noninformative priors:	

1. Priors for location parameters	

2. Priors for scale parameters	

These are constructed to make effects to posteriors
as little as possible, so that the inference would be
objective.
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Noninformative priors	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
1. Priors for location parameters	

	

If the density form is 	

	

 p(x|µ) = f(x − µ),
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Noninformative priors	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
1. Priors for location parameters	

	

If the density form is 	

	

the constant shift gives same density:	

	

	

x = x + c
p(x|µ) = f(x − µ),
p(x|µ) = f(x − µ).
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Noninformative priors	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
1. Priors for location parameters	

	

If the density form is 	

	

the constant shift gives same density:	

	

	

This property is “translation invariance” and 	

these parameter is “location parameter”.	

	

x = x + c
p(x|µ) = f(x − µ),
p(x|µ) = f(x − µ).
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Noninformative priors	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
1. Priors for location parameters	

	

To reflect the translation invariance, priors should be	

A
B
p(µ)dµ =
A
B
p(µ − c)dµ for∀A, B.
⇐⇒ p(µ) = p(µ − c).
⇐⇒ p(µ) = constant.
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Noninformative priors	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
1. Priors for location parameters	

	

To reflect the translation invariance, priors should be	

A
B
p(µ)dµ =
A
B
p(µ − c)dµ for∀A, B.
⇐⇒ p(µ) = p(µ − c).
⇐⇒ p(µ) = constant.
We obtained uniform distributions after all.
But unlike before, we know when to use it.
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Noninformative priors	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
1. Priors for location parameters	

E.g.) The mean in Gaussian	

	

	

p(x|µ) =
1
(2πσ2)1/2
exp −
1
2σ2
(x − µ)2
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Noninformative priors	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
1. Priors for location parameters	

E.g.) The mean in Gaussian	

	

	

p(x|µ) =
1
(2πσ2)1/2
exp −
1
2σ2
(x − µ)2
f(x − µ)This form is
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Noninformative priors	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
1. Priors for location parameters	

E.g.) The mean in Gaussian	

	

	

This prior is also obtained as a limit of conjugates.	

p(x|µ) =
1
(2πσ2)1/2
exp −
1
2σ2
(x − µ)2
f(x − µ)This form is	
 
p(µ) = N(µ|µ0, σ2
0)
σ2
0 →∞
−−−−→const.,
µN =
σ2
Nσ2
0 + σ2
µ0 +
Nσ2
0
Nσ2
0 + σ2
µML →µML,
1
σ2
N
=
1
σ2
0
+
N
σ2
→
N
σ2
.
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Noninformative priors	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
2. Priors for scale parameters	

	

If the density form is 	

	

 p(x|σ) =
1
σ
f
x
σ
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Noninformative priors	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
2. Priors for scale parameters	

	

If the density form is 	

	

the constant scale gives same density:	

	

	

p(x|σ) =
1
σ
f
x
σ
p(x|σ) =
1
σ
f
x
σ
x = cx
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Noninformative priors	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
2. Priors for scale parameters	

	

If the density form is 	

	

the constant scale gives same density:	

	

	

This property is “scale invariance” and 	

these parameter is “scale parameter”.	

	

p(x|σ) =
1
σ
f
x
σ
p(x|σ) =
1
σ
f
x
σ
x = cx
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Noninformative priors	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
2. Priors for scale parameters	

	

To reflect the scale invariance, priors should be	

A
B
p(σ)dσ =
A
B
p
1
c
σ
dσ
d(cσ)
dσ for∀A, B.
⇐⇒ p(σ) =
1
c
p
1
c
σ .
⇐⇒ p(σ) ∝
1
σ
.
⇐⇒ p(ln σ) = const.
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Noninformative priors	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
2. Priors for scale parameters	

E.g.) The deviation in Gaussian	

	

 p(x|σ) =
1
(2πσ2)1/2
exp −
1
2σ2
x2
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Noninformative priors	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
2. Priors for scale parameters	

E.g.) The deviation in Gaussian	

	

This form is	
 1
σ f x
σ
p(x|σ) =
1
(2πσ2)1/2
exp −
1
2σ2
x2
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Noninformative priors	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
2. Priors for scale parameters	

E.g.) The deviation in Gaussian	

	

	

This prior is also obtained as a limit of conjugates.	

This form is	
 1
σ f x
σ
p(x|σ) =
1
(2πσ2)1/2
exp −
1
2σ2
x2
p(λ) = Gam(λ|a0, b0)
a0,b0→∞
−−−−−−→
const
λ
,
aN = a0 +
N
2
→
N
2
,
bN = b0 +
N
2
σ2
ML →
N
2
σ2
ML,
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Priors for EF – Noninformative priors	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
Two examples of noninformative priors:	

1. Priors for location parameters	

	

2. Priors for scale parameters	

	

p(x|µ) = f(x − µ) =⇒ p(µ) = const.
p(x|σ) =
1
σ
f
x
σ
=⇒ p(σ) ∝
1
σ
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Today's topics	
 
1. The exponential family	

1.  What is exponential family?	

2.  Maximum likelihood for EF	

3.  How to decide priors for EF	

	

2. Nonparametric methods	

1.  What is the point of nonparametric methods ?	

2.  Kernel density estimator	

3.  Nearest-neighbour methods	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Today's topics	
 
1. The exponential family	

1.  What is exponential family?	

2.  Maximum likelihood for EF	

3.  How to decide priors for EF	

	

2. Nonparametric methods	

1.  What is the point of nonparametric methods ?	

2.  Kernel density estimator	

3.  Nearest-neighbour methods	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Today's topics	
 
1. The exponential family	

1.  What is exponential family?	

2.  Maximum likelihood for EF	

3.  How to decide priors for EF	

	

2. Nonparametric methods	

1.  What is the point of nonparametric methods ?	

2.  Kernel density estimator	

3.  Nearest-neighbour methods	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
We learned 	

“parametric approach”	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
We learned 	

“parametric approach”	

vs.	

We will learn 	

“nonparametric approach”	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
We learned 	

“parametric approach”	

vs.	

We will learn 	

“nonparametric approach”	
 
	

	

What is the difference?	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
Parametric	
  Nonparametric	
 
Assume a specific form
of the distribution	
 
Put few assumption about
the form of distribution	
 
Simple	
 
Complex 	

(depend on data size)	
 
Poor	
  Rich / Flexible	
 
Efficient	
  Inefficient
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
Parametric	
  Nonparametric	
 
Assume a specific form
of the distribution	
 
Put few assumption about
the form of distribution	
 
Simple	
 
Complex 	

(depend on data size)	
 
Poor	
  Rich / Flexible	
 
Efficient	
  Inefficient
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Today's topics	
 
1. The exponential family	

1.  What is exponential family?	

2.  Maximum likelihood for EF	

3.  How to decide priors for EF	

	

2. Nonparametric methods	

1.  What is the point of nonparametric methods ?	

2.  Kernel density estimator	

3.  Nearest-neighbour methods	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
We will learn:	

1. Histogram methods	

2. Kernel density estimators	

3. Nearest-neighbour methods	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
1. Histogram methods	

Split the space into grids (or bins), and count data points.	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
1. Histogram methods	

Split the space into grids (or bins), and count data points.	

	

	

where	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
p(x) = pi =
ni
N∆i
(x ∈ i-th bin),
∆i = Width of ith
bin (usually same for all i),
ni = # of observations which is assigned to ith
bin,
N = Total # of observations.
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
1. Histogram methods	

Split the space into grids (or bins), and count data points.	

	

	

where	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
p(x) = pi =
ni
N∆i
(x ∈ i-th bin),
∆i = Width of ith
bin (usually same for all i),
ni = # of observations which is assigned to ith
bin,
N = Total # of observations.
This is piecewise constant, hence discontinuous.
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
1. Histogram methods – Example	

is...	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
∆ = 0.04
0 0.5 1
0
5
∆ = 0.08
0 0.5 1
0
5
∆ = 0.25
0 0.5 1
0
5
∆
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
1. Histogram methods – Example	

is...	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
∆ = 0.04
0 0.5 1
0
5
∆ = 0.08
0 0.5 1
0
5
∆ = 0.25
0 0.5 1
0
5
Too narrow to catch enough points	

	

Too spiky (noisy)	

∆
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
1. Histogram methods – Example	

is...	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
∆ = 0.04
0 0.5 1
0
5
∆ = 0.08
0 0.5 1
0
5
∆ = 0.25
0 0.5 1
0
5
Too narrow to catch enough points	

	

Too spiky (noisy)	

# of bins = MD (curse of dimensionality)	
 
∆
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
1. Histogram methods – Example	

is...	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
∆ = 0.04
0 0.5 1
0
5
∆ = 0.08
0 0.5 1
0
5
∆ = 0.25
0 0.5 1
0
5
Too narrow to catch enough points	

	

Too spiky (noisy)	

Good intermediate value	

# of bins = MD (curse of dimensionality)	
 
∆
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
1. Histogram methods – Example	

is...	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
∆ = 0.04
0 0.5 1
0
5
∆ = 0.08
0 0.5 1
0
5
∆ = 0.25
0 0.5 1
0
5
Too narrow to catch enough points	

	

Too spiky (noisy)	

Good intermediate value	

Too wide to express the data	

	

Too smooth (less info)	

# of bins = MD (curse of dimensionality)	
 
∆
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
1. Histogram methods – Example	

is...	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
∆ = 0.04
0 0.5 1
0
5
∆ = 0.08
0 0.5 1
0
5
∆ = 0.25
0 0.5 1
0
5
Too narrow to catch enough points	

	

Too spiky (noisy)	

Good intermediate value	

Too wide to express the data	

	

Too smooth (less info)	

Find good value is very important!	
 
# of bins = MD (curse of dimensionality)	
 
∆
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
Lessons from histogram methods	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
Estimate density at a particular point
from data points of small local region.
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
Lessons from histogram methods	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
Estimate density at a particular point
from data points of small local region.	

	

The regions are defined by “smoothing
parameter”, which control the
complexity in relation with data size.
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
Lessons from histogram methods	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
Estimate density at a particular point
from data points of small local region.	

	

The regions are defined by “smoothing
parameter”, which control the
complexity in relation with data size.	



	
 
Other problems
•  Discontinuity
•  Not scalable (curse of dimensionality)
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
Lessons from histogram methods	

Let's consider a small local region , then	

	

	

	

	

where .	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
R
P = R
p(x)dx
Pr(K out of N data ∈ R) =
N!
K!(N − K)!
PK
(1 − P)N−K
,
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
Lessons from histogram methods	

Let's consider a small local region , then	

	

	

	

	

where .	

If	

1.  K is large enough (smoother not too small)	

2.  N is constant over (smoother small enough)	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
R
P = R
p(x)dx
Pr(K out of N data ∈ R) =
N!
K!(N − K)!
PK
(1 − P)N−K
,
R
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
Lessons from histogram methods	

Let's consider a small local region , then	

	

	

	

	

where .	

If	

1.  K is large enough (smoother not too small)	

2.  N is constant over (smoother small enough)	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
R
P = R
p(x)dx
Pr(K out of N data ∈ R) =
N!
K!(N − K)!
PK
(1 − P)N−K
,
R
Contradictory
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
Lessons from histogram methods	

Let's consider a small local region , then	

	

	

	

	

where .	

If	

1.  K is large enough (smoother not too small)	

2.  N is constant over (smoother small enough)	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
R
P = R
p(x)dx
Pr(K out of N data ∈ R) =
N!
K!(N − K)!
PK
(1 − P)N−K
,
R
Contradictory	
 Depend on data size
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
Lessons from histogram methods	

Let's consider a small local region , then	

	

	

	

	

where .	

If	

1.  K is large enough (smoother not too small)	

2.  N is constant over (smoother small enough)	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
R
P = R
p(x)dx
Pr(K out of N data ∈ R) =
N!
K!(N − K)!
PK
(1 − P)N−K
,
R
⇒ p(x) =
K
NV
.
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Today's topics	
 
1. The exponential family	

1.  What is exponential family?	

2.  Maximum likelihood for EF	

3.  How to decide priors for EF	

	

2. Nonparametric methods	

1.  What is the point of nonparametric methods ?	

2.  Kernel density estimator	

3.  Nearest-neighbour methods	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Today's topics	
 
1. The exponential family	

1.  What is exponential family?	

2.  Maximum likelihood for EF	

3.  How to decide priors for EF	

	

2. Nonparametric methods	

1.  What is the point of nonparametric methods ?	

2.  Kernel density estimator	

3.  Nearest-neighbour methods	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Kernel density estimators	
 
Fix a region (e.g., hypercube centered on x, side is h) 	

and count data by kernel function k(u) (Parzen window).	

	

	

	

	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
k(u) =
1, |ui| 1/2, (i = 1, . . . D)
0, otherwise.
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Kernel density estimators	
 
Fix a region (e.g., hypercube centered on x, side is h) 	

and count data by kernel function k(u) (Parzen window).	

	

	

	

	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
Centered on origin,
side is 1	
 
k(u) =
1, |ui| 1/2, (i = 1, . . . D)
0, otherwise.
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Kernel density estimators	
 
Fix a region (e.g., hypercube centered on x, side is h) 	

and count data by kernel function k(u) (Parzen window).	

	

	

	

	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
k(u) =
1, |ui| 1/2, (i = 1, . . . D)
0, otherwise.
Discontinuous kernel
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Kernel density estimators	
 
Fix a region (e.g., hypercube centred on x, side is h) 	

and count data by kernel function k(u) (Parzen window).	

	

	

	

	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
K =
N
n=1
k
x − xn
h
,
V = hD
,
∴ p(x) =
1
N
N
n=1
1
hD
k
x − xn
h
.
k(u) =
1, |ui| 1/2, (i = 1, . . . D)
0, otherwise.
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Kernel density estimators	
 
Symmetry of k(u) let us re-interpret the result.	

	

	

	

	

	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
N data points in the single	

cube centered on x
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Kernel density estimators	
 
Symmetry of k(u) let us re-interpret the result.	

	

	

	

	

	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
N data points in the single	

cube centered on x	
 
N cubes centered on xn
around x
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Kernel density estimators	
 
Other choice of k(u): Gaussian	

	

	

	

	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
k(u) =
1
(2π)D/2
exp −
u 2
2
.
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Kernel density estimators	
 
Other choice of k(u): Gaussian	

	

	

	

	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
k(u) =
1
(2π)D/2
exp −
u 2
2
.
This kernel give continuous density.
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Kernel density estimators	
 
Other choice of k(u): Gaussian	

	

	

	

	

You can use anything as long as it holds	

	

	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
k(u) 0,
k(u)du = 1.
k(u) =
1
(2π)D/2
exp −
u 2
2
.
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Kernel density estimators	
 
Example	

	

Again, we can see that
smooth parameter h controls
the outcome of estimations.	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
h = 0.005
0 0.5 1
0
5
h = 0.07
0 0.5 1
0
5
h = 0.2
0 0.5 1
0
5
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Today's topics	
 
1. The exponential family	

1.  What is exponential family?	

2.  Maximum likelihood for EF	

3.  How to decide priors for EF	

	

2. Nonparametric methods	

1.  What is the point of nonparametric methods ?	

2.  Kernel density estimator	

3.  Nearest-neighbour methods	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Today's topics	
 
1. The exponential family	

1.  What is exponential family?	

2.  Maximum likelihood for EF	

3.  How to decide priors for EF	

	

2. Nonparametric methods	

1.  What is the point of nonparametric methods ?	

2.  Kernel density estimator	

3.  Nearest-neighbour methods	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nearest-neighbour methods	
 
Use a sphere as a region which centred on x and
contains K (fixed number) data points.	

	

	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nearest-neighbour methods	
 
Use a sphere as a region which centred on x and
contains K (fixed number) data points.	

	

	

	

where V(x) denotes the volume	

of the sphere.	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
p(x) =
K
NV (x)
,
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nearest-neighbour methods	
 
Note that this density can not be normalized. 	

From x* where faraway from all data points, the radius
of the sphere is inversely proportional to x, thus integral
diverge.	

	

	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
∞
−∞
dx
r(x)
∞
x∗
dx
r(x)
∞
x∗
dx
x − x†
→ ∞.
∴
RD
K
NV (x)
dx ∝
RD
dx
r(x)D
→ ∞.
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nearest-neighbour estimators	
 
Example	

	

Here again, smooth parameter	

K controls the outcome of
estimations.	

	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
K = 1
0 0.5 1
0
5
K = 5
0 0.5 1
0
5
K = 30
0 0.5 1
0
5
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nearest-neighbour estimators	
 
Example	

	

Here again, smooth parameter	

K controls the outcome of
estimations.	

	

Furthermore, we can observe
that in K=1 case.	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
K = 1
0 0.5 1
0
5
K = 5
0 0.5 1
0
5
K = 30
0 0.5 1
0
5
p(x) → ∞
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
Another problem of Kernels and NNs	

	

These methods need all observed data for estimation,
so both time and space complexity is O(N). It is very
inefficient.	

	

On that point, parametric methods are quite efficient
(c.f., sufficient statistics).	

Histograms are also efficient.	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
Histograms	

 Kernels	

 NNs	

K	

 Not fixed	

 Not fixed	

 Fixed	

V	

 Not fixed	

 Fixed	

 Not fixed	

Smoother	

 h	

 V	

Continuity	

 No	

 It depends	

 Yes*	

Dimensionality	

 Suffer	

 Scalable	

 Scalable	

Normalization	

 Proper	

 Proper	

 Improper	

Data set	

 Discard	

 Keep	

 Keep	

∆
* If K=1, not continuous
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
Histograms	

 Kernels	

 NNs	

K	

 Not fixed	

 Not fixed	

 Fixed	

V	

 Not fixed	

 Fixed	

 Not fixed	

Smoother	

 h	

 V	

Continuity	

 No	

 It depends	

 Yes*	

Dimensionality	

 Suffer	

 Scalable	

 Scalable	

Normalization	

 Proper	

 Proper	

 Improper	

Data set	

 Discard	

 Keep	

 Keep	

∆
* If K=1, not continuous
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
Histograms	

 Kernels	

 NNs	

K	

 Not fixed	

 Not fixed	

 Fixed	

V	

 Not fixed	

 Fixed	

 Not fixed	

Smoother	

 h	

 V	

Continuity	

 No	

 It depends	

 Yes*	

Dimensionality	

 Suffer	

 Scalable	

 Scalable	

Normalization	

 Proper	

 Proper	

 Improper	

Data set	

 Discard	

 Keep	

 Keep	

∆
* If K=1, not continuous
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nonparametric methods	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
Histograms	

 Kernels	

 NNs	

K	

 Not fixed	

 Not fixed	

 Fixed	

V	

 Not fixed	

 Fixed	

 Not fixed	

Smoother	

 h	

 V	

Continuity	

 No	

 It depends	

 Yes*	

Dimensionality	

 Suffer	

 Scalable	

 Scalable	

Normalization	

 Proper	

 Proper	

 Improper	

Data set	

 Discard	

 Keep	

 Keep	

∆
* If K=1, not continuous
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nearest-neighbour methods	
 
Use NNs as classifier	

To do this, use the sphere contains
K points irrespective to the class.	

	

	

	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nearest-neighbour methods	
 
Use NNs as classifier	

To do this, use the sphere contains
K points irrespective to the class.	

	

	

	

where Kk is # in class k and sphere. 	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
p(x|Ck) =
Kk
NkV
,
p(x) =
K
NV
,
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nearest-neighbour methods	
 
Use NNs as classifier	

To do this, use the sphere contains
K points irrespective to the class.	

	

	

	

where Kk is # in class k and sphere.
Class priors are , so 	

	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
p(x|Ck) =
Kk
NkV
,
p(x) =
K
NV
,
p(Ck|x) =
p(x|Ck)p(Ck)
p(x)
=
Kk
K
.
p(Ck) = Nk/N
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nearest-neighbour methods	
 
Use NNs as classifier	

	

Therefore, x will be classified to
the greatest majority among x's
K-nearest neighbours.	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nearest-neighbour methods	
 
Use NNs as classifier	

	

Therefore, x will be classified to
the greatest majority among x's
K-nearest neighbours.	

	

If K=1, it is called “nearest-
neighbour rule”.	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Nearest-neighbour methods	
 
Use NNs as classifier – Example	

	

	

	

	

	

	

Same as the discussion so far, here K acts as
smooth parameter.	

June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
x6
x7
K = 1
0 1 2
0
1
2
x6
x7
K = 3
0 1 2
0
1
2
x6
x7
K = 31
0 1 2
0
1
2
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Today's topics	
 
1. The exponential family	

1.  What is exponential family?	

2.  Maximum likelihood for EF	

3.  How to decide priors for EF	

	

2. Nonparametric methods	

1.  What is the point of nonparametric methods ?	

2.  Kernel density estimator	

3.  Nearest-neighbour methods	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA
NONPARAMETRIC METHODS	
 THE EXPONENTIAL FAMILY	
 
Today's topics	
 
1. The exponential family	

1.  What is exponential family?	

2.  Maximum likelihood for EF	

3.  How to decide priors for EF	

	

2. Nonparametric methods	

1.  What is the point of nonparametric methods ?	

2.  Kernel density estimator	

3.  Nearest-neighbour methods	
 
June 11, 2014
 PRML 2.4-2.5
 Shinichi TAMURA

Contenu connexe

Tendances

Maximum Likelihood Estimation
Maximum Likelihood EstimationMaximum Likelihood Estimation
Maximum Likelihood Estimationguestfee8698
 
StanとRでベイズ統計モデリングに関する読書会(Osaka.stan) 第四章
StanとRでベイズ統計モデリングに関する読書会(Osaka.stan) 第四章StanとRでベイズ統計モデリングに関する読書会(Osaka.stan) 第四章
StanとRでベイズ統計モデリングに関する読書会(Osaka.stan) 第四章nocchi_airport
 
PRML輪読#10
PRML輪読#10PRML輪読#10
PRML輪読#10matsuolab
 
PRML 1.6 情報理論
PRML 1.6 情報理論PRML 1.6 情報理論
PRML 1.6 情報理論sleepy_yoshi
 
PRML読書会#2,#3資料
PRML読書会#2,#3資料PRML読書会#2,#3資料
PRML読書会#2,#3資料Hiromasa Ohashi
 
PRML輪読#3
PRML輪読#3PRML輪読#3
PRML輪読#3matsuolab
 
PRML輪読#6
PRML輪読#6PRML輪読#6
PRML輪読#6matsuolab
 
PRML 3.5.2, 3.5.3, 3.6
PRML 3.5.2, 3.5.3, 3.6PRML 3.5.2, 3.5.3, 3.6
PRML 3.5.2, 3.5.3, 3.6Kohei Tomita
 
2013.12.26 prml勉強会 線形回帰モデル3.2~3.4
2013.12.26 prml勉強会 線形回帰モデル3.2~3.42013.12.26 prml勉強会 線形回帰モデル3.2~3.4
2013.12.26 prml勉強会 線形回帰モデル3.2~3.4Takeshi Sakaki
 
PRML2.4 指数型分布族
PRML2.4 指数型分布族PRML2.4 指数型分布族
PRML2.4 指数型分布族hiroki yamaoka
 
パターン認識と機械学習6章(カーネル法)
パターン認識と機械学習6章(カーネル法)パターン認識と機械学習6章(カーネル法)
パターン認識と機械学習6章(カーネル法)Yukara Ikemiya
 
PRML第6章「カーネル法」
PRML第6章「カーネル法」PRML第6章「カーネル法」
PRML第6章「カーネル法」Keisuke Sugawara
 
MLaPP 9章 「一般化線形モデルと指数型分布族」
MLaPP 9章 「一般化線形モデルと指数型分布族」MLaPP 9章 「一般化線形モデルと指数型分布族」
MLaPP 9章 「一般化線形モデルと指数型分布族」moterech
 
PRML ベイズロジスティック回帰 4.5 4.5.2
PRML ベイズロジスティック回帰 4.5 4.5.2PRML ベイズロジスティック回帰 4.5 4.5.2
PRML ベイズロジスティック回帰 4.5 4.5.2tmtm otm
 
Prml3.5 エビデンス近似〜
Prml3.5 エビデンス近似〜Prml3.5 エビデンス近似〜
Prml3.5 エビデンス近似〜Yuki Matsubara
 
PRML読み会第一章
PRML読み会第一章PRML読み会第一章
PRML読み会第一章Takushi Miki
 

Tendances (20)

Maximum Likelihood Estimation
Maximum Likelihood EstimationMaximum Likelihood Estimation
Maximum Likelihood Estimation
 
StanとRでベイズ統計モデリングに関する読書会(Osaka.stan) 第四章
StanとRでベイズ統計モデリングに関する読書会(Osaka.stan) 第四章StanとRでベイズ統計モデリングに関する読書会(Osaka.stan) 第四章
StanとRでベイズ統計モデリングに関する読書会(Osaka.stan) 第四章
 
PRML輪読#10
PRML輪読#10PRML輪読#10
PRML輪読#10
 
PRML 1.6 情報理論
PRML 1.6 情報理論PRML 1.6 情報理論
PRML 1.6 情報理論
 
Prml
PrmlPrml
Prml
 
PRML読書会#2,#3資料
PRML読書会#2,#3資料PRML読書会#2,#3資料
PRML読書会#2,#3資料
 
PRML輪読#3
PRML輪読#3PRML輪読#3
PRML輪読#3
 
Bernoulli distribution
Bernoulli distributionBernoulli distribution
Bernoulli distribution
 
PRML輪読#6
PRML輪読#6PRML輪読#6
PRML輪読#6
 
PRML 3.5.2, 3.5.3, 3.6
PRML 3.5.2, 3.5.3, 3.6PRML 3.5.2, 3.5.3, 3.6
PRML 3.5.2, 3.5.3, 3.6
 
Prml 1.3~1.6 ver3
Prml 1.3~1.6 ver3Prml 1.3~1.6 ver3
Prml 1.3~1.6 ver3
 
2013.12.26 prml勉強会 線形回帰モデル3.2~3.4
2013.12.26 prml勉強会 線形回帰モデル3.2~3.42013.12.26 prml勉強会 線形回帰モデル3.2~3.4
2013.12.26 prml勉強会 線形回帰モデル3.2~3.4
 
PRML2.4 指数型分布族
PRML2.4 指数型分布族PRML2.4 指数型分布族
PRML2.4 指数型分布族
 
パターン認識と機械学習6章(カーネル法)
パターン認識と機械学習6章(カーネル法)パターン認識と機械学習6章(カーネル法)
パターン認識と機械学習6章(カーネル法)
 
PRML第6章「カーネル法」
PRML第6章「カーネル法」PRML第6章「カーネル法」
PRML第6章「カーネル法」
 
MLaPP 9章 「一般化線形モデルと指数型分布族」
MLaPP 9章 「一般化線形モデルと指数型分布族」MLaPP 9章 「一般化線形モデルと指数型分布族」
MLaPP 9章 「一般化線形モデルと指数型分布族」
 
PRML ベイズロジスティック回帰 4.5 4.5.2
PRML ベイズロジスティック回帰 4.5 4.5.2PRML ベイズロジスティック回帰 4.5 4.5.2
PRML ベイズロジスティック回帰 4.5 4.5.2
 
Prml3.5 エビデンス近似〜
Prml3.5 エビデンス近似〜Prml3.5 エビデンス近似〜
Prml3.5 エビデンス近似〜
 
PRML2.1 2.2
PRML2.1 2.2PRML2.1 2.2
PRML2.1 2.2
 
PRML読み会第一章
PRML読み会第一章PRML読み会第一章
PRML読み会第一章
 

En vedette

ESL 17.3.2-17.4: Graphical Lasso and Boltzmann Machines
ESL 17.3.2-17.4: Graphical Lasso and Boltzmann MachinesESL 17.3.2-17.4: Graphical Lasso and Boltzmann Machines
ESL 17.3.2-17.4: Graphical Lasso and Boltzmann MachinesShinichi Tamura
 
MLaPP 2章 「確率」(前編)
MLaPP 2章 「確率」(前編)MLaPP 2章 「確率」(前編)
MLaPP 2章 「確率」(前編)Shinichi Tamura
 
NIPS 2016 輪読: Supervised Word Movers Distance
NIPS 2016 輪読: Supervised Word Movers DistanceNIPS 2016 輪読: Supervised Word Movers Distance
NIPS 2016 輪読: Supervised Word Movers DistanceShinichi Tamura
 
PRML 9.1-9.2: K-means Clustering & Mixtures of Gaussians
PRML 9.1-9.2: K-means Clustering & Mixtures of GaussiansPRML 9.1-9.2: K-means Clustering & Mixtures of Gaussians
PRML 9.1-9.2: K-means Clustering & Mixtures of GaussiansShinichi Tamura
 
PRML 13.2.2: The Forward-Backward Algorithm
PRML 13.2.2: The Forward-Backward AlgorithmPRML 13.2.2: The Forward-Backward Algorithm
PRML 13.2.2: The Forward-Backward AlgorithmShinichi Tamura
 
ESL 4.4.3-4.5: Logistic Reression (contd.) and Separating Hyperplane
ESL 4.4.3-4.5: Logistic Reression (contd.) and Separating HyperplaneESL 4.4.3-4.5: Logistic Reression (contd.) and Separating Hyperplane
ESL 4.4.3-4.5: Logistic Reression (contd.) and Separating HyperplaneShinichi Tamura
 
如何用十分鐘快速瞭解一個程式語言 《以JavaScript和C語言為例》
如何用十分鐘快速瞭解一個程式語言  《以JavaScript和C語言為例》如何用十分鐘快速瞭解一個程式語言  《以JavaScript和C語言為例》
如何用十分鐘快速瞭解一個程式語言 《以JavaScript和C語言為例》鍾誠 陳鍾誠
 

En vedette (7)

ESL 17.3.2-17.4: Graphical Lasso and Boltzmann Machines
ESL 17.3.2-17.4: Graphical Lasso and Boltzmann MachinesESL 17.3.2-17.4: Graphical Lasso and Boltzmann Machines
ESL 17.3.2-17.4: Graphical Lasso and Boltzmann Machines
 
MLaPP 2章 「確率」(前編)
MLaPP 2章 「確率」(前編)MLaPP 2章 「確率」(前編)
MLaPP 2章 「確率」(前編)
 
NIPS 2016 輪読: Supervised Word Movers Distance
NIPS 2016 輪読: Supervised Word Movers DistanceNIPS 2016 輪読: Supervised Word Movers Distance
NIPS 2016 輪読: Supervised Word Movers Distance
 
PRML 9.1-9.2: K-means Clustering & Mixtures of Gaussians
PRML 9.1-9.2: K-means Clustering & Mixtures of GaussiansPRML 9.1-9.2: K-means Clustering & Mixtures of Gaussians
PRML 9.1-9.2: K-means Clustering & Mixtures of Gaussians
 
PRML 13.2.2: The Forward-Backward Algorithm
PRML 13.2.2: The Forward-Backward AlgorithmPRML 13.2.2: The Forward-Backward Algorithm
PRML 13.2.2: The Forward-Backward Algorithm
 
ESL 4.4.3-4.5: Logistic Reression (contd.) and Separating Hyperplane
ESL 4.4.3-4.5: Logistic Reression (contd.) and Separating HyperplaneESL 4.4.3-4.5: Logistic Reression (contd.) and Separating Hyperplane
ESL 4.4.3-4.5: Logistic Reression (contd.) and Separating Hyperplane
 
如何用十分鐘快速瞭解一個程式語言 《以JavaScript和C語言為例》
如何用十分鐘快速瞭解一個程式語言  《以JavaScript和C語言為例》如何用十分鐘快速瞭解一個程式語言  《以JavaScript和C語言為例》
如何用十分鐘快速瞭解一個程式語言 《以JavaScript和C語言為例》
 

Dernier

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 

Dernier (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 

PRML 2.4-2.5 Exponential Family & Nonparametric Methods

  • 1. PRML 2.4-2.5 The exponential family & Nonparametric methods June 11, 2014 by Shinichi TAMURA
  • 2. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Today's topics 1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF 2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 3. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Today's topics 1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF 2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 4. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Today's topics 1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF 2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 5. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY The Exponential Family Almost all of the distributions we studied so far belong to a single class, namely the exponential family. June 11, 2014 PRML 2.4-2.5 The exponential family Shinichi TAMURA
  • 6. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY The Exponential Family Almost all of the distributions we studied so far belong to a single class, namely the exponential family. June 11, 2014 PRML 2.4-2.5 Bernoulli, The exponential family Shinichi TAMURA
  • 7. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY The Exponential Family Almost all of the distributions we studied so far belong to a single class, namely the exponential family. June 11, 2014 PRML 2.4-2.5 Bernoulli, multinomial, The exponential family Shinichi TAMURA
  • 8. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY The Exponential Family Almost all of the distributions we studied so far belong to a single class, namely the exponential family. June 11, 2014 PRML 2.4-2.5 Bernoulli, multinomial, Gaussian, The exponential family Shinichi TAMURA
  • 9. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY The Exponential Family Almost all of the distributions we studied so far belong to a single class, namely the exponential family. June 11, 2014 PRML 2.4-2.5 Bernoulli, multinomial, Gaussian, beta, The exponential family Shinichi TAMURA
  • 10. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY The Exponential Family Almost all of the distributions we studied so far belong to a single class, namely the exponential family. June 11, 2014 PRML 2.4-2.5 Bernoulli, multinomial, Gaussian, beta, gamma, The exponential family Shinichi TAMURA
  • 11. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY The Exponential Family Almost all of the distributions we studied so far belong to a single class, namely the exponential family. June 11, 2014 PRML 2.4-2.5 Bernoulli, multinomial, Gaussian, beta, gamma, von Mises...etc. The exponential family Shinichi TAMURA
  • 12. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY The Exponential Family Almost all of the distributions we studied so far belong to a single class, namely the exponential family. June 11, 2014 PRML 2.4-2.5 Parametric distributions Bernoulli, multinomial, Gaussian, beta, gamma, von Mises...etc. The exponential family Gaussian mixture...etc. Shinichi TAMURA
  • 13. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY p(x|η) = h(x)g(η) exp ηT u(x) The Exponential Family The exponential family over x given is a class of distributions which form is η June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 14. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY p(x|η) = h(x)g(η) exp ηT u(x) The Exponential Family The exponential family over x given is a class of distributions which form is η Natural parameter June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 15. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY p(x|η) = h(x)g(η) exp ηT u(x) The Exponential Family The exponential family over x given is a class of distributions which form is η Natural parameter Where and come across x η June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 16. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY p(x|η) = h(x)g(η) exp ηT u(x) The Exponential Family The exponential family over x given is a class of distributions which form is η Natural parameter Normalizing constant Where and come across x η June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 17. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY The Exponential Family E.g. 1) The Bernoulli Distribution p(x|η) = µx (1 − µ)1−x = σ(−η) exp(ηx) June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 18. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY The Exponential Family E.g. 1) The Bernoulli Distribution where η = ln µ 1 − µ p(x|η) = µx (1 − µ)1−x = σ(−η) exp(ηx) u(x) h(x) = 1 g(η) June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 19. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY The Exponential Family E.g. 2) The Multinomial Distribution p(x|η) = µxk k = exp(ηT x) June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 20. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY The Exponential Family E.g. 2) The Multinomial Distribution where η = (ln µ1, . . . , ln µM )T ⇒ exp(ηk) = µk = 1 p(x|η) = µxk k = exp(ηT x) u(x) h(x) = 1 g(η) = 1 June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 21. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY The Exponential Family E.g. 2) The Multinomial Distribution where η = (ln µ1, . . . , ln µM )T ⇒ exp(ηk) = µk = 1 p(x|η) = µxk k = exp(ηT x) It's inconvenient! u(x) h(x) = 1 g(η) = 1 June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 22. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY The Exponential Family E.g. 2) The Multinomial Distribution Remove the constraint by µM = 1 − M−1 k=1 µk, xM = 1 − M−1 k=1 xk p(x|µ) = exp M−1 k=1 xk ln µk + 1 − M−1 k=1 xk ln 1 − M−1 k=1 µk = exp M−1 k=1 xk ln µk 1 − M−1 k=1 µk + ln 1 − M−1 k=1 µk = 1 − M−1 k=1 µk exp M−1 k=1 xk ln µk 1 − M−1 k=1 µk . June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 23. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY The Exponential Family E.g. 2) The Multinomial Distribution Remove the constraint by µM = 1 − M−1 k=1 µk, xM = 1 − M−1 k=1 xk p(x|µ) = exp M−1 k=1 xk ln µk + 1 − M−1 k=1 xk ln 1 − M−1 k=1 µk = exp M−1 k=1 xk ln µk 1 − M−1 k=1 µk + ln 1 − M−1 k=1 µk = 1 − M−1 k=1 µk exp M−1 k=1 xk ln µk 1 − M−1 k=1 µk . June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 24. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY The Exponential Family E.g. 2) The Multinomial Distribution Remove the constraint by Therefore... µM = 1 − M−1 k=1 µk, xM = 1 − M−1 k=1 xk p(x|µ) = exp M−1 k=1 xk ln µk + 1 − M−1 k=1 xk ln 1 − M−1 k=1 µk = exp M−1 k=1 xk ln µk 1 − M−1 k=1 µk + ln 1 − M−1 k=1 µk = 1 − M−1 k=1 µk exp M−1 k=1 xk ln µk 1 − M−1 k=1 µk . June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 25. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY The Exponential Family E.g. 2') The Multinomial Distribution w/o constraint where p(x|η) = µxk k = 1 + M−1 k=1 exp(ηk) −1 exp(ηT x) η = ln µ1 1− P j µj , . . . , ln µM−1 1− P j µj , 0 T u(x) h(x) = 1 g(η) June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 26. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY The Exponential Family E.g. 3) The Gaussian Distribution p(x|η) = 1 (2πσ2)1/2 exp − 1 2σ2 (x − µ)2 = (2π)−1/2 (−2η2)1/2 exp η2 1 4η2 exp η1 η2 x x2 June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 27. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY The Exponential Family E.g. 3) The Gaussian Distribution where u(x) h(x) = 1 g(η) p(x|η) = 1 (2πσ2)1/2 exp − 1 2σ2 (x − µ)2 = (2π)−1/2 (−2η2)1/2 exp η2 1 4η2 exp η1 η2 x x2 η = µ σ2 , − 1 2σ2 T June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 28. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Today's topics 1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF 2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 29. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Today's topics 1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF 2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 30. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Maximum likelihood for EF OK, we know what EF looks like. Then, how to estimate the parameter? Maximize likelihood! Frequentist way. June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 31. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Maximum likelihood for EF Suppose we have i.i.d. data , The log-likelihood of is June 11, 2014 PRML 2.4-2.5 η X = {x1, . . . , xN } Shinichi TAMURA ln p(X|η) = ln N n=1 p(xn|η) = ln N n=1 h(xn)g(η) exp ηT u(xn) = N n=1 ln h(xn) + N ln g(η) + ηT N n=1 u(xn). ∴ η ln p(X|η) = N η ln g(η) + N n=1 u(xn). −→ 0
  • 32. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Maximum likelihood for EF Suppose we have i.i.d. data , The log-likelihood of is June 11, 2014 PRML 2.4-2.5 η X = {x1, . . . , xN } Shinichi TAMURA ln p(X|η) = ln N n=1 p(xn|η) = ln N n=1 h(xn)g(η) exp ηT u(xn) = N n=1 ln h(xn) + N ln g(η) + ηT N n=1 u(xn). ∴ η ln p(X|η) = N η ln g(η) + N n=1 u(xn). −→ 0
  • 33. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Maximum likelihood for EF Suppose we have i.i.d. data , The log-likelihood of is June 11, 2014 PRML 2.4-2.5 η X = {x1, . . . , xN } Shinichi TAMURA ln p(X|η) = ln N n=1 p(xn|η) = ln N n=1 h(xn)g(η) exp ηT u(xn) = N n=1 ln h(xn) + N ln g(η) + ηT N n=1 u(xn). ∴ η ln p(X|η) = N η ln g(η) + N n=1 u(xn). −→ 0 By putting this to zero
  • 34. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Maximum likelihood for EF Therefore Here, is determined only through , so it is called “sufficient statistics”. We need to store only for estimation. June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA − η ln g(ηML) = 1 N N n=1 u(xn). ηML n u(xn) n u(xn)
  • 35. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Maximum likelihood for EF E.g.) Gaussian distribution By and , That's what we already know. June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA g(η) = (−2η2)1/2 exp η2 1/4η2 u(x) = (x, x2 )T − ln g(η) = − η1 2η2 − 1 2η2 + η2 1 4η2 2 = µ σ2 + µ2 . ∴ µML = 1 N n xn, σ2 ML = 1 N n x2 n − 1 N n xn 2 .
  • 36. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Maximum likelihood for EF By the way, we want to know the relation between and . June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA ηηML
  • 37. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Maximum likelihood for EF Gradient of by gives June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA η h(x)g(η) exp ηT u(x) dx = 1 g(η) h(x) exp ηT u(x) dx + h(x)g(η) exp ηT u(x) u(x)dx = 0. ⇔ − ln g(η) = E [u(x)] .
  • 38. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Maximum likelihood for EF Gradient of by gives June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA η h(x)g(η) exp ηT u(x) dx = 1 g(η) h(x) exp ηT u(x) dx + h(x)g(η) exp ηT u(x) u(x)dx = 0. ⇔ − ln g(η) = E [u(x)] . Similar to − η ln g(ηML) = 1 N N n=1 u(xn)
  • 39. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Maximum likelihood for EF According to LLN, sample mean will converge to the expectation, so will converge to . June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA ηηML − η ln g(ηML) = 1 N N n=1 u(xn) − ln g(η) = E [u(x)]
  • 40. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Maximum likelihood for EF According to LLN, sample mean will converge to the expectation, so will converge to . June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA ηηML − η ln g(ηML) = 1 N N n=1 u(xn) − ln g(η) = E [u(x)] Converge
  • 41. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Maximum likelihood for EF According to LLN, sample mean will converge to the expectation, so will converge to . June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA ηηML − η ln g(ηML) = 1 N N n=1 u(xn) − ln g(η) = E [u(x)] Converge Converge
  • 42. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Today's topics 1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF 2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 43. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Today's topics 1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF 2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 44. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF If you want to use the Bayesian inference, a prior distribution is needed. Then, how to decide it, if we don't know anything about the parameter? June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 45. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF Three candidates: 1. Conjugate priors 2. Uniform distributions 3. Noninformative priors June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 46. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF Three candidates: 1. Conjugate priors ... Easy to handle 2. Uniform distributions 3. Noninformative priors June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 47. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF Three candidates: 1. Conjugate priors ... Easy to handle 2. Uniform distributions ... Principle of indifference 3. Noninformative priors June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 48. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF Three candidates: 1. Conjugate priors ... Easy to handle 2. Uniform distributions ... Principle of indifference 3. Noninformative priors ... Make effects of priors little June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 49. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Conjugate priors Three candidates: 1. Conjugate priors ... Easy to handle 2. Uniform distributions ... Principle of indifference 3. Noninformative priors ... Make effects of priors little June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 50. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Conjugate priors Distributions of EF has factors of , so conjugate priors is June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA g(η) exp(ηT u) p(η|X, ν) = f(X, ν) g(η) exp{ηT X} ν = f(X, ν)g(η)ν exp{νηT X}.
  • 51. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Conjugate priors Distributions of EF has factors of , so conjugate priors is June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA g(η) exp(ηT u) p(η|X, ν) = f(X, ν) g(η) exp{ηT X} ν = f(X, ν)g(η)ν exp{νηT X}. Correspond
  • 52. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Conjugate priors Distributions of EF has factors of , so conjugate priors is June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA g(η) exp(ηT u) p(η|X, ν) = f(X, ν) g(η) exp{ηT X} ν = f(X, ν)g(η)ν exp{νηT X}. Normalizing constant
  • 53. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Conjugate priors Distributions of EF has factors of , so conjugate priors is June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA g(η) exp(ηT u) p(η|X, ν) = f(X, ν) g(η) exp{ηT X} ν = f(X, ν)g(η)ν exp{νηT X}.
  • 54. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Conjugate priors Distributions of EF has factors of , so conjugate priors is It will give posteriors as follows. June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA g(η) exp(ηT u) p(η|X, ν) = f(X, ν) g(η) exp{ηT X} ν = f(X, ν)g(η)ν exp{νηT X}. p(η|X, X, ν) ∝ N n=1 h(xn)g(η) exp ηT u(xn) × g(η)ν exp{ηT X} ∝ g(η)N+ν exp ηT N n=1 u(xn) + νX
  • 55. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Conjugate priors Distributions of EF has factors of , so conjugate priors is It will give posteriors as follows. June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA g(η) exp(ηT u) p(η|X, ν) = f(X, ν) g(η) exp{ηT X} ν = f(X, ν)g(η)ν exp{νηT X}. p(η|X, X, ν) ∝ N n=1 h(xn)g(η) exp ηT u(xn) × g(η)ν exp{ηT X} ∝ g(η)N+ν exp ηT N n=1 u(xn) + νX Correspond
  • 56. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Uniform distributions Three candidates: 1. Conjugate priors ... Easy to handle 2. Uniform distributions ... Principle of indifference 3. Noninformative priors ... Make effects of priors little June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 57. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Uniform distributions June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA The uniform distribution is common choice for discrete bounded variable. C.f.: Principle of insufficient reason (or Principle of indifference)
  • 58. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Uniform distributions June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA The uniform distribution is common choice for discrete bounded variable. C.f.: Principle of insufficient reason (or Principle of indifference) But two problems arise when it is applied to continuous variables: 1.  The normalization problem 2.  The transformation problem
  • 59. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Uniform distributions June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA 1. Normalization Problem If the parameter is unbounded These priors are called “improper”. ∞ −∞ p(λ)dλ = ∞ −∞ const dλ → ∞
  • 60. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Uniform distributions June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA 1. Normalization Problem If the parameter is unbounded These priors are called “improper”. Note that these priors can give proper posteriors, because posteriors are proportional to likelihood, which can be normalized. ∞ −∞ p(λ)dλ = ∞ −∞ const dλ → ∞
  • 61. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Uniform distributions June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA 2. Transformation problem Non-linear transformation gives non-constant priors. E.g.) (Sometimes, the posteriors are not sensitive to the difference.) p(λ) = 1   η= √ λ p(η) = p(λ) dλ dη = 2η
  • 62. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Uniform distributions June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA 2. Transformation problem Non-linear transformation gives non-constant priors. E.g.) (Sometimes, the posteriors are not sensitive to the difference.) Not constant for η p(λ) = 1   η= √ λ p(η) = p(λ) dλ dη = 2η
  • 63. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Uniform distributions June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA 2. Transformation problem Non-linear transformation gives non-constant priors. E.g.) (Sometimes, the posteriors are not sensitive to the difference.) Not constant for η Think "constant for what?" p(λ) = 1   η= √ λ p(η) = p(λ) dλ dη = 2η
  • 64. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Uniform distributions June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA Keep these problems in mind: 1.  The normalization problem 2.  The transformation problem
  • 65. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Noninformative priors Three candidates: 1. Conjugate priors ... Easy to handle 2. Uniform distributions ... Principle of indifference 3. Noninformative priors ... Make effects of priors little June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 66. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Noninformative priors June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA Two examples of noninformative priors: 1. Priors for location parameters 2. Priors for scale parameters These are constructed to make effects to posteriors as little as possible, so that the inference would be objective.
  • 67. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Noninformative priors June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA 1. Priors for location parameters If the density form is p(x|µ) = f(x − µ),
  • 68. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Noninformative priors June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA 1. Priors for location parameters If the density form is the constant shift gives same density: x = x + c p(x|µ) = f(x − µ), p(x|µ) = f(x − µ).
  • 69. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Noninformative priors June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA 1. Priors for location parameters If the density form is the constant shift gives same density: This property is “translation invariance” and these parameter is “location parameter”. x = x + c p(x|µ) = f(x − µ), p(x|µ) = f(x − µ).
  • 70. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Noninformative priors June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA 1. Priors for location parameters To reflect the translation invariance, priors should be A B p(µ)dµ = A B p(µ − c)dµ for∀A, B. ⇐⇒ p(µ) = p(µ − c). ⇐⇒ p(µ) = constant.
  • 71. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Noninformative priors June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA 1. Priors for location parameters To reflect the translation invariance, priors should be A B p(µ)dµ = A B p(µ − c)dµ for∀A, B. ⇐⇒ p(µ) = p(µ − c). ⇐⇒ p(µ) = constant. We obtained uniform distributions after all. But unlike before, we know when to use it.
  • 72. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Noninformative priors June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA 1. Priors for location parameters E.g.) The mean in Gaussian p(x|µ) = 1 (2πσ2)1/2 exp − 1 2σ2 (x − µ)2
  • 73. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Noninformative priors June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA 1. Priors for location parameters E.g.) The mean in Gaussian p(x|µ) = 1 (2πσ2)1/2 exp − 1 2σ2 (x − µ)2 f(x − µ)This form is
  • 74. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Noninformative priors June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA 1. Priors for location parameters E.g.) The mean in Gaussian This prior is also obtained as a limit of conjugates. p(x|µ) = 1 (2πσ2)1/2 exp − 1 2σ2 (x − µ)2 f(x − µ)This form is p(µ) = N(µ|µ0, σ2 0) σ2 0 →∞ −−−−→const., µN = σ2 Nσ2 0 + σ2 µ0 + Nσ2 0 Nσ2 0 + σ2 µML →µML, 1 σ2 N = 1 σ2 0 + N σ2 → N σ2 .
  • 75. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Noninformative priors June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA 2. Priors for scale parameters If the density form is p(x|σ) = 1 σ f x σ
  • 76. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Noninformative priors June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA 2. Priors for scale parameters If the density form is the constant scale gives same density: p(x|σ) = 1 σ f x σ p(x|σ) = 1 σ f x σ x = cx
  • 77. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Noninformative priors June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA 2. Priors for scale parameters If the density form is the constant scale gives same density: This property is “scale invariance” and these parameter is “scale parameter”. p(x|σ) = 1 σ f x σ p(x|σ) = 1 σ f x σ x = cx
  • 78. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Noninformative priors June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA 2. Priors for scale parameters To reflect the scale invariance, priors should be A B p(σ)dσ = A B p 1 c σ dσ d(cσ) dσ for∀A, B. ⇐⇒ p(σ) = 1 c p 1 c σ . ⇐⇒ p(σ) ∝ 1 σ . ⇐⇒ p(ln σ) = const.
  • 79. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Noninformative priors June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA 2. Priors for scale parameters E.g.) The deviation in Gaussian p(x|σ) = 1 (2πσ2)1/2 exp − 1 2σ2 x2
  • 80. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Noninformative priors June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA 2. Priors for scale parameters E.g.) The deviation in Gaussian This form is 1 σ f x σ p(x|σ) = 1 (2πσ2)1/2 exp − 1 2σ2 x2
  • 81. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Noninformative priors June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA 2. Priors for scale parameters E.g.) The deviation in Gaussian This prior is also obtained as a limit of conjugates. This form is 1 σ f x σ p(x|σ) = 1 (2πσ2)1/2 exp − 1 2σ2 x2 p(λ) = Gam(λ|a0, b0) a0,b0→∞ −−−−−−→ const λ , aN = a0 + N 2 → N 2 , bN = b0 + N 2 σ2 ML → N 2 σ2 ML,
  • 82. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Priors for EF – Noninformative priors June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA Two examples of noninformative priors: 1. Priors for location parameters 2. Priors for scale parameters p(x|µ) = f(x − µ) =⇒ p(µ) = const. p(x|σ) = 1 σ f x σ =⇒ p(σ) ∝ 1 σ
  • 83. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Today's topics 1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF 2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 84. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Today's topics 1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF 2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 85. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Today's topics 1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF 2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 86. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods We learned “parametric approach” June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 87. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods We learned “parametric approach” vs. We will learn “nonparametric approach” June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 88. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods We learned “parametric approach” vs. We will learn “nonparametric approach” What is the difference? June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 89. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA Parametric Nonparametric Assume a specific form of the distribution Put few assumption about the form of distribution Simple Complex (depend on data size) Poor Rich / Flexible Efficient Inefficient
  • 90. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA Parametric Nonparametric Assume a specific form of the distribution Put few assumption about the form of distribution Simple Complex (depend on data size) Poor Rich / Flexible Efficient Inefficient
  • 91. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Today's topics 1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF 2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 92. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods We will learn: 1. Histogram methods 2. Kernel density estimators 3. Nearest-neighbour methods June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 93. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods 1. Histogram methods Split the space into grids (or bins), and count data points. June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 94. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods 1. Histogram methods Split the space into grids (or bins), and count data points. where June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA p(x) = pi = ni N∆i (x ∈ i-th bin), ∆i = Width of ith bin (usually same for all i), ni = # of observations which is assigned to ith bin, N = Total # of observations.
  • 95. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods 1. Histogram methods Split the space into grids (or bins), and count data points. where June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA p(x) = pi = ni N∆i (x ∈ i-th bin), ∆i = Width of ith bin (usually same for all i), ni = # of observations which is assigned to ith bin, N = Total # of observations. This is piecewise constant, hence discontinuous.
  • 96. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods 1. Histogram methods – Example is... June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA ∆ = 0.04 0 0.5 1 0 5 ∆ = 0.08 0 0.5 1 0 5 ∆ = 0.25 0 0.5 1 0 5 ∆
  • 97. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods 1. Histogram methods – Example is... June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA ∆ = 0.04 0 0.5 1 0 5 ∆ = 0.08 0 0.5 1 0 5 ∆ = 0.25 0 0.5 1 0 5 Too narrow to catch enough points Too spiky (noisy) ∆
  • 98. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods 1. Histogram methods – Example is... June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA ∆ = 0.04 0 0.5 1 0 5 ∆ = 0.08 0 0.5 1 0 5 ∆ = 0.25 0 0.5 1 0 5 Too narrow to catch enough points Too spiky (noisy) # of bins = MD (curse of dimensionality) ∆
  • 99. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods 1. Histogram methods – Example is... June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA ∆ = 0.04 0 0.5 1 0 5 ∆ = 0.08 0 0.5 1 0 5 ∆ = 0.25 0 0.5 1 0 5 Too narrow to catch enough points Too spiky (noisy) Good intermediate value # of bins = MD (curse of dimensionality) ∆
  • 100. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods 1. Histogram methods – Example is... June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA ∆ = 0.04 0 0.5 1 0 5 ∆ = 0.08 0 0.5 1 0 5 ∆ = 0.25 0 0.5 1 0 5 Too narrow to catch enough points Too spiky (noisy) Good intermediate value Too wide to express the data Too smooth (less info) # of bins = MD (curse of dimensionality) ∆
  • 101. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods 1. Histogram methods – Example is... June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA ∆ = 0.04 0 0.5 1 0 5 ∆ = 0.08 0 0.5 1 0 5 ∆ = 0.25 0 0.5 1 0 5 Too narrow to catch enough points Too spiky (noisy) Good intermediate value Too wide to express the data Too smooth (less info) Find good value is very important! # of bins = MD (curse of dimensionality) ∆
  • 102. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods Lessons from histogram methods June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA Estimate density at a particular point from data points of small local region.
  • 103. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods Lessons from histogram methods June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA Estimate density at a particular point from data points of small local region. The regions are defined by “smoothing parameter”, which control the complexity in relation with data size.
  • 104. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods Lessons from histogram methods June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA Estimate density at a particular point from data points of small local region. The regions are defined by “smoothing parameter”, which control the complexity in relation with data size. Other problems •  Discontinuity •  Not scalable (curse of dimensionality)
  • 105. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods Lessons from histogram methods Let's consider a small local region , then where . June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA R P = R p(x)dx Pr(K out of N data ∈ R) = N! K!(N − K)! PK (1 − P)N−K ,
  • 106. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods Lessons from histogram methods Let's consider a small local region , then where . If 1.  K is large enough (smoother not too small) 2.  N is constant over (smoother small enough) June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA R P = R p(x)dx Pr(K out of N data ∈ R) = N! K!(N − K)! PK (1 − P)N−K , R
  • 107. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods Lessons from histogram methods Let's consider a small local region , then where . If 1.  K is large enough (smoother not too small) 2.  N is constant over (smoother small enough) June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA R P = R p(x)dx Pr(K out of N data ∈ R) = N! K!(N − K)! PK (1 − P)N−K , R Contradictory
  • 108. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods Lessons from histogram methods Let's consider a small local region , then where . If 1.  K is large enough (smoother not too small) 2.  N is constant over (smoother small enough) June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA R P = R p(x)dx Pr(K out of N data ∈ R) = N! K!(N − K)! PK (1 − P)N−K , R Contradictory Depend on data size
  • 109. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods Lessons from histogram methods Let's consider a small local region , then where . If 1.  K is large enough (smoother not too small) 2.  N is constant over (smoother small enough) June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA R P = R p(x)dx Pr(K out of N data ∈ R) = N! K!(N − K)! PK (1 − P)N−K , R ⇒ p(x) = K NV .
  • 110. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Today's topics 1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF 2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 111. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Today's topics 1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF 2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 112. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Kernel density estimators Fix a region (e.g., hypercube centered on x, side is h) and count data by kernel function k(u) (Parzen window). June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA k(u) = 1, |ui| 1/2, (i = 1, . . . D) 0, otherwise.
  • 113. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Kernel density estimators Fix a region (e.g., hypercube centered on x, side is h) and count data by kernel function k(u) (Parzen window). June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA Centered on origin, side is 1 k(u) = 1, |ui| 1/2, (i = 1, . . . D) 0, otherwise.
  • 114. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Kernel density estimators Fix a region (e.g., hypercube centered on x, side is h) and count data by kernel function k(u) (Parzen window). June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA k(u) = 1, |ui| 1/2, (i = 1, . . . D) 0, otherwise. Discontinuous kernel
  • 115. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Kernel density estimators Fix a region (e.g., hypercube centred on x, side is h) and count data by kernel function k(u) (Parzen window). June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA K = N n=1 k x − xn h , V = hD , ∴ p(x) = 1 N N n=1 1 hD k x − xn h . k(u) = 1, |ui| 1/2, (i = 1, . . . D) 0, otherwise.
  • 116. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Kernel density estimators Symmetry of k(u) let us re-interpret the result. June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA N data points in the single cube centered on x
  • 117. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Kernel density estimators Symmetry of k(u) let us re-interpret the result. June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA N data points in the single cube centered on x N cubes centered on xn around x
  • 118. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Kernel density estimators Other choice of k(u): Gaussian June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA k(u) = 1 (2π)D/2 exp − u 2 2 .
  • 119. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Kernel density estimators Other choice of k(u): Gaussian June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA k(u) = 1 (2π)D/2 exp − u 2 2 . This kernel give continuous density.
  • 120. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Kernel density estimators Other choice of k(u): Gaussian You can use anything as long as it holds June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA k(u) 0, k(u)du = 1. k(u) = 1 (2π)D/2 exp − u 2 2 .
  • 121. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Kernel density estimators Example Again, we can see that smooth parameter h controls the outcome of estimations. June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA h = 0.005 0 0.5 1 0 5 h = 0.07 0 0.5 1 0 5 h = 0.2 0 0.5 1 0 5
  • 122. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Today's topics 1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF 2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 123. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Today's topics 1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF 2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 124. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nearest-neighbour methods Use a sphere as a region which centred on x and contains K (fixed number) data points. June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 125. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nearest-neighbour methods Use a sphere as a region which centred on x and contains K (fixed number) data points. where V(x) denotes the volume of the sphere. June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA p(x) = K NV (x) ,
  • 126. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nearest-neighbour methods Note that this density can not be normalized. From x* where faraway from all data points, the radius of the sphere is inversely proportional to x, thus integral diverge. June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA ∞ −∞ dx r(x) ∞ x∗ dx r(x) ∞ x∗ dx x − x† → ∞. ∴ RD K NV (x) dx ∝ RD dx r(x)D → ∞.
  • 127. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nearest-neighbour estimators Example Here again, smooth parameter K controls the outcome of estimations. June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA K = 1 0 0.5 1 0 5 K = 5 0 0.5 1 0 5 K = 30 0 0.5 1 0 5
  • 128. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nearest-neighbour estimators Example Here again, smooth parameter K controls the outcome of estimations. Furthermore, we can observe that in K=1 case. June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA K = 1 0 0.5 1 0 5 K = 5 0 0.5 1 0 5 K = 30 0 0.5 1 0 5 p(x) → ∞
  • 129. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods Another problem of Kernels and NNs These methods need all observed data for estimation, so both time and space complexity is O(N). It is very inefficient. On that point, parametric methods are quite efficient (c.f., sufficient statistics). Histograms are also efficient. June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 130. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA Histograms Kernels NNs K Not fixed Not fixed Fixed V Not fixed Fixed Not fixed Smoother h V Continuity No It depends Yes* Dimensionality Suffer Scalable Scalable Normalization Proper Proper Improper Data set Discard Keep Keep ∆ * If K=1, not continuous
  • 131. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA Histograms Kernels NNs K Not fixed Not fixed Fixed V Not fixed Fixed Not fixed Smoother h V Continuity No It depends Yes* Dimensionality Suffer Scalable Scalable Normalization Proper Proper Improper Data set Discard Keep Keep ∆ * If K=1, not continuous
  • 132. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA Histograms Kernels NNs K Not fixed Not fixed Fixed V Not fixed Fixed Not fixed Smoother h V Continuity No It depends Yes* Dimensionality Suffer Scalable Scalable Normalization Proper Proper Improper Data set Discard Keep Keep ∆ * If K=1, not continuous
  • 133. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nonparametric methods June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA Histograms Kernels NNs K Not fixed Not fixed Fixed V Not fixed Fixed Not fixed Smoother h V Continuity No It depends Yes* Dimensionality Suffer Scalable Scalable Normalization Proper Proper Improper Data set Discard Keep Keep ∆ * If K=1, not continuous
  • 134. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nearest-neighbour methods Use NNs as classifier To do this, use the sphere contains K points irrespective to the class. June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 135. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nearest-neighbour methods Use NNs as classifier To do this, use the sphere contains K points irrespective to the class. where Kk is # in class k and sphere. June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA p(x|Ck) = Kk NkV , p(x) = K NV ,
  • 136. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nearest-neighbour methods Use NNs as classifier To do this, use the sphere contains K points irrespective to the class. where Kk is # in class k and sphere. Class priors are , so June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA p(x|Ck) = Kk NkV , p(x) = K NV , p(Ck|x) = p(x|Ck)p(Ck) p(x) = Kk K . p(Ck) = Nk/N
  • 137. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nearest-neighbour methods Use NNs as classifier Therefore, x will be classified to the greatest majority among x's K-nearest neighbours. June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 138. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nearest-neighbour methods Use NNs as classifier Therefore, x will be classified to the greatest majority among x's K-nearest neighbours. If K=1, it is called “nearest- neighbour rule”. June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 139. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Nearest-neighbour methods Use NNs as classifier – Example Same as the discussion so far, here K acts as smooth parameter. June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA x6 x7 K = 1 0 1 2 0 1 2 x6 x7 K = 3 0 1 2 0 1 2 x6 x7 K = 31 0 1 2 0 1 2
  • 140. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Today's topics 1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF 2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA
  • 141. NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY Today's topics 1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF 2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA