SlideShare a Scribd company logo
1 of 20
Download to read offline
Confidence Intervals
Exact Intervals, Jackknife, and Bootstrap
Francesco Casalegno
1/20
Why do we need Confidence Intervals?
• Very common use case
we have a few samples x1, ..., xn from an unknown distribution F
we need to estimate some parameter θ of the underlying distribution
• A single value or an interval of values?
use x1, ..., xn to compute our best guess for the parameter θ
but this single value does not take into account the intrinsic uncertainty
due to our limited information on F (finite number of samples n)
so use x1, ..., xn to also compute an interval that likely contains the true θ
• The frequentist solution
there are several ways to compute an interval estimate for θ
we follow the frequentist approach: computing a confidence interval
2/20
Point Estimates and Interval Estimates
• Given x1, ..., xn from a distribution F, estimate unknown parameter θ of F.
Given x1, ..., xn drawn from N(µ, σ2), we want to estimate the variance σ2.
Given x1, ..., xn and y1, ..., yn, we want to estimate the correlation ρ(X, Y ).
• A point estimate is a statistic ˆθ = T(X1, ..., Xn) estimating the unknown θ.
A classical example is the maximum likelihood estimator (MLE).
• Important properties of an estimator are bias and variance.
Bias(ˆθ) = E[ˆθ] − θ Var(ˆθ) = E[(ˆθ − E[ˆθ])2
]
Given x1, ..., xn drawn from N(µ, σ2), the MLE for σ2 is ˆσ2 = 1
n i (xi − ¯x)2.
This estimator has bias −σ2
n
and variance 2σ4
n
.
• An interval estimate is an interval statistic
I(X1, ..., Xn) = [L(X1, ..., Xn), U(X1, ..., Xn)]
containing possible values for the unknown θ.
Two classical examples are the confidence interval and the credible interval.
3/20
Confidence Intervals
• I(X1, ..., Xn) is a confidence interval for θ with confidence level α if, for
any fixed value of the unknown parameter θ,
P(θ ∈ I(X1, ..., Xn)) = α
• If α = 0.95, this means that for any fixed θ, if we repeat a sampling of n
values X1, ..., Xn ∼ Fθ for 100 times and compute a confidence interval every
time, 95 of such intervals contain the true value of θ.
• This does not mean that given the samples x1, ..., xn the probability that
θ ∈ I(x1, ..., xn) is α! This is a common misunderstanding, but the probability
in our definition is about the samples X1, ..., Xn and not on θ.
Indeed, in the frequentist approach θ is a fixed (albeit unknown) value, not a
random variable with associated probability.
For a Bayesian approach, fix x1, ..., xn instead and assign a posterior distribution
to θ. This yields a credible interval which contains θ with probability α.
4/20
Confidence Intervals: Example
90% Confidence Interval
Frequentist approach
θ is fixed, but unknown
X1, ..., Xn are drawn from Fθ 10
times
Build interval for each (X1, ..., Xn)
9/10 of the intervals contain the
true θ
90% Credible Interval
Bayesian approach
Associate to θ a probability
measuring our belief
x1, ..., xn are fixed observations
update posterior belief on θ
Build interval containing θ with
probability = 90%
5/20
How do we compute Confidence Interval?
• Depending on the situation, we have to use a different approach
Exact method: based on a known distribution of ˆθ
Asymptotic method: based on asymptotic normality of the MLE
Jackknife method: simple resampling technique
Bootstrap method: more elaborate resampling technique
6/20
Exact method
• The value of θ is fixed, but ˆθ = T(X1, ..., Xn) is a random variable. If we
know the exact distribution of ˆθ we can compute an exact confidence interval.
Example: Normal distribution N(µ, σ2
)
The MLE for the mean µ is the sample mean ˆµ = ¯x and we have
ˆµ − µ
σ/
√
n
∼ N(0, 1)
so if σ2 is known we can compute an exact confidence interval for µ.
The MLE for the variance is the sample variance ˆσ2 = 1
n i (xi − ¯x)2. This
estimator is biased, so we consider instead the Bessel correction yielding
s2 = 1
n−1 i (xi − ¯x)2 and we have
s2
σ2/(n − 1)
∼ χ2
n−1.
If σ2 is unknown we can use Using s2 we can compute an exact confidence
interval for µ n by using
ˆµ − µ
s/
√
n
∼ tn−1.
7/20
Exact Method: Pros and Cons
Pros
+ confidence level is exactly α
+ closed-form expression allows fast computation
+ works for any sample size n
Cons
– if we do not know the distribution F or the family of distributions it
belongs to (non-parametric statistics), we cannot compute the exact
distribution of ˆθ
– even if F is known, the exact distribution of ˆθ is often impossible to
compute: θ = ρ(X, Y ), θ = Median(X), ...
8/20
Asymptotic Method
• In many cases we choose ˆθ as the MLE for θ. This estimator has (under
reasonable assumptions) the key property of asymptotic normality
√
n(ˆθ − θ)
d
−→ N(0, I(θ)−1
)
where I(θ)=−EX [ (θ; x)] is the Fisher information and (θ; x) = log p(x; θ).
Example: Exponential distribution Exp(λ)
The p.d.f. is p(x; λ) = λxe−λx , so we have
(λ; x) = log(λ) − λx and (λ; x1, ..., xn) = n log(λ) − λ i xi
so that the MLE is ˆλ = 1/¯x. The Fisher information is
−EX [ (λ)] = −EX [−1/λ2] = 1/λ2
so we can use the asymptotic approximation
√
n(ˆλ − λ) ≈ N(0, λ2) ⇒
ˆλ
λ
≈ N(1, 1/n)
Example: Bernoulli distribution Bernoulli(λ)
The p.d.f. is p(x; p) = px (1 − p)1−x , so we have
(p;x)=xlog(p) + (1 − x)log(1 − p)
so that the MLE is ˆp = ¯x. The Fisher information is
−EX [ (p)] = −EX [− x
p
− 1−x
(1−p)2 ] = − p
p2 + 1−p
(1−p)2 = 1
p(1−p)
so we can use the asymptotic approximation
√
n(ˆp − p) ≈ N(0, p(1 − p)) ⇒ ˆp − p ≈ N(0, ˆp(1 − ˆp)/n)
9/20
Asymptotic Method: Example
Consider Exp(λ) and the estimator ˆk = i Xi /n for k = 1/λ = 1/3.
It can be shown that the exact distribution is ˆk ∼ 2nkχ2
2n
We have seen that the asymptotic distribution is ˆk ≈ N(k, k2
/n)
10/20
Asymptotic Method: Pros and Cons
Pros
+ easier computation if sampling distribution F is known
+ expected information I(θ) may be replaced by observed information I(ˆθ)
Cons
– works well only for n sufficiently large (typically at least n > 50)
– neglets skewness of the distribution of ˆθ
– requires to know F or the family of distributions it belongs to
– can be applied only if ˆθ is asymptotically normal (typically MLE)
11/20
Jackknife Method
• Given any estimator ˆθ, the jackknife is based on n leave-1-out estimators
ˆθ(i) = T(X1, ..., Xi−1, Xi+1, ..., Xn), with ˆθ(·) = 1
n i
ˆθ(i).
We also consider the n pseudo-values
˜θi = nˆθ − (n − 1)ˆθ(i)
• A first use of jackknife is bias correction. Indeed,
biasjack = (n − 1)(ˆθ(·) − ˆθ)
is a linear estimator of Bias(ˆθ) (i.e. error is O(1/n2
)). Then, a bias-corrected
estimator is given by the mean of the pseudo-values
ˆθjack = nˆθ − (n − 1)ˆθ(·) = 1
n i
˜θi = ˜θ
• Similarly, bootstrap is used to for the estimation of other properties of ˆθ.
E.g. the variance estimator for Var(ˆθ) given by
varjack = 1
n
˜s2
= 1
n
1
n−1 i (˜θi − ˜θ)2
= n−1
n i (ˆθ(i) − ˆθ(·))2
is a linear estimator, assuming that ˆθ = T(X1, ..., Xn) is smooth.
• We may use varjack and ˆθjack to compute a jackknife approximated
confidence interval using the asymptotic approximation
˜θ − θ
˜s2/n
≈ tn−1
but in practice this approximation is often too crude.
12/20
Jackknife Method: Limitations
• If ˆθ is non-smooth the jackknife variance estimator may be non-consistent.
If ˆθ is the sample mean, it can be proved that
varjack
Var(ˆθ)
d
−→
χ2
2
2
2
• To fix that we introduce an extension of jackknife. This time we consider the
n
d
leave-d-out estimators obtained by computing the statistic T on every
possible subset of X1, ..., Xn obtained by removing d elements.
For ˆθ = sample mean, choosing
√
n < d < n − 1 yields a consistent variance
estimator
varjack = n−d
d n
d
i (ˆθ(i) − ˆθ(·))2
13/20
Jackknife Method: Example
Consider (X, Y ) ∼ F for some F and the Pearson correlation coefficient ρ.
It can shown that the estimator ˆρ2
= i (xi −¯x)(yi −¯y)
√
i (xi −¯x)2
√
i (yi −¯y)2
is biased.
F =⇒ (x1, y1), (x2, y2), (x3, y3), ..., (x10, y10) −→ ˆρ2
⇓
(x2, y2), (x3, y3), (x4, y4), ..., (x10, y10) −→ ˆρ2
(1)
(x1, y1), (x3, y3), (x4, y4), ..., (x10, y10) −→ ˆρ2
(2)
...
(x1, y1), (x2, y2), (x3, y3), ..., (x9, y9) −→ ˆρ2
(10)
The jackknife estimator ˆσ2
jack = 10ˆσ2
− 9ˆρ2
(·) has bias correct up to O(1/n).
14/20
Jackknife Method: Pros and Cons
Pros
+ can be used for non-parametric statistics
+ fast computation
+ bias correction up to O(1/n2
)
+ leave-d-out provides consistent variance estimator
Cons
– leave-1-out may be non-consistent
– leave-d-out is more expensive
– confidence interval is based on crude approximations, bootstrap is better
15/20
Bootstrap Method
• The bootstrap consists in B resamplings with replacement from x1, ..., xn.
This is equivalent to sample from the empirical CDF ˆF.
• For each of the B resamples (x
(b)
1 , ..., x
(b)
n ), compute the estimator ˆθ∗
(b). We
use the values ˆθ∗
(b) to estimate the distribution of ˆθ∗
= ˆθ∗
( ˆF), which in turn is
an appoximation of the distribution of interest ˆθ = ˆθ(F).
• To compute point estimates for the properties of ˆθ we use the pair (ˆθ∗
, ˆθ)
to approximate the pair (ˆθ, θ).
The bootstrap bias estimator for Bias(ˆθ) = E[ˆθ] − θ is given by
biasboot = E[ˆθ∗] − ˆθ = 1
B b
ˆθ∗
b − ˆθ
so that the bias-corrected bootstrap estimator reads
ˆθboot = 2ˆθ + 1
B b
ˆθ∗
b
Similarly, the bootstrap variance estimator for Var(ˆθ) = E[(ˆθ − E[ˆθ])2] is
varboot = 1
B−1 b
ˆθ∗
b − 1
B b
ˆθ∗
b
2
• Notice that bootstrap is more generic than jackknife, since it estimates the
whole distribution of ˆθ and not only its bias and variance.
Actually, one can prove that jackknife is a first order approximation of bootstrap.
16/20
Bootstrap Method: Example
Real World Bootstrap World
F ˆF
⇓ ⇓
x1, ..., xn x∗
1 , ..., x∗
n
↓ ↓
ˆθ ˆθ∗
17/20
Bootstrap Method: Confidence Intervals
• Different techniques are available to compute bootstrap interval estimates.
Here p[α] denotes the α-quantile of distribution p, with z[α] for standard normal.
• The pivotal interval comes from P(l < ˆθ − ˆθ∗
< u) ≈ P(l < θ − ˆθ < u)
CI = (2ˆθ − ˆθ∗
[1 − α/2], 2ˆθ − ˆθ∗
[α/2])
• The studentized interval has an approach similar to the jackknife’s one
CI = (ˆθjack − tn−1[α/2]varjack , ˆθjack + tn−1[α/2]varjack )
• The BCa interval (bias-corrected and accelerated)
CI = (ˆθ∗
[g(α)], ˆθ∗
[g(1 − α)]), with g(α) = Φ z0 +
z0 + z[α]
1 − a(z0 + z[α])
where z0 = Φ−1
(#{ˆθ∗
b < ˆθ}/B) is the bias-correction and the acceleration
a =
1
6
Skew(I(ˆθ)) ≈
1
6
i (ˆθ(·) − ˆθ(i))3
[ i (ˆθ(·) − ˆθ(i))2]3/2
is approximated using jackknife. The BCa interval has an excellent O(1/n)
coverage error, so it is preferred to the other bootstrap methods.
18/20
Bootstrap Method: Pros and Cons
Pros
+ can be used for non-parametric statistics
+ more powerful than jackknife, it approximates whole distribution of ˆθ
+ more accurate than jackknife for computing variance of ˆθ
+ BCa interval has O(1/n) coverage error
Cons
– more expensive than jackknife (B should be large enough)
– if n is very small bootstrap may fail
– if the family of F is known, much better results with exact methods
19/20
Conclusions
• We want to estimate a parameter θ, using samples X1, ..., Xn ∼ Fθ.
• Confidence intervals are needed to express uncertainty of estimator ˆθ.
• If distribution F is known, preferably use exact or asymptotic methods.
• If F is unknown or distribution of ˆθ is complex, use jackknife or bootstrap.
• Use jackknife to estimate properties of ˆθ. Not so good for c. intervals.
• Use bootstrap to estimate distribution of ˆθ. Good for c. intervals (BCa).
20/20

More Related Content

What's hot

Maximum likelihood estimation
Maximum likelihood estimationMaximum likelihood estimation
Maximum likelihood estimationzihad164
 
Estimation in statistics
Estimation in statisticsEstimation in statistics
Estimation in statisticsRabea Jamal
 
Covariance and correlation
Covariance and correlationCovariance and correlation
Covariance and correlationRashid Hussain
 
Logistic regression (blyth 2006) (simplified)
Logistic regression (blyth 2006) (simplified)Logistic regression (blyth 2006) (simplified)
Logistic regression (blyth 2006) (simplified)MikeBlyth
 
Binomial distribution
Binomial distributionBinomial distribution
Binomial distributionSonamWadhwa3
 
Binomial probability distribution
Binomial probability distributionBinomial probability distribution
Binomial probability distributionhamza munir
 
Ppt for 1.1 introduction to statistical inference
Ppt for 1.1 introduction to statistical inferencePpt for 1.1 introduction to statistical inference
Ppt for 1.1 introduction to statistical inferencevasu Chemistry
 
Introduction to Maximum Likelihood Estimator
Introduction to Maximum Likelihood EstimatorIntroduction to Maximum Likelihood Estimator
Introduction to Maximum Likelihood EstimatorAmir Al-Ansary
 
Ordinary least squares linear regression
Ordinary least squares linear regressionOrdinary least squares linear regression
Ordinary least squares linear regressionElkana Rorio
 
Logistic Ordinal Regression
Logistic Ordinal RegressionLogistic Ordinal Regression
Logistic Ordinal RegressionSri Ambati
 
Fundamentals Probability 08072009
Fundamentals Probability 08072009Fundamentals Probability 08072009
Fundamentals Probability 08072009Sri Harsha gadiraju
 

What's hot (20)

Maximum likelihood estimation
Maximum likelihood estimationMaximum likelihood estimation
Maximum likelihood estimation
 
Estimation in statistics
Estimation in statisticsEstimation in statistics
Estimation in statistics
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Normal as Approximation to Binomial
Normal as Approximation to Binomial  Normal as Approximation to Binomial
Normal as Approximation to Binomial
 
Covariance and correlation
Covariance and correlationCovariance and correlation
Covariance and correlation
 
Probability
ProbabilityProbability
Probability
 
Logistic regression (blyth 2006) (simplified)
Logistic regression (blyth 2006) (simplified)Logistic regression (blyth 2006) (simplified)
Logistic regression (blyth 2006) (simplified)
 
Logistic Regression Analysis
Logistic Regression AnalysisLogistic Regression Analysis
Logistic Regression Analysis
 
probability
probabilityprobability
probability
 
F-Distribution
F-DistributionF-Distribution
F-Distribution
 
Binomial distribution
Binomial distributionBinomial distribution
Binomial distribution
 
Binomial probability distribution
Binomial probability distributionBinomial probability distribution
Binomial probability distribution
 
Joint probability
Joint probabilityJoint probability
Joint probability
 
Normal distribution
Normal distributionNormal distribution
Normal distribution
 
Ppt for 1.1 introduction to statistical inference
Ppt for 1.1 introduction to statistical inferencePpt for 1.1 introduction to statistical inference
Ppt for 1.1 introduction to statistical inference
 
Chapter11.2
Chapter11.2Chapter11.2
Chapter11.2
 
Introduction to Maximum Likelihood Estimator
Introduction to Maximum Likelihood EstimatorIntroduction to Maximum Likelihood Estimator
Introduction to Maximum Likelihood Estimator
 
Ordinary least squares linear regression
Ordinary least squares linear regressionOrdinary least squares linear regression
Ordinary least squares linear regression
 
Logistic Ordinal Regression
Logistic Ordinal RegressionLogistic Ordinal Regression
Logistic Ordinal Regression
 
Fundamentals Probability 08072009
Fundamentals Probability 08072009Fundamentals Probability 08072009
Fundamentals Probability 08072009
 

Similar to Confidence Intervals––Exact Intervals, Jackknife, and Bootstrap

Problem_Session_Notes
Problem_Session_NotesProblem_Session_Notes
Problem_Session_NotesLu Mao
 
Fisher_info_ppt and mathematical process to find time domain and frequency do...
Fisher_info_ppt and mathematical process to find time domain and frequency do...Fisher_info_ppt and mathematical process to find time domain and frequency do...
Fisher_info_ppt and mathematical process to find time domain and frequency do...praveenyadav2020
 
Statistics (1): estimation, Chapter 2: Empirical distribution and bootstrap
Statistics (1): estimation, Chapter 2: Empirical distribution and bootstrapStatistics (1): estimation, Chapter 2: Empirical distribution and bootstrap
Statistics (1): estimation, Chapter 2: Empirical distribution and bootstrapChristian Robert
 
DSP_FOEHU - MATLAB 02 - The Discrete-time Fourier Analysis
DSP_FOEHU - MATLAB 02 - The Discrete-time Fourier AnalysisDSP_FOEHU - MATLAB 02 - The Discrete-time Fourier Analysis
DSP_FOEHU - MATLAB 02 - The Discrete-time Fourier AnalysisAmr E. Mohamed
 
ISM_Session_5 _ 23rd and 24th December.pptx
ISM_Session_5 _ 23rd and 24th December.pptxISM_Session_5 _ 23rd and 24th December.pptx
ISM_Session_5 _ 23rd and 24th December.pptxssuser1eba67
 
Applications to Central Limit Theorem and Law of Large Numbers
Applications to Central Limit Theorem and Law of Large NumbersApplications to Central Limit Theorem and Law of Large Numbers
Applications to Central Limit Theorem and Law of Large NumbersUniversity of Salerno
 
Communication Theory - Random Process.pdf
Communication Theory - Random Process.pdfCommunication Theory - Random Process.pdf
Communication Theory - Random Process.pdfRajaSekaran923497
 
Point Estimate, Confidence Interval, Hypotesis tests
Point Estimate, Confidence Interval, Hypotesis testsPoint Estimate, Confidence Interval, Hypotesis tests
Point Estimate, Confidence Interval, Hypotesis testsUniversity of Salerno
 
Econometrics 2.pptx
Econometrics 2.pptxEconometrics 2.pptx
Econometrics 2.pptxfuad80
 
Chapter_09_ParameterEstimation.pptx
Chapter_09_ParameterEstimation.pptxChapter_09_ParameterEstimation.pptx
Chapter_09_ParameterEstimation.pptxVimalMehta19
 
Stratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computationStratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computationUmberto Picchini
 

Similar to Confidence Intervals––Exact Intervals, Jackknife, and Bootstrap (20)

Talk 3
Talk 3Talk 3
Talk 3
 
Problem_Session_Notes
Problem_Session_NotesProblem_Session_Notes
Problem_Session_Notes
 
Fisher_info_ppt and mathematical process to find time domain and frequency do...
Fisher_info_ppt and mathematical process to find time domain and frequency do...Fisher_info_ppt and mathematical process to find time domain and frequency do...
Fisher_info_ppt and mathematical process to find time domain and frequency do...
 
Statistics Homework Help
Statistics Homework HelpStatistics Homework Help
Statistics Homework Help
 
Statistics (1): estimation, Chapter 2: Empirical distribution and bootstrap
Statistics (1): estimation, Chapter 2: Empirical distribution and bootstrapStatistics (1): estimation, Chapter 2: Empirical distribution and bootstrap
Statistics (1): estimation, Chapter 2: Empirical distribution and bootstrap
 
Talk 2
Talk 2Talk 2
Talk 2
 
DSP_FOEHU - MATLAB 02 - The Discrete-time Fourier Analysis
DSP_FOEHU - MATLAB 02 - The Discrete-time Fourier AnalysisDSP_FOEHU - MATLAB 02 - The Discrete-time Fourier Analysis
DSP_FOEHU - MATLAB 02 - The Discrete-time Fourier Analysis
 
ISM_Session_5 _ 23rd and 24th December.pptx
ISM_Session_5 _ 23rd and 24th December.pptxISM_Session_5 _ 23rd and 24th December.pptx
ISM_Session_5 _ 23rd and 24th December.pptx
 
U unit7 ssb
U unit7 ssbU unit7 ssb
U unit7 ssb
 
Data Analysis Assignment Help
Data Analysis Assignment HelpData Analysis Assignment Help
Data Analysis Assignment Help
 
Applications to Central Limit Theorem and Law of Large Numbers
Applications to Central Limit Theorem and Law of Large NumbersApplications to Central Limit Theorem and Law of Large Numbers
Applications to Central Limit Theorem and Law of Large Numbers
 
Communication Theory - Random Process.pdf
Communication Theory - Random Process.pdfCommunication Theory - Random Process.pdf
Communication Theory - Random Process.pdf
 
Point Estimate, Confidence Interval, Hypotesis tests
Point Estimate, Confidence Interval, Hypotesis testsPoint Estimate, Confidence Interval, Hypotesis tests
Point Estimate, Confidence Interval, Hypotesis tests
 
Excel Homework Help
Excel Homework HelpExcel Homework Help
Excel Homework Help
 
Propensity albert
Propensity albertPropensity albert
Propensity albert
 
Econometrics 2.pptx
Econometrics 2.pptxEconometrics 2.pptx
Econometrics 2.pptx
 
Chapter_09_ParameterEstimation.pptx
Chapter_09_ParameterEstimation.pptxChapter_09_ParameterEstimation.pptx
Chapter_09_ParameterEstimation.pptx
 
Montecarlophd
MontecarlophdMontecarlophd
Montecarlophd
 
the ABC of ABC
the ABC of ABCthe ABC of ABC
the ABC of ABC
 
Stratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computationStratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computation
 

More from Francesco Casalegno

DVC - Git-like Data Version Control for Machine Learning projects
DVC - Git-like Data Version Control for Machine Learning projectsDVC - Git-like Data Version Control for Machine Learning projects
DVC - Git-like Data Version Control for Machine Learning projectsFrancesco Casalegno
 
Ordinal Regression and Machine Learning: Applications, Methods, Metrics
Ordinal Regression and Machine Learning: Applications, Methods, MetricsOrdinal Regression and Machine Learning: Applications, Methods, Metrics
Ordinal Regression and Machine Learning: Applications, Methods, MetricsFrancesco Casalegno
 
Markov Chain Monte Carlo Methods
Markov Chain Monte Carlo MethodsMarkov Chain Monte Carlo Methods
Markov Chain Monte Carlo MethodsFrancesco Casalegno
 
Hyperparameter Optimization for Machine Learning
Hyperparameter Optimization for Machine LearningHyperparameter Optimization for Machine Learning
Hyperparameter Optimization for Machine LearningFrancesco Casalegno
 
[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...
[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...
[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...Francesco Casalegno
 
C++11: Rvalue References, Move Semantics, Perfect Forwarding
C++11: Rvalue References, Move Semantics, Perfect ForwardingC++11: Rvalue References, Move Semantics, Perfect Forwarding
C++11: Rvalue References, Move Semantics, Perfect ForwardingFrancesco Casalegno
 

More from Francesco Casalegno (8)

DVC - Git-like Data Version Control for Machine Learning projects
DVC - Git-like Data Version Control for Machine Learning projectsDVC - Git-like Data Version Control for Machine Learning projects
DVC - Git-like Data Version Control for Machine Learning projects
 
Ordinal Regression and Machine Learning: Applications, Methods, Metrics
Ordinal Regression and Machine Learning: Applications, Methods, MetricsOrdinal Regression and Machine Learning: Applications, Methods, Metrics
Ordinal Regression and Machine Learning: Applications, Methods, Metrics
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Markov Chain Monte Carlo Methods
Markov Chain Monte Carlo MethodsMarkov Chain Monte Carlo Methods
Markov Chain Monte Carlo Methods
 
Hyperparameter Optimization for Machine Learning
Hyperparameter Optimization for Machine LearningHyperparameter Optimization for Machine Learning
Hyperparameter Optimization for Machine Learning
 
Smart Pointers in C++
Smart Pointers in C++Smart Pointers in C++
Smart Pointers in C++
 
[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...
[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...
[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...
 
C++11: Rvalue References, Move Semantics, Perfect Forwarding
C++11: Rvalue References, Move Semantics, Perfect ForwardingC++11: Rvalue References, Move Semantics, Perfect Forwarding
C++11: Rvalue References, Move Semantics, Perfect Forwarding
 

Recently uploaded

(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...Scintica Instrumentation
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptxryanrooker
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxRenuJangid3
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Silpa
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Silpa
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceAlex Henderson
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Silpa
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxseri bangash
 
Velocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.pptVelocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.pptRakeshMohan42
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flyPRADYUMMAURYA1
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...Monika Rani
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Serviceshivanisharma5244
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and ClassificationsAreesha Ahmad
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
Exploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdfExploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdfrohankumarsinghrore1
 

Recently uploaded (20)

(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptx
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
Velocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.pptVelocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.ppt
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Exploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdfExploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdf
 

Confidence Intervals––Exact Intervals, Jackknife, and Bootstrap

  • 1. Confidence Intervals Exact Intervals, Jackknife, and Bootstrap Francesco Casalegno 1/20
  • 2. Why do we need Confidence Intervals? • Very common use case we have a few samples x1, ..., xn from an unknown distribution F we need to estimate some parameter θ of the underlying distribution • A single value or an interval of values? use x1, ..., xn to compute our best guess for the parameter θ but this single value does not take into account the intrinsic uncertainty due to our limited information on F (finite number of samples n) so use x1, ..., xn to also compute an interval that likely contains the true θ • The frequentist solution there are several ways to compute an interval estimate for θ we follow the frequentist approach: computing a confidence interval 2/20
  • 3. Point Estimates and Interval Estimates • Given x1, ..., xn from a distribution F, estimate unknown parameter θ of F. Given x1, ..., xn drawn from N(µ, σ2), we want to estimate the variance σ2. Given x1, ..., xn and y1, ..., yn, we want to estimate the correlation ρ(X, Y ). • A point estimate is a statistic ˆθ = T(X1, ..., Xn) estimating the unknown θ. A classical example is the maximum likelihood estimator (MLE). • Important properties of an estimator are bias and variance. Bias(ˆθ) = E[ˆθ] − θ Var(ˆθ) = E[(ˆθ − E[ˆθ])2 ] Given x1, ..., xn drawn from N(µ, σ2), the MLE for σ2 is ˆσ2 = 1 n i (xi − ¯x)2. This estimator has bias −σ2 n and variance 2σ4 n . • An interval estimate is an interval statistic I(X1, ..., Xn) = [L(X1, ..., Xn), U(X1, ..., Xn)] containing possible values for the unknown θ. Two classical examples are the confidence interval and the credible interval. 3/20
  • 4. Confidence Intervals • I(X1, ..., Xn) is a confidence interval for θ with confidence level α if, for any fixed value of the unknown parameter θ, P(θ ∈ I(X1, ..., Xn)) = α • If α = 0.95, this means that for any fixed θ, if we repeat a sampling of n values X1, ..., Xn ∼ Fθ for 100 times and compute a confidence interval every time, 95 of such intervals contain the true value of θ. • This does not mean that given the samples x1, ..., xn the probability that θ ∈ I(x1, ..., xn) is α! This is a common misunderstanding, but the probability in our definition is about the samples X1, ..., Xn and not on θ. Indeed, in the frequentist approach θ is a fixed (albeit unknown) value, not a random variable with associated probability. For a Bayesian approach, fix x1, ..., xn instead and assign a posterior distribution to θ. This yields a credible interval which contains θ with probability α. 4/20
  • 5. Confidence Intervals: Example 90% Confidence Interval Frequentist approach θ is fixed, but unknown X1, ..., Xn are drawn from Fθ 10 times Build interval for each (X1, ..., Xn) 9/10 of the intervals contain the true θ 90% Credible Interval Bayesian approach Associate to θ a probability measuring our belief x1, ..., xn are fixed observations update posterior belief on θ Build interval containing θ with probability = 90% 5/20
  • 6. How do we compute Confidence Interval? • Depending on the situation, we have to use a different approach Exact method: based on a known distribution of ˆθ Asymptotic method: based on asymptotic normality of the MLE Jackknife method: simple resampling technique Bootstrap method: more elaborate resampling technique 6/20
  • 7. Exact method • The value of θ is fixed, but ˆθ = T(X1, ..., Xn) is a random variable. If we know the exact distribution of ˆθ we can compute an exact confidence interval. Example: Normal distribution N(µ, σ2 ) The MLE for the mean µ is the sample mean ˆµ = ¯x and we have ˆµ − µ σ/ √ n ∼ N(0, 1) so if σ2 is known we can compute an exact confidence interval for µ. The MLE for the variance is the sample variance ˆσ2 = 1 n i (xi − ¯x)2. This estimator is biased, so we consider instead the Bessel correction yielding s2 = 1 n−1 i (xi − ¯x)2 and we have s2 σ2/(n − 1) ∼ χ2 n−1. If σ2 is unknown we can use Using s2 we can compute an exact confidence interval for µ n by using ˆµ − µ s/ √ n ∼ tn−1. 7/20
  • 8. Exact Method: Pros and Cons Pros + confidence level is exactly α + closed-form expression allows fast computation + works for any sample size n Cons – if we do not know the distribution F or the family of distributions it belongs to (non-parametric statistics), we cannot compute the exact distribution of ˆθ – even if F is known, the exact distribution of ˆθ is often impossible to compute: θ = ρ(X, Y ), θ = Median(X), ... 8/20
  • 9. Asymptotic Method • In many cases we choose ˆθ as the MLE for θ. This estimator has (under reasonable assumptions) the key property of asymptotic normality √ n(ˆθ − θ) d −→ N(0, I(θ)−1 ) where I(θ)=−EX [ (θ; x)] is the Fisher information and (θ; x) = log p(x; θ). Example: Exponential distribution Exp(λ) The p.d.f. is p(x; λ) = λxe−λx , so we have (λ; x) = log(λ) − λx and (λ; x1, ..., xn) = n log(λ) − λ i xi so that the MLE is ˆλ = 1/¯x. The Fisher information is −EX [ (λ)] = −EX [−1/λ2] = 1/λ2 so we can use the asymptotic approximation √ n(ˆλ − λ) ≈ N(0, λ2) ⇒ ˆλ λ ≈ N(1, 1/n) Example: Bernoulli distribution Bernoulli(λ) The p.d.f. is p(x; p) = px (1 − p)1−x , so we have (p;x)=xlog(p) + (1 − x)log(1 − p) so that the MLE is ˆp = ¯x. The Fisher information is −EX [ (p)] = −EX [− x p − 1−x (1−p)2 ] = − p p2 + 1−p (1−p)2 = 1 p(1−p) so we can use the asymptotic approximation √ n(ˆp − p) ≈ N(0, p(1 − p)) ⇒ ˆp − p ≈ N(0, ˆp(1 − ˆp)/n) 9/20
  • 10. Asymptotic Method: Example Consider Exp(λ) and the estimator ˆk = i Xi /n for k = 1/λ = 1/3. It can be shown that the exact distribution is ˆk ∼ 2nkχ2 2n We have seen that the asymptotic distribution is ˆk ≈ N(k, k2 /n) 10/20
  • 11. Asymptotic Method: Pros and Cons Pros + easier computation if sampling distribution F is known + expected information I(θ) may be replaced by observed information I(ˆθ) Cons – works well only for n sufficiently large (typically at least n > 50) – neglets skewness of the distribution of ˆθ – requires to know F or the family of distributions it belongs to – can be applied only if ˆθ is asymptotically normal (typically MLE) 11/20
  • 12. Jackknife Method • Given any estimator ˆθ, the jackknife is based on n leave-1-out estimators ˆθ(i) = T(X1, ..., Xi−1, Xi+1, ..., Xn), with ˆθ(·) = 1 n i ˆθ(i). We also consider the n pseudo-values ˜θi = nˆθ − (n − 1)ˆθ(i) • A first use of jackknife is bias correction. Indeed, biasjack = (n − 1)(ˆθ(·) − ˆθ) is a linear estimator of Bias(ˆθ) (i.e. error is O(1/n2 )). Then, a bias-corrected estimator is given by the mean of the pseudo-values ˆθjack = nˆθ − (n − 1)ˆθ(·) = 1 n i ˜θi = ˜θ • Similarly, bootstrap is used to for the estimation of other properties of ˆθ. E.g. the variance estimator for Var(ˆθ) given by varjack = 1 n ˜s2 = 1 n 1 n−1 i (˜θi − ˜θ)2 = n−1 n i (ˆθ(i) − ˆθ(·))2 is a linear estimator, assuming that ˆθ = T(X1, ..., Xn) is smooth. • We may use varjack and ˆθjack to compute a jackknife approximated confidence interval using the asymptotic approximation ˜θ − θ ˜s2/n ≈ tn−1 but in practice this approximation is often too crude. 12/20
  • 13. Jackknife Method: Limitations • If ˆθ is non-smooth the jackknife variance estimator may be non-consistent. If ˆθ is the sample mean, it can be proved that varjack Var(ˆθ) d −→ χ2 2 2 2 • To fix that we introduce an extension of jackknife. This time we consider the n d leave-d-out estimators obtained by computing the statistic T on every possible subset of X1, ..., Xn obtained by removing d elements. For ˆθ = sample mean, choosing √ n < d < n − 1 yields a consistent variance estimator varjack = n−d d n d i (ˆθ(i) − ˆθ(·))2 13/20
  • 14. Jackknife Method: Example Consider (X, Y ) ∼ F for some F and the Pearson correlation coefficient ρ. It can shown that the estimator ˆρ2 = i (xi −¯x)(yi −¯y) √ i (xi −¯x)2 √ i (yi −¯y)2 is biased. F =⇒ (x1, y1), (x2, y2), (x3, y3), ..., (x10, y10) −→ ˆρ2 ⇓ (x2, y2), (x3, y3), (x4, y4), ..., (x10, y10) −→ ˆρ2 (1) (x1, y1), (x3, y3), (x4, y4), ..., (x10, y10) −→ ˆρ2 (2) ... (x1, y1), (x2, y2), (x3, y3), ..., (x9, y9) −→ ˆρ2 (10) The jackknife estimator ˆσ2 jack = 10ˆσ2 − 9ˆρ2 (·) has bias correct up to O(1/n). 14/20
  • 15. Jackknife Method: Pros and Cons Pros + can be used for non-parametric statistics + fast computation + bias correction up to O(1/n2 ) + leave-d-out provides consistent variance estimator Cons – leave-1-out may be non-consistent – leave-d-out is more expensive – confidence interval is based on crude approximations, bootstrap is better 15/20
  • 16. Bootstrap Method • The bootstrap consists in B resamplings with replacement from x1, ..., xn. This is equivalent to sample from the empirical CDF ˆF. • For each of the B resamples (x (b) 1 , ..., x (b) n ), compute the estimator ˆθ∗ (b). We use the values ˆθ∗ (b) to estimate the distribution of ˆθ∗ = ˆθ∗ ( ˆF), which in turn is an appoximation of the distribution of interest ˆθ = ˆθ(F). • To compute point estimates for the properties of ˆθ we use the pair (ˆθ∗ , ˆθ) to approximate the pair (ˆθ, θ). The bootstrap bias estimator for Bias(ˆθ) = E[ˆθ] − θ is given by biasboot = E[ˆθ∗] − ˆθ = 1 B b ˆθ∗ b − ˆθ so that the bias-corrected bootstrap estimator reads ˆθboot = 2ˆθ + 1 B b ˆθ∗ b Similarly, the bootstrap variance estimator for Var(ˆθ) = E[(ˆθ − E[ˆθ])2] is varboot = 1 B−1 b ˆθ∗ b − 1 B b ˆθ∗ b 2 • Notice that bootstrap is more generic than jackknife, since it estimates the whole distribution of ˆθ and not only its bias and variance. Actually, one can prove that jackknife is a first order approximation of bootstrap. 16/20
  • 17. Bootstrap Method: Example Real World Bootstrap World F ˆF ⇓ ⇓ x1, ..., xn x∗ 1 , ..., x∗ n ↓ ↓ ˆθ ˆθ∗ 17/20
  • 18. Bootstrap Method: Confidence Intervals • Different techniques are available to compute bootstrap interval estimates. Here p[α] denotes the α-quantile of distribution p, with z[α] for standard normal. • The pivotal interval comes from P(l < ˆθ − ˆθ∗ < u) ≈ P(l < θ − ˆθ < u) CI = (2ˆθ − ˆθ∗ [1 − α/2], 2ˆθ − ˆθ∗ [α/2]) • The studentized interval has an approach similar to the jackknife’s one CI = (ˆθjack − tn−1[α/2]varjack , ˆθjack + tn−1[α/2]varjack ) • The BCa interval (bias-corrected and accelerated) CI = (ˆθ∗ [g(α)], ˆθ∗ [g(1 − α)]), with g(α) = Φ z0 + z0 + z[α] 1 − a(z0 + z[α]) where z0 = Φ−1 (#{ˆθ∗ b < ˆθ}/B) is the bias-correction and the acceleration a = 1 6 Skew(I(ˆθ)) ≈ 1 6 i (ˆθ(·) − ˆθ(i))3 [ i (ˆθ(·) − ˆθ(i))2]3/2 is approximated using jackknife. The BCa interval has an excellent O(1/n) coverage error, so it is preferred to the other bootstrap methods. 18/20
  • 19. Bootstrap Method: Pros and Cons Pros + can be used for non-parametric statistics + more powerful than jackknife, it approximates whole distribution of ˆθ + more accurate than jackknife for computing variance of ˆθ + BCa interval has O(1/n) coverage error Cons – more expensive than jackknife (B should be large enough) – if n is very small bootstrap may fail – if the family of F is known, much better results with exact methods 19/20
  • 20. Conclusions • We want to estimate a parameter θ, using samples X1, ..., Xn ∼ Fθ. • Confidence intervals are needed to express uncertainty of estimator ˆθ. • If distribution F is known, preferably use exact or asymptotic methods. • If F is unknown or distribution of ˆθ is complex, use jackknife or bootstrap. • Use jackknife to estimate properties of ˆθ. Not so good for c. intervals. • Use bootstrap to estimate distribution of ˆθ. Good for c. intervals (BCa). 20/20