Chapter9

1
Chapter 9
Estimation
Using a Single Sample

2
A point estimate of a population
characteristic is a single number that is
based on sample data and represents a
plausible value of the characteristic.
Point Estimation

3
Example
A sample of 200 students at a large
university is selected to estimate the
proportion of students that wear contact lens.
In this sample 47 wore contact lens.
Let π = the true proportion of all students at
this university who wear contact lens.
Consider “success” being a student who
wears contact lens.
The statistic
is a reasonable choice for a formula to obtain a point
estimate for π.
number of successes in the sample
p
n
=The statistic
is a reasonable choice for a formula to obtain a point
estimate for π.
number of successes in the sample
p
n
=
Such a point estimate is
47
p 0.235
200
= =Such a point estimate is
47
p 0.235
200
= =

4
Example
A sample of weights of 34 male freshman
students was obtained.
185 161 174 175 202 178
202 139 177 170 151 176
197 214 283 184 189 168
188 170 207 180 167 177
166 231 176 184 179 155
148 180 194 176
If one wanted to estimate the true mean of all
male freshman students, you might use the
sample mean as a point estimate for the true
mean.
sample mean x 182.44= =

5
Example
After looking at a histogram and boxplot of the
data (below) you might notice that the data
seems reasonably symmetric with a outlier, so
you might use either the sample median or a
sample trimmed mean as a point estimate.
260220180140
Calculated using Minitab
=5% trimmed mean 180.07
177 178
sample median 177.5
2
+
= =

6
Bias
A statistic with mean value equal to the
value of the population characteristic being
estimated is said to be an unbiased
statistic. A statistic that is not unbiased is
said to be biased.
value
True
Sampling
distribution of a
unbiased statistic
Sampling
distribution of a
biased statistic
Original
distribution

7
Criteria
Given a choice between several unbiased
statistics that could be used for estimating a
population characteristic, the best statistic to
use is the one with the smallest standard
deviation.
value
True
Unbiased sampling
distribution with the
smallest standard
deviation, the Best
choice.

8
Large-sample Confidence Interval
for a Population Proportion
A confidence interval for a population
characteristic is an interval of plausible
values for the characteristic. It is
constructed so that, with a chosen degree
of confidence, the value of the
characteristic will be captured inside the
interval.

9
Confidence Level
The confidence level associated with a
confidence interval estimate is the success
rate of the method used to construct the
interval.

10
Recall
* nπ ≥ 10 and nπ(1-π) ≥ 10
Specifically when n is large*, the statistic
p has a sampling distribution that is
approximately normal with mean π and
standard deviation .(1 )
n
π − π
For the sampling distribution of p,
µp = π, and for large* n
The sampling distribution of p is
approximately normal.
p
(1 )
n
π − π
σ =

11
Some considerations
Approximately 95% of all large samples will
result in a value of p that is within
of the true population
proportion π.
p
(1 )
1.96 1.96
n
π − π
σ =
Approximately 95% of all large samples will
result in a value of p that is within
of the true population
proportion π.
p
(1 )
1.96 1.96
n
π − π
σ =

12
Some considerations
This interval can be used as long as
np ≥ 10 and np(1-p) ≥ 10
Equivalently, this means that for 95% of
all possible samples, π will be in the
interval
(1 ) (1 )
p 1.96 to p 1.96
n n
π − π π − π
− +
Since π is unknown and n is large, we estimate
(1 ) p(1 p)
with
n n
π − π −
Since π is unknown and n is large, we estimate
(1 ) p(1 p)
with
n n
π − π −

13
The 95% Confidence Interval
When n is large, a 95% confidence
interval for π is
p(1 p) p(1 p)
p 1.96 , p 1.96
n n
 − −
− + 
 
The endpoints of the interval are often
abbreviated by
where - gives the lower endpoint and + the
upper endpoint.
p(1 p)
p 1.96
n
−
±

14
Example
For a project, a student randomly
sampled 182 other students at a large
university to determine if the majority of
students were in favor of a proposal to
build a field house. He found that 75 were
in favor of the proposal.
Let π = the true proportion of students
that favor the proposal.

15
Example - continued
So np = 182(0.4121) = 75 >10 and
n(1-p)=182(0.5879) = 107 >10 we can use
the formulas given on the previous slide to
find a 95% confidence interval for π.
The 95% confidence interval for π is
(0.341, 0.484).
75
p 0.4121
182
= =
p(1 p) 0.4121(0.5879)
p 1.96 0.4121 1.96
n 182
0.4121 0.07151
−
± = ±
= ±

16
The General Confidence Interval
The general formula for a confidence
interval for a population proportion π
when
1. p is the sample proportion from a
random sample , and
2. The sample size n is large
(np ≥ 10 and np(1-p) ≥ 10)
is given by
( )
p(1 p)
p z critical value
n
−
±

17
Finding a z Critical Value
Finding a z critical value for a 98%
confidence interval.
Looking up the cumulative area or 0.9900 in the
body of the table we find z = 2.33
2.33

18
Some Common Critical Values
Confidence
level
z critical
value
80% 1.28
90% 1.645
95% 1.96
98% 2.33
99% 2.58
99.8% 3.09
99.9% 3.29

19
Terminology
The standard error of a statistic is the
estimated standard deviation of the statistic.
(1 )
n
π − π
For sample proportions, the standard deviation is
(1 )
n
π − π
For sample proportions, the standard deviation is
p(1 p)
n
−
This means that the standard error of the sample
proportion is
p(1 p)
n
−
This means that the standard error of the sample
proportion is

20
Terminology
The bound on error of estimation, B,
associated with a 95% confidence interval is
(1.96)·(standard error of the statistic).
The bound on error of estimation, B, associated
with a confidence interval is
(z critical value)·(standard error of the statistic).

21
Sample Size
The sample size required to estimate a
population proportion π to within an amount
B with 95% confidence is
The value of π may be estimated by prior
information. If no prior information is available,
use π = 0.5 in the formula to obtain a
conservatively large value for n.
Generally one rounds the result up to the nearest integer.
2
1.96
n (1 )
B
 
= π − π  
 

22
Sample Size Calculation
Example
If a TV executive would like to find a 95%
confidence interval estimate within 0.03
for the proportion of all households that
watch NYPD Blue regularly. How large a
sample is needed if a prior estimate for π
was 0.15.
A sample of 545 or more would be needed.
We have B = 0.03 and the prior estimate of π = 0.15
2 2
1.96 1.96
n (1 ) (0.15)(0.85) 544.2
B 0.03
   
= π − π = =   
   

23
Sample Size Calculation Example revisited
Suppose a TV executive would like to find a
95% confidence interval estimate within 0.03
for the proportion of all households that
watch NYPD Blue regularly. How large a
sample is needed if we have no reasonable
prior estimate for π.
The required sample size is now 1068.
We have B = 0.03 and should use π = 0.5 in
the formula.
Notice, a reasonable ball park estimate for π
can lower the needed sample size.
2 2
1.96 1.96
n (1 ) (0.5)(0.5) 1067.1
B 0.03
   
= π − π = =   
   

24
Another Example
A college professor wants to estimate the
proportion of students at a large university
who favor building a field house with a 99%
confidence interval accurate to 0.02. If one
of his students performed a preliminary
study and estimated π to be 0.412, how
large a sample should he take.
The required sample size is 4032.
We have B = 0.02, a prior estimate π = 0.412 and we
should use the z critical value 2.58 (for a 99%
confidence interval)
2 2
2.58 2.58
n (1 ) (0.412)(0.588) 4031.4
B 0.02
   
= π − π = =   
   

25
One-Sample z Confidence
Interval for µ
2. The sample size n is large (generally
n≥30), and
3. σ , the population standard deviation, is
known then the general formula for a
confidence interval for a population mean µ
is given by
( )x z critical value
n
σ
±
If
1. is the sample mean from a random
sample,
x
If
1. is the sample mean from a random
sample,
x

26
One-Sample z Confidence
Interval for µ
Notice that this formula works when σ is known and
either
1. n is large (generally n ≥ 30) or
2. The population distribution is normal (any
sample size.
If n is small (generally n < 30) but it is
reasonable to believe that the distribution of
values in the population is normal, a
confidence interval for µ (when σ is known)
is
( )x z critical value
n
σ
±

27
Find a 90% confidence interval estimate for the
true mean fills of catsup from this machine.
Example
A certain filling machine has a true
population standard deviation σ = 0.228
ounces when used to fill catsup bottles. A
random sample of 36 “6 ounce” bottles of
catsup was selected from the output from
this machine and the sample mean was
6.018 ounces.

28
Example I (continued)
The z critical value is 1.645
90% Confidence Interval
(5.955, 6.081)
36n,228.0,018.6x ==σ= 36n,228.0,018.6x ==σ=
x (z critical value)
n
0.228
6.018 1.645 6.018 0.063
36
σ
±
= ± = ±

29
Unknown σ - Small Size Samples
[All Size Samples]
An Irish mathematician/statistician, W. S. Gosset
developed the techniques and derived the Student’s
t distributions that describe the behavior of
ns
x 0µ−

30
t Distributions
If X is a normally distributed random variable, the
statistic
follows a t distribution with df = n-1 (degrees of
freedom).
ns
x
t 0µ−
=

31
t Distributions
This statistic is fairly robust
and the results are reasonable for moderate
sample sizes (15 and up) if x is just reasonable
centrally weighted. It is also quite reasonable
for large sample sizes for distributional
patterns (of x) that are not extremely skewed.
ns
x
t 0µ−
=

32
-4 -3 -2 -1 0 1 2 3 4
df = 2
df = 5
df = 10
df = 25
Normal
Comparison of normal and t distibutions
t Distributions

33
Notice: As df increase, t distributions
approach the standard normal
distribution.
Since each t distribution would require a
table similar to the standard normal table,
we usually only create a table of critical
values for the t distributions.
t Distributions

34
0.80 0.90 0.95 0.98 0.99 0.998 0.999
80% 90% 95% 98% 99% 99.8% 99.9%
1 3.08 6.31 12.71 31.82 63.66 318.29 636.58
2 1.89 2.92 4.30 6.96 9.92 22.33 31.60
3 1.64 2.35 3.18 4.54 5.84 10.21 12.92
4 1.53 2.13 2.78 3.75 4.60 7.17 8.61
5 1.48 2.02 2.57 3.36 4.03 5.89 6.87
6 1.44 1.94 2.45 3.14 3.71 5.21 5.96
7 1.41 1.89 2.36 3.00 3.50 4.79 5.41
8 1.40 1.86 2.31 2.90 3.36 4.50 5.04
9 1.38 1.83 2.26 2.82 3.25 4.30 4.78
10 1.37 1.81 2.23 2.76 3.17 4.14 4.59
11 1.36 1.80 2.20 2.72 3.11 4.02 4.44
12 1.36 1.78 2.18 2.68 3.05 3.93 4.32
13 1.35 1.77 2.16 2.65 3.01 3.85 4.22
14 1.35 1.76 2.14 2.62 2.98 3.79 4.14
15 1.34 1.75 2.13 2.60 2.95 3.73 4.07
16 1.34 1.75 2.12 2.58 2.92 3.69 4.01
17 1.33 1.74 2.11 2.57 2.90 3.65 3.97
18 1.33 1.73 2.10 2.55 2.88 3.61 3.92
19 1.33 1.73 2.09 2.54 2.86 3.58 3.88
20 1.33 1.72 2.09 2.53 2.85 3.55 3.85
21 1.32 1.72 2.08 2.52 2.83 3.53 3.82
22 1.32 1.72 2.07 2.51 2.82 3.50 3.79
23 1.32 1.71 2.07 2.50 2.81 3.48 3.77
24 1.32 1.71 2.06 2.49 2.80 3.47 3.75
25 1.32 1.71 2.06 2.49 2.79 3.45 3.73
26 1.31 1.71 2.06 2.48 2.78 3.43 3.71
27 1.31 1.70 2.05 2.47 2.77 3.42 3.69
28 1.31 1.70 2.05 2.47 2.76 3.41 3.67
29 1.31 1.70 2.05 2.46 2.76 3.40 3.66
30 1.31 1.70 2.04 2.46 2.75 3.39 3.65
40 1.30 1.68 2.02 2.42 2.70 3.31 3.55
60 1.30 1.67 2.00 2.39 2.66 3.23 3.46
120 1.29 1.66 1.98 2.36 2.62 3.16 3.37
1.28 1.645 1.96 2.33 2.58 3.09 3.29
Central area captured:
Confidence level:
D
e
g
r
e
e
s
o
f
f
r
e
e
d
o
m
z critical values

35
One-Sample t Procedures
Suppose that a SRS of size n is drawn from a
population having unknown mean µ. The general
confidence limits are
s
x (t critical value)
n
±
and the general confidence interval for µ is
s s
x (t critical value) ,x (t critical value)
n n
 
− + 
 

36
Confidence Interval Example
Ten randomly selected shut-ins were each
asked to list how many hours of television
they watched per week. The results are
82 66 90 84 75
88 80 94 110 91
Find a 90% confidence interval estimate for
the true mean number of hours of
television watched per week by shut-ins.

37
We find the critical t value of 1.833 by looking on the
t table in the row corresponding to df = 9, in the
column with bottom label 90%. Computing the
confidence interval for µ is
Calculating the sample mean and standard
deviation we have n = 10, = 86, s =
11.842
x = 86
10
842.11
)833.1(86 ±=
n
s
*tx ± 86.686 ±=
)86.92,14.79(

38
To calculate the confidence interval, we had
to make the assumption that the distribution
of weekly viewing times was normally
distributed. Consider the normal plot of the
10 data points produced with Minitab that is
given on the next slide.

39
Notice that the normal plot looks reasonably
linear so it is reasonable to assume that the
number of hours of television watched per week
by shut-ins is normally distributed.
P-Value: 0.753
A-Squared:0.226
Anderson-Darling NormalityTest
N:10
StDev:11.8415
Average:86
110100908070
.999
.99
.95
.80
.50
.20
.05
.01
.001
Probability
Hours
Normal Probability Plot
P-Value: 0.753
A-Squared: 0.226
Anderson-Darling Normality Test
Typically if the
p-value is more than
0.05 we assume that the
distribution is normal

Chapter9

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (10)

Similar to Chapter9

Similar to Chapter9 (20)

More from Richard Ferreria

More from Richard Ferreria (16)

Recently uploaded

Recently uploaded (20)

Chapter9

Editor's Notes