2. 2
Any quantity computed from values in a
sample is called a statistic.
The observed value of a statistic depends
on the particular sample selected from
the population; typically, it varies from
sample to sample. This variability is
called sampling variability.
Basic Terms
4. 4
Example
Consider a population that consists of the
numbers 1, 2, 3, 4 and 5 generated in a
manner that the probability of each of those
values is 0.2 no matter what the previous
selections were. This population could be
described as the outcome associated with a
spinner such as given below. The distribution is
next to it.
x p(x)
1 0.2
2 0.2
3 0.2
4 0.2
5 0.2
5. 5
Example
If the sampling distribution for the means of
samples of size two is analyzed, it looks like
Sample Sample
1, 1 1 3, 4 3.5
1, 2 1.5 3, 5 4
1, 3 2 4, 1 2.5
1, 4 2.5 4, 2 3
1, 5 3 4, 3 3.5
2, 1 1.5 4, 4 4
2, 2 2 4, 5 4.5
2, 3 2.5 5, 1 3
2, 4 3 5, 2 3.5
2, 5 3.5 5, 3 4
3, 1 2 5, 4 4.5
3, 2 2.5 5, 5 5
3, 3 3
frequency p(x)
1 1 0.04
1.5 2 0.08
2 3 0.12
2.5 4 0.16
3 5 0.20
3.5 4 0.16
4 3 0.12
4.5 2 0.08
5 1 0.04
25
6. 6
Example
The original distribution and the sampling
distribution of means of samples with n=2
are given below.
54321
Original distribution
54321
Sampling distribution
n = 2
7. 7
Example
Sampling distributions for n=3 and n=4 were
calculated and are illustrated below.
Sampling distribution n = 3
54321
Sampling distribution n = 4
54321
8. 8
Simulations
Means (n=120)
432
Means (n=60)
432
Means (n=30)
432
To illustrate the general
behavior of samples of
fixed size n, 10000
samples each of size 30,
60 and 120 were
generated from this
uniform distribution and
the means calculated.
Probability histograms
were created for each of
these (simulated)
sampling distributions.
Notice all three of these
look to be essentially
normally distributed.
Further, note that the
variability decreases as
the sample size increases.
9. 9
Simulations
Skewed distribution
To further illustrate the general behavior of
samples of fixed size n, 10000 samples each of
size 4, 16 and 32 were generated from the
positively skewed distribution pictured below.
Notice that these sampling distributions all all skewed,
but as n increased the sampling distributions became
more symmetric and eventually appeared to be almost
normally distributed.
10. 10
Terminology
Let denote the mean of the observations
in a random sample of size n from a
population having mean µ and standard
deviation σ. Denote the mean value of the
distribution by and the standard deviation
of the distribution by (called the standard
error of the mean), then the rules on the
next two slides hold.
x
xµ
xσ
11. 11
Properties of the Sampling
Distribution of the Sample Mean.
Rule 2:
This rule is approximately correct as
long as no more than 5% of the
population is included in the sample.
x
n
σ
σ =x
n
σ
σ =
xµ = µRule 1:
Rule 3: When the population distribution is
normal, the sampling distribution of
is also normal for any sample size n.
x
12. 12
Central Limit Theorem.
Rule 4: When n is sufficiently large, the
sampling distribution of is
approximately normally
distributed, even when the
population distribution is not
itself normal.
x
15. 15
More about the Central Limit
Theorem.
The Central Limit Theorem can safely
be applied when n exceeds 30.
If n is large or the population distribution
is normal, the standardized variable
has (approximately) a standard normal (z)
distribution.
X
X
x x
z
n
− µ − µ
= =
σ σ
16. 16
Example
A food company sells “18 ounce” boxes
of cereal. Let x denote the actual amount
of cereal in a box of cereal. Suppose that
x is normally distributed with µ = 18.03
ounces and σ = 0.05.
a) What proportion of the boxes will
contain less than 18 ounces?
18 18.03
P(x 18) P z
0.05
P(z 0.60) 0.2743
−
< = <
= < − =
17. 17
Example - continued
b) A case consists of 24 boxes of cereal.
What is the probability that the mean
amount of cereal (per box in a case)
is less than 18 ounces?
18 18.03
P(x 18) P z
0.05 24
P(z 2.94) 0.0016
−
< = < ÷
= < − =
The central limit theorem states that the
distribution of is normally distributed sox
18. 18
Some proportion distributions
where π = 0.2
0.2
n = 10
0.2
n = 50
0.2
n = 20
0.2
n = 100
Let p be the proportion of successes in a
random sample of size n from a population
whose proportion of S’s (successes) is π.
19. 19
Properties of the Sampling
Distribution of p
Let p be the proportion of successes
in a random sample of size n from a
population whose proportion of S’s
(successes) is π.
Denote the mean of p by µp and the
standard deviation by σp. Then the
following rules hold
20. 20
Properties of the Sampling
Distribution of p
Rule 3: When n is large and π is not too near
0 or 1, the sampling distribution of p is
approximately normal.
pµ = πRule 1: pµ = πRule 1:
p
(1 )
n
π − π
σ =
Rule 2:
p
(1 )
n
π − π
σ =
Rule 2:
21. 21
Rule of Thumb
If both np ≥ 10 and n(1-p) ≥ 10, then it is
safe to use a normal approximation.
Condition for Use
The further the value of π is from 0.5, the
larger n must be for the normal
approximation to the sampling distribution
of p to be accurate.
22. 22
Example
If the true proportion of defectives
produced by a certain manufacturing
process is 0.08 and a sample of 400 is
chosen, what is the probability that the
proportion of defectives in the sample is
greater than 0.10?
Since nπ = 400(0.08) = 32 > 10 and
n(1-π) = 400(0.92) = 368 > 10,
it’s reasonable to use the normal
approximation.
23. 23
Example (continued)
P(p 0.1) P(z 1.47)
1 0.9292 0.0708
> = >
= − =
p
p
0.08
(1 ) 0.08(1 0.08)
0.013565
n 400
µ = π =
π − π −
σ = = =
p
p
p 0.10 0.08
z 1.47
0.013565
− µ −
= = =
σ
p
p
p 0.10 0.08
z 1.47
0.013565
− µ −
= = =
σ
24. 24
Example
Suppose 3% of the people contacted by phone
are receptive to a certain sales pitch and buy
your product. If your sales staff contacts 2000
people, what is the probability that more than
100 of the people contacted will purchase your
product?
Clearly π = 0.03 and p = 100/2000 = 0.05 so
−
> = >
− = > = > ≈
0.05 0.03
P(p 0.05) P z
(0.03)(0.97)
2000
0.05 0.03
P z P(z 5.24) 0
0.0038145
25. 25
Example - continued
If your sales staff contacts 2000 people, what
is the probability that less than 50 of the
people contacted will purchase your product?
Now π = 0.03 and p = 50/2000 = 0.025 so
−
< = <
− = < = < − =
0.025 0.03
P(p 0.025) P z
(0.03)(0.97)
2000
0.025 0.03
P z P(z 1.31) 0.0951
0.0038145