3. 3
FORMAL STRUCTURE
Hypothesis Tests are based on an reductio
ad absurdum form of argument.
Specifically, we make an assumption and
then attempt to show that assumption leads
to an absurdity or contradiction, hence the
assumption is wrong.
4. 4
FORMAL STRUCTURE
The null hypothesis, denoted H0 is a
statement or claim about a population
characteristic that is initially assumed to be
true.
The null hypothesis is so named because it
is the “starting point” for the investigation.
The phrase “there is no difference” is often
used in its interpretation.
5. 5
FORMAL STRUCTURE
The alternate hypothesis, denoted by Ha is
the competing claim.
The alternate hypothesis is a statement
about the same population characteristic
that is used in the null hypothesis.
Generally, the alternate hypothesis is a
statement that specifies that the population
has a value different, in some way, from the
value given in the null hypothesis.
6. 6
FORMAL STRUCTURE
Rejection of the null hypothesis will imply
the acceptance of this alternative
hypothesis.
Assume H0 is true and attempt to show
this leads to an absurdity, hence H0 is
false and Ha is true.
7. 7
FORMAL STRUCTURE
Typically one assumes the null hypothesis to
be true and then one of the following
conclusions are drawn.
1. Reject H0
Equivalent to saying that Ha is correct
or true
2. Fail to reject H0
Equivalent to saying that we have
failed to show a statistically significant
deviation from the claim of the null
hypothesis
This is not the same as saying that
the null hypothesis is true.
8. 8
AN ANALOGY
The Statistical Hypothesis Testing process
can be compared very closely with a judicial
trial.
1.Assume a defendant is innocent (H0)
2.Present evidence to show guilt
3.Try to prove guilt beyond a reasonable
doubt (Ha)
10. 10
Examples of Hypotheses
You would like to
determine if the
diameters of the
ball bearings you
produce have a
mean of 6.5 cm.
H0: µ = 6.5
Ha: µ ≠ 6.5
(Two-sided
alternative)
11. 11
The students
entering into the
math program used
to have a mean
SAT quantitative
score of 525. Are
the current students
poorer (as
measured by the
SAT quantitative
score)?
H0: µ = 525
(Really: µ ≥ 525)
Ha: µ < 525
(One-sided alternative)
Examples of Hypotheses
12. 12
Do the “16 ounce” cans
of peaches canned and
sold by DelMonte meet
the claim on the label
(on the average)?
H0: µ = 16
(Really: µ ≥16)
Ha: µ < 16
Examples of Hypotheses
Notice, the real concern
would be selling the
consumer less than 16
ounces of peaches.
13. 13
Is the proportion of
defective parts
produced by a
manufacturing
process more than
5%?
H0: π = 0.05
(Really, π ≤ 0.05)
Ha: π > 0.05
Examples of Hypotheses
14. 14
Do two brands of
light bulb have
the same mean
lifetime?
H0: µBrand A = µBrand B
Ha: µBrand A ≠ µBrand B
Examples of Hypotheses
15. 15
Do parts produced
by two different
milling machines
have the same
variability in
diameters?
or equivalently
0 1 2
a 1 2
H :
H :
σ = σ
σ ≠ σ
2 2
0 1 2
2 2
a 1 2
H :
H :
σ = σ
σ ≠ σ
Examples of Hypotheses
16. 16
Comments on Hypothesis Form
The null hypothesis must contain the equal
sign.
This is absolutely necessary because
the test requires the null hypothesis to
be assumed to be true and the value
attached to the equal sign is then the
value assumed to be true and used in
subsequent calculations.
The alternate hypothesis should be what
you are really attempting to show to be true.
This is not always possible.
17. 17
Hypothesis Form
The form of the null hypothesis is
H0: population characteristic = hypothesized value
where the hypothesized value is a specific
number determined by the problem context.
The alternative (or alternate) hypothesis will have
one of the following three forms:
Ha: population characteristic > hypothesized value
Ha: population characteristic < hypothesized value
Ha: population characteristic ≠ hypothesized value
18. 18
Caution
When you set up a hypothesis test, the
result is either
1. Strong support for the alternate
hypothesis (if the null hypothesis is
rejected)
2. There is not sufficient evidence to
refute the claim of the null hypothesis
(you are stuck with it, because there is
a lack of strong evidence against the
null hypothesis.
20. 20
Error Analogy
Consider a medical test where the hypotheses
are equivalent to
H0: the patient has a specific disease
Ha: the patient doesn’t have the disease
Then,
Type I error is equivalent to a false
negative
(i.e., Saying the patient does not have the
disease when in fact, he does.)
Type II error is equivalent to a false
positive
(i.e., Saying the patient has the disease
when, in fact, he does not.)
21. 21
More on Error
The probability of a type I error is denoted
by α and is called the level of significance
of the test.
Thus, a test with α = 0.01 is said to
have a level of significance of 0.01 or
to be a level 0.01 test.
The probability of a type II error is denoted
by β.
22. 22
Relationships Between α and β
Generally, with everything else held constant,
decreasing one type of error causes the other
to increase.
The only way to decrease both types of error
simultaneously is to increase the sample size.
No matter what decision is reached, there is
always the risk of one of these errors.
23. 23
Comment of Process
Look at the consequences of type I and type
II errors and then identify the largest α that
is tolerable for the problem.
Employ a test procedure that uses this
maximum acceptable value of α (rather than
anything smaller) as the level of significance
(because using a smaller α increases β).
24. 24
Test Statistic
A test statistic is the function of sample
data on which a conclusion to reject or fail
to reject H0 is based.
25. 25
P-value
The P-value (also called the observed
significance level) is a measure of
inconsistency between the hypothesized
value for a population characteristic and
the observed sample.
The P-value is the probability, assuming
that H0 is true, of obtaining a test statistic
value at least as inconsistent with H0 as
what actually resulted.
26. 26
Decision Criteria
A decision as to whether H0 should be
rejected results from comparing the P-value
to the chosen α:
H0 should be rejected if P-value ≤ α.
H0 should not be rejected if P-value > α.
27. 27
Large Sample Hypothesis Test
for a Single Proportion
In terms of a standard normal random variable z, the
approximate P-value for this test depends on the
alternate hypothesis and is given for each of the
possible alternate hypotheses on the next 3 slides.
To test the hypothesis
H0: π = hypothesized proportion,
compute the z statistic
p hypothesized value
z
hypothesized value(1-hypothesized value)
n
−
=
28. 28
Hypothesis Test
Large Sample Test of Population Proportion
p hypothesized value
P-value P z
hypothesized value(1-hypothesized value)
n
−
= <
29. 29
p hypothesized value
P-value P z
hypothesized value(1-hypothesized value)
n
−
= >
Hypothesis Test
Large Sample Test of Population Proportion
30. 30
p hypothesized value
P-value 2P z
hypothesized value(1-hypothesized value)
n
−
= >
Hypothesis Test
Large Sample Test of Population Proportion
31. 31
An insurance company states that the
proportion of its claims that are settled
within 30 days is 0.9. A consumer group
thinks that the company drags its feet and
takes longer to settle claims. To check
these hypotheses, a simple random
sample of 200 of the company’s claims
was obtained and it was found that 160 of
the claims were settled within 30 days.
Hypothesis Test Example
Large-Sample Test for a Population Proportion
32. 32
P-value P(z 4.71) 0= < − ≈
0.8 0.9 0.8 0.9
z 4.71
(0.9)(1 0.9) 0.9(0.1)
200 200
− −
= = = −
−
Hypothesis Test Example 2
Single Proportion continued
π = proportion of the company’s claims that are
settled within 30 days
H0: π = 0.9
HA: π < 0.9
160
p 0.8
200
= =The sample proportion is
33. 33
Hypothesis Test Example 2
Single Proportion continued
The probability of getting a result as strongly or
more strongly in favor of the consumer group's
claim (the alternate hypothesis Ha) if the
company’s claim (H0) was true is essentially 0.
Clearly, this gives strong evidence in support of
the alternate hypothesis (against the null
hypothesis).
34. 34
Hypothesis Test Example 2
Single Proportion continued
We would say that we have strong support for
the claim that the proportion of the insurance
company’s claims that are settled within 30 days
is less than 0.9.
Some people would state that we have shown
that the true proportion of the insurance
company’s claims that are settled within 30 days
is statistically significantly less than 0.9.
35. 35
A county judge has agreed that he will
give up his county judgeship and run for
a state judgeship unless there is
evidence at the 0.10 level that more then
25% of his party is in opposition. A SRS
of 800 party members included 217 who
opposed him. Please advise this judge.
Hypothesis Test Example
Single Proportion
36. 36
Hypothesis Test Example
Single Proportion continued
π = proportion of his party that is in opposition
H0: π = 0.25
HA: π > 0.25
α = 0.10
Note: hypothesized value = 0.25
217
n 800, p 0.27125
800
= = =
0.27125 0.25
z 1.39
0.25(0.75)
800
−
= =
37. 37
Hypothesis Test Example
Single Proportion continued
At a level of significance of 0.10, there is
sufficient evidence to support the claim that
the true percentage of the party members
that oppose him is more than 25%.
Under these circumstances, I would advise
him not to run.
P-value=P(z 1.39) 1 0.9177 0.0823> = − =
38. 38
1. Describe (determine) the population
characteristic about which hypotheses
are to be tested.
2. State the null hypothesis H0.
3. State the alternate hypothesis Ha.
4. Select the significance level α for the
test.
5. Display the test statistic to be used, with
substitution of the hypothesized value
identified in step 2 but without any
computation at this point.
Steps in a Hypothesis-Testing
Analysis
39. 39
Steps in a Hypothesis-Testing
Analysis
6. Check to make sure that any
assumptions required for the test are
reasonable.
7. Compute all quantities appearing in the
test statistic and then the value of the
test statistic itself.
8. Determine the P-value associated with
the observed value of the test statistic
9. State the conclusion in the context of
the problem, including the level of
significance.
40. 40
x hypothesized mean
z
n
−
=
σ
Hypothesis Test (Large samples)
Single Sample Test of Population Mean
In terms of a standard normal random variable z, the
approximate P-value for this test depends on the
alternate hypothesis and is given for each of the
possible alternate hypotheses on the next 3 slides.
To test the hypothesis
H0: µ = hypothesized mean,
compute the z statistic
41. 41
x hypothesized mean
P-value P Z
n
− = <
σ
Hypothesis Test
Single Sample Test of Population Mean
H0: µ = hypothesized mean
HA: µ < hypothesized mean
42. 42
x hypothesized mean
P-value P Z
n
− = >
σ
Hypothesis Test
Single Sample Test of Population Mean
H0: µ = hypothesized mean
HA: µ > hypothesized mean
43. 43
x hypothesized mean
P-value 2P Z
n
−
= > σ
Hypothesis Test
Single Sample Test of Population Mean
H0: µ = hypothesized mean
HA: µ ≠ hypothesized mean
44. 44
It is not likely that one would know σ but not
know µ, so calculating a z value using the
formula
would not be very realistic.
x hypothesized mean
z
n
−
=
σ
Reality Check
For large values of n (>30) it is generally acceptable to
use s to estimate σ, however, it is much more common
to apply the t-distribution.
45. 45
x hypothesized mean
t
s
n
−
=
Hypothesis Test (σ unknown)
Single Sample Test of Population Mean
The approximate P-value for this test is found
using a t random variable with degrees of freedom
df = n-1. The procedure is described in the next
group of slides.
To test the null hypothesis µ = hypothesized
mean, when we may assume that the
underlying distribution is normal or
approximately normal, compute the t statistic
46. 46
x hypothesized mean
P-value P t
s
n
− = <
Hypothesis Test
Single Sample Test of Population Mean
H0: µ = hypothesized mean
HA: µ < hypothesized mean
47. 47
x hypothesized mean
P-value P t
s
n
− = >
Hypothesis Test
Single Sample Test of Population Mean
H0: µ = hypothesized mean
HA: µ > hypothesized mean
48. 48
x hypothesized mean
P-value 2P t
s
n
−
= >
Hypothesis Test
Single Sample Test of Population Mean
H0: µ = hypothesized mean
HA: µ ≠ hypothesized mean
49. 49
The t statistic can be used for all sample
sizes, however, the smaller the sample,
the more important the assumption that
the underlying distribution is normal.
Typically, when n >15 the underlying
distribution need only be centrally
weighted and may be somewhat skewed.
Hypothesis Test (σ unknown)
Single Sample Test of Population Mean
52. 52
An manufacturer of a special bolt requires
that this type of bolt have a mean
shearing strength in excess of 110 lb. To
determine if the manufacturer’s bolts
meet the required standards a sample of
25 bolts was obtained and tested. The
sample mean was 112.7 lb and the
sample standard deviation was 9.62 lb.
Use this information to perform an
appropriate hypothesis test with a
significance level of 0.05.
Example of Hypothesis Test
Single Sample Test of Population Mean - continued
53. 53
µ = the mean shearing strength of this
specific type of bolt
Τhe hypotheses to be tested are
H0: µ = 110 lb
Ha: µ > 110 lb
The significance level to be used for the
test is α = 0.05.
Example of Hypothesis Test
Single Sample Test of Population Mean - continued
x 110
t
s
n
−
=The test statistic is
54. 54
Example of Hypothesis Test
Single Sample Test of Population Mean - continued
x 112.7, s 9.62, n 25, df 24= = = =
112.7 110
P-value P t
9.62
25
P(t 1.4) 0.087
− = >
= > =
55. 55
Because P-value = 0.087 > 0.05 = α, we
fail to reject H0.
At a level of significance of 0.05, there is
insufficient evidence to conclude that the
mean shearing strength of this brand of bolt
exceeds 110 lbs.
Example of Hypothesis Test
Single Sample Test of Population Mean - conclusion
57. 57
Revisit the problem with α=0.10
What would happen if the significance level
of the test was 0.10 instead of 0.05?
At the 0.10 level of significance there is
sufficient evidence to conclude that the
mean shearing strength of this brand of
bolt exceeds 120 lbs.
Now P-value = 0.087 < 0.10 = α, and we
reject H0 at the 0.10 level of significance and
conclude
58. 58
Comments continued
Many people are bothered by the fact that
different choices of α lead to different
conclusions.
This is nature of a process where you
control the probability of being wrong when
you select the level of significance. This
reflects your willingness to accept a certain
level of type I error.
59. 59
Another Example
A jeweler is planning on manufacturing gold
charms. His design calls for a particular piece to
contain 0.08 ounces of gold. The jeweler would
like to know if the pieces that he makes contain
(on the average) 0.08 ounces of gold. To test to
see if the pieces contain 0.08 ounces of gold, he
made a sample of 16 of these particular pieces
and obtained the following data.
0.0773 0.0779 0.0756 0.0792 0.0777
0.0713 0.0818 0.0802 0.0802 0.0785
0.0764 0.0806 0.0786 0.0776 0.0793 0.0755
Use a level of significance of 0.01 to perform an
appropriate hypothesis test.
60. 60
Another Example
1. The population characteristic being
studied is µ = true mean gold content for
this particular type of charm.
2. Null hypothesis: H0:µ = 0.08 oz
3. Alternate hypothesis: Ha:µ ≠ 0.08 oz
4. Significance level: α = 0.01
5. Test statistic:
x hypothesized mean x 0.08
t
s s
n n
− −
= =
61. 61
Another Example
6. Minitab was used to create a normal plot
along with a graphical display of the
descriptive statistics for the sample data. The
result of this display is that it is reasonable to
assume that the population of gold contents
of this type of charm is normally distributed
P-Value: 0.396
A-Squared: 0.363
Anderson-Darling Normality Test
N: 16
StDev: 0.0025143
Average: 0.0779813
0.0820.0770.072
.999
.99
.95
.80
.50
.20
.05
.01
.001
Probability
Gold
Normal Probability Plot
62. 62
Another Example
We can see that with the exception of one
outlier, the data is reasonably symmetric and
mound shaped in shape, indicating that the
assumption that the population of amounts of
gold for this particular charm can reasonably
be expected to be normally distributed.
0.0820.0800.0780.0760.0740.072
95% Confidence Interval for Mu
0.07950.07850.07750.0765
95% Confidence Interval for Median
Variable: Gold
7.71E-02
1.86E-03
7.66E-02
Maximum
3rd Quartile
Median
1st Quartile
Minimum
N
Kurtosis
Skewness
Variance
StDev
Mean
P-Value:
A-Squared:
7.95E-02
3.89E-03
7.93E-02
8.18E-02
8.00E-02
7.82E-02
7.66E-02
7.13E-02
16
2.23191
-1.10922
6.32E-06
2.51E-03
7.80E-02
0.396
0.363
95% Confidence Interval for Median
95% Confidence Interval for Sigma
95% Confidence Interval for Mu
Anderson-Darling Normality Test
Descriptive Statistics
63. 63
Another Example
8. P-value: This is a two tailed test. Looking up in
the table of tail areas for t curves, t = 3.2 with
df = 15 we see the table entry is 0.003 so
P-Value = 2(0.003) = 0.006
n 16, x 0.077981, s 0.0025143
0.077981 0.08
t 3.2
0.0025143
16
= = =
−
= = −
7. Computations:
n 16, x 0.077981, s 0.0025143
0.077981 0.08
t 3.2
0.0025143
16
= = =
−
= = −
7. Computations:
64. 64
Another Example
9. Conclusion:
Since P-value = 0.006 ≤ 0.01 = α, we
reject H0 at the 0.01 level of significance.
At the 0.01 level of significance there is
convincing evidence that the true mean
gold content of this type of charm is not
0.08 ounces.
Actually when rejecting a null hypothesis for
the ≠ alternative, a one tailed claim is
supported. In this case, at the 0.01 level of
significance, there is convincing evidence
that the true mean gold content of this type
of charm is less than 0.08 ounces.
65. 65
Power and Probability of Type
II Error
The power of a test is the probability of
rejecting the null hypothesis.
When H0 is false, the power is the
probability that the null hypothesis is
rejected. Specifically, power = 1 – β.
66. 66
Effects of Various Factors on Power
1. The larger the size of the discrepancy
between the hypothesized value and
the true value of the population
characteristic, the higher the power.
2. The larger the significance level, α,
the higher the power of the test.
3. The larger the sample size, the higher
the power of the test.
67. 67
Some Comments
Calculating β (hence power) depends on
knowing the true value of the population
characteristic being tested. Since the true
value is not known, generally, one
calculates β for a number of possible
“true” values of the characteristic under
study and then sketches a power curve.
68. 68
Example (based on z-curve)
Consider the earlier example where we
tested H0: µ = 110 vs. Ha: µ > 110 and
furthermore, suppose the true standard
deviation of the bolts was actually 10 lbs.
69. 69
Example (based on z-curve)
Power Curves
Different α's
0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
118 119 120 121 122 123 124 125 126 127 128
True Value of µ
Power(1-β)
α = 0.10
α = 0.05
α = 0.01
H0: µ = 120
Ha: µ > 120
σ = 10
70. 70
Example (based on z-curve)
Power Curves
Different n's
0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
118 119 120 121 122 123 124 125 126 127 128
True Value of µ
Power(1-β)
n = 45
n = 90
n = 180
n = 360
H0: µ = 120
Ha: µ > 120
σ = 10