Malimu statistical significance testing.

STATISTICALSTATISTICAL
SIGNIFICANCE TESTINGSIGNIFICANCE TESTING
BY MALIMUBY MALIMU
Dept of Epidemiology/Biostatistics,Dept of Epidemiology/Biostatistics,
School of Public Health and Social Sciences,School of Public Health and Social Sciences,
MUHAS/KIUMUHAS/KIU

OUTLINEOUTLINE
• Introduction
– why significance tests (with examples)
– how significance tests work
– significance levels
– critical region and critical values
– concept of the P-value
– hypotheses
• Significance test for 1 mean (z-test and t-test)
• Significance test for 1 proportion (z-test)
• Significance test for 2 means (z-test and t-
test)
• Significance test for 2 proportions ( χχ22
-test)-test)
derived from a 2 x 2 contingency tablederived from a 2 x 2 contingency table

WHY A SIGNIFICANCE TESTWHY A SIGNIFICANCE TEST
Example 1: mean age at firstExample 1: mean age at first
sexual intercourse;sexual intercourse; do malesdo males
start earlier than females?start earlier than females?
Male Female Overall
mean (n) 16.9 (93) 17.2 (99) 17.0 (192)
std. dev. 2.1 2.0 2.0

INTRODUCTION: whyINTRODUCTION: why
significance testssignificance tests
Example 2: Prevalence of HIV infection: comparing
in-school youth and the general youth population
• Large studies have indicated that the proportion
of youth infected with HIV is 10%. A study done
involving 400 school youth showed a prevalence
of 7%. Do these results provide evidence that
the prevalence of HIV among school youth is
lower than that in the general youth population?

Example 3: Utilization of VCT services by
marital status
• In a study on VCT utilization, we find that
60% of married people in the sample
utilize VCT services compared to 30% of
unmarried people. How should we
interpret this result?

• We note that in all examples, there are
differences
• However, the observed difference in each
example might:
– reflect a TRUE DIFFERENCE (i.e. the difference
also exists in the total population from which the
sample was drawn
– be due to CHANCE (i.e. in reality there is no
difference, but the observed difference is due to
sampling variation)
– be due to BIAS (e.g. due to defects in the study
methodology)

• With an appropriate study design, we can feel
confident that an observed difference between
two groups cannot be explained by BIAS
• We would like to find out whether this difference
can be considered as a TRUE difference
• We can only conclude that this is the case if we
can rule out the CHANCE explanation
• We accomplish this by applying a significance
test

• A SIGNIFICANCE TEST estimates the
likelihood that an observed study result (e.g. a
difference between two groups) is due to
chance
• A significance test is used to assess whether a
study result, which is observed in a sample can
be considered as a result which indeed exists in
the study population from which the sample was
drawn

INTRODUCTION: HowINTRODUCTION: How
significance tests worksignificance tests work
• Suppose we observed a difference
between two groups in a study
sample.
• We want to know whether this
observed difference between the two
groups represents a real difference
in the total study population from
which the sample was drawn, or
whether it just occurred by chance
(due to sampling variation).

• To find this out, we determine how likely it
is that this difference could have occurred
by chance.
• We can never be 100% sure that an
observed difference is true, but in general,
we are happy if we can be 99% or 95%
sure (confident).
• If we are 95% sure, there is a less than
5% likelihood that the observed difference
occurred by chance.
INTRODUCTION: HowINTRODUCTION: How
significance tests worksignificance tests work

Important TerminologiesImportant Terminologies
• Statistical hypothesisStatistical hypothesis
– This is a statement about the
parameter(s) of the population(s) from
which the sample(s) were taken.
– Null hypothesis, H0 : hypothesis of
“no difference”. This is the one to be
tested.
– Alternative hypothesis, H1:
hypothesis that disagrees with the null
hypothesis.

• Test statisticTest statistic
– This is a mathematical function
(expression) of sample values which
provides a basis for testing a statistical
hypothesis.
– It has a known sampling distribution with
tabulated percentage points (e.g.
standard normal deviate (SND), z; chi-
squared, χ2
; t)

• Significance level (Significance level (αα):):
– The probability of rejecting H0 when it is true
– Often expressed in percentage form (i.e.
probability, α, is multiplied by 100)
– In social sciences, we choose a commonly
accepted level of allowing that our conclusion
may have occurred by chance of 0.05 (5%).
In clinical trials involving new drugs, a higher
significance level (e.g. of 0.01=1%) would be
chosen
– Generally, 0.01 and 0.05 values are most
commonly used in scientific studies

• Critical (Rejection) RegionCritical (Rejection) Region
– This is the region that encompasses
values of the test statistic leading to
rejection of the null hypothesis
– Location of the critical region is
dependent on the test statistic and the
specified significance level

• Critical ValueCritical Value
– This is the value of the test statistic
corresponding to a given significance level
– The critical value changes as the confidence
level alters: e.g. corresponding critical values
for confidence levels of 90%, 95%, 99% are
1.64, 1.96 (≈2), 2.58, respectively
– If the test statistic value computed from the
data is greater than the critical value, H0
is rejected
– It is the boundary value of the critical region

CRITICAL REGION &CRITICAL REGION &
VALUEVALUE

CConcept of P-valueoncept of P-value
• In any study looking for differences between
groups or associations between variables, the
likelihood or PROBABILITY of observing a
certain result by chance has to be calculated by
a statistical test
• This PROBABILITY of observing a result by
chance is usually expressed as a P-VALUE
• If it is unlikely (<0.05) that the difference
occurred by chance, we reject the chance
explanation and accept that there is a real
difference. We then say that the difference is
statistically significant

• If it is likely (≥ 0.05) that the
difference occurred by chance, we
cannot conclude that a real difference
exists We then say that the difference
is not statistically significant
• Therefore a difference is
considered significant if P <
0.05
Concept of P-valueConcept of P-value

HypothesesHypotheses
• In statistical terms the assumption that in the
total study population no real difference exists
between groups (or that no real association
exists between variables) is called the NULL
HYPOTHESIS (H0)
• The ALTERNATIVE HYPOTHESIS (H1) is
that there exists a difference between groups or
that a real association exists between
variables
• If the result is statistically significant, we reject
the NULL HYPOTHESIS (H0) and accept the
ALTERNATIVE HYPOTHESIS (H1) that
there is a real difference between two groups,
or a real association between two variables

One sample significance testOne sample significance test
for a mean (for a mean (σσ known): theknown): the
Standard Normal Deviate or z-Standard Normal Deviate or z-
testtest
• Problem: Is it reasonable to conclude
that a sample of n observations, with
mean could have been from a
population with mean µ and standard
deviation σ?
x

for a mean (for a mean (σσ known): theknown): the
Standard Normal Deviate or z-Standard Normal Deviate or z-
testtest
• H0: The difference between µ and is merely
due to sampling error
( - µ)
• Calculate SND, z = --------
SE( )
( - µ)
= --------
σ/√n
x
x
x
x

INTERPRETATION OF P-INTERPRETATION OF P-
VALUEVALUE
• If z < 1.96 then P > 0.05:
– we have no strong evidence against H0
– suggests that difference being due to
chance is more likely
– hence, difference is not statistically
significant
• If z > 1.96 then P < 0.05:
– we have evidence against H0
– it is unlikely that the difference between µ
and is due only to sampling error
– hence, difference is statistically
significant
• If z > 2.58 then P < 0.01:
– we have strong evidence against H
x

ExampleExample
• Results of a study investigating medical risks
associated with a certain occupation show that
in random sample of 20 men aged 30-39 years
the mean systolic blood pressure is 141.4
mmHg.
• Suppose that the mean systolic blood pressure
in the general population of men aged 30-39
years is known to be 133.2 mmHg with a
standard deviation of 15.1 mmHg.
• Do the results of the study provide evidence of
an increased blood pressure associated with this
occupation?

SolutionSolution
• Null hypothesis, H0: there is no increase in blood
pressure in this occupation, and the sample of 20 men
can be regarded as a random sample from the general
population of men aged 30-39 years.
( - µ)
• Calculate SND, z = --------
SE( )
( - µ)
= --------
σ/√n
x
x
x

SolutionSolution
• n = 20
• = 141.4 mmHg
• µ = 133.2 mmHg
• σ = 15.1 mmHg
• SE( ) = 3.38
• SND, z = 2.43; P=0.015<0.05
• Conclusion:
– The difference is statistically significant!
– That is, there is enough evidence of an increase in
systolic blood pressure among men in this occupation
x
x

PRACTICALPRACTICAL
The mean level of prothrombin in the normal
population is known to be 20.0mg/100 ml of
plasma and the standard deviation is 4mg/100
ml. A sample of 40 patients showing vitamin K
deficiency has a mean prothrombin level of
18.5mg/100 ml.
• (a) How reasonable is it to conclude that the true
mean for patients with vitamin K deficiency is
the same as that for a normal population?
• (b) Within what limits would the mean
prothrombin level be expected to lie for all
patients with vitamin K deficiency? (Give the
95% confidence limits)

ONE SAMPLE SIGNIFICANCEONE SAMPLE SIGNIFICANCE
TEST FOR A PROPORTIONTEST FOR A PROPORTION
Problem: Is it reasonable to conclude that a sample of n
observations in which the proportion p have a
characteristic, could have been taken from a population
in which the proportion with the characteristic is π?
• H0: the difference between p and π is merely due to
sampling error (i.e. by chance)
• If n is reasonably large (? >40), then calculate
z = p- π
SE(p)
z = p - π
√π(1-π)/n
OR
z = p - π
√π(100-π)/n
• Conclusions follow like before

PRACTICAL:PRACTICAL: one sampleone sample
proportionproportion
• In a clinical trial to compare two systems
of TB treatment: A (hospital based DOTS)
and B (home based DOTS), 100 patients
were each tried the two systems on
different occasions. Of the 100 patients,
65 say they prefer A, 35 prefer B. Is this
reasonably good evidence that more
patients prefer A than B?

for a mean (for a mean (σσ unknown): the t-unknown): the t-
testtest
Use of SND, z, applies when the population
standard deviation, σ, is known
• If σ is unknown, it can be estimated from the
sample by the standard deviation s
• With small samples, and replacing σ by s in the
formula for SND, leads to a new quantity t,
given by
( - µ)
• t = --------
SE( )
( - µ)
= --------
x
x
x

END OF ONE SAMPLEEND OF ONE SAMPLE
SIGNIFICANCE TESTSIGNIFICANCE TEST
• s = 5.18s = 5.18
• t = 2.65 on 11 df; P<0.05;
• probably G. secundum

DETERMINING SIGNIFICANTDETERMINING SIGNIFICANT
DIFFERENCES BETWEENDIFFERENCES BETWEEN
GROUPS IN CATEGORICALGROUPS IN CATEGORICAL
DATADATA
• For NOMINAL data the significance test to be
used depends on whether the sample is small or
large
• Generally, the Chi-square test ((χχ22
)) will be
used
• However, for small samples (? < 40) in a 2 x 2
table, or if any cell of the cross-table, has an
expected value of less than 5, it is better to use
Yate’s corrected χχ22
or Fisher exact test

COMPARING TWOCOMPARING TWO
PROPORTIONS: the chi-PROPORTIONS: the chi-
square (square (χχ22
) test) test
• The chi-square test is used for
CATEGORICAL data to test for independency
between two or more variables, basically
comparing two or more proportions
• With categorical data the chi-square test is used
to find out whether observed differences
between proportions of events in two or more
groups may be considered statistically
significant

THE CHI-SQUARE (THE CHI-SQUARE ( ΧΧ22
) TEST:) TEST:
ExampleExample
• Suppose that in a cross-sectional study of the
factors affecting the utilization of antenatal
clinics we find that 64% of the women who lived
within 10 km of the clinic came for antenatal
care, compared to only 47% of those who lived
more than 10 km away
• This suggests that antenatal care (ANC) is used
more often by women who live close to the
clinics. The complete results are presented in
the following table

UTILIZATION OF ANTENATALUTILIZATION OF ANTENATAL
CLINICS BY DISTANCE FROM ACLINICS BY DISTANCE FROM A
CLINICCLINIC

• We now want to examine if this observed
difference is statistically significant or not.
• The chi-square test can be used to give us
the answer.
• To perform a χ2
test we need to complete the
following 3 steps:
– calculate the χ2
,
– use a χ2
table to obtain the P-value and
– interpret the χ2
THE CHI-SQUARE (THE CHI-SQUARE ( ΧΧ22
))
TESTTEST

TABLE OF ΧTABLE OF Χ22
VALUESVALUES
df P = 0.05 P = 0.01
1
2
3
4
5
6
7
8
9
10
11
3.84
5.99
7.81
9.49
11.07
12.59
14.07
15.51
16.92
18.31
19.68
6.63
9.21
11.34
13.28
15.09
16.81
18.48
20.09
21.67
23.21
24.72

CALCULATING ΧCALCULATING Χ22
VALUE:VALUE:
the 2 X 2 tablethe 2 X 2 table
• Consider the following general cross-table:

• The quick formula for calculating χχ22
in a 2
x 2 table is as follows:
• χχ22
= N(ad-bc)2
/(EFGH)
• General formula for larger contingency
tables is time consuming to perform, but
computers are very useful for this
• For larger contingency tables, we shall
therefore learn how to use Epi-Info for
statistical significance testing
CALCULATING ΧCALCULATING Χ22
VALUE:VALUE:
the 2 X 2 tablethe 2 X 2 table

USING A ΧUSING A Χ22
TABLETABLE
(1) Decide on the significance level you want to use (e.g.
0.05)
(2) Calculate the degrees of freedom (df) as:
• df = (r-1) x (c-1), where r is the number of rows and c is
the number of columns
• for a simple two-by-two table the number of degrees of
freedom is 1 (df = (2-1) x (2-1) = 1)
(3) If the calculated χ2
> the tabulated χ2
then P<0.05
• In this case, we reject the null hypothesis and conclude
that there is a statistically significant difference
between the groups
(4) If the calculated χ2
< the tabulated χ2
then P>0.05
• In this case, we accept the null hypothesis and conclude
that the observed difference is not statistically
significant

APPLYING THE ΧAPPLYING THE Χ22
Example: use of χ2
test with the data on
utilization of antenatal care:
• χ2
= 4.57; P = 0.03 <0.05
• Conclusion:
– women living within a distance of 10 km from
the clinic utilize antenatal care more often
(64%) than the women living more than 10 km
away (48%);
– this difference is statistically significant (χ2
=
4.57; P = 0.03)

PRACTICAL: two samplePRACTICAL: two sample
proportionsproportions
• From each of 509 vaginal swabs taken at an STI
clinic, isolation of Candida albicans and culture
of Trichomonas vaginalis were attempted.
• There were 347 swabs negative for both
Candida and Trichomonas. Candida was
isolated from 7 of the 44 swabs positive for
Trichomonas:
– Present this information in a 2 x 2 table.
– Is there an evidence of association between
Candidiasis and Trichomoniasis?

Malimu statistical significance testing.

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Malimu statistical significance testing.

Similar to Malimu statistical significance testing. (20)

More from Miharbi Ignasm

More from Miharbi Ignasm (17)

Recently uploaded

Recently uploaded (20)

Malimu statistical significance testing.