Bio statistics 2 /certified fixed orthodontic courses by Indian dental academy

Bio Statistics
INDIAN DENTAL ACADEMY
Leader in continuing dental education
www.indiandentalacademy.com


Contents
•
•
•
•
•
•
•
•

Tests of significance
Stages in performing test of significance
Types of error
Test of significance for large samples
Test of significance for small samples
Chi square test
ANOVA
Bibliography

• Whatever be the sampling procedure or
the care taken while selecting sample, the
sample statistics will differ from the
population parameters
• Also variations between 2 samples drawn
from the same population may also occur
• i.e. differences in the results between two
research
workers
for
the
same
investigation may be observed

• Thus it becomes important to find out the
significance of this observed variation
• ie whether it is due to
– chance or biological variation (statistically not
significant) OR
– due to influence of some external factors
( statistically significant)


•

To test whether the variation observed is
of significance, the various tests of
significance are done. The test of
significance can be broadly classified as
1. Parameteric tests
2. Non parametric tests


Parameteric tests
• Parametric tests are those tests in which
certain assumptions are made about the
population
– Population from which sample is drawn has
normal distribution
– The variances of sample do not differ
significantly
– The observations found are truly numerical
thus arithmetic procedure such as addition,
division, and multiplication can be used

Parameteric tests
• Since these test make assumptions about
the population parameters hence they are
called parameteric tests .
• These are usually used to test the
difference
• They are:
– Student t test( paired or unpaired)
– ANOVA
– Test of significance between two means

Non parametric tests
• In many biological investigation the
research worker may not know the nature
of distribution or other required values of
the population.
• Also some biological measurements may
not be true numerical values hence
arithmetic procedures are not possible in
such cases.

Non parametric tests
• In such cases distribution free or non
parametric tests are used in which no
assumption are made about the population
parameters e.g.
–
–
–
–
–
–

Mann Whitney test
Chi square test
Phi coefficient test
Fischer’s Exact test
Sign Test
Freidmans Test

One tailed and two tailed test
• Test of significance can also be divided
into one tailed or 2 tailed test


•
•
•
•

Two tailed test
This test determines if there is a difference
between the two groups without specifying
whether difference is higher or lower
It includes both ends or tails of the normal
distribution
Such test is called Two tailed test
Eg when one wants to know if mean IQ in
malnourished children is different from well
nourished children but does not specify if it
is more or less

•
•
•
•
•

One tailed test
In the test of significance when one wants to
specifically know if the difference between the two
groups is higher or lower
ie the direction plus or minus side is specified.
Then one end or tail of the distribution is excluded
eg if one wants to know if mal nourished children
have less mean IQ than well nourished then higher
side of the distribution will be excluded
Such test of significance is called one tailed test


Stages in performing test of
significance


significance
•
•
•
•

State the null hypothesis
State the alternative hypothesis
Accept or reject the null hypothesis
Finally determine the p value


significance
• Null hypothesis
• It is a hypothesis of no difference between
statistics of a sample and parameter of the
population or between statistics of two
samples
• It nullifies the claim that the experimental
result is different from or better than the
one observed already

significance
State the alternative hypothesis
• Alternative hypothesis
• It is hypothesis stating that the sample
result is different ie larger or smaller than
the value of population or statistics of one
sample is different from the other


significance
Accept or reject the null hypothesis
• Null Hypothesis is accepted or rejected
depending on whether the result falls in
zone of acceptance or zone of rejection
• If the result of a sample falls in the area of
mean ± 2SE the null hypothesis is
accepted.
• This area of normal curve is called zone of
acceptance for null hypothesis

significance
• If the result of sample falls beyond the
area of mean ± 2 SE
• null hypothesis of no difference is rejected
and alternate hypothesis accepted
• This area of normal curve is called zone of
rejection for null hypothesis


significance
Finally determine the p value
• P value is determined using any of the
previously mentioned methods
• If p> 0.05 the difference is due to chance
and not statistically different but if
• p < 0.05 the difference is due to some
external factor and statistically significant

Types of error


Types of error
• While drawing conclusions in a study
we are likely to commit two types of
error.
– Type I error
– Type II error


Types of error
• Type I error
• This type of error occurs
• When we conclude that the difference
is significant when in fact there is no
real difference in the population ie we
reject the null hypothesis when it is
true
• Denoted by α

Types of error
• Type II error
• This type of error occurs
• When we say that the difference is not
significant when in fact there is a real
difference between the populations i.e.
the null hypothesis is not rejected
when it is actually false
• It is denoted by β

Tests of significance for
large samples


Test of significance for large
samples
• These tests are used for sample size
greater than 30
• The test used is Z test
• Z is standard normal derivate and has
been discussed under normal distribution
Z = observation – mean
SD

samples
• However in Z test standard deviation is
replaced by standard error
In Z test, Z = observed difference
standard error
• We know that standard deviation measure
the variation within a sample
• Standard error is the measure of difference
in values occuring
– between a sample and population
– between two samples of the same population

samples
• Standard error used in Z test can be
– Standard error of mean
– Standard error of proportion
– Standard error of difference between 2
means
– Standard error of difference between 2
proportions


samples
• If in the Z test the Z>2 i.e. if the observed
difference between the 2 means or
proportion is greater than 2 times the
standard error of difference
• p < 0.05 according to the given table


samples
Z

1.6

2.0

2.3

2.6

p

0.1

0.05

0.02

0.01

Thus the difference is not due to chance
and may be due to influence of some
external factor i.e. the difference is
statistically significant

Standard error of mean
• Used for quantitative data
• Standard error of mean is the difference
between sample mean and population
mean given by
SE x = SD of Sample
n
• also population mean will be sample
mean ± 2 standard error of mean

• This will enable us to know whether the
sample mean is within the limits of
population mean
Here Z=sample mean – population mean
SE of mean


• In a random sample of 100 the mean blood
sugar is 80 mg % with SD 6 mg% . Within what
limits the population mean will be ? What can be
said about another sample whose mean is 82%
SE = 6 = 6 = 0.6
100
10
• Thus the population mean will be 80± 2 X 0.6 =
78.8 to 81.2
• A sample with 82% mean is not within limits of
population mean thus it does not seem to be
drawn from the same population

Standard error of difference
between 2 means
• Used for quantitative data
• It is the difference between means of two
samples drawn from the same population
• It helps to know what is the significance of
difference obtained by 2 research workers
for the same investigation
SE (X1 – X2) =
SD12 + SD22
n1

n2

between 2 means
• Eg.Find the significance of difference in
mean heights of 50 girls and 50 boys with
following values
Mean

SD

Girls

147.4

6.6

Boys

151.6

6.3


between 2 means
SE

=

(6.6)2 +
50
= 1.29

(6.3)2
50

Z=observed difference
SE
Z = 151.6 – 147.4
1.29
= 3.26

between 2 means
• Since Z value is more than 2 ,p will be less
than .05
• Thus difference is statistically significant
and it can be concluded that boys are
taller than girls


Standard error of proportion
• In case of qualitative data where character
remains same but its frequency varies we
express it in proportion instead of mean
• Proportion of individual having special
character p
• q is number of individual not having the
character
• P+q =1 or 100 if expressed in %age

• Standard error of proportion is the unit which measures
variation in proportion of a character from sample to
population
SE of proportion = p X q
n
p=proportion of positive character
q=proportion of negative character
n=sample size
• Also proportion of population = proportion of sample ± 2
SEP
• Thus one can determine whether the proportion of
sample is within limits of population proportion

• Proportion of blood group B among Indians
is 30%. If in a sample of 100 individuals it is
25% what is your conclusion about the group
SEP = p X q = 25 X 75 = 4.33
n
100
Z = observed diff = 30 - 25 = 1.15
SE
4.33
• Since z is < 2 ,p will be more .05 thus the
difference is not significant.

between 2 proportion
• Measures the difference in proportion of a
character from sample to sample
SE (p1-p2) = p1 q1 + p2 q2
n1


n2

between 2 proportion
• If typhoid mortality in a sample of 100 is 20 %
and other sample of 100 is 30% then is this
difference in mortality rate significant ?
• p1 = 20 : q1 = 80 : n1 = 100
• p2 = 30 : q2 = 70 : n2 = 100
• SE(p1-p2) = 6.08
• Z = 30 – 20 = 1.64
6.08
• Z< 2 , p<.05 thus difference observed is not
significant

Test of significance for
small samples


Test of significance for small
samples
• In case of samples less than 30 the Z
value will not follow the normal distribution
• Hence Z test will not give the correct level
of significance .
• In such cases students t test is used
• It was given by WS Gossett whose pen
name was student

•

There are two types of student t Test
1. Unpaired t test
2. Paired t test


Unpaired t test
• Applied to unpaired data of observation
made on individuals of 2 separate groups
to find the significance of difference
between 2 means
• Sample size is less than 30
• e.g. difference in accuracy in an
impression using two different impression
materials

Unpaired t test
•
•
•

•
•

Steps in unpaired t Test are
Calculate the mean of two samples
Calculate combined standard deviation
Calculate the standard error of mean which is given
by
SEM = SD 1 + 1
n1
n2
Calculate observed difference between means X1 –
X2
Calculate t value = observed difference

Unpaired t test
• Determine the degree of freedom which is
one less than no of observation in a sample
(n -1)
• Here combined degree of freedom will be =
(n1 – 1) + (n2 – 1)
• Refer to table and find the probability of the t
value corresponding to degree of freedom
• P< 0.05 states difference is significant
• P> 0.05 states difference is not significant

Unpaired t test
• In a nutritional study 13 children in group
A are given usual diet along with vitamin A
and vitamin D while 12 children in group B
take the usual diet.
• The gain in weight in pounds for both
groups after 12 months is shown in the
table
• Is vitamin A and D responsible for gain in
weight?

Group A

Group B

5

1

3

3

4

2

3

4

2

2

6

1

3

3

2

4

3

3

6

2

7

2

5

3

3


Unpaired t test
•
•
•
•
•

Mean of group A = 4
Mean of group B = 2.5
Total SD = 1.37
Total SE = 0.548
t=
Observed difference
SE
• t = 4 – 2.5 = 2.74
0.548

Unpaired t test
• Combined degree of freedom = n1 + n2 –
2
• 12 +13 - 2
• p Value is checked corresponding to the t
value at 23 d.f. from the t table
• It is < 0.02
• Thus difference is statistically significant
• And accounted to role of vitamins A&D

Paired t test
• It is applied to paired data of observation
from one sample only .
• Used in sample less than 30
• The individual gives a pair of observation
i.e. observation before and after taking a
drug
• The steps involved are

Paired t test
• Calculate the difference in paired
observation i.e. before and after = x1 – x2 = y
• Calculate the mean of this difference = y
• Calculate SD
• Calculate SE = SD
n
• Determine t = y
SE

Paired t test
• Determine the degree of freedom
• Since there is one sample df = n-1
• Refer to table and find the probability of
the t value corresponding to degree of
freedom
• P< 0.05 states difference is significant
• P> 0.05 states difference is not significant

Paired t test
• Eg.Systolic BP of a normal individual
before and after injection of hypotensive
drug is given in the table.
• Does the drug lower the BP?


BP before
giving drug X1

BP after giving
drug X2

Difference X1X2 = y

122
121

120
118

2
3

120

115

5

115

110

5

126

122

4

130

130

0

120

116

4

125

124

1

128

125

3


Paired t test
• Mean of difference y= €y/ n = 27 / 9 = 3
• SD =
€( y - y )2 = 1.73
n-1
• SE = SD = 1.73 = 0.58
n
9
• t = y / SE = 3 / 0.58 = 5.17
• Degree of freedom to n – 1 = 9 – 1 = 8

Paired t test
• p value corresponding to t = 5.17 and d.f.
8 is < 0.001
• Thus highly significant
• Thus decrease in BP is due to the Drug


Chi square test


Chi square test
• Chi square test unlike z and t test is a non
parametric test
• The test involves calculation of a quantity
called chi square .
• Chi square is denoted by X2
• It was developed by Karl Pearson


Chi square test
• The most important application of chi
square test in medical statistics are
– Test of proportion
– Test of association
– Test of goodness of fit


Chi square test
• Test of proportion
– Used as an alternate test to find the significance
of difference in 2 or more than 2 proportions

• Test of association
– To measure the probability of association
between 2 discreet attributes e.g smoking and
cancer

• Test of goodness of fit
– Tests whether the observed values of a character
differ from the expected value by chance or due to
play of some external factor

Chi square test
X2 = € ( O – E ) 2
E
• X2 denotes Chi square
• O = Observed Value
• E = Expected Value

Chi square test
•
•
•
•

Steps in Chi Square Test
Determine the Chi square value
Find the degree of freedom
Refer the Chi square table to find the
probability value corresponding to the
degree of freedom

Chi square test
• Let us consider the following example
• We are making a field trial of 2 vaccines
• The results of field trial are
Not
Vaccine Attacked
Attacked

Total

Attack
Rate

A

22

68

90

24.4%

B

14

72

86

16.2%

Total

36

140

176


Chi square test
• Vaccine B seems to be superior to
Vaccine A
• We perform Chi Square test to verify if the
vaccine B is superior to vaccine A or is it
merely due to chance
• State the null hypothesis
• It states that the vaccines have equal
efficacy

Chi square test
Determining the Chi Square Value
• Find total attack and non attack rates
• Total Attack rate = 36 = 0.204
176
• Total Non Attack Rate =

140 =
176


0.795

Chi square test
Vaccine

Attacked

Not Attacked

A
(n=90)

O = 22
E = 0.204 X 90
=18.36
O - E = + 3.64

O = 68
E = 0.795 X 90
= 71.55
O - E = - 3.55

O = 14
E = 0.204 X 86
= 17.54
O - E = -3.54

O = 72
E = 0.795 X 86
= 68.37
O - E = + 3.63

B
(n=86)


Chi square test
X2 = € ( O – E ) 2
E
= (3.64)2 + (3.55)2 + (3.54)2 + (3.63)2
18.36
71.55
17.54
68.37
= 0.72 + 0.17 + 0.71 + 0.19
= 1.79

Chi square test
• Find the Degree of Freedom = (c-1) (r-1)
• c = number of Columns
• r = number of Rows
• d.f. = (2-1)(2-1) = 1


Chi square test
• Find the p value
• On referring to Chi square table with one
degree of freedom the p value was more
than 0.05.
• Hence the difference is not statistically
significant and the null hypothesis of no
difference between vaccines is accepted.


ANOVA


ANOVA
Analysis of variance
• Investigations may not always be confined
to comparison of 2 samples only
• e.g. we might like to compare the
difference in vertical dimension obtained
using 3 or more methods like phonetics,
swallowing, niswonger’s method
• In such cases where more than 2 samples
are used ANOVA can be used

ANOVA
• Also when measurements are influenced by
several factors playing there role e.g. factors
affecting retention of a denture, ANOVA can
be used.
• ANOVA helps to decide which factors are
more important
• Requirements
– Data for each group are assumed
independent and normally distributed
– Sampling should be at random

to

be

ANOVA
• One way ANOVA
– Where only one factor will effect the result
between 2 groups

• Two way ANOVA
– Where we have 2 factors that affect the
result or outcome

• Multi way ANOVA
– Three or more factors affect the result or
outcomes between groups

ANOVA
F test
F = Mean Square between Samples
Mean Square within Samples
• F = variance ratio
• The values of Mean square are seen from
the analysis of variance table if we have
the values of sum of squares and degree
of freedom ( which are calculated )

ANOVA
• Mean Square between Samples
– It denotes the difference between the sample
mean of all groups involved in the study (A, B,
C etc) with the mean of the population

• Mean Square within Samples
– it denotes the difference between the means
in between different samples

• The greater both these value more is the
difference between the samples

ANOVA
• The F value observed from the study is
compared to the theoretical F value obtained
from the Tables at 1% and 5% confidence limits.
• The results are then interpreted.
• If the observed value is more than theoretical
value at 1% , the relation is highly significant.
• If the observed value is less than the
theoretical value at 5% it is not significant.
• If the observed value is between 1 and 5% of
theoretical value it is statistically significant.

Bibliography
• Mahajan BK: Methods in Biostatistics.6th edition
• Park. Textbook of social and preventive
medicine. 18th edition
• Smith F Gao, Smith J.E.; Clinical Research
• Rao sunder, Richard: An introduction to
biostatistics 3rd edition
• Baride , Kularni , muzumdar :Manual of
Biostatistics : 1st edition
• Soben Peter ; Textbook of preventive and
community dentistry : 2nd edition

Thank You ….


ANOVA
•
•
•
•
•
•
•
•
•
•
•

Steps in ANOVA
Calculate sum of squares for individuals
Calculate Correction term
Calculate total sum of squares
Calculate sum of squares between groups
Calculate the sum of squares within groups
Calculate the degree of freedom
Check the mean square from ANOVA table
Calculate F value
Comparison with theoretical value

ANOVA
• In a study conducted children were divided
into 3 groups and fed with different diets
• The hemoglobin concentration was
measured after a month and recorded
• By applying ANOVA we can find whether
the difference between groups differ
significantly.


Group 1

Group 2

Group 3

11.6

11.2

9.8

10.3

8.9

9.7

10.0

9.2

11.5

11.5

8.8

11.6

11.8

8.4

10.8

11.8

9.1

9.1

12.1

6.3

10.5

10.8

9.3

10.0

11.9

7.8

12.4

10.7

8.8

10.7

11.5

10.0
9.7

n = 11

n = 12

n = 10

Total = 124.0

Total = 107.5
www.indiandentalacademy.com Total = 106.1

Mean = 11.27

Mean = 8.96

Mean = 10.61

ANOVA
• State the null hypothesis
– The three groups are from the same
population and do not show any difference

• Calculate sum of squares for individuals =
– € x2 = 11.62 + 10.32 + ………. 10.72
= 3516.32

• Calculate Correction term =
–(€x)2
n

=

(337.6) 2
33

= 3453.75


ANOVA
• Calculate total sum of squares =
Sum of squares for individuals - Correction term
= 3516.32 – 3453.75 = 62.57

• Calculate sum of squares between groups
= (€x)2 – Correction term
n
= t12 + t22 + t32
– Correction term
n1

n2

n3

ANOVA
= 124.02 + 107.52 + 106.12 – 3452.75 = 32.81
11
12
10

• Calculate the sum of squares within
groups
=Total sum of squares – sum of square
between groups
= 62.57 – 32.81 = 29.57

• Calculate the degree of freedom
– It is one less than the number of items

ANOVA
– Degree of freedom for total sum of samples =
33 – 1 = 32
– Degree of freedom for sum of squares
between groups = 3 – 1 = 2
– Degree of freedom for sum of squares within
groups = 32 – 2 = 30

• Check the mean square from ANOVA
table for given sum of square and degree
of freedom

ANOVA
From the table we get
– Mean square between groups = 16.405
– Mean square within groups = 0.992

• Calculate F value ie variance ratio
F = Mean Square between Samples
Mean Square within Samples
= 16.405 = 16.54
0.992

ANOVA

• Comparison with theoretical value
– theoretical value of F at 1% and 5% at a given
degree of freedom is obtained from F table and
compared with F value obtained
– In the example discussed degree of freedom are n1 =
2 and n2 = 30
– The F value at 5% is 3.32 and at 1% is 5.39
– The observed value 16.54 is greater than the 1%
value. Therefore we can say that the difference
between groups is significant at 1% level
– Thus the probability of observing a variance is less
than 0.01.

Thank you
For more details please visit


Bio statistics 2 /certified fixed orthodontic courses by Indian dental academy

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Bio statistics 2 /certified fixed orthodontic courses by Indian dental academy

Similar to Bio statistics 2 /certified fixed orthodontic courses by Indian dental academy (20)

More from Indian dental academy

More from Indian dental academy (20)

Recently uploaded

Recently uploaded (20)

Bio statistics 2 /certified fixed orthodontic courses by Indian dental academy