2. 1. Statistics - Introduction
2. Scope of statistic
3. Normal Distribution
4. Central Tendency
1. Arithmetic mean
2. Median
3. Mode
5. Dispersion
1. Standard deviation (SD)
6. Standard error of mean (SEM)
7. Probability
8. Test for significance
1. Student ‘t’ test
2. Chi square test
3. “Statistics is a science which deals with the
collection, classification and tabulation of
numerical facts as the basis for explanation,
description and comparison of phenomena.”
Here, the data are numbers which contain
information.
4. Industries
Medical Science
Agricultural biology
Social Science
Planning and economics
Space research
5. When many independent random factors
act in an additive manner to create
variability, the data set follows a bell
shaped distribution called as normal
distribution.
Mathematicians De Moivre and Laplace
used this distribution in the 1700's.
In the early 1800's, German
mathematician and physicist Karl Gauss
used it to analyze astronomical data, and
known as the Gaussian distribution.
6.
7. When maximum frequency of
distribution occurring at the centre of the
curve and the remaining evenly
distributed around it, it follows normal
distribution.
Normal distribution is described by its
mean (µ) and standard deviation (σ).
9. It is defined as the sum of the all variates
of a variable divided by the total number
of item in a sample.
It is expressed by the symbol
Where, = Arithmetic mean
n = frequency
Xi = all the varietes of Variable
11. It is defined as the nth root of the product
of the n items in an ungrouped data.
When percentage increase or decrease is
expressed over a period of time, the mean
percentage is find out by using geometric
mean.
If X1, X2, X3,…. Xn are the n variates of the
variable X then,
Geometric Mean =
12. Example:
Following administration of a drug in a
laboratory mammal, the blood glucose
level increased by 5% in the first hour, by
8% in the second hour and 77% in the
third hour. What is the mean percentage
increase during the observation period?
Here, we assume that the glucose level at
the beginning of every hour as 100mg%
Then the level of blood sugar
13. At the end of 1 hour= 100+5 =105mg%
At the end of 2 hour=100+8 = 108mg%
At the end of 3 hour=100+77 = 177mg%
So, geometric mean=
= 126.14
So the mean percentage increase
= 126.14 – 100 = 26.14
14. It is the central value of all observations
arranged from the lowest to the highest.
Example:
(1) For Odd number of variates
Weight of frog in gram. n = 7
75, 66, 55, 68, 71, 78, 72.
Data in ascending order of value:
55, 66, 68, 71, 72, 75, 78.
Here, Median is 71.
15. Example:
(1) For Even number of variates
Height of Students in cm, n = 8
165, 175, 161, 155, 169, 171, 152, 166.
Data in ascending order of value:
152, 155, 161, 165, 166, 169, 171, 175.
Here, Median is = 165.5
16. It is defined as the value which occurs
most frequently in the sample.
Example
Weight of tablet in mg:
52, 48, 50, 51, 50, 51, 50, 49.
In the above data, 50 occurs 3 times
So mode of above data = 50 mg
18. It is defined as the square root of the
arithmetic mean of the squared
deviations of the various items from
arithmetic mean.
It is expressed as SD
It is calculated by the following formula
21. Text Book : Basic Concepts and Methodology for
the Health Sciences
Variance:
It measure dispersion relative to the scatter of the values
about there mean.
a) Sample Variance ( ) :
,where is sample meanx
2
S
1
)(
1
2
2
n
xx
S
n
i
i
22. Text Book : Basic Concepts and Methodology for
the Health Sciences
b)Population Variance ( ) :
where , is Population mean
Example: slide no:20
Varience=( )2
= 1160
2
N
x
N
i
i
1
2
2
)(
23. In a small sample size the arithmetic
mean would be an approximation of the
true mean of the whole population, and
therefore subject to error.
In such cases the error of the observed
mean is calculated.
The SE allows to find out the range in
which the true mean would lie.
It gives an estimate of the extent to which
the mean will vary if the experiment is
repeated.
24. SE=
SE of the previous example.
SE=
= 13.05
25. The term probability means “chance” or
“likelihood” of the occurrence of the
event.
It is defined as the symbol ‘P’.
Where, m= Number of favorable events
N= Total number of events
26. Test of Significance
In scientific research, a sample investigation
produces results which are helpful in
making decisions about a population
We are interested in comparing the
characteristics of two or more groups.
The two samples drawn from the same
population will show some difference
Difference can be controlled by “Test of
significance”
27. Procedure for Test
of Significance
1. Laying down Hypothesis:
a) Null hypothesis: Hypothesis which is to be actually tested for
acceptance.
b) Alternative hypothesis: Hypothesis which is complementary to
the to the null hypothesis.
Eg. avg of gene length is 170 kbp
Ho:µ=170
H1:µ=170
i.e, µ>170 or µ<170
2. Two types of error in testing of hypothesis
a) Type I error: Rejection of null hypothesis which is true
b) Type II error: Acceptance of null hypothesis which is false
28. 3. Level of significance
Minimize Type I & II error
Level of significance is denoted by α
α is conventionally chosen as 0.05 (moderate precision) or 0.01
(high precision)
In most biostatistical test α is fixed at 5%, means probability of
accepting a true hypothesis is 95%
4. One & two tailed tests of hypothesis
In a test the area under probability curve is divided into
Acceptance region
Critical/ rejection region
29. Types of test of
Significance
Two types of test used in interpretation of
results.
(1)Parametric test:-
It involves normal distribution.
It includes: Student’s t-test
Analysis of variance(ANOVA)
Regression
Correlation
Z- test
30. Test of Significance
(2)Non-Parametric test:-
It involves when the sample data does
not follow normal distribution.
It includes: Chi-squared test
Wilcoxon Signed-rank test
Kruskal-Wallis test
31. Student ‘t’ test:
This test is applied to assess the statistical significance
of difference between two independently drawn sample
means obtained from two series of data with an
assumption that the two mean are from normal
distribution population, with no significant variation
t= (difference of means of two samples)/(std error of
difference)
Standard error of difference(Sd) = √{(S1
2/n1)+(S2
2/n2)}
t= |X1 – X2|/ √{(S1
2/n1)+(S2
2/n2)}
Degrees of freedom = (n1+n2-2)
32. Ex. Following data related to disintegration time(DT) of
Chloroquine tablets using diluent, Lactose monohydrate(LM),
dibasic calcium phosphate (DCP).Determine whether the two
means are significantly different.
Lactose Monohydrate DCP
n 3o 35
mean 32 38
variance 9.62 14.23
Null hypothesis: Ho: There is no significant difference
between the mean DT in choroquine tablets between
LM & DCP
Sd = √{(S1
2/n1)+(S2
2/n2)} =
√(9.62/30)+(14.23/35)=√0.73 = 0.85
33. Difference between mean = 38-32 = 6
t= |X1 – X2|/ √{(S1
2/n1)+(S2
2/n2)}
= |32-38|/ √{(9.62/30)+(14.23/35)}
= 6/√o.73 = 7.06
Degrees of freedom = (n1+n2-2)= (30+35-2)=63
Conclusion:
Calculated value of t(7.06)> tabulated value of t for 63(at
1%=2.66)
So the two mean are very much different
So the null hypothesis is rejected at p=0.01
The difference between the two sample means is a real
difference because the level of significance is very high
34. Chi-square test:-
In biological research apart from quantitative characters one has
to deal with qualitative data like flower color or seed color
Results of breeding experiments and genetical analysis comes
under chi-square test
The quantity x2 describes the magnitude of difference between
the observed & the expected frequency
x2 = ∑(fo - fe)2/fe
fo – observed frequency
fe – effective frequency
35. Determination of value of x2
1. Calculate the expected frequency(fe)
2. Find out the difference between the observed
frequency(fo) and expected frequency(fe)
3. Square the value of (fo-fe) i.e (fo-fe)2
4. Divide each value of fe & obtain the total ∑(fo - fe)2/fe
value
5. The calculated value of x2 is compared with the table
value for the given degrees of freedom(d.f)
d.f= (r-1) (c-1)
where, r- no. of rows in table
c- no. of columns in table
36. Examples of x2 test
In F2 generation, Mendel obtained 621 tall plants & 187
dwarf plants out of the total of 808. test whether these
two types of plants are in accordance with the
Mendelian monohybrid ratio of 3:1 or they deviate
from ratio
Solution:
Tall plants Dwarf plants Total
Observed frequency(fo) 621 187 808
Expected frequency(fe) 606 202 808
Deviation(fo-fe) 15 -15
37. Formula applied
x2 = ∑(fo - fe)2/fe
=(15)2/606+(-15)2/202
= 225/606+ 225/202
= 0.3713+ 1.1139
= 1.4852
Tabulated value is 3.84 at 5% level of probability
for d.f= 2-1 =1
Therefore the difference between the observed & expected
frequencies is not significant
Hence the null hypothesis is true
38. Application of x2 test
1. To test the goodness of fit
2. To test the independence of attributes
3. To test the homogeneity of independent estimates of
the population varience
4. To test the detection of linkage
39. References
Khan IA, Khatum A. Fundamentals of Biostatistics.
3rd revised edition. Ukazz publication, Hyderabad
Brahmankar DM, Jaiswal SB. Biopharmaceutics &
Pharmacokinetics.
Kulkarni SK. Textbook of Experimental pharmacology.
Khan IA, Khatum A. Biostatistics in Pharmacy. 3rd edition.
Ukazz publikation, Hydrabad
Jeffery GH, Bassett J,Mendham J, Denney RC. Textbook of
quantitative chemical analysis. Fifth edition. Vogel’s
publication.