1. Biostatistics
Dr. Arshad Sabir A.P
2009
Session-I
12/10/12 1
2. Biostatistics
Statistics: Science of figure. Science concerned
with techniques or methods of collection,
classification, summarization, interpretation of
data, drawing inference , testing of hypotheses
and making recommendations etc.
Biostatistics: when tools of statistics are applied
to data derived from biological sciences as
medicine is known as biostatistics .
Health statistics--Medical Statistics--Vital statistics
12/10/12 2
3. Types of Biostatistics
1. Descriptive Statistics: /deductive
statistics merely describe, organize or
summarize data. It refers to actual data
available. Blood pressure pattern of class
students , disease prevalence in the
community, a case report .
2. Inferential statistics: Involves deriving
inference beyond the actual
data….correlation of B.P with weights of the
students, if any. Involves inductive reasoning
like estimating whole class B.P by assessing
B.P of a sample.
12/10/12 3
4. CHARACTISTICS OF STATISTICS
1. Statistics are aggregate facts
2. Statistics are affected by multiple factors/ variable.
3. Statistics deals with facts which are numerically
measurable and expressible
4. Statistics are measurable with a degree of accuracy
5. Statistics are comparable and are capable of further
mathematical manipulation.
6. Statistics are collected with some objectives.
7. Statistical results are true only on the average or in long
run not in strict sense (sample based estimate).
8. Statistics provides only a tool for analysis and can not
change actuality or true values.
12/10/12 4
5. Why we need statistics?
• Any science needs demands precision for its
development, so does medical science.
• Clarity of judgment, right assessments and correct
decision making
• For precision facts, observations or
measurements have to be expressed in figures.
• When you can measure what you are speaking
about and express it in numbers you know some
thing other wise your knowledge is meager and of
unsatisfactory kind. LORD KELVEIN
12/10/12 5
6. Why Biostatistics?
• Every thing in medicine; be research,
Diagnosis ,treatment or public health
depends upon counting and measurements.
Testing hypothesis, spleen enlargement , High
& low B.P, efficacy of a treatment or mortality
pattern of a population.
• In nature Heights & weights of people, action
of drugs etc vary. Extent of this variability in
an attribute whether it is natural/ normal or
not ( due to play of an external factor)is learnt
by studying statistics as a science.
12/10/12 6
7. Variability
Biological data (Quant. Or Quali.) is highly
variable. Ht. Wt. Hb. IQ. Behavior & effect of
same drug in diff. pts etc. ………..Variability is
a normal character.
Types or variability:
1. Biological variability
2. Real variability
3. Experimental Variability
12/10/12 7
8. How to measure?
VARIABLE: A measurable quantity which varies from
one individual or object to another is called variable.
• It is the characteristic or property of a person, object
or phenomenon which can take more than one value.
• It is characteristic that takes on different values in
different persons , places or things.
• A quantity that varies within limits
CONSTANT: a quantity that do not vary like “ g = 9.8” ,
“ π = 3.14“ etc. They do not require any statistical
studies.
For a give distribution its summary values , mean,
median, Mode, Range, MD, SD and SE, Correlation
Coefficient are also constant.
12/10/12 8
9. Uses of Biostatistics in medical sciences
1. To define limits of normality e.g. Weight, B.P,
Gender, Pulse rate.
2. To compare certain attributes of the two
different populations…..is the difference is normal /
natural or by chance, or is due to play of some external
factor.
3. To find difference b/w efficacy of two drugs or
vaccine (is by chance or otherwise).
4. To study cause & effect relationship in disease
causation. (obesity & CHD)
5. To establish sign symptoms of the diseases
( fever not cough, is significantly asso. with typhoid fever)
12/10/12 9
10. Biostatistics as science of figure (Public health)
1. What are leading causes of deaths
2. What are common health problems
3. Whether a particular disease is decreasing or
increasing
4. How is severity of a diseases.
5. How a disease affects other persons
6. Who are high risk groups, Conditions & Locations.
7. What is productive power of a certain population
8. What could be health needs of a certain community
9. How is health seeking attitude of a population
10. How successful a health program is?
12/10/12 10
11. Basic Biostatical concepts & terms
DATA:
• A Collection of facts and figures
• A set of values recorded on one or
more observational units
• Any information as a fact or figure.
• Numerical facts relating to any field
of study.
• Data is a medium for expression of a
variable
12/10/12 11
12. Data types
• Raw Data: First hand as such collected data
with out any treatment. A haphazard mass
of accumulated facts.
• Processed data: Data after some
mathematical or statistical treatment given
to it.
other types… NOIR
Qualitative / Categorical
Quantitative / Numeric
12/10/12 12
13. Important concepts
Observation: An event and its measurement like Height
(event) and its measurement (5.6 Feet), Gender-M/F
Observational unit: Source that gives the observations
such as persons, Hospitals, patients.
Population: It is an entire group of people or the study
elements – persons, things or measurements for which
we have an interest at a particular time like all women of
reproductive age, Serum cholesterol levels, Hb% etc.
(Parameter)
Sample: It is subset of the population which comes under
study. (statistic)
12/10/12 13
14. How to describe a Distribution!
• Measure of Central Tendency
Mean
Median
Mode
• Modes of Dispersion
Range
Variance (Mean deviation)
Standard Deviation
Coefficient of Variation (CV)
12/10/12 14
15. Mean
Sum of all values (Σ) divided by total number of
observations. It is denoted by x. (µ)
• Advantages
– Easy to calculate
– Contains more information
– Amenable to most statistical treatments
• Disadvantages
Influenced by extreme values
May not convey proper sense e.g. Mean No. of
children may turn out to be 5.77
12/10/12 15
16. Calculation of Mean
Average income college office staff
1. 10,000
2. 20,000
3. 15,000
4. 11,000 = Ʃ X i-n / N
5. 16,000 158,000 / 10 = 15,800
6. 17,000 Mean = 15,800
7. 23,000 x = 15,800
8. 24,000
9. 13,000
10. 9,000
12/10/12 SUM= 158,000 16
18. Median (positional average)
When the data is arranged in ascending or
descending order, the median is the value that
divides the data in two equal parts.
• Advantages
It is not influenced by extreme values
• Disadvantages
Not very precise measure
Not amenable to further statistical evaluation
12/10/12 18
19. Calculation of Median
1. Arrange all values in Ascending
or Descending order.
2. Add 1 to the number of
observations. (n + 1 )
3. Divide by 2. ( n+1 / 2 )
4. The answer will be the number
(serial number) of observation,
which constitutes Median.
12/10/12 19
20. Effect of Extreme Values on Median
1 10,000 10,000
2 10,000 10,000
3 10,000 10,000
4 10,000 10,000
5 10,000 10,000
6 15,000 15,000
7 15,000 15,000
8 16,000 16,000
9 16,000 16,000
10 20,000 600,000
11 20,000 500,000
Median = n+1/2 Median = n+1/2
11 + 1 /2 = 6th Value 11 + 1 /2 = 6th Value
6th Value = 15,000 6th Value = 15,000
12/10/12 20
21. MODE
• It is most frequently occurring value in the
distribution. Example No. of T.B Pts seen in one
month at private clinics in RWP.
• Data: (5 – 45 Pts in various clinics)
• Pts f (clinics)
• 5-15 50 ( it is recorded 50 times)
• 16-25 35
• 26-35 28
• 36-45 10
Mode is 50 ( frequently 5 to 15 pts of T.B are seen at private
clinic in Rawalpindi)
12/10/12 21
22. Mean is not sufficient
Same Mean for 2 different populations
Group – 1 Group – 2
30 – 34 Years 0 30 – 34 Years 40
35 – 39 Years 10 35 – 39 Years 10
40 – 44 Years 20 40 – 44 Years 0
45 – 49 Years 40 45 – 49 Years 0
50 – 54 Years 20 50 – 54 Years 0
55 – 59 Years 10 55 – 59 Years 10
≥ 60 Years 0 ≥ 60 Years 40
Mean Age 45 Mean Age 45
12/10/12 22
23. Measures of Dispersions
1. Range: Difference b/w highest and lowest figures
in the given distribution. It consider only
extreme value and not the values in
between
2. Mean deviation: Average of all deviations from
the arithmetic mean. (Variance, S2)
∑ (xi-n – x )2
=
---------------
n
It is actually average of all squared
deviations and is of no
practical use.
3. Standard Deviation.
4. Co efficient of Variation (CV) = SD/Mean x
12/10/12 100 23
24. Standard Deviation
• It is a measure, which describes how much
individual measurements differ on average,
from the mean.
• It expresses in quantitative terms the scatter of
data around the mean.
• It is the most important measure of dispersion
around the mean and forms the basis of most
statistical analysis.
• It is denoted by δ (population) or SD (sample)
12/10/12 24
25. Calculation of Standard Deviation
Calculate mean of the given distribution( Xi-n )
Find difference of each individual observation from
the Mean (Deviation Score)
(xi-n – x )
Square all the differences (deviations)
(xi-n – x )2
Take sum of all squared differences
∑ (xi-n – x )2
Divide sum by total number of observation (n).
to find average deviation
∑ (xi-n – x )2 / n
Take square root of whole
12/10/12
√ ∑ (xi-n – x)2 / n-1 = SD 25
26. Mean of Systolic Blood Pressure in 5 individuals
Observed Value Mean Deviation from mean Square of deviation
(mm Hg) (mm Hg) (d) (d)2
X
110 124 -14 196
116 124 -8 64
120 124 -4 16
130 124 +6 36
144 124 +20 Ʃ= 00 or 52 400
∑ ( d )2 = 712
S.D. = √ ∑ ( d )2 √ 712 , √ 178 = 13.3
n–1 5-1
12/10/12 26
27. Standard Deviation
1. Most important tool in statistical analysis
2. SD helps to describe “Normal Curve”
3. It gives an idea whether the observed diff. of
an individual value from the mean is by
chance , normal or is significant.
4. Helps in calculation of “Standard error”.
5. Helps in calculating “Sample size”.
12/10/12 27
28. Normal Distribution/Gaussian Distribution
Theoretical, mathematical model to best describe many
biological characteristics like Ht., Wt., B.P, Hb.&
cholesterol.
Main features
• Devised by Gauss (Germany), Lapless (France)
• Graphic presentation of freq. dist. table of Qunti.
Continuous variable based on a large random sample.
• Symmetrical about its mean
• Smooth Bell shaped curve
• This dist. provides foundation to “Central limit
theorem” upon which most statistical calculations are
based.
• It can be arithmetically expressed in terms of its mean
12/10/12
and Standard deviation. 28
29. Normal Distribution
1. Mean±1SD include 68.27%. (2/3rd) of observations.
Reaming 32% (1/3rd) lie outside the range
mean±1SD
2. Mean±2SDinclude 95.45% of the observations while
4.55% will lie outside the this limit.
Mean±1.96SDlimits include 95% of the
observations.
3. Mean± 3SDlimits include 99.73% of the
observations. Mean± 2.58SD observations include
99% of the values.
4. It means values higher or lower than mean±3SD are
very rare (only 0.27%) and their chances of being
normal are 0.27times in 100.
Such high values are not normal or unusual and may
even be pathological.
12/10/12 29
30. STANDARD NORML CURVE.
• Mathematically designed curve
• Perfectly bell shaped symmetrical curve.
• Mean, Mode & Median coincide
• Mean is zero
• SD is 1
Mean ± 1 SD = 68.2 % 68 % of obs.
Mean ± 2 SD = 95.4 % 95 % “
Mean ± 3 SD = 99.7 % 99 % “
12/10/12 30
31. Standard Normal Curve
Mathematical Formula of Standard Normal
Curve
n c – x2 / 2 σ2
Y= __________________
σ√2π
12/10/12 31