Biostatistics ug

www.indiandentalacademy.comwww.indiandentalacademy.com

I . INTRODUCTION
1. Why we need to know Bio-statistics
3. Application & Uses Of Bio-statistics
- As a Science
-As a figure
II. METHODS OF COLLECTION OF DATA
2. Common Statistical Terms

IIIIII . SAMPLING. SAMPLING
: Graphical representation: Graphical representation
: Basic Definitions: Basic Definitions
: Why Sampling is necessary?: Why Sampling is necessary?
: What is Sampling ?: What is Sampling ?
: Sampling Techniques.: Sampling Techniques.
IVIV. METHODS OF PRESENTATION OF DATA. METHODS OF PRESENTATION OF DATA
: Tabulation: Tabulation
: Diagrammatic representation: Diagrammatic representation

VV.. METHODS OF SUMMARIZING THEMETHODS OF SUMMARIZING THE
DATADATA
: Measures of Central Tendency: Measures of Central Tendency
:Mode:Mode
: Measures of Dispersion: Measures of Dispersion
:Mean:Mean
:Median:Median
:Range:Range
:Standard deviation:Standard deviation
:Mean deviation:Mean deviation

VI.VI. NORMAL DISTRIBUTION AND NORMALNORMAL DISTRIBUTION AND NORMAL
CURVE.CURVE.
VIIVII. METHODS OF ANALYZING THE DATA. METHODS OF ANALYZING THE DATA
: Basic Concepts of statistical inference: Basic Concepts of statistical inference
: Tests of significance: Tests of significance
: t test: t test
: Chi-square test: Chi-square test

COMMON STATISTICALCOMMON STATISTICAL
TERMSTERMS

STATISTICSSTATISTICS
Is the application of statistics to healthIs the application of statistics to health
problemsproblems..
Is the science of Collecting,Is the science of Collecting,
Presenting,Summarizing, Analyzing & InterpretingPresenting,Summarizing, Analyzing & Interpreting
sets of data.sets of data.
BIO-STATISTICSBIO-STATISTICS

DATA:DATA:
DATADATA can be defined as a set of valuescan be defined as a set of values
recorded on one or more observational units.recorded on one or more observational units.
Are the set of values of one or more variablesAre the set of values of one or more variables
recorded on one or more individuals.recorded on one or more individuals.
- OR -- OR -

BASIC PRINCIPLES OF BIOSTATISTICSBASIC PRINCIPLES OF BIOSTATISTICS
55. Interpretation of Data. Interpretation of Data
11. Collection of Data. Collection of Data
22. Presentation of Data. Presentation of Data
33. Summarization of Data. Summarization of Data
44. Analysis of Data. Analysis of Data

APPLICATIONSAPPLICATIONS
&&
USESUSES

AS A SCIENCEAS A SCIENCE
3.3. To find CORRELATION between twoTo find CORRELATION between two
variables,variables, XX andand YY, such as, such as height & weight.height & weight.
1.1. To define what is NORMAL or healthy in aTo define what is NORMAL or healthy in a
population & to find out the LIMITS OFpopulation & to find out the LIMITS OF
NORMALITY in variables.NORMALITY in variables.
2.2. To find the DIFFERENCE BETWEEN MEANSTo find the DIFFERENCE BETWEEN MEANS
& PROPORTIONS of normal at two places or& PROPORTIONS of normal at two places or
in different periods.in different periods.
4.4. To find the action of a drug.To find the action of a drug.

6.6. To find the relative potency of a new drugTo find the relative potency of a new drug
with respect to a standard drug.with respect to a standard drug.
5.5. To compare the action of two differentTo compare the action of two different
drugs or two successive dosages of thedrugs or two successive dosages of the
same drug.same drug.
7.7. To find an association between twoTo find an association between two
variables such as oral leukoplakia and panvariables such as oral leukoplakia and pan
chewingchewing OROR Dental caries & Sugar intake.Dental caries & Sugar intake.
9.9. In epidemiological studies - the role ofIn epidemiological studies - the role of
causative factors is statistically tested.causative factors is statistically tested.
8.8. To test usefulness of sera and vaccinesTo test usefulness of sera and vaccines
used in the field.used in the field.

TYPES OF DATATYPES OF DATA
DATADATA
1.Quantitative1.Quantitative
2.Qualitative2.Qualitative
3.Primary3.Primary
4.Secondary4.Secondary

Are those whichAre those which cannotcannot be Quantified that isbe Quantified that is
the character which we take intothe character which we take into
considerationconsideration cannotcannot be Quantified.be Quantified. For E.g.:For E.g.:
Beauty, Sex etc.Beauty, Sex etc.
Are those whichAre those which can becan be Quantified - that isQuantified - that is
the character which we take intothe character which we take into
considerationconsideration can becan be Quantified.Quantified. For E.g.:For E.g.:
Height, Weight etc.Height, Weight etc.
Quantitative DataQuantitative Data
Qualitative DataQualitative Data

Are those which have already been collectedAre those which have already been collected
by someone else & which have already beenby someone else & which have already been
passed through the statistical process.passed through the statistical process.
Are those which are collected afresh and theAre those which are collected afresh and the
first time, and thus happen to be original infirst time, and thus happen to be original in
character.character.
Primary DataPrimary Data
Secondary DataSecondary Data

METHODS OF DATAMETHODS OF DATA
COLLECTIONCOLLECTION

COLLECTION OF PRIMARYCOLLECTION OF PRIMARY
DATADATA
• Observation MethodObservation Method
• Clinical Examination &Clinical Examination &
• Through SchedulesThrough Schedules
• Through QuestionnairesThrough Questionnaires
• Interview MethodInterview Method

Sources of health informationSources of health information
 1. Census1. Census
 2. Registration of vital events-18732. Registration of vital events-1873
 3. Sample Registration system3. Sample Registration system
 4. Notification of diseases4. Notification of diseases
 5. Hospital Records5. Hospital Records
 6. Diseases registers6. Diseases registers
 7. Record Linkage7. Record Linkage
 8. Environmental health data8. Environmental health data

 Health manpower statisticsHealth manpower statistics
 Population surveysPopulation surveys

BASIC DEFINITIONSBASIC DEFINITIONS
• POPULATION / TARGET POPULATIONPOPULATION / TARGET POPULATION
Each member of the population isEach member of the population is
called as sampling unit.called as sampling unit.
• SAMPLING UNITSAMPLING UNIT
Is a defined group of individuals toIs a defined group of individuals to
whom an investigator wants thewhom an investigator wants the
conclusions of his study to apply.conclusions of his study to apply.

A group of sampling units that forms part ofA group of sampling units that forms part of
a population generally selected so as to bea population generally selected so as to be
representative of the population whoserepresentative of the population whose
variables are under study.variables are under study.
• SAMPLESAMPLE
The smallest part of the population is calledThe smallest part of the population is called
as aas a UNIT/INDIVIDUAL.UNIT/INDIVIDUAL. AnyAny FINITEFINITE part ofpart of
the population isthe population is SAMPLESAMPLE. A sample is a. A sample is a
subset of the target population.subset of the target population.

Sampling may be defined as as theSampling may be defined as as the
selection of some part of an aggregateselection of some part of an aggregate
or totality on the basis of which aor totality on the basis of which a
judgment or inference about thejudgment or inference about the
aggregate or totality is made.aggregate or totality is made.
It is the process of obtainingIt is the process of obtaining
information about an entire populationinformation about an entire population
by examining only a part of it.by examining only a part of it.
-OR--OR-
• SAMPLINGSAMPLING

A list containing all sampling unitsA list containing all sampling units
is known as sampling frame.is known as sampling frame.
Sampling frame consists of a listSampling frame consists of a list
of items from which the sample isof items from which the sample is
to be drawn.to be drawn.
- OR -- OR -
• SAMPLING FRAMESAMPLING FRAME

WHY SAMPLING
WHY SAMPLING
IS NECESSARY?
IS NECESSARY?

NECCESSARIES:NECCESSARIES:
• Sampling can save time & moneySampling can save time & money
• Sampling remains the only way whenSampling remains the only way when
population contains infinitely manypopulation contains infinitely many
membersmembers
• To check Physical impossibility of items inTo check Physical impossibility of items in
a populationa population
• Adequacy of sampling resultsAdequacy of sampling results

 A sample has to be taken from aA sample has to be taken from a
populationpopulation
 A sample should beA sample should be representativerepresentative ofof
the populationthe population

SAMPLING TECHNIQESSAMPLING TECHNIQES
There are two types of samplingThere are two types of sampling
techniques.techniques.
2) Non probability sampling2) Non probability sampling
1) Probability sampling1) Probability sampling

The probability sampling methods areThe probability sampling methods are
6) MULTIPHASE SAMPLING6) MULTIPHASE SAMPLING
5) MULTISTAGE SAMPLING5) MULTISTAGE SAMPLING
4) CLUSTER SAMPLING4) CLUSTER SAMPLING
3) STRATIFIED RANDOM SAMPLING3) STRATIFIED RANDOM SAMPLING
2) SYSTEMATIC RANDOM SAMPLING2) SYSTEMATIC RANDOM SAMPLING
1) SIMPLE RANDOM SAMPLING1) SIMPLE RANDOM SAMPLING

The various Non probability samplingThe various Non probability sampling
methods are :methods are :
3) CONVENIENCE SAMPLING3) CONVENIENCE SAMPLING
2) JUDGMENT SAMPLING2) JUDGMENT SAMPLING
1) QUOTA SAMPLING1) QUOTA SAMPLING

1. SIMPLE RANDOM SAMPLING1. SIMPLE RANDOM SAMPLING
2) Table of Random Number Method2) Table of Random Number Method
1) Lottery Method1) Lottery Method
o Two MethodsTwo Methods
o Unrestricted Random samplingUnrestricted Random sampling
o Principle is that every member orPrinciple is that every member or
every unit of the population has anevery unit of the population has an
equal chance of being selected.equal chance of being selected.
Applicable when the population is small,Applicable when the population is small,
homogeneous and readily availablehomogeneous and readily available

II. SYSTEMATIC RANDOM SAMPLINGII. SYSTEMATIC RANDOM SAMPLING
 Used in those cases where a completeUsed in those cases where a complete
list of population from which sample islist of population from which sample is
to be drawn is available.to be drawn is available.
 More often applied to field studiesMore often applied to field studies
where population is large, scattered &where population is large, scattered &
homogeneous.homogeneous.
 Choose every KChoose every Kthth
house where ‘K’house where ‘K’
refers to the SAMPLING INTERVALrefers to the SAMPLING INTERVAL
K =K =
Total populationTotal population
Sample size desiredSample size desired

MERITS (MERITS (Sys. R.S.Sys. R.S.))
• Systematic design is simple andSystematic design is simple and
convenience to adopt.convenience to adopt.
• Gives accurate results, if theGives accurate results, if the
population is sufficiently large,population is sufficiently large,
homogeneous and each unit ishomogeneous and each unit is
numbered.numbered.
• Time and labor involved in theTime and labor involved in the
collection of sample is relativelycollection of sample is relatively
small.small.

III. STRATIFIED RANDOM SAMPLINGIII. STRATIFIED RANDOM SAMPLING
 Used when the population is notUsed when the population is not
homogeneous.homogeneous.
 The population under study is firstThe population under study is first
divided into homogeneous groups ordivided into homogeneous groups or
classes called ‘classes called ‘stratastrata’ & sample is’ & sample is
drawn from each stratum at randomdrawn from each stratum at random
in proportion to its size.in proportion to its size.

 Gives representation to all ‘Gives representation to all ‘stratastrata’’ ofof
the society or population.the society or population.
2)2) It gives greater accuracyIt gives greater accuracy
1)1) Proportionate representative sampleProportionate representative sample
for each strata is securedfor each strata is secured
MeritsMerits::
Mainly adopted if the condition underMainly adopted if the condition under
study is known to be related to variousstudy is known to be related to various
factors such as age, sex & area offactors such as age, sex & area of
residence or occupation etc.residence or occupation etc.

IV. CLUSTER RANDOM SAMPLINGIV. CLUSTER RANDOM SAMPLING
Population is heterogeneous, Vast andPopulation is heterogeneous, Vast and
Scattered over a wide areaScattered over a wide area

V. MULTISTAGE SAMPLINGV. MULTISTAGE SAMPLING
• Here, the sampling procedure areHere, the sampling procedure are
carried out in several stagescarried out in several stages
using simple random samplingusing simple random sampling
techniques.techniques.
IIIIIIrdrd
stagestage
TaluksTaluks
IIIIndnd
StageStage
DistrictsDistricts
IIstst
StageStage
StatesStates
Villages etc.Villages etc.

VI. MULTIPHASE SAMPLINGVI. MULTIPHASE SAMPLING
ADVANTAGESADVANTAGES
2) Less laborious2) Less laborious
1) Less costly & more purposeful1) Less costly & more purposeful
Number on the sub-samples decrease inNumber on the sub-samples decrease in
2nd, 3rd & 4th phases & will become2nd, 3rd & 4th phases & will become
smaller & smaller.smaller & smaller.
• Part of the information is collectedPart of the information is collected
from the whole sample & part fromfrom the whole sample & part from
the sub sample.the sub sample.

NON PROBABILITY SAMPLING METHODSNON PROBABILITY SAMPLING METHODS
III. JUDGMENT SAMPLINGIII. JUDGMENT SAMPLING
II. SAMPLE OF CONVENIENCEII. SAMPLE OF CONVENIENCE
I. QUOTA SAMPLINGI. QUOTA SAMPLING

METHODS OF PRESENTATIONMETHODS OF PRESENTATION
OF DATAOF DATA
3. GRAPHS3. GRAPHS
2. DIAGRAMS2. DIAGRAMS
1. TABLES1. TABLES

TABULATIONTABULATION
2. Simple2. Simple
TableTable
1. General Purpose Table1. General Purpose Table
Types of TablesTypes of Tables

DIAGRAMMATICDIAGRAMMATIC
REPRESENTATIONREPRESENTATION
1.One Dimensional Diagrams1.One Dimensional Diagrams
i. Pie-Diagrami. Pie-Diagram
2.Two Dimensional Diagram2.Two Dimensional Diagram
iii.Component Bar Diagramiii.Component Bar Diagram
ii. Multiple Bar Diagramii. Multiple Bar Diagram
i. Simple Bar Diagrami. Simple Bar Diagram
(Qualitative Data)(Qualitative Data)
3. Pictograms3. Pictograms

SIMPLE TABLESIMPLE TABLE
DISTRIBUTION OF STUDY BDS SUBJECTS,DISTRIBUTION OF STUDY BDS SUBJECTS,
YEAR WISEYEAR WISE
313313TOTALTOTAL
5151FOURTHFOURTH
8585THIRDTHIRD
8484SECONDSECOND
FIRSTFIRST 9393
TOTAL NO.TOTAL NO.
OF SUBJECTSOF SUBJECTS
B. D. S.B. D. S.
YEARYEAR

 1. The table should be numbered1. The table should be numbered
 2. Title must be given to each table, self2. Title must be given to each table, self
explanatoryexplanatory
 3. Headings of rows/columns should be clear3. Headings of rows/columns should be clear
& concise& concise
 4.The data must be presented according to size4.The data must be presented according to size
or importance,chronologically,alphabeticallyor importance,chronologically,alphabetically
or geograficallyor geografically
 5. If persentages or averages are to be5. If persentages or averages are to be
compared they should be placed as close ascompared they should be placed as close as
possible.possible.
 6. no table should be too large6. no table should be too large

 Vertical arrangement better than the horizontalVertical arrangement better than the horizontal
oneone
 Foot notes may be given when necessaryFoot notes may be given when necessary

93
84
85
51
0
20
40
60
80
100
NO.OFSUBJECTS
FIRST SECOND THIRD FOURTH
YEAR
SHOWING DISTRIBUTION OF STUDY SUBJECTS, YEAR WISESHOWING DISTRIBUTION OF STUDY SUBJECTS, YEAR WISE
SIMPLE BAR CHART

Distribution of students by sex
34
47
3839
17
38
46
54
0
10
20
30
40
50
60
No.ofstudents
MULTIPLE BAR CHART

COMPONENT BAR DIAGRAMCOMPONENT BAR DIAGRAM
39
54
38
46
47
38
34
17
0
10
20
30
40
50
60
70
80
90
100
No.ofstudents
First Second Third Fourth
Class
Distribution of students by sex
Male Female

PIE-DIAGRAM
THIRDyear
98degree
FOURTHyear
59degree
FIRSTyear
107degree
SECONDyear
97degree
DISTRIBUTION OF STUDY POPULATION ,
YEAR - WISE
ClassClass NO.NO. AnglesAngles
FIRST yearFIRST year 9393 107107
SECOND yearSECOND year 8484 9797
THIRD yearTHIRD year 8585 9898
FOURTH yearFOURTH year 5151 5959
TotalTotal 313313

GRAPHICAL REPRESENTATIONGRAPHICAL REPRESENTATION
(QUANTITATIVE DATA)(QUANTITATIVE DATA)
2.Frequency Polygon2.Frequency Polygon
1.Histogram1.Histogram
Frequency distribution table can beFrequency distribution table can be
presented by any one of the followingpresented by any one of the following
graphs:graphs:

HISTOGRAMHISTOGRAM
DISTRIBUTION OF PATIENTS ACCORDING TO THEIR AGEDISTRIBUTION OF PATIENTS ACCORDING TO THEIR AGE
NO.OFPERSONSNO.OFPERSONS
30 35 40 45 50 55 60 6530 35 40 45 50 55 60 65
AGE (in years)AGE (in years)
5050
4040
3030
2020
00
1010

FREQUENCY POLYGON
DISTRIBUTION OF PATIENTS ACCORDING TO THEIR AGEDISTRIBUTION OF PATIENTS ACCORDING TO THEIR AGE
NO.OFPERSONSNO.OFPERSONS
30 35 40 45 50 55 60 6530 35 40 45 50 55 60 65
AGE (in years)AGE (in years)
5050
4040
3030
2020
00
1010

PICTOGRAMPICTOGRAM
USSR-270
USA-500
INDIA-3700
POPULATION PER PHYSICIAN

METHODS OF SUMMARIZATIONMETHODS OF SUMMARIZATION
3.3. Standard DeviationStandard Deviation
2.2. Mean DeviationMean Deviation
1.1. RangeRange
(b) Measures of Dispersion(b) Measures of Dispersion
3.3. ModeMode
2.2. MedianMedian
1.1. MeanMean
(a) Measures of Central Tendency(a) Measures of Central Tendency

Mean is obtained by summing up all theMean is obtained by summing up all the
observations and dividing the total byobservations and dividing the total by
the number of observations. It is giventhe number of observations. It is given
byby
XX11 ,X,X22 ,X,X33,………….,X,………….,Xnn ,n observations,n observations
Where, x= variable, n= sample sizeWhere, x= variable, n= sample size
MEANMEAN
X/nX/n==
nn
__
MEAN (x) =MEAN (x) =
XX11+X+X22+X+X33+………+X+………+Xnn

ForFor Grouped Data ,Grouped Data , Mean can be calculated byMean can be calculated by
using the formula :using the formula :
ff==
N = Total number of observationsN = Total number of observations
f = Frequency of the classf = Frequency of the class
x = Mid value of the classx = Mid value of the class
Where,Where,
__
x =x = (f x) / N(f x) / N

Merits:(Mean)Merits:(Mean)
3.3. It can’t be calculated by inspection.It can’t be calculated by inspection.
2.2. If one obs. is missed, mean can’t beIf one obs. is missed, mean can’t be
calculatedcalculated
1.1. It affected by extreme valuesIt affected by extreme values
Demerits:Demerits:
4.4. It is amenable to algebraic treatmentIt is amenable to algebraic treatment
3.3. It based upon all the observations.It based upon all the observations.
2.2. It is easy to understand and easy toIt is easy to understand and easy to
calculate.calculate.
1.1. It is rigidly defined.It is rigidly defined.

Ex:Ex:The following data gives the plaque scoresThe following data gives the plaque scores
in 5 students. Calculate Mean plaque score.in 5 students. Calculate Mean plaque score.
(Mean plaque score of 5 students)(Mean plaque score of 5 students)
= 1.4049= 1.4049
= 7.0247 / 5= 7.0247 / 5
MeanMean
Solution:Solution:
1.1267, 0.9834, 1.5634, 1.0267, 2.32451.1267, 0.9834, 1.5634, 1.0267, 2.3245
= [1.1267+0.9834+1.0267+2.3245] / 5= [1.1267+0.9834+1.0267+2.3245] / 5

Median is theMedian is the middle valuemiddle value (frequencies) after(frequencies) after
arranging themarranging them eithereither in the ascending or inin the ascending or in
the descending order.the descending order.
If n is even number,If n is even number,
Median is the MEAN of the middle two termsMedian is the MEAN of the middle two terms
If n is odd number,If n is odd number,
Median divides the observationsMedian divides the observations EXACTLYEXACTLY
into half (Middle Term).into half (Middle Term).
MedianMedian
MEDIANMEDIAN
= Size of the [(n+1)/2]= Size of the [(n+1)/2]thth
itemitem

Merits:(Median)Merits:(Median)
3.3. It is affected much by fluctuations ofIt is affected much by fluctuations of
sampling.sampling.
2.2. It is not amenable to algebraic treatment.It is not amenable to algebraic treatment.
1.1. It is not based upon all the observations.It is not based upon all the observations.
Demerits:Demerits:
3.3. It can be located graphically also.It can be located graphically also.
2.2. It is not affected by extreme values.It is not affected by extreme values.
1.1. It is easy to understand and easy toIt is easy to understand and easy to
calculate.calculate.

Ex:Ex: The following data gives the plaque coresThe following data gives the plaque cores
in 5 students. Calculate Median plaque score.in 5 students. Calculate Median plaque score.
= 1.1267= 1.1267MedianMedian
0.9834,0.9834, 1.0267, 1.1267 1.5634, 2.32451.0267, 1.1267 1.5634, 2.3245
Ascending order:Ascending order:
1.1267, 0.9834, 1.5634, 1.0267, 2.32451.1267, 0.9834, 1.5634, 1.0267, 2.3245

Mode is the one which is the mostMode is the one which is the most
repeated in the particular series ofrepeated in the particular series of
observations.observations.
It is the value of the variable whichIt is the value of the variable which
occurs most frequently in a series ofoccurs most frequently in a series of
observation.observation.
OROR
MODEMODE

1.25, 3.10, 0.95, 0.75, 1.81, 1.81, 2.72, 2.50
Ex: Calculate Mode from the following Plaque
scores
1.81 repeated 2 times ,1.81 repeated 2 times ,
The value of mode is 1.81The value of mode is 1.81
Therefore, the frequency of 1.81 is 2Therefore, the frequency of 1.81 is 2

Merits:(Mode)Merits:(Mode)
2.2. In some cases mode is ill defined.In some cases mode is ill defined.
1.1. It is not based on all the observations.It is not based on all the observations.
Demerits:Demerits:
4.4. It can be calculated both from qualitativeIt can be calculated both from qualitative
and quantitative data.and quantitative data.
3.3. It is not affected by fluctuations ofIt is not affected by fluctuations of
sampling.sampling.
2.2. It can be calculated by graphically also.It can be calculated by graphically also.
1.1. It is easy to understand and calculate.It is easy to understand and calculate.

Measures of DispersionMeasures of Dispersion
3. Standard Deviation3. Standard Deviation
2. Mean Deviation2. Mean Deviation
1. Range1. Range

It is the difference between the highestIt is the difference between the highest
and lowest values in the series.and lowest values in the series.
• it is affected by sampling fluctuationsit is affected by sampling fluctuations
• It is based on extreme valuesIt is based on extreme values
• It is no based upon all the observationsIt is no based upon all the observations
• It is simplest measure of dispersionIt is simplest measure of dispersion
= Maximum value - Minimum value= Maximum value - Minimum value
RANGERANGE
RANGERANGE
= H - L= H - L

Ex:Ex:Find range of incubation period ofFind range of incubation period of
small pox when we administered in 9small pox when we administered in 9
patients it was found to bepatients it was found to be
Range = 15 - 7 = 8Range = 15 - 7 = 8
Maximum Value = 15Maximum Value = 15
Here,Here,
Minimum Value = 7Minimum Value = 7
14, 13, 11, 15, 10, 7, 9, 12, 10 (in days)14, 13, 11, 15, 10, 7, 9, 12, 10 (in days)

It is the arithmetic mean of the deviations ofIt is the arithmetic mean of the deviations of
the values from a measure of central tendencythe values from a measure of central tendency
without taking plus and minus signs intowithout taking plus and minus signs into
consideration.consideration.
If X is the variable, n the number ofIf X is the variable, n the number of
observations and x is mean and it is given byobservations and x is mean and it is given by
Where I I indicates ignoring the negative signsWhere I I indicates ignoring the negative signs
I x-x I / nI x-x I / nM.D. =M.D. =
MEAN DEVIATIONMEAN DEVIATION

Steps involved in calculation of meanSteps involved in calculation of mean
deviation:deviation:
Ix-x I /nIx-x I /n• Divide sum by n , i.e.Divide sum by n , i.e.
Ix-xIIx-xI
• find the sum of absolute deviations,find the sum of absolute deviations,
• Find absolute deviation of (x-x) i.e. Ix-xIFind absolute deviation of (x-x) i.e. Ix-xI
• Find deviations, i.e.(x-x)Find deviations, i.e.(x-x)
• Calculate mean.Calculate mean.

Merits for (Mean .Deviation)Merits for (Mean .Deviation)
2.2. It is not amenable to algebraicIt is not amenable to algebraic
treatmenttreatment
1.1. Mathematically illogicalMathematically illogical
DemeritsDemerits
3.3. It is based upon all the observationsIt is based upon all the observations
2.2. Easy to understand.Easy to understand.
1.1. Simple and easy to calculate.Simple and easy to calculate.

Ex.Ex. The following data gives the respirationThe following data gives the respiration
rate per minute. Find the Mean Deviation.rate per minute. Find the Mean Deviation.
MeanMean
DeviationDeviation
RespirationRespiration
Rate per min (x) ( x- x) I x- xIRate per min (x) ( x- x) I x- xI
2323 33 33
2222 22 22
2424 4 44 4
1616 -4 4-4 4
1717 -3 3-3 3
1818 -2 2-2 2
1919 -1 1-1 1
2121 1 11 1
20 0 020 0 0
Total ( x = 20) 0 22Total ( x = 20) 0 22
23, 22, 24, 16, 17, 18, 19, 21, 2023, 22, 24, 16, 17, 18, 19, 21, 20
= 2.4444= 2.4444
= 22/9= 22/9
== I x - x I /nI x - x I /n

Standard DeviationStandard Deviation is the square root ofis the square root of
the variance.the variance. OROR
It is defined as “square-root of the meanIt is defined as “square-root of the mean
of the squares of all the deviationsof the squares of all the deviations
being measured from the mean of thebeing measured from the mean of the
observations”.observations”.
(x -x )(x -x )22
/ n for large samples/ n for large samples==
(x -x )(x -x )22
/ n-1 for small samples/ n-1 for small samplesS.D. =S.D. =
x = mean of the observationsx = mean of the observations

STEPS INVOLVED IN CALCULATINGSTEPS INVOLVED IN CALCULATING
STANDARD DEVIATIONSTANDARD DEVIATION
6. Take square root in step no.56. Take square root in step no.5
n for large samplesn for large samples
n-1 for small samplesn-1 for small samples
5. Divide sum by n-1 or n5. Divide sum by n-1 or n
4. Find sum of the squares,4. Find sum of the squares, (x -x )(x -x )22
3. Find Deviations square, i.e. (x -x )3. Find Deviations square, i.e. (x -x )22
2. Find Deviations, i.e.(x -x )2. Find Deviations, i.e.(x -x )
1. Calculate Mean1. Calculate Mean

Ex.Ex. The following data gives the respiration rateThe following data gives the respiration rate
per minute. Find the Standard Deviation.per minute. Find the Standard Deviation.
Standard DeviationStandard DeviationRespirationRespiration
Rate per min (x) ( x- x) (x- x)Rate per min (x) ( x- x) (x- x)22
2323 33 99
2222 22 44
2424 4 164 16
1616 -4 16-4 16
1717 -3 9-3 9
1818 -2 4-2 4
1919 -1 1-1 1
2121 1 11 1
20 0 020 0 0
Total ( x = 20) 0 60Total ( x = 20) 0 60
23, 22, 24, 16, 17, 18, 19, 21, 2023, 22, 24, 16, 17, 18, 19, 21, 20
= 2.7386= 2.7386
7.57.5==
60/9-160/9-1==
(x -x )(x -x )22
/ n-1/ n-1==

After collection of large samples,After collection of large samples,
prepare a frequency Distributionprepare a frequency Distribution
with small class intervals, see thewith small class intervals, see the
following points:following points:
NORMAL DISTRIBUTIONNORMAL DISTRIBUTION
ANDAND
NORMAL CURVENORMAL CURVE

2)If they are arranged in order,
deviating towards the
extremes from the mean, on
plus or minus side, maximum
number of frequencies will be
seen in the middle around the
mean and fewer at the
extremes, decreasing smoothly
1)Some observations are above
the mean and others are
below the mean.

3)Normally almost half the
observations lie above and
half lie below the mean.
A distribution of this nature is called as
NORMAL DISTRIBUTION OR
GAUSSIAN DISTRIBUTION.
4)All the observations are
symmetrically distributed on
each side of the mean.

Histogram of the
frequency distribution gives a
frequency curve which is
symmetrical in nature called
as the NORMAL CURVE OR
GAUSSIAN CURVE.

1.Mean ± SD limits, includes
68.27% obs or 2/3rd of all the
observations.
3.Mean ± 3SD limits,includes
99.73% obs. llly Mean ± 2.58SD
limits, includes 99% obs
2.Mean ± 2SD limits, includes 95%
obs. llly Mean ± 1.96SD limits,
includes 95% obs
ARITHMETIC INTERPRETATION OF
N.D:

NORMAL CURVE
0 18 19 20 21 22 23
Age in years
18
16
14
12
10
8
6
4
2
Frequency
Normal Curve

CHARACTERISTICS OF A NORMAL
CURVE
1. It is bell shaped curve and symmetrical about the
mean which at the max. ordinate
iii. Mean ± 3SD includes 99.74% area
ii. Mean ± 2SD includes 95.44% area
i. Mean ± SD includes 68.26% area
5. Area Property.
4. The curve approaches horizontal axis but never
reaches horizontal axis.
3. Mean, Median and Mode are coincides.
2. The first and third quartiles are equidistant from
median.

BASIC CONCEPTS
OF
STATISTICAL
INFERENCES

TESTS OF
SIGNIFICANCE
Are the mathematical methods by
which probability (P) or relative
frequency of an observed
difference occurring by chance is
found

This test applied to small samples
W. S. Gosset -- Students t-test
Paired t-test
Unpaired t-testt-test
t-TEST:

Applied to independent samples
checking significant difference between
two means
2 respectively called S.D’sσ σ1
and
SE(x1 - x2) = 1 /n1 + 2 /n2σσ
SE (x1 - x2)
t =
x1 - x2
UNPAIRED t-TEST

Is applied to paired data of independent
obs. from one sample only when each
individual gives pair of obs.
Where,
n = sample size
SD = Std. deviation for the difference
d = difference between x1 and x2
SD/ n
t =
d
PAIRED t-TEST

CHI-SQUARE TEST: Is an alternative method
of testing the significance difference between
two or more proportions.
Total of the total
E =
Row total x column total
E = Expected frequency & it is given by
Where, O = Observed frequency
E
=
(O - E)22

TRIAL OF 2 WHOOPINGTRIAL OF 2 WHOOPING
COUGH VACCINECOUGH VACCINE
VACCINE ATTACKED NOT ATTACKED TOTAL
A 22 68 90
B 14 72 86
TOTAL 36 140 176

2. TEST OF NULL HYPOTHESIS2. TEST OF NULL HYPOTHESIS
 It is a hypothesis which reflectsIt is a hypothesis which reflects
no changeno change oror no difference,no difference,
usually denoted by Husually denoted by H00
..
 Then proceed to test the hypothesis inThen proceed to test the hypothesis in
quantitative termsquantitative terms
Proportion of people attacked will beProportion of people attacked will be
36/176= 0.20436/176= 0.204
Proportion of people not attacked will beProportion of people not attacked will be
140/176= 0.795140/176= 0.795

 Expected number of attacks by vaccine AExpected number of attacks by vaccine A
will be 90x0.204 = 18.36will be 90x0.204 = 18.36
 Expected number of not attacked by vaccineExpected number of not attacked by vaccine
A will be 90x0.795 = 71.55A will be 90x0.795 = 71.55
 Expected number of attacks by vaccine BExpected number of attacks by vaccine B
will be 86X0.204 = 17.544will be 86X0.204 = 17.544
 Expected number of not attacked by vaccineExpected number of not attacked by vaccine
B will be 86x0.795 = 68.37B will be 86x0.795 = 68.37

VACCINE ATTACKED NOT
ATTACKED
A O = 22
E= 18.36
+3.64
O = 68
E= 71.55
3.55
B O = 14
E= 17.54
-3.54
O = 22
E= 68.37
+3.63
O= observed values E= Expected values

2. APPLYING CHI-SQUARE TEST2. APPLYING CHI-SQUARE TEST
 χχ22
= ∑ (O-E)= ∑ (O-E)22
 EE
 χχ22
= (3.64)= (3.64)22
+ (3.55)+ (3.55)22
+ (3.55)+ (3.55)22
+ (3.63)+ (3.63)22
 18.36 71.55 17.54 68.3718.36 71.55 17.54 68.37
 = 0.72+0.71+0.19= 0.72+0.71+0.19
 = 1.79= 1.79

Finding the degree of freedom(df)Finding the degree of freedom(df)
 Depends on the no.of columns and rows in theDepends on the no.of columns and rows in the
original table.original table.
 df= (c-1)(r-1)df= (c-1)(r-1)
 c= No. of columnsc= No. of columns
 r= No. of rowsr= No. of rows
 For the given example df isFor the given example df is
 df=(c-1)(r-1)df=(c-1)(r-1)
 = (2-1)(2-1)= (2-1)(2-1)
 =1=1

PROBABILITY TABLESPROBABILITY TABLES
 Then we turn to published probability tablesThen we turn to published probability tables
on referring to chi square table with oneon referring to chi square table with one
degree of freedom the value of chi square for adegree of freedom the value of chi square for a
probability of 0.05 is 3.84 which is notprobability of 0.05 is 3.84 which is not
significantsignificant
 So null hypothesis is true & vaccine B is notSo null hypothesis is true & vaccine B is not
superior to vaccine Asuperior to vaccine A

Biostatistics ug

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Biostatistics ug

Similaire à Biostatistics ug (20)

Plus de Indian dental academy

Plus de Indian dental academy (20)

Dernier

Dernier (20)

Biostatistics ug

Notes de l'éditeur