SlideShare une entreprise Scribd logo
1  sur  117
Télécharger pour lire hors ligne
ATOM
BOMB
PATHOLOGY
National Institute Of Unani
Medicine
DR. Md.Khurshid Alam
Biostatistic and
Research Methodology
DR.
Md.Khurshid
Alam
1
SL.NO CONTENTS PAGE SL.NO CONTENTS PAGE
1 BIO- STATISTICS:
Introduction
2 23 Chi-square (X2
) Test 76
2 Scale 5 24 ANOVA 80
3 Data, Questionnaire 7 25 Mann–Whitney U test 82
4 Measurement of central
tendencies
11 26 Kruskal-Wallis Test 85
5 Weighted average,
Measure of position
20 27 Moods median test 89
6 Measures of dispersion 22 28 Correlation 90
7 Standard deviation 24 29 Regression analysis 93
8 Graphic presentation of
frequency distribution
27 30 Minitab 95
9 Probability 32 31
SPSS
96
10 Estimation 38 32 Book, Journal,
Compendium
99
11 Research Methodology 39 33 Research Report: protocols
and Report Format
100
12 Research problem 43 34 Trend and possibilities of
research in Unani
103
13 Concept of variable 46 35 ICMR
STATEMENT
105
14 Research design 47 36 WHO-Researchers’
responsibilities
108
15 Types of control 52 37 WMA DECLARATION OF
HELSINKI
110
16 Blinding 53
17 Clinical research 54
18 Inductive and deductive
approaches
58
19 Sampling 59
20 Hypothesis 64
21 Z-test 67
22 T-test 72
DR.
Md.Khurshid
Alam
2
BIO- STATISTICS: Introduction
❖ Word STATISTICS is derived from Latin word STATUS means
STATE – a political state.
❖ BIO- STATISTICS is application of statistical tools and method in
biology and Medicine.
❖ John Graunt – is father of health statistics. In 1662 he published
“Natural and political observation made upon Bills of mortality”
Definition
❖ Statistics is science of data that will enable us to become
proficient data producer and efficient data users.
❖ In plural form, it stands for numerical facts (facts expressed in
numbers) pertaining to a collection of objects.
❖ In singular form, it stands for the science of collection,
organization, analysis and interpretation of numerical facts.
Prof. Horace Secrist defines
❖ By statistics we mean, aggregate of facts affected to a marked
extent by multiplicity of causes, numerically expressed,
enumerated or estimated according to reasonable standards of
accuracy, collected in a systemic manner for a pre-determined
purpose and place in relation to each other.
Branch of statistics
❖ Two branches –
1) Statistical methods
2) Applied statistics
Main branches of applied statistics are Biometry, Demography,
Econometrics, statistical quality control, Psychometry etc.
DR.
Md.Khurshid
Alam
3
Scope and applications of statistics
Statistics is considered to be a distinct branch of study applicable to
investigation in many branches of science. Statistical methods are
applied to specific problems in biology, medicine, agriculture,
commerce, business, economics, industry, sociology etc.
Function of statistics
❖ It simplifies complexity of the data.
❖ It reduces the bulk of the data.
❖ It adds precision to thinking.
❖ It helps in comparing different sets of figures.
❖ It guides in formulation of policies and helps in planning.
❖ It indicates trends and tendencies.
❖ It helps in studying relationship between different factors
Limitations of statistics
❖ Statistics does not deal with qualitative data.
❖ Statistics does not deal with individual fact.
❖ Statistical inferences (conclusions) are not exact.
❖ Statistics can be misused.
❖Common men cannot handle statistics properly.
Basic Notions
❖ Units or Individuals – the object whose characteristics are
studied.
❖ Population or Universe – the totality (collection) of units under
consideration.
❖ Finite population – population contains finite number of units.
DR.
Md.Khurshid
Alam
4
❖ Infinite population – Population contains infinite number of
units. E.g.- heights of plants.
❖ Census -If each and every units are studied; this type of study is
called complete enumeration or census.
❖ Quantitative characteristic: - A characteristic which is
numerically measurable.
❖ Qualitative characteristic: - A characteristic which is not
numerically measurable.
❖ Variable – A quantitative characteristics which varies from unit
to unit. E.g.- height
❖ Attribute – A qualitative characteristic which varies from unit in
unit. E.g.- sex
❖ Discrete variable – Some specified value in a given range. E.g. –
number of children per family.
❖ Continuous variable – A variable which assume all the value in
the range. E.g.- Hight of persons.
❖ Statistical survey or investigation – Study of variable which
show statistical (stochastic non-mathematical) variation.
❖ Investigator – who conduct statistical survey.
❖ Informants Respondents – A persons who supply information.
❖ Enumerators – An agent who collect and handover information
to the investigator.
❖ Sample – It is representative portion of the population
❖ Census enumeration – A survey in which the whole population
is made use.
DR.
Md.Khurshid
Alam
5
Collection and classification of data
A statistician is concerned with the study of variables which
show statistical (stochastic non mathematical) variation. Such a
study is called statistical investigation (statistical survey).
Investigator: -The person who conduct the statistical survey is
called investigator. The investigator plans the survey collect the
required data analyses them and finally draw conclusion.
Stages of statistical investigation
Mainly two stage-
I. Planning and preparation.
II. Executive of survey.
Execution has four steps, namely
1)collection of data
2)scrutiny, editing and presentation of data.
3)Analysis of data.
4) Interpretation of analyzed data.
Quality of data
❖“GIGO” Garbage in-garbage out. This means, researcher
must ensure high quality of data at every step.
Scale of measurement: - It is able to measure anything.
Measurement of magnitude. Basically, it is two type-
(1) Crude- it provides rough idea of magnitude.eg tall
(2) precise- it provides exact value of magnitude.eg 2cm
DR.
Md.Khurshid
Alam
6
• Nominal scale: - Also called classificatory scale. On the basis of
common property/share property/character it divides data into
sub groups. For example, if we divide 10people on their income
in high income, average income, low income, there is no
importance of order, either low income is written on top or
bottom.
• Ordinal scale: - It has all the property of nominal scale. It also
divides the under-study parameter into sub groups, into order,
Ascending or descending order. So, first divide the object on
nominal scale and then arrange in ordinal scale.
• Interval scale: -It has all the property of ordinal and nominal
scale in addition it places the sub group (rank sub-group) with a
definite interval.
The space between starting and terminating point is called
interval. This scale cannot used in mathematical calculations.
• Ratio scale: -It has all the property of nominal, ordinal and
interval scale in addition it is always start with zero.
Measurement of this is subject of mathematical calculation.
Every division is definite measurement. It is absolute scale. In
this zero is fixed. It most precise scale.
DR.
Md.Khurshid
Alam
7
Primary data are specially collected for a particular purpose. It is
reliable complete and fresh.
Method of collection of primary data: -
1. Direct personal interview – investigator personally comes
in contact with the unit.
2. Indirect personal interview – the investigator does not
contact the units directly but, he/she contacts person who
are in close association with units. These persons
(informants) supply information to the investigator.
3. Information through correspondents – the investigators
appoints his agents called correspondent at different place.
These correspondents collect required data in their area and
hand over to the investigator.
4. Method of questionnaire (mail inquiry) – questionnaire is
the list of question, answer for which are filled in by the
informants and these answers are required information for
the investigation.it is cheap, consumes less time and labour.
5. Method of schedule (collection through enumerators) –
schedule is the list of items on which the enumerators have
to collect and record information. It is filled by the
enumerators. These data are reliable and accurate. But in
this method, there is scope for bias.
General principle in drafting questionnaire(schedule)
1. The number of questions should be as less as possible.
2. Question should be short and simple.
3. If a lengthy question is unavoidable, it should be divided into two
or more parts.
4. Question should be such that answer to them are short.eg. Are
you married?
5. As far as possible, question regarding personal matter should be
avoided.
6. The question should be so framed that do not hurt the feeling of
the informants.
7. Question should not be ambiguous.
DR.
Md.Khurshid
Alam
8
8. Question should be logically arranged.
9. Any clarification, if necessary, regarding any of the question,
should be provided.
10.Question should be so framed that validity of information
supplied by informants can be cross checked.
A covering letter introducing the investigator and indicating the
purpose of survey should be attached to the questionnaire. It should
supply necessary instruction to the informants regarding return
SECONDRY DATA
❖ Primarily collected for some other purpose.
❖ It may not contain all required information.
❖ Sources of secondary data are—
1) published sources e.g. Gov. Reports
2) unpublished sources e.g.
Records of Govt.office
DR.
Md.Khurshid
Alam
9
Classification
❖ Units having common characteristics are grouped together.
❖ Each of these groups is called class.
❖ Simple or one-way classification – classification of units on the
basis of a single characteristic.
❖ Mani-fold classification – simultaneous classification of units on
the basis of two or more characteristics.
❖ Dichotomy- classification of units on the basis of a characteristic
into two classes.eg married and unmarried.
Function of classification
❖ Reduce the bulk of data.
❖ Simplifies the data.
❖Facilitates comparison of characteristics.
❖ Renders the data ready for statistical analysis.
TABULATION is a systemic arrangement of classified data in row and
columns of a table.
CONTIGENCY Table – A table showing many-fold classified data.
Types of classification- four types
1) Quantitative classification- classification with regard to variable.
2) Qualitative classification – classification with regard to attribute.
3) Spatial classification (Geographical classification).
4) Temporal classification (chronological classification)
classification with regard to time.
DR.
Md.Khurshid
Alam
10
Frequency table
❖ A systemic presentation of the values taken by a variable and the
corresponding frequencies is called frequency distribution of
that variable.
❖ A tabular presentation of frequency distribution is called
frequency table.
A frequency distribution in which class interval are considered is
a continuous(grouped) frequency distribution. If class interval is
not considered, it is a discrete(ungrouped) frequency
distribution.
Terms-
❖ Class interval: - it is range between upper- and lower-class limit.
It is width of the class.
❖ Lower class limit: - smallest value of class.
❖ Upper class limit: - highest value of class.
❖ Class mark or class mid value: -the central value (middle most
value) of a class interval.
❖ Inclusive class interval: - if class interval is such that the lower as
well as the upper-class limit are included in the same class
interval. Usually inclusive type of class interval is adopted when
the variable is discrete.
❖ Exclusive class interval: - If class interval is such that the lower-
class limit is included in the same class interval, whereas, the
upper-class limit is included in the succeeding class interval.
❖ Open end class interval: -Some time in frequency distribution,
the class interval at the extremities may not have one of the
limits. Such class interval is called open end class interval. E.g.
More than 100.
DR.
Md.Khurshid
Alam
11
❖ Frequency: - the number of observations in any class.
Bivariate and multivariate frequency distribution
Frequency distribution of a single variable is called univariate
frequency distribution. Frequency distribution of more than one
variable is called multivariate frequency distribution.
Bivariate = on two variables
Measurement of central tendencies
Central tendency
❖ Generally, in frequency distribution, the values cluster around a
central value.
❖ The property of concentration of the values around a central
value is called central tendency.
❖ The central values around which there is concentration is called
measure of central tendency (measure of location, average).
Five important measures of central tendencies-
1. Arithmetic Mean (A.M)
2. Median
3. Mode
4. Geometric Mean (G.M)
5. Harmonic mean (H.M)
Desired Qualities of an ideal measure of central tendencies –
1) It should be easy to understand.
2) Its computation procedure should be simple.
3) It should be rigidly defined.
DR.
Md.Khurshid
Alam
12
4) It should be based on all the values.
5) It should not be affected too much by abnormal extreme values.
6) It should be capable of further algebraic treatment so that it
could be used in further analysis of the data.
7) It should be stable. That is, the measure should be such that
sampling variation in the value of the measure should be least.
Arithmetic Mean (Mean)
❖ Arithmetic mean of a set of values is obtained by dividing the
sum of the values by the number of values in the set
❖ Arithmetic mean of the values -- X1, X2,..Xn is –
𝑋̄̅ =
𝑋1+𝑋2+⋯+𝑋𝑛
𝑛
=
∑ 𝑥
𝑛
❖ If the observation x1, x2, ……. Xn have frequencies f1 , f2 , …….. Fn ,
the Arithmetic mean is
𝑋̄̅ =
𝑓1𝑋1+𝑓2𝑋2+⋯+𝑓𝑛𝑋𝑛
𝑓1+𝑓2+ ….. +𝑓𝑛
=
∑ 𝑓𝑥
𝑁
(For discrete frequency distribution)
Where N = ∑ 𝑓 Is the total frequency
For raw data, the arithmetic mean is –
𝑋̄̅ =
∑ 𝑥
𝑛
For tabulated data (discrete or continuous), it is –
𝑋̄̅ =
∑ 𝑓𝑥
𝑁
DR.
Md.Khurshid
Alam
13
Change of origin and scale
❖ Let x1 , x2 , …….. Xn be n values.
Let ‘a’ be a constant.
Then x1- a, x2 –a,….xn – a are the value of x1 , x2 , ….. Xn with
origin shifted to ‘a’.
If ‘c’ is positive constant,
𝑥1 −𝑎
𝑐
𝑥2 −𝑎
𝑐
…….
𝑥𝑛 −𝑎
𝑐
Are the values x1 , x2 , ….. Xn with origin shifted to a and scale
changed by c.
Thus, u =
𝑥 −𝑎
𝑐
Therefore, x = a+cu
X – a = uc
X = a+uc
And so, X
̅ = 𝑎 + 𝑐𝑢̅
= a +
𝑐 ∑ 𝑓𝑢
𝑁
However, if c=1, X
̅ = 𝑎 + 𝑢̅
= a+
∑ 𝑢
𝑛
Properties of arithmetic Mean
1) Algebraic sum of the deviation of a set of values from their
arithmetic mean is zero.
That is, ∑(𝑥 − 𝑥)=0
2) Sum of the squared deviations of a set of values is a minimum
when deviation is taken around the arithmetic mean.
DR.
Md.Khurshid
Alam
14
Let x̅1 be the arithmetic mean of a set of n1 values. And, let x̅2,
be the arithmetic mean of another set of n2 values. Then, the
arithmetic means of the two set of values put together is
X̅=
𝑛1𝑥1+𝑛2𝑥
𝑛1+𝑛2
(combined arithmetic mean)
Merits of arithmetic mean
❖ It is rigidly defined.
❖ It can be easily computed.
❖ Logic behind its computation can be easily understood.
❖ It can be easily adopted for further statistical analysis.
❖ It is based on all the values.
❖ It is more stable than any other average.
❖ It can be calculated even when some of the values are equal to
zero or negative.
Demerits of arithmetic mean
❖ It is highly affected by abnormal extreme values.
❖ Since it is based on all the values, even if one of the values is
missing, it cannot be calculated.
❖ Sometimes, the arithmetic mean may be a value which is not
assumed by the variable.
DR.
Md.Khurshid
Alam
15
Median
❖ Median of a set of values is the middle most value when they are
arranged in the ascending order of magnitude. (such an
arrangement is called an array).
❖ It is a value that is greater than half of the values and lesser than
the remaining half.
❖ The median is denoted by M.
❖ In case of a raw data and also a discrete frequency distribution,
the median is –
M={
(𝑛+1)
2
}Th value in the arrayed series.
In the case of continuous frequency distribution, the median is –
M=l + [
𝑁
2
−𝑚) × 𝑐
𝑓
]
Where l: lower limit of the median class.
C: width of the median class.
F: frequency of the median class.
M: less than cumulative frequency up to l.
N: total frequency.
Merits of median
❖ The logic behind its computation is easily understood.
❖ It can be easily computed.
❖ Even if some extreme value is missing, it can be computed.
❖ It is not affected by abnormal extreme value.
❖ It can be used for the study of qualitative data also.
DR.
Md.Khurshid
Alam
16
Demerits of median
❖ It is not based on all the values.
❖ It cannot be used in deep statistical analysis.
Mode
❖ Mode is the value which has highest frequency.
❖ It is most frequently occurring value.
❖ It is denoted by Z.
❖ In case of raw data, and also in case of a discrete frequency
distribution, mode Is the value with highest frequency.
❖ In case of a continuous frequency distribution, mode is –
Z= 𝒍 + [
(𝒇−𝒇𝟏)×𝒄
𝟐𝒇−𝒇𝟏−𝒇𝟐
]
Where –
L: lower limit of the modal class.
F: frequency of the modal class.
C: width of the modal class.
F1: frequency of the class preceding the modal class.
F2: frequency of the class succeeding the modal class.
❖ Modal class is the class which contains the mode.
❖ Generally, modal class will be the class with highest frequency.
But sometimes, it may be a class other than the class with
DR.
Md.Khurshid
Alam
17
❖ Highest frequency.in such a situation, mode is obtained by using
the formula –
Z=l+ [
𝑐𝑓2
𝑓1+𝑓2
]
❖ Unimodal – most of the frequency distribution have only one
value with highest frequency, such frequency distribution is
unimodal. The have only mode.
❖ Multimodal – if there is more than one value with highest
frequency in frequency distribution.it will have more than one
mode.
❖ Bimodal – if there are two modes,
❖ A distribution which has more than one mode, it is said to be ill
defined.
Merits and demerits of mode
❖ Merits and demerits of mode are the same as merits and demerits
of median
❖ One additional demerit is –
For some frequency distribution, mode is ill defined.
DR.
Md.Khurshid
Alam
18
Geometric mean (GM): -the geometric mean of n value is the nth
root of product of the values. It is denoted by G.
The gematric mean of n values X1, X2, X3, ….Xn is
G= √𝑥1 × 𝑥2 × … × 𝑥𝑛
𝑛
If logarithms are used,
G=antilog[
∑ 𝑙𝑜𝑔 𝑥
𝑛
] For raw data
And G= antilog [
∑ 𝑓 𝑙𝑜𝑔 𝑥
𝑁
] For tabulated data
Geometric mean is the appropriate measure for averaging rate
of growth. This is the reason why geometric mean index number
is considered the best.
When any of the value is equal to zero, geometric mean is not
defined. Also, it not defined when some of the value are
negative. It is defined only when either all the value is positive
or all of them are negative.
Harmonic mean (H.M): - The harmonic mean of n value is the
reciprocal of the arithmetic mean of the reciprocals of the given
values. It is denoted by H.
Thus, harmonic mean of the value X1, X2, X3,…Xn is –
𝐻 =
𝑛
∑ (
1
𝑥
)
In case of tabulated data, H.M is-
𝐻 =
𝑁
∑(
𝑓
𝑥
)
DR.
Md.Khurshid
Alam
19
Uses of different averages: - the appropriate situation where
various average can be used.
Arithmetic mean-
1. The average is required for deep statistical analysis.
2. The variable is continuous.
3. The average is additive in nature.
Median-
1.The variable is discrete.
2. Some of the extreme value are missing.
3. There are abnormal extreme values.
4. Mode is ill defined.
5.The characteristic under study is qualitative.
Mode-
1. Modal value has very high frequency compared to other
frequency.
2. Some of the extreme value are missing.
3. The variable is discrete.
4. There is abnormal extreme value.
5. The characteristic under study is qualitative.
Geometric mean- The variable is multiplicative in nature. Average
rates and ratio have to be found.
Harmonic mean- The reciprocal of the variable is additive in nature. In
a slightly skew, distribution, the mean, median and mode show a
rough relation among themselves. It is -
Empirical relation between mean, median and mode: -
Mean – Mode =3(Mean - Median)
Mode = 3 Median – 2Mean
That is, Z = 3M – 2X¯
DR.
Md.Khurshid
Alam
20
Weighted average: - Sometime in the data, some of the item may
be more important than the other items. For example, in post
graduate of pathology Unani, the marks of ilmul alamat and ilmul
asbab carries greater importance than marks of histology, cytology,
etc. Thus, in such a situation an appropriate average with varying
weightage assigned to the values, is necessary. Such an average is
called weighted average. The weighted average in common use are
weighted arithmetic mean and weighted geometric mean.
Let the values x1, x2, x3, ….xn be assigned the weights w1, w2, w3,…wn
respectively. Then-
Weighted arithmetic mean is –
X¯ =
∑ 𝑊𝑋
∑ 𝑊
The weighted geometric mean is –
GW = antilog
∑ 𝑊 Log 𝑋
∑ 𝑊
Measure of position (Partition value): - The value which divide
the frequency distribution in definite ration are called partition value.
E.g. Median, Quartile, Decile, Percentile.
Quartiles: - Quartile divides the distribution into four quarters. For a
frequency distribution there are three quartiles.
Q1- the first quartile. It is also called lower quartile.it divide the value
which is greater than one quarter of the observation and less than the
remining three quarters.
Q2- it divides the value in two equal halves. It is same as median.
Q3 – it is called third quartile or upper quartile. The value divided by
it is greater than three quarter and lesser than remaining one quarter.
DR.
Md.Khurshid
Alam
21
For raw Data, and for a discrete distribution the rth quartile is –
Qr = {
𝑟(𝑛+1)
4
}Th
value in arrayed series.
For continuous frequency distribution, the rth quartile is –
Qr = 𝑙 + [
(
𝑟𝑁
4
−𝑚)×𝑐
𝑓
]
Decile: - there are nine decile for a frequency distribution, which are
denoted by D1, D2, D3,….D9. It divides the frequency distribution into
ten equal parts.
For continuous distribution –
Dr=𝑙 + [
(
𝑟𝑁
10
−𝑚)×𝑐
𝑓
]
Percentiles: -percentile divides the frequency distribution into
hundred equal parts. One percent of value exist between two
consecutive percentiles. It is denoted by P1 to P99. There are ninety-
nine percentile for a frequency distribution.
For continuous distribution –
Pr= 𝑙 + [
(
𝑟𝑁
100
−𝑚)×𝑐
𝑓
]
Median: -The median divides the frequency distribution into two
halves.
DR.
Md.Khurshid
Alam
22
Measures of dispersion
(Range, Quartile deviation, Mean deviation and standard deviation)
Dispersion (variation)
In a frequency distribution, though the values cluster around an
average, most of them differ from it. In some distribution, the
difference may be less, whereas in some other, it may be more. This
property of deviation of values from the average is called variation or
dispersion
Measures of dispersion
1) Range
2) Quartile deviation (semi- interquartile range, QD)
3) Mean deviation (M.D)
4) Standard deviation (S.D)
Among these four measures, standard deviation is the most
commonly used measure.
Essentials of good measure of variation
O It should be easy to understand.
O Its computation procedure should be simple.
O It should be rigidly defined.
O It should be based on all the values.
O It should not be affected too much by abnormal extreme values.
O It should be capable of further algebraic treatment.
O It should be stable.
DR.
Md.Khurshid
Alam
23
Range
O Range is the difference between highest and lowest values in the
data.
O If H is the highest value and L is the lowest value in the data, the
range of variation is –
R = H – L
Coefficient of range
A relative measure of variation which is used for comparison of
frequency distribution is coefficient of range. It is—
Coefft. Of R =
𝐻−𝐿
𝐻+𝐿
Range is easy in computation and very simple to understand.
Demerits of range
➢ Since range is based only on the extreme values, it shows
too much fluctuation.
➢ It is highly affected by abnormal extreme values.
➢ If data has abnormal extreme values, range should not be
adopted for study.
Quartile deviation
(semi- interquartile range)
➢ The quartile deviation is obtained by dividing the range
between the lower and upper quartiles by 2.
➢ If Q1 and Q3 are the lower and upper quartiles, the quartile
deviation is –
Q.D =
𝑄3−𝑄1
2
DR.
Md.Khurshid
Alam
24
Coefft. Of Q.D
Relative measure of variation based on the quartile is coefficient of
quartile deviation. It is –
Coefft. Of Q.D =
𝑄3−𝑄1
𝑄3+𝑄1
Feature of Q.D
➢ It is based only on the lower and upper quartiles.
➢ It can be easily computed.
➢ It is not affected much by extreme values.
➢ It is not based on all the values.
➢ It is not convenient for mathematical treatment.
Standard deviation
Standard deviation of a set of values is the positive square- root of
mean of the squared deviations of the values from their arithmetic
mean. It is denoted by sigma ( 𝜎 )
The range is based only on the lowest and highest values. Quartile
deviation is based only on quartiles. But these measures are not based
on all the values. And so, we consider standard deviation which is
based on all the values.
Standard deviation of the values x1, x2, x3,……..xn is ---
𝜎 = √∑(𝑥−𝑥)2
𝑛
In case of tabulated data (in both continuous and discrete)
𝜎 = √∑ 𝑓(𝑥−𝑥)2
𝑛
DR.
Md.Khurshid
Alam
25
Variance: -
The square of the standard deviation is called variance
Variance of x1, x2, x3,…. Xn is –
Var (x)=𝜎2
=
∑ 𝑓(𝑥−𝑥)2
𝑛
It is mean of squared deviation of the values from their
arithmetic mean.
Computation of standard deviation for raw data-
𝜎 = √
∑ 𝑥2
𝑛
− (
∑ 𝑥
𝑛
)2
Computation of standard deviation for tabulated data-
𝜎 = √
∑ 𝑓𝑥2
𝑁
− (
∑ 𝑓𝑥
𝑁
)2
If the origin of value is shifted to a and the scale is changed by c, that
is, if u=
𝑥−𝑎
𝑐
Then it can be shown that –
S.D(x)= c X S.D.(u)
Also, Var.(x)= c2 X var.(u)
Properties of standard deviation: -
S.D. is independent of origin of measurement, but not on scale.
S.D. is the least of all root-mean-squire deviation.
Combination of standard deviation of set of n1 and n2 values is –
𝜎 = √
𝑛1(𝜎1
2 + 𝑑1
2
) + 𝑛2(𝜎2
2 + 𝑑2
2
)
𝑛1 + 𝑛2
Where, d1= x1¯ −x¯, d2=x2¯ −x¯ and x¯=
𝑛1𝑥1¯+𝑛2𝑥2¯
𝑛1+𝑛2
DR.
Md.Khurshid
Alam
26
Coefficient of variation: -
Coefficient of variation is relative measure of variation. It is used for
comparing the variation in frequency distribution. It is the standard
deviation expressed as a percentage of the mean.
Thus, Coefficient of variation is-
C.V =
𝐬𝐭𝐚𝐧𝐝𝐚𝐫𝐝 𝐝𝐞𝐯𝐢𝐚𝐭𝐢𝐨𝐧
𝑨𝒓𝒊𝒕𝒉𝒎𝒆𝒕𝒊𝒄 𝒎𝒆𝒂𝒏
× 𝟏𝟎𝟎
=
𝝈
𝒙¯
× 𝟏𝟎𝟎
➢ A high value of Coefficient of variation indicates high degree of
variation, and
➢ A low value indicates low degree of variation.
➢ Coefficient of variation is independent of the unit of measurement
of the values,
Mean deviation
Mean deviation is the mean of absolute deviation of the values from
the central value.
Thus, mean deviation of the set of values x1, x2, x3,….xn from their
arithmetic mean is –
M.D.(X¯) =
∑|𝑥−𝑥¯|
𝑛
In case of tabulated data M.D. From A.M. is –
M.D.(X¯) =
∑ 𝑓|𝑥−𝑥¯|
𝑁
Mean deviation of the values from median M is –
M.D.(M) =
∑|𝑥−𝑀|
𝑛
In the case of tabulated data, M.D. from Median M is –
M.D.(M) =
∑ 𝑓|𝑥−𝑀|
𝑁
DR.
Md.Khurshid
Alam
27
❖ Here, |𝑥1 − 𝑥¯|, |𝑥2 − 𝑥¯|, |𝑥3 − 𝑥¯|,….. |𝑥𝑛 − 𝑥¯| Are the
deviations with the signs ignored. The signs are ignored because if
the deviations are algebraically added, the sum reduces to zero
(property 1 of A.M.).
❖ Mean deviation may be calculated around any average – mean,
median, mode, etc.
Minimal property of median – mean deviation is least when it is
measured from the median is called Minimal property of median.
Coefficient of mean deviation from the arithmetic mean is –
Coefft. Of M.D.(X¯) =
𝑀.𝐷.(𝑋¯)
𝑋¯
Coefficient of mean deviation from the Median is –
Coefft. Of M.D.(M)=
𝑀.𝐷.(𝑀)
𝑀¯
Graphic presentation of frequency distribution
A)Tabulation (simple and frequency distribution table)
B)Chart and diagram
A) Histogram
B) Frequency polygon
C) Frequency curve
D) Ogive (cumulative frequency)
E) Bar diagram
F) Pie diagram
G) Map diagram
F) Pictogram
DR.
Md.Khurshid
Alam
28
Histogram – Drawing procedure
❖ A histogram is simplest form of graphical presentation.
❖ Histogram for Equal class interval ---
Horizontal axis – which may not necessarily start from zero, is divided
by putting dots into equal parts numbering two or three more than
the number of class interval. Starting from left, each dot is then
labeled by the lower-class limits of the successive classes by leaving a
space of the size of one class interval at each end of the X-axis.
Sometimes, the horizontal also measure the mid-point of the
successive class interval.
❖ The vertical axis which always begins
with zero at the meeting point of the
two axes, is appropriately scaled to
measure class frequencies along it.
❖ Rectangular bars are then
constructed for successive class
intervals with their base on the X-axis,
such that the base is equal in width and
the height (on the Y-axis) equal to the
corresponding class frequency.
❖ The area of the bars so drawn
corresponding to each class interval is
given by its class frequency f multiplied
by the width of the class interval C.
Histogram - example
❖ weekly income of 80 salesman as constructed in table-
Income F1
No.
Salesman
50-59 6
60-69 9
70-79 15
80-89 25
90-99 13
100-109 7
110-119 5
DR.
Md.Khurshid
Alam
29
Histogram for unequal class interval
❖ Procedure of drawing histogram for unequal class interval is
slightly different.
❖ Minor adjustment is required to made in spacing of various dots
marked on the X-axis. For example, if a class interval is of a width
of 15 points and the rest of 5 points., the space on X-axis for the
class interval of 15 point should be three times longer than that
for an interval of 5 points.
❖ The vertical for such class intervals measures the frequency
density and not the original class frequency.
❖ The frequency density for a class interval of a width more than that
of the others is given to be the actual frequency of this class
divided by the number of times the width of this class of 15 points
width being 69, the frequency density of this class will be 69
divided by 3, that is 23.
❖ For drawing, histogram for an open-ended distribution, follow
usual procedure after leaving out the open-ended classes.
Frequency polygon
❖ Dot at the mid-point of top horizontal line of each bar and then
joining these dots by straight lines.
❖ Closed the polygon on each end by drawing straight lines from
the midpoint of the top base of the first and the last rectangle to
the mid-point falling on the horizontal axis of the next outlying
interval width zero frequency.
Drawing a frequency polygon does not necessarily require
constructing a histogram first
DR.
Md.Khurshid
Alam
30
OGIVE- Cumulative frequency curve
❖ Cumulative frequency curve is popularly known as ogive.
❖ The first step in drawing an ogive is to add another column of
cumulative frequencies, denoted as fc. This may be done by
finding cumulative frequencies either on a less than or more
basis.
❖ Less than cumulative frequency is obtained by adding successive
class frequencies from top to bottom.
❖ More than type cumulative frequency is obtained by adding up
successive class frequencies from bottom to top.
DR.
Md.Khurshid
Alam
31
Ogive drawing procedure
❖ Once cumulative frequencies are obtained, procedure is as usual.
The only difference being that the Y-axis now to be so scaled that
it accommodates the total frequencies. The X-axis is labeled with
the upper-class limits in the case of less than ogive, and the
lower-class limits in the case of more than ogive.
❖ Advantage of ogive is that these curves have quick adaptability
to interpretation.
Pie diagram
Total angle of a circle(pie) is 360 degree, and total area of circle is
100%.
Hence, each percent acquires 360
100
⁄ Degree
1% = 3.6 degree
Hence area under pie for a class frequency is =
𝑐𝑙𝑎𝑠𝑠 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦
𝑡𝑜𝑡𝑎𝑙 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛
𝑋̄360
Example- According to NCAER, New Delhi, forms of tobacco
consumption is as estimated by weight. Bidi-55%, cigrate-16% and
others 29%. Shown in pie diagram.
Bidis
55%
cigrate
16%
others
29%
Tobacco consuption
Bidis cigrate others
DR.
Md.Khurshid
Alam
32
Probability
Probability is the chance something will happen. In many instances,
we will have some knowledge about the possible outcomes of a
decision. In research we are unable to forecast the future with
complete uncertainty. Therefore, the need to cope with uncertainty
leads us to study and use of probability theory. Probability is part of
our everyday lives.
Probability is expressed as fractions {
1
5
,
1
6
,
1
15
}
Or as decimals (0.454, 0.475, 0.5669) between zero and one (0 - 1).
Zero probability means that something will never happen.
Probability of 1 (one) indicates that some thing definitely will happen.
Number between 0 (zero) to 1 (one) is probability, that is region
between certainty and uncertainty is probability.
The value of probability cannot be less than 0 or greater than 1.
Event: - in probability theory, an event is described as one or more of
the possible outcomes of doing something. E.g. In a coin toss
experiment, getting a tail would be an event, and getting a head would
be another event.
Experiment: - the activity that produce an event. E.g. Coin toss.
Sample space: -the set of all possible outcome of an experiment of an
experiment is called sample space. It is written as-
S= {𝒉𝒆𝒂𝒅, 𝒕𝒂𝒊𝒍} In coin toss experiment.
DR.
Md.Khurshid
Alam
33
Mutually exclusive events: - two or more events cannot occur at a
time, that means one and only one of events can takes place at a time.
E.g. In coin toss experiment either head or tail may turn up, but not
both.
Collectively exhaustive list of events: -list of events which include
every possible outcome.
Dependent event: - probability of occurrence of an event is
dependent on, or effected by in some way the occurrence of another
events.
Independent event: - probability of occurrence of an event has no
effect on the occurrence of another event.
Types (approach) of probability: -There are three basic type of
approach-
1. Classical approach
2. Relative frequency approach
3. Subjective approach.
Classical approach of probability: - Classical probability is also called
a priori probability. It is –
Probability of an event=
𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒐𝒖𝒕𝒄𝒐𝒎𝒆𝒔 𝒘𝒉𝒆𝒓𝒆 𝒕𝒉𝒆 𝒆𝒗𝒆𝒏𝒕𝒔 𝒐𝒄𝒄𝒖𝒓𝒔
𝒕𝒐𝒕𝒂𝒍 𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒑𝒐𝒔𝒔𝒊𝒃𝒍𝒆 𝒐𝒖𝒕𝒄𝒐𝒎𝒆
In coin toss experiment,
P(Head)=
𝟏(𝑯𝒆𝒂𝒅)
𝟐(𝑯𝒆𝒂𝒅+𝑻𝒂𝒊𝒍)
= 0.5
P(Tail)=
𝟏(𝑻𝒂𝒊𝒍)
𝟐(𝑯𝒆𝒂𝒅+𝑻𝒂𝒊𝒍)
= 0.5
Relative frequency approach of probability: - this method uses the
relative frequencies of past occurrences as probabilities. We
determine how often something has happened in the past and use
that figure to predict the probability that it will happen again in the
future. Life insurance companies are using this approach.
DR.
Md.Khurshid
Alam
34
Subjective approach: - it is based on the belief of the person making
the probability assessment. In 1926, Frank Ramsey in his book The
Foundation of Mathematical and Other Logical Essays introduce the
concept of Subjective approach of probability.
Laws of probability
i. Addition rules for mutually exclusive events: - probability of
either A or event B happening is written as –
P(A or B) = P(A)+P(B)
ii. Addition rules for NOT mutually exclusive events: -
If A and B is not mutually exclusive events, then-
P(A or B) = P(A)+P(B) – P(AB)
Where P(AB) is event where both event A and B occur together
at the same time.
iii. Multiplication law of probability: - this is applied when two or
more events occurs together but, they are independent of each
other.
Probability under condition of statistical independence- when two
events happen, occurrence of one event has no effect on the
probability of the occurrence of any other event.
There are three type of Probability under statistical independence-
1. Marginal Probability
2. Joint Probability –- P(AB)= P(A) x P(B)
3. Conditional Probability –
A) conditional Probability under statistical independence-
P(B/A) =P(B) or P(A/B)=P(A)
B) conditional Probability under statistical dependence. It is of three
types, conditional, joint and marginal.
DR.
Md.Khurshid
Alam
35
Probability distribution: -probability distribution is classified as either
continuous or discrete.
Continuous probability distribution: -if the variable under
consideration is allowed to take any value within a given range, so, we
cannot list all the possible values. E.g. Height of children.
Discrete Probability distribution: - Discrete Probability can take only
a limited number of values in a given range. It can be listed. E.g.
Number of children in a family.
Bernoulli distribution: - it was given by Jacob Bernoulli, a swiss
mathematician. The Bernoulli distribution describe discrete, not
continuous data, resulting from an experiment known as Bernoulli
process.
Uses of Bernoulli process-
1. Each experiment (trail) has a fixed number of possible outcomes.
In coin toss experiment, outcome is fixed (only two) either head
or tail. E.g. Success or failure; yes or no.
2. The probability of outcome of any trail remain fixed over time.
E.g. In fair coin toss experiment, the probability of head remains
0.5 for each toss regardless of the number of times the coin is
tossed.
3. The outcome of one experiment(trails) does not affect the
outcome of any other experiment. The experiment(trails) are
statistically independent.
The probability of r success in n trails is given as: ncr p rqn-r
= 𝒏!
𝒓!(𝒏−𝒓)!
Pr
qn-r
The mean of binomial distribution is given as 𝜇 = 𝑛𝑝
The standard deviation of binomial distribution is as 𝜎 = √𝑛𝑝𝑞
DR.
Md.Khurshid
Alam
36
➢ When p=0.5, the binomial distribution is symmetrical.
➢ When p > 0.5, the binomial distribution is skewed to the left.
➢ As p increases (0.3), the skewness is less noticeable.
➢ When p is small (0.1), the binomial distribution is skewed to the
right.
➢ The probability for 0.3, are same as those for 0.7. Except that the
value of p and q are reversed.
The Poisson distribution: it is based on previous data. Used for
discrete Probability distribution.
The Poisson probability formula is –
P(x)=
À𝒙𝒙𝒆−À
𝒙!
The Poisson distribution can be used instead of binomial distribution
to avoid tedious job of calculation in binomial, if n is larger and r is
small, that is when the number of trials is large and the binomial
probability of success is small. It gives good approximation of binomial
when n is greater than or equal to 20 and p is less than go 0.05.
DR.
Md.Khurshid
Alam
37
The normal probability distribution – continuous probability
distribution (Gaussian distribution): - in 18th century, Karl Gauss
postulate it.
-2 -1 0 1 2
➢ The curve is bell shaped; it has a single peak. It is unimodal.
➢ The mean of normally distributed papulation lies at the center of its
normal curve.
➢ Because of symmetry of the normal probability distribution the
mean, median and mode are the same value, at the center of the
curve.
➢ The two tail of the normal probability distribution extend
indefinitely and never touch the horizontal line.
➢ Total area under curve is 1.00, (probability).
➢ In normally distributed population-
❖ Approximately 68% of all the values lies within ±1
standard deviation from the mean.
❖ Approximately 95.5% of all the values lies within ±2
standard deviation from the mean.
❖ Approximately 99.7% of all the values lies within ±3
standard deviation from the mean.
DR.
Md.Khurshid
Alam
38
Estimation
Statistical inference is based on estimation and hypothesis testing. In
both estimation and hypothesis testing, we make inference about
characteristic of populations from information contained in sample.
Here we infer something about a population from information taken
from a sample. There are two type of estimation about population-
1) A point estimation- it a single number that is used to estimate an
unknown population parameter. A point estimation is more useful
if it is accompanied by an estimate of the error that might be
involved.
X¯ =
∑ 𝒙
𝒏
Thus, by using sample mean x¯ as the estimator we have a point
estimate of the population mean 𝜇.
Similarly, we can we can use the sample variance s2
and estimate
the population variance 𝜎2
. Where the sample variance s2
is given
by the formula- s2
=
∑(𝑥−𝑥¯)2
𝑛−1
2) An interval estimation- it is a range of value used to estimate a
population parameter. It indicates the error in two ways: By the
extent of its range and, By the probability of the true population
parameter lying within that range.
Actually, an interval estimate is a range of values within which a
papulation parameter is likely to lie.
❖ Interval estimate and confidence level: -the probability that
we associate with an interval estimate is called the
confidence level. A higher probability means more
confidence. In estimation, the most commonly used
confidence levels are 90%, 95%, and 99%, but we are free
to apply any confidence level. The confidence interval is the
range of estimate we are making.
Criteria for a good estimator
O Unbiasedness, Efficiency, consistency and sufficiency.
DR.
Md.Khurshid
Alam
39
Research Methodology
Research definition-
Research simply mean search for facts., answer to a question and
solution to a problem. It is a purposive investigation. It is an organised
enquiry. It seeks to find explanation of unexplained phenomenon to
clarify the doubtful facts and to correct the misconceived facts.
There are two type of method to search for facts-----
1) Aribitatory method (unscientific method) – it is based on
opinion, imagination, belief, impression etc. Its finding varies
person to person.
2) Scientific method – it is systemic and rational approach to
seeking facts.
Aim of research-
➢ Discover the new facts
➢ Verify or test the old facts.
➢ Develop new scientific tools, concept and theories.
Research is a scientific endeavour. It involves scientific method.
The scientific method is based on certain article of faith. These are-
• Reliance of Empirical evidence: - truth is established on the basis
of evidence.
• Use of relevant concept: - use concept with specific meaning.
• Commitment of objectivity: - objectivity is the hall mark of
scientific method.
• Ethical neutrality: -
• Generalization: -
• Verifiability: -
• Logical and reasoning process:
DR.
Md.Khurshid
Alam
40
Characteristic of research
➢ It is systemic and critical investigation into a phenomenon.
➢ It is a purposive investigation aiming at describing, interpreting
and explaining phenomenon.
➢ It adopts scientific methods.
➢ It is objective and logical.
➢ It is based upon observable experience or empirical evidence.
➢ It is directed towards finding answer.
➢ It emphasizes the development of generalisation, principle and
theories.
➢ It also stands up for test and criticism
Purpose of research
✓ Research extend knowledge of human beings.
✓ It explains undiscovered phenomenon.
✓ It verifies and test existing facts and theories
✓ It develops general laws.
✓ It analyses inter-relationship between variables.
✓ It derives casual explanation.
✓ Applied research aim at finding solution to a real-life problem.
✓ It also develops new tools, concepts and theories.
✓ It contributes in human development.
Types of research: -
A)According to intent—
1) Pure research/ basic/ fundamental research
2) Applied research
3) Exploratory research
4) Descriptive research
5) Diagnostic studies
6) Evaluation studies
7) Action research- it is a type of evaluation studies.
DR.
Md.Khurshid
Alam
41
B)According to method of studies-
1) Experimental research
2) Clinical research
3) Analytical studies
4) Historical research – it studies the past record and data
5) Survey
Research approaches (inquiry mode/scale of measurement)- two
type-
❖ Quantitative approach
❖ Qualitative approach
On the basis of application- two types
❖ Pure/ fundamental/Basic research
❖ Applied research
On the basis of objective – four type-
❖ Descriptive research
❖ Exploratory research
❖ Explanatory research
❖ Correlational research
1.Pure research/ basic/ fundamental research- it aims at extension of
knolledge.it is not necessary to be problem oriented.it may lead to
either discovery of a new theory or refinement of existing theory.it
lays foundation for applied research. Eg. Humoral theory of
Hippocrates, Einstein’s theory of relativity etc.
2.Applied research-it is real life problem-oriented and action directed
research. It seeks an immediate and practical result.
3. Descriptive research- it is simplest type of research. It is a fact-
finding investigation with adequate interpreatation.it describe
systemically a situation, phenomenon, problem, service or program, it
describes an attitude towards an issue.
DR.
Md.Khurshid
Alam
42
4.Exploratory research- it is also known as formulative research. This
type of study is undertaken with the objective either to explore an
area where little is known or investigate the possibilities of
undertaking a particular research study.
5.Explanatory research- it clarifies the relationship between two
aspect of a situation or phenomenon.
6.Correlational research – it discovers or establish the existence of
relationship /association/interdependence between two or more
aspect of a situation.
❖ Quantitative approach- it is structured/ rigid/predetermined
methodology to quantify extent of variation in to a phenomenon,
situation, issue, etc. It has reliability and objectivity.
❖ Qualitative approach – it is unstructured/ flexible/open
methodology to describe variation in a phenomenon, situation,
issue, etc. Emphasis on description of variable.
Hence, research is a scientific undertaking which, by mean of logical
and systemic technique, aim to discover new facts or verify and test
old facts, analyse the sequence, interrelationship and casual
explanation, develops new scientific tools, concept and theories which
would facilitate reliable and valid study of human behaviour. And it
also stand-up for the test of criticism.
Steps of research
➢ Formulating a research problem
➢ Research design conceptualisation
➢ Instrument construction for data collection
➢ Sampling
➢ Research proposal writing
➢ Data collection
➢ Data processing
➢ Research report writing
DR.
Md.Khurshid
Alam
43
Research problem
It is a difficulty or a problem demanding a solution with in the subject
is of his discipline. It is the first step in a scientific enquiry.
There are five components of a problem—
➢ Research consumer
➢ Research consumers objectives
➢ Alternative means to meet the objective
➢ Doubt in regard to selection of alternatives
➢ There must be one or more environments to which the difficulty
or problem pertains.
Selection of a problem: - it is first step in research. Selection is, Itself a
problem. One with a critical, curious and imaginative mind and is
sensitive to practical problem could easily identify problem for study.
Sources for Selection of a problem: -
• Review of literature
• Academic experience
• Daily experience
• Exposure to field situations
• Consultations brain storming
• Research
• Institution
Formulating the problem – it needs following criteria-
I. Internal criteria – it consist---
a) Researchers interest
b) Researchers competence
c) Researchers own resource.
DR.
Md.Khurshid
Alam
44
II. External criteria –
a) Research ability of the problem.
b) Importance and urgency
c) Novelty of problem
d) Feasibility
e) Facilities
f) Usefulness and social relevance
g) Research personal
Objective of formulating a problem – A problem well put is half
solved. The formulation serves the purpose. The clear and accurate
statement of the problem, the development of conceptual model, it
defines the objective of the study, the setting of investigative
question, the formulation of hypothesis to be tested and the
operation definition of concept and the delimitation of the study
determine the exact data needs of the study. It prevents wastage of
time and energy. It provides direction of study. It determine method
to be adopted.
Technique involve in formulating problem- it includes-
I. Developing title- It indicates core of study, reflect the real
intention of researcher. Title should be as long as it covers the
subject and as short as interest should retain.
II. Building a conceptual model- Conceptual model gives an exact
idea of the research problem and shows its various properties
and variable to be studied.
III. Defining the objective of study – It indicates what are trying to
get through study.
DR.
Md.Khurshid
Alam
45
Criteria of a good research problem-
1. Verifiable evidence: - other observer can see or check.
2. Accuracy: - it means truth or correctness of a statement.
3. Precision: - that is making it as exact as necessary.
4. Systematisation: - data should be collected systemic and
organised way.
5. Objectivity: - that is free being from all biases and vested
interest.
6. Recording: - that is writing down complete detail as
quickly as possible.
7. Controlling condition: - controlling all variable except
one.
8. Training investigators: - that is imparting necessary
knowledge to investigators.
Types of research question: - conceptualise that research study can
ask three type of question—
▪ Descriptive question – describe phenomenon or characteristics of
a particular group of subjects being study.
▪ Relationship question – investigate the degree to which two or
more variable are associated with each other.
▪ Difference question – make compression between or within
groups of interest.
A research question must identify –
▪ Variable under study
▪ Population being studied
▪ Testability of question
DR.
Md.Khurshid
Alam
46
Concept of variable
A variable is a characteristic, traits or attribute of a subject.
❖ Variable – A quantitative characteristics which varies from unit
to unit. E.g.- height
❖ Attribute – A qualitative characteristic which varies from unit in
unit. E.g.- sex
❖ Discrete variable – Some specified value in a given range. E.g. –
number of children per family.
❖ Continuous variable – A variable which assume all the value in
the range. E.g.- Hight of persons
Types of variable –
▪ Independent – any variable which is adopted for bringing change
(effect) is called independent variable.
▪ Dependent – the variable that change under the effect of another
variable is called dependent variable.
▪ Extraneous – the independent variable which is unwanted for
purpose of study but may affect the dependent variable is called
extraneous variable/ factor.
▪ Chance variable – it is also independent and unwanted variable
which may affect the dependant variable by chance.
DR.
Md.Khurshid
Alam
47
Research design
It is a logic and systemic plan prepared for directing a research study.it
specifies the objective of study, the methodology and technique to be
adopted for achieving the objective, it contributes the blue print for
the collection, measurement and analysis of data. A research design is
programme that guide the investigator in the process of collecting,
analysing and interpreting the observation.
According to cook – A research design is arrangement of
condition for collection and analysis of data in a manner that aims
to combine relevance to the research purpose with economy in
procedure.
Component of research design
1. Dependent and independent variable: -
Phenomena that assume different values quantitatively even
in decimal point are known as continuous variable, and
values that can be expressed only in integer value are called
non continuous variable.
2. Extraneous factor- the independent variable which are not directly
associated to the purpose but effect the dependent variable.
3. Control – the term control is used in experimental research to reflect
the restrain in experimental condition.it is used to minimise the
effect of extraneous independent variable.
4. Cofound relationship – the relationship between dependent and
independent variable is said to be confounded by an extraneous
variable, when the dependent variable is not free from its effect.
❖ Research hypothesis-it is the predictive statement which relates a
dependent variable and an independent variable.
❖ Control group- in experimental research, the group which is exposed
to usual condition is known as control group.
❖ Experimental group- the group which is either receive or exposed to
the intervention is called experimental group.
❖ Treatment- it is referred to the different condition to which the
experimental and control group is subjected to.
DR.
Md.Khurshid
Alam
48
Function of research design
❖ It relates to the identification and development of the procedure
and logistics arrangement of those procedure required for study.
❖ Emphasis on quality of the adopted procedure to ensure validity,
objectivity and accuracy.
Research design should have following information-
❖ Who will contribute to research population?
❖ Method of identification of research population.
❖ Where whole population will be studied or not? If not, then selection
of sample and method of sampling.
❖ Method of data collection with justification.
❖ How ethical issue will be addressed?
Different research design- (study is of different types; hence a single
research design is not suitable for all study.)
A)On the basis of number of contacts with the study population,
Research design three types-
1. Cross sectional study or prevalence study (one contact only): This
study is cross sectional to both the study population and time of
investigation.it is extremely simple design used to study the
prevalence of a phenomenon, situation or issue.it is easy and cheap
but change cannot measure by this study.
2. Before and after study (two contacts): - It is also known as pre-
test/post-test design. It can be described as two sets of cross-sectional
data collection point on the same population to find out the change in
the phenomenon or variable between two points of time.it can
measure change in a situation, phenomenon or issue.it is an
appropriate design for measure the impact or effectiveness of a
program. It may be either experimental or non-experimental.it is
more difficult, more expensive and require a longer time to
complete.it measure total change (change produce by both
independent variable and extraneous variable). Effect of this study
may be contaminated with maturation effect, reactive effect and
regression effect.
DR.
Md.Khurshid
Alam
49
3.Longitudinal study- (> 𝟐 𝒄𝒐𝒕𝒂𝒄𝒕𝒔): -In this study population
is visited a number of times at a regular interval, to collect the
required information. The number of intervals varies study to
study. Interval may be days, weeks, months or years, depends
upon study. Pattern of change can be studied by this method.
But maturation effect, reactive effect, regression effect and
conditioning effect can produce error in data.
B) On the basis of reference period, study design is of three types-
1. Retrospective study design: - This study is focus on the
problem or phenomenon which has happened in the past. Data
can be collected on the basis of recall of the situation. It is always
non experimental.
2. Prospective study design; -In this design study id done in
future.it may be experimental or non-experimental or semi
experimental.
3. Retro-prospective design: -This study focus on past trends in
a phenomenon and study in the future. This is combination of
retro and prospective studies.
C) On the basis of nature of investigation, study design is of three
types-
1. Non-experimental study
2. Experimental study
3. Semi experimental study
1.Non-experimental study- This is cause tracing study, that is it start
from the effect to trace the cause. Environment is not controlled in
this study.
2. Experimental study – It is cause and effect relationship study.it start
from cause to establish effect. An experimental study can be carried
out in either a controlled or a natural environment.
DR.
Md.Khurshid
Alam
50
Prof. Fisher has enumerated three principle of experimental design-
a) The principle of replication: - the experiment should be
repeated more than once. Thus, each treatment is applied in many
experimental units instead of one. By doing so, the statistical
accuracy of the experiments is increased.
b) The principle of randomisation: - it provides protection, when
we conduct an experiment, against the effect of extraneous factors
by randomisation.
c) Principle of local control: - this means that we should plan the
experiment in such a manner that we can perform a two-way
analysis of variance, in which the total variability of the data is
divided into three components attributed to treatment, the
extraneous factor and experimental error.
Types of experimental study design- there are so many types of
experimental study design, some of them which is used in medical
science and public health are-
❖ The after only design
❖ The before and after design
❖ The control group design
❖ The double control designs
❖ The comparative design
❖ The matched control experimental design
❖ The placebo designs
The after only design- In this design the baseline data (pre-test or
before observation) is constructed on the basis of respondents recall
of the situation before the intervention or from information available
in existing records. Only one set of data, after intervention is collected.
The change in dependent variable is measured by the difference
between the base line and after intervention data. This study
measure total change, including change attributed to extraneous
variable, hence, net effect of intervention cannot be identified by this
design. Due to improper baseline data to compare observation, the
two set of data are not strictly comparable. So, it is a faulty design for
measure the impact of an intervention.
DR.
Md.Khurshid
Alam
51
The before and after design- In this design a base line data is collected
before intervention and another set of data is collected after
intervention from population. Hence, data is comparable. This design
also measures the total change.
Effect (change in dependent variable) = effect of (intervention effect
+ extraneous effect + chance effect) – base line data.
The control group design- In experimental research, the term control
is used to refer the restrain experimental condition. This study is
design to minimise the effect of extraneous independent variable. In
this study a test article is compared with a treatment that has known
effect. The control may receive no treatment, or standard treatment
or placebo. The chief objective of control group is to quantify the
impact of extraneous variable. By this design the net effect of
intervention is measured.
Effect in dependent variable = (intervention effect+ extraneous
effect+ chance effect) – (effect in control group+ base line data).
Purpose of control study
➢ It helps in differentiating the result or outcome by the test
treatment from result caused by another factor(extraneous).
➢ It helps the investigator to know that what would happen to the
patient if they had not received the treatment.
➢ It provides sufficient evidence to prove the effectiveness of the
use of Unani medicine/ procedure in prevention, diagnosis or
treatment.
Control may consist-
❖ No treatment (plain control)
❖ Placebo treatment
❖ Well established treatment
❖ Different dose of same treatment
❖ Full scale treatment
❖ Minimal treatment
❖ Alternative treatment
DR.
Md.Khurshid
Alam
52
In an experimental hypothesis testing research when a group is
exposed to usual condition, it is termed control group. And the group
which is exposed to some novel/special condition, is termed as
experimental group.
Types of control
a) Plain control (no treatment in control group)- It is always
open that is patient and investigator both are not blind.
Eg. Effect of Hijamah in hair fall.
B) Placebo control-
Placebo is a pharmacological inert substance used in a clinical
trial. Placebo treatment is also called dummy treatment.
• Single blind control
• Double blind control (ideal)
• Triple blind control
C)Standard control-
D) Dose respondent control
E) External control – Subject receiving treatment are compared
with a group of patients external to the study.
F) Multiple control-more than one types of control groups.
Disadvantage-
➢ Ethical concern.
➢ In certain condition it may not be wise to withdraw the patient
from medication (Hypertention) and it may impose a serious
threat to the patient wellbeing.
Matching: - it is method for formation of comparable control and
experimental group to minimise the effect of extraneous variable.
Matching on minimum parameter is better. Matching may be –
A) Topographic f) Age matching
B) Same habit g) Sex matching
C)Same socioeconomic status
D)Dietary habit matching
E) Professional matching
DR.
Md.Khurshid
Alam
53
Blinding
Blinding is a method of control experimentation in which the subject
or researcher or both are not informed about the treatment given.
According to level of blinding, trials can be divided into following four type-
1.Unbliend study/open study- in this type of study both the patients
and the investigator (everybody involved in trial) are aware of the
identity of treatment given.
2.Single blind trials – in this type of study the patients are not aware
of the trial treatment being given to them, but their physician does
know about it.
3.Double blind trials - in this type of study neither patients nor
investigator knows which treatment on individual patient is given.
4.Triple blind trials – it is a double-blind study design involves
monitoring of the response by committee who is blind, is called triple
blind study. In this type, the patients, investigators and the data
analyst does not know, which treatment was being given since the
treatment may be coded. When trial reaches a predefined point, the
code is broken and trial is unblinded. This design gives an advantage
over double blind study because the monitoring committee can
evaluate the response in a more objective fashion.
Trial with Zemen’s design- in this design eligible individuals are
randomised before they given consent to participate in the trial, to
receive either a standard treatment or an experimental intervention.
Those who are allocated to standard treatment are given the standard
treatment and not told that they are part of a trial, where as those
who are allocated to the experimental intervention are offered the
experimental intervention and told that they are part of a trial. If they
refuse to participate in the trial, they are given the standard
intervention but are analysed as if they had received the experimental
intervention.
DR.
Md.Khurshid
Alam
54
Clinical research/ trial
Clinical research/ trial is systemic study of pharmaceutical product or
procedure on human subject. Clinical trial is a prospective study
comparing to the outcome of certain intervention against a control in
human subject. It may proceed from cause to effect or effect to cause.
Objective-
➢ To evaluate the safety and efficacy of Unani drug that is already
claimed by Unani physician.
➢ To develop new Unani drugs.
➢ To develop economical easily available Unani drugs.
➢ To develop new indication of Unani drug or to change dose
format or route of administration.
➢ Objective may be oriented on disease/drugs, procedure and
fundamental of science.
➢ Disease oriented objectives are-
❖ To study the aetiology.
❖ To study the pathogenesis.
❖ To study the clinical method.
❖ To study the principle of methods of treatment.
❖ To study the prognosis of disease.
❖ To study the complication of disease.
➢ Drugs oriented objectives are-
❖ To study safety and efficacy of Unani drugs.
❖ Clinical studies. Therapeutic trials of single and compound
drugs in different disease.
➢ Clinico-pharmacological studies-
❖ To validate various regime(cupping).
❖ To validate fundamental of Unani system of medicine.
❖ To develop parameter for mizaj assessment.
❖ To develops diagnostic tools based on Unani fundamentals.
❖ To validate asbabe sitta zarooria and gair zarooria.
➢ Clinical trials may be concerns with ilaz bil tadbeer, ilaz bil
gheza, surgical procedure, radiotherapy, or other alternative
approach.
DR.
Md.Khurshid
Alam
55
There are three elements of experimental design-
A) Control
B) Randomisation
C)Blinding
Phases of clinical trials: -there are four phase of clinical trials which
proceeds one from another.
❖ Phase I (Pharmacological phase)
• Always proceed by preclinical data and safety and efficacy
of test drug by study on animal subjects.
• First time human subjects (healthy volunteers) are exposed
with test drug.
• The purpose of trial is to find out toxicity, to calculate
maximum tolerance dose in human, to study
pharmacokinetics of drugs as per metabolism and
distribution.
❖ Phase II (Exploratory phase)
• It proceeds after successful phase I trials.
• It is conducted on patients.
• The purpose of trial is to find safety and efficacy, on
patients,
• To study pharmacodynamics pharmacokinetics of test drug.
• Informed consent is mandatory from subject.
❖ Phase III (Confirmatory phase)
• It proceeds after successful phase II trials.
• The purpose of trial is to Conclude safety and efficacy of
test drugs. Long term tolerance, different dose and regimes
and drug interaction are studied.
• Subjects are patients, hence Informed consent is mandatory
from subject.
DR.
Md.Khurshid
Alam
56
❖ Phase VI (Post marketing surveillance)
• It is not a clinical trial rather it is feedback or ADR
from market on patients.
• Delayed and rare effect may be reported from field.
• The effect on some special population or condition
may be reported.
• For this phase voluntary reporting and cohort study
method are adopted.
Case studies methods
Herbert spencer was the first social philosopher who used case study
in comparative studies of different culture. Several case studies were
mentioned by Zakaria Razi in their literatures.
Case studies is a method of exploring and analysing the life of a social
unit or entity, be it a person, a family, an institution or a community.
The aim of case study method is to locate or identify the factors that
account for the behaviour patterns of a given unit, and its relationship
with the environment. Case study is conducted for understanding,
exploring and interpreting of understudy issue for which little is
known.
Advantage of case study method- it provides an opportunity for the
intensive analysis of many specific details often overlooked by other
methods.
Disadvantage of case study method-. The case documents hardly fulfil
the criteria of reliability, adequacy and representativeness.
Case may be extremely typical or atypical.
DR.
Md.Khurshid
Alam
57
Case control study: - It is a retrospective study. This is first approach
to test casual hypothesis.
a) Both exposure and outcome (disease)have occurred
before the start of the study.
b) Study proceed from effect to cause (backward direction)
c) It uses a control or comparison group to support/ refuse
inference.
There are four basic steps in conducting a case control study-
1.Selection of case and control
2.Matching
3.Mesurement of exposure
4. Analysis and interpretation
Trend studies- it is most appropriate method of investigation to map
change over a period.
Cohort study
Cohort is defined as a group of people(units)who share a common
characteristic or experience within a definite time period (age,
occupation, pregnancy etc).
Cohort study is a type of analytical (observational)study which is
usually undertaken to obtain additional evidence to refuse or support
the existence of an association between suspected cause and disease.
It is longitudinal and incidence study.
Action research:- it is carried out to identifies area of concern, develop
and test alternatives, and experiment with new approaches.
There are two tradition of action research-
(1) The British tradition (2) The American tradition
DR.
Md.Khurshid
Alam
58
Inductive and deductive approaches to research
(Inductive=Zuj se qul; Deductive= qul se zuj)
The main difference between inductive and deductive approaches to
research is that whilst a deductive approach is aimed and testing theory,
an inductive approach is concerned with the generation of new theory
emerging from the data.
A deductive approach usually begins with a hypothesis, whilst an inductive
approach will usually use research questions to narrow the scope of the study.
For deductive approaches the emphasis is generally on causality, whilst
for inductive approaches the aim is usually focused on exploring new
phenomena or looking at previously researched phenomena from a
different perspective.
Inductive approaches are generally associated with qualitative
research, whilst deductive approaches are more commonly associated
with quantitative research. However, there are no set rules and some
qualitative studies may have a deductive orientation.
One specific inductive approach that is frequently referred to in
research literature is grounded theory, pioneered by Glaser and Strauss.
This approach necessitates the researcher beginning with a completely
open mind without any preconceived ideas of what will be found. The
aim is to generate a new theory based on the data.
Once the data analysis has been completed the researcher must examine existing
theories in order to position their new theory within the discipline.
Grounded theory is not an approach to be used lightly. It requires
extensive and repeated sifting through the data and analysing and re-
analysing multiple times in order to identify new theory. It is an
approach best suited to research projects where there the phenomena to
be investigated has not been previously explored.
The most important point to bear in mind when considering whether to
use an inductive or deductive approach is firstly the purpose of your
research; and secondly the methods that are best suited to either test a
hypothesis, explore a new or emerging area within the discipline, or to
answer specific research questions.
DR.
Md.Khurshid
Alam
59
Sampling
A part of the population is known as sample. The method consisting
of the selection for study, a portion of the universe (population) with
a view to draw conclusion about the universe (population) is known
as sampling.
Sampling helps in time and cost saving.
A statistic is characteristic of a sample, it is denoted by using lower
case roman letter. E.g. Sample size is denoted by n, sample standard
deviation is denoted by s. Whereas parameter is characteristic of
population and it is denoted by Greek or capital letter.
E.g. Population size is denoted by N, population standard deviation is
denoted by 𝝈.
Advantage of sampling
➢ Limit the number of units for study. (Unit– the object whose
characteristics are studied)
➢ It makes study feasible in respect of budget, time and logistics.
Characteristics of a good sample
➢ Representativeness: a sample must be representative of the
population.
➢ Accuracy: an accurate sample is one which exactly represents the
population.
➢ Precision: Precision is measured by standard error.
➢ Size: a good sample must be adequate in size in order to be
reliable.
Types of sampling: - there are two generic type-
A)Random or probability sampling
B)Non-Random or Non-probability sampling
DR.
Md.Khurshid
Alam
60
A)Random or probability sampling: - It is based on theory of
probability. It provides a known non-zero chance of selection for
each population element. There are four method of random
sampling-
1. Simple random sampling
2. Systematic random sampling
3. Stratified random sampling
4. Cluster sampling
1.Simple random sampling: - This sampling technique gives each
element an equal and independent chance of being selected.
➢ Drawing sample numbers by using (a) lottery method, (b)a tables
of random numbers or (c) by using computer.
➢ This type of sampling is suited for a small homogeneous
population.
➢ This is one of the easiest methods.
2. Systematic random sampling: - In this sampling, elements are
selected from the population at a uniform interval that is measured in
time, order, or space. It is simpler than random sampling.
It ignores all elements.
3.Stratified random sampling: - In this method we divide the
population into relatively homogeneous groups, called strata. Then
we use one of the two approaches- Either we select at random from
each stratum a specified number of elements corresponding to the
proportion of that stratum in the population as a whole or we draw an
equal number of elements from each stratum and give weight to the
results according to the stratum’s proportion of total
population.(Hence there are two method of sampling – 1. Equal
allocation and 2. Proportional allocation.)
4.Cluster sampling: - In this method we divide the population into the
group or clusters, and then select a random sample of these clusters.
We assume that these individual clusters are representative of the
population as a whole. A well-designed cluster sampling procedure
DR.
Md.Khurshid
Alam
61
can produce a more precise sample at considerably less cost than that
of simple random sampling.
➢ Needs of randomisation- The process of assigning the study
subjects randomly to either the treatment or control group is
called randomisation.
❖ It is essential to control various known or even unknown
biases at the beginning of the trial and during the course of
trial. It is very helpful in achieving this objective.
❖ Randomisation always remove the bias influencing the
result.
❖ Randomisation allows for valid statistical interpretation of
raw data.
❖ It eliminates selection bias.
❖ It avoids systemic difference between groups.
❖ It produces comparable group.
B) Non-Random or Non probability sampling: - It is not based on the
theory of probability. This sampling does not provide a chance of
selection to each population element. This method of sampling is
simple, convenience and low cost.it may be classified in to-
1. Convenience or accidental sampling: - It means selecting
sample units in a just “hit and miss” fashion. It the cheapest,
simplest and not require any statistical expertise. But this is
highly biased because of researcher’s subjectivity.
2. Purposive or judgemental sampling: -this method means
deliberate selection of sample units that conform to some pre-
determined criteria. It may not be true representative of their
parent population.
3. Quota sampling: -This is a form of convenient sampling
involving selection of quota groups of accessible sampling
units by traits such as sex, age, social class. Etc.
4. Snow-ball sampling: -This is a method of building up a list or
a sample of a special population by using an initial set of its
DR.
Md.Khurshid
Alam
62
members as informants.it is useful for smaller population for
which no frame is readily available.
Sampling distribution
➢ If we take several samples from a population, the statistic we
would compute for each sample need not be the same and most
probably would vary from sample to sample.
➢ A probability distribution of all the possible means of the samples
is a distribution of the sample means. Statisticians call this a
sampling distribution of the mean.
➢ Standard deviation of the distribution of sample means to
describe a distribution of sample means is standard error of the
mean.
➢ The standard deviation of the distribution of sample proportions
is called standard error of the proportion.
➢ The standard deviation of the distribution of sample statistic is
known as the standard error of the statistic.
➢ The sampling distribution has a mean equal to the population
mean 𝝁𝒙¯ = 𝝁.
➢ The sampling distribution has a standard deviation (a standard
error) equal to the population standard deviation divided by the
square root of the sample size 𝝈𝒙¯ =
𝝈
√𝒏
Therefore, the standard error of the mean for an infinite
population is given by: 𝝈𝒙¯ =
𝝈
√𝒏
Where 𝜎 is the population standard deviation and n= sample size.
If the sample mean is standardised and is taken from a
normalised population then the standardised sample mean is
given by: -
Z =
𝒙¯−𝝁
𝝈
DR.
Md.Khurshid
Alam
63
The central limit theorem: - First, the mean of the sampling
distribution will equal the population mean regardless of the sample
size, even if the population is not normal. Second, as the sample size
increases, the sampling distribution of the mean will approach
normality, regardless of the shape of the population distribution. This
relationship between the shape of the population distribution and the
shape of the sampling distribution of the mean is called the central
limit theorem.
The relationship between sample size and standard error: -
The use of finite population multiplier in calculating the standard
error.
If the population size is known, i.e. If N is known then if
𝑛
𝑁
> 0.05
Then we have the following formula to calculate the standard error of
the mean-
𝝈𝒙¯ =
𝝈
√𝒏
X √
𝑵−𝒏
𝑵−𝟏
Here N=Size of population and n=size of sample;
The term √
𝑁−𝑛
𝑁−1
In above equation is finite population multiplier.
DR.
Md.Khurshid
Alam
64
Hypothesis= Hypo+thesis
(Hypo means under; thesis means research theory) A hypothesis is an
assumption about relation between variables.it is a tentative
explanation of the research problem or a gauss about the research
outcome.
Importance of hypothesis-
▪ It gives direction to research.
▪ Suggest new experiment and observation.
▪ It enables collecting relevant data and organising them
effectively.
▪ It prevents indiscriminate gathering of data.
Sources of hypothesis –
▪ Existing theories
▪ Finding of previous studies.
▪ Personal experience. Analogy.
Criteria for hypothesis construction – It is never formulated in the
form of question. Following criteria should be followed for hypothesis
construction—
➢ It should be empirically testable, whether it is right or wrong.
➢ It should be specific and precise.
➢ The statement of the hypothesis should not be contradictory.
➢ It should specify variables.
➢ It should describe only one issue,
➢ It must consider the experience of another researcher
Need of hypothesis –
✓ It provides definite point to the investigation
✓ It guides the direction of study
✓ It specifies source of data. It determines the data needs
✓ It determines the most appropriate technique for analysis
✓ It contributes to the development of theory.
DR.
Md.Khurshid
Alam
65
Characteristics of a good hypothesis
1. Conceptual clarity
2. Specificity
3. Empirically testable
4. Availability of techniques
5. Theoretical relevance
Types of hypothesis
➢ Null hypothesis(H0) and Alternative hypothesis (Ha)
➢ One tail or two tail hypotheses (directional vs non directional)
Alternative hypothesis is usually the one which wishes to prove and
the null hypothesis are one that wish to disprove. The null hypothesis
represents the hypothesis we are trying to reject, the alternative
hypothesis represents all other possibilities. Null hypothesis should
always be specific hypothesis, i.e it should not state about or
approximately a certain value.
Concept of hypothesis testing: -
A)The level of significance- it is very important concept in the
context of hypothesis testing. It is always some percentage
(usually 5%) which should be chosen with great care, thought and
region. 5% level of significance means researcher is willing to
take as much as 5%risk rejecting the null hypothesis when null
hypothesis happens to be true.
Type I and type II error –
Type I error- when ewe reject H0 when H0 is true. Type I error means
rejection of hypothesis which should be accepted.it is also called level
of significance of test. It is denoted by 𝛼 (alpha)
Type II error – we accept H0 when it is not true.it means accepting the
hypothesis which should has been rejected. It is denoted by 𝛽 (beta).
6.Consistency
7.Objectivity
8.Simplicity
DR.
Md.Khurshid
Alam
66
Two tailed or one tailed test
A one tail test should be used when we are to test, say, whether the
population mean is either lower than or higher than some
hypothesised value.
A two-tail test reject the null hypothesis if, say, the sample mean is
significantly higher or lower than the hypothesised value of the mean
of the population.
(When we accept a null hypothesis on the basis of sample
information, we are really saying that there is no statistical evidence
to reject it. We are not saying that the null hypothesis is true. The only
way to prove a null hypothesis is to know the population parameter.,
and that is not possible with sampling. Thus, we accept the null
hypothesis and behave as if it is true simply because we can find no
evidence to reject it).
The steps in processing in using a standardised scale in hypothesis
testing: -
1. Decide whether it is one tail or two tailed tests.
2. State the hypothesis.
3. Select a level of significance appropriate for the decision.
4. Decide which distribution (t or z) is appropriate and find the
critical values for the chosen level of significance from the
appropriate table.
5. Calculate the standard error of the sample statistic.
6. Use the standard error to convert the observed value of the sample
to the standardised value.
7. Sketch the distribution and mark the position of the standardised
sample value and the critical values of the test.
8. Compare the value of the standardised sample statistic with the
critical values for this test and interpret the result.
DR.
Md.Khurshid
Alam
67
STATISTIC
Parametric: - t-test, z-test, chi square test (F-test)
Non-parametric: - chi square test, fisher test, Mann–Whitney U test
Z-test
A Z-test is any statistical test for which the distribution of the test
statistic under the null hypothesis can be approximated by a normal
distribution. Because of the central limit theorem, many test statistics
are approximately normally distributed for large samples. For each
significance level, the Z-test has a single critical value (for example,
1.96 for 5% two tailed) which makes it more convenient than the
Student's t-test which has separate critical values for each sample size.
Therefore, many statistical tests can be conveniently performed as
approximate Z-tests if the sample size is large or the population
variance is known. If the population variance is unknown (and
therefore has to be estimated from the sample itself) and the sample
size is not large (n < 30), the Student's t-test may be more appropriate.
If T is a statistic that is approximately normally distributed under the
null hypothesis, the next step in performing a Z-test is to estimate the
expected value θ of T under the null hypothesis, and then obtain an
estimate s of the standard deviation of T. After that the standard score
Z = (T − θ) / s is calculated, from which one-tailed and two-tailed p-
values can be calculated as Φ(−Z) (for upper-tailed tests), Φ(Z) (for
lower-tailed tests) and 2Φ(−|Z|) (for two-tailed tests) where Φ is the
standard normal cumulative distribution function.
DR.
Md.Khurshid
Alam
68
Use in location testing
The term "Z-test" is often used to refer specifically to the one-sample
location test comparing the mean of a set of measurements to a given
constant when the sample variance is known. If the observed data X1,
..., Xn are (i) independent, (ii) have a common mean μ, and (iii) have a
common variance σ2
, then the sample average X has mean μ and
variance σ2
/ n.
The null hypothesis is that the mean value of X is a given number μ0.
We can use X as a test-statistic, rejecting the null hypothesis if X − μ0
is large.
To calculate the standardized statistic Z = (X − μ0) / s, we need to
either know or have an approximate value for σ2
, from which we can
calculate s2
= σ2
/ n. In some applications, σ2
is known, but this is
uncommon.
If the sample size is moderate or large, we can substitute the sample
variance for σ2
, giving a plug-in test. The resulting test will not be an
exact Z-test since the uncertainty in the sample variance is not
accounted for—however, it will be a good approximation unless the
sample size is small.
A t-test can be used to account for the uncertainty in the sample
variance when the data are exactly normal.
There is no universal constant at which the sample size is generally
considered large enough to justify use of the plug-in test. Typical rules
of thumb: the sample size should be 50 observations or more.
For large sample sizes, the t-test procedure gives almost identical p-
values as the Z-test procedure.
Other location tests that can be performed as Z-tests are the two-sample
location test and the paired difference test.
DR.
Md.Khurshid
Alam
69
Conditions
For the Z-test to be applicable, certain conditions must be met.
• Nuisance parameters should be known, or estimated with high
accuracy (an example of a nuisance parameter would be the
standard deviation in a one-sample location test). Z-tests focus on
a single parameter, and treat all other unknown parameters as
being fixed at their true values. In practice, due to Slutsky's
theorem, "plugging in" consistent estimates of nuisance
parameters can be justified. However, if the sample size is not
large enough for these estimates to be reasonably accurate, the Z-
test may not perform well.
• The test statistic should follow a normal distribution. Generally,
one appeals to the central limit theorem to justify assuming that a
test statistic varies normally. There is a great deal of statistical
research on the question of when a test statistic varies
approximately normally. If the variation of the test statistic is
strongly non-normal, a Z-test should not be used.
If estimates of nuisance parameters are plugged in as discussed above,
it is important to use estimates appropriate for the way the data were
sampled. In the special case of Z-tests for the one or two sample
location problem, the usual sample standard deviation is only
appropriate if the data were collected as an independent sample.
In some situations, it is possible to devise a test that properly accounts
for the variation in plug-in estimates of nuisance parameters. In the case
of one and two sample location problems, a t-test does this.
Example
Suppose that in a particular geographic region, the mean and standard
deviation of scores on a reading test are 100 points, and 12 points,
respectively. Our interest is in the scores of 55 students in a particular
school who received a mean score of 96. We can ask whether this mean
score is significantly lower than the regional mean—that is, are the
students in this school comparable to a simple random sample of 55
DR.
Md.Khurshid
Alam
70
students from the region as a whole, or are their scores surprisingly
low?
First calculate the standard error of the mean:
Where is the population standard deviation?
Next calculate the z-score, which is the distance from the sample mean
to the population mean in units of the standard error:
In this example, we treat the population mean and variance as
known, which would be appropriate if all students in the region
were tested. When population parameters are unknown, a t test
should be conducted instead.
The classroom mean score is 96, which is −2.47 standard error units
from the population mean of 100. Looking up the z-score in a table of
the standard normal distribution, we find that the probability of
observing a standard normal value below −2.47 is approximately 0.5 −
0.4932 = 0.0068. This is the one-sided p-value for the null hypothesis
that the 55 students are comparable to a simple random sample from
the population of all test-takers. The two-sided p-value is
approximately 0.014 (twice the one-sided p-value).
Another way of stating things is that with probability
1 − 0.014 = 0.986, a simple random sample of 55 students would have
a mean test score within 4 units of the population mean. We could also
say that with 98.6% confidence we reject the null hypothesis that the
55 test takers are comparable to a simple random sample from the
population of test-takers.
The Z-test tells us that the 55 students of interest have an unusually low
mean test score compared to most simple random samples of similar
size from the population of test-takers. A deficiency of this analysis is
that it does not consider whether the effect size of 4 points is
meaningful. If instead of a classroom, we considered a subregion
containing 900 students whose mean score was 99, nearly the same z-
DR.
Md.Khurshid
Alam
71
score and p-value would be observed. This shows that if the sample
size is large enough, very small differences from the null value can be
highly statistically significant. See statistical hypothesis testing for
further discussion of this issue.
Z-tests other than location tests
Location tests are the most familiar Z-tests. Another class of Z-tests
arises in maximum likelihood estimation of the parameters in a
parametric statistical model. Maximum likelihood estimates are
approximately normal under certain conditions, and their asymptotic
variance can be calculated in terms of the Fisher information. The
maximum likelihood estimate divided by its standard error can be used
as a test statistic for the null hypothesis that the population value of the
parameter equals zero. More generally, if is the maximum
likelihood estimate of a parameter θ, and θ0 is the value of θ under the
null hypothesis,
Can be used as a Z-test statistic.
When using a Z-test for maximum likelihood estimates, it is important
to be aware that the normal approximation may be poor if the sample
size is not sufficiently large. Although there is no simple, universal rule
stating how large the sample size must be to use a Z-test, simulation
can give a good idea as to whether a Z-test is appropriate in a given
situation.
Z-tests are employed whenever it can be argued that a test statistic
follows a normal distribution under the null hypothesis of interest.
Many non-parametric test statistics, such as U statistics, are
approximately normal for large enough sample sizes, and hence are
often performed as Z-tests.
DR.
Md.Khurshid
Alam
72
t-test
t-test:- in 1908s, theoretical work on t distribution was done by
W.S.Gosset. Gosset was was employ of Guinnness Brewery in Dublin,
Ireland. Guinnness Brewery did not permit employees to publish
research findings under their own names.so, Gosset adopted the pen
name student and published under the name. The t distribution is
commonly called students t distribution or simply students
distribution.
Conditions for use of t-test
➢ Sample size ≤ 30.
➢ Population standard deviation must be unknown.
➢ Population distribution should be normal or approxmatly normal.
➢ Random sample.
➢ Quantative data.
Degree of freedom- there is a different t distribution for each of the
possible degree of freedom. We will use degree of freedom when we
select a t distribution to estimate a population mean, and we will use
n-1 degree of freedom, where n is the sample size.
For example, if we use a sample of 22 to estimate a population mean,
we will use (n-1)=21 degree of freedom in order to select the
appropriate t distribution.
DR.
Md.Khurshid
Alam
73
t-test Calculation of srandard error of diffrence between
means
Small sample and uncorelated data
Ist step-calculate combined variance(SD2)-
SD2
=
(∑ 𝑋1−𝑋¯)2 𝑜𝑓 𝑔𝑟.1 + (∑ 𝑥2−𝑥¯)2 𝑜𝑓 𝑔𝑟.2
𝑁1 +𝑁2−2
Because, (𝑥1 − x¯) = x1 and (𝑥2 − x¯) = x2 ; hence we can write as
(x1 - x¯)2 = x1
2 and (𝑥2 − x¯)2 = x2
2
SD2
=
(∑ 𝑋1)2 + (∑ 𝑋2)2
𝑁1 +𝑁2−2
SD = √
(∑ 𝑋1)2 + (∑ 𝑋2)2
𝑁1 +𝑁2−2
2nd step- calculation for dtandard error of diffrence-
SED = √
SD12
N1
+
SD2 2
N2
OR SD = √
1
𝑁1
+
1
𝑁2
3rd step- calculation of t –
T =
𝒐𝒃𝒔𝒆𝒓𝒗𝒆𝒅 𝒅𝒊𝒇𝒇𝒓𝒆𝒏𝒄𝒆
𝑺𝑬𝑫
=
𝐗¯𝟏−𝐗¯𝟐
𝑺𝑬𝑫
4th step- D.F = N1+N2 - 2
DR.
Md.Khurshid
Alam
74
Using t – distribution table
Comparison between t and z tables
The table of t distribution values differs in construction from the z
tables. The t table is more compact and shows areas and t values for
only a few percentages (10, 5, 2, and 1 percent). Because there is a
different t distribution for each number of degrees of freedom, a more
complete table would be quite lengthy. Although we can conceive of
the need for a more complete table.
A second difference in the t table is that it does not focus on the
chance that the population parameter being estimated will fall within
our confidence interval. Instead, it measures the chance that the
population parameter we are estimating will not be within our
confidence interval (that is, that it will lie outside it).
If we are making an estimate at the 90% confidence level, we would
look in the t table under the 0.10 column
(100 percent − 90 𝑝𝑒𝑟𝑐𝑒𝑛𝑡 = 10 𝑝𝑒𝑟𝑐𝑒𝑛𝑡)
This 0.10 chance of error is symbolised by a, which is the Greek letter 𝛼
(alpha). We would find the appropriate t values for confidence
intervals of 95%, 98%, and 99% under the column headed 0.05, 0.02,
and 0.01 respectively.
A third difference in using the t table is that we must specify the
degree of freedom with which we are dealing. Suppose we make an
estimate at the 90% confidence level with a sample size of 14, which
is 13 degree of freedom. Look under the 0.10 column until you
encounter the row labelled 13. Like a z value, the t value there of 1.771
shows that if we mark off plus and minus 1.7716
𝜎𝑥¯ (𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒𝑑 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟 𝑜𝑓 𝑥¯) on either side of the mean,
the areas under the curve between these two limits will be 90%, and
the area outside these limits (the chance of error ) will be 10 percent.
DR.
Md.Khurshid
Alam
75
Remember that the any estimation in which the sample size is 30 or
less and the standard deviation of the population is unknown and the
underlying population can be assumed to be normal or approximately
normal, we use the t distribution.
Determining the sample size(n) in Estimation: - in all above example
the sample size was known. Now we are trying to estimate the sample
size n. If it is too small, we may fail to achieve the objective, if it is too
large, we will be wasting resources. However, let’s try to examine
some of the methods that are useful in determining what sample is
necessary for any specified level of precision.
Comparison of two ways of expressing the same confidence limits
Lower confidence limit Upper confidence limit
A. X¯− 500 X¯+ 500
B. X¯− 𝑧 𝜎𝑥¯ X¯+ 𝑧 𝜎𝑥¯
C. X¯− t 𝜎𝑥¯ X¯+ t 𝜎𝑥¯
Example: - Department of TST, NIUM Banglore, wants to conduct a
survey of the annual earning of its Passed M.D, calculate appropriate
sample size for this study in order to estimate the mean annual
earnings of last year’s class within 500 at 95% level of confidence.
Solution: - in problem, it is stated that variation of 500 on either side
of the population mean. That means, 𝑧 𝜎𝑥¯= 500
At 95% level of confidence we know from the z table that z=1.96
Therefore, 1.96 𝜎𝑥¯= 500; and that means 𝜎𝑥¯= 500/1.96=255
Now if the standard error of the mean is 255; that leads us to
𝜎𝑥¯=
𝜎
√𝑛
= 255. Since 𝜎 = 1500 we can find n that is-
=
1500
√𝑛
=255 ; therefore, n= (
1500
255
)2 = 34.6
It means n should be greater than 34.6 or 35 if the NIUM wants to
estimate the precision with which it wants to conduct the survey.
DR.
Md.Khurshid
Alam
76
Chi-square (𝐗𝟐
) Test
Chi-square (X2
) Test enable us to test whether more than two
populations can be considered equal. (t-test and z-test are applicable
for only one / two sample).
Chi-square (X2
) Test Allow us to do a lot more than just test for the
equality of several population. Suppose we classify a population in to
several categories with respect to two attributes (such as age and job
performance), we can then use a Chi-square (X2
) Test to determine
whether the two attributes are independent of each other.
Characteristics of Chi-square (X2
) Test: -
➢ Chi-square (X2
) Test is based on frequencies and not on
parameter.
➢ It is a non-parametric test where no parameter regarding the
rigidity of papulation or papulations are required.
➢ Additive property is also found in Chi-square (X2
) Test
➢ Chi-square (X2
) Test is useful to test the hypothesis about the
independence of attributes.
➢ Chi-square (X2
) Test can be used in complex contingency tables.
➢ Chi-square (X2
) Test has very wide use.
Degree of freedom: - The number of degrees of freedom for n
observations is n-k and is usually denoted by v where k is the number
of independent linear constraints imposed upon them.
Suppose someone tells me to write any four numbers then I have all
the numbers of my choice. But if a restriction is applied or imposed to
the choice that the sum of these numbers should be 50; then the
freedom of choice would be reduced to three only and so, the degree
of freedom would now be 3. If a Chi-square (X2
) is defined as the sum
of the square of n independent standardised normal variates and the
condition of the satisfaction of k linear relations is imposed upon them
(such as estimation of some population parametric value etc.) Then
the effect of these n constrains of
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf
Research Methodology and Statics  pdf.pdf

Contenu connexe

Similaire à Research Methodology and Statics pdf.pdf

Review of descriptive statistics
Review of descriptive statisticsReview of descriptive statistics
Review of descriptive statisticsAniceto Naval
 
Survey procedures in dentistry
Survey procedures in dentistrySurvey procedures in dentistry
Survey procedures in dentistrydeepthiRagasree
 
Survey procedures in dentitistry
Survey procedures in dentitistrySurvey procedures in dentitistry
Survey procedures in dentitistryDocdhingra
 
Economics Class 11 Model Paper Study Well
Economics Class 11 Model Paper Study WellEconomics Class 11 Model Paper Study Well
Economics Class 11 Model Paper Study WellAvinAsisMiranda1
 
1 Introduction to Biostatistics.pdf
1 Introduction to Biostatistics.pdf1 Introduction to Biostatistics.pdf
1 Introduction to Biostatistics.pdfbayisahrsa
 
Research Methodology Basics - I
Research Methodology Basics - IResearch Methodology Basics - I
Research Methodology Basics - Ivigneswaran81
 
DATA COLLECTION AND PRESENTATION IN PUBLIC HEALTH DENTISTRY
DATA COLLECTION AND PRESENTATION IN PUBLIC HEALTH DENTISTRYDATA COLLECTION AND PRESENTATION IN PUBLIC HEALTH DENTISTRY
DATA COLLECTION AND PRESENTATION IN PUBLIC HEALTH DENTISTRYPoonam Narang
 
Collection of data class 12th cbse
Collection of data class 12th cbse Collection of data class 12th cbse
Collection of data class 12th cbse Sanjay Thakran
 
Advanced Biostatistics presentation pptx
Advanced Biostatistics presentation  pptxAdvanced Biostatistics presentation  pptx
Advanced Biostatistics presentation pptxAbebe334138
 
Collection of Data - Class 11 - Statistics
Collection of Data - Class 11 - StatisticsCollection of Data - Class 11 - Statistics
Collection of Data - Class 11 - StatisticsAnjaliKaur3
 
introduction to statistics
introduction to statisticsintroduction to statistics
introduction to statisticsSoujanyaLk1
 
Module-7-Descriptive Research-survey.pdf
Module-7-Descriptive Research-survey.pdfModule-7-Descriptive Research-survey.pdf
Module-7-Descriptive Research-survey.pdfVikramjit Singh
 

Similaire à Research Methodology and Statics pdf.pdf (20)

Review of descriptive statistics
Review of descriptive statisticsReview of descriptive statistics
Review of descriptive statistics
 
Survey procedures in dentistry
Survey procedures in dentistrySurvey procedures in dentistry
Survey procedures in dentistry
 
Descriptive Method
Descriptive MethodDescriptive Method
Descriptive Method
 
Survey procedures in dentitistry
Survey procedures in dentitistrySurvey procedures in dentitistry
Survey procedures in dentitistry
 
1.introduction
1.introduction1.introduction
1.introduction
 
Economics Class 11 Model Paper Study Well
Economics Class 11 Model Paper Study WellEconomics Class 11 Model Paper Study Well
Economics Class 11 Model Paper Study Well
 
1 Introduction to Biostatistics.pdf
1 Introduction to Biostatistics.pdf1 Introduction to Biostatistics.pdf
1 Introduction to Biostatistics.pdf
 
Research Methodology Basics - I
Research Methodology Basics - IResearch Methodology Basics - I
Research Methodology Basics - I
 
DATA COLLECTION AND PRESENTATION IN PUBLIC HEALTH DENTISTRY
DATA COLLECTION AND PRESENTATION IN PUBLIC HEALTH DENTISTRYDATA COLLECTION AND PRESENTATION IN PUBLIC HEALTH DENTISTRY
DATA COLLECTION AND PRESENTATION IN PUBLIC HEALTH DENTISTRY
 
Collection of data class 12th cbse
Collection of data class 12th cbse Collection of data class 12th cbse
Collection of data class 12th cbse
 
Sampling
SamplingSampling
Sampling
 
Advanced Biostatistics presentation pptx
Advanced Biostatistics presentation  pptxAdvanced Biostatistics presentation  pptx
Advanced Biostatistics presentation pptx
 
Collection of Data - Class 11 - Statistics
Collection of Data - Class 11 - StatisticsCollection of Data - Class 11 - Statistics
Collection of Data - Class 11 - Statistics
 
Satistical data,types
Satistical data,typesSatistical data,types
Satistical data,types
 
introduction to statistics
introduction to statisticsintroduction to statistics
introduction to statistics
 
Module-7-Descriptive Research-survey.pdf
Module-7-Descriptive Research-survey.pdfModule-7-Descriptive Research-survey.pdf
Module-7-Descriptive Research-survey.pdf
 
Stat and prob a recap
Stat and prob   a recapStat and prob   a recap
Stat and prob a recap
 
Business research(Rubrics)
Business research(Rubrics)Business research(Rubrics)
Business research(Rubrics)
 
Introduction.pdf
Introduction.pdfIntroduction.pdf
Introduction.pdf
 
Mm23
Mm23Mm23
Mm23
 

Plus de Dr. Shabistan Fatma Taiyabi

Ainul Hayat: An important treatise on Geriatrics and Gerontology
Ainul Hayat: An important treatise on Geriatrics and GerontologyAinul Hayat: An important treatise on Geriatrics and Gerontology
Ainul Hayat: An important treatise on Geriatrics and GerontologyDr. Shabistan Fatma Taiyabi
 
Assignment Reseaech Methodology and Biostatistics.pdf
Assignment Reseaech Methodology and Biostatistics.pdfAssignment Reseaech Methodology and Biostatistics.pdf
Assignment Reseaech Methodology and Biostatistics.pdfDr. Shabistan Fatma Taiyabi
 
Kitāb al-Ḥāwī fī al-ṭibb, Vol. 17th P.1-71.pdf
Kitāb al-Ḥāwī fī al-ṭibb, Vol. 17th P.1-71.pdfKitāb al-Ḥāwī fī al-ṭibb, Vol. 17th P.1-71.pdf
Kitāb al-Ḥāwī fī al-ṭibb, Vol. 17th P.1-71.pdfDr. Shabistan Fatma Taiyabi
 

Plus de Dr. Shabistan Fatma Taiyabi (20)

Ainul Hayat: An important treatise on Geriatrics and Gerontology
Ainul Hayat: An important treatise on Geriatrics and GerontologyAinul Hayat: An important treatise on Geriatrics and Gerontology
Ainul Hayat: An important treatise on Geriatrics and Gerontology
 
Jan Jan Unani Training Module of AUP Bihar.pdf
Jan Jan Unani Training Module of AUP Bihar.pdfJan Jan Unani Training Module of AUP Bihar.pdf
Jan Jan Unani Training Module of AUP Bihar.pdf
 
Assignment Reseaech Methodology and Biostatistics.pdf
Assignment Reseaech Methodology and Biostatistics.pdfAssignment Reseaech Methodology and Biostatistics.pdf
Assignment Reseaech Methodology and Biostatistics.pdf
 
Kitab AL Havi of Razi vol 18 p.1-45.pdf
Kitab AL Havi  of Razi vol 18 p.1-45.pdfKitab AL Havi  of Razi vol 18 p.1-45.pdf
Kitab AL Havi of Razi vol 18 p.1-45.pdf
 
AUP Members list January 2024.pdf
AUP Members list January 2024.pdfAUP Members list January 2024.pdf
AUP Members list January 2024.pdf
 
AL Havi Vol.18.pdf
AL Havi Vol.18.pdfAL Havi Vol.18.pdf
AL Havi Vol.18.pdf
 
Kitab Al Hawi Vol. 17_compressed.pdf
Kitab Al Hawi Vol. 17_compressed.pdfKitab Al Hawi Vol. 17_compressed.pdf
Kitab Al Hawi Vol. 17_compressed.pdf
 
Kamilus Sana'ah I.pdf
Kamilus Sana'ah I.pdfKamilus Sana'ah I.pdf
Kamilus Sana'ah I.pdf
 
Kitabul Hawi urdu VOL.1.pdf
Kitabul Hawi urdu VOL.1.pdfKitabul Hawi urdu VOL.1.pdf
Kitabul Hawi urdu VOL.1.pdf
 
Kitāb al-Ḥāwī fī al-ṭibb, Vol. 17th P.1-71.pdf
Kitāb al-Ḥāwī fī al-ṭibb, Vol. 17th P.1-71.pdfKitāb al-Ḥāwī fī al-ṭibb, Vol. 17th P.1-71.pdf
Kitāb al-Ḥāwī fī al-ṭibb, Vol. 17th P.1-71.pdf
 
Covid Patrimony of medieval age physicians.pptx
Covid Patrimony of medieval age physicians.pptxCovid Patrimony of medieval age physicians.pptx
Covid Patrimony of medieval age physicians.pptx
 
Unani Strategic Report-2023.pdf
Unani Strategic Report-2023.pdfUnani Strategic Report-2023.pdf
Unani Strategic Report-2023.pdf
 
RESEARCH METODOLOGY ASSNGMENT
RESEARCH METODOLOGY ASSNGMENTRESEARCH METODOLOGY ASSNGMENT
RESEARCH METODOLOGY ASSNGMENT
 
Encyclopedia of Diets.pdf
Encyclopedia of Diets.pdfEncyclopedia of Diets.pdf
Encyclopedia of Diets.pdf
 
AES Muzaffarpur.pptx
AES Muzaffarpur.pptxAES Muzaffarpur.pptx
AES Muzaffarpur.pptx
 
Mizan.pdf
Mizan.pdfMizan.pdf
Mizan.pdf
 
kitab ul judri wal hasba - Razi.pdf
kitab ul judri wal hasba - Razi.pdfkitab ul judri wal hasba - Razi.pdf
kitab ul judri wal hasba - Razi.pdf
 
Ainul Hayat.pdf
Ainul Hayat.pdfAinul Hayat.pdf
Ainul Hayat.pdf
 
Heart Attack on Unani Concept.pdf
Heart Attack on Unani Concept.pdfHeart Attack on Unani Concept.pdf
Heart Attack on Unani Concept.pdf
 
mujarrebat مجربات فطن Shabistan.pdf
mujarrebat مجربات فطن Shabistan.pdfmujarrebat مجربات فطن Shabistan.pdf
mujarrebat مجربات فطن Shabistan.pdf
 

Dernier

Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Call Girls Service In Shyam Nagar Whatsapp 8445551418 Independent Escort Service
Call Girls Service In Shyam Nagar Whatsapp 8445551418 Independent Escort ServiceCall Girls Service In Shyam Nagar Whatsapp 8445551418 Independent Escort Service
Call Girls Service In Shyam Nagar Whatsapp 8445551418 Independent Escort Serviceparulsinha
 
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service BangaloreCall Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalorenarwatsonia7
 
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service ChennaiCall Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service ChennaiNehru place Escorts
 
Housewife Call Girls Hoskote | 7001305949 At Low Cost Cash Payment Booking
Housewife Call Girls Hoskote | 7001305949 At Low Cost Cash Payment BookingHousewife Call Girls Hoskote | 7001305949 At Low Cost Cash Payment Booking
Housewife Call Girls Hoskote | 7001305949 At Low Cost Cash Payment Bookingnarwatsonia7
 
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...narwatsonia7
 
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...Miss joya
 
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service JaipurHigh Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipurparulsinha
 
See the 2,456 pharmacies on the National E-Pharmacy Platform
See the 2,456 pharmacies on the National E-Pharmacy PlatformSee the 2,456 pharmacies on the National E-Pharmacy Platform
See the 2,456 pharmacies on the National E-Pharmacy PlatformKweku Zurek
 
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...Miss joya
 
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...narwatsonia7
 
call girls in green park DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in green park  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️call girls in green park  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in green park DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️saminamagar
 
Sonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Sonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowSonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Sonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowRiya Pathan
 
Ahmedabad Call Girls CG Road 🔝9907093804 Short 1500 💋 Night 6000
Ahmedabad Call Girls CG Road 🔝9907093804  Short 1500  💋 Night 6000Ahmedabad Call Girls CG Road 🔝9907093804  Short 1500  💋 Night 6000
Ahmedabad Call Girls CG Road 🔝9907093804 Short 1500 💋 Night 6000aliya bhat
 
Asthma Review - GINA guidelines summary 2024
Asthma Review - GINA guidelines summary 2024Asthma Review - GINA guidelines summary 2024
Asthma Review - GINA guidelines summary 2024Gabriel Guevara MD
 
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original PhotosBook Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photosnarwatsonia7
 
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy GirlsCall Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girlsnehamumbai
 

Dernier (20)

Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
 
Call Girls Service In Shyam Nagar Whatsapp 8445551418 Independent Escort Service
Call Girls Service In Shyam Nagar Whatsapp 8445551418 Independent Escort ServiceCall Girls Service In Shyam Nagar Whatsapp 8445551418 Independent Escort Service
Call Girls Service In Shyam Nagar Whatsapp 8445551418 Independent Escort Service
 
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service BangaloreCall Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
 
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service ChennaiCall Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
 
Housewife Call Girls Hoskote | 7001305949 At Low Cost Cash Payment Booking
Housewife Call Girls Hoskote | 7001305949 At Low Cost Cash Payment BookingHousewife Call Girls Hoskote | 7001305949 At Low Cost Cash Payment Booking
Housewife Call Girls Hoskote | 7001305949 At Low Cost Cash Payment Booking
 
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
 
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
 
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
 
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service JaipurHigh Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
 
See the 2,456 pharmacies on the National E-Pharmacy Platform
See the 2,456 pharmacies on the National E-Pharmacy PlatformSee the 2,456 pharmacies on the National E-Pharmacy Platform
See the 2,456 pharmacies on the National E-Pharmacy Platform
 
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...
 
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
 
call girls in green park DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in green park  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️call girls in green park  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in green park DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
 
Sonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Sonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowSonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Sonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
 
Ahmedabad Call Girls CG Road 🔝9907093804 Short 1500 💋 Night 6000
Ahmedabad Call Girls CG Road 🔝9907093804  Short 1500  💋 Night 6000Ahmedabad Call Girls CG Road 🔝9907093804  Short 1500  💋 Night 6000
Ahmedabad Call Girls CG Road 🔝9907093804 Short 1500 💋 Night 6000
 
Asthma Review - GINA guidelines summary 2024
Asthma Review - GINA guidelines summary 2024Asthma Review - GINA guidelines summary 2024
Asthma Review - GINA guidelines summary 2024
 
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original PhotosBook Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
 
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
 
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
 
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy GirlsCall Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
 

Research Methodology and Statics pdf.pdf

  • 1. ATOM BOMB PATHOLOGY National Institute Of Unani Medicine DR. Md.Khurshid Alam Biostatistic and Research Methodology
  • 2. DR. Md.Khurshid Alam 1 SL.NO CONTENTS PAGE SL.NO CONTENTS PAGE 1 BIO- STATISTICS: Introduction 2 23 Chi-square (X2 ) Test 76 2 Scale 5 24 ANOVA 80 3 Data, Questionnaire 7 25 Mann–Whitney U test 82 4 Measurement of central tendencies 11 26 Kruskal-Wallis Test 85 5 Weighted average, Measure of position 20 27 Moods median test 89 6 Measures of dispersion 22 28 Correlation 90 7 Standard deviation 24 29 Regression analysis 93 8 Graphic presentation of frequency distribution 27 30 Minitab 95 9 Probability 32 31 SPSS 96 10 Estimation 38 32 Book, Journal, Compendium 99 11 Research Methodology 39 33 Research Report: protocols and Report Format 100 12 Research problem 43 34 Trend and possibilities of research in Unani 103 13 Concept of variable 46 35 ICMR STATEMENT 105 14 Research design 47 36 WHO-Researchers’ responsibilities 108 15 Types of control 52 37 WMA DECLARATION OF HELSINKI 110 16 Blinding 53 17 Clinical research 54 18 Inductive and deductive approaches 58 19 Sampling 59 20 Hypothesis 64 21 Z-test 67 22 T-test 72
  • 3. DR. Md.Khurshid Alam 2 BIO- STATISTICS: Introduction ❖ Word STATISTICS is derived from Latin word STATUS means STATE – a political state. ❖ BIO- STATISTICS is application of statistical tools and method in biology and Medicine. ❖ John Graunt – is father of health statistics. In 1662 he published “Natural and political observation made upon Bills of mortality” Definition ❖ Statistics is science of data that will enable us to become proficient data producer and efficient data users. ❖ In plural form, it stands for numerical facts (facts expressed in numbers) pertaining to a collection of objects. ❖ In singular form, it stands for the science of collection, organization, analysis and interpretation of numerical facts. Prof. Horace Secrist defines ❖ By statistics we mean, aggregate of facts affected to a marked extent by multiplicity of causes, numerically expressed, enumerated or estimated according to reasonable standards of accuracy, collected in a systemic manner for a pre-determined purpose and place in relation to each other. Branch of statistics ❖ Two branches – 1) Statistical methods 2) Applied statistics Main branches of applied statistics are Biometry, Demography, Econometrics, statistical quality control, Psychometry etc.
  • 4. DR. Md.Khurshid Alam 3 Scope and applications of statistics Statistics is considered to be a distinct branch of study applicable to investigation in many branches of science. Statistical methods are applied to specific problems in biology, medicine, agriculture, commerce, business, economics, industry, sociology etc. Function of statistics ❖ It simplifies complexity of the data. ❖ It reduces the bulk of the data. ❖ It adds precision to thinking. ❖ It helps in comparing different sets of figures. ❖ It guides in formulation of policies and helps in planning. ❖ It indicates trends and tendencies. ❖ It helps in studying relationship between different factors Limitations of statistics ❖ Statistics does not deal with qualitative data. ❖ Statistics does not deal with individual fact. ❖ Statistical inferences (conclusions) are not exact. ❖ Statistics can be misused. ❖Common men cannot handle statistics properly. Basic Notions ❖ Units or Individuals – the object whose characteristics are studied. ❖ Population or Universe – the totality (collection) of units under consideration. ❖ Finite population – population contains finite number of units.
  • 5. DR. Md.Khurshid Alam 4 ❖ Infinite population – Population contains infinite number of units. E.g.- heights of plants. ❖ Census -If each and every units are studied; this type of study is called complete enumeration or census. ❖ Quantitative characteristic: - A characteristic which is numerically measurable. ❖ Qualitative characteristic: - A characteristic which is not numerically measurable. ❖ Variable – A quantitative characteristics which varies from unit to unit. E.g.- height ❖ Attribute – A qualitative characteristic which varies from unit in unit. E.g.- sex ❖ Discrete variable – Some specified value in a given range. E.g. – number of children per family. ❖ Continuous variable – A variable which assume all the value in the range. E.g.- Hight of persons. ❖ Statistical survey or investigation – Study of variable which show statistical (stochastic non-mathematical) variation. ❖ Investigator – who conduct statistical survey. ❖ Informants Respondents – A persons who supply information. ❖ Enumerators – An agent who collect and handover information to the investigator. ❖ Sample – It is representative portion of the population ❖ Census enumeration – A survey in which the whole population is made use.
  • 6. DR. Md.Khurshid Alam 5 Collection and classification of data A statistician is concerned with the study of variables which show statistical (stochastic non mathematical) variation. Such a study is called statistical investigation (statistical survey). Investigator: -The person who conduct the statistical survey is called investigator. The investigator plans the survey collect the required data analyses them and finally draw conclusion. Stages of statistical investigation Mainly two stage- I. Planning and preparation. II. Executive of survey. Execution has four steps, namely 1)collection of data 2)scrutiny, editing and presentation of data. 3)Analysis of data. 4) Interpretation of analyzed data. Quality of data ❖“GIGO” Garbage in-garbage out. This means, researcher must ensure high quality of data at every step. Scale of measurement: - It is able to measure anything. Measurement of magnitude. Basically, it is two type- (1) Crude- it provides rough idea of magnitude.eg tall (2) precise- it provides exact value of magnitude.eg 2cm
  • 7. DR. Md.Khurshid Alam 6 • Nominal scale: - Also called classificatory scale. On the basis of common property/share property/character it divides data into sub groups. For example, if we divide 10people on their income in high income, average income, low income, there is no importance of order, either low income is written on top or bottom. • Ordinal scale: - It has all the property of nominal scale. It also divides the under-study parameter into sub groups, into order, Ascending or descending order. So, first divide the object on nominal scale and then arrange in ordinal scale. • Interval scale: -It has all the property of ordinal and nominal scale in addition it places the sub group (rank sub-group) with a definite interval. The space between starting and terminating point is called interval. This scale cannot used in mathematical calculations. • Ratio scale: -It has all the property of nominal, ordinal and interval scale in addition it is always start with zero. Measurement of this is subject of mathematical calculation. Every division is definite measurement. It is absolute scale. In this zero is fixed. It most precise scale.
  • 8. DR. Md.Khurshid Alam 7 Primary data are specially collected for a particular purpose. It is reliable complete and fresh. Method of collection of primary data: - 1. Direct personal interview – investigator personally comes in contact with the unit. 2. Indirect personal interview – the investigator does not contact the units directly but, he/she contacts person who are in close association with units. These persons (informants) supply information to the investigator. 3. Information through correspondents – the investigators appoints his agents called correspondent at different place. These correspondents collect required data in their area and hand over to the investigator. 4. Method of questionnaire (mail inquiry) – questionnaire is the list of question, answer for which are filled in by the informants and these answers are required information for the investigation.it is cheap, consumes less time and labour. 5. Method of schedule (collection through enumerators) – schedule is the list of items on which the enumerators have to collect and record information. It is filled by the enumerators. These data are reliable and accurate. But in this method, there is scope for bias. General principle in drafting questionnaire(schedule) 1. The number of questions should be as less as possible. 2. Question should be short and simple. 3. If a lengthy question is unavoidable, it should be divided into two or more parts. 4. Question should be such that answer to them are short.eg. Are you married? 5. As far as possible, question regarding personal matter should be avoided. 6. The question should be so framed that do not hurt the feeling of the informants. 7. Question should not be ambiguous.
  • 9. DR. Md.Khurshid Alam 8 8. Question should be logically arranged. 9. Any clarification, if necessary, regarding any of the question, should be provided. 10.Question should be so framed that validity of information supplied by informants can be cross checked. A covering letter introducing the investigator and indicating the purpose of survey should be attached to the questionnaire. It should supply necessary instruction to the informants regarding return SECONDRY DATA ❖ Primarily collected for some other purpose. ❖ It may not contain all required information. ❖ Sources of secondary data are— 1) published sources e.g. Gov. Reports 2) unpublished sources e.g. Records of Govt.office
  • 10. DR. Md.Khurshid Alam 9 Classification ❖ Units having common characteristics are grouped together. ❖ Each of these groups is called class. ❖ Simple or one-way classification – classification of units on the basis of a single characteristic. ❖ Mani-fold classification – simultaneous classification of units on the basis of two or more characteristics. ❖ Dichotomy- classification of units on the basis of a characteristic into two classes.eg married and unmarried. Function of classification ❖ Reduce the bulk of data. ❖ Simplifies the data. ❖Facilitates comparison of characteristics. ❖ Renders the data ready for statistical analysis. TABULATION is a systemic arrangement of classified data in row and columns of a table. CONTIGENCY Table – A table showing many-fold classified data. Types of classification- four types 1) Quantitative classification- classification with regard to variable. 2) Qualitative classification – classification with regard to attribute. 3) Spatial classification (Geographical classification). 4) Temporal classification (chronological classification) classification with regard to time.
  • 11. DR. Md.Khurshid Alam 10 Frequency table ❖ A systemic presentation of the values taken by a variable and the corresponding frequencies is called frequency distribution of that variable. ❖ A tabular presentation of frequency distribution is called frequency table. A frequency distribution in which class interval are considered is a continuous(grouped) frequency distribution. If class interval is not considered, it is a discrete(ungrouped) frequency distribution. Terms- ❖ Class interval: - it is range between upper- and lower-class limit. It is width of the class. ❖ Lower class limit: - smallest value of class. ❖ Upper class limit: - highest value of class. ❖ Class mark or class mid value: -the central value (middle most value) of a class interval. ❖ Inclusive class interval: - if class interval is such that the lower as well as the upper-class limit are included in the same class interval. Usually inclusive type of class interval is adopted when the variable is discrete. ❖ Exclusive class interval: - If class interval is such that the lower- class limit is included in the same class interval, whereas, the upper-class limit is included in the succeeding class interval. ❖ Open end class interval: -Some time in frequency distribution, the class interval at the extremities may not have one of the limits. Such class interval is called open end class interval. E.g. More than 100.
  • 12. DR. Md.Khurshid Alam 11 ❖ Frequency: - the number of observations in any class. Bivariate and multivariate frequency distribution Frequency distribution of a single variable is called univariate frequency distribution. Frequency distribution of more than one variable is called multivariate frequency distribution. Bivariate = on two variables Measurement of central tendencies Central tendency ❖ Generally, in frequency distribution, the values cluster around a central value. ❖ The property of concentration of the values around a central value is called central tendency. ❖ The central values around which there is concentration is called measure of central tendency (measure of location, average). Five important measures of central tendencies- 1. Arithmetic Mean (A.M) 2. Median 3. Mode 4. Geometric Mean (G.M) 5. Harmonic mean (H.M) Desired Qualities of an ideal measure of central tendencies – 1) It should be easy to understand. 2) Its computation procedure should be simple. 3) It should be rigidly defined.
  • 13. DR. Md.Khurshid Alam 12 4) It should be based on all the values. 5) It should not be affected too much by abnormal extreme values. 6) It should be capable of further algebraic treatment so that it could be used in further analysis of the data. 7) It should be stable. That is, the measure should be such that sampling variation in the value of the measure should be least. Arithmetic Mean (Mean) ❖ Arithmetic mean of a set of values is obtained by dividing the sum of the values by the number of values in the set ❖ Arithmetic mean of the values -- X1, X2,..Xn is – 𝑋̄̅ = 𝑋1+𝑋2+⋯+𝑋𝑛 𝑛 = ∑ 𝑥 𝑛 ❖ If the observation x1, x2, ……. Xn have frequencies f1 , f2 , …….. Fn , the Arithmetic mean is 𝑋̄̅ = 𝑓1𝑋1+𝑓2𝑋2+⋯+𝑓𝑛𝑋𝑛 𝑓1+𝑓2+ ….. +𝑓𝑛 = ∑ 𝑓𝑥 𝑁 (For discrete frequency distribution) Where N = ∑ 𝑓 Is the total frequency For raw data, the arithmetic mean is – 𝑋̄̅ = ∑ 𝑥 𝑛 For tabulated data (discrete or continuous), it is – 𝑋̄̅ = ∑ 𝑓𝑥 𝑁
  • 14. DR. Md.Khurshid Alam 13 Change of origin and scale ❖ Let x1 , x2 , …….. Xn be n values. Let ‘a’ be a constant. Then x1- a, x2 –a,….xn – a are the value of x1 , x2 , ….. Xn with origin shifted to ‘a’. If ‘c’ is positive constant, 𝑥1 −𝑎 𝑐 𝑥2 −𝑎 𝑐 ……. 𝑥𝑛 −𝑎 𝑐 Are the values x1 , x2 , ….. Xn with origin shifted to a and scale changed by c. Thus, u = 𝑥 −𝑎 𝑐 Therefore, x = a+cu X – a = uc X = a+uc And so, X ̅ = 𝑎 + 𝑐𝑢̅ = a + 𝑐 ∑ 𝑓𝑢 𝑁 However, if c=1, X ̅ = 𝑎 + 𝑢̅ = a+ ∑ 𝑢 𝑛 Properties of arithmetic Mean 1) Algebraic sum of the deviation of a set of values from their arithmetic mean is zero. That is, ∑(𝑥 − 𝑥)=0 2) Sum of the squared deviations of a set of values is a minimum when deviation is taken around the arithmetic mean.
  • 15. DR. Md.Khurshid Alam 14 Let x̅1 be the arithmetic mean of a set of n1 values. And, let x̅2, be the arithmetic mean of another set of n2 values. Then, the arithmetic means of the two set of values put together is X̅= 𝑛1𝑥1+𝑛2𝑥 𝑛1+𝑛2 (combined arithmetic mean) Merits of arithmetic mean ❖ It is rigidly defined. ❖ It can be easily computed. ❖ Logic behind its computation can be easily understood. ❖ It can be easily adopted for further statistical analysis. ❖ It is based on all the values. ❖ It is more stable than any other average. ❖ It can be calculated even when some of the values are equal to zero or negative. Demerits of arithmetic mean ❖ It is highly affected by abnormal extreme values. ❖ Since it is based on all the values, even if one of the values is missing, it cannot be calculated. ❖ Sometimes, the arithmetic mean may be a value which is not assumed by the variable.
  • 16. DR. Md.Khurshid Alam 15 Median ❖ Median of a set of values is the middle most value when they are arranged in the ascending order of magnitude. (such an arrangement is called an array). ❖ It is a value that is greater than half of the values and lesser than the remaining half. ❖ The median is denoted by M. ❖ In case of a raw data and also a discrete frequency distribution, the median is – M={ (𝑛+1) 2 }Th value in the arrayed series. In the case of continuous frequency distribution, the median is – M=l + [ 𝑁 2 −𝑚) × 𝑐 𝑓 ] Where l: lower limit of the median class. C: width of the median class. F: frequency of the median class. M: less than cumulative frequency up to l. N: total frequency. Merits of median ❖ The logic behind its computation is easily understood. ❖ It can be easily computed. ❖ Even if some extreme value is missing, it can be computed. ❖ It is not affected by abnormal extreme value. ❖ It can be used for the study of qualitative data also.
  • 17. DR. Md.Khurshid Alam 16 Demerits of median ❖ It is not based on all the values. ❖ It cannot be used in deep statistical analysis. Mode ❖ Mode is the value which has highest frequency. ❖ It is most frequently occurring value. ❖ It is denoted by Z. ❖ In case of raw data, and also in case of a discrete frequency distribution, mode Is the value with highest frequency. ❖ In case of a continuous frequency distribution, mode is – Z= 𝒍 + [ (𝒇−𝒇𝟏)×𝒄 𝟐𝒇−𝒇𝟏−𝒇𝟐 ] Where – L: lower limit of the modal class. F: frequency of the modal class. C: width of the modal class. F1: frequency of the class preceding the modal class. F2: frequency of the class succeeding the modal class. ❖ Modal class is the class which contains the mode. ❖ Generally, modal class will be the class with highest frequency. But sometimes, it may be a class other than the class with
  • 18. DR. Md.Khurshid Alam 17 ❖ Highest frequency.in such a situation, mode is obtained by using the formula – Z=l+ [ 𝑐𝑓2 𝑓1+𝑓2 ] ❖ Unimodal – most of the frequency distribution have only one value with highest frequency, such frequency distribution is unimodal. The have only mode. ❖ Multimodal – if there is more than one value with highest frequency in frequency distribution.it will have more than one mode. ❖ Bimodal – if there are two modes, ❖ A distribution which has more than one mode, it is said to be ill defined. Merits and demerits of mode ❖ Merits and demerits of mode are the same as merits and demerits of median ❖ One additional demerit is – For some frequency distribution, mode is ill defined.
  • 19. DR. Md.Khurshid Alam 18 Geometric mean (GM): -the geometric mean of n value is the nth root of product of the values. It is denoted by G. The gematric mean of n values X1, X2, X3, ….Xn is G= √𝑥1 × 𝑥2 × … × 𝑥𝑛 𝑛 If logarithms are used, G=antilog[ ∑ 𝑙𝑜𝑔 𝑥 𝑛 ] For raw data And G= antilog [ ∑ 𝑓 𝑙𝑜𝑔 𝑥 𝑁 ] For tabulated data Geometric mean is the appropriate measure for averaging rate of growth. This is the reason why geometric mean index number is considered the best. When any of the value is equal to zero, geometric mean is not defined. Also, it not defined when some of the value are negative. It is defined only when either all the value is positive or all of them are negative. Harmonic mean (H.M): - The harmonic mean of n value is the reciprocal of the arithmetic mean of the reciprocals of the given values. It is denoted by H. Thus, harmonic mean of the value X1, X2, X3,…Xn is – 𝐻 = 𝑛 ∑ ( 1 𝑥 ) In case of tabulated data, H.M is- 𝐻 = 𝑁 ∑( 𝑓 𝑥 )
  • 20. DR. Md.Khurshid Alam 19 Uses of different averages: - the appropriate situation where various average can be used. Arithmetic mean- 1. The average is required for deep statistical analysis. 2. The variable is continuous. 3. The average is additive in nature. Median- 1.The variable is discrete. 2. Some of the extreme value are missing. 3. There are abnormal extreme values. 4. Mode is ill defined. 5.The characteristic under study is qualitative. Mode- 1. Modal value has very high frequency compared to other frequency. 2. Some of the extreme value are missing. 3. The variable is discrete. 4. There is abnormal extreme value. 5. The characteristic under study is qualitative. Geometric mean- The variable is multiplicative in nature. Average rates and ratio have to be found. Harmonic mean- The reciprocal of the variable is additive in nature. In a slightly skew, distribution, the mean, median and mode show a rough relation among themselves. It is - Empirical relation between mean, median and mode: - Mean – Mode =3(Mean - Median) Mode = 3 Median – 2Mean That is, Z = 3M – 2X¯
  • 21. DR. Md.Khurshid Alam 20 Weighted average: - Sometime in the data, some of the item may be more important than the other items. For example, in post graduate of pathology Unani, the marks of ilmul alamat and ilmul asbab carries greater importance than marks of histology, cytology, etc. Thus, in such a situation an appropriate average with varying weightage assigned to the values, is necessary. Such an average is called weighted average. The weighted average in common use are weighted arithmetic mean and weighted geometric mean. Let the values x1, x2, x3, ….xn be assigned the weights w1, w2, w3,…wn respectively. Then- Weighted arithmetic mean is – X¯ = ∑ 𝑊𝑋 ∑ 𝑊 The weighted geometric mean is – GW = antilog ∑ 𝑊 Log 𝑋 ∑ 𝑊 Measure of position (Partition value): - The value which divide the frequency distribution in definite ration are called partition value. E.g. Median, Quartile, Decile, Percentile. Quartiles: - Quartile divides the distribution into four quarters. For a frequency distribution there are three quartiles. Q1- the first quartile. It is also called lower quartile.it divide the value which is greater than one quarter of the observation and less than the remining three quarters. Q2- it divides the value in two equal halves. It is same as median. Q3 – it is called third quartile or upper quartile. The value divided by it is greater than three quarter and lesser than remaining one quarter.
  • 22. DR. Md.Khurshid Alam 21 For raw Data, and for a discrete distribution the rth quartile is – Qr = { 𝑟(𝑛+1) 4 }Th value in arrayed series. For continuous frequency distribution, the rth quartile is – Qr = 𝑙 + [ ( 𝑟𝑁 4 −𝑚)×𝑐 𝑓 ] Decile: - there are nine decile for a frequency distribution, which are denoted by D1, D2, D3,….D9. It divides the frequency distribution into ten equal parts. For continuous distribution – Dr=𝑙 + [ ( 𝑟𝑁 10 −𝑚)×𝑐 𝑓 ] Percentiles: -percentile divides the frequency distribution into hundred equal parts. One percent of value exist between two consecutive percentiles. It is denoted by P1 to P99. There are ninety- nine percentile for a frequency distribution. For continuous distribution – Pr= 𝑙 + [ ( 𝑟𝑁 100 −𝑚)×𝑐 𝑓 ] Median: -The median divides the frequency distribution into two halves.
  • 23. DR. Md.Khurshid Alam 22 Measures of dispersion (Range, Quartile deviation, Mean deviation and standard deviation) Dispersion (variation) In a frequency distribution, though the values cluster around an average, most of them differ from it. In some distribution, the difference may be less, whereas in some other, it may be more. This property of deviation of values from the average is called variation or dispersion Measures of dispersion 1) Range 2) Quartile deviation (semi- interquartile range, QD) 3) Mean deviation (M.D) 4) Standard deviation (S.D) Among these four measures, standard deviation is the most commonly used measure. Essentials of good measure of variation O It should be easy to understand. O Its computation procedure should be simple. O It should be rigidly defined. O It should be based on all the values. O It should not be affected too much by abnormal extreme values. O It should be capable of further algebraic treatment. O It should be stable.
  • 24. DR. Md.Khurshid Alam 23 Range O Range is the difference between highest and lowest values in the data. O If H is the highest value and L is the lowest value in the data, the range of variation is – R = H – L Coefficient of range A relative measure of variation which is used for comparison of frequency distribution is coefficient of range. It is— Coefft. Of R = 𝐻−𝐿 𝐻+𝐿 Range is easy in computation and very simple to understand. Demerits of range ➢ Since range is based only on the extreme values, it shows too much fluctuation. ➢ It is highly affected by abnormal extreme values. ➢ If data has abnormal extreme values, range should not be adopted for study. Quartile deviation (semi- interquartile range) ➢ The quartile deviation is obtained by dividing the range between the lower and upper quartiles by 2. ➢ If Q1 and Q3 are the lower and upper quartiles, the quartile deviation is – Q.D = 𝑄3−𝑄1 2
  • 25. DR. Md.Khurshid Alam 24 Coefft. Of Q.D Relative measure of variation based on the quartile is coefficient of quartile deviation. It is – Coefft. Of Q.D = 𝑄3−𝑄1 𝑄3+𝑄1 Feature of Q.D ➢ It is based only on the lower and upper quartiles. ➢ It can be easily computed. ➢ It is not affected much by extreme values. ➢ It is not based on all the values. ➢ It is not convenient for mathematical treatment. Standard deviation Standard deviation of a set of values is the positive square- root of mean of the squared deviations of the values from their arithmetic mean. It is denoted by sigma ( 𝜎 ) The range is based only on the lowest and highest values. Quartile deviation is based only on quartiles. But these measures are not based on all the values. And so, we consider standard deviation which is based on all the values. Standard deviation of the values x1, x2, x3,……..xn is --- 𝜎 = √∑(𝑥−𝑥)2 𝑛 In case of tabulated data (in both continuous and discrete) 𝜎 = √∑ 𝑓(𝑥−𝑥)2 𝑛
  • 26. DR. Md.Khurshid Alam 25 Variance: - The square of the standard deviation is called variance Variance of x1, x2, x3,…. Xn is – Var (x)=𝜎2 = ∑ 𝑓(𝑥−𝑥)2 𝑛 It is mean of squared deviation of the values from their arithmetic mean. Computation of standard deviation for raw data- 𝜎 = √ ∑ 𝑥2 𝑛 − ( ∑ 𝑥 𝑛 )2 Computation of standard deviation for tabulated data- 𝜎 = √ ∑ 𝑓𝑥2 𝑁 − ( ∑ 𝑓𝑥 𝑁 )2 If the origin of value is shifted to a and the scale is changed by c, that is, if u= 𝑥−𝑎 𝑐 Then it can be shown that – S.D(x)= c X S.D.(u) Also, Var.(x)= c2 X var.(u) Properties of standard deviation: - S.D. is independent of origin of measurement, but not on scale. S.D. is the least of all root-mean-squire deviation. Combination of standard deviation of set of n1 and n2 values is – 𝜎 = √ 𝑛1(𝜎1 2 + 𝑑1 2 ) + 𝑛2(𝜎2 2 + 𝑑2 2 ) 𝑛1 + 𝑛2 Where, d1= x1¯ −x¯, d2=x2¯ −x¯ and x¯= 𝑛1𝑥1¯+𝑛2𝑥2¯ 𝑛1+𝑛2
  • 27. DR. Md.Khurshid Alam 26 Coefficient of variation: - Coefficient of variation is relative measure of variation. It is used for comparing the variation in frequency distribution. It is the standard deviation expressed as a percentage of the mean. Thus, Coefficient of variation is- C.V = 𝐬𝐭𝐚𝐧𝐝𝐚𝐫𝐝 𝐝𝐞𝐯𝐢𝐚𝐭𝐢𝐨𝐧 𝑨𝒓𝒊𝒕𝒉𝒎𝒆𝒕𝒊𝒄 𝒎𝒆𝒂𝒏 × 𝟏𝟎𝟎 = 𝝈 𝒙¯ × 𝟏𝟎𝟎 ➢ A high value of Coefficient of variation indicates high degree of variation, and ➢ A low value indicates low degree of variation. ➢ Coefficient of variation is independent of the unit of measurement of the values, Mean deviation Mean deviation is the mean of absolute deviation of the values from the central value. Thus, mean deviation of the set of values x1, x2, x3,….xn from their arithmetic mean is – M.D.(X¯) = ∑|𝑥−𝑥¯| 𝑛 In case of tabulated data M.D. From A.M. is – M.D.(X¯) = ∑ 𝑓|𝑥−𝑥¯| 𝑁 Mean deviation of the values from median M is – M.D.(M) = ∑|𝑥−𝑀| 𝑛 In the case of tabulated data, M.D. from Median M is – M.D.(M) = ∑ 𝑓|𝑥−𝑀| 𝑁
  • 28. DR. Md.Khurshid Alam 27 ❖ Here, |𝑥1 − 𝑥¯|, |𝑥2 − 𝑥¯|, |𝑥3 − 𝑥¯|,….. |𝑥𝑛 − 𝑥¯| Are the deviations with the signs ignored. The signs are ignored because if the deviations are algebraically added, the sum reduces to zero (property 1 of A.M.). ❖ Mean deviation may be calculated around any average – mean, median, mode, etc. Minimal property of median – mean deviation is least when it is measured from the median is called Minimal property of median. Coefficient of mean deviation from the arithmetic mean is – Coefft. Of M.D.(X¯) = 𝑀.𝐷.(𝑋¯) 𝑋¯ Coefficient of mean deviation from the Median is – Coefft. Of M.D.(M)= 𝑀.𝐷.(𝑀) 𝑀¯ Graphic presentation of frequency distribution A)Tabulation (simple and frequency distribution table) B)Chart and diagram A) Histogram B) Frequency polygon C) Frequency curve D) Ogive (cumulative frequency) E) Bar diagram F) Pie diagram G) Map diagram F) Pictogram
  • 29. DR. Md.Khurshid Alam 28 Histogram – Drawing procedure ❖ A histogram is simplest form of graphical presentation. ❖ Histogram for Equal class interval --- Horizontal axis – which may not necessarily start from zero, is divided by putting dots into equal parts numbering two or three more than the number of class interval. Starting from left, each dot is then labeled by the lower-class limits of the successive classes by leaving a space of the size of one class interval at each end of the X-axis. Sometimes, the horizontal also measure the mid-point of the successive class interval. ❖ The vertical axis which always begins with zero at the meeting point of the two axes, is appropriately scaled to measure class frequencies along it. ❖ Rectangular bars are then constructed for successive class intervals with their base on the X-axis, such that the base is equal in width and the height (on the Y-axis) equal to the corresponding class frequency. ❖ The area of the bars so drawn corresponding to each class interval is given by its class frequency f multiplied by the width of the class interval C. Histogram - example ❖ weekly income of 80 salesman as constructed in table- Income F1 No. Salesman 50-59 6 60-69 9 70-79 15 80-89 25 90-99 13 100-109 7 110-119 5
  • 30. DR. Md.Khurshid Alam 29 Histogram for unequal class interval ❖ Procedure of drawing histogram for unequal class interval is slightly different. ❖ Minor adjustment is required to made in spacing of various dots marked on the X-axis. For example, if a class interval is of a width of 15 points and the rest of 5 points., the space on X-axis for the class interval of 15 point should be three times longer than that for an interval of 5 points. ❖ The vertical for such class intervals measures the frequency density and not the original class frequency. ❖ The frequency density for a class interval of a width more than that of the others is given to be the actual frequency of this class divided by the number of times the width of this class of 15 points width being 69, the frequency density of this class will be 69 divided by 3, that is 23. ❖ For drawing, histogram for an open-ended distribution, follow usual procedure after leaving out the open-ended classes. Frequency polygon ❖ Dot at the mid-point of top horizontal line of each bar and then joining these dots by straight lines. ❖ Closed the polygon on each end by drawing straight lines from the midpoint of the top base of the first and the last rectangle to the mid-point falling on the horizontal axis of the next outlying interval width zero frequency. Drawing a frequency polygon does not necessarily require constructing a histogram first
  • 31. DR. Md.Khurshid Alam 30 OGIVE- Cumulative frequency curve ❖ Cumulative frequency curve is popularly known as ogive. ❖ The first step in drawing an ogive is to add another column of cumulative frequencies, denoted as fc. This may be done by finding cumulative frequencies either on a less than or more basis. ❖ Less than cumulative frequency is obtained by adding successive class frequencies from top to bottom. ❖ More than type cumulative frequency is obtained by adding up successive class frequencies from bottom to top.
  • 32. DR. Md.Khurshid Alam 31 Ogive drawing procedure ❖ Once cumulative frequencies are obtained, procedure is as usual. The only difference being that the Y-axis now to be so scaled that it accommodates the total frequencies. The X-axis is labeled with the upper-class limits in the case of less than ogive, and the lower-class limits in the case of more than ogive. ❖ Advantage of ogive is that these curves have quick adaptability to interpretation. Pie diagram Total angle of a circle(pie) is 360 degree, and total area of circle is 100%. Hence, each percent acquires 360 100 ⁄ Degree 1% = 3.6 degree Hence area under pie for a class frequency is = 𝑐𝑙𝑎𝑠𝑠 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑡𝑜𝑡𝑎𝑙 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛 𝑋̄360 Example- According to NCAER, New Delhi, forms of tobacco consumption is as estimated by weight. Bidi-55%, cigrate-16% and others 29%. Shown in pie diagram. Bidis 55% cigrate 16% others 29% Tobacco consuption Bidis cigrate others
  • 33. DR. Md.Khurshid Alam 32 Probability Probability is the chance something will happen. In many instances, we will have some knowledge about the possible outcomes of a decision. In research we are unable to forecast the future with complete uncertainty. Therefore, the need to cope with uncertainty leads us to study and use of probability theory. Probability is part of our everyday lives. Probability is expressed as fractions { 1 5 , 1 6 , 1 15 } Or as decimals (0.454, 0.475, 0.5669) between zero and one (0 - 1). Zero probability means that something will never happen. Probability of 1 (one) indicates that some thing definitely will happen. Number between 0 (zero) to 1 (one) is probability, that is region between certainty and uncertainty is probability. The value of probability cannot be less than 0 or greater than 1. Event: - in probability theory, an event is described as one or more of the possible outcomes of doing something. E.g. In a coin toss experiment, getting a tail would be an event, and getting a head would be another event. Experiment: - the activity that produce an event. E.g. Coin toss. Sample space: -the set of all possible outcome of an experiment of an experiment is called sample space. It is written as- S= {𝒉𝒆𝒂𝒅, 𝒕𝒂𝒊𝒍} In coin toss experiment.
  • 34. DR. Md.Khurshid Alam 33 Mutually exclusive events: - two or more events cannot occur at a time, that means one and only one of events can takes place at a time. E.g. In coin toss experiment either head or tail may turn up, but not both. Collectively exhaustive list of events: -list of events which include every possible outcome. Dependent event: - probability of occurrence of an event is dependent on, or effected by in some way the occurrence of another events. Independent event: - probability of occurrence of an event has no effect on the occurrence of another event. Types (approach) of probability: -There are three basic type of approach- 1. Classical approach 2. Relative frequency approach 3. Subjective approach. Classical approach of probability: - Classical probability is also called a priori probability. It is – Probability of an event= 𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒐𝒖𝒕𝒄𝒐𝒎𝒆𝒔 𝒘𝒉𝒆𝒓𝒆 𝒕𝒉𝒆 𝒆𝒗𝒆𝒏𝒕𝒔 𝒐𝒄𝒄𝒖𝒓𝒔 𝒕𝒐𝒕𝒂𝒍 𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒑𝒐𝒔𝒔𝒊𝒃𝒍𝒆 𝒐𝒖𝒕𝒄𝒐𝒎𝒆 In coin toss experiment, P(Head)= 𝟏(𝑯𝒆𝒂𝒅) 𝟐(𝑯𝒆𝒂𝒅+𝑻𝒂𝒊𝒍) = 0.5 P(Tail)= 𝟏(𝑻𝒂𝒊𝒍) 𝟐(𝑯𝒆𝒂𝒅+𝑻𝒂𝒊𝒍) = 0.5 Relative frequency approach of probability: - this method uses the relative frequencies of past occurrences as probabilities. We determine how often something has happened in the past and use that figure to predict the probability that it will happen again in the future. Life insurance companies are using this approach.
  • 35. DR. Md.Khurshid Alam 34 Subjective approach: - it is based on the belief of the person making the probability assessment. In 1926, Frank Ramsey in his book The Foundation of Mathematical and Other Logical Essays introduce the concept of Subjective approach of probability. Laws of probability i. Addition rules for mutually exclusive events: - probability of either A or event B happening is written as – P(A or B) = P(A)+P(B) ii. Addition rules for NOT mutually exclusive events: - If A and B is not mutually exclusive events, then- P(A or B) = P(A)+P(B) – P(AB) Where P(AB) is event where both event A and B occur together at the same time. iii. Multiplication law of probability: - this is applied when two or more events occurs together but, they are independent of each other. Probability under condition of statistical independence- when two events happen, occurrence of one event has no effect on the probability of the occurrence of any other event. There are three type of Probability under statistical independence- 1. Marginal Probability 2. Joint Probability –- P(AB)= P(A) x P(B) 3. Conditional Probability – A) conditional Probability under statistical independence- P(B/A) =P(B) or P(A/B)=P(A) B) conditional Probability under statistical dependence. It is of three types, conditional, joint and marginal.
  • 36. DR. Md.Khurshid Alam 35 Probability distribution: -probability distribution is classified as either continuous or discrete. Continuous probability distribution: -if the variable under consideration is allowed to take any value within a given range, so, we cannot list all the possible values. E.g. Height of children. Discrete Probability distribution: - Discrete Probability can take only a limited number of values in a given range. It can be listed. E.g. Number of children in a family. Bernoulli distribution: - it was given by Jacob Bernoulli, a swiss mathematician. The Bernoulli distribution describe discrete, not continuous data, resulting from an experiment known as Bernoulli process. Uses of Bernoulli process- 1. Each experiment (trail) has a fixed number of possible outcomes. In coin toss experiment, outcome is fixed (only two) either head or tail. E.g. Success or failure; yes or no. 2. The probability of outcome of any trail remain fixed over time. E.g. In fair coin toss experiment, the probability of head remains 0.5 for each toss regardless of the number of times the coin is tossed. 3. The outcome of one experiment(trails) does not affect the outcome of any other experiment. The experiment(trails) are statistically independent. The probability of r success in n trails is given as: ncr p rqn-r = 𝒏! 𝒓!(𝒏−𝒓)! Pr qn-r The mean of binomial distribution is given as 𝜇 = 𝑛𝑝 The standard deviation of binomial distribution is as 𝜎 = √𝑛𝑝𝑞
  • 37. DR. Md.Khurshid Alam 36 ➢ When p=0.5, the binomial distribution is symmetrical. ➢ When p > 0.5, the binomial distribution is skewed to the left. ➢ As p increases (0.3), the skewness is less noticeable. ➢ When p is small (0.1), the binomial distribution is skewed to the right. ➢ The probability for 0.3, are same as those for 0.7. Except that the value of p and q are reversed. The Poisson distribution: it is based on previous data. Used for discrete Probability distribution. The Poisson probability formula is – P(x)= À𝒙𝒙𝒆−À 𝒙! The Poisson distribution can be used instead of binomial distribution to avoid tedious job of calculation in binomial, if n is larger and r is small, that is when the number of trials is large and the binomial probability of success is small. It gives good approximation of binomial when n is greater than or equal to 20 and p is less than go 0.05.
  • 38. DR. Md.Khurshid Alam 37 The normal probability distribution – continuous probability distribution (Gaussian distribution): - in 18th century, Karl Gauss postulate it. -2 -1 0 1 2 ➢ The curve is bell shaped; it has a single peak. It is unimodal. ➢ The mean of normally distributed papulation lies at the center of its normal curve. ➢ Because of symmetry of the normal probability distribution the mean, median and mode are the same value, at the center of the curve. ➢ The two tail of the normal probability distribution extend indefinitely and never touch the horizontal line. ➢ Total area under curve is 1.00, (probability). ➢ In normally distributed population- ❖ Approximately 68% of all the values lies within ±1 standard deviation from the mean. ❖ Approximately 95.5% of all the values lies within ±2 standard deviation from the mean. ❖ Approximately 99.7% of all the values lies within ±3 standard deviation from the mean.
  • 39. DR. Md.Khurshid Alam 38 Estimation Statistical inference is based on estimation and hypothesis testing. In both estimation and hypothesis testing, we make inference about characteristic of populations from information contained in sample. Here we infer something about a population from information taken from a sample. There are two type of estimation about population- 1) A point estimation- it a single number that is used to estimate an unknown population parameter. A point estimation is more useful if it is accompanied by an estimate of the error that might be involved. X¯ = ∑ 𝒙 𝒏 Thus, by using sample mean x¯ as the estimator we have a point estimate of the population mean 𝜇. Similarly, we can we can use the sample variance s2 and estimate the population variance 𝜎2 . Where the sample variance s2 is given by the formula- s2 = ∑(𝑥−𝑥¯)2 𝑛−1 2) An interval estimation- it is a range of value used to estimate a population parameter. It indicates the error in two ways: By the extent of its range and, By the probability of the true population parameter lying within that range. Actually, an interval estimate is a range of values within which a papulation parameter is likely to lie. ❖ Interval estimate and confidence level: -the probability that we associate with an interval estimate is called the confidence level. A higher probability means more confidence. In estimation, the most commonly used confidence levels are 90%, 95%, and 99%, but we are free to apply any confidence level. The confidence interval is the range of estimate we are making. Criteria for a good estimator O Unbiasedness, Efficiency, consistency and sufficiency.
  • 40. DR. Md.Khurshid Alam 39 Research Methodology Research definition- Research simply mean search for facts., answer to a question and solution to a problem. It is a purposive investigation. It is an organised enquiry. It seeks to find explanation of unexplained phenomenon to clarify the doubtful facts and to correct the misconceived facts. There are two type of method to search for facts----- 1) Aribitatory method (unscientific method) – it is based on opinion, imagination, belief, impression etc. Its finding varies person to person. 2) Scientific method – it is systemic and rational approach to seeking facts. Aim of research- ➢ Discover the new facts ➢ Verify or test the old facts. ➢ Develop new scientific tools, concept and theories. Research is a scientific endeavour. It involves scientific method. The scientific method is based on certain article of faith. These are- • Reliance of Empirical evidence: - truth is established on the basis of evidence. • Use of relevant concept: - use concept with specific meaning. • Commitment of objectivity: - objectivity is the hall mark of scientific method. • Ethical neutrality: - • Generalization: - • Verifiability: - • Logical and reasoning process:
  • 41. DR. Md.Khurshid Alam 40 Characteristic of research ➢ It is systemic and critical investigation into a phenomenon. ➢ It is a purposive investigation aiming at describing, interpreting and explaining phenomenon. ➢ It adopts scientific methods. ➢ It is objective and logical. ➢ It is based upon observable experience or empirical evidence. ➢ It is directed towards finding answer. ➢ It emphasizes the development of generalisation, principle and theories. ➢ It also stands up for test and criticism Purpose of research ✓ Research extend knowledge of human beings. ✓ It explains undiscovered phenomenon. ✓ It verifies and test existing facts and theories ✓ It develops general laws. ✓ It analyses inter-relationship between variables. ✓ It derives casual explanation. ✓ Applied research aim at finding solution to a real-life problem. ✓ It also develops new tools, concepts and theories. ✓ It contributes in human development. Types of research: - A)According to intent— 1) Pure research/ basic/ fundamental research 2) Applied research 3) Exploratory research 4) Descriptive research 5) Diagnostic studies 6) Evaluation studies 7) Action research- it is a type of evaluation studies.
  • 42. DR. Md.Khurshid Alam 41 B)According to method of studies- 1) Experimental research 2) Clinical research 3) Analytical studies 4) Historical research – it studies the past record and data 5) Survey Research approaches (inquiry mode/scale of measurement)- two type- ❖ Quantitative approach ❖ Qualitative approach On the basis of application- two types ❖ Pure/ fundamental/Basic research ❖ Applied research On the basis of objective – four type- ❖ Descriptive research ❖ Exploratory research ❖ Explanatory research ❖ Correlational research 1.Pure research/ basic/ fundamental research- it aims at extension of knolledge.it is not necessary to be problem oriented.it may lead to either discovery of a new theory or refinement of existing theory.it lays foundation for applied research. Eg. Humoral theory of Hippocrates, Einstein’s theory of relativity etc. 2.Applied research-it is real life problem-oriented and action directed research. It seeks an immediate and practical result. 3. Descriptive research- it is simplest type of research. It is a fact- finding investigation with adequate interpreatation.it describe systemically a situation, phenomenon, problem, service or program, it describes an attitude towards an issue.
  • 43. DR. Md.Khurshid Alam 42 4.Exploratory research- it is also known as formulative research. This type of study is undertaken with the objective either to explore an area where little is known or investigate the possibilities of undertaking a particular research study. 5.Explanatory research- it clarifies the relationship between two aspect of a situation or phenomenon. 6.Correlational research – it discovers or establish the existence of relationship /association/interdependence between two or more aspect of a situation. ❖ Quantitative approach- it is structured/ rigid/predetermined methodology to quantify extent of variation in to a phenomenon, situation, issue, etc. It has reliability and objectivity. ❖ Qualitative approach – it is unstructured/ flexible/open methodology to describe variation in a phenomenon, situation, issue, etc. Emphasis on description of variable. Hence, research is a scientific undertaking which, by mean of logical and systemic technique, aim to discover new facts or verify and test old facts, analyse the sequence, interrelationship and casual explanation, develops new scientific tools, concept and theories which would facilitate reliable and valid study of human behaviour. And it also stand-up for the test of criticism. Steps of research ➢ Formulating a research problem ➢ Research design conceptualisation ➢ Instrument construction for data collection ➢ Sampling ➢ Research proposal writing ➢ Data collection ➢ Data processing ➢ Research report writing
  • 44. DR. Md.Khurshid Alam 43 Research problem It is a difficulty or a problem demanding a solution with in the subject is of his discipline. It is the first step in a scientific enquiry. There are five components of a problem— ➢ Research consumer ➢ Research consumers objectives ➢ Alternative means to meet the objective ➢ Doubt in regard to selection of alternatives ➢ There must be one or more environments to which the difficulty or problem pertains. Selection of a problem: - it is first step in research. Selection is, Itself a problem. One with a critical, curious and imaginative mind and is sensitive to practical problem could easily identify problem for study. Sources for Selection of a problem: - • Review of literature • Academic experience • Daily experience • Exposure to field situations • Consultations brain storming • Research • Institution Formulating the problem – it needs following criteria- I. Internal criteria – it consist--- a) Researchers interest b) Researchers competence c) Researchers own resource.
  • 45. DR. Md.Khurshid Alam 44 II. External criteria – a) Research ability of the problem. b) Importance and urgency c) Novelty of problem d) Feasibility e) Facilities f) Usefulness and social relevance g) Research personal Objective of formulating a problem – A problem well put is half solved. The formulation serves the purpose. The clear and accurate statement of the problem, the development of conceptual model, it defines the objective of the study, the setting of investigative question, the formulation of hypothesis to be tested and the operation definition of concept and the delimitation of the study determine the exact data needs of the study. It prevents wastage of time and energy. It provides direction of study. It determine method to be adopted. Technique involve in formulating problem- it includes- I. Developing title- It indicates core of study, reflect the real intention of researcher. Title should be as long as it covers the subject and as short as interest should retain. II. Building a conceptual model- Conceptual model gives an exact idea of the research problem and shows its various properties and variable to be studied. III. Defining the objective of study – It indicates what are trying to get through study.
  • 46. DR. Md.Khurshid Alam 45 Criteria of a good research problem- 1. Verifiable evidence: - other observer can see or check. 2. Accuracy: - it means truth or correctness of a statement. 3. Precision: - that is making it as exact as necessary. 4. Systematisation: - data should be collected systemic and organised way. 5. Objectivity: - that is free being from all biases and vested interest. 6. Recording: - that is writing down complete detail as quickly as possible. 7. Controlling condition: - controlling all variable except one. 8. Training investigators: - that is imparting necessary knowledge to investigators. Types of research question: - conceptualise that research study can ask three type of question— ▪ Descriptive question – describe phenomenon or characteristics of a particular group of subjects being study. ▪ Relationship question – investigate the degree to which two or more variable are associated with each other. ▪ Difference question – make compression between or within groups of interest. A research question must identify – ▪ Variable under study ▪ Population being studied ▪ Testability of question
  • 47. DR. Md.Khurshid Alam 46 Concept of variable A variable is a characteristic, traits or attribute of a subject. ❖ Variable – A quantitative characteristics which varies from unit to unit. E.g.- height ❖ Attribute – A qualitative characteristic which varies from unit in unit. E.g.- sex ❖ Discrete variable – Some specified value in a given range. E.g. – number of children per family. ❖ Continuous variable – A variable which assume all the value in the range. E.g.- Hight of persons Types of variable – ▪ Independent – any variable which is adopted for bringing change (effect) is called independent variable. ▪ Dependent – the variable that change under the effect of another variable is called dependent variable. ▪ Extraneous – the independent variable which is unwanted for purpose of study but may affect the dependent variable is called extraneous variable/ factor. ▪ Chance variable – it is also independent and unwanted variable which may affect the dependant variable by chance.
  • 48. DR. Md.Khurshid Alam 47 Research design It is a logic and systemic plan prepared for directing a research study.it specifies the objective of study, the methodology and technique to be adopted for achieving the objective, it contributes the blue print for the collection, measurement and analysis of data. A research design is programme that guide the investigator in the process of collecting, analysing and interpreting the observation. According to cook – A research design is arrangement of condition for collection and analysis of data in a manner that aims to combine relevance to the research purpose with economy in procedure. Component of research design 1. Dependent and independent variable: - Phenomena that assume different values quantitatively even in decimal point are known as continuous variable, and values that can be expressed only in integer value are called non continuous variable. 2. Extraneous factor- the independent variable which are not directly associated to the purpose but effect the dependent variable. 3. Control – the term control is used in experimental research to reflect the restrain in experimental condition.it is used to minimise the effect of extraneous independent variable. 4. Cofound relationship – the relationship between dependent and independent variable is said to be confounded by an extraneous variable, when the dependent variable is not free from its effect. ❖ Research hypothesis-it is the predictive statement which relates a dependent variable and an independent variable. ❖ Control group- in experimental research, the group which is exposed to usual condition is known as control group. ❖ Experimental group- the group which is either receive or exposed to the intervention is called experimental group. ❖ Treatment- it is referred to the different condition to which the experimental and control group is subjected to.
  • 49. DR. Md.Khurshid Alam 48 Function of research design ❖ It relates to the identification and development of the procedure and logistics arrangement of those procedure required for study. ❖ Emphasis on quality of the adopted procedure to ensure validity, objectivity and accuracy. Research design should have following information- ❖ Who will contribute to research population? ❖ Method of identification of research population. ❖ Where whole population will be studied or not? If not, then selection of sample and method of sampling. ❖ Method of data collection with justification. ❖ How ethical issue will be addressed? Different research design- (study is of different types; hence a single research design is not suitable for all study.) A)On the basis of number of contacts with the study population, Research design three types- 1. Cross sectional study or prevalence study (one contact only): This study is cross sectional to both the study population and time of investigation.it is extremely simple design used to study the prevalence of a phenomenon, situation or issue.it is easy and cheap but change cannot measure by this study. 2. Before and after study (two contacts): - It is also known as pre- test/post-test design. It can be described as two sets of cross-sectional data collection point on the same population to find out the change in the phenomenon or variable between two points of time.it can measure change in a situation, phenomenon or issue.it is an appropriate design for measure the impact or effectiveness of a program. It may be either experimental or non-experimental.it is more difficult, more expensive and require a longer time to complete.it measure total change (change produce by both independent variable and extraneous variable). Effect of this study may be contaminated with maturation effect, reactive effect and regression effect.
  • 50. DR. Md.Khurshid Alam 49 3.Longitudinal study- (> 𝟐 𝒄𝒐𝒕𝒂𝒄𝒕𝒔): -In this study population is visited a number of times at a regular interval, to collect the required information. The number of intervals varies study to study. Interval may be days, weeks, months or years, depends upon study. Pattern of change can be studied by this method. But maturation effect, reactive effect, regression effect and conditioning effect can produce error in data. B) On the basis of reference period, study design is of three types- 1. Retrospective study design: - This study is focus on the problem or phenomenon which has happened in the past. Data can be collected on the basis of recall of the situation. It is always non experimental. 2. Prospective study design; -In this design study id done in future.it may be experimental or non-experimental or semi experimental. 3. Retro-prospective design: -This study focus on past trends in a phenomenon and study in the future. This is combination of retro and prospective studies. C) On the basis of nature of investigation, study design is of three types- 1. Non-experimental study 2. Experimental study 3. Semi experimental study 1.Non-experimental study- This is cause tracing study, that is it start from the effect to trace the cause. Environment is not controlled in this study. 2. Experimental study – It is cause and effect relationship study.it start from cause to establish effect. An experimental study can be carried out in either a controlled or a natural environment.
  • 51. DR. Md.Khurshid Alam 50 Prof. Fisher has enumerated three principle of experimental design- a) The principle of replication: - the experiment should be repeated more than once. Thus, each treatment is applied in many experimental units instead of one. By doing so, the statistical accuracy of the experiments is increased. b) The principle of randomisation: - it provides protection, when we conduct an experiment, against the effect of extraneous factors by randomisation. c) Principle of local control: - this means that we should plan the experiment in such a manner that we can perform a two-way analysis of variance, in which the total variability of the data is divided into three components attributed to treatment, the extraneous factor and experimental error. Types of experimental study design- there are so many types of experimental study design, some of them which is used in medical science and public health are- ❖ The after only design ❖ The before and after design ❖ The control group design ❖ The double control designs ❖ The comparative design ❖ The matched control experimental design ❖ The placebo designs The after only design- In this design the baseline data (pre-test or before observation) is constructed on the basis of respondents recall of the situation before the intervention or from information available in existing records. Only one set of data, after intervention is collected. The change in dependent variable is measured by the difference between the base line and after intervention data. This study measure total change, including change attributed to extraneous variable, hence, net effect of intervention cannot be identified by this design. Due to improper baseline data to compare observation, the two set of data are not strictly comparable. So, it is a faulty design for measure the impact of an intervention.
  • 52. DR. Md.Khurshid Alam 51 The before and after design- In this design a base line data is collected before intervention and another set of data is collected after intervention from population. Hence, data is comparable. This design also measures the total change. Effect (change in dependent variable) = effect of (intervention effect + extraneous effect + chance effect) – base line data. The control group design- In experimental research, the term control is used to refer the restrain experimental condition. This study is design to minimise the effect of extraneous independent variable. In this study a test article is compared with a treatment that has known effect. The control may receive no treatment, or standard treatment or placebo. The chief objective of control group is to quantify the impact of extraneous variable. By this design the net effect of intervention is measured. Effect in dependent variable = (intervention effect+ extraneous effect+ chance effect) – (effect in control group+ base line data). Purpose of control study ➢ It helps in differentiating the result or outcome by the test treatment from result caused by another factor(extraneous). ➢ It helps the investigator to know that what would happen to the patient if they had not received the treatment. ➢ It provides sufficient evidence to prove the effectiveness of the use of Unani medicine/ procedure in prevention, diagnosis or treatment. Control may consist- ❖ No treatment (plain control) ❖ Placebo treatment ❖ Well established treatment ❖ Different dose of same treatment ❖ Full scale treatment ❖ Minimal treatment ❖ Alternative treatment
  • 53. DR. Md.Khurshid Alam 52 In an experimental hypothesis testing research when a group is exposed to usual condition, it is termed control group. And the group which is exposed to some novel/special condition, is termed as experimental group. Types of control a) Plain control (no treatment in control group)- It is always open that is patient and investigator both are not blind. Eg. Effect of Hijamah in hair fall. B) Placebo control- Placebo is a pharmacological inert substance used in a clinical trial. Placebo treatment is also called dummy treatment. • Single blind control • Double blind control (ideal) • Triple blind control C)Standard control- D) Dose respondent control E) External control – Subject receiving treatment are compared with a group of patients external to the study. F) Multiple control-more than one types of control groups. Disadvantage- ➢ Ethical concern. ➢ In certain condition it may not be wise to withdraw the patient from medication (Hypertention) and it may impose a serious threat to the patient wellbeing. Matching: - it is method for formation of comparable control and experimental group to minimise the effect of extraneous variable. Matching on minimum parameter is better. Matching may be – A) Topographic f) Age matching B) Same habit g) Sex matching C)Same socioeconomic status D)Dietary habit matching E) Professional matching
  • 54. DR. Md.Khurshid Alam 53 Blinding Blinding is a method of control experimentation in which the subject or researcher or both are not informed about the treatment given. According to level of blinding, trials can be divided into following four type- 1.Unbliend study/open study- in this type of study both the patients and the investigator (everybody involved in trial) are aware of the identity of treatment given. 2.Single blind trials – in this type of study the patients are not aware of the trial treatment being given to them, but their physician does know about it. 3.Double blind trials - in this type of study neither patients nor investigator knows which treatment on individual patient is given. 4.Triple blind trials – it is a double-blind study design involves monitoring of the response by committee who is blind, is called triple blind study. In this type, the patients, investigators and the data analyst does not know, which treatment was being given since the treatment may be coded. When trial reaches a predefined point, the code is broken and trial is unblinded. This design gives an advantage over double blind study because the monitoring committee can evaluate the response in a more objective fashion. Trial with Zemen’s design- in this design eligible individuals are randomised before they given consent to participate in the trial, to receive either a standard treatment or an experimental intervention. Those who are allocated to standard treatment are given the standard treatment and not told that they are part of a trial, where as those who are allocated to the experimental intervention are offered the experimental intervention and told that they are part of a trial. If they refuse to participate in the trial, they are given the standard intervention but are analysed as if they had received the experimental intervention.
  • 55. DR. Md.Khurshid Alam 54 Clinical research/ trial Clinical research/ trial is systemic study of pharmaceutical product or procedure on human subject. Clinical trial is a prospective study comparing to the outcome of certain intervention against a control in human subject. It may proceed from cause to effect or effect to cause. Objective- ➢ To evaluate the safety and efficacy of Unani drug that is already claimed by Unani physician. ➢ To develop new Unani drugs. ➢ To develop economical easily available Unani drugs. ➢ To develop new indication of Unani drug or to change dose format or route of administration. ➢ Objective may be oriented on disease/drugs, procedure and fundamental of science. ➢ Disease oriented objectives are- ❖ To study the aetiology. ❖ To study the pathogenesis. ❖ To study the clinical method. ❖ To study the principle of methods of treatment. ❖ To study the prognosis of disease. ❖ To study the complication of disease. ➢ Drugs oriented objectives are- ❖ To study safety and efficacy of Unani drugs. ❖ Clinical studies. Therapeutic trials of single and compound drugs in different disease. ➢ Clinico-pharmacological studies- ❖ To validate various regime(cupping). ❖ To validate fundamental of Unani system of medicine. ❖ To develop parameter for mizaj assessment. ❖ To develops diagnostic tools based on Unani fundamentals. ❖ To validate asbabe sitta zarooria and gair zarooria. ➢ Clinical trials may be concerns with ilaz bil tadbeer, ilaz bil gheza, surgical procedure, radiotherapy, or other alternative approach.
  • 56. DR. Md.Khurshid Alam 55 There are three elements of experimental design- A) Control B) Randomisation C)Blinding Phases of clinical trials: -there are four phase of clinical trials which proceeds one from another. ❖ Phase I (Pharmacological phase) • Always proceed by preclinical data and safety and efficacy of test drug by study on animal subjects. • First time human subjects (healthy volunteers) are exposed with test drug. • The purpose of trial is to find out toxicity, to calculate maximum tolerance dose in human, to study pharmacokinetics of drugs as per metabolism and distribution. ❖ Phase II (Exploratory phase) • It proceeds after successful phase I trials. • It is conducted on patients. • The purpose of trial is to find safety and efficacy, on patients, • To study pharmacodynamics pharmacokinetics of test drug. • Informed consent is mandatory from subject. ❖ Phase III (Confirmatory phase) • It proceeds after successful phase II trials. • The purpose of trial is to Conclude safety and efficacy of test drugs. Long term tolerance, different dose and regimes and drug interaction are studied. • Subjects are patients, hence Informed consent is mandatory from subject.
  • 57. DR. Md.Khurshid Alam 56 ❖ Phase VI (Post marketing surveillance) • It is not a clinical trial rather it is feedback or ADR from market on patients. • Delayed and rare effect may be reported from field. • The effect on some special population or condition may be reported. • For this phase voluntary reporting and cohort study method are adopted. Case studies methods Herbert spencer was the first social philosopher who used case study in comparative studies of different culture. Several case studies were mentioned by Zakaria Razi in their literatures. Case studies is a method of exploring and analysing the life of a social unit or entity, be it a person, a family, an institution or a community. The aim of case study method is to locate or identify the factors that account for the behaviour patterns of a given unit, and its relationship with the environment. Case study is conducted for understanding, exploring and interpreting of understudy issue for which little is known. Advantage of case study method- it provides an opportunity for the intensive analysis of many specific details often overlooked by other methods. Disadvantage of case study method-. The case documents hardly fulfil the criteria of reliability, adequacy and representativeness. Case may be extremely typical or atypical.
  • 58. DR. Md.Khurshid Alam 57 Case control study: - It is a retrospective study. This is first approach to test casual hypothesis. a) Both exposure and outcome (disease)have occurred before the start of the study. b) Study proceed from effect to cause (backward direction) c) It uses a control or comparison group to support/ refuse inference. There are four basic steps in conducting a case control study- 1.Selection of case and control 2.Matching 3.Mesurement of exposure 4. Analysis and interpretation Trend studies- it is most appropriate method of investigation to map change over a period. Cohort study Cohort is defined as a group of people(units)who share a common characteristic or experience within a definite time period (age, occupation, pregnancy etc). Cohort study is a type of analytical (observational)study which is usually undertaken to obtain additional evidence to refuse or support the existence of an association between suspected cause and disease. It is longitudinal and incidence study. Action research:- it is carried out to identifies area of concern, develop and test alternatives, and experiment with new approaches. There are two tradition of action research- (1) The British tradition (2) The American tradition
  • 59. DR. Md.Khurshid Alam 58 Inductive and deductive approaches to research (Inductive=Zuj se qul; Deductive= qul se zuj) The main difference between inductive and deductive approaches to research is that whilst a deductive approach is aimed and testing theory, an inductive approach is concerned with the generation of new theory emerging from the data. A deductive approach usually begins with a hypothesis, whilst an inductive approach will usually use research questions to narrow the scope of the study. For deductive approaches the emphasis is generally on causality, whilst for inductive approaches the aim is usually focused on exploring new phenomena or looking at previously researched phenomena from a different perspective. Inductive approaches are generally associated with qualitative research, whilst deductive approaches are more commonly associated with quantitative research. However, there are no set rules and some qualitative studies may have a deductive orientation. One specific inductive approach that is frequently referred to in research literature is grounded theory, pioneered by Glaser and Strauss. This approach necessitates the researcher beginning with a completely open mind without any preconceived ideas of what will be found. The aim is to generate a new theory based on the data. Once the data analysis has been completed the researcher must examine existing theories in order to position their new theory within the discipline. Grounded theory is not an approach to be used lightly. It requires extensive and repeated sifting through the data and analysing and re- analysing multiple times in order to identify new theory. It is an approach best suited to research projects where there the phenomena to be investigated has not been previously explored. The most important point to bear in mind when considering whether to use an inductive or deductive approach is firstly the purpose of your research; and secondly the methods that are best suited to either test a hypothesis, explore a new or emerging area within the discipline, or to answer specific research questions.
  • 60. DR. Md.Khurshid Alam 59 Sampling A part of the population is known as sample. The method consisting of the selection for study, a portion of the universe (population) with a view to draw conclusion about the universe (population) is known as sampling. Sampling helps in time and cost saving. A statistic is characteristic of a sample, it is denoted by using lower case roman letter. E.g. Sample size is denoted by n, sample standard deviation is denoted by s. Whereas parameter is characteristic of population and it is denoted by Greek or capital letter. E.g. Population size is denoted by N, population standard deviation is denoted by 𝝈. Advantage of sampling ➢ Limit the number of units for study. (Unit– the object whose characteristics are studied) ➢ It makes study feasible in respect of budget, time and logistics. Characteristics of a good sample ➢ Representativeness: a sample must be representative of the population. ➢ Accuracy: an accurate sample is one which exactly represents the population. ➢ Precision: Precision is measured by standard error. ➢ Size: a good sample must be adequate in size in order to be reliable. Types of sampling: - there are two generic type- A)Random or probability sampling B)Non-Random or Non-probability sampling
  • 61. DR. Md.Khurshid Alam 60 A)Random or probability sampling: - It is based on theory of probability. It provides a known non-zero chance of selection for each population element. There are four method of random sampling- 1. Simple random sampling 2. Systematic random sampling 3. Stratified random sampling 4. Cluster sampling 1.Simple random sampling: - This sampling technique gives each element an equal and independent chance of being selected. ➢ Drawing sample numbers by using (a) lottery method, (b)a tables of random numbers or (c) by using computer. ➢ This type of sampling is suited for a small homogeneous population. ➢ This is one of the easiest methods. 2. Systematic random sampling: - In this sampling, elements are selected from the population at a uniform interval that is measured in time, order, or space. It is simpler than random sampling. It ignores all elements. 3.Stratified random sampling: - In this method we divide the population into relatively homogeneous groups, called strata. Then we use one of the two approaches- Either we select at random from each stratum a specified number of elements corresponding to the proportion of that stratum in the population as a whole or we draw an equal number of elements from each stratum and give weight to the results according to the stratum’s proportion of total population.(Hence there are two method of sampling – 1. Equal allocation and 2. Proportional allocation.) 4.Cluster sampling: - In this method we divide the population into the group or clusters, and then select a random sample of these clusters. We assume that these individual clusters are representative of the population as a whole. A well-designed cluster sampling procedure
  • 62. DR. Md.Khurshid Alam 61 can produce a more precise sample at considerably less cost than that of simple random sampling. ➢ Needs of randomisation- The process of assigning the study subjects randomly to either the treatment or control group is called randomisation. ❖ It is essential to control various known or even unknown biases at the beginning of the trial and during the course of trial. It is very helpful in achieving this objective. ❖ Randomisation always remove the bias influencing the result. ❖ Randomisation allows for valid statistical interpretation of raw data. ❖ It eliminates selection bias. ❖ It avoids systemic difference between groups. ❖ It produces comparable group. B) Non-Random or Non probability sampling: - It is not based on the theory of probability. This sampling does not provide a chance of selection to each population element. This method of sampling is simple, convenience and low cost.it may be classified in to- 1. Convenience or accidental sampling: - It means selecting sample units in a just “hit and miss” fashion. It the cheapest, simplest and not require any statistical expertise. But this is highly biased because of researcher’s subjectivity. 2. Purposive or judgemental sampling: -this method means deliberate selection of sample units that conform to some pre- determined criteria. It may not be true representative of their parent population. 3. Quota sampling: -This is a form of convenient sampling involving selection of quota groups of accessible sampling units by traits such as sex, age, social class. Etc. 4. Snow-ball sampling: -This is a method of building up a list or a sample of a special population by using an initial set of its
  • 63. DR. Md.Khurshid Alam 62 members as informants.it is useful for smaller population for which no frame is readily available. Sampling distribution ➢ If we take several samples from a population, the statistic we would compute for each sample need not be the same and most probably would vary from sample to sample. ➢ A probability distribution of all the possible means of the samples is a distribution of the sample means. Statisticians call this a sampling distribution of the mean. ➢ Standard deviation of the distribution of sample means to describe a distribution of sample means is standard error of the mean. ➢ The standard deviation of the distribution of sample proportions is called standard error of the proportion. ➢ The standard deviation of the distribution of sample statistic is known as the standard error of the statistic. ➢ The sampling distribution has a mean equal to the population mean 𝝁𝒙¯ = 𝝁. ➢ The sampling distribution has a standard deviation (a standard error) equal to the population standard deviation divided by the square root of the sample size 𝝈𝒙¯ = 𝝈 √𝒏 Therefore, the standard error of the mean for an infinite population is given by: 𝝈𝒙¯ = 𝝈 √𝒏 Where 𝜎 is the population standard deviation and n= sample size. If the sample mean is standardised and is taken from a normalised population then the standardised sample mean is given by: - Z = 𝒙¯−𝝁 𝝈
  • 64. DR. Md.Khurshid Alam 63 The central limit theorem: - First, the mean of the sampling distribution will equal the population mean regardless of the sample size, even if the population is not normal. Second, as the sample size increases, the sampling distribution of the mean will approach normality, regardless of the shape of the population distribution. This relationship between the shape of the population distribution and the shape of the sampling distribution of the mean is called the central limit theorem. The relationship between sample size and standard error: - The use of finite population multiplier in calculating the standard error. If the population size is known, i.e. If N is known then if 𝑛 𝑁 > 0.05 Then we have the following formula to calculate the standard error of the mean- 𝝈𝒙¯ = 𝝈 √𝒏 X √ 𝑵−𝒏 𝑵−𝟏 Here N=Size of population and n=size of sample; The term √ 𝑁−𝑛 𝑁−1 In above equation is finite population multiplier.
  • 65. DR. Md.Khurshid Alam 64 Hypothesis= Hypo+thesis (Hypo means under; thesis means research theory) A hypothesis is an assumption about relation between variables.it is a tentative explanation of the research problem or a gauss about the research outcome. Importance of hypothesis- ▪ It gives direction to research. ▪ Suggest new experiment and observation. ▪ It enables collecting relevant data and organising them effectively. ▪ It prevents indiscriminate gathering of data. Sources of hypothesis – ▪ Existing theories ▪ Finding of previous studies. ▪ Personal experience. Analogy. Criteria for hypothesis construction – It is never formulated in the form of question. Following criteria should be followed for hypothesis construction— ➢ It should be empirically testable, whether it is right or wrong. ➢ It should be specific and precise. ➢ The statement of the hypothesis should not be contradictory. ➢ It should specify variables. ➢ It should describe only one issue, ➢ It must consider the experience of another researcher Need of hypothesis – ✓ It provides definite point to the investigation ✓ It guides the direction of study ✓ It specifies source of data. It determines the data needs ✓ It determines the most appropriate technique for analysis ✓ It contributes to the development of theory.
  • 66. DR. Md.Khurshid Alam 65 Characteristics of a good hypothesis 1. Conceptual clarity 2. Specificity 3. Empirically testable 4. Availability of techniques 5. Theoretical relevance Types of hypothesis ➢ Null hypothesis(H0) and Alternative hypothesis (Ha) ➢ One tail or two tail hypotheses (directional vs non directional) Alternative hypothesis is usually the one which wishes to prove and the null hypothesis are one that wish to disprove. The null hypothesis represents the hypothesis we are trying to reject, the alternative hypothesis represents all other possibilities. Null hypothesis should always be specific hypothesis, i.e it should not state about or approximately a certain value. Concept of hypothesis testing: - A)The level of significance- it is very important concept in the context of hypothesis testing. It is always some percentage (usually 5%) which should be chosen with great care, thought and region. 5% level of significance means researcher is willing to take as much as 5%risk rejecting the null hypothesis when null hypothesis happens to be true. Type I and type II error – Type I error- when ewe reject H0 when H0 is true. Type I error means rejection of hypothesis which should be accepted.it is also called level of significance of test. It is denoted by 𝛼 (alpha) Type II error – we accept H0 when it is not true.it means accepting the hypothesis which should has been rejected. It is denoted by 𝛽 (beta). 6.Consistency 7.Objectivity 8.Simplicity
  • 67. DR. Md.Khurshid Alam 66 Two tailed or one tailed test A one tail test should be used when we are to test, say, whether the population mean is either lower than or higher than some hypothesised value. A two-tail test reject the null hypothesis if, say, the sample mean is significantly higher or lower than the hypothesised value of the mean of the population. (When we accept a null hypothesis on the basis of sample information, we are really saying that there is no statistical evidence to reject it. We are not saying that the null hypothesis is true. The only way to prove a null hypothesis is to know the population parameter., and that is not possible with sampling. Thus, we accept the null hypothesis and behave as if it is true simply because we can find no evidence to reject it). The steps in processing in using a standardised scale in hypothesis testing: - 1. Decide whether it is one tail or two tailed tests. 2. State the hypothesis. 3. Select a level of significance appropriate for the decision. 4. Decide which distribution (t or z) is appropriate and find the critical values for the chosen level of significance from the appropriate table. 5. Calculate the standard error of the sample statistic. 6. Use the standard error to convert the observed value of the sample to the standardised value. 7. Sketch the distribution and mark the position of the standardised sample value and the critical values of the test. 8. Compare the value of the standardised sample statistic with the critical values for this test and interpret the result.
  • 68. DR. Md.Khurshid Alam 67 STATISTIC Parametric: - t-test, z-test, chi square test (F-test) Non-parametric: - chi square test, fisher test, Mann–Whitney U test Z-test A Z-test is any statistical test for which the distribution of the test statistic under the null hypothesis can be approximated by a normal distribution. Because of the central limit theorem, many test statistics are approximately normally distributed for large samples. For each significance level, the Z-test has a single critical value (for example, 1.96 for 5% two tailed) which makes it more convenient than the Student's t-test which has separate critical values for each sample size. Therefore, many statistical tests can be conveniently performed as approximate Z-tests if the sample size is large or the population variance is known. If the population variance is unknown (and therefore has to be estimated from the sample itself) and the sample size is not large (n < 30), the Student's t-test may be more appropriate. If T is a statistic that is approximately normally distributed under the null hypothesis, the next step in performing a Z-test is to estimate the expected value θ of T under the null hypothesis, and then obtain an estimate s of the standard deviation of T. After that the standard score Z = (T − θ) / s is calculated, from which one-tailed and two-tailed p- values can be calculated as Φ(−Z) (for upper-tailed tests), Φ(Z) (for lower-tailed tests) and 2Φ(−|Z|) (for two-tailed tests) where Φ is the standard normal cumulative distribution function.
  • 69. DR. Md.Khurshid Alam 68 Use in location testing The term "Z-test" is often used to refer specifically to the one-sample location test comparing the mean of a set of measurements to a given constant when the sample variance is known. If the observed data X1, ..., Xn are (i) independent, (ii) have a common mean μ, and (iii) have a common variance σ2 , then the sample average X has mean μ and variance σ2 / n. The null hypothesis is that the mean value of X is a given number μ0. We can use X as a test-statistic, rejecting the null hypothesis if X − μ0 is large. To calculate the standardized statistic Z = (X − μ0) / s, we need to either know or have an approximate value for σ2 , from which we can calculate s2 = σ2 / n. In some applications, σ2 is known, but this is uncommon. If the sample size is moderate or large, we can substitute the sample variance for σ2 , giving a plug-in test. The resulting test will not be an exact Z-test since the uncertainty in the sample variance is not accounted for—however, it will be a good approximation unless the sample size is small. A t-test can be used to account for the uncertainty in the sample variance when the data are exactly normal. There is no universal constant at which the sample size is generally considered large enough to justify use of the plug-in test. Typical rules of thumb: the sample size should be 50 observations or more. For large sample sizes, the t-test procedure gives almost identical p- values as the Z-test procedure. Other location tests that can be performed as Z-tests are the two-sample location test and the paired difference test.
  • 70. DR. Md.Khurshid Alam 69 Conditions For the Z-test to be applicable, certain conditions must be met. • Nuisance parameters should be known, or estimated with high accuracy (an example of a nuisance parameter would be the standard deviation in a one-sample location test). Z-tests focus on a single parameter, and treat all other unknown parameters as being fixed at their true values. In practice, due to Slutsky's theorem, "plugging in" consistent estimates of nuisance parameters can be justified. However, if the sample size is not large enough for these estimates to be reasonably accurate, the Z- test may not perform well. • The test statistic should follow a normal distribution. Generally, one appeals to the central limit theorem to justify assuming that a test statistic varies normally. There is a great deal of statistical research on the question of when a test statistic varies approximately normally. If the variation of the test statistic is strongly non-normal, a Z-test should not be used. If estimates of nuisance parameters are plugged in as discussed above, it is important to use estimates appropriate for the way the data were sampled. In the special case of Z-tests for the one or two sample location problem, the usual sample standard deviation is only appropriate if the data were collected as an independent sample. In some situations, it is possible to devise a test that properly accounts for the variation in plug-in estimates of nuisance parameters. In the case of one and two sample location problems, a t-test does this. Example Suppose that in a particular geographic region, the mean and standard deviation of scores on a reading test are 100 points, and 12 points, respectively. Our interest is in the scores of 55 students in a particular school who received a mean score of 96. We can ask whether this mean score is significantly lower than the regional mean—that is, are the students in this school comparable to a simple random sample of 55
  • 71. DR. Md.Khurshid Alam 70 students from the region as a whole, or are their scores surprisingly low? First calculate the standard error of the mean: Where is the population standard deviation? Next calculate the z-score, which is the distance from the sample mean to the population mean in units of the standard error: In this example, we treat the population mean and variance as known, which would be appropriate if all students in the region were tested. When population parameters are unknown, a t test should be conducted instead. The classroom mean score is 96, which is −2.47 standard error units from the population mean of 100. Looking up the z-score in a table of the standard normal distribution, we find that the probability of observing a standard normal value below −2.47 is approximately 0.5 − 0.4932 = 0.0068. This is the one-sided p-value for the null hypothesis that the 55 students are comparable to a simple random sample from the population of all test-takers. The two-sided p-value is approximately 0.014 (twice the one-sided p-value). Another way of stating things is that with probability 1 − 0.014 = 0.986, a simple random sample of 55 students would have a mean test score within 4 units of the population mean. We could also say that with 98.6% confidence we reject the null hypothesis that the 55 test takers are comparable to a simple random sample from the population of test-takers. The Z-test tells us that the 55 students of interest have an unusually low mean test score compared to most simple random samples of similar size from the population of test-takers. A deficiency of this analysis is that it does not consider whether the effect size of 4 points is meaningful. If instead of a classroom, we considered a subregion containing 900 students whose mean score was 99, nearly the same z-
  • 72. DR. Md.Khurshid Alam 71 score and p-value would be observed. This shows that if the sample size is large enough, very small differences from the null value can be highly statistically significant. See statistical hypothesis testing for further discussion of this issue. Z-tests other than location tests Location tests are the most familiar Z-tests. Another class of Z-tests arises in maximum likelihood estimation of the parameters in a parametric statistical model. Maximum likelihood estimates are approximately normal under certain conditions, and their asymptotic variance can be calculated in terms of the Fisher information. The maximum likelihood estimate divided by its standard error can be used as a test statistic for the null hypothesis that the population value of the parameter equals zero. More generally, if is the maximum likelihood estimate of a parameter θ, and θ0 is the value of θ under the null hypothesis, Can be used as a Z-test statistic. When using a Z-test for maximum likelihood estimates, it is important to be aware that the normal approximation may be poor if the sample size is not sufficiently large. Although there is no simple, universal rule stating how large the sample size must be to use a Z-test, simulation can give a good idea as to whether a Z-test is appropriate in a given situation. Z-tests are employed whenever it can be argued that a test statistic follows a normal distribution under the null hypothesis of interest. Many non-parametric test statistics, such as U statistics, are approximately normal for large enough sample sizes, and hence are often performed as Z-tests.
  • 73. DR. Md.Khurshid Alam 72 t-test t-test:- in 1908s, theoretical work on t distribution was done by W.S.Gosset. Gosset was was employ of Guinnness Brewery in Dublin, Ireland. Guinnness Brewery did not permit employees to publish research findings under their own names.so, Gosset adopted the pen name student and published under the name. The t distribution is commonly called students t distribution or simply students distribution. Conditions for use of t-test ➢ Sample size ≤ 30. ➢ Population standard deviation must be unknown. ➢ Population distribution should be normal or approxmatly normal. ➢ Random sample. ➢ Quantative data. Degree of freedom- there is a different t distribution for each of the possible degree of freedom. We will use degree of freedom when we select a t distribution to estimate a population mean, and we will use n-1 degree of freedom, where n is the sample size. For example, if we use a sample of 22 to estimate a population mean, we will use (n-1)=21 degree of freedom in order to select the appropriate t distribution.
  • 74. DR. Md.Khurshid Alam 73 t-test Calculation of srandard error of diffrence between means Small sample and uncorelated data Ist step-calculate combined variance(SD2)- SD2 = (∑ 𝑋1−𝑋¯)2 𝑜𝑓 𝑔𝑟.1 + (∑ 𝑥2−𝑥¯)2 𝑜𝑓 𝑔𝑟.2 𝑁1 +𝑁2−2 Because, (𝑥1 − x¯) = x1 and (𝑥2 − x¯) = x2 ; hence we can write as (x1 - x¯)2 = x1 2 and (𝑥2 − x¯)2 = x2 2 SD2 = (∑ 𝑋1)2 + (∑ 𝑋2)2 𝑁1 +𝑁2−2 SD = √ (∑ 𝑋1)2 + (∑ 𝑋2)2 𝑁1 +𝑁2−2 2nd step- calculation for dtandard error of diffrence- SED = √ SD12 N1 + SD2 2 N2 OR SD = √ 1 𝑁1 + 1 𝑁2 3rd step- calculation of t – T = 𝒐𝒃𝒔𝒆𝒓𝒗𝒆𝒅 𝒅𝒊𝒇𝒇𝒓𝒆𝒏𝒄𝒆 𝑺𝑬𝑫 = 𝐗¯𝟏−𝐗¯𝟐 𝑺𝑬𝑫 4th step- D.F = N1+N2 - 2
  • 75. DR. Md.Khurshid Alam 74 Using t – distribution table Comparison between t and z tables The table of t distribution values differs in construction from the z tables. The t table is more compact and shows areas and t values for only a few percentages (10, 5, 2, and 1 percent). Because there is a different t distribution for each number of degrees of freedom, a more complete table would be quite lengthy. Although we can conceive of the need for a more complete table. A second difference in the t table is that it does not focus on the chance that the population parameter being estimated will fall within our confidence interval. Instead, it measures the chance that the population parameter we are estimating will not be within our confidence interval (that is, that it will lie outside it). If we are making an estimate at the 90% confidence level, we would look in the t table under the 0.10 column (100 percent − 90 𝑝𝑒𝑟𝑐𝑒𝑛𝑡 = 10 𝑝𝑒𝑟𝑐𝑒𝑛𝑡) This 0.10 chance of error is symbolised by a, which is the Greek letter 𝛼 (alpha). We would find the appropriate t values for confidence intervals of 95%, 98%, and 99% under the column headed 0.05, 0.02, and 0.01 respectively. A third difference in using the t table is that we must specify the degree of freedom with which we are dealing. Suppose we make an estimate at the 90% confidence level with a sample size of 14, which is 13 degree of freedom. Look under the 0.10 column until you encounter the row labelled 13. Like a z value, the t value there of 1.771 shows that if we mark off plus and minus 1.7716 𝜎𝑥¯ (𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒𝑑 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟 𝑜𝑓 𝑥¯) on either side of the mean, the areas under the curve between these two limits will be 90%, and the area outside these limits (the chance of error ) will be 10 percent.
  • 76. DR. Md.Khurshid Alam 75 Remember that the any estimation in which the sample size is 30 or less and the standard deviation of the population is unknown and the underlying population can be assumed to be normal or approximately normal, we use the t distribution. Determining the sample size(n) in Estimation: - in all above example the sample size was known. Now we are trying to estimate the sample size n. If it is too small, we may fail to achieve the objective, if it is too large, we will be wasting resources. However, let’s try to examine some of the methods that are useful in determining what sample is necessary for any specified level of precision. Comparison of two ways of expressing the same confidence limits Lower confidence limit Upper confidence limit A. X¯− 500 X¯+ 500 B. X¯− 𝑧 𝜎𝑥¯ X¯+ 𝑧 𝜎𝑥¯ C. X¯− t 𝜎𝑥¯ X¯+ t 𝜎𝑥¯ Example: - Department of TST, NIUM Banglore, wants to conduct a survey of the annual earning of its Passed M.D, calculate appropriate sample size for this study in order to estimate the mean annual earnings of last year’s class within 500 at 95% level of confidence. Solution: - in problem, it is stated that variation of 500 on either side of the population mean. That means, 𝑧 𝜎𝑥¯= 500 At 95% level of confidence we know from the z table that z=1.96 Therefore, 1.96 𝜎𝑥¯= 500; and that means 𝜎𝑥¯= 500/1.96=255 Now if the standard error of the mean is 255; that leads us to 𝜎𝑥¯= 𝜎 √𝑛 = 255. Since 𝜎 = 1500 we can find n that is- = 1500 √𝑛 =255 ; therefore, n= ( 1500 255 )2 = 34.6 It means n should be greater than 34.6 or 35 if the NIUM wants to estimate the precision with which it wants to conduct the survey.
  • 77. DR. Md.Khurshid Alam 76 Chi-square (𝐗𝟐 ) Test Chi-square (X2 ) Test enable us to test whether more than two populations can be considered equal. (t-test and z-test are applicable for only one / two sample). Chi-square (X2 ) Test Allow us to do a lot more than just test for the equality of several population. Suppose we classify a population in to several categories with respect to two attributes (such as age and job performance), we can then use a Chi-square (X2 ) Test to determine whether the two attributes are independent of each other. Characteristics of Chi-square (X2 ) Test: - ➢ Chi-square (X2 ) Test is based on frequencies and not on parameter. ➢ It is a non-parametric test where no parameter regarding the rigidity of papulation or papulations are required. ➢ Additive property is also found in Chi-square (X2 ) Test ➢ Chi-square (X2 ) Test is useful to test the hypothesis about the independence of attributes. ➢ Chi-square (X2 ) Test can be used in complex contingency tables. ➢ Chi-square (X2 ) Test has very wide use. Degree of freedom: - The number of degrees of freedom for n observations is n-k and is usually denoted by v where k is the number of independent linear constraints imposed upon them. Suppose someone tells me to write any four numbers then I have all the numbers of my choice. But if a restriction is applied or imposed to the choice that the sum of these numbers should be 50; then the freedom of choice would be reduced to three only and so, the degree of freedom would now be 3. If a Chi-square (X2 ) is defined as the sum of the square of n independent standardised normal variates and the condition of the satisfaction of k linear relations is imposed upon them (such as estimation of some population parametric value etc.) Then the effect of these n constrains of