SlideShare a Scribd company logo
1 of 58
Download to read offline
RESEARCH METHODOLOGY & STATISTICAL TOOLS
MASTER OF BUSINESS ADMINISTRATION
(JNTU)
A MATERIAL FOR
RESEARCH
METHODOLOGY
AND
STATISTICAL TOOLS
(According to JNTU Syllabus)
Prepared by,
S. Venkata Siva Kumar;
MBA (HR/MRKTG), MSc (Statistics).
1
RESEARCH METHODOLOGY & STATISTICAL TOOLS
UNIT-1
RESEARCH METHODOLOGY:
An Introduction
Meaning of Research:
Research in common parlance refers to a search for knowledge. Once can also
define research as a scientific and systematic search for pertinent information on a
specific topic. In fact, research is an art of scientific investigation. The advanced
Learner’s Dictionary of current English lays down the meaning of research as “a
careful investigation or inquiry especially through search for new facts in any branch
of knowledge.” Redman and Mory define research as a “systematized effort to gain
new knowledge.” Some people consider research as a movement, a movement from
the known to unknown. It is actually a voyage of discovery.
Research is an academic activity and as such the term should be used in a
technical sense. According to “Clifford Woody, Research comprises defining and
redefining problems, formulating hypothesis or suggested solutions; collecting,
organizing and evaluating data; making deductions and reaching conclusions; and at
last carefully testing the conclusions to determine whether they fit the formulating
hypothesis. D. Slesinger and M. Stephenson in the encyclopedia of Social Sciences
define Research as “the manipulation of things, concepts or symbols for the purpose of
generalizing to extend, correct or verify knowledge, whether that knowledge aids in
construction of theory or in the practice of an art.”
Objectives of Research:
The purpose of Research is to discover answers to questions through the
application of scientific procedures. The main aim of research is to find out the truth
which is hidden and which has not been discovered as yet. Though each research study
has its own specific purpose, we may think of research objectives as falling into a
number of following broad groupings:
1. To gain familiarity with a phenomenon or to achieve new insights into it
(studies with this object in view are termed as exploratory or formulative
research studies);
2
RESEARCH METHODOLOGY & STATISTICAL TOOLS
2. To portray accurately the characteristics of a particular individual, situation or
a group (studies with this object in view are known as descriptive research
studies);
3. To determine the frequency with which something occurs or with which it is
associated with something else (studies with this object in view are known as
diagnostic research studies);
4. To test a hypothesis of a casual relationship between variables (such studies are
known as hypothesis-testing research studies).
Motivation in Research:
What makes people to undertake research? This is a question of fundamental
importance. The possible motives for doing research may be either one or more of the
following:
1. Desire to get a research degree along with its consequential benefits;
2. Desire to face the challenge in solving the unsolved problems, i.e., concern
over practical problems initiates research;
3. Desire to get intellectual joy of doing some creative work;
4. Desire to be of service to society.
5. Desire to get Respectability.
However, this is not an exhaustive list of factors motivating people to
undertake research studies. Many more factors such as directives of
government, employment conditions, curiosity about new things, desire to
understand casual relationships, social thinking and awakening and the like
may as well motivate (or at times compel) people to perform research
operations.
Types of Research:
The basic types of research are as follows:
1. Descriptive Vs. Analytical Research: Descriptive research includes surveys
and fact-finding enquiries of different kinds. The major purpose of descriptive
research is description of the state of affairs as it exists at present. In social
science and business research we quite often use the term Ex post facto
research for descriptive research studies. The main characteristic of this
3
RESEARCH METHODOLOGY & STATISTICAL TOOLS
method is that the researcher has no control over the variables; he can only
report what has happened or what is happening. Most ex post facto research
projects used for descriptive studies in which the researcher seeks to measure
such items as, for example, frequency of shopping, preferences of people, or
similar data. Ex post facto studies also include attempts by researchers to
discover causes even when they cannot control the variables. The methods of
research utilized in descriptive research are survey methods of all kinds,
including comparative and co-relational methods. In analytical research, on
the other hand, the researcher has to use facts or information already available,
and analyze these to make a critical evaluation of the material.
2. Applied Vs Fundamental Research: Research can either be applied (or action)
research or fundamental (to basic or pure) research. Applied research aims at
finding a solution for an immediate problem facing a society or an
industrial/business organization, whereas fundamental research is mainly
concerned with generalizations and with the formulation of a theory.
“Gathering knowledge for knowledge’s sake is termed as ‘pure’ or ‘basic’
research.” Research concerning some natural phenomenon or relating to pure
mathematics are examples of fundamental research. Similarly, research studies,
concerning human behavior carried on with a view to make generalizations
about human behavior, are also examples of fundamental research, but research
aimed at certain conclusion (say, a solution) facing a concrete social or
business problem is an example of applied research. Research to identify
social, economic or political trends that may affect a particular institution or
the copy research or the marketing research or evaluation research are
examples of applied research. Thus, the central aim of applied research is to
discover a solution for some pressing practical problem, whereas basic research
is directed towards finding information that has a broad base of applications
and thus, adds to the already existing organized body of scientific knowledge.
3. Quantitative Vs Qualitative Research: Quantitative research is based on the
measurement of quantity or amount. It is applicable to phenomena that can be
expressed in terms of quantity. Qualitative research, on the other hand, is
concerned with qualitative phenomenon i.e., phenomena relating to or
involving quality or kind. For instance, when we are interested in investigating
the reasons for human behavior, we quite often talk of ‘Motivation Research’,
4
RESEARCH METHODOLOGY & STATISTICAL TOOLS
an important type of qualitative research. This type of research aims at
discovering the underlying motives and desires, using in depth interviews for
the purpose. Other techniques of such research are word association tests,
sentence completion tests, story completion tests and similar other projective
techniques. Attitude or opinion research i.e., research designed to find out how
people feel or what they think about a particular subject or institution is also
qualitative research. Qualitative research is especially important in the
behavioral sciences where the aim is to discover the underlying motives of
human behavior. Through such research we can analyze the various factors
which motivate people to behave in a particular manner or which make people
like or dislike a particular thing. It may be stated, that to apply qualitative
research in practice is relatively a difficult job and therefore, while doing such
research, one should seek guidance from experimental psychologists.
4. Conceptual Vs Empirical Research: Conceptual research is that related to
some abstract idea(s) or theory. It is generally used by philosophers and
thinkers to develop new concepts or to reinterpret existing ones. On the other
hand, empirical research relies on experience or observation alone, often
without due regard for system and theory. It is data-based research, coming up
with conclusions which are capable of being verified by observation or
experiment. We can also call it as experimental type of research. In such a
research it is necessary to get at facts first hand, at their source, and actively to
go about doing certain things to stimulate the production of desired
information. In such a research, the researcher must first provide himself with a
working hypothesis or guess as to the probable results. He then works to get
enough facts (data) to prove or disprove his hypothesis. He then sets up
experimental designs which he thinks will manipulate the persons or the
materials concerned so far to bring forth the desired information. Such research
is thus characterized by the experimenter’s control over the variables under
study and his deliberate manipulation of one of them to study its effects.
Empirical research is appropriate when proof is sought that certain variables
affect other variables in some way. Evidence gathered through experiments or
empirical studies is today considered studies are today considered to be the
most powerful support possible for a given hypothesis.
Nature and Importance of Research:
5
RESEARCH METHODOLOGY & STATISTICAL TOOLS
“All progress is born of inquiry. Doubt is often better than over-confidence, for it leads
to inquiry, and inquiry leads to invention” is famous Hudson Maxim in context of
which the significance of research can well be understood. Increased amounts of
research make progress possible. Research inculcates scientific and inductive thinking
and it promotes the development of logical habits of thinking and organization.
The role of research in several fields of applied economics, whether related to
business or to the economy as a whole, has greatly increased in modern times. The
increasingly complex nature business and government has focused attention on the use
of research in solving operational problems. Research, as an aid to economic policy,
has gained added importance, both for government ad business.
Research provides the basis for nearly all government policies in our
economic system. For instance, government’s budgets rests in part on an analysis of
the needs and desires of the people and on the availability of revenues to meet these
needs. The cost of needs has to be equated to probable revenues and this is a field
where research is most needed. Through research we van devise alternative policies
and can as well examine the consequences of each of these alternatives. Decision-
making may not be a part of research, but research certainly facilitates the decisions of
the policy maker. Government has also to chalk out programmes for dealing with all
facets of the country’s existence and most of these will be related directly or indirectly
to economic conditions. The plight of cultivators, the problems of big and small
business and industry, working conditions, trade union activities, the problems of
distribution, even the size and nature of defense services are matters requiring
research. Thus, research is considered necessary with regard to the allocation of
nation’s resources.
Research has its special significance in solving various operational and
planning problems of business and industry. Operations research and market research,
along with motivational research, are considered crucial and their results assist, in
more than one way, in taking business decisions. Market research is the investigation
of the structure and development of a market of the purpose of formulating efficient
policies for purchasing, production and sales. Operations research refers to the
application of mathematical, logical and analytical techniques to the solution of
business problems of cost minimization or of profit maximization or what can be
termed as optimization problems. Motivational research of determining why people
behave as they do is mainly concerned with market characteristics.
6
RESEARCH METHODOLOGY & STATISTICAL TOOLS
In addition to what has been stated above, the significance of research can also be
understood keeping in view the following points:
1. To those students who are to write a master’s or Ph.D.thesis, research may
mean a careerism or a way to attain a high position in the social structure;
2. To professionals in research methodology, research may mean a source of
livelihood.
3. To philosophers and thinkers, research may mean the outlet for new ideas and
insights;
4. To analysts and intellectuals, research may mean the generalizations of new
theories.
Thus, research is the fountain of knowledge for the sake of knowledge and an
important source of providing guidelines for solving different business, governmental
and social problems. It is a sort of formal training which enables one to understand the
new developments in one’s field in a battery way.
RESEARCH PROCESS:
The Research Process consists of series of actions or steps necessary to
effectively carry out research and the desired sequencing of these steps. The following
order concerning various steps provides a useful procedural guideline regarding the
research process:
1. Formulating the Research problem
2. Extensive Literature survey
3. Development of working hypothesis
4. Preparing the Research design
5. Determining the Sample design
6. Collection of data
7. Execution of the project
8. Analysis of data
9. Hypothesis-testing
10. Generalizations and interpretation
11. Preparation of the report or the thesis
1) Formulating the research problem: There are two types of research problems,
viz., those which relates to states of nature and those which relate to relationships
7
RESEARCH METHODOLOGY & STATISTICAL TOOLS
between variables. At the very outset the researcher must single out the problem he
wants to study i.e., he must decide the general area of interest or aspect of a subject
matter that he would like to inquire into. Initially the problem may be stated in a broad
general way and then the ambiguities, if any, relating to the problem be resolved.
Then, the feasibility of a particular solution has to be considered before a working
formulation of the problem can be set up. The formulation of a general topic into a
specific research problem, thus, constitutes the first step in a scientific enquiry.
Essentially two steps are involved in formulating the research problem, viz.,
understanding the problem thoroughly, and rephrasing the same into meaningful terms
from an analytical point of view.
The best way of understanding the problem is to discuss it with one’s
own colleagues or with those having some expertise in the matter. In an academic
institution the researcher can seek the help from a guide who is usually an
experimented man and has several research problems in mind. Often, the guide puts
forth the problem in general terms and it is up to the researcher to narrow it down and
phrase the problem in operational terms. In private business units or in governmental
organizations, the problem is usually earmarked by the administrative agencies with
which the researcher can discuss as to how the problem originally came about and
what considerations are involved in its possible solutions.
Professor W.A. Neiswanger correctly states that the statement of the
objective is of basic importance because it determines the data which are to be
collected, the characteristics of the data which are relevant, relations which are to be
explored, the choice of techniques to be used in these explorations and the form of the
final report. If there are certain pertinent terms, the same should be clearly defined
along with the task of formulating the problem. In fact, formulation of the problem
often follows a sequential pattern where a number of formulations are set up, each
formulation more specific than the preceding one, each one phrased in more analytical
terms, and each more realistic in terms of the available data and resources.
2) Extensive literature survey: Once the problem is formulated, a brief summary
of it should be written down. It is compulsory for a research worker writing a thesis for
a Ph.D. degree to write a synopsis of the topic and submit it to the necessary
Committee or the Research Board for approval. At this juncture the researcher should
8
RESEARCH METHODOLOGY & STATISTICAL TOOLS
undertake extensive literature survey connected with the problem. For this purpose, the
abstracting and indexing journals and published or unpublished bibliographies are the
first place to go to. Academic journals, conference proceedings, government reports,
books etc., must be tapped depending on the nature of the problem. In this process, it
should be remembered that one source will lead to another. The earlier studies, if any,
which are similar to the study in hand, should be carefully studied. A good library will
be a great help to the researcher at this stage.
3) Development of working hypothesis: After extensive literature survey,
researcher state in clear terms the working hypothesis or hypotheses. Working
hypothesis is tentative assumption made in order to draw out and test its logical or
empirical consequences. As such the manner in which research hypotheses are
developed is particularly important since they provide the focal point for research.
They also affect the manner in which tests must be conducted in the analysis of data
and indirectly the quality of data which is required for the analysis. In most types of
research, the development of working hypothesis plays an important role. Hypothesis
should be very specific and limited to the piece of research in hand because it has to be
tested. The role of the hypothesis is to guide the researcher by delimiting the area of
research and to keep him on the right track. It sharpens his thinking and focuses
attention on the more important facets of the problem. It also indicates the type of data
required and the type of methods of data analysis to be used.
How does one go about developing working hypothesis? The answer is by
using the following approach:
a) Discussions with colleagues and experts about the problem, its origin and the
objectives in seeking a solution;
b) Examination of data and records, if available, concerning the problem for
possible trends, peculiarities and other clues;
c) Review of similar studies in the area or of the studies on similar problems; and
d) Exploratory personal investigation which involves original field interviews on
a limited scale with interested parties and individuals with a view to secure
greater insight into the practical aspects of the problem.
Thus, working hypothesis arise as a result of a priori thinking about the subject,
examination of the available data and material including related studies and the
counsel of experts and interested parties. Working hypothesis is more useful when
stated in precise and clearly defined terms. It may as well be remembered that
9
RESEARCH METHODOLOGY & STATISTICAL TOOLS
occasionally we may encounter a problem where we do not need working hypothesis,
especially in the case of exploratory or formulative researches which do not aim at
testing the hypothesis. But as a general rule, specification of working hypothesis in
another basic step of the research process in most research problems.
4) Preparing the research design: The research problem having been formulated
in clear cut terms, the researcher will be required to prepare a research design, i.e., he
will have to state the conceptual structure within which research would be conducted.
The preparation of such a design facilitates research to be as efficient as possible
yielding maximal information. In other words, the function of research design is to
provide for the collection of relevant evidence with minimal expenditure of effort,
time and money. But how all these can be achieved depends mainly on the research
purpose. Research purposes may be grouped into four categories, viz.,
a. Exploration
b. Description
c. Diagnosis
d. Experimentation
A flexible research design which provides opportunity for considering many
different aspects of a problem is considered appropriate if the purpose of the research
study is that of exploration. But when the purpose happens to be an accurate
description of a situation or of an association between variables, the suitable design
will be one that minimizes bias and maximizes the reliability of the data collected and
analyzed. There are several research designs, such as, an experimental and non-
experimental hypothesis testing. Experimental designs can be either informal design
(such as completely randomized design, randomized block design, Latin square
design, simple and complex factorial designs), out of which the researcher must select
one for his own project.
The preparation of the research design, appropriate for a particular research
problem, involves usually the consideration of the following:
I. The means of obtaining the information;
II. The availability and skills of the researcher and his staff (if any);
III. Explanation of the way in which selected means of obtaining information will
be organized and the reasoning leading to the selection;
IV. The time available for research; and
V. The cost factor relating to research, i.e., the finance available for the purpose.
10
RESEARCH METHODOLOGY & STATISTICAL TOOLS
5) Determining sample design: All the items under consideration in any field of
inquiry constitute a ‘universe’ or ‘population’. A complete enumeration of all items in
the ‘population’ is known as a census enquiry. It can be presumed that in such an
enquiry when all the items are covered no element of chance is left and highest
accuracy is obtained. But in practice this may not be true. Even the slightest element of
bias in such an enquiry will get larger and larger as the number of observations
increases. Moreover, there is no way of checking the element if bias or its extent
except through a resurvey or use of sample checks. Besides, this type of inquiry
involves a great deal of time, money and energy. Not only this, census enquiry is not
possible in practice under many circumstances. For instance, blood testing is done
only on sample basis. Hence, quite often we select only a few items from the universe
for our study purposes. The items so selected continue what is technically called a
sample.
The researcher must decide the way of selecting a sample or what is popularly
known as the sample design. In other words a sample design is a definite plan
determined before any data are actually collected for obtaining a sample from a given
population. Thus, the plan to select 12 of a city’s 200 drugstores in a certain way
constitutes a sample design. Samples can be either probability samples or non-
probability samples. With probability samples each element has a known probability
of being included in the sample but the non-probability samples do not allow the
researcher to determine this probability. Probability samples are those based on simple
random sampling, systematic sampling, stratified sampling, cluster/area sampling
whereas non-probability samples are those based on convenience sampling, judgment
sampling and quota sampling techniques. A brief mention of the important sample
designs is as follows.
1. Deliberate sampling: Deliberate sampling is also known as purposive or non-
probability sampling. This sampling method involves purposive or deliberate
selection of particular units of the universe for constituting a sample which
represents the universe. When population elements are selected for inclusion in
the sample based on the ease of access, it can be called convenience sampling.
2. Simple random sampling: This type of sampling is also known as chance
sampling or probability sampling where each and every item in the population
has an equal chance of inclusion in the sample and each one of the possible
samples, in case of finite universe, has the same probability of being selected.
11
RESEARCH METHODOLOGY & STATISTICAL TOOLS
For example, if we have to select a sample of 300 items from a universe of
15,000 items, then we can put the names or numbers of all the 15,000 items on
slips of paper and conduct a lottery.
3. Systematic sampling: In some instances the most practical way of sampling is
to select every 15th
name on a list, every 10th
house on one side of a street and
so on. Sampling of this type is known as systematic sampling.
4. Stratified sampling: if the population from which a sample is to be drawn does
not constitute a homogeneous group, then stratified sampling technique is
applied so as to obtain a representative sample. In this technique, the
population as stratified into a number of non-overlapping subpopulations or
strata and sample items are selected from each stratum. If the items selected
from each stratum is based on simple random sampling the entire procedure,
first stratification and then simple random sampling, is known as stratified
random sampling.
5. Quota sampling: In stratified sampling the cost of taking random samples
from individual strata is often so expensive that interviewers are simply given
quota to be filled from different strata, the actual selection of items for sample
being left to the interviewer’s judgment. This is called quota sampling.
6. Cluster sampling and Area sampling: cluster sampling involves grouping the
population and then selecting the groups or the clusters rather than individual
elements for inclusion in the sample. Suppose some departmental store wishes
to sample its credit card holders. It has issued its cards to 15,000 customers.
The sample size is to be kept say 450. For cluster sample this list of 15,000
card holders could be formed into 100 clusters of 150 card holders each. Three
clusters might then be selected for the sample randomly.
7. Multi-stage sampling: This is a further development of the idea of cluster
sampling. This technique is mean for big enquiries extending to a considerably
large geographical area like an entry country. Under multi-stage sampling the
first stage may be to select large primary sampling units such as states, then
districts, then towns and finally certain families within towns. If the technique
of random sampling is applied at all stages, the sampling procedure is
described as multi-stage random sampling.
8. Sequential sampling: This is some what a complex sample design where the
ultimate size of the sample is not fixed in advance but is determined according
12
RESEARCH METHODOLOGY & STATISTICAL TOOLS
to mathematical decisions on the basis of information yielded as survey
progresses. This design is usually adopted under acceptance sampling plan in
the context of statistical quality control.
6) Collecting the data: In dealing with any real life problem it is often found that
data at hand are inadequate, and hence, it becomes necessary to collect data that are
appropriate. There are several ways of collecting the appropriate data which differ
considerably in context of money costs, time and other resources at the disposal of the
researcher.
Primary data can be collected either through experiment or through survey. If
the researcher conducts an experiment, he observes some quantitative measurements,
or the data, with the help of which he examines the truth contained in his hypothesis.
But in the case of a survey, data can be collected by any one or more of the following
ways.
1. By observation
2. Through personal interview
3. Through telephone interviews
4. By mailing of questionnaires
5. Through schedulers.
7) Execution of the project: Execution of the project is a very important step in the
research process. If the execution of the project proceeds on correct lines, the data to
be collected would be adequate and dependable. The researcher should see that the
project is executed in a systematic manner and in time. If the survey is to be conducted
by means of structured questionnaires, data can be readily machine-processed. In such
a situation, questions as well as the possible answers may be coded. If the data are to
be collected through interviewers, arrangements should made for proper selection and
training of the interviewers. The training may be given with the help of instruction
manuals which explain clearly the job of the interviewer at each step. Occasional field
checks should be made to ensure that the interviewers are doing their assigned job
sincerely and efficiently. A careful watch should be kept for unanticipated factors in
order to keep the survey as much realistic as possible. This, in other words, means that
steps should be taken to ensure that survey is under statistical control so that the
collected information is in accordance with the pre-defined standard of accuracy. If
some of the respondents do not cooperate, some suitable methods should be designed
13
RESEARCH METHODOLOGY & STATISTICAL TOOLS
to tackle this problem. One method of dealing with the non-response problem is to
make a list of the non-respondents and take a small sub sample of them, and then with
the help of experts vigorous efforts can be made for securing response.
8) Analysis of data: After the data have been collected, the researcher turns to the
task of analyzing them. The analysis of data requires a number of closely related
operations such as establishment of categories, the application of these categories to
raw data through coding, tabulation and then drawing statistical inferences. The un-
widely data should necessarily be condensed into a few manageable groups and tables
for further analysis. Thus researcher should classify the raw data into some purposeful
and usable categories. Coding operation is usually done at this stage through which the
categories of data are transformed into symbols that nay be tabulated and counted.
Editing is the procedure that improves the quality of the data for coding. With coding
the stage is ready for tabulation. Tabulation is a part of the technical procedure
wherein the classified data are put in the form of tables. The mechanical devices can
be made use of at this juncture. A great deal of data, especially in large inquiries, is
tabulated by computers. Computers not only save time but also make it possible to
study large number of variables affecting a problem simultaneously.
9) Hypothesis-testing: after analyzing the data as stated above, the researcher is in a
position to test the hypothesis, if any, he had formulated earlier. Do the facts support
the hypothesis or they happen to be contrary? This is the usual question which should
be answered while testing hypothesis. Various tests, such as Chi-square test, t-test, F-
test have been developed by statisticians for the purpose. The hypothesis may be tested
through the use of one or more of such tests, depending upon the nature and object of
research inquiry. Hypothesis-testing will result in either accepting the hypothesis or in
rejecting it. If the researcher had no hypothesis to start with, generalizations
established on the basis of data may be stated as hypothesis to be tested by subsequent
researches in times to come.
10) Generalizations and interpretation: If a hypothesis is tested and upheld
several times, it man be possible for the researcher to arrive at generalization, i.e., to
build a theory. As a matter of fact, the real value of research lies in its ability to arrive
at certain generalizations. If the researcher had no hypothesis to start with. He might
seek to explain his findings on the basis of some theory. It is knows as interpretation.
14
RESEARCH METHODOLOGY & STATISTICAL TOOLS
The process of interpretation may quite often trigger off new questions which in turn
lead to further researches.
11) Preparation of the report or the thesis: Finally, the researcher has to prepare the
report of what has been done by him. Writing of report must be done with great care
keeping in view the following:
1. The layout of report should be as follows:
(i) The preliminary pages;
(ii) The main text, and (iii) The end matter
In its preliminary pages the report should carry title and data followed
acknowledgements and foreword. Then there should be a table of contents followed by
a list of tables and list of graphs and charts, if any, given in the report.
The main text of the report should have the following parts:
(a) Introduction: It should contain a clear statement of the objective of the
research and explanation of the methodology adopted in accomplishing the
research. The scope of the study along with various limitations should as well
be stated in this part.
(b) Summary of findings: after introduction there would appear a statement of
findings and recommendations in non-technical language. If the findings are
extensive, they should be summarized.
(c) Main report: the main body of the report should be presented in logical
sequence and broken-down into readily identifiable sections.
(d) Conclusion: towards the end of the main text, researcher should again put
down the results of his research clearly and precisely. In fact, it is the final
summing up.
At the end of the report, appendices should be enlisted in respect of all
technical data. Bibliography, i.e., list of books, journals, reports, etc.,
consulted, should also be given in the end. Index should also be given specially
in a published research report.
2. Report should be written in a concise and objective style in simple language
avoiding vague expressions such as ‘it seems’, ‘there may be’, and the like.
3. Charts and illustrations in the main report should be used only if they present
the information more clearly and forcibly.
15
RESEARCH METHODOLOGY & STATISTICAL TOOLS
4. Calculated ‘confidence limits’ must be mentioned and the various constraints
experienced in conducting research operations may as well be stated.
COLLECTION OF DATA
Statistical investigation: An investigation (or) inquiry means a “search for
knowledge”. Statistical investigation means “search for knowledge with the help of
statistical methods”.
Stages of Investigation: A statistical investigation is a comprehensive which passes
through the following steps:
1. Planning the inquiry
2. Collection of data
3. Editing the data
4. Presentation of data
5. Analysis of data
6. Presentation of final report
Collection of data: The first in the conduct of statistical investigation (or) inquiry
is “collection of data”. The source of data can be represented as follows:
Internal source: Internal data come from government and business organizations
which generate them in the form of production, purchase, expenses etc.
DATA
INTERNAL
DATA
EXTERNAL
DATA
PRIMARY
DATA
SECONDARY
DATA
16
RESEARCH METHODOLOGY & STATISTICAL TOOLS
External data: When data is collected from outside the organization, then this is
collected from the external source. External data can be divided into two types.
(i) Primary (ii) secondary
(i) Primary data: It refers to the statistical material which the investigator originates
for him for the purpose of the inquiry in hand in other words; it is one which is
collected by the investigator the first time.
(ii) Secondary data: it refers to the statistical material which is not originated by the
investigator himself but obtained from some one else records. This type of data is
generally taken from news papers, magazines, bulletins, reports etc.
Methods of collection of primary data: following methods may be used to collect the
primary data:
1. Direct personal investigation
2. Indirect personal investigation
3. Information through correspondent
4. Questionnaire method
(a) Questionnaire step to post
(b) Questionnaire step to investigators
(1) Direct personal investigation: According to this method, the investigator obtains
the data from personal interview or observation.
Therefore, he contains the source of information directly and personally. He
will contact cash and every possible source of information.
(2) Indirect personal investigation: According to this method the investigator contains
third party’s witnesses who are use to collect the information directly or indirectly and
or capable of supplying the necessary information. This method is generally adapted
by government committees to get views of the people relating to the inquiry.
(3) Information through correspondent: Under this method, the investigator does not
collect the information from the persons directly. He appoints local agents in different
cards of the area under investigation. These local agents are called “correspondents”.
This correspondents collect the information and pass it on to the investigate on time-
to-time.
(4) Questionnaire method: In this method, the necessary information is collected from
the respondent’s through a questionnaire. A questionnaire is a set of questions relating
to the inquiry. The information can be collected through questionnaires in two ways.
17
RESEARCH METHODOLOGY & STATISTICAL TOOLS
(i) Questionnaires sent to post: in this case, the questionnaire is sent to a person and
the persons he fills the various answers to the various questions asked in it.
(ii) Questionnaires sent to investigator: under this method, the investigators are
appointed and contact the persons and get replace to the questionnaire and tell them in
their own hand writing in the questionnaire form.
Sources of secondary data: sometimes it is not possible to collect information for
resources in terms of money, time etc, in that solution secondary data is used. This
type of data is generally available in magazines, journals etc. This secondary data can
be classified into two categories:
(i) Published data
(ii) Unpublished data
Organization of data: the raw data in the form of unarranged figures are collected
through primary or secondary sources. The raw data practically gives no information
and hence there is a need for organization of data. In organization of data involves the
following ‘3’ stages:
(1) Editing of data
(2) Classification of data
(3) Tabulation of data
(1) Editing of data:
 Editing of data refers to detect possible errors and irregulatories committed
during the collection of data.
 If the data is not edited, then it may lead to wrong conclusions. Therefore
editing is essential to arrange the data in order.
(2) Classification of data:
 The process of arranging the data in groups or classes according to their
common characteristics is technically classified.
 Classification is the grouping of related facts into classes.
Types of classification: broadly whole data can be classified into following factors:
1. geographical classification
2. chromo logical classification
3. conditional classification
4. qualitative classification
5. quantitative classification
18
RESEARCH METHODOLOGY & STATISTICAL TOOLS
1. Geographical classification: Here data are classified on the basic of
geographical area like village, city, states, and regions.
2. Chromo logical classification: Here, this classification is done on the
basis of time likely hourly, daily, weakly, monthly etc.
3. Conditional classification: This classification is done on the basis of
some conditions such as literacy, intelligence, honesty, beauty and ugly
etc.
4. Qualitative classification: Here, this data is classified on the basis of
some attributes (or) quality like literacy, honesty, beauty, intelligence
etc,. In this case the basis of classification is either presence or absence
of a quality.
5. Quantitative classification: When the data classified on the basis of
the characteristics which can be measured such as age, income, marks,
height, weight, product is called “Qualitative classification”.
(3) Tabulation of data: After the collection and classification of data process of
tabulation begins. Tabulation is dependent upon classification. Tabulation is necessary
in order to make the data understandable or organize. By tabulation we make a
systematic arrangement of statistical data in rows and columns. Rows are the
horizontal arrangements of data, where as the columns are the vertical arrangement of
data.
Tabulation tries to give the maximum information contained in the data in
minimum possible space. It is mid way process between the collection of data and
statistical analysis.
QUESTIONNAIRE AS A TOOL OF COLLECTING DATA
This method consists in preparing a questionnaire (a list of questions relating to
the field of enquiry and providing space for the answers to be filled by the
respondents) which is mailed to the respondents with a request for quick response
within the specified time. The questionnaire is the only media of communication
between the investigator and the respondents and as such the questionnaire should be
designed or drafted with utmost care and caution so that all the relevant and essential
information for the enquiry may be collected without any difficulty, ambiguity and
vagueness.
Drafting or Framing the Questionnaire:
19
RESEARCH METHODOLOGY & STATISTICAL TOOLS
Drafting of a good questionnaire is a highly specialized job and requires great
care, skill, wisdom, efficiency and experience. No hard and fast rules can be laid down
for designing or framing a questionnaire. However, in this connection, the following
general points may be borne in mind:
1. The size of the questionnaire should be as small as possible. The
number of questions should be restricted to the minimum, keeping in view the
nature, objectives and scope of the enquiry. In other words, the questionnaire
should be concise and should contain only those questions which would furnish
all the necessary information relevant for the purpose. Respondents’ time should
not be wasted by asking irrelevant and unimportant questions. A large number of
questions would involve more work for the investigator and thus result in delay
on his part in collecting and submitting the information. These may, in addition,
also necessarily annoy or tire the respondents. A reasonable questionnaire should
contain from 15 to 20-25 questions. If a still larger number of questions is a must
in any enquiry, then the questionnaire should be divided into various sections or
parts.
2. The questions should be clear, brief, unambiguous, non-offending, and
courteous in tone, corroborative in nature and to the point so that not much scope
of guessing is left on the part of the respondents.
3. The questions should be arranged in a natural logical sequence. For
example, to find if a person owns a refrigerator the logical order of questions
would be: “Do you own a refrigerator”? When did you buy it? What is its make?
How much did it cost you? Is its performance satisfactory? Have you ever got it
serviced? The logical arrangement of questions in addition to facilitating
tabulation work would leave no chance for omissions or duplication.
4. The usage of vague and ‘multiple meaning’ words should be avoided.
The vague works like good, bad, efficient, sufficient, prosperity, rarely,
frequently, reasonable, poor, and rich, etc., should not be used since these may
be interpreted by different persons and as such might give unreliable and
misleading information. Similarly the use of words with multiple meanings like
price, assets, capital, income, household, democracy, socialism, etc., should not
be used unless a clarification to these terms is given in the questionnaire.
5. Questions should be so designed that they are readily comprehensive
and easy to answer for the respondents. They should not be tedious nor should
20
RESEARCH METHODOLOGY & STATISTICAL TOOLS
they tax the respondents’ memory. Further, questions involving mathematical
calculations like percentages, ratios, etc., should not be asked.
6. Questions of a sensitive and personal nature should be avoided.
Questions like “How much money you owe to private parties?” or “Do you clean
your utensils yourself?” which might hurt the sentiments, pride or prestige of an
individual should not be asked, as far as possible. It is also advisable to avoid
questions on which the respondent may be reluctant or unwilling to furnish
information. For example, the questions pertaining to income, savings, habits,
addiction to social evils, age (particularly in case of ladies), etc., should be asked
very tactfully.
7. Typed Questions: Under this head, the questions in the questionnaire
may be broadly classified as follows:
a) Shut Questions: In much questions possible answers are suggested by
the framers of the questionnaire and the respondent is required to tick one of
them. Shut questions can further be sub-divided into the following forms.
(i) Simple Alternative Questions: In such questions, the
respondent has to choose between two clear cut alternatives like ‘Yes’ or
‘No’; ‘Right’ or ‘Wrong’; ‘Either’ or ‘Or’ and so on. For instance, do you
own a refrigerator? – Yes or No. Such questions are also called
dichotomous questions. This technique can be applied with elegance to
situations where two clear cut alternatives exist.
(ii) Multiple Choice Questions: Quite often, it is not possible to define a clear
cut alternative and accordingly in such a situation either the first method
(Alternative Questions) is not used or additional answers between ‘Yes’
or ‘No’ like ‘Do not know’, ‘No opinion’, Occasionally, Casually,
Seldom, etc., are added. For instance to find a person smokes or drinks,
the following multiple choice answers may be used:
 Do you smoke?
Yes (Regularly) [ ] No (Never) [ ]
Occasionally [ ] Seldom [ ]
 Which of the following modes of cooking you use?
Gas [ ] Coal (Coke) [ ] Wood [ ]
Power (Electricity) [ ] Stove (Kerosene) [ ]
21
RESEARCH METHODOLOGY & STATISTICAL TOOLS
 How do you go to your place of duty?
By bus [ ] By three wheeler scooter [ ]
By your own vehicle [ ] By taxi [ ]
By your own scooter [ ] On foot [ ]
By your own car [ ] Any other [ ]
Multiple choice questions are very easy and convenient for the respondents
to answer. Such questions save time and also facilitate tabulation. This
method should be used if only a selected few alternative answers exist to a
particular question. Sometimes, a last alternative under the category
‘Others’ or ‘Any other’ may be added. However, multiple answer questions
of relatively equal importance to a given question.
b) Open Questions: Open questions are those in which no alternative
answers are suggested and the respondents are at liberty to express their frank
and independent opinions on the problem in their own words. For instance,
‘What are the drawbacks in our examination system?’; ‘What solution do you
suggest to the housing problem in Delhi?’; ‘Which program in the Delhi TV
do you like best?’ are some of the open questions. Since the views of the
respondents in the open questions might differ widely, it is very difficult to
tabulate the diverse opinions and responses.
8) Leading questions should be avoided: For example, the question ‘why do we use a
particular brand of blades, say, Erasmic blades’ should preferably be framed into two
questions.
(i) Which blade do you use?
(ii) Why do you prefer it?
Gives a smooth shave [] Readily available in the market []
Gives more shaves [] Any other []
Price is less (cheaper) []
9) Cross checks: The questionnaire should be so designed as to provide internal
checks on the accuracy of the information supplied by the respondents by including
some connected questions at least with respect to matters which are fundamental to the
enquiry. For example in social survey for finding the age of the mother the question
‘What is your age’? Can be supplemented by additional questions ‘What is your date
of birth?’ or ‘What is the age of your eldest child’? Similarly, the question, ‘Age at
marriage’ can be supplemented by the question ‘The age of the first child’.
22
RESEARCH METHODOLOGY & STATISTICAL TOOLS
10) Pre-testing the questionnaire: From practical of view it is desirable to try out the
questionnaire on a small scale (i.e., on a small cross-section of the population for
which the enquiry is intended) before using it for the given enquiry on a large scale.
This testing on a small scale (called pre-test) has been found to be extremely useful in
practice. The given questionnaire can be improved or modified in the light of the
drawbacks, shortcomings and problems faced by the investigator in the pre-test. Pre-
testing also helps to decide upon the effective methods of asking questions for
soliciting the requisite information.
11) A covering letter: A covering letter from the organizers of the enquiry should be
enclosed along with the questionnaire for the following purposes:
i. It should clearly explain in brief the objectives and scope of the
survey to evoke the interest of the respondents and impress upon them to
render their full co-operation by returning their schedule/questionnaire duly
filled in within the specified period.
ii. It should contain a note regarding the operational definitions to
the various terms and the concepts used in the questionnaire; units of
measurements to be used and the degree of accuracy aimed it.
iii. It should take the respondents in confidence and ensure them
that the information furnished by them will be kept completely secret and
they will not be harassed in any way later.
iv. In the case of mailed questionnaire method a self-addressed
stamped envelope should be enclosed for enabling the respondents to return
the questionnaire after completing it.
v. To ensure quick and better response the respondents may be
offered awards/incentives in the form of free gifts, coupons, etc.
vi. A copy of the survey report may be promised to the interested
respondents.
12) Mode of tabulation and analysis viz., hand operated, machine tabulation or
computerization should also be kept in mind while designing the questionnaire.
13) Lastly, the questionnaire should be made attractive by proper layout and appealing
get up. We give below two specimen questionnaires for illustration.
A MODEL OF QUESTIONNAIRE IN REGARDS TO CENSUS SURVEY:
23
RESEARCH METHODOLOGY & STATISTICAL TOOLS
We give below the 1971 Census – Individual Slip which was used for a general
purpose survey to collect:
(i) Social and Cultural data like nationality, religion, literacy, mother tongue, etc.;
(ii) Exhaustive economic data like occupation, industry, class of worker and activity, if
not working;
(iii) Demographic data like relation to the head of the house,
sex, age, marital status, birth place, births and depths and the fertility of women to
assess in particular the performance of the family planning programme.
1971 CENSUS – INDIVIDUAL SLIP
1. Name…………………………………………………..
2. Relationship to the head of the family………………………………………
3. Sex………………………..
4. Age…………………………………..
5. Marital status………………………..
6. For currently married women only:
a) Age at marriage……………
b) Any child born in the last one year……………..
7. Birth place:
a) Place of birth……………
b) Rural or urban…………….
c) District…………………………….
d) State/Country…………………………..
8. Last Residence:
a) Place of last residence…………………………………………
b) Rural/Urban……………………………………….
c) District………………………………….
d) State/Country………………………………………
9. Duration of present residence……………………………………..
10. Religion………………………………………….
11. Scheduled Caste/Tribe………………………………………
12. Literacy………………………………………….
13. Educational level………………………………………..
24
RESEARCH METHODOLOGY & STATISTICAL TOOLS
14. Mother Tongue…………………………………………..
15. Other Languages, if any……………………………………………………….
16. Main Activity:
a) Broad Category:
(i) Worker
(ii) Non – Worker
b) Place of work (Name of village/town)…………………………..
c) Name of establishment………………………
d) Name of Industry, Trade, Profession or Service…………………
e) Description of work…………………………………..
f) Class of worker………………………………..
17. Secondary work:
a) Broad Category………………………
b) Place of work…………………………….
c) Name of establishment……………………….
d) Nature of Industry, Trade, Profession or
service………………………….
e) Description of work…………………………………..
f) Class of worker……………………………………………..
SCHEDULES AS A TOOL FOR COLLECTING DATA
Before discussing this method it is desirable to make a distinction between a
questionnaire and a schedule. As already explained, questionnaire in a list of questions
which are answered by the respondent himself in this own handwriting while schedule
is the device of obtaining answers to the questions in a form which is filled by the
interviewers or enumerators (the field agents who put these questions) in a face to face
situation with the respondents. The most widely used method of collection of primary
data is the ‘schedules sent through enumerators’. This is so because this method is free
from certain shortcomings inherent in the earlier methods discussed so far. In this the
enumerators go to the respondents personally with the schedule (list of questions), ask
them the questions there in and record their replies. This method is generally used by
big business houses, large public enterprises and research institutions like ‘National
Council of Applied Economic Research (NCAER), Federation of Indian Chambers of
Commerce and Industries (FICCI) and so on and even by the governments – state or
25
RESEARCH METHODOLOGY & STATISTICAL TOOLS
central – for certain projects and investigations where high degree of response is
desired. Population census, all over the world is conducted by this technique.
Merits:
1. The enumerators can explain in detail the objectives and aims of the enquiry to
the informants and impress upon them the need and utility of furnishing the
correct information.
2. This technique is very useful in expensive enquiries and generally yields fairly
dependable and reliable results due to the fact that the information is recorded
by highly trained and educated enumerators.
3. Unlike the ‘Questionnaire method’, this technique can be used with advantage
even if the respondents are illiterate.
4. As already pointed out in the ‘direct personal investigation’, due to personal
likes and dislikes, different people react differently to different questions and
as such some people might react very sharply to certain sensitive and personal
questions.
Demerits:
1. It is fairly expensive method since the team of enumerators is to be paid for
different services and as such can be used by only those bodies or institutions
which are financially sound.
2. It is also more time consuming as compared with the ‘Questionnaire method’.
3. The success of the method largely depends upon the efficiency and skill of the
enumerators who collect the information. The enumerators have to be trained
properly in the art of collecting correct information by their intelligence,
insight, patience and perseverance, diplomacy and courage. They should
clearly understand the aims and objectives of the enquiry and also the
implications of the various terms, definitions and concepts used in the
questionnaire.
4. Due to inherent variation in the individual personalities of the enumerators
there is bound to be variation, though not so obvious, in the information
26
RESEARCH METHODOLOGY & STATISTICAL TOOLS
recorded by different enumerators. An attempt should be made to minimize
this variation.
5. The success of this method also lies to a great extent on the efficiency and
wisdom with which the schedule is prepared or drafted. If the schedule is
framed haphazardly and incompetently, the enumerators will find it very
difficult to get the complete and correct desired information from the
respondents.
SAMPLE DESIGN AND SAMPLING PROCEDURES
SAMPLE DESIGN:
A sample design is a definite plan for obtaining a sample from a given
population. It refers to the technique or the procedure the researcher would adopt in
selecting items for the sample. Sample design may as well lay down the number of
times to be included in the sample i.e., the size of the sample. Sample design is
determined before data are collected. There are many sample designs from which a
researcher can choose. Some designs are relatively more precise and easier to apply
than others. Researcher must select/prepare a sample design which should be reliable
and appropriate for his research study.
STEPS IN SAMPLE DESIGN:
While developing a sample design, the researcher must pay attention to the following
points:
1. Type of universe: The first step in developing sample design is to clearly
define the set of objects, technically called the Universe, to be studied. The
universe can be finite or infinite. In finite universe the number of items is
certain, but in case of an infinite universe the number of items is infinite i.e.,
we cannot have any idea about the total number of items. The population of a
city, the number of workers in a factory and the like are examples of finite
universes, whereas the number of stars in the sky, listeners of a specific radio
programme, throwing of a dice etc., are examples of infinite universes.
2. Sampling Unit: A decision has to be taken concerning a sampling unit before
selecting sample. Sampling unit may be a geographical one such as state,
district, village, etc., or a construction unit such as house, flat, etc., or it may be
a social unit such as family, club, school, etc., or it may be an individual. The
27
RESEARCH METHODOLOGY & STATISTICAL TOOLS
researcher will have to decide one or more of such units that he has to select
for his study.
3. Source List: It is also known as ‘Sampling frame’ from which sample is to be
drawn. It contains the names of all items of a universe (in case of finite
universe only). If source list is not available, researcher has to prepare it. Such
a list should be comprehensive, correct, reliable and appropriate. It is
extremely important for the source list to be as representative of the population
as possible.
4. Size of sample: This refers to the number of items to be selected from the
universe to constitute a sample. This major problem before a researcher. The
size of sample should neither be excessively large, nor too small. It should be
optimum. An optimum sample is one which fulfills the requirements of
efficiency, representative-ness, reliability and flexibility. While deciding the
size of sample, researcher must determine the desired precision as also an
acceptable confidence level for the estimate.
5. Parameters of interest: In determining the sample design, one must consider
the question of the specific population parameters which are of interest. For
instance, we may be interested in estimating the proportion of persons with
some characteristic in the population, or we may be interested in knowing
some average or the other measure concerning the population. There may also
be important sub-groups in the population about whom we would like to make
estimates. All this has a strong impact upon the sample design we would
accept.
6. Budgetary Constraint: Cost considerations, from practical point of view, have
a major impact upon decisions relating to not only the size of the sample but
also to the type of sample. This fact can even lead to the use of a non-
probability sample.
7. Sampling Procedure: Finally, the researcher must decide the type of sample he
will use i.e., he must decide about the technique to be used in selecting the
items for the sample. In fact, this technique or procedure stands for the sample
design itself. There are several sample designs out of which the researcher
must choose one for his study. Obviously, he must select that design which, for
a given sample size and for a cost, has a small sampling error.
28
RESEARCH METHODOLOGY & STATISTICAL TOOLS
CHARACTERISTICS OF GOOD SAMPLE DESIGN:
From what has been stated above, we can list down the characteristics of a good
sample design as under:
a) Sample design must result in a truly representative sample.
b) Sample design must be such which results in a small sampling error.
c) Sample design must be viable in the context of funds available for the research
study.
d) Sample design must be such so that systematic bias can be controlled in a
better way.
e) Sample should be such that the results of the sample study can be applied, in
general, for the universe with a reasonable level of confidence.
CRITERIA OF SELECTING A SAMPLING PROCEDURE:
In this context one must remember that two costs are involved in a sampling
analysis viz., the cost of collecting the data and the cost of an incorrect inference
resulting from the data. Researcher must keep in view the two causes of incorrect
inferences viz., systematic bias and sampling error. Systematic bias results from errors
in the sampling procedures, and it cannot be reduced or eliminated by increasing the
sample size. At best the causes responsible for these errors can be detected and
corrected. Usually a systematic bias is the result of one or more of the following
factors.
1) Inappropriate frame: If the sampling frame is inappropriate i.e., a biased
representation of the universe, it will result in a systematic bias.
2) Defective measuring device: If the measuring device is constantly in error, it will return in
systematic bias. In survey work, systematic bias can result if the questionnaire or the
interviewer is biased. Similarly, if the physical measuring device is defective there will be
systematic bias in the data collected through such a measuring device.
3) Non-respondents: If we are unable to sample all the individuals initially include in the
sample, there may arise a systematic bias. The reason is that in such a situation the likelihood
of establishing contact or receiving a response from an individual is often correlated with the
measure of what is to be estimated.
29
RESEARCH METHODOLOGY & STATISTICAL TOOLS
4) Indeterminacy principle: Sometimes we find that individuals act different when kept
under observation that what they do when kept in non-observed situations. For instance, if
workers are aware that somebody is observing then in course of a work study on the basis of
which the average length of time to complete a task will be determined and accordingly the
quota will be set for piece work, they generally tend to work slowly in comparison to the
speed with which they work if kept unobserved. Thus, the indeterminacy principle may also be
a cause of a systematic bias.
5) Natural bias in the reporting of data: Natural bias of respondents in the reporting of data
is often the cause of a systematic bias in many inquiries. There is usually a download bias in
the income data collected data by government taxation department, whereas we find an
upward bias in the income data collected by some social organization. People in general
understate their incomes if asked about it for tax purposes, but they overstate the same if asked
for social status or their affluence. Generally in psychological surveys, people tend to give
what they think is the ‘correct’ answer rather than revealing their true feelings.
DIFFERENT TYPES OF SAMPLE DESIGNS:
There are different types of sample designs based on two factors viz., the
representation basis and the element selection technique. On the representation basis
and the element selection technique. On the representation basis, the sample may be
probability sampling or it may be non-probability sampling. Probability sampling is
based on the concept of random selection, whereas non-probability sampling is ‘non-
random sampling. On element selection bias, the sample may be either unrestricted or
restricted. When each sample element is drawn individually from the population at
large, then the sample so drawn is known as ‘unrestricted sample’, whereas all other
forms of sampling are covered under the term ‘restricted sampling’. The following
chart exhibits the sample designs as explained above.
Non-probability sampling: Non-probability sampling is that sampling procedure
which does not afford any basis for estimating the probability that each item in the
population has of being included in the sample. Non-probability sampling is also
known by different names such as deliberate sampling, purposive sampling and
judgment sampling. In this type if sampling, items for the sample are selected
deliberately by the researcher; his choice concerning the items remains supreme. In
other words, under non-probability sampling the organizers of the inquiry purposively
30
RESEARCH METHODOLOGY & STATISTICAL TOOLS
choose the particular units of the universe for consulting a sample on the basis that the
small mass that they so select out of a huge one will be typical or representative of the
whole. For instance, if economic conditions of people living in a state are to be
studied, a few towns and villages may be purposively selected for intensive study on
the principle that they can be representative of the entire state. Thus, the judgment of
the organizers of the study plays an important part in this sampling design.
Quota sampling: It is also an example of non-probability sampling. Under quota
sampling the interviewers are simply given quotas to be filled from the different strata,
with some restrictions on how they are to be filled. In other words, the actual selection
of the items for the sample is left to the interviewer’s discretion. This type of sampling
is very convenient and is relatively inexpensive. But the samples so selected certainly
do not possess the characteristic of random samples. Quota samples are essentially
judgment samples and inferences drawn on their basis are not amenable to statistical
treatment in a formal way.
Probability sampling: Probability sampling is also known as ‘random sampling’ or
‘chance sampling’. Under this sampling design, every time of the universe has an
equal chance of inclusion in the sample. It is, so to say, a lottery method in which
individual units are picked up from the whole group not deliberately but by some
mechanical process. Here it is blind chance alone that determines whether one item or
the other is selected. The results obtained from probability or random sampling can be
assured in terms of probability i.e., we can measure the errors of estimation or the
significance of results obtained from a random sample, and this fact brings out the
superiority of random sampling design over the deliberate sampling design. Random
sampling ensures the Law of Statistical Regularity which states that if on an average
the sample chosen is a random one, the sample will have the same composition and
characteristics as the universe. This is the reason why random sampling is considered
as the best technique of selecting a representative sample.
Random sampling from a finite population to that method of sample selection
which gives each possible sample combination an equal probability of being picked up
and each item in the entire population to have an equal chance of being included in the
sample. This applies to sampling without replacement i.e., once an selected for the
sample, it cannot appear in the sample again (sampling with replacement is used less
31
RESEARCH METHODOLOGY & STATISTICAL TOOLS
frequently in which procedure the element for the sample is returned to the population
before the next element is selected. In such a situation the same element could appear
twice in the same sample before the second element is chosen).in brief, the
implications of random sampling (or simple random sampling) are:
(a) It gives each element in the population an equal probability of getting into the
sample; and all choices are independent of one another.
(b) It gives each possible sample combination an equal probability of being chosen.
COMPLEX RANDOM SAMPLING DESIGNS:
Probability sampling under restricted sampling techniques, as stated above,
may result in complex random sampling designs. Such designs may as well be called
‘mixed sampling designs’ for many of such designs may represent a combination of
probability and non-probability sampling procedures in selecting a sample. Some of
the popular complex random sampling designs are as follows:
(i) Systematic Sampling: In some instances, the most practical way of sampling is to
select every ith
item on a list. Sampling of this type is known as systematic sampling.
An element of randomness is introduced into this kind of sampling by using random
numbers to pick up the unit with which to start. For instance, if a 4 percent sample is
desired, the first item would be selected randomly from the first twenty-five and
thereafter every 25th
item would automatically be included in the sample. Thus, in
systematic sampling only the first unit is selected randomly and the remaining units of
the sample are selected at fixed intervals. Although a systematic sample is not a
random sample in the strict sense of the term, but it is often considered reasonable to
treat systematic sample as if it were a random sample.
(ii) Stratified Sampling: If a population from which a sample is to be drawn does not
constitute a homogeneous group, stratified sampling technique is generally applied in
order to obtain a representative sample. Under stratified sampling the population is
divided into several sub-populations that are individually more homogeneous than the
total population a (the different sub-populations are called ‘strata’) and then we select
items from each stratum to constitute a sample. Since each stratum is more
homogeneous than the total population, we are able to get precise estimates for each
stratum and by estimating more accurately each of the component parts; we get a
32
RESEARCH METHODOLOGY & STATISTICAL TOOLS
better estimate of the whole. In brief, stratified sampling results in more reliable and
detailed information.
(iii) Cluster Sampling: If the total area of interest happens to be a big one , a
convenient way in which a sample can be taken is to divide the area into a number of
smaller non-overlapping areas and then to randomly select a number of these smaller
areas (usually called clusters), with the ultimate sample consisting of all (or samples of
) units in these small areas of clusters.
Thus in cluster sampling the total population is divided into a number of
relatively small subdivisions which are themselves clusters of still smaller units and
then some of these clusters are randomly selected for inclusion in the overall sample.
Suppose we want to estimate the proportion of machine parts in an inventory which
are defective. Also assume that there are 20000 machine parts in the inventory at a
given point of time, stored in 400 cases of 50 each. Now using a cluster sampling, we
would consider the 400 cases as clusters and randomly select ‘n’ cases and examine all
the machine parts in each randomly selected case.
Cluster sampling, no doubt, reduces cost by concentrating surveys in selected
surveys. But certainly it is less precise than random sampling. There is also not as
much information in ‘n’ observations within a cluster as there happens to be in ‘n’
randomly drawn observations. Cluster sampling is used only because of the economic
advantage it possesses; estimates based on cluster samples are usually more reliable
per unit cost.
(iv) Area Sampling: If clusters happen to be some geographic subdivisions, in that
case cluster sampling is better known as area sampling. In other words, cluster
designs, where the primary sampling unit represents a cluster of units based on
geographic area, are distinguished as area sampling. The plus and minus points of
cluster sampling are also applicable to area sampling.
(v) Multi-stage Sampling: Multi-stage sampling is a further development of the
principle of cluster sampling. Suppose we want to investigate the working efficiency
of nationalized banks in India and we want to take a sample of few banks for this
purpose. The first stage is to select large primary sampling unit such as states in a
country. Then we may select certain districts and interview all banks in the chosen
33
RESEARCH METHODOLOGY & STATISTICAL TOOLS
districts. This would represent a two-stage sampling design with the ultimate sampling
units being clusters of districts.
If instead of taking a census of all banks within the selected districts, we select
certain towns and interview all banks in the chosen towns. This would represent a
three-stage sampling design. If instead of taking a census of all banks within the
selected towns, we randomly sample banks from each selected town, then it is a case
of using a four-stage sampling plan. If we select randomly at all stages, we will have
what is known as ‘multi-stage random sampling design’.
Ordinarily multi-stage sampling is applied in inquires extending to a
considerable large geographical area, say, the entire country. There are two advantages
of this sampling design viz., (a) It is easier to administer than most single stage designs
mainly because of the fact that sampling frame under multi-stage sampling in
developed impartial units. (b) A large number of units can be sampled for a given cost
under multistage because of sequential clustering, whereas this is not possible in most
of the sample designs.
(vi) Sampling with probability proportional to size: In case the cluster sampling
units do not have the same number or approximately the same number of elements, it
is considered appropriate to use a random selection process where the probability of
each cluster being included in the sample is proportional to the size of the cluster. For
this purpose, we have to list the number of the elements in each cluster irrespective of
the method of ordering the cluster. Then we must sample systematically the
appropriate number of elements from the cumulative totals.
(vii) Sequential Sampling: This sampling design is some what complex sample
design. The ultimate size of the sample under this technique is not fixed in advance,
but we determined according to mathematical decision rules on the basis of
information yielded as survey progresses. This is usually adopted in case of acceptance
sampling plan in context of statistical quality control. When a particular lot is to be
accepted or rejected on the basis of single sample, it is known as single sampling;
when the decision is to be taken on the basis of two samples, it is known as double
sampling and in case the decision rests on the basis of more than two samples but the
number of samples in certain and decide in advance, the sampling is known as the
34
RESEARCH METHODOLOGY & STATISTICAL TOOLS
multiple sampling. But when the number of samples is more than two but it is neither
certain nor decides in advance, this type of system is often referred to as sequential
sampling.
DIAGRAMATIC PRESENTATION OF DATA
General rules for Constructing Diagrams:
(1) Neatness: Diagrams are visual aids for presentation of statistical
data and are more appealing and fascinating to the eye and leave a lasting
impression on the mind. It is, therefore, imperative that they are made very neat,
clean and attractive by proper size and lettering; and the use of appropriate devices
like different colours, different shades (light and dark), dots, dashes, dotted lines,
broken lines, dots and dash lines, etc., for filling the in between space of the bars,
rectangles, circles, etc., and their components.
(2) Title and Footnotes: As in the case of a good statistical table, each
diagram should be given a suitable title to indicate the subject-matter and the
various facts depicted in the diagram. The title should be brief and self
explanatory, clear. If necessary the footnotes may be given at the left hand bottom
of the diagram to explain certain points or facts, not otherwise covered in the title.
(3) Selection of Scale: One of the most important factors in the
construction of diagrams is the choice of an appropriate scale. The same set of
numerical data if plotted on different scales may give the diagrams differing
widely in size and at times might lead to wrong and misleading interpretations.
Hence, the scale should be selected with great caution.
(4) Proportion between Width and Height: A proper proportion
between the dimensions (height and width) of the diagram should be maintained,
consistent with the space available.
(5) Choice of a Diagram: A large number of diagrams are used to
present statistical data. The choice of a particular diagram to present a given set of
numerical data is not an easy one. It primarily depends on the nature of the data,
magnitude of the observations and the type of the people for whom the diagrams
35
RESEARCH METHODOLOGY & STATISTICAL TOOLS
are meant and requires great amount of expertise, skill, and intelligence. An
inappropriate choice of the diagram for the given set of data might give a distorted
picture of the phenomenon under study and might lead to wrong and fallacious
interpretations and conclusions.
(6) Source Note and Number: As in the case of tables, source note,
wherever possible should be appended at the bottom of the diagram. This is
necessary as, to the learned audience of statistics; the reliability of the information
varies from source to source. Each diagram should also be given a number for
ready reference and comparative study.
(7) Index: A brief index explaining various types of shades, colors,
lines, and designs used in the construction of the diagram should be given for clear
understanding of the diagram.
(8) Simplicity: Lastly, diagrams should be as simple as possible so that
they are easily understood even by a layman who does not have any mathematical
or statistical background. If too much information is presented in a single complex
diagram it will be difficult to grasp and might even become confusing to the mind.
Hence, it is advisable to draw more simple diagrams than one or two complex
diagrams.
TYPES OF DIAGRAMS:
A large variety of diagrammatic devices are used in practice to present statistical data.
However, we shall discuss here only some of the most commonly used diagrams
which may be broadly classified as follows:
(1) One-dimensional diagrams
(2) Two-dimensional diagrams
(3) Three-dimensional diagrams
(4) Pictograms
(5) Cartograms
1) One-Dimensional Diagrams: These one-dimensional diagrams are classified into
two types. They are:
I. Line Diagrams
II. Bar Diagram
36
RESEARCH METHODOLOGY & STATISTICAL TOOLS
a) Line Diagram: This is the simplest of all the diagrams. It consists in drawing
vertical lines, each vertical line being equal to the frequency. The variate (x) values
are presented on a suitable scale along the X-axis and the corresponding
frequencies are presented on a suitable scale along Y-axis. Line diagrams facilitate
comparisons though they are not attractive or appealing to the eye.
0
20
40
60
80
100
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
East
West
North
b) Bar Diagram: Bar diagrams are one of the easiest and the most commonly used
devices of presenting most of the business and economic data. These are especially
satisfactory for categorical data or series. They consist of a group of equidistant
rectangles, one for each group or category of the data in which the values or the
magnitudes are represented by the length or height of the rectangles, the width of
the rectangles being arbitrary and immaterial. These diagrams are called one-
dimensional because in such diagrams only one dimension viz., height or length of
the rectangles is taken into account to present the given values. There are various
types of Bar Diagrams. They are listed as follows:
(i) Simple bar diagram
(ii) Sub-divided or component bar diagram
(iii) Percentage bar diagram
(iv) Multiple bar diagram
(v) Deviation or Bilateral bar diagram
37
RESEARCH METHODOLOGY & STATISTICAL TOOLS
0
10
20
30
40
50
60
70
80
90
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
East
West
North
2) Two-Dimensional Diagrams: Line or Bar diagrams discussed so far are one-
dimensional diagrams since the magnitudes of the observations are represented by
only one of the dimensions viz., height (length) of the bars while the width of the bars
is arbitrary and uniform. However, in two-dimensional diagrams, the magnitudes of
the given observations are represented by the area of the diagram. Thus, in the case of
two-dimensional bar diagrams, the length as well as width of the bars will have to be
considered. Two-dimensional diagrams are also known as “area diagrams or surface
diagrams”. Some of the commonly used two-dimensional diagrams are listed as
follows:
They are:
 Rectangles
 Squares
 Circles
 Angular or pie diagrams
3) Three-Dimensional Diagrams: Three-dimensional diagrams, also termed as
‘volume diagrams’ are those in which three dimensions, viz., length, breadth, and
height are taken into account. They are constructed so that the given magnitudes
are represented by the volumes of the corresponding diagrams. The common forms
38
RESEARCH METHODOLOGY & STATISTICAL TOOLS
of such diagrams are “cubes, spheres, cylinders, blocks etc”. These diagrams are
specially useful if there are very wide variations between the smallest and the
largest magnitudes to be represented. Of the various three-dimensional diagrams,
‘cubes’ are the simplest and most commonly used devices of diagrammatic
presentation of data.
4) Pictograms: Pictograms is the technique of presenting statistical data through
appropriate pictures and is one of the very popular devices particularly when the
statistical facts are to be presented to a layman without any mathematical
background. In this, the magnitudes of the particular phenomenon under study are
presented through appropriate pictures, the number pictures drawn or the size of
the pictures being proportional to the values of the different magnitudes to be
presented. Pictures are more attractive and appealing to the eye and have a lasting
impression on the mind. Accordingly they are extensively used by government and
private institutions for diagrammatic presentation of the data relating to a variety
of social, business or economic phenomena primarily for display to the general
public or common masses in fairs and exhibitions.
5) Cartograms: in cartograms, statistical facts are presented through maps
accomplished by various types of diagrammatic representation. They are specially
used to depict the quantitative facts on a regional or geographical basis eg., the
population density of different states in a country or different countries in the
world, or the distribution of the rainfall in different regions of a country can be
shown with the help of maps or cartograms. The different regions or geographical
zones are depicted on a map and the quantities or magnitudes in the regions may
be shown by dots, different shades or colors etc., or by placing bars or pictograms
in each region or by writing the magnitudes to be represented in the respective
regions. Cartograms are simple and elementary forms of visual presentation and
are easy to understand. They are generally used when the regional or geographic
comparisons are to be highlighted.
GRAPHIC REPRESENTATION OF DATA
Diagrams are primarily used for comparative studies and can’t be used to study the
relation ship between the variables under study. This is done through graphs.
Diagrams furnish only approximate information and they are not of much utility to a
statistician from analysis point of view. On the other hand, graphs are more obvious,
precise and accurate than diagrams and can be effectively used for further statistical
39
RESEARCH METHODOLOGY & STATISTICAL TOOLS
analysis, viz., to study slopes, rates of change and for forecasting wherever possible.
Graphs are drawn on a special type of paper, known as “graph paper”.
Before discussing these graphs we shall briefly describe the technique of
constructing graphs and the general rules for drawing graphs.
TECHNIQUE OF CONSTRUCTION OF GRAPHS:
QUADRANT II 5- QUADRANT I
X-Negative 4- X-Positive
Y-Positive 3- Y-Negative
(-X, +Y) 2- (+X, +Y)
1-
-5 -4 -3 -2 -1 0 1 2 3 4 5
QUADRANT III -1- QUADRANT IV
X-Negative -2- X - Positive
Y-Positive -3- Y - Negative
(-X, -Y) -4- (+X, -Y)
-5-
Graphs are drawn on a special type of paper known as “Graph Paper”, which
has a fine network of horizontal and vertical lines; the thick lines for each division of a
centimeter or an inch measure and thin lines for small parts of the same. In a graph of
any size, two simple lines are drawn at right angle to each other, intersecting at point
‘O’ which is known as origin or zero of reference. The two lines are known as co-
ordinate axes. The horizontal line is called X – axis and is denoted by X’OX. The
vertical line is called the Y – axis and is usually denoted by YOY’. Thus the graph is
divided into four sections, known as four quadrants.
General Rules for Graphing: The following guidelines may be kept in mind for
drawing effective and accurate graphs.
1. Neatness
2. Title and Footnote
3. Structural Framework
4. Scale
40
RESEARCH METHODOLOGY & STATISTICAL TOOLS
5. False Base Line
6. Ratio or Logarithmic Scale
7. Line designs
8. Source Note and Number
9. Index
10. Simplicity
TYPES OF GRAPHS: A large number of graphs are used in practice. But they can
be broadly classified under the following two heads:
(i) Graphs of frequency distributions.
(ii) Graphs of time series.
1) Graphs of Frequency Distributions: The reasons and the guiding principles
for the graphic representation of the frequency distributions are precisely the same
as for the diagrammatic and graphic representation of other types of data. The so-
called frequency graphs are designed to reveal clearly the characteristic features of
a frequency data. Such graphs are more appealing to the eye than the tabulated data
and are readily perceptible to the mind. They facilitate comparative study of two or
more frequency distributions regarding their shape and pattern. The most
commonly used graphs for charting a frequency distribution for the general
understanding of the details of the data are:
A) Histogram B) Frequency Polygon
C) Frequency Curve D) “Ogive” or Cumulative Frequency Curve
The choice of a particular graph for a given frequency distribution largely depends on
the nature of the frequency distribution, viz., discrete or continuous.
A) HISTOGRAM: It is one of the most popular and commonly used devices for
charting continuous frequency distribution. It consists in erecting a series of
adjacent vertical rectangles on the sections of the horizontal axis (X-axis), with
bases (sections) equal to the width of the corresponding class intervals and heights
are so taken that the areas of the rectangles are equal to the frequencies of the
corresponding classes.
The Histogram can be constructed in two cases. They are:
Case (i): Histogram with equal classes.
Case (ii): Histogram with un-equal classes.
41
RESEARCH METHODOLOGY & STATISTICAL TOOLS
B) FREQUENCY POLYGON: Frequency polygon is other device of graphic
presentation of a frequency distribution (continuous, grouped or discrete). In case
of discrete frequency distribution, frequency polygon is obtained on plotting the
frequencies on the vertical axis (Y-axis) against the corresponding values of the
variable on the horizontal axis (X-axis) and joining the points so obtained by
straight lines.
C) FREQUENCY CURVE: A frequency curve is a smooth free hand curve
drawn through the vertices of a frequency polygon. The object of smoothing of the
frequency polygon is to eliminate, as far as possible, the random or erratic
fluctuations that might be present in the data. The area enclosed by the frequency
curve is same as that of the histogram or frequency polygon but its shape is smooth
one and not with sharp edges. Frequency curve may be regarded as a limited form
of the frequency polygon as the number of observations (total frequency) becomes
very large and class intervals are made smaller and smaller.
Types of frequency curves:
Though different types of data may give rise to a variety of frequency curves, we
shall discuss below only some of the important curves which, in general, describe
most of the data observed in practice, viz., and the data relating to natural, social,
economic and business phenomena.
i) Curves of Symmetrical Distribution
ii) Moderately Asymmetrical (skewed) frequency distribution
curves
iii) Extremely asymmetrical or J – shaped curves
iv) U – curve
v) Mixed curves
D) “OGIVE” OR CUMULATIVE FREQUENCY CURVE: Ogive, pronounced
as “Ojive”, is a graphic presentation of the cumulative frequency (C.F)
distribution of continuous variable. It consists in plotting the cumulative frequency
(along the Y – axis) against the class boundaries (along the X – axis). Since there
are two types of cumulative frequency distributions viz., “LESS THAN C.F” and
“MORE THAN C.F”. We have accordingly two types of ogives, viz., (i) Less than
ogive (ii) More than ogive.
42
RESEARCH METHODOLOGY & STATISTICAL TOOLS
(i) Less than Ogive: This consists in plotting the ‘less than’ cumulative
frequencies against the upper class boundaries of the respective classes. The points
so obtained are joined by a smooth free hand curve to give “Less than Ogive”.
Obviously, “less than ogive” is an increasing curve, sloping upwards from left to
right and has the shape of an elongated S.
(ii) More than Ogive: Similarly, in “more than ogive”, the “more than”
cumulative frequencies are plotted against the lower class boundaries of the
respective classes. The points so obtained are joined by a smooth ‘free hand’ curve
to give “more than ogive”. “More than Ogive” is a decreasing curve and slopes
downwards from left to right and has the shape of an elongated S, upside down.
2) Graphs of Time Series: The Time Series data are represented geometrically by
means of times series graph which is also known as “Histogram”. The various types
of Time Series graphs are:
i) Horizontal Line Graphs or Histograms
ii) Silhouette or Net Balance Graphs
iii) Range or Variation Graphs
iv) Components or Band Graphs
TABULATION OF DATA
Meaning and Importance of Tabulation: By Tabulation we mean the symmetric
presentation of the information contained in the data, in rows and columns in
accordance with some salient features or characteristics. Rows are horizontal
arrangements and columns are vertical arrangements. In the words of A.M. Tuttle.
“A Statistical table is the logical listing of related quantitative data in vertical
columns and horizontal rows of numbers with sufficient explanatory and qualifying
words, phrases and statements in the form of titles, headings and notes to make clear
the full meaning of data and their origin”.
Professor Bowley, in his manual of statistics prefers to Tabulation as “the
intermediate process between the accumulation of data in what ever form they are
obtained, and the final reasoned account of the result shown by the statistics”.
Tabulation is one of the most important and ingenious device of the presenting
the data in a condensed and readily comprehensible form and attempts to furnish the
43
RESEARCH METHODOLOGY & STATISTICAL TOOLS
maximum information contained in the data in the minimum possible space, without
sacrificing the quality and usefulness of the data. It is an intermediate process between
the collection of the data on one hand and statistical analysis on the other hand. In
fact, Tabulation is the final stage in collection and compilation of the data and forms
the gateway for further statistical analysis and interpretations. Tabulation makes the
data comprehensible and facilitates comparisons (by classifying data into suitable
groups), and the work of further statistical analysis, averaging, correlation, etc. It
makes the data suitable for further Diagrammatic and Graphic representation.
GENERAL RULES FOR CONSTRUCTING A TABLE
The various parts of a table vary from problem to problem depending upon the nature
of the data and the purpose of the investigation. However, the following are a must in
a good statistical table:
1. Table Number
2. Title
3. Head Notes (or) Prefatory Notes
4. Captions and Stubs
5. Body of the Table
6. Foot-Note
7. Source Note
FORMAT OF A BLANK TABLE
Table No: # TITLE
[Head Note or Prefatory Note (if any)]
Caption
Sub Heads Sub Heads
44
RESEARCH METHODOLOGY & STATISTICAL TOOLS
Stub
Heading
Total
Column
Head
Column
Head
Column
Head
Column
Head
Column
Head
Body
Total
Foot Note:
Source Note:
TYPES OF TABULATION: The Tables are constructed in many ways.
1. Objectives and Scope of the enquiry.
General Purpose or Reference Table
Special Purpose or Summary Table
2. Nature of Enquiry.
(i) Original or Primary Table
(ii) Derived or Derivative Table
3. Extent of Coverage given in the Enquiry.
Simple Table
Complex Table
45
RESEARCH METHODOLOGY & STATISTICAL TOOLS
SPSS (STATISTICAL PACKAGE FOR THE SOCIAL SCIENCES)
SPSS (Statistical Package for the Social Sciences) has now been in development for
more than thirty years. Originally developed as a programming language for
conducting statistical analysis, it has grown into a complex and powerful application
with now uses both a graphical and a syntactical interface and provides dozens of
functions for managing, analyzing, and presenting data. Its statistical capabilities alone
range from simple percentages to complex analyses of variance, multiple regressions,
and general linear models. You can use data ranging from simple integers/binary
variables to multiple response or logarithmic variables. SPSS also provides extensive
data management functions, along with a complex and powerful programming
language.
STATISTICS PROGRAM
SPSS (originally, Statistical Package for the Social Sciences) was released in its first
version in 1968 after being developed by Norman H. Nie and C. Hadlai Hull. Norman
Nie was then a political science postgraduate at Stanford University, and now
Research Professor in the Department of Political Science at Stanford and Professor
Emeritus of Political Science at the University of Chicago. SPSS is among the most
widely used programs for statistical analysis in social science. It is used by market
researchers, health researchers, survey companies, government, education researchers,
marketing organizations and others. The original SPSS manual (Nie, Bent & Hull,
TYPES
OF
TABLES
OBJECTIVES
AND THE
SCOPE OF
THE
ENQUIRIES
NATURE OF
THE
ENQUIRY
EXTENT OF
COVERAGE
GIVEN IN
THE
ENQUIRY
General
Purpose or
Reference
Table
Special Purpose
or Summary
Table
Original or
Primary Table
Derived or
Derivative
Table
Simple Table Complex Table
46
RESEARCH METHODOLOGY & STATISTICAL TOOLS
1970) has been described as 'Sociology's most influential book'. In addition to
statistical analysis, data management (case selection, file reshaping, creating derived
data) and data documentation (a metadata dictionary is stored in the data file) are
features of the base software.
Statistics included in the base software:
• Descriptive statistics: Cross tabulation, Frequencies, Descriptive, Explore,
Descriptive Ratio Statistics
• Bi-variate statistics: Means, t-test, ANOVA, Correlation (bi-variate, partial,
distances), Nonparametric tests
• Prediction for numerical outcomes: Linear regression
• Prediction for identifying groups: Factor analysis, cluster analysis (two-step,
K-means, hierarchical), Discriminant
The many features of SPSS are accessible via pull-down menus or can be programmed
with a proprietary 4GL command syntax language. Command syntax programming
has the benefits of reproducibility; simplifying repetitive tasks; and handling complex
data manipulations and analyses. Additionally, some complex applications can only be
programmed in syntax and is not accessible through the menu structure. The pull-
down menu interface also generates command syntax, this can be displayed in the
output though the default settings have to be changed to make the syntax visible to the
user; or can be paste into a syntax file using the "paste" button present in each menu.
Programs can be run interactively or unattended using the supplied Production Job
Facility. Additionally a "macro" language can be used to write command language
subroutines and a Python programmability extension can access the information in the
data dictionary and data and dynamically build command syntax programs. The
Python programmability extension, introduced in SPSS 14, replaced the less functional
SAX Basic "scripts" for most purposes, although Sax Basic remains available. In
addition, the Python extension allows SPSS to run any of the statistics in the free
software package R. From version 14 onwards SPSS can be driven externally by a
Python or a VB.NET program using supplied "plug-ins".
SPSS places constraints on internal file structure, data types, data processing and
matching files, which together considerably simplify programming. SPSS datasets
47
RESEARCH METHODOLOGY & STATISTICAL TOOLS
have a 2-dimensional table structure where the rows typically represent cases (such as
individuals or households) and the columns represent measurements (such as age, sex
or household income). Only 2 data types are defined: numeric and text (or "string").
All data processing occurs sequentially case-by-case through the file. Files can be
matched one-to-one and one-to-many, but not many-to-many.
The graphical user interface has two views which can be toggled by clicking on one of
the two tabs in the bottom left of the SPSS window. The 'Data View' shows a
spreadsheet view of the cases (rows) and variables (columns). Unlike spreadsheets, the
data cells can only contain numbers or text and formulas cannot be stored in these
cells. The 'Variable View' displays the metadata dictionary where each row represents
a variable and shows the variable name, variable label, value label(s), print width,
measurement type and a variety of other characteristics. Cells in both views can be
manually edited, defining the file structure and allowing data entry without using
command syntax. This may be sufficient for small datasets. Larger datasets such as
statistical surveys are more often created in data entry software, or entered during
computer-assisted personal interviewing, by scanning and using optical character
recognition and optical mark recognition software, or by direct capture from online
questionnaires. These datasets are then read into SPSS.
SPSS can read and write data from ASCII text files (including hierarchical files), other
statistics packages, spreadsheets and databases. SPSS can read and write to external
relational database tables via ODBC and SQL.
Statistical output is to a proprietary file format (*.spv file, supporting pivot tables) for
which, in addition to the in-package viewer, a stand-alone reader can be downloaded.
The proprietary output can be exported to text or Microsoft Word. Alternatively,
output can be captured as data (using the OMS command), as text, tab-delimited text,
PDF, XLS, HTML, XML, SPSS dataset or a variety of graphic image formats (JPEG,
PNG, BMP and EMF).
Add-on modules provide additional capabilities. The available modules are:
48
22538598 introduction-to-research-methodology-acccording-to-jntu-hyd-mba-syllabus
22538598 introduction-to-research-methodology-acccording-to-jntu-hyd-mba-syllabus
22538598 introduction-to-research-methodology-acccording-to-jntu-hyd-mba-syllabus
22538598 introduction-to-research-methodology-acccording-to-jntu-hyd-mba-syllabus
22538598 introduction-to-research-methodology-acccording-to-jntu-hyd-mba-syllabus
22538598 introduction-to-research-methodology-acccording-to-jntu-hyd-mba-syllabus
22538598 introduction-to-research-methodology-acccording-to-jntu-hyd-mba-syllabus
22538598 introduction-to-research-methodology-acccording-to-jntu-hyd-mba-syllabus
22538598 introduction-to-research-methodology-acccording-to-jntu-hyd-mba-syllabus
22538598 introduction-to-research-methodology-acccording-to-jntu-hyd-mba-syllabus

More Related Content

What's hot

Basic principles of research
Basic principles of researchBasic principles of research
Basic principles of researchNinoy Mahilum
 
Presentation on the characteristic of scientific research 1
Presentation on the characteristic of scientific research 1Presentation on the characteristic of scientific research 1
Presentation on the characteristic of scientific research 1Junesh Acharya
 
Research Methodology
Research MethodologyResearch Methodology
Research MethodologyHafez Ahmad
 
Research methodology -ppt-1
Research methodology -ppt-1Research methodology -ppt-1
Research methodology -ppt-1DrASHOKKUMARSECE
 
Research Methodology Ph D.ppt
Research Methodology Ph D.pptResearch Methodology Ph D.ppt
Research Methodology Ph D.pptShama
 
Difference between qualitative and quantitative research shani
Difference between qualitative and quantitative research shaniDifference between qualitative and quantitative research shani
Difference between qualitative and quantitative research shaniShani Jyothis
 
RESEARCH MATHODOLOGY and VARIABLES
RESEARCH MATHODOLOGY and VARIABLESRESEARCH MATHODOLOGY and VARIABLES
RESEARCH MATHODOLOGY and VARIABLESWaheed Ali
 
Research methodology
Research methodologyResearch methodology
Research methodologyBalaji P
 
Research Questions, Objectives, and Hypothesis
Research Questions, Objectives, and HypothesisResearch Questions, Objectives, and Hypothesis
Research Questions, Objectives, and HypothesisAshok Pandey
 
Research Question and Hypothesis
Research Question and HypothesisResearch Question and Hypothesis
Research Question and HypothesisArvind Kushwaha
 
Applied research methodology lecture 1
Applied research methodology lecture 1Applied research methodology lecture 1
Applied research methodology lecture 1Pulchowk Campus
 
Research methodology introduction ch1
Research methodology introduction ch1Research methodology introduction ch1
Research methodology introduction ch1Dr.BAMU University
 

What's hot (20)

Types of research
Types of research   Types of research
Types of research
 
Research methodology
Research methodologyResearch methodology
Research methodology
 
Understanding research philosophies
Understanding research philosophiesUnderstanding research philosophies
Understanding research philosophies
 
Basic principles of research
Basic principles of researchBasic principles of research
Basic principles of research
 
Presentation on the characteristic of scientific research 1
Presentation on the characteristic of scientific research 1Presentation on the characteristic of scientific research 1
Presentation on the characteristic of scientific research 1
 
Research methodology
Research methodologyResearch methodology
Research methodology
 
Research Methodology
Research MethodologyResearch Methodology
Research Methodology
 
Research methodology -ppt-1
Research methodology -ppt-1Research methodology -ppt-1
Research methodology -ppt-1
 
Research Methodology Ph D.ppt
Research Methodology Ph D.pptResearch Methodology Ph D.ppt
Research Methodology Ph D.ppt
 
Difference between qualitative and quantitative research shani
Difference between qualitative and quantitative research shaniDifference between qualitative and quantitative research shani
Difference between qualitative and quantitative research shani
 
TYPES OF RESEARCH
TYPES OF RESEARCHTYPES OF RESEARCH
TYPES OF RESEARCH
 
Hypothesis
HypothesisHypothesis
Hypothesis
 
RESEARCH MATHODOLOGY and VARIABLES
RESEARCH MATHODOLOGY and VARIABLESRESEARCH MATHODOLOGY and VARIABLES
RESEARCH MATHODOLOGY and VARIABLES
 
Research methodology
Research methodologyResearch methodology
Research methodology
 
Research Questions, Objectives, and Hypothesis
Research Questions, Objectives, and HypothesisResearch Questions, Objectives, and Hypothesis
Research Questions, Objectives, and Hypothesis
 
Research Question and Hypothesis
Research Question and HypothesisResearch Question and Hypothesis
Research Question and Hypothesis
 
Descriptive research
Descriptive researchDescriptive research
Descriptive research
 
Applied research methodology lecture 1
Applied research methodology lecture 1Applied research methodology lecture 1
Applied research methodology lecture 1
 
Design of qualitative research
Design of qualitative researchDesign of qualitative research
Design of qualitative research
 
Research methodology introduction ch1
Research methodology introduction ch1Research methodology introduction ch1
Research methodology introduction ch1
 

Similar to 22538598 introduction-to-research-methodology-acccording-to-jntu-hyd-mba-syllabus

Basic Research methodology notes
Basic Research methodology notesBasic Research methodology notes
Basic Research methodology notesDr. Sunil Kumar
 
RPE Unit 1 rm
RPE Unit 1 rmRPE Unit 1 rm
RPE Unit 1 rmblzz2net
 
Basics of research
Basics of researchBasics of research
Basics of researchPayalMehta37
 
R M Notes
R M  NotesR M  Notes
R M NotesBob Bin
 
Research Methodology Course - Unit 1.ppt
Research Methodology Course - Unit  1.pptResearch Methodology Course - Unit  1.ppt
Research Methodology Course - Unit 1.pptsvarsastry
 
chapter one research methodology dentistry
chapter one research methodology dentistrychapter one research methodology dentistry
chapter one research methodology dentistryNadiirMahamoud
 
Project Report on Research Methodology
Project Report on Research MethodologyProject Report on Research Methodology
Project Report on Research MethodologyOjas Narsale
 
Research Methodology and Research Types discussion
Research Methodology and Research Types discussionResearch Methodology and Research Types discussion
Research Methodology and Research Types discussionssrkai2020
 
Research Methodology Module 1 useful for any course work
Research Methodology Module 1 useful for any course workResearch Methodology Module 1 useful for any course work
Research Methodology Module 1 useful for any course workMayuraD1
 
Research lecture 1
Research lecture 1Research lecture 1
Research lecture 1Fraz Ali
 
Research methodology as per the syllabus of CDLU Sirsa
Research methodology as per the syllabus of CDLU SirsaResearch methodology as per the syllabus of CDLU Sirsa
Research methodology as per the syllabus of CDLU SirsaParveen Vashisth
 
Research methodology of nestle and cadbury chocolates
Research methodology of nestle and cadbury chocolates Research methodology of nestle and cadbury chocolates
Research methodology of nestle and cadbury chocolates yogita varma
 

Similar to 22538598 introduction-to-research-methodology-acccording-to-jntu-hyd-mba-syllabus (20)

Final book -_rm
Final book -_rmFinal book -_rm
Final book -_rm
 
Research by kothary
Research by kotharyResearch by kothary
Research by kothary
 
Basic Research methodology notes
Basic Research methodology notesBasic Research methodology notes
Basic Research methodology notes
 
Meaning objectives
Meaning objectivesMeaning objectives
Meaning objectives
 
RPE Unit 1 rm
RPE Unit 1 rmRPE Unit 1 rm
RPE Unit 1 rm
 
Basics of research
Basics of researchBasics of research
Basics of research
 
R M Notes
R M  NotesR M  Notes
R M Notes
 
Research Methodology Course - Unit 1.ppt
Research Methodology Course - Unit  1.pptResearch Methodology Course - Unit  1.ppt
Research Methodology Course - Unit 1.ppt
 
2.5 research and its types-jan-ppt
2.5 research and its types-jan-ppt2.5 research and its types-jan-ppt
2.5 research and its types-jan-ppt
 
TYPES OF RESEARCH.pptx
TYPES OF RESEARCH.pptxTYPES OF RESEARCH.pptx
TYPES OF RESEARCH.pptx
 
chapter one research methodology dentistry
chapter one research methodology dentistrychapter one research methodology dentistry
chapter one research methodology dentistry
 
Project Report on Research Methodology
Project Report on Research MethodologyProject Report on Research Methodology
Project Report on Research Methodology
 
Research Methodology and Research Types discussion
Research Methodology and Research Types discussionResearch Methodology and Research Types discussion
Research Methodology and Research Types discussion
 
Research Methodology Module 1 useful for any course work
Research Methodology Module 1 useful for any course workResearch Methodology Module 1 useful for any course work
Research Methodology Module 1 useful for any course work
 
Research lecture 1
Research lecture 1Research lecture 1
Research lecture 1
 
Rm
RmRm
Rm
 
1.1 definition and types of research-qualities of good research
1.1 definition and types of research-qualities of good research1.1 definition and types of research-qualities of good research
1.1 definition and types of research-qualities of good research
 
Research methodology as per the syllabus of CDLU Sirsa
Research methodology as per the syllabus of CDLU SirsaResearch methodology as per the syllabus of CDLU Sirsa
Research methodology as per the syllabus of CDLU Sirsa
 
Research methodology of nestle and cadbury chocolates
Research methodology of nestle and cadbury chocolates Research methodology of nestle and cadbury chocolates
Research methodology of nestle and cadbury chocolates
 
RM Notes.pdf
RM Notes.pdfRM Notes.pdf
RM Notes.pdf
 

22538598 introduction-to-research-methodology-acccording-to-jntu-hyd-mba-syllabus

  • 1. RESEARCH METHODOLOGY & STATISTICAL TOOLS MASTER OF BUSINESS ADMINISTRATION (JNTU) A MATERIAL FOR RESEARCH METHODOLOGY AND STATISTICAL TOOLS (According to JNTU Syllabus) Prepared by, S. Venkata Siva Kumar; MBA (HR/MRKTG), MSc (Statistics). 1
  • 2. RESEARCH METHODOLOGY & STATISTICAL TOOLS UNIT-1 RESEARCH METHODOLOGY: An Introduction Meaning of Research: Research in common parlance refers to a search for knowledge. Once can also define research as a scientific and systematic search for pertinent information on a specific topic. In fact, research is an art of scientific investigation. The advanced Learner’s Dictionary of current English lays down the meaning of research as “a careful investigation or inquiry especially through search for new facts in any branch of knowledge.” Redman and Mory define research as a “systematized effort to gain new knowledge.” Some people consider research as a movement, a movement from the known to unknown. It is actually a voyage of discovery. Research is an academic activity and as such the term should be used in a technical sense. According to “Clifford Woody, Research comprises defining and redefining problems, formulating hypothesis or suggested solutions; collecting, organizing and evaluating data; making deductions and reaching conclusions; and at last carefully testing the conclusions to determine whether they fit the formulating hypothesis. D. Slesinger and M. Stephenson in the encyclopedia of Social Sciences define Research as “the manipulation of things, concepts or symbols for the purpose of generalizing to extend, correct or verify knowledge, whether that knowledge aids in construction of theory or in the practice of an art.” Objectives of Research: The purpose of Research is to discover answers to questions through the application of scientific procedures. The main aim of research is to find out the truth which is hidden and which has not been discovered as yet. Though each research study has its own specific purpose, we may think of research objectives as falling into a number of following broad groupings: 1. To gain familiarity with a phenomenon or to achieve new insights into it (studies with this object in view are termed as exploratory or formulative research studies); 2
  • 3. RESEARCH METHODOLOGY & STATISTICAL TOOLS 2. To portray accurately the characteristics of a particular individual, situation or a group (studies with this object in view are known as descriptive research studies); 3. To determine the frequency with which something occurs or with which it is associated with something else (studies with this object in view are known as diagnostic research studies); 4. To test a hypothesis of a casual relationship between variables (such studies are known as hypothesis-testing research studies). Motivation in Research: What makes people to undertake research? This is a question of fundamental importance. The possible motives for doing research may be either one or more of the following: 1. Desire to get a research degree along with its consequential benefits; 2. Desire to face the challenge in solving the unsolved problems, i.e., concern over practical problems initiates research; 3. Desire to get intellectual joy of doing some creative work; 4. Desire to be of service to society. 5. Desire to get Respectability. However, this is not an exhaustive list of factors motivating people to undertake research studies. Many more factors such as directives of government, employment conditions, curiosity about new things, desire to understand casual relationships, social thinking and awakening and the like may as well motivate (or at times compel) people to perform research operations. Types of Research: The basic types of research are as follows: 1. Descriptive Vs. Analytical Research: Descriptive research includes surveys and fact-finding enquiries of different kinds. The major purpose of descriptive research is description of the state of affairs as it exists at present. In social science and business research we quite often use the term Ex post facto research for descriptive research studies. The main characteristic of this 3
  • 4. RESEARCH METHODOLOGY & STATISTICAL TOOLS method is that the researcher has no control over the variables; he can only report what has happened or what is happening. Most ex post facto research projects used for descriptive studies in which the researcher seeks to measure such items as, for example, frequency of shopping, preferences of people, or similar data. Ex post facto studies also include attempts by researchers to discover causes even when they cannot control the variables. The methods of research utilized in descriptive research are survey methods of all kinds, including comparative and co-relational methods. In analytical research, on the other hand, the researcher has to use facts or information already available, and analyze these to make a critical evaluation of the material. 2. Applied Vs Fundamental Research: Research can either be applied (or action) research or fundamental (to basic or pure) research. Applied research aims at finding a solution for an immediate problem facing a society or an industrial/business organization, whereas fundamental research is mainly concerned with generalizations and with the formulation of a theory. “Gathering knowledge for knowledge’s sake is termed as ‘pure’ or ‘basic’ research.” Research concerning some natural phenomenon or relating to pure mathematics are examples of fundamental research. Similarly, research studies, concerning human behavior carried on with a view to make generalizations about human behavior, are also examples of fundamental research, but research aimed at certain conclusion (say, a solution) facing a concrete social or business problem is an example of applied research. Research to identify social, economic or political trends that may affect a particular institution or the copy research or the marketing research or evaluation research are examples of applied research. Thus, the central aim of applied research is to discover a solution for some pressing practical problem, whereas basic research is directed towards finding information that has a broad base of applications and thus, adds to the already existing organized body of scientific knowledge. 3. Quantitative Vs Qualitative Research: Quantitative research is based on the measurement of quantity or amount. It is applicable to phenomena that can be expressed in terms of quantity. Qualitative research, on the other hand, is concerned with qualitative phenomenon i.e., phenomena relating to or involving quality or kind. For instance, when we are interested in investigating the reasons for human behavior, we quite often talk of ‘Motivation Research’, 4
  • 5. RESEARCH METHODOLOGY & STATISTICAL TOOLS an important type of qualitative research. This type of research aims at discovering the underlying motives and desires, using in depth interviews for the purpose. Other techniques of such research are word association tests, sentence completion tests, story completion tests and similar other projective techniques. Attitude or opinion research i.e., research designed to find out how people feel or what they think about a particular subject or institution is also qualitative research. Qualitative research is especially important in the behavioral sciences where the aim is to discover the underlying motives of human behavior. Through such research we can analyze the various factors which motivate people to behave in a particular manner or which make people like or dislike a particular thing. It may be stated, that to apply qualitative research in practice is relatively a difficult job and therefore, while doing such research, one should seek guidance from experimental psychologists. 4. Conceptual Vs Empirical Research: Conceptual research is that related to some abstract idea(s) or theory. It is generally used by philosophers and thinkers to develop new concepts or to reinterpret existing ones. On the other hand, empirical research relies on experience or observation alone, often without due regard for system and theory. It is data-based research, coming up with conclusions which are capable of being verified by observation or experiment. We can also call it as experimental type of research. In such a research it is necessary to get at facts first hand, at their source, and actively to go about doing certain things to stimulate the production of desired information. In such a research, the researcher must first provide himself with a working hypothesis or guess as to the probable results. He then works to get enough facts (data) to prove or disprove his hypothesis. He then sets up experimental designs which he thinks will manipulate the persons or the materials concerned so far to bring forth the desired information. Such research is thus characterized by the experimenter’s control over the variables under study and his deliberate manipulation of one of them to study its effects. Empirical research is appropriate when proof is sought that certain variables affect other variables in some way. Evidence gathered through experiments or empirical studies is today considered studies are today considered to be the most powerful support possible for a given hypothesis. Nature and Importance of Research: 5
  • 6. RESEARCH METHODOLOGY & STATISTICAL TOOLS “All progress is born of inquiry. Doubt is often better than over-confidence, for it leads to inquiry, and inquiry leads to invention” is famous Hudson Maxim in context of which the significance of research can well be understood. Increased amounts of research make progress possible. Research inculcates scientific and inductive thinking and it promotes the development of logical habits of thinking and organization. The role of research in several fields of applied economics, whether related to business or to the economy as a whole, has greatly increased in modern times. The increasingly complex nature business and government has focused attention on the use of research in solving operational problems. Research, as an aid to economic policy, has gained added importance, both for government ad business. Research provides the basis for nearly all government policies in our economic system. For instance, government’s budgets rests in part on an analysis of the needs and desires of the people and on the availability of revenues to meet these needs. The cost of needs has to be equated to probable revenues and this is a field where research is most needed. Through research we van devise alternative policies and can as well examine the consequences of each of these alternatives. Decision- making may not be a part of research, but research certainly facilitates the decisions of the policy maker. Government has also to chalk out programmes for dealing with all facets of the country’s existence and most of these will be related directly or indirectly to economic conditions. The plight of cultivators, the problems of big and small business and industry, working conditions, trade union activities, the problems of distribution, even the size and nature of defense services are matters requiring research. Thus, research is considered necessary with regard to the allocation of nation’s resources. Research has its special significance in solving various operational and planning problems of business and industry. Operations research and market research, along with motivational research, are considered crucial and their results assist, in more than one way, in taking business decisions. Market research is the investigation of the structure and development of a market of the purpose of formulating efficient policies for purchasing, production and sales. Operations research refers to the application of mathematical, logical and analytical techniques to the solution of business problems of cost minimization or of profit maximization or what can be termed as optimization problems. Motivational research of determining why people behave as they do is mainly concerned with market characteristics. 6
  • 7. RESEARCH METHODOLOGY & STATISTICAL TOOLS In addition to what has been stated above, the significance of research can also be understood keeping in view the following points: 1. To those students who are to write a master’s or Ph.D.thesis, research may mean a careerism or a way to attain a high position in the social structure; 2. To professionals in research methodology, research may mean a source of livelihood. 3. To philosophers and thinkers, research may mean the outlet for new ideas and insights; 4. To analysts and intellectuals, research may mean the generalizations of new theories. Thus, research is the fountain of knowledge for the sake of knowledge and an important source of providing guidelines for solving different business, governmental and social problems. It is a sort of formal training which enables one to understand the new developments in one’s field in a battery way. RESEARCH PROCESS: The Research Process consists of series of actions or steps necessary to effectively carry out research and the desired sequencing of these steps. The following order concerning various steps provides a useful procedural guideline regarding the research process: 1. Formulating the Research problem 2. Extensive Literature survey 3. Development of working hypothesis 4. Preparing the Research design 5. Determining the Sample design 6. Collection of data 7. Execution of the project 8. Analysis of data 9. Hypothesis-testing 10. Generalizations and interpretation 11. Preparation of the report or the thesis 1) Formulating the research problem: There are two types of research problems, viz., those which relates to states of nature and those which relate to relationships 7
  • 8. RESEARCH METHODOLOGY & STATISTICAL TOOLS between variables. At the very outset the researcher must single out the problem he wants to study i.e., he must decide the general area of interest or aspect of a subject matter that he would like to inquire into. Initially the problem may be stated in a broad general way and then the ambiguities, if any, relating to the problem be resolved. Then, the feasibility of a particular solution has to be considered before a working formulation of the problem can be set up. The formulation of a general topic into a specific research problem, thus, constitutes the first step in a scientific enquiry. Essentially two steps are involved in formulating the research problem, viz., understanding the problem thoroughly, and rephrasing the same into meaningful terms from an analytical point of view. The best way of understanding the problem is to discuss it with one’s own colleagues or with those having some expertise in the matter. In an academic institution the researcher can seek the help from a guide who is usually an experimented man and has several research problems in mind. Often, the guide puts forth the problem in general terms and it is up to the researcher to narrow it down and phrase the problem in operational terms. In private business units or in governmental organizations, the problem is usually earmarked by the administrative agencies with which the researcher can discuss as to how the problem originally came about and what considerations are involved in its possible solutions. Professor W.A. Neiswanger correctly states that the statement of the objective is of basic importance because it determines the data which are to be collected, the characteristics of the data which are relevant, relations which are to be explored, the choice of techniques to be used in these explorations and the form of the final report. If there are certain pertinent terms, the same should be clearly defined along with the task of formulating the problem. In fact, formulation of the problem often follows a sequential pattern where a number of formulations are set up, each formulation more specific than the preceding one, each one phrased in more analytical terms, and each more realistic in terms of the available data and resources. 2) Extensive literature survey: Once the problem is formulated, a brief summary of it should be written down. It is compulsory for a research worker writing a thesis for a Ph.D. degree to write a synopsis of the topic and submit it to the necessary Committee or the Research Board for approval. At this juncture the researcher should 8
  • 9. RESEARCH METHODOLOGY & STATISTICAL TOOLS undertake extensive literature survey connected with the problem. For this purpose, the abstracting and indexing journals and published or unpublished bibliographies are the first place to go to. Academic journals, conference proceedings, government reports, books etc., must be tapped depending on the nature of the problem. In this process, it should be remembered that one source will lead to another. The earlier studies, if any, which are similar to the study in hand, should be carefully studied. A good library will be a great help to the researcher at this stage. 3) Development of working hypothesis: After extensive literature survey, researcher state in clear terms the working hypothesis or hypotheses. Working hypothesis is tentative assumption made in order to draw out and test its logical or empirical consequences. As such the manner in which research hypotheses are developed is particularly important since they provide the focal point for research. They also affect the manner in which tests must be conducted in the analysis of data and indirectly the quality of data which is required for the analysis. In most types of research, the development of working hypothesis plays an important role. Hypothesis should be very specific and limited to the piece of research in hand because it has to be tested. The role of the hypothesis is to guide the researcher by delimiting the area of research and to keep him on the right track. It sharpens his thinking and focuses attention on the more important facets of the problem. It also indicates the type of data required and the type of methods of data analysis to be used. How does one go about developing working hypothesis? The answer is by using the following approach: a) Discussions with colleagues and experts about the problem, its origin and the objectives in seeking a solution; b) Examination of data and records, if available, concerning the problem for possible trends, peculiarities and other clues; c) Review of similar studies in the area or of the studies on similar problems; and d) Exploratory personal investigation which involves original field interviews on a limited scale with interested parties and individuals with a view to secure greater insight into the practical aspects of the problem. Thus, working hypothesis arise as a result of a priori thinking about the subject, examination of the available data and material including related studies and the counsel of experts and interested parties. Working hypothesis is more useful when stated in precise and clearly defined terms. It may as well be remembered that 9
  • 10. RESEARCH METHODOLOGY & STATISTICAL TOOLS occasionally we may encounter a problem where we do not need working hypothesis, especially in the case of exploratory or formulative researches which do not aim at testing the hypothesis. But as a general rule, specification of working hypothesis in another basic step of the research process in most research problems. 4) Preparing the research design: The research problem having been formulated in clear cut terms, the researcher will be required to prepare a research design, i.e., he will have to state the conceptual structure within which research would be conducted. The preparation of such a design facilitates research to be as efficient as possible yielding maximal information. In other words, the function of research design is to provide for the collection of relevant evidence with minimal expenditure of effort, time and money. But how all these can be achieved depends mainly on the research purpose. Research purposes may be grouped into four categories, viz., a. Exploration b. Description c. Diagnosis d. Experimentation A flexible research design which provides opportunity for considering many different aspects of a problem is considered appropriate if the purpose of the research study is that of exploration. But when the purpose happens to be an accurate description of a situation or of an association between variables, the suitable design will be one that minimizes bias and maximizes the reliability of the data collected and analyzed. There are several research designs, such as, an experimental and non- experimental hypothesis testing. Experimental designs can be either informal design (such as completely randomized design, randomized block design, Latin square design, simple and complex factorial designs), out of which the researcher must select one for his own project. The preparation of the research design, appropriate for a particular research problem, involves usually the consideration of the following: I. The means of obtaining the information; II. The availability and skills of the researcher and his staff (if any); III. Explanation of the way in which selected means of obtaining information will be organized and the reasoning leading to the selection; IV. The time available for research; and V. The cost factor relating to research, i.e., the finance available for the purpose. 10
  • 11. RESEARCH METHODOLOGY & STATISTICAL TOOLS 5) Determining sample design: All the items under consideration in any field of inquiry constitute a ‘universe’ or ‘population’. A complete enumeration of all items in the ‘population’ is known as a census enquiry. It can be presumed that in such an enquiry when all the items are covered no element of chance is left and highest accuracy is obtained. But in practice this may not be true. Even the slightest element of bias in such an enquiry will get larger and larger as the number of observations increases. Moreover, there is no way of checking the element if bias or its extent except through a resurvey or use of sample checks. Besides, this type of inquiry involves a great deal of time, money and energy. Not only this, census enquiry is not possible in practice under many circumstances. For instance, blood testing is done only on sample basis. Hence, quite often we select only a few items from the universe for our study purposes. The items so selected continue what is technically called a sample. The researcher must decide the way of selecting a sample or what is popularly known as the sample design. In other words a sample design is a definite plan determined before any data are actually collected for obtaining a sample from a given population. Thus, the plan to select 12 of a city’s 200 drugstores in a certain way constitutes a sample design. Samples can be either probability samples or non- probability samples. With probability samples each element has a known probability of being included in the sample but the non-probability samples do not allow the researcher to determine this probability. Probability samples are those based on simple random sampling, systematic sampling, stratified sampling, cluster/area sampling whereas non-probability samples are those based on convenience sampling, judgment sampling and quota sampling techniques. A brief mention of the important sample designs is as follows. 1. Deliberate sampling: Deliberate sampling is also known as purposive or non- probability sampling. This sampling method involves purposive or deliberate selection of particular units of the universe for constituting a sample which represents the universe. When population elements are selected for inclusion in the sample based on the ease of access, it can be called convenience sampling. 2. Simple random sampling: This type of sampling is also known as chance sampling or probability sampling where each and every item in the population has an equal chance of inclusion in the sample and each one of the possible samples, in case of finite universe, has the same probability of being selected. 11
  • 12. RESEARCH METHODOLOGY & STATISTICAL TOOLS For example, if we have to select a sample of 300 items from a universe of 15,000 items, then we can put the names or numbers of all the 15,000 items on slips of paper and conduct a lottery. 3. Systematic sampling: In some instances the most practical way of sampling is to select every 15th name on a list, every 10th house on one side of a street and so on. Sampling of this type is known as systematic sampling. 4. Stratified sampling: if the population from which a sample is to be drawn does not constitute a homogeneous group, then stratified sampling technique is applied so as to obtain a representative sample. In this technique, the population as stratified into a number of non-overlapping subpopulations or strata and sample items are selected from each stratum. If the items selected from each stratum is based on simple random sampling the entire procedure, first stratification and then simple random sampling, is known as stratified random sampling. 5. Quota sampling: In stratified sampling the cost of taking random samples from individual strata is often so expensive that interviewers are simply given quota to be filled from different strata, the actual selection of items for sample being left to the interviewer’s judgment. This is called quota sampling. 6. Cluster sampling and Area sampling: cluster sampling involves grouping the population and then selecting the groups or the clusters rather than individual elements for inclusion in the sample. Suppose some departmental store wishes to sample its credit card holders. It has issued its cards to 15,000 customers. The sample size is to be kept say 450. For cluster sample this list of 15,000 card holders could be formed into 100 clusters of 150 card holders each. Three clusters might then be selected for the sample randomly. 7. Multi-stage sampling: This is a further development of the idea of cluster sampling. This technique is mean for big enquiries extending to a considerably large geographical area like an entry country. Under multi-stage sampling the first stage may be to select large primary sampling units such as states, then districts, then towns and finally certain families within towns. If the technique of random sampling is applied at all stages, the sampling procedure is described as multi-stage random sampling. 8. Sequential sampling: This is some what a complex sample design where the ultimate size of the sample is not fixed in advance but is determined according 12
  • 13. RESEARCH METHODOLOGY & STATISTICAL TOOLS to mathematical decisions on the basis of information yielded as survey progresses. This design is usually adopted under acceptance sampling plan in the context of statistical quality control. 6) Collecting the data: In dealing with any real life problem it is often found that data at hand are inadequate, and hence, it becomes necessary to collect data that are appropriate. There are several ways of collecting the appropriate data which differ considerably in context of money costs, time and other resources at the disposal of the researcher. Primary data can be collected either through experiment or through survey. If the researcher conducts an experiment, he observes some quantitative measurements, or the data, with the help of which he examines the truth contained in his hypothesis. But in the case of a survey, data can be collected by any one or more of the following ways. 1. By observation 2. Through personal interview 3. Through telephone interviews 4. By mailing of questionnaires 5. Through schedulers. 7) Execution of the project: Execution of the project is a very important step in the research process. If the execution of the project proceeds on correct lines, the data to be collected would be adequate and dependable. The researcher should see that the project is executed in a systematic manner and in time. If the survey is to be conducted by means of structured questionnaires, data can be readily machine-processed. In such a situation, questions as well as the possible answers may be coded. If the data are to be collected through interviewers, arrangements should made for proper selection and training of the interviewers. The training may be given with the help of instruction manuals which explain clearly the job of the interviewer at each step. Occasional field checks should be made to ensure that the interviewers are doing their assigned job sincerely and efficiently. A careful watch should be kept for unanticipated factors in order to keep the survey as much realistic as possible. This, in other words, means that steps should be taken to ensure that survey is under statistical control so that the collected information is in accordance with the pre-defined standard of accuracy. If some of the respondents do not cooperate, some suitable methods should be designed 13
  • 14. RESEARCH METHODOLOGY & STATISTICAL TOOLS to tackle this problem. One method of dealing with the non-response problem is to make a list of the non-respondents and take a small sub sample of them, and then with the help of experts vigorous efforts can be made for securing response. 8) Analysis of data: After the data have been collected, the researcher turns to the task of analyzing them. The analysis of data requires a number of closely related operations such as establishment of categories, the application of these categories to raw data through coding, tabulation and then drawing statistical inferences. The un- widely data should necessarily be condensed into a few manageable groups and tables for further analysis. Thus researcher should classify the raw data into some purposeful and usable categories. Coding operation is usually done at this stage through which the categories of data are transformed into symbols that nay be tabulated and counted. Editing is the procedure that improves the quality of the data for coding. With coding the stage is ready for tabulation. Tabulation is a part of the technical procedure wherein the classified data are put in the form of tables. The mechanical devices can be made use of at this juncture. A great deal of data, especially in large inquiries, is tabulated by computers. Computers not only save time but also make it possible to study large number of variables affecting a problem simultaneously. 9) Hypothesis-testing: after analyzing the data as stated above, the researcher is in a position to test the hypothesis, if any, he had formulated earlier. Do the facts support the hypothesis or they happen to be contrary? This is the usual question which should be answered while testing hypothesis. Various tests, such as Chi-square test, t-test, F- test have been developed by statisticians for the purpose. The hypothesis may be tested through the use of one or more of such tests, depending upon the nature and object of research inquiry. Hypothesis-testing will result in either accepting the hypothesis or in rejecting it. If the researcher had no hypothesis to start with, generalizations established on the basis of data may be stated as hypothesis to be tested by subsequent researches in times to come. 10) Generalizations and interpretation: If a hypothesis is tested and upheld several times, it man be possible for the researcher to arrive at generalization, i.e., to build a theory. As a matter of fact, the real value of research lies in its ability to arrive at certain generalizations. If the researcher had no hypothesis to start with. He might seek to explain his findings on the basis of some theory. It is knows as interpretation. 14
  • 15. RESEARCH METHODOLOGY & STATISTICAL TOOLS The process of interpretation may quite often trigger off new questions which in turn lead to further researches. 11) Preparation of the report or the thesis: Finally, the researcher has to prepare the report of what has been done by him. Writing of report must be done with great care keeping in view the following: 1. The layout of report should be as follows: (i) The preliminary pages; (ii) The main text, and (iii) The end matter In its preliminary pages the report should carry title and data followed acknowledgements and foreword. Then there should be a table of contents followed by a list of tables and list of graphs and charts, if any, given in the report. The main text of the report should have the following parts: (a) Introduction: It should contain a clear statement of the objective of the research and explanation of the methodology adopted in accomplishing the research. The scope of the study along with various limitations should as well be stated in this part. (b) Summary of findings: after introduction there would appear a statement of findings and recommendations in non-technical language. If the findings are extensive, they should be summarized. (c) Main report: the main body of the report should be presented in logical sequence and broken-down into readily identifiable sections. (d) Conclusion: towards the end of the main text, researcher should again put down the results of his research clearly and precisely. In fact, it is the final summing up. At the end of the report, appendices should be enlisted in respect of all technical data. Bibliography, i.e., list of books, journals, reports, etc., consulted, should also be given in the end. Index should also be given specially in a published research report. 2. Report should be written in a concise and objective style in simple language avoiding vague expressions such as ‘it seems’, ‘there may be’, and the like. 3. Charts and illustrations in the main report should be used only if they present the information more clearly and forcibly. 15
  • 16. RESEARCH METHODOLOGY & STATISTICAL TOOLS 4. Calculated ‘confidence limits’ must be mentioned and the various constraints experienced in conducting research operations may as well be stated. COLLECTION OF DATA Statistical investigation: An investigation (or) inquiry means a “search for knowledge”. Statistical investigation means “search for knowledge with the help of statistical methods”. Stages of Investigation: A statistical investigation is a comprehensive which passes through the following steps: 1. Planning the inquiry 2. Collection of data 3. Editing the data 4. Presentation of data 5. Analysis of data 6. Presentation of final report Collection of data: The first in the conduct of statistical investigation (or) inquiry is “collection of data”. The source of data can be represented as follows: Internal source: Internal data come from government and business organizations which generate them in the form of production, purchase, expenses etc. DATA INTERNAL DATA EXTERNAL DATA PRIMARY DATA SECONDARY DATA 16
  • 17. RESEARCH METHODOLOGY & STATISTICAL TOOLS External data: When data is collected from outside the organization, then this is collected from the external source. External data can be divided into two types. (i) Primary (ii) secondary (i) Primary data: It refers to the statistical material which the investigator originates for him for the purpose of the inquiry in hand in other words; it is one which is collected by the investigator the first time. (ii) Secondary data: it refers to the statistical material which is not originated by the investigator himself but obtained from some one else records. This type of data is generally taken from news papers, magazines, bulletins, reports etc. Methods of collection of primary data: following methods may be used to collect the primary data: 1. Direct personal investigation 2. Indirect personal investigation 3. Information through correspondent 4. Questionnaire method (a) Questionnaire step to post (b) Questionnaire step to investigators (1) Direct personal investigation: According to this method, the investigator obtains the data from personal interview or observation. Therefore, he contains the source of information directly and personally. He will contact cash and every possible source of information. (2) Indirect personal investigation: According to this method the investigator contains third party’s witnesses who are use to collect the information directly or indirectly and or capable of supplying the necessary information. This method is generally adapted by government committees to get views of the people relating to the inquiry. (3) Information through correspondent: Under this method, the investigator does not collect the information from the persons directly. He appoints local agents in different cards of the area under investigation. These local agents are called “correspondents”. This correspondents collect the information and pass it on to the investigate on time- to-time. (4) Questionnaire method: In this method, the necessary information is collected from the respondent’s through a questionnaire. A questionnaire is a set of questions relating to the inquiry. The information can be collected through questionnaires in two ways. 17
  • 18. RESEARCH METHODOLOGY & STATISTICAL TOOLS (i) Questionnaires sent to post: in this case, the questionnaire is sent to a person and the persons he fills the various answers to the various questions asked in it. (ii) Questionnaires sent to investigator: under this method, the investigators are appointed and contact the persons and get replace to the questionnaire and tell them in their own hand writing in the questionnaire form. Sources of secondary data: sometimes it is not possible to collect information for resources in terms of money, time etc, in that solution secondary data is used. This type of data is generally available in magazines, journals etc. This secondary data can be classified into two categories: (i) Published data (ii) Unpublished data Organization of data: the raw data in the form of unarranged figures are collected through primary or secondary sources. The raw data practically gives no information and hence there is a need for organization of data. In organization of data involves the following ‘3’ stages: (1) Editing of data (2) Classification of data (3) Tabulation of data (1) Editing of data:  Editing of data refers to detect possible errors and irregulatories committed during the collection of data.  If the data is not edited, then it may lead to wrong conclusions. Therefore editing is essential to arrange the data in order. (2) Classification of data:  The process of arranging the data in groups or classes according to their common characteristics is technically classified.  Classification is the grouping of related facts into classes. Types of classification: broadly whole data can be classified into following factors: 1. geographical classification 2. chromo logical classification 3. conditional classification 4. qualitative classification 5. quantitative classification 18
  • 19. RESEARCH METHODOLOGY & STATISTICAL TOOLS 1. Geographical classification: Here data are classified on the basic of geographical area like village, city, states, and regions. 2. Chromo logical classification: Here, this classification is done on the basis of time likely hourly, daily, weakly, monthly etc. 3. Conditional classification: This classification is done on the basis of some conditions such as literacy, intelligence, honesty, beauty and ugly etc. 4. Qualitative classification: Here, this data is classified on the basis of some attributes (or) quality like literacy, honesty, beauty, intelligence etc,. In this case the basis of classification is either presence or absence of a quality. 5. Quantitative classification: When the data classified on the basis of the characteristics which can be measured such as age, income, marks, height, weight, product is called “Qualitative classification”. (3) Tabulation of data: After the collection and classification of data process of tabulation begins. Tabulation is dependent upon classification. Tabulation is necessary in order to make the data understandable or organize. By tabulation we make a systematic arrangement of statistical data in rows and columns. Rows are the horizontal arrangements of data, where as the columns are the vertical arrangement of data. Tabulation tries to give the maximum information contained in the data in minimum possible space. It is mid way process between the collection of data and statistical analysis. QUESTIONNAIRE AS A TOOL OF COLLECTING DATA This method consists in preparing a questionnaire (a list of questions relating to the field of enquiry and providing space for the answers to be filled by the respondents) which is mailed to the respondents with a request for quick response within the specified time. The questionnaire is the only media of communication between the investigator and the respondents and as such the questionnaire should be designed or drafted with utmost care and caution so that all the relevant and essential information for the enquiry may be collected without any difficulty, ambiguity and vagueness. Drafting or Framing the Questionnaire: 19
  • 20. RESEARCH METHODOLOGY & STATISTICAL TOOLS Drafting of a good questionnaire is a highly specialized job and requires great care, skill, wisdom, efficiency and experience. No hard and fast rules can be laid down for designing or framing a questionnaire. However, in this connection, the following general points may be borne in mind: 1. The size of the questionnaire should be as small as possible. The number of questions should be restricted to the minimum, keeping in view the nature, objectives and scope of the enquiry. In other words, the questionnaire should be concise and should contain only those questions which would furnish all the necessary information relevant for the purpose. Respondents’ time should not be wasted by asking irrelevant and unimportant questions. A large number of questions would involve more work for the investigator and thus result in delay on his part in collecting and submitting the information. These may, in addition, also necessarily annoy or tire the respondents. A reasonable questionnaire should contain from 15 to 20-25 questions. If a still larger number of questions is a must in any enquiry, then the questionnaire should be divided into various sections or parts. 2. The questions should be clear, brief, unambiguous, non-offending, and courteous in tone, corroborative in nature and to the point so that not much scope of guessing is left on the part of the respondents. 3. The questions should be arranged in a natural logical sequence. For example, to find if a person owns a refrigerator the logical order of questions would be: “Do you own a refrigerator”? When did you buy it? What is its make? How much did it cost you? Is its performance satisfactory? Have you ever got it serviced? The logical arrangement of questions in addition to facilitating tabulation work would leave no chance for omissions or duplication. 4. The usage of vague and ‘multiple meaning’ words should be avoided. The vague works like good, bad, efficient, sufficient, prosperity, rarely, frequently, reasonable, poor, and rich, etc., should not be used since these may be interpreted by different persons and as such might give unreliable and misleading information. Similarly the use of words with multiple meanings like price, assets, capital, income, household, democracy, socialism, etc., should not be used unless a clarification to these terms is given in the questionnaire. 5. Questions should be so designed that they are readily comprehensive and easy to answer for the respondents. They should not be tedious nor should 20
  • 21. RESEARCH METHODOLOGY & STATISTICAL TOOLS they tax the respondents’ memory. Further, questions involving mathematical calculations like percentages, ratios, etc., should not be asked. 6. Questions of a sensitive and personal nature should be avoided. Questions like “How much money you owe to private parties?” or “Do you clean your utensils yourself?” which might hurt the sentiments, pride or prestige of an individual should not be asked, as far as possible. It is also advisable to avoid questions on which the respondent may be reluctant or unwilling to furnish information. For example, the questions pertaining to income, savings, habits, addiction to social evils, age (particularly in case of ladies), etc., should be asked very tactfully. 7. Typed Questions: Under this head, the questions in the questionnaire may be broadly classified as follows: a) Shut Questions: In much questions possible answers are suggested by the framers of the questionnaire and the respondent is required to tick one of them. Shut questions can further be sub-divided into the following forms. (i) Simple Alternative Questions: In such questions, the respondent has to choose between two clear cut alternatives like ‘Yes’ or ‘No’; ‘Right’ or ‘Wrong’; ‘Either’ or ‘Or’ and so on. For instance, do you own a refrigerator? – Yes or No. Such questions are also called dichotomous questions. This technique can be applied with elegance to situations where two clear cut alternatives exist. (ii) Multiple Choice Questions: Quite often, it is not possible to define a clear cut alternative and accordingly in such a situation either the first method (Alternative Questions) is not used or additional answers between ‘Yes’ or ‘No’ like ‘Do not know’, ‘No opinion’, Occasionally, Casually, Seldom, etc., are added. For instance to find a person smokes or drinks, the following multiple choice answers may be used:  Do you smoke? Yes (Regularly) [ ] No (Never) [ ] Occasionally [ ] Seldom [ ]  Which of the following modes of cooking you use? Gas [ ] Coal (Coke) [ ] Wood [ ] Power (Electricity) [ ] Stove (Kerosene) [ ] 21
  • 22. RESEARCH METHODOLOGY & STATISTICAL TOOLS  How do you go to your place of duty? By bus [ ] By three wheeler scooter [ ] By your own vehicle [ ] By taxi [ ] By your own scooter [ ] On foot [ ] By your own car [ ] Any other [ ] Multiple choice questions are very easy and convenient for the respondents to answer. Such questions save time and also facilitate tabulation. This method should be used if only a selected few alternative answers exist to a particular question. Sometimes, a last alternative under the category ‘Others’ or ‘Any other’ may be added. However, multiple answer questions of relatively equal importance to a given question. b) Open Questions: Open questions are those in which no alternative answers are suggested and the respondents are at liberty to express their frank and independent opinions on the problem in their own words. For instance, ‘What are the drawbacks in our examination system?’; ‘What solution do you suggest to the housing problem in Delhi?’; ‘Which program in the Delhi TV do you like best?’ are some of the open questions. Since the views of the respondents in the open questions might differ widely, it is very difficult to tabulate the diverse opinions and responses. 8) Leading questions should be avoided: For example, the question ‘why do we use a particular brand of blades, say, Erasmic blades’ should preferably be framed into two questions. (i) Which blade do you use? (ii) Why do you prefer it? Gives a smooth shave [] Readily available in the market [] Gives more shaves [] Any other [] Price is less (cheaper) [] 9) Cross checks: The questionnaire should be so designed as to provide internal checks on the accuracy of the information supplied by the respondents by including some connected questions at least with respect to matters which are fundamental to the enquiry. For example in social survey for finding the age of the mother the question ‘What is your age’? Can be supplemented by additional questions ‘What is your date of birth?’ or ‘What is the age of your eldest child’? Similarly, the question, ‘Age at marriage’ can be supplemented by the question ‘The age of the first child’. 22
  • 23. RESEARCH METHODOLOGY & STATISTICAL TOOLS 10) Pre-testing the questionnaire: From practical of view it is desirable to try out the questionnaire on a small scale (i.e., on a small cross-section of the population for which the enquiry is intended) before using it for the given enquiry on a large scale. This testing on a small scale (called pre-test) has been found to be extremely useful in practice. The given questionnaire can be improved or modified in the light of the drawbacks, shortcomings and problems faced by the investigator in the pre-test. Pre- testing also helps to decide upon the effective methods of asking questions for soliciting the requisite information. 11) A covering letter: A covering letter from the organizers of the enquiry should be enclosed along with the questionnaire for the following purposes: i. It should clearly explain in brief the objectives and scope of the survey to evoke the interest of the respondents and impress upon them to render their full co-operation by returning their schedule/questionnaire duly filled in within the specified period. ii. It should contain a note regarding the operational definitions to the various terms and the concepts used in the questionnaire; units of measurements to be used and the degree of accuracy aimed it. iii. It should take the respondents in confidence and ensure them that the information furnished by them will be kept completely secret and they will not be harassed in any way later. iv. In the case of mailed questionnaire method a self-addressed stamped envelope should be enclosed for enabling the respondents to return the questionnaire after completing it. v. To ensure quick and better response the respondents may be offered awards/incentives in the form of free gifts, coupons, etc. vi. A copy of the survey report may be promised to the interested respondents. 12) Mode of tabulation and analysis viz., hand operated, machine tabulation or computerization should also be kept in mind while designing the questionnaire. 13) Lastly, the questionnaire should be made attractive by proper layout and appealing get up. We give below two specimen questionnaires for illustration. A MODEL OF QUESTIONNAIRE IN REGARDS TO CENSUS SURVEY: 23
  • 24. RESEARCH METHODOLOGY & STATISTICAL TOOLS We give below the 1971 Census – Individual Slip which was used for a general purpose survey to collect: (i) Social and Cultural data like nationality, religion, literacy, mother tongue, etc.; (ii) Exhaustive economic data like occupation, industry, class of worker and activity, if not working; (iii) Demographic data like relation to the head of the house, sex, age, marital status, birth place, births and depths and the fertility of women to assess in particular the performance of the family planning programme. 1971 CENSUS – INDIVIDUAL SLIP 1. Name………………………………………………….. 2. Relationship to the head of the family……………………………………… 3. Sex……………………….. 4. Age………………………………….. 5. Marital status……………………….. 6. For currently married women only: a) Age at marriage…………… b) Any child born in the last one year…………….. 7. Birth place: a) Place of birth…………… b) Rural or urban……………. c) District……………………………. d) State/Country………………………….. 8. Last Residence: a) Place of last residence………………………………………… b) Rural/Urban………………………………………. c) District…………………………………. d) State/Country……………………………………… 9. Duration of present residence…………………………………….. 10. Religion…………………………………………. 11. Scheduled Caste/Tribe……………………………………… 12. Literacy…………………………………………. 13. Educational level……………………………………….. 24
  • 25. RESEARCH METHODOLOGY & STATISTICAL TOOLS 14. Mother Tongue………………………………………….. 15. Other Languages, if any………………………………………………………. 16. Main Activity: a) Broad Category: (i) Worker (ii) Non – Worker b) Place of work (Name of village/town)………………………….. c) Name of establishment……………………… d) Name of Industry, Trade, Profession or Service………………… e) Description of work………………………………….. f) Class of worker……………………………….. 17. Secondary work: a) Broad Category……………………… b) Place of work……………………………. c) Name of establishment………………………. d) Nature of Industry, Trade, Profession or service…………………………. e) Description of work………………………………….. f) Class of worker…………………………………………….. SCHEDULES AS A TOOL FOR COLLECTING DATA Before discussing this method it is desirable to make a distinction between a questionnaire and a schedule. As already explained, questionnaire in a list of questions which are answered by the respondent himself in this own handwriting while schedule is the device of obtaining answers to the questions in a form which is filled by the interviewers or enumerators (the field agents who put these questions) in a face to face situation with the respondents. The most widely used method of collection of primary data is the ‘schedules sent through enumerators’. This is so because this method is free from certain shortcomings inherent in the earlier methods discussed so far. In this the enumerators go to the respondents personally with the schedule (list of questions), ask them the questions there in and record their replies. This method is generally used by big business houses, large public enterprises and research institutions like ‘National Council of Applied Economic Research (NCAER), Federation of Indian Chambers of Commerce and Industries (FICCI) and so on and even by the governments – state or 25
  • 26. RESEARCH METHODOLOGY & STATISTICAL TOOLS central – for certain projects and investigations where high degree of response is desired. Population census, all over the world is conducted by this technique. Merits: 1. The enumerators can explain in detail the objectives and aims of the enquiry to the informants and impress upon them the need and utility of furnishing the correct information. 2. This technique is very useful in expensive enquiries and generally yields fairly dependable and reliable results due to the fact that the information is recorded by highly trained and educated enumerators. 3. Unlike the ‘Questionnaire method’, this technique can be used with advantage even if the respondents are illiterate. 4. As already pointed out in the ‘direct personal investigation’, due to personal likes and dislikes, different people react differently to different questions and as such some people might react very sharply to certain sensitive and personal questions. Demerits: 1. It is fairly expensive method since the team of enumerators is to be paid for different services and as such can be used by only those bodies or institutions which are financially sound. 2. It is also more time consuming as compared with the ‘Questionnaire method’. 3. The success of the method largely depends upon the efficiency and skill of the enumerators who collect the information. The enumerators have to be trained properly in the art of collecting correct information by their intelligence, insight, patience and perseverance, diplomacy and courage. They should clearly understand the aims and objectives of the enquiry and also the implications of the various terms, definitions and concepts used in the questionnaire. 4. Due to inherent variation in the individual personalities of the enumerators there is bound to be variation, though not so obvious, in the information 26
  • 27. RESEARCH METHODOLOGY & STATISTICAL TOOLS recorded by different enumerators. An attempt should be made to minimize this variation. 5. The success of this method also lies to a great extent on the efficiency and wisdom with which the schedule is prepared or drafted. If the schedule is framed haphazardly and incompetently, the enumerators will find it very difficult to get the complete and correct desired information from the respondents. SAMPLE DESIGN AND SAMPLING PROCEDURES SAMPLE DESIGN: A sample design is a definite plan for obtaining a sample from a given population. It refers to the technique or the procedure the researcher would adopt in selecting items for the sample. Sample design may as well lay down the number of times to be included in the sample i.e., the size of the sample. Sample design is determined before data are collected. There are many sample designs from which a researcher can choose. Some designs are relatively more precise and easier to apply than others. Researcher must select/prepare a sample design which should be reliable and appropriate for his research study. STEPS IN SAMPLE DESIGN: While developing a sample design, the researcher must pay attention to the following points: 1. Type of universe: The first step in developing sample design is to clearly define the set of objects, technically called the Universe, to be studied. The universe can be finite or infinite. In finite universe the number of items is certain, but in case of an infinite universe the number of items is infinite i.e., we cannot have any idea about the total number of items. The population of a city, the number of workers in a factory and the like are examples of finite universes, whereas the number of stars in the sky, listeners of a specific radio programme, throwing of a dice etc., are examples of infinite universes. 2. Sampling Unit: A decision has to be taken concerning a sampling unit before selecting sample. Sampling unit may be a geographical one such as state, district, village, etc., or a construction unit such as house, flat, etc., or it may be a social unit such as family, club, school, etc., or it may be an individual. The 27
  • 28. RESEARCH METHODOLOGY & STATISTICAL TOOLS researcher will have to decide one or more of such units that he has to select for his study. 3. Source List: It is also known as ‘Sampling frame’ from which sample is to be drawn. It contains the names of all items of a universe (in case of finite universe only). If source list is not available, researcher has to prepare it. Such a list should be comprehensive, correct, reliable and appropriate. It is extremely important for the source list to be as representative of the population as possible. 4. Size of sample: This refers to the number of items to be selected from the universe to constitute a sample. This major problem before a researcher. The size of sample should neither be excessively large, nor too small. It should be optimum. An optimum sample is one which fulfills the requirements of efficiency, representative-ness, reliability and flexibility. While deciding the size of sample, researcher must determine the desired precision as also an acceptable confidence level for the estimate. 5. Parameters of interest: In determining the sample design, one must consider the question of the specific population parameters which are of interest. For instance, we may be interested in estimating the proportion of persons with some characteristic in the population, or we may be interested in knowing some average or the other measure concerning the population. There may also be important sub-groups in the population about whom we would like to make estimates. All this has a strong impact upon the sample design we would accept. 6. Budgetary Constraint: Cost considerations, from practical point of view, have a major impact upon decisions relating to not only the size of the sample but also to the type of sample. This fact can even lead to the use of a non- probability sample. 7. Sampling Procedure: Finally, the researcher must decide the type of sample he will use i.e., he must decide about the technique to be used in selecting the items for the sample. In fact, this technique or procedure stands for the sample design itself. There are several sample designs out of which the researcher must choose one for his study. Obviously, he must select that design which, for a given sample size and for a cost, has a small sampling error. 28
  • 29. RESEARCH METHODOLOGY & STATISTICAL TOOLS CHARACTERISTICS OF GOOD SAMPLE DESIGN: From what has been stated above, we can list down the characteristics of a good sample design as under: a) Sample design must result in a truly representative sample. b) Sample design must be such which results in a small sampling error. c) Sample design must be viable in the context of funds available for the research study. d) Sample design must be such so that systematic bias can be controlled in a better way. e) Sample should be such that the results of the sample study can be applied, in general, for the universe with a reasonable level of confidence. CRITERIA OF SELECTING A SAMPLING PROCEDURE: In this context one must remember that two costs are involved in a sampling analysis viz., the cost of collecting the data and the cost of an incorrect inference resulting from the data. Researcher must keep in view the two causes of incorrect inferences viz., systematic bias and sampling error. Systematic bias results from errors in the sampling procedures, and it cannot be reduced or eliminated by increasing the sample size. At best the causes responsible for these errors can be detected and corrected. Usually a systematic bias is the result of one or more of the following factors. 1) Inappropriate frame: If the sampling frame is inappropriate i.e., a biased representation of the universe, it will result in a systematic bias. 2) Defective measuring device: If the measuring device is constantly in error, it will return in systematic bias. In survey work, systematic bias can result if the questionnaire or the interviewer is biased. Similarly, if the physical measuring device is defective there will be systematic bias in the data collected through such a measuring device. 3) Non-respondents: If we are unable to sample all the individuals initially include in the sample, there may arise a systematic bias. The reason is that in such a situation the likelihood of establishing contact or receiving a response from an individual is often correlated with the measure of what is to be estimated. 29
  • 30. RESEARCH METHODOLOGY & STATISTICAL TOOLS 4) Indeterminacy principle: Sometimes we find that individuals act different when kept under observation that what they do when kept in non-observed situations. For instance, if workers are aware that somebody is observing then in course of a work study on the basis of which the average length of time to complete a task will be determined and accordingly the quota will be set for piece work, they generally tend to work slowly in comparison to the speed with which they work if kept unobserved. Thus, the indeterminacy principle may also be a cause of a systematic bias. 5) Natural bias in the reporting of data: Natural bias of respondents in the reporting of data is often the cause of a systematic bias in many inquiries. There is usually a download bias in the income data collected data by government taxation department, whereas we find an upward bias in the income data collected by some social organization. People in general understate their incomes if asked about it for tax purposes, but they overstate the same if asked for social status or their affluence. Generally in psychological surveys, people tend to give what they think is the ‘correct’ answer rather than revealing their true feelings. DIFFERENT TYPES OF SAMPLE DESIGNS: There are different types of sample designs based on two factors viz., the representation basis and the element selection technique. On the representation basis and the element selection technique. On the representation basis, the sample may be probability sampling or it may be non-probability sampling. Probability sampling is based on the concept of random selection, whereas non-probability sampling is ‘non- random sampling. On element selection bias, the sample may be either unrestricted or restricted. When each sample element is drawn individually from the population at large, then the sample so drawn is known as ‘unrestricted sample’, whereas all other forms of sampling are covered under the term ‘restricted sampling’. The following chart exhibits the sample designs as explained above. Non-probability sampling: Non-probability sampling is that sampling procedure which does not afford any basis for estimating the probability that each item in the population has of being included in the sample. Non-probability sampling is also known by different names such as deliberate sampling, purposive sampling and judgment sampling. In this type if sampling, items for the sample are selected deliberately by the researcher; his choice concerning the items remains supreme. In other words, under non-probability sampling the organizers of the inquiry purposively 30
  • 31. RESEARCH METHODOLOGY & STATISTICAL TOOLS choose the particular units of the universe for consulting a sample on the basis that the small mass that they so select out of a huge one will be typical or representative of the whole. For instance, if economic conditions of people living in a state are to be studied, a few towns and villages may be purposively selected for intensive study on the principle that they can be representative of the entire state. Thus, the judgment of the organizers of the study plays an important part in this sampling design. Quota sampling: It is also an example of non-probability sampling. Under quota sampling the interviewers are simply given quotas to be filled from the different strata, with some restrictions on how they are to be filled. In other words, the actual selection of the items for the sample is left to the interviewer’s discretion. This type of sampling is very convenient and is relatively inexpensive. But the samples so selected certainly do not possess the characteristic of random samples. Quota samples are essentially judgment samples and inferences drawn on their basis are not amenable to statistical treatment in a formal way. Probability sampling: Probability sampling is also known as ‘random sampling’ or ‘chance sampling’. Under this sampling design, every time of the universe has an equal chance of inclusion in the sample. It is, so to say, a lottery method in which individual units are picked up from the whole group not deliberately but by some mechanical process. Here it is blind chance alone that determines whether one item or the other is selected. The results obtained from probability or random sampling can be assured in terms of probability i.e., we can measure the errors of estimation or the significance of results obtained from a random sample, and this fact brings out the superiority of random sampling design over the deliberate sampling design. Random sampling ensures the Law of Statistical Regularity which states that if on an average the sample chosen is a random one, the sample will have the same composition and characteristics as the universe. This is the reason why random sampling is considered as the best technique of selecting a representative sample. Random sampling from a finite population to that method of sample selection which gives each possible sample combination an equal probability of being picked up and each item in the entire population to have an equal chance of being included in the sample. This applies to sampling without replacement i.e., once an selected for the sample, it cannot appear in the sample again (sampling with replacement is used less 31
  • 32. RESEARCH METHODOLOGY & STATISTICAL TOOLS frequently in which procedure the element for the sample is returned to the population before the next element is selected. In such a situation the same element could appear twice in the same sample before the second element is chosen).in brief, the implications of random sampling (or simple random sampling) are: (a) It gives each element in the population an equal probability of getting into the sample; and all choices are independent of one another. (b) It gives each possible sample combination an equal probability of being chosen. COMPLEX RANDOM SAMPLING DESIGNS: Probability sampling under restricted sampling techniques, as stated above, may result in complex random sampling designs. Such designs may as well be called ‘mixed sampling designs’ for many of such designs may represent a combination of probability and non-probability sampling procedures in selecting a sample. Some of the popular complex random sampling designs are as follows: (i) Systematic Sampling: In some instances, the most practical way of sampling is to select every ith item on a list. Sampling of this type is known as systematic sampling. An element of randomness is introduced into this kind of sampling by using random numbers to pick up the unit with which to start. For instance, if a 4 percent sample is desired, the first item would be selected randomly from the first twenty-five and thereafter every 25th item would automatically be included in the sample. Thus, in systematic sampling only the first unit is selected randomly and the remaining units of the sample are selected at fixed intervals. Although a systematic sample is not a random sample in the strict sense of the term, but it is often considered reasonable to treat systematic sample as if it were a random sample. (ii) Stratified Sampling: If a population from which a sample is to be drawn does not constitute a homogeneous group, stratified sampling technique is generally applied in order to obtain a representative sample. Under stratified sampling the population is divided into several sub-populations that are individually more homogeneous than the total population a (the different sub-populations are called ‘strata’) and then we select items from each stratum to constitute a sample. Since each stratum is more homogeneous than the total population, we are able to get precise estimates for each stratum and by estimating more accurately each of the component parts; we get a 32
  • 33. RESEARCH METHODOLOGY & STATISTICAL TOOLS better estimate of the whole. In brief, stratified sampling results in more reliable and detailed information. (iii) Cluster Sampling: If the total area of interest happens to be a big one , a convenient way in which a sample can be taken is to divide the area into a number of smaller non-overlapping areas and then to randomly select a number of these smaller areas (usually called clusters), with the ultimate sample consisting of all (or samples of ) units in these small areas of clusters. Thus in cluster sampling the total population is divided into a number of relatively small subdivisions which are themselves clusters of still smaller units and then some of these clusters are randomly selected for inclusion in the overall sample. Suppose we want to estimate the proportion of machine parts in an inventory which are defective. Also assume that there are 20000 machine parts in the inventory at a given point of time, stored in 400 cases of 50 each. Now using a cluster sampling, we would consider the 400 cases as clusters and randomly select ‘n’ cases and examine all the machine parts in each randomly selected case. Cluster sampling, no doubt, reduces cost by concentrating surveys in selected surveys. But certainly it is less precise than random sampling. There is also not as much information in ‘n’ observations within a cluster as there happens to be in ‘n’ randomly drawn observations. Cluster sampling is used only because of the economic advantage it possesses; estimates based on cluster samples are usually more reliable per unit cost. (iv) Area Sampling: If clusters happen to be some geographic subdivisions, in that case cluster sampling is better known as area sampling. In other words, cluster designs, where the primary sampling unit represents a cluster of units based on geographic area, are distinguished as area sampling. The plus and minus points of cluster sampling are also applicable to area sampling. (v) Multi-stage Sampling: Multi-stage sampling is a further development of the principle of cluster sampling. Suppose we want to investigate the working efficiency of nationalized banks in India and we want to take a sample of few banks for this purpose. The first stage is to select large primary sampling unit such as states in a country. Then we may select certain districts and interview all banks in the chosen 33
  • 34. RESEARCH METHODOLOGY & STATISTICAL TOOLS districts. This would represent a two-stage sampling design with the ultimate sampling units being clusters of districts. If instead of taking a census of all banks within the selected districts, we select certain towns and interview all banks in the chosen towns. This would represent a three-stage sampling design. If instead of taking a census of all banks within the selected towns, we randomly sample banks from each selected town, then it is a case of using a four-stage sampling plan. If we select randomly at all stages, we will have what is known as ‘multi-stage random sampling design’. Ordinarily multi-stage sampling is applied in inquires extending to a considerable large geographical area, say, the entire country. There are two advantages of this sampling design viz., (a) It is easier to administer than most single stage designs mainly because of the fact that sampling frame under multi-stage sampling in developed impartial units. (b) A large number of units can be sampled for a given cost under multistage because of sequential clustering, whereas this is not possible in most of the sample designs. (vi) Sampling with probability proportional to size: In case the cluster sampling units do not have the same number or approximately the same number of elements, it is considered appropriate to use a random selection process where the probability of each cluster being included in the sample is proportional to the size of the cluster. For this purpose, we have to list the number of the elements in each cluster irrespective of the method of ordering the cluster. Then we must sample systematically the appropriate number of elements from the cumulative totals. (vii) Sequential Sampling: This sampling design is some what complex sample design. The ultimate size of the sample under this technique is not fixed in advance, but we determined according to mathematical decision rules on the basis of information yielded as survey progresses. This is usually adopted in case of acceptance sampling plan in context of statistical quality control. When a particular lot is to be accepted or rejected on the basis of single sample, it is known as single sampling; when the decision is to be taken on the basis of two samples, it is known as double sampling and in case the decision rests on the basis of more than two samples but the number of samples in certain and decide in advance, the sampling is known as the 34
  • 35. RESEARCH METHODOLOGY & STATISTICAL TOOLS multiple sampling. But when the number of samples is more than two but it is neither certain nor decides in advance, this type of system is often referred to as sequential sampling. DIAGRAMATIC PRESENTATION OF DATA General rules for Constructing Diagrams: (1) Neatness: Diagrams are visual aids for presentation of statistical data and are more appealing and fascinating to the eye and leave a lasting impression on the mind. It is, therefore, imperative that they are made very neat, clean and attractive by proper size and lettering; and the use of appropriate devices like different colours, different shades (light and dark), dots, dashes, dotted lines, broken lines, dots and dash lines, etc., for filling the in between space of the bars, rectangles, circles, etc., and their components. (2) Title and Footnotes: As in the case of a good statistical table, each diagram should be given a suitable title to indicate the subject-matter and the various facts depicted in the diagram. The title should be brief and self explanatory, clear. If necessary the footnotes may be given at the left hand bottom of the diagram to explain certain points or facts, not otherwise covered in the title. (3) Selection of Scale: One of the most important factors in the construction of diagrams is the choice of an appropriate scale. The same set of numerical data if plotted on different scales may give the diagrams differing widely in size and at times might lead to wrong and misleading interpretations. Hence, the scale should be selected with great caution. (4) Proportion between Width and Height: A proper proportion between the dimensions (height and width) of the diagram should be maintained, consistent with the space available. (5) Choice of a Diagram: A large number of diagrams are used to present statistical data. The choice of a particular diagram to present a given set of numerical data is not an easy one. It primarily depends on the nature of the data, magnitude of the observations and the type of the people for whom the diagrams 35
  • 36. RESEARCH METHODOLOGY & STATISTICAL TOOLS are meant and requires great amount of expertise, skill, and intelligence. An inappropriate choice of the diagram for the given set of data might give a distorted picture of the phenomenon under study and might lead to wrong and fallacious interpretations and conclusions. (6) Source Note and Number: As in the case of tables, source note, wherever possible should be appended at the bottom of the diagram. This is necessary as, to the learned audience of statistics; the reliability of the information varies from source to source. Each diagram should also be given a number for ready reference and comparative study. (7) Index: A brief index explaining various types of shades, colors, lines, and designs used in the construction of the diagram should be given for clear understanding of the diagram. (8) Simplicity: Lastly, diagrams should be as simple as possible so that they are easily understood even by a layman who does not have any mathematical or statistical background. If too much information is presented in a single complex diagram it will be difficult to grasp and might even become confusing to the mind. Hence, it is advisable to draw more simple diagrams than one or two complex diagrams. TYPES OF DIAGRAMS: A large variety of diagrammatic devices are used in practice to present statistical data. However, we shall discuss here only some of the most commonly used diagrams which may be broadly classified as follows: (1) One-dimensional diagrams (2) Two-dimensional diagrams (3) Three-dimensional diagrams (4) Pictograms (5) Cartograms 1) One-Dimensional Diagrams: These one-dimensional diagrams are classified into two types. They are: I. Line Diagrams II. Bar Diagram 36
  • 37. RESEARCH METHODOLOGY & STATISTICAL TOOLS a) Line Diagram: This is the simplest of all the diagrams. It consists in drawing vertical lines, each vertical line being equal to the frequency. The variate (x) values are presented on a suitable scale along the X-axis and the corresponding frequencies are presented on a suitable scale along Y-axis. Line diagrams facilitate comparisons though they are not attractive or appealing to the eye. 0 20 40 60 80 100 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr East West North b) Bar Diagram: Bar diagrams are one of the easiest and the most commonly used devices of presenting most of the business and economic data. These are especially satisfactory for categorical data or series. They consist of a group of equidistant rectangles, one for each group or category of the data in which the values or the magnitudes are represented by the length or height of the rectangles, the width of the rectangles being arbitrary and immaterial. These diagrams are called one- dimensional because in such diagrams only one dimension viz., height or length of the rectangles is taken into account to present the given values. There are various types of Bar Diagrams. They are listed as follows: (i) Simple bar diagram (ii) Sub-divided or component bar diagram (iii) Percentage bar diagram (iv) Multiple bar diagram (v) Deviation or Bilateral bar diagram 37
  • 38. RESEARCH METHODOLOGY & STATISTICAL TOOLS 0 10 20 30 40 50 60 70 80 90 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr East West North 2) Two-Dimensional Diagrams: Line or Bar diagrams discussed so far are one- dimensional diagrams since the magnitudes of the observations are represented by only one of the dimensions viz., height (length) of the bars while the width of the bars is arbitrary and uniform. However, in two-dimensional diagrams, the magnitudes of the given observations are represented by the area of the diagram. Thus, in the case of two-dimensional bar diagrams, the length as well as width of the bars will have to be considered. Two-dimensional diagrams are also known as “area diagrams or surface diagrams”. Some of the commonly used two-dimensional diagrams are listed as follows: They are:  Rectangles  Squares  Circles  Angular or pie diagrams 3) Three-Dimensional Diagrams: Three-dimensional diagrams, also termed as ‘volume diagrams’ are those in which three dimensions, viz., length, breadth, and height are taken into account. They are constructed so that the given magnitudes are represented by the volumes of the corresponding diagrams. The common forms 38
  • 39. RESEARCH METHODOLOGY & STATISTICAL TOOLS of such diagrams are “cubes, spheres, cylinders, blocks etc”. These diagrams are specially useful if there are very wide variations between the smallest and the largest magnitudes to be represented. Of the various three-dimensional diagrams, ‘cubes’ are the simplest and most commonly used devices of diagrammatic presentation of data. 4) Pictograms: Pictograms is the technique of presenting statistical data through appropriate pictures and is one of the very popular devices particularly when the statistical facts are to be presented to a layman without any mathematical background. In this, the magnitudes of the particular phenomenon under study are presented through appropriate pictures, the number pictures drawn or the size of the pictures being proportional to the values of the different magnitudes to be presented. Pictures are more attractive and appealing to the eye and have a lasting impression on the mind. Accordingly they are extensively used by government and private institutions for diagrammatic presentation of the data relating to a variety of social, business or economic phenomena primarily for display to the general public or common masses in fairs and exhibitions. 5) Cartograms: in cartograms, statistical facts are presented through maps accomplished by various types of diagrammatic representation. They are specially used to depict the quantitative facts on a regional or geographical basis eg., the population density of different states in a country or different countries in the world, or the distribution of the rainfall in different regions of a country can be shown with the help of maps or cartograms. The different regions or geographical zones are depicted on a map and the quantities or magnitudes in the regions may be shown by dots, different shades or colors etc., or by placing bars or pictograms in each region or by writing the magnitudes to be represented in the respective regions. Cartograms are simple and elementary forms of visual presentation and are easy to understand. They are generally used when the regional or geographic comparisons are to be highlighted. GRAPHIC REPRESENTATION OF DATA Diagrams are primarily used for comparative studies and can’t be used to study the relation ship between the variables under study. This is done through graphs. Diagrams furnish only approximate information and they are not of much utility to a statistician from analysis point of view. On the other hand, graphs are more obvious, precise and accurate than diagrams and can be effectively used for further statistical 39
  • 40. RESEARCH METHODOLOGY & STATISTICAL TOOLS analysis, viz., to study slopes, rates of change and for forecasting wherever possible. Graphs are drawn on a special type of paper, known as “graph paper”. Before discussing these graphs we shall briefly describe the technique of constructing graphs and the general rules for drawing graphs. TECHNIQUE OF CONSTRUCTION OF GRAPHS: QUADRANT II 5- QUADRANT I X-Negative 4- X-Positive Y-Positive 3- Y-Negative (-X, +Y) 2- (+X, +Y) 1- -5 -4 -3 -2 -1 0 1 2 3 4 5 QUADRANT III -1- QUADRANT IV X-Negative -2- X - Positive Y-Positive -3- Y - Negative (-X, -Y) -4- (+X, -Y) -5- Graphs are drawn on a special type of paper known as “Graph Paper”, which has a fine network of horizontal and vertical lines; the thick lines for each division of a centimeter or an inch measure and thin lines for small parts of the same. In a graph of any size, two simple lines are drawn at right angle to each other, intersecting at point ‘O’ which is known as origin or zero of reference. The two lines are known as co- ordinate axes. The horizontal line is called X – axis and is denoted by X’OX. The vertical line is called the Y – axis and is usually denoted by YOY’. Thus the graph is divided into four sections, known as four quadrants. General Rules for Graphing: The following guidelines may be kept in mind for drawing effective and accurate graphs. 1. Neatness 2. Title and Footnote 3. Structural Framework 4. Scale 40
  • 41. RESEARCH METHODOLOGY & STATISTICAL TOOLS 5. False Base Line 6. Ratio or Logarithmic Scale 7. Line designs 8. Source Note and Number 9. Index 10. Simplicity TYPES OF GRAPHS: A large number of graphs are used in practice. But they can be broadly classified under the following two heads: (i) Graphs of frequency distributions. (ii) Graphs of time series. 1) Graphs of Frequency Distributions: The reasons and the guiding principles for the graphic representation of the frequency distributions are precisely the same as for the diagrammatic and graphic representation of other types of data. The so- called frequency graphs are designed to reveal clearly the characteristic features of a frequency data. Such graphs are more appealing to the eye than the tabulated data and are readily perceptible to the mind. They facilitate comparative study of two or more frequency distributions regarding their shape and pattern. The most commonly used graphs for charting a frequency distribution for the general understanding of the details of the data are: A) Histogram B) Frequency Polygon C) Frequency Curve D) “Ogive” or Cumulative Frequency Curve The choice of a particular graph for a given frequency distribution largely depends on the nature of the frequency distribution, viz., discrete or continuous. A) HISTOGRAM: It is one of the most popular and commonly used devices for charting continuous frequency distribution. It consists in erecting a series of adjacent vertical rectangles on the sections of the horizontal axis (X-axis), with bases (sections) equal to the width of the corresponding class intervals and heights are so taken that the areas of the rectangles are equal to the frequencies of the corresponding classes. The Histogram can be constructed in two cases. They are: Case (i): Histogram with equal classes. Case (ii): Histogram with un-equal classes. 41
  • 42. RESEARCH METHODOLOGY & STATISTICAL TOOLS B) FREQUENCY POLYGON: Frequency polygon is other device of graphic presentation of a frequency distribution (continuous, grouped or discrete). In case of discrete frequency distribution, frequency polygon is obtained on plotting the frequencies on the vertical axis (Y-axis) against the corresponding values of the variable on the horizontal axis (X-axis) and joining the points so obtained by straight lines. C) FREQUENCY CURVE: A frequency curve is a smooth free hand curve drawn through the vertices of a frequency polygon. The object of smoothing of the frequency polygon is to eliminate, as far as possible, the random or erratic fluctuations that might be present in the data. The area enclosed by the frequency curve is same as that of the histogram or frequency polygon but its shape is smooth one and not with sharp edges. Frequency curve may be regarded as a limited form of the frequency polygon as the number of observations (total frequency) becomes very large and class intervals are made smaller and smaller. Types of frequency curves: Though different types of data may give rise to a variety of frequency curves, we shall discuss below only some of the important curves which, in general, describe most of the data observed in practice, viz., and the data relating to natural, social, economic and business phenomena. i) Curves of Symmetrical Distribution ii) Moderately Asymmetrical (skewed) frequency distribution curves iii) Extremely asymmetrical or J – shaped curves iv) U – curve v) Mixed curves D) “OGIVE” OR CUMULATIVE FREQUENCY CURVE: Ogive, pronounced as “Ojive”, is a graphic presentation of the cumulative frequency (C.F) distribution of continuous variable. It consists in plotting the cumulative frequency (along the Y – axis) against the class boundaries (along the X – axis). Since there are two types of cumulative frequency distributions viz., “LESS THAN C.F” and “MORE THAN C.F”. We have accordingly two types of ogives, viz., (i) Less than ogive (ii) More than ogive. 42
  • 43. RESEARCH METHODOLOGY & STATISTICAL TOOLS (i) Less than Ogive: This consists in plotting the ‘less than’ cumulative frequencies against the upper class boundaries of the respective classes. The points so obtained are joined by a smooth free hand curve to give “Less than Ogive”. Obviously, “less than ogive” is an increasing curve, sloping upwards from left to right and has the shape of an elongated S. (ii) More than Ogive: Similarly, in “more than ogive”, the “more than” cumulative frequencies are plotted against the lower class boundaries of the respective classes. The points so obtained are joined by a smooth ‘free hand’ curve to give “more than ogive”. “More than Ogive” is a decreasing curve and slopes downwards from left to right and has the shape of an elongated S, upside down. 2) Graphs of Time Series: The Time Series data are represented geometrically by means of times series graph which is also known as “Histogram”. The various types of Time Series graphs are: i) Horizontal Line Graphs or Histograms ii) Silhouette or Net Balance Graphs iii) Range or Variation Graphs iv) Components or Band Graphs TABULATION OF DATA Meaning and Importance of Tabulation: By Tabulation we mean the symmetric presentation of the information contained in the data, in rows and columns in accordance with some salient features or characteristics. Rows are horizontal arrangements and columns are vertical arrangements. In the words of A.M. Tuttle. “A Statistical table is the logical listing of related quantitative data in vertical columns and horizontal rows of numbers with sufficient explanatory and qualifying words, phrases and statements in the form of titles, headings and notes to make clear the full meaning of data and their origin”. Professor Bowley, in his manual of statistics prefers to Tabulation as “the intermediate process between the accumulation of data in what ever form they are obtained, and the final reasoned account of the result shown by the statistics”. Tabulation is one of the most important and ingenious device of the presenting the data in a condensed and readily comprehensible form and attempts to furnish the 43
  • 44. RESEARCH METHODOLOGY & STATISTICAL TOOLS maximum information contained in the data in the minimum possible space, without sacrificing the quality and usefulness of the data. It is an intermediate process between the collection of the data on one hand and statistical analysis on the other hand. In fact, Tabulation is the final stage in collection and compilation of the data and forms the gateway for further statistical analysis and interpretations. Tabulation makes the data comprehensible and facilitates comparisons (by classifying data into suitable groups), and the work of further statistical analysis, averaging, correlation, etc. It makes the data suitable for further Diagrammatic and Graphic representation. GENERAL RULES FOR CONSTRUCTING A TABLE The various parts of a table vary from problem to problem depending upon the nature of the data and the purpose of the investigation. However, the following are a must in a good statistical table: 1. Table Number 2. Title 3. Head Notes (or) Prefatory Notes 4. Captions and Stubs 5. Body of the Table 6. Foot-Note 7. Source Note FORMAT OF A BLANK TABLE Table No: # TITLE [Head Note or Prefatory Note (if any)] Caption Sub Heads Sub Heads 44
  • 45. RESEARCH METHODOLOGY & STATISTICAL TOOLS Stub Heading Total Column Head Column Head Column Head Column Head Column Head Body Total Foot Note: Source Note: TYPES OF TABULATION: The Tables are constructed in many ways. 1. Objectives and Scope of the enquiry. General Purpose or Reference Table Special Purpose or Summary Table 2. Nature of Enquiry. (i) Original or Primary Table (ii) Derived or Derivative Table 3. Extent of Coverage given in the Enquiry. Simple Table Complex Table 45
  • 46. RESEARCH METHODOLOGY & STATISTICAL TOOLS SPSS (STATISTICAL PACKAGE FOR THE SOCIAL SCIENCES) SPSS (Statistical Package for the Social Sciences) has now been in development for more than thirty years. Originally developed as a programming language for conducting statistical analysis, it has grown into a complex and powerful application with now uses both a graphical and a syntactical interface and provides dozens of functions for managing, analyzing, and presenting data. Its statistical capabilities alone range from simple percentages to complex analyses of variance, multiple regressions, and general linear models. You can use data ranging from simple integers/binary variables to multiple response or logarithmic variables. SPSS also provides extensive data management functions, along with a complex and powerful programming language. STATISTICS PROGRAM SPSS (originally, Statistical Package for the Social Sciences) was released in its first version in 1968 after being developed by Norman H. Nie and C. Hadlai Hull. Norman Nie was then a political science postgraduate at Stanford University, and now Research Professor in the Department of Political Science at Stanford and Professor Emeritus of Political Science at the University of Chicago. SPSS is among the most widely used programs for statistical analysis in social science. It is used by market researchers, health researchers, survey companies, government, education researchers, marketing organizations and others. The original SPSS manual (Nie, Bent & Hull, TYPES OF TABLES OBJECTIVES AND THE SCOPE OF THE ENQUIRIES NATURE OF THE ENQUIRY EXTENT OF COVERAGE GIVEN IN THE ENQUIRY General Purpose or Reference Table Special Purpose or Summary Table Original or Primary Table Derived or Derivative Table Simple Table Complex Table 46
  • 47. RESEARCH METHODOLOGY & STATISTICAL TOOLS 1970) has been described as 'Sociology's most influential book'. In addition to statistical analysis, data management (case selection, file reshaping, creating derived data) and data documentation (a metadata dictionary is stored in the data file) are features of the base software. Statistics included in the base software: • Descriptive statistics: Cross tabulation, Frequencies, Descriptive, Explore, Descriptive Ratio Statistics • Bi-variate statistics: Means, t-test, ANOVA, Correlation (bi-variate, partial, distances), Nonparametric tests • Prediction for numerical outcomes: Linear regression • Prediction for identifying groups: Factor analysis, cluster analysis (two-step, K-means, hierarchical), Discriminant The many features of SPSS are accessible via pull-down menus or can be programmed with a proprietary 4GL command syntax language. Command syntax programming has the benefits of reproducibility; simplifying repetitive tasks; and handling complex data manipulations and analyses. Additionally, some complex applications can only be programmed in syntax and is not accessible through the menu structure. The pull- down menu interface also generates command syntax, this can be displayed in the output though the default settings have to be changed to make the syntax visible to the user; or can be paste into a syntax file using the "paste" button present in each menu. Programs can be run interactively or unattended using the supplied Production Job Facility. Additionally a "macro" language can be used to write command language subroutines and a Python programmability extension can access the information in the data dictionary and data and dynamically build command syntax programs. The Python programmability extension, introduced in SPSS 14, replaced the less functional SAX Basic "scripts" for most purposes, although Sax Basic remains available. In addition, the Python extension allows SPSS to run any of the statistics in the free software package R. From version 14 onwards SPSS can be driven externally by a Python or a VB.NET program using supplied "plug-ins". SPSS places constraints on internal file structure, data types, data processing and matching files, which together considerably simplify programming. SPSS datasets 47
  • 48. RESEARCH METHODOLOGY & STATISTICAL TOOLS have a 2-dimensional table structure where the rows typically represent cases (such as individuals or households) and the columns represent measurements (such as age, sex or household income). Only 2 data types are defined: numeric and text (or "string"). All data processing occurs sequentially case-by-case through the file. Files can be matched one-to-one and one-to-many, but not many-to-many. The graphical user interface has two views which can be toggled by clicking on one of the two tabs in the bottom left of the SPSS window. The 'Data View' shows a spreadsheet view of the cases (rows) and variables (columns). Unlike spreadsheets, the data cells can only contain numbers or text and formulas cannot be stored in these cells. The 'Variable View' displays the metadata dictionary where each row represents a variable and shows the variable name, variable label, value label(s), print width, measurement type and a variety of other characteristics. Cells in both views can be manually edited, defining the file structure and allowing data entry without using command syntax. This may be sufficient for small datasets. Larger datasets such as statistical surveys are more often created in data entry software, or entered during computer-assisted personal interviewing, by scanning and using optical character recognition and optical mark recognition software, or by direct capture from online questionnaires. These datasets are then read into SPSS. SPSS can read and write data from ASCII text files (including hierarchical files), other statistics packages, spreadsheets and databases. SPSS can read and write to external relational database tables via ODBC and SQL. Statistical output is to a proprietary file format (*.spv file, supporting pivot tables) for which, in addition to the in-package viewer, a stand-alone reader can be downloaded. The proprietary output can be exported to text or Microsoft Word. Alternatively, output can be captured as data (using the OMS command), as text, tab-delimited text, PDF, XLS, HTML, XML, SPSS dataset or a variety of graphic image formats (JPEG, PNG, BMP and EMF). Add-on modules provide additional capabilities. The available modules are: 48