2. OVERVIEW OF DATA COLLECTION
Virtually every management research project will involve some type of data
Collection
Data collection must be well planned and managed
Data is the raw material of problem solving and decision making
Management researcher is interested in information rather than data
3. Must know not only what data is required, but also approaches and techniques
for collecting data
What are the principal types of data ?
How may we classify data ?
Data collection methods ?
What are the principle ways ?
Techniques of collecting data ?
OVERVIEW OF DATA COLLECTION
4. VARIABLES
• The concept of a variable is basic but vitally important
• A variable is anything that varies and can be measured
• The values that the variable takes will vary when measurements are made on
different objects or at different times.
• Variables differ in how well they can be measured, i.e., in how much measurable
information their measurement scale can provide
In general, a variable may fall into one of two types, viz., quantitative variables
and qualitative variables.
5. • Quantitative variables are those for which the value has numerical meaning.
The value refers to a specific amount of some quantity. They are also called
metric variables or measurement variables.
• Measurement variables tell us “how much” of something each case has. You can
do mathematical operations on the values of quantitative variables (like taking
an average)
• A good example would be a person's weight
• The quantitative variables can be broken down into two types, viz., discrete
variable and continuous variable.
VARIABLES
6. • Discrete: A discrete variable is one which can only take certain
fixed numerical values, there are usually gaps between the
values. Discrete variables have numerical values that arise from a
counting process.
• Continuous: A continuous variable in one which - in principle at
least – can take any numerical value within a specific range.
Continuous variables produce numerical responses that arise
from a measuring process.
VARIABLES
7. Qualitative Variables are those for which the value indicates different
groupings. They are also called Attributes or Categorical Variables
For the purpose of analysis we assign an arbitrary numerical value to such a
variable. Objects that have the same value on the variable are the same with
regard to some characteristic, but you can't say that one group has “more" or
“less" of some feature.
Thus, the categorical variables tell us only “what kind” or category a case
belongs in. It doesn't really make sense to do math on categorical variables.
VARIABLES
8. DEPENDENT VARIABLE
AND
INDEPENDENT VARIABLE
Dependent Variable : Any variable that depend upon other factors
It represent the output whose variation is being studied
Example : Exam score
It depends upon how much time you studied
How much sleep you got last night
Your physical condition during examination etc.
9. Independent Variable: A variable that stands alone and isn’t changed by
other variables
It represents inputs or causes for variation
Example: Someone's age , t doesn’t depend upon any variable
DEPENDENT VARIABLE
AND
INDEPENDENT VARIABLE
10. ANTECEDENT VARIABLES
AND
INTERVENING VARIABLES
• Antecedent Variables : Any variable that explains the relationship
between two variables by its prior impact on the two variables
• Example : Social class affect an relationship between income and
political party support
11. ANTECEDENT VARIABLES
AND
INTERVENING VARIABLES
• Intervene Variables :They used to explain relationship between observed
variables such independent and dependent variables, also called mediating
variable
• Example: Income and longevity is not fully related, money cant make life longer.
But people with money get high medical care than others. Here medical care is
an intervening variables
12. DICHOTOMOUS VARIABLE
AND
DUMMY VARIABLE
Dichotomous Variable
A variable that has only two possible categories such as gender
Dummy Variable
Dichotomous qualitative variable coded as ‘1’ if the characteristic is
present and ‘0’ if the characteristic is absent
13. DATA AND DATA SET
• Data is a collection of facts, figures and statistics related to an
object.
• Each time that we record information about an object we observe a
case.
• We might include several different variables in the same case. For
example, we might measure the age, sex, height, weight, and hair
color of a group of people in an experiment.
• We would have one case for each entity
14. DATA AND DATA SET
• When the raw data are organized in a row-by-column
format, with each row representing one case and each
column representing one variable, it is called a data set.
•
15. CLASSIFICATION OF DATA
Some common modes of classification are:
1) geographical, i.e., area wise or region-wise;
2) chronological, temporal, or historical, i.e., with respect to
occurrence of time;
3) qualitative, i.e., by character or by attributes; and
4) numerical, quantitative or by magnitude.
16. • Before you collect data for a research study, consider carefully
which of the four types of data you are collecting and how you will
use them once you have them
• The four widely used classification of measurement scales are:
TYPES OF DATA
Nominal Ordinal Interval Ratio
17. The lowest measurement level you can useIn nominal measurement the
numerical values are assigned to name the attribute uniquely.
In this scale, the numbers or letters assigned to objects serve only as labels
or tags for identification and classification of objects.
A nominal scale simply place data into categories, without any order or
structure.
The numbers do not reflect the amount of the characteristic possessed by the
objects.
These are scales in name only.
NOMINAL DATA
18. • It is the least powerful measurement
• The counting of members in each group is the only possible
arithmetic operation when a nominal scale is employed.
• No measure of dispersion can be used.
NOMINAL DATA
19. • An ordinal scale is next up the list in terms of power of
measurement
• Ordinal data include the characteristics of the nominal scale plus an
indicator of order
• In ordinal measurement the attributes can be rank-ordered
• Here, the distance between the attributes do not have any meaning
ORDINAL DATA
20. ORDINAL DATA
• In addition to the counting operation allowable for nominal scale
data, ordinal scales permit the use of statistics based on centiles,
e.g., percentile, quartile
• Median is the appropriate measure of central tendency
• A percentile or quartile measure reveals dispersion. Rank
correlation and a few nonparametric tests of significance can be
applied
21. • Interval data have the power of nominal and ordinal data plus one
additional strength: they incorporate the concept of equality of interval.
Thus, in interval measurement the distance between attributes does have
a meaning.
• The zero point on an interval scale is arbitrary and is not a true zero.
• It permits comparison of the differences between objects. Example: when
we measure temperature(in Fahrenheit) the distance from 30-40 is same
as distance from 70-80.
INTERVAL DATA
22. RATIO DATA
• A ratio scale is the top level of measurement and is not often
available in social research.
• Ratio data incorporates all the powers of the interval data plus the
provision for absolute zero or origin. Ratio data represent the
actual amounts of a variable.
• Here, we can construct actual fractions(or ratios) with a a ratio
variable.
• Examples: Height, weight, distance, etc.
• All statistical techniques can be applied to ratio data.
24. RAW DATA
When data are collected, the information
obtained from each member of a population or
sample is recorded in the sequence in which it
becomes available. Such data, before they are
grouped or ranked, are called raw data.
25. CROSS-SECTIONAL AND TIME SERIES DATA
• Cross-sectional data are collected at the same or approximately the
same point in time.
• Example: data detailing the number of road accidents in 28 Indian states in
June 2013
• Time series data are collected over several time periods.
• Example: data detailing the number of road accidents in each of the 28 Indian
states in the last 36 months
26. PRIMARY AND SECONDARY SOURCES OF DATA
• The sources from which data can be collected are divided into
primary and secondary:
• Primary data is the data collected by an individual or organization
to use specifically for the purpose of the investigation at hand. The
primary data is collected by conducting experiments, investigations,
observation, interviews, and surveys and by using questionnaires.
27. • Secondary data is facts and information gathered not for the
immediate study at hand but for some other purpose
• Secondary data has been gathered by others for their own purposes,
but the data could be useful in the analysis of a wide range of real
property. In general, secondary data exists in published sources- both
print and electronic.
PRIMARY AND SECONDARY SOURCES OF DATA
28. A population consists of all elements—individuals, items, or
objects— that we are interested in studying.
When researchers gather data from the whole population for a
given measurement of interest, they call it census.
• A sample is a finite subset of the population, if properly taken, is
the representative of the population.
• Data can be collected from a sample to answer questions about the
population.
POPULATION VERSUS SAMPLE
29. SAMPLE SIZE
• Large samples express greater expected variation
• Large samples represent population better than small samples
SAMPLE SIZE SELECTION
Statistical analysis planned
Expected variability within subsets of the sample
Tradition in our research
30. SAMPLING DESIGN
• Sampling design is a design that specifies the population frame ,
sample size, sample selection and estimation method in detail
• It is a definite plan for obtaining the sample from a population
SAMPLE FRAME
It is the record of the population from which a sample can be drawn
31. RANDOM SAMPLING (CHANCE SAMPLING)
• Random sampling is the purest form of probability sampling (the best
sampling method available)
• Each member of the population has an equal chance of being selected
• The sample units are drawn without showing any regard to the character of
the population
32. SYSTEMATIC SAMPLING
(QUASI RANDOM SAMPLING)
• Involves ‘Ordering of the universe’
• Ordering may be in alphabetical , numerical, geographical etc.
• Every nth member is selected from the order
• Sampling interval = Size of population
Size of the sample
33. STRATIFIED SAMPLING
• All people in sampling frame are divided into strata
• Stratum means groups or categories
• Strata is developed on basis of homogenity
• From each stratum , random samples or systematic samples are
selected
34. OTHER SAMPLING METHODS
CLUSTER SAMPLING
• Used when no satisfactory sampling frame available
• Population is divided into clusters or large groups
RANDOM ROUTE SAMPLING
Used In market research surveys (example: sampling households)
An address is selected randomly from register
Then every nth address by alternate left and right turns
35. OTHER SAMPLING METHODS
ACCIDENTAL SAMPLING
• Researcher simply contacts and pickup those cases which he come across and continue the
process (e.g.: 1st 100 willing persons)
QUOTA SAMPLING
Quotas are setup according to certain characteristics
SNOWBALL SAMPLING
We initially contact with potential respondents and ask them to refer respondents of similar
characteristics
37. FOCUS GROUPS
• A focus group is a form of qualitative research
• Respondents from the target population are typically put in a
single group and are asked about their perceptions, opinions,
beliefs, and attitudes towards a product, service, concept,
advertisement, idea, or packaging
• Questions are asked in an interactive group setting where
participants are free to talk with other group members
38. FOCUS GROUPS
• Two-way focus group - one focus group watches another focus group
and discusses the observed interactions and conclusion
• Dual moderator focus group - one moderator ensures the session
progresses smoothly, while another ensures that all the topics are
covered
39. • Mini focus groups - groups are composed of four or five members
rather than 6 to 12
• Teleconference focus groups - telephone network is used
• Online focus groups - computers connected via the internet
are used
40. INTERVIEWS
• A method for collecting primary data
• The person who is interviewing is called interviewer
• Person who is giving interview is called interviewee or respondent
• Respondent is asked to provide information in form of opinions,
facts and attitude etc.
41. SURVEYS
• A non-experimental, descriptive research method
• Helps in collecting data on phenomena that cant be directly observed such as
opinions
Types
• Industrial and consumer survey –for industrial use
• Media studies- to know effectiveness of advertising etc.
• Multiple survey- Survey is conducted among several groups of people
42. QUESTIONNAIRES
• It is a method for collecting 10 data in which a sample of respondents are
asked a list of carefully structured questions chosen after considerable
testing, with a view of eliciting reliable responses
Objectives
• It can translate the information needed into specific questions
• It must motivate the respondent to become involved in completing it
43. OPEN AND CLOSED QUESTIONS
Open questions
• An open question is likely to receive a long answer
• They offer the advantage of giving the opinions as precisely as possible in our
own words
• Eg: Do you support hike in oil prices, why?
44. OPEN AND CLOSED QUESTIONS
Closed questions
• A closed question can be answered with either a single word or a short phrase.
• They are convenient and easy to answer, since range of potential answers is limited
• Eg: Do you support hike in oil prices ?
• Tick your opinion
• 1. Yes
• 2. No
45. MULTIPLE CHOICE QUESTIONS
• The participant is asked a closed question and select his/her anser from a list
of predetermined responses
• E.g.:Who is the reserve bank governor?
Tick your opinion
• 1.Reghu ram Rajan
• 2. D. Subbarao
• 3. Sen Gupta
• 4. L.K.Jha
• 5. Dr.C. Rangarajan
46. RATING SCALES
• These scales are used to tap preferences between two or among more objects
or items
• Rating scales are used quite frequently in research, especially in surveys
• Typically, an itemized rating scale asks subjects to choose one response
category from several arranged in hierarchical order.
Eg: Please rate our service (out of 10)
• > 4 - Poor
• 5 to 7 - Average
• 8 to 9 - Good
• 10 - Outstanding