4. Data Analysis
3 A set of methods and techniques used to
obtain information and insights from data
3 Helps avoid erroneous judgements and
conclusions
3 Can constructively influence the research
objectives and the research design
Essentials of Marketing Research Kumar, Aaker, Day
5. Preparing the Data for Analysis
3 Data editing
3 Coding
3 Statistically adjusting the data
Essentials of Marketing Research Kumar, Aaker, Day
6. Preparing the Data for Analysis
(Contd.)
Data Editing
3 Identifies omissions, ambiguities, and errors
in responses
3 Conducted in the field by interviewer and
field supervisor and by the analyst prior to
data analysis
Essentials of Marketing Research Kumar, Aaker, Day
7. Preparing the Data for Analysis
(Contd.)
Problems Identified With Data Editing
3 Interviewer Error
3 Omissions
3 Ambiguity
3 Inconsistencies
3 Lack of Cooperation
3 Ineligible Respondent
Essentials of Marketing Research Kumar, Aaker, Day
8. Preparing the Data for Analysis
(Contd.)
Coding
3 Coding closed-ended questions involves
specifying how the responses are to be
entered
3 Open-ended questions are difficult to code
x Lengthy list of possible responses is generated
Essentials of Marketing Research Kumar, Aaker, Day
9. Preparing the Data for Analysis
(Contd.)
Statistically Adjusting the Data +
Weighting
3 Each response is assigned a number according to a
pre-specified rule
3 Makes sample data more representative of target
population on specific characteristics
3 Modifies number of cases in the sample that
possess certain characteristics
3 Adjusts the sample so that greater importance is
attached to of Marketing Research with certain characteristics
Essentials respondents Kumar, Aaker, Day
10. Preparing the Data for Analysis
(Contd.)
Statistically Adjusting the Data + Variable
Re-specification
3 Existing data is modified to create new variables
3 Large number of variables collapsed into fewer
variables
3 Creates variables that are consistent with study
objectives
3 Dummy variables are used (binary, dichotomous,
instrumental, quantitative variables)
3 Use (d-1) dummy Research
Essentials of Marketing variables to specify (d) levels of
Kumar, Aaker, Day
11. Preparing the Data for Analysis
(Contd.)
Statistically Adjusting the Data + Scale
Transformation
3 Scale values are manipulated to ensure
comparability with other scales
3 Standardization allows the researcher to compare
variables that have been measured using different
types of scales
3 Variables are forced to have a mean of zero and a
standard deviation of one
3 Can be done Marketing on interval or ratioAaker, Day data
Essentials of only Research Kumar, scaled
12. Simple Tabulation
3 Consists of counting the number of cases
that fall into various categories
Use of Simple Tabulation
3 Determine empirical distribution (frequency
distribution) of the variable in question
3 Calculate summary statistics, particularly
the mean or percentages
3 Aid in "data cleaning" aspects
Essentials of Marketing Research Kumar, Aaker, Day
13. Frequency Distribution
3 Reports the number of responses that each
question received
3 Organizes data into classes or groups of values
3 Shows number of observations that fall into each
class
3 Can be illustrated simply as a number or as a
percentage or histogram
3 Response categories may be combined for many
questions
3 Should result inResearch
Essentials of Marketing categories Kumar, Aaker,worthwhile
with Day
14. Descriptive Statistics
3 Statistics normally associated with a
frequency distribution to help summarize
information in the frequency table
3 Measures of central tendency mean, median
and mode
3 Measures of dispersion (range, standard
deviation, and coefficient of variation)
3 Measures of shape (skewness and kurtosis)
Essentials of Marketing Research Kumar, Aaker, Day
15. Analysis for Various Population
Subgroups
3 Differences between means or percentages
of two subgroup responses can provide
insights
3 Difference between means is concerned
with the association between two questions
3 Question upon which means are based are
intervally scaled
Essentials of Marketing Research Kumar, Aaker, Day
16. Cross Tabulations
3 Statistical analysis technique to study the
relationships among and between variables
3 Sample is divided to learn how the
dependent variable varies from subgroup to
subgroup
3 Frequency distribution for each subgroup is
compared to the frequency distribution for
the total sample
3 The two variables that are analyzed must be
Essentials of Marketing Research Kumar, Aaker, Day
17. Factors Influencing the Choice of
Statistical Technique
Type of Data
x Classification of data involves nominal, ordinal,
interval and ratio scales of measurement
x Nominal scaling is restricted to the mode as the only
measure of central tendency
x Both median and mode can be used for ordinal scale
x Non-parametric tests can only be run on ordinal data
x Mean, median and mode can all be used to measure
central tendency for interval and ratio scaled data
Essentials of Marketing Research Kumar, Aaker, Day
18. Factors Influencing the Choice of
Statistical Technique (Contd.)
Research Design
x Dependency of observations
x Number of observations per object
x Number of groups being analyzed
x Control exercised over variable of interest
Assumptions Underlying the Test Statistic
x If assumptions on which a statistical test is based are
violated, the test will provide meaningless results
Essentials of Marketing Research Kumar, Aaker, Day
19. Overview of Statistical
Techniques
Univariate Techniques
x Appropriate when there is a single measurement of
each of the 'n' sample objects or there are several
measurements of each of the `n' observations but
each variable is analyzed in isolation
x Nonmetric - measured on nominal or ordinal scale
x Metric-measured on interval or ratio scale
x Determine whether single or multiple samples are
involved
x For multiple samples, choice of statistical test
depends on whether the samples are independent or
dependent
Essentials of Marketing Research Kumar, Aaker, Day
20. Overview of Statistical
Techniques (Contd.)
Multivariate Techniques
3 A collection of procedures for analyzing
association between two or more sets of
measurements that have been made on each
object in one or more samples of objects
3 Dependence or interdependence techniques
Essentials of Marketing Research Kumar, Aaker, Day
21. Overview of Statistical
Techniques (Contd.)
Multivariate Techniques (Contd.)
Dependence Techniques
3 One or more variables can be identified as
dependent variables and the remaining as
independent variables
3 Choice of dependence technique depends
on the number of dependent variables
involved in analysis
Essentials of Marketing Research Kumar, Aaker, Day
22. Overview of Statistical
Techniques (Contd.)
Multivariate Techniques (Contd.)
Interdependence Techniques
3 Whole set of interdependent relationships is
examined
3 Further classified as having focus on
variable or objects
Essentials of Marketing Research Kumar, Aaker, Day
23. Overview of Statistical
Techniques (Contd.)
Why Use Multivariate Analysis?
3 To group variables or people or objects
3 To improve the ability to predict variables
(such as usage)
3 To understand relationships between
variables (such as advertising and sales)
Essentials of Marketing Research Kumar, Aaker, Day
24. Hypothesis Testing:
Basic Concepts
3 Assumption (hypothesis) made about a
population parameter (not sample parameter)
3 Purpose of Hypothesis Testing
x To make a judgement about the difference between
two sample statistics or the sample statistic and a
hypothesized population parameter
3 Evidence has to be evaluated statistically
before arriving at a conclusion regarding the
hypothesis.
Essentials of Marketing Research Kumar, Aaker, Day
25. Hypothesis Testing
3 The null hypothesis (Ho) is tested against
the alternative hypothesis (Ha).
3 At least the null hypothesis is stated.
3 Decide upon the criteria to be used in
making the decision whether to “reject” or
"not reject" the null hypothesis.
Essentials of Marketing Research Kumar, Aaker, Day
26. Significance Level
3 Indicates the percentage of sample means that
is outside the cut-off limits (critical value)
3 The higher the significance level (α) used for
testing a hypothesis, the higher the probability
of rejecting a null hypothesis when it is true
(Type I error)
3 Accepting a null hypothesis when it is false is
called a Type II error and its probability is
(β)
Essentials of Marketing Research Kumar, Aaker, Day
27. Hypothesis Testing
Tests in this class
Statistical Test
3 Frequency Distributions χ2
3 Means (one) z (if σ is known)
t (if σ is unknown)
3 Means (two or more) ANOVA
Essentials of Marketing Research Kumar, Aaker, Day
28. Cross-tabulation and Chi Square
In Marketing Applications, Chi-square
Statistic Is Used As
Test of Independence
3 Are there associations between two or more variables in a
study?
Test of Goodness of Fit
3 Is there a significant difference between an observed
frequency distribution and a theoretical frequency
distribution?
Essentials of Marketing Research Kumar, Aaker, Day
29. Chi-Square As a Test of
Independence
Null Hypothesis Ho
3 Two (nominally scaled) variables are
statistically independent
Alternative Hypothesis Ha
3 The two variables are not independent
Use Chi-square distribution to test.
Essentials of Marketing Research Kumar, Aaker, Day
30. Chi-square Statistic (χ ) 2
3 Measures of the difference between the actual numbers
observed in cell i (Oi), and number expected (Ei) under
independence if the null hypothesis were true
(Oi − Ei )
n 2
χ =Σ2
i =1 Ei
With (r-1)*(c-1) degrees of freedom
r = number of rows c = number of columns
3 Expected frequency in each cell: Ei = pc * pr * n
Where pc and pr are proportions for independent variables
and n is the total number of observations
Essentials of Marketing Research Kumar, Aaker, Day
31. Chi-square Step-by-Step
1) Formulate Hypotheses
2) Calculate row and column totals
3) Calculate row and column proportions
4) Calculate expected frequencies (Ei)
5) Calculate χ2 statistic
6) Calculate degrees of freedom
7) Obtain Critical Value from table
8) Make decision regarding the Null-hypothesis
Essentials of Marketing Research Kumar, Aaker, Day
32. Example of Chi-square as a Test
of Independence
Class
1 2
A 10 8
Grade B 20 16
C 45 18
This is a ‘Cell’
D 16 6
E 9 2
Essentials of Marketing Research Kumar, Aaker, Day
33. Chi-square As a Test of
Independence - Exercise
Own Income
Expensive Low Middle High
Automobile
Yes 45 34 55
No 52 53 27
Task: Make a decision whether the two variables are
independent!
Essentials of Marketing Research Kumar, Aaker, Day
34. Hypothesis Testing About
a Single Mean
3 Make judgement about a single sample parameter.
3 Hypothesis testing depends on whether the population
is known on not known
( X − µ) ( X − µ)
z= t=
σx sx
if population variance if population variance
is known is not known, or
if sample size < 60
Essentials of Marketing Research Kumar, Aaker, Day
35. Hypothesis Testing About
a Single Mean - Step-by-Step
1) Formulate Hypotheses
2) Select appropriate formula
3) Select significance level
4) Calculate z or t statistic
5) Calculate degrees of freedom (for t-test)
6) Obtain critical value from table
7) Make decision regarding the Null-
hypothesis
Essentials of Marketing Research Kumar, Aaker, Day
36. Hypothesis Testing About
a Single Mean - Example 1
3 Ho: µ = 5000 (hypothesized value of population)
3 Ha: µ ≠ 5000 (alternative hypothesis)
3 n = 100
3 X = 4960
3 σ = 250
3 α = 0.05
Rejection rule: if |zcalc| > zα/2 then reject Ho.
Essentials of Marketing Research Kumar, Aaker, Day
37. Hypothesis Testing About
a Single Mean - Example 2
3 Ho: µ = 1000 (hypothesized value of population)
3 Ha: µ ≠ 1000 (alternative hypothesis)
3 n = 12
3 X = 1087.1
3 s = 191.6
3 α = 0.01
Rejection rule: if |tcalc| > tdf, α/2 then reject Ho.
Essentials of Marketing Research Kumar, Aaker, Day
38. Hypothesis Testing About
a Single Mean - Example 3
3 Ho: µ ≤ 1000 (hypothesized value of population)
3 Ha: µ > 1000 (alternative hypothesis)
3 n = 12
3 X = 1087.1
3 s = 191.6
3 α = 0.05
Rejection rule: if tcalc > tdf, α then reject Ho.
Essentials of Marketing Research Kumar, Aaker, Day
39. Confidence Intervals
3 Hypothesis testing and Confidence Intervals
are two sides of the same coin.
( X − µ)
t= ⇒ X ± ts x = interval
sx estimate of µ
Essentials of Marketing Research Kumar, Aaker, Day
40. Analysis of Variance (ANOVA)
3 Response variable - dependent variable (Y)
3 Factor(s) - independent variables (X)
3 Treatments - different levels of factors
(r1, r2, r3, …)
Essentials of Marketing Research Kumar, Aaker, Day
41. Example (Book p.495)
Product Sales
1 2 3 4 5 Total Xp
39¢ 8 12 10 9 11 50 10
Price
Level 44 ¢ 7 10 6 8 9 40 8
49 ¢ 4 8 7 9 7 35 7
Overall sample mean: X = 8.333
Overall sample size: n = 15
No. of observations per price level: np = 5
Essentials of Marketing Research Kumar, Aaker, Day
42. Example (Book p.495)
Grand Mean
Essentials of Marketing Research Kumar, Aaker, Day
43. One - Factor Analysis of
Variance
3 Studies the effect of 'r' treatments on one
response variable
3 Determine whether or not there are any
statistically significant differences between
the treatment means µ1, µ2,... µR
3 Ho: All treatments have same effect on
mean responses
3 H1 : At least 2 of µ1, µ2 ... µr are different
Essentials of Marketing Research Kumar, Aaker, Day
44. One - Factor ANOVA -
Intuitively
If: Between Treatment Variance
Within Treatment Variance
Wis large then there are differences between treatments
i is small then there are no differences between treatments
3 To Test Hypothesis, Compute the Ratio Between the
"Between Treatment" Variance and "Within
Treatment" Variance
Essentials of Marketing Research Kumar, Aaker, Day
45. One - Factor ANOVA Table
Source of Variation Degrees of Mean Sum F-ratio
Variation (SS) Freedom of Squares
Between SSr r-1 MSSr =SSr/r-1 MSSr
(price levels) MSSu
Within SSu n-r MSSu=SSu/n-r
(price levels)
Total SSt n-1
Essentials of Marketing Research Kumar, Aaker, Day
46. One - Factor Analysis of
Variance
3 Between Treatment Variance
r
Σ
SSr = p=1 np (Xp - X)2 = 23.3
n r
3 Within-treatment variance
p
i=1 p=1
SSu = Σ Σ (Xip - Xp)2 = 34
Where
SSr = treatment sums of squares r = number of groups
size of group ‘p’
np = sampleEssentialsin Marketing Research X = meanAaker,group p
Kumar, of Day
47. One - Factor Analysis of
Variance
3 Between variance estimate (MSSr)
MSSr = SSr/(r-1) = 23.3/2 = 11.65
3 Within variance estimate (MSSu)
MSSu = SSu/(n-r) = 34/12 = 2.8
Where
n = total sample size Research
Essentials of Marketing
r = Kumar, Aaker, of groups
number Day
48. One - Factor Analysis of
Variance
3 Total variation (SSt): SSt = SSr + SSu = 23.3+34 = 57.3
3 F-statistic: F = MSSr / MSSu = 11.65/2.8 = 4.16
3 DF: (r-1), (n-r) = 2, 12
3 Critical value from table: CV(α, df) = 3.89
Essentials of Marketing Research Kumar, Aaker, Day
Notes de l'éditeur
Solutions for Confidence Interval Exercises (last class): x 95% 90% Problem 1: 4/7 (54.85, 57.14) (55.05, 56.95) (X bar =56, s=4, n = 49) Problem 2: 4/10 (55.2, 56.8) (55.33, 56.66) (X bar =56, s=4, n = 100)
Look at book page 473: explain Type I/II error
We do not deal with Goodness of fit!!
Test whether grade and class are related: Ho: Grade and Class are not related Ha: Grade and Class are related Class Sum 1 2 A 10 (12) 8 (6) 18 (0.12) Grade B 20 (24) 16 (12) 36 (0.24) C 45 (42) 18 (21) 63 (0.42) D 16 (14.66) 6 (7.33) 22 (0.1466) E 9 (7.33) 2 (3.66) 11 (0.0733) Sum: 100 (0.666) 50 (0.333) 150 2 = (10-12) 2 /12 + (8-6) 2 /6 + (20-24) 2 /24 + (16-12) 2/ 12 + (45-42) 2 /42 + (18-21) 2 /21 + (16-14.66) 2 /14.66 + (6-7.33) 2 /7.33 + (9-7.33) 2 /7.33 + (2-3.66) 2 /3.66 = 0.333 + 0.666 + 0.666 + 1.333 + 0.214 + 0.428 + 0.121 + 0.2424 + 0.3787 + 0.752 = 5.136 df = (r-1)*(c-1) = 4*1 = 4 = 0.05 (significance level) Critical value (from table) = 9.49 Since 5.136 < than CV: not reject
Chi-Square = 14.201 df= 2 (r-1)*(c-1) = (2-1)*(3-1) = 2 = 0.05 CV = 5.991 Reject Ho of independence
Talk about Z and t distribution
Population case: therefore z-test Standard error of mean: x = /sqrt(n) = 250/10 = 25 z= (4960-5000) / 25 = -1.6 z /2 = 1.96 if |z calc | > z /2 then reject Ho since |-1.6| < 1.96 do not reject Ho.
Softdrink manufacturer plans to introduce new soft drink. 12 supermarkets are selected at random and soft drink is offered in these supermarkets for limited time.Average existing softdrink sales are 1000, new softdrink sales are 1087.1 Sample < 60 therefore t-test Standard error of mean: s x = s /sqrt(n) = 191.6/sqrt(12) = 55.31 t calc = (1087.1-1000) / 55.31 = 1.57 df = 12-1 = 11 t 11 , /2 = 3.106 if |t calc | > t /2 then reject Ho since |1.57| < 3.106 do not reject Ho.
One sided test Sample < 30 therefore t-test Standard error of mean: x = /sqrt(n) = 191.6/sqrt(12) = 55.31 t calc = (1087.1-1000) / 55.31 = 1.57 df = 12-1 = 11 t 11 , /2 = 1.796 if t calc > t then reject Ho since 1.57 < 1.796 do not reject Ho. Rejection rule for opposite directionality: if t calc < -t then reject Ho