2. INTRODUCTION
WHAT IS IT?
Market Mix Modelling is used to estimate the effectiveness of Investment in Media.
Statistical methods are applied to measure the impact of media investments, promotional activities and
price tactics on sales
HOW IT WORKS?
A statistical model is estimated on historical data with sales as a dependent variable and list of
explanatory variables as marketing activities, price, seasonality and macro factors.
The simplest and broadly used model is linear regression:
The output of the model is then used to carry out further analysis like media effectiveness, ROI .
3. STEP BY STEP APPROACH
Data understanding reveals that
the media variables need some
transformations using different
Adstock , Power and Lag
combinations.
This works on the concept that
advertising has an effect
extending several periods after
the original exposure, which is
generally referred to as
advertising carry-over or
‘Adstock’.
Media effect is broken into two
components:
Current effect - change in sales
caused by an advertising exposure
occurring at the same time period
as the exposure and
Carry-over effect – change in sales
that occurs in time periods
following the pulse of advertising.
The following APLs are tried:
ADSTOCKS: .1,.3,.5,.7,.9
POWER: .1,.3,.5,.7,.9
LAG: 0,1,2
The sample data provided has 2
years of Monthly Sales data for 2
different regions.
It also has the TV, RADIO, ONLINE,
PRINT and OUTDOOR monthly
spend for 2 years.
The level of the data is :
Region x Month
DATA
UNDERSTANDING
DATA PREPARATION
AND STRUCTURING
EXPLORATORY DATA
ANALYSIS
MODEL FITTING
USING SAS
FINAL MARKETING
MIX
After the structuring of the Data
, the basic exploratory analysis
was done on the data to check
trends and observe
relationships.
The "data exploration" tab
provides a snippet of the data
exploration that was conducted
Using Proc Corr bivariate
relationship and correlations
between sales_volume and
different media activities viz
TV, RADIO, ONLINE etc was
conducted.
The test for autocorrelation was
conduted using Proc Autoreg
In our data, the first-order
Durbin-Watson test is
insignificant, with p > .0001 for
the hypothesis of no first-order
autocorrelation.Thus, no
autocorrelation correction is
needed.
A 2 step modelling tecchnique
was followed:
STAGE1: The Base /Seasonality
was estimated using monthly
dummies and SSN_BASE variable
was created . This was
incorporated in the stage 2
model as an independent
variable along with other
marketing/transformed
marketing variables.
STAGE 2: Since here we have
regional data of marketing
activity as well as their
corresponding sales at the
regional level, we fit a linear
mixed model with a random
class structure as the region.
The RANDOM option specifies
the variables whose effect on
sales is assumed to be random
among the different
regions.Proc Mixed was used
for this.
It was observed that the
contribution of SSN_BASE or the
SEASONALITY is highest followed
by Online and Radio
contributions.
The Rsquare for the model was
coming to be around 63%
4. DATA EXPLORATION
0
200
400
600
800
1000
1200
0
50
100
150
200
250
300
350
400
450
01Nov2012
01Jan2013
01Mar2013
01May2013
01Jul2013
01Sep2013
01Nov2013
01Jan2014
01Mar2014
01May2014
01Jul2014
01Sep2014
Sum of TV_000100
Sum of TV_000300
Sum of TV_000500
Sum of TV_000700
Sum of TV_000900
Sum of TV_100100
Sum of TV_100300
Sum of TV_100500
Sum of TV_100700
Sum of TV_100900
0
20
40
60
80
100
120
0
10
20
30
40
50
60
70
80
90
100
01Nov2012
01Jan2013
01Mar2013
01May2013
01Jul2013
01Sep2013
01Nov2013
01Jan2014
01Mar2014
01May2014
01Jul2014
01Sep2014
Sum of
Radio_000100
Sum of
Radio_000300
Sum of
Radio_000500
Sum of
Radio_000700
Sum of
Radio_000900
Sum of
Radio_100100
Sum of
Radio_100300
Sum of
Radio_100500
Sum of
Radio_100700
Sum of
Radio_100900
Sum of
Radio_300100
Sum of
Radio_300300
0
50
100
150
200
250
300
350
400
450
0
50000
100000
150000
200000
250000
300000
350000
400000
450000
Sum of TV Sum of Sales_Volume
0
10
20
30
40
50
60
70
80
90
100
0
50000
100000
150000
200000
250000
300000
350000
400000
450000
01Nov2012
01Dec2012
01Jan2013
01Feb2013
01Mar2013
01Apr2013
01May2013
01Jun2013
01Jul2013
01Aug2013
01Sep2013
01Oct2013
01Nov2013
01Dec2013
01Jan2014
01Feb2014
01Mar2014
01Apr2014
01May2014
01Jun2014
01Jul2014
01Aug2014
01Sep2014
01Oct2014
Sum of Radio Sum of Sales_Volume
For Region R001 the original TV and Radio variables are plotted against the transformed TV and Radio variable
For Region R001 the Sales Volume is plotted against original TV and Radio variables
0
200
400
600
800
1000
1200
0
50
100
150
200
250
300
350
400
450
01Nov2012
01Jan2013
01Mar2013
01May2013
01Jul2013
01Sep2013
01Nov2013
01Jan2014
01Mar2014
01May2014
01Jul2014
01Sep2014
Sum of TV_000100
Sum of TV_000300
Sum of TV_000500
Sum of TV_000700
Sum of TV_000900
Sum of TV_100100
Sum of TV_100300
Sum of TV_100500
Sum of TV_100700
Sum of TV_100900
0
20
40
60
80
100
120
0
20
40
60
80
100
120
01Nov2012
01Jan2013
01Mar2013
01May2013
01Jul2013
01Sep2013
01Nov2013
01Jan2014
01Mar2014
01May2014
01Jul2014
01Sep2014
Sum of
Radio_00010
0
Sum of
Radio_00030
0
Sum of
Radio_00050
0
Sum of
Radio_00070
0
Sum of
Radio_00090
0
Sum of
Radio_10010
0
Sum of
Radio_10030
0
For Region R220 the original TV and Radio variables are plotted against the transformed TV and Radio vars
0
50
100
150
200
250
300
350
400
450
0
50000
100000
150000
200000
250000
300000
350000
400000
450000
01Nov2012
01Dec2012
01Jan2013
01Feb2013
01Mar2013
01Apr2013
01May2013
01Jun2013
01Jul2013
01Aug2013
01Sep2013
01Oct2013
01Nov2013
01Dec2013
01Jan2014
01Feb2014
01Mar2014
01Apr2014
01May2014
01Jun2014
01Jul2014
01Aug2014
01Sep2014
01Oct2014
Sum of TV Sum of Sales_Volume
For Region R220 the Sales Volume is plotted against original TV and Radio variables
0
20
40
60
80
100
120
0
50000
100000
150000
200000
250000
300000
350000
400000
450000
Sum of Radio Sum of Sales_Volume
5. Modelling Process
Autocorrelation test
The Durbin-Watson test is a widely used method of testing
for autocorrelation.
The following statements perform the Durbin-Watson test
for autocorrelation in the OLS residuals for orders 1
through 4.
/*-- Durbin-Watson test for autocorrelation --*/
proc autoreg data=ads_regmon_wid_ssn outest=testdw;
by region_code;
model sales_volume= time / dw=4 dwprob;
run;
In our case, the first-order Durbin-Watson test is
insignificant, with p > .0001 for the hypothesis of no first-
order autocorrelation.
Thus, no autocorrelation correction is needed.
Proc Mixed
PROC MIXED DATA=ads_regmon_wid_ssn METHOD=ML;
CLASS region_code;
MODEL SALES_VOLUME = SSN_BASE
Print
TV_100700
Radio_100701
Online_500301
Outdoor_100701
/NOINT SOLUTION;
random
Print
TV_100700
Radio_100701
Outdoor_100701
/solution subject=region_code;
ods output solutionf=est_fixed;
RUN;
ESTIMATES
Effect Estimate StdErr DF tValue Probt
ssn_base 0.9 0.2 36.0 5.4 0.0
Print 1560.2 1704.7 1.0 0.9 0.5
TV_100700 733.6 1105.5 1.0 0.7 0.6
Radio_100701 4332.1 3855.2 1.0 1.1 0.5
Online_500301 88466.6 64882.9 36.0 1.4 0.2
Outdoor_100701 -30867.5 12967.1 1.0 -2.4 0.3
STAGE1: The Base /Seasonality was estimated using monthly dummies and SSN_BASE
variable was created . This was incorporated in the stage 2 model as an independent
variable along with other marketing/transformed marketing variables.
STAGE 2: Since here we have regional data of marketing activity as well as their
corresponding sales at the regional level, we fit a linear mixed model with a random
class structure as the region. The RANDOM option specifies the variables whose
effect on sales is assumed to be random among the different regions.Proc Mixed was
used for this.
A noint model was chosen as SSN_BASE (seasonality ) is getting incorporated as an
independent variable to account for the base.
*****Please note that Since this is a dummy data and the modelling is developed just
for illustration sake and to judge the approach I have not taken a lot of effort to
improve the significance of parameter estimates
6. FINAL MIX
For Region R001: ACTUAL VS PREDICTED
At each month the stacked bars show the predicted contribution by each Marketing Driver for Region R001.
Contribution of Seasonality is highest followed by Online and Radio
At each month the stacked bars show the predicted contribution by each Marketing Driver for Region R220
Contribution of Seasonality is highest followed by Online and Radio
For Region R220: ACTUAL VS PREDICTED