Any business and economic applications of forecasting involve time series data. Re-gression models can be fit to monthly, quarterly, or yearly data using the techniques de-scribed in previous chapters. However, because data collected over time tend to exhibit trends, seasonal patterns, and so forth, observations in different time periods are re¬lated or autocorrelated. That is, for time series data, the sample of observations cannot be regarded as a random sample. Problems of interpretation can arise when standard regression methods are applied to observations that are related to one another over time. Fitting regression models to time series data must be done with considerable care.
Fostering Friendships - Enhancing Social Bonds in the Classroom
Regression with Time Series Data
1. TUGAS KELOMPOK
TEKNIK PROYEKSI BISNIS
RESUME
“Regression with Time
Series Data”
Dosen:
SigitIndrawijaya, SE. M.Si
Disusunoleh:
RizanoAhdiatRash’ada (C1B011047)
MuchlasPratama
Roby Harianto (C1B011005)
PROGRAM STUDI MANAJEMEN
FAKULTAS EKONOMI
UNIVERSITAS JAMBI
2013
2. Regression with Time
Series Data
Any business and economic applications of forecasting involve time
series data. Re-gression models can be fit to monthly, quarterly, or yearly data
using the techniques de-scribed in previous chapters. However, because data
collected over time tend to exhibit trends, seasonal patterns, and so forth,
observations in different time periods are re¬ lated or autocorrelated. That is,
for time series data, the sample of observations cannot be regarded as a random
sample. Problems of interpretation can arise when standard regression methods
are applied to observations that are related to one another over time. Fitting
regression models to time series data must be done with considerable care.
Time Series Data and the Problem of Autocorrelation
With time series data, the assumption of independence rarely holds.. Consider
the annual base price for a particular model of a new car. Can you imagine the chaos
that would exist if the new car prices from one year to the next were indeed
unrelated (in-dependent) of one another? In such a world prices would be
determined like numbers drawn from a random number tabl-. Knowledge of the
price in one year would not tell you anything about the price in the next year. In the,
real world, price in the current year is related to (correlated with) the price in the
previous year and maybe the price two years ago, arid so forth. That is, the prices in
different years are autocorrelated, they are not independent.
Autocorrelation exists when successive observations over time are related to
one another.
3. Autocorrelation can occur because the effect of a predictor variable on the
re¬sponse is distributed over time. For example, an increase in salary may
affect your con¬sumption (or saving) not only in the current period but
also in several future periodic A currentlabor contract may affect the cost of
production for some time to come. Over time, relationships tend to be dynamic
From a forecasting perspective, autocorrelation is not all bad. If values of a
re¬sponse Y in one time period are related to Y values in previous time periods,
then pre¬vious Y's can be used to predict future Y's.1 In a regression framework,
autocorrelation is handled by "fixing up" the standard regression model. To
accommodate autocorrelation, sometimes it is necessary to change the mix of
predictor variables and/or the form of the regression function. More typically,
however, autocorrelation is handled by changing the nature of the error term. A
common kind of autocorrelation, often called first-order serial correlation, is one in
which the error term in the current time period is directly related to the error term
in the previous time period. In this case, with the subscript t representing time, the
simple linear regression model takes the form (evolving), not static.
From a forecasting perspective, autocorrelation is not all bad. If values of a
response Y in one time period are related to Y values in previous time periods,
then previous Y's can be used to predict future Y's.1 In a regression framework,
autocorrelation is handled by "fixing up" the standard regression model. To
accommodate autocorrelation, sometimes it is necessary to change the mix of
predictor variables and/or the form of the regression function. More typically,
however, autocorrelation is handled by changing the nature of the error term.
4. A common kind of autocorrelation, often called first-order serial correlation,
is one in which the error term in the current time period is directly related to the
error term in the previous time period. In this case, with the subscript t representing
time, the simple linear regression model takes the form
Yt=β0+β1X1+ε1
With
(1)
εt=ρεt-1+v
(2)
Where
E, = the error at time t
p = the parameter (lag 1 autocorrelation coefficient) that measures correlation
between adjacent error terms
= normally distributed independent error term with mean 0 and variance σ2
Equation 2 says that the level of one error term (εt-1) directly affects the level of the
next error term (εt,). The magnitude of the autocorrelation coefficient p, where —1
< p < I, indicates the strength of the serial correlation. If p is zero, then there is no
serial correlation, and the error terms are independent (εt = vt).
Durbin-Watson Test for Serial Correlation
One approach that is used frequently to determine if serial correlation is present is
Durbin-Watson test. The test involves the determination of whether the
autocorretion parameter p shown in Equation 8.2 is zero. Consider
εt=ρεt-1+v
5. The hypotheses to be tested are
H0:ρ=0
H1:p>0
The alternative hypothesis is p > 0 since business and economic time series tend
show positive autocorrelation.
If a regression model does not properly account for autocorrelation, the residu will
be autocorrelated. So, the Durbin-Watson test is carried out using the residua from
the regression analysis.
Durbin-Watson statistic is defined as
where
e, = Yt — Yt = the residual for time period t
et-i— Yt -1Yr -1 = the residual for time period t — 1
For positive serial correlation, successive residuals tend to be alike and the
sum of squared differences in the numerator of the Durbin-Watson statistic will be
relatively small. Small values of the Durbin-Watson statistic are consistent with
positive serial correlation.
The autocorrelation coefficient ρ can be estimated by the lag 1 residual
autocorrelation r1(e), and with a little bit of mathematical maneuvering. the
Durbin-Watson statistic can be related to ri (e). For moderate to large samples,
DW ----- 2(1 — r1(e))
6. Since —1 <r1(e)< 1, Equation above shows that 0 < DW < 4. For r1(e) close to 0,
the DW statistic will be close to 2. Positive lag 1 residual autocorrelation is
associated with DW values less than 2, and negative lag 1 residual autocorrelation
is associated with DW values above 2.
A useful, but sometimes not definitive, test for serial correlation can be performed
by comparing the calculated value of the Durbin-Watson statistic with lower (L)
and upper (U) bounds. The decision rules are:
1.When the Durbin-Watson statistic is larger than the upper (U) bound, the autocorrelation coefficient p is equal to zero (there is no positive autocorrelation).
2.When the Durbin-Watson statistic is smaller than the lower (L) bound, the autocorrelation coefficient p is greater than zero (there is positive autocorrelation).
3.When the Durbin-Watson statistic lies within the lower and upper bounds, the
test is inconclusive (we don't know whether there is positive autocorrelation).
The Durbin-Watson test is used to determine whether positive autocorrelation
is present.
If DW > U, conclude H0 :ρ= 0. If DW < L, conclude H1 : ρ > 0.
If DW lies within the lower and upper bounds (L ≤ DW ≤ U), the test is
inconclusive.
Solutions to Autocorrelation Problems
Once autocorrelation has been discovered in a regression of time series data,
it is neccessary to remove it, or model it, before the regression function can be
evaluated for its effectiveness.
The solution to the problem of serial correlation begins with an
evaluation of the model specification. Is the functional form correct? Were any
important variables omitted? Are there effects that might have some pattern
over time that could have introduced autocorrelation into the errors?
7. Since a major cause of autocorrelated errors in the regression model is
the omission of one or more key variables, the bes t approach to solving the
problem is to findthem. This effort is sometimes referred to as improving the
model specification. Modelspecification not only involves finding the
important predictor variables, it also involves entering these variables in the
regression function in the right way. Unfortu nately, it is not always possible to
improve the model specification because an important missing variable may not
be quantifiable or, if it is quantifiable, the data may not be available. For
example, one may suspect that business investment in future periods is related
to the attitude of potential investors. However, it is difficult to quantify the
variable "attitude." Nevertheless, whenever possible, the model should be
specified in accordance with theoretically sound insight.
Only after the specification of the equation has been carefully reviewed should
the possibility of an adjustment be considered. Several techniques for
eliminating auto-correlation will be discussed.
One approach to eliminating autocorrelati on is to add an omitted variable
to the re-gression function that explains the association in the response from
one period to the next.
REGRESSION WITH DIFFERENCES
For highly autocorrelated data, modeling changes rather than levels can often
elimi¬nate the serial correlation. That is, instead of formulating the regression
equation in terms of Y and X 1 , X 2 ,... , X k , the regression equation is written in
terms of the differences, Y 1 = Y t – Y t-1 , and X t1 = X t1 - 1,1 , X t2 = X t2 – X t-1,2 , and
so forth. Differences should be considered when the Durbin -Watson statistic
associated with the regression involving the original variables is close to 0.7
8. One rationale for differencing comes from the following argument.
Yr = β 0 + β 1 X 1 +εt
with
εt=ρε t-1 =v t
where
p = correlation between consecutive errors
V t = random error = εtwhen p = 0
The model holds for any time period so
Y t-1 = β 0 +βtX t -1+ε t -1
Time Series Data and the Problem of Heteroscedasticity
Variability can increase if a variable is growing at a constant rate rather
than a constant amount over time. Nonconstant variability is called
heteroscedasticity.In a regression framework, heteroscedasticity occurs if the
variance of the error term, c, is not constant. If the variability for recent time
periods is larger than it was for past time periods, then the standard error of
the estimate,underestimates the current standard deviation of the error term. If
the standard deviationof the estimate is then used to set forecast limits for
future observations, these limits can be too narrow for the stated confidence
level.
Using Regression to Forecast Seasonal Data
In this model the seasonality is handled by using dummy variables in the
regression function.
A seasonal model for quarterly data with a time trend is
Yt=β 0 +β 1 t+β 2 S 2 +β3S3+β 4 S 4 +ε t
9. Where
Y t = the variable to be forecast
t = the time index
S 2 = a dummy variable that is 1 for the second quarter of the year; 0 otherwise
= a dummy variable that is 1 for the third quarter of the year; 0 otherwise
S 4 = a dummy variable that is 1 for the fourth quarter of the year; 0 otherwise
ε t = errors assumed to be independent and normally distributed with mean zero
and constant variance
β 0 β 1 β 2 β 3 β 4 = coefficients to be estimated
Econometric Forecasting
When regression analysis is applied to economic data, the predictions developed
from such models are referred to as economic forecasts. However, since economic
theory frequently suggests that the values taken by the quantities of interest are
determined through the simultaneous interaction of different economic forces, it
may be necessary to model this interaction with a set of simultaneous equations.
This idea leads to the construction of simultaneous equation econometric models.
These models involve individual equations that look like regression equations.
However, in a simultaneous system the individual equations are related, and the
econometric model allows the joint determination of a set of dependent variables in
terms of several independent variables. This contrasts with the usual regression
situation in which a single equation determines the expected value of one
dependent variable in terms of the independent, variables.
A simultaneous equation econometric model determines jointly the values of a set
of dependent variables, called endogenous variables by econometricians, in terms
10. of thevalues of independent variables, called exogenous variables. The values of
the exoge¬nous variables are assumed to influence the endogenous variables but
not the other way around. A complete simultaneous equation model will involve
the same number of equations as endogenous variables.
Economic theory holds that, in equilibrium, the quantity supplied is equal to the
quantity demanded at a particular price. That is, the quantity demanded, the
quantity supplied, and price are determined simultaneously. In one study of the
price elasticity of demand, the model was specified as
Qt=α0+a1Pt+a2lt+a3Tt+εt
Pt=β0+β1Q1+β2Lt+vt
where
Qt = a measure of the demand (quantity sold)
Pt = a measure of price (deflated dollars)
lt = a measure of income per capita
Tt = a measure of temperature
lt = a measure of labor cost
εt,vt= independent error terms that are uncorrelated with each other
Large-scale econometric models are being used today to model the behavior of
specific firms within an industry, selected industries within the economy, and
the total economy. Econometric models can include any number of
simultaneous multiple re¬gression-like equations. Econometric models are
used to understand how the economy works and to generate forecasts of key
economic variables. Econometric models are important aids in policy
formulation.