SlideShare une entreprise Scribd logo
1  sur  174
Télécharger pour lire hors ligne
Total
A
AA AB AC
B
BA BB BC
C
CA CB CC
1
Rob J Hyndman
Forecasting
hierarchical time series
Outline
1 Hierarchical time series
2 Forecasting framework
3 Optimal forecasts
4 Approximately optimal forecasts
5 Application to Australian tourism
6 hts package for R
7 References
Forecasting hierarchical time series Hierarchical time series 2
Introduction
Total
A
AA AB AC
B
BA BB BC
C
CA CB CC
Examples
Manufacturing product hierarchies
Net labour turnover
Pharmaceutical sales
Tourism demand by region and purpose
Forecasting hierarchical time series Hierarchical time series 3
Introduction
Total
A
AA AB AC
B
BA BB BC
C
CA CB CC
Examples
Manufacturing product hierarchies
Net labour turnover
Pharmaceutical sales
Tourism demand by region and purpose
Forecasting hierarchical time series Hierarchical time series 3
Introduction
Total
A
AA AB AC
B
BA BB BC
C
CA CB CC
Examples
Manufacturing product hierarchies
Net labour turnover
Pharmaceutical sales
Tourism demand by region and purpose
Forecasting hierarchical time series Hierarchical time series 3
Introduction
Total
A
AA AB AC
B
BA BB BC
C
CA CB CC
Examples
Manufacturing product hierarchies
Net labour turnover
Pharmaceutical sales
Tourism demand by region and purpose
Forecasting hierarchical time series Hierarchical time series 3
Introduction
Total
A
AA AB AC
B
BA BB BC
C
CA CB CC
Examples
Manufacturing product hierarchies
Net labour turnover
Pharmaceutical sales
Tourism demand by region and purpose
Forecasting hierarchical time series Hierarchical time series 3
Forecasting the PBS
Forecasting hierarchical time series Hierarchical time series 4
ATC drug classification
A Alimentary tract and metabolism
B Blood and blood forming organs
C Cardiovascular system
D Dermatologicals
G Genito-urinary system and sex hormones
H Systemic hormonal preparations, excluding sex hor-
mones and insulins
J Anti-infectives for systemic use
L Antineoplastic and immunomodulating agents
M Musculo-skeletal system
N Nervous system
P Antiparasitic products, insecticides and repellents
R Respiratory system
S Sensory organs
V Various
Forecasting hierarchical time series Hierarchical time series 5
ATC drug classification
A Alimentary tract and metabolism14 classes
A10 Drugs used in diabetes84 classes
A10B Blood glucose lowering drugs
A10BA Biguanides
A10BA02 Metformin
Forecasting hierarchical time series Hierarchical time series 6
Australian tourism
Forecasting hierarchical time series Hierarchical time series 7
Australian tourism
Forecasting hierarchical time series Hierarchical time series 7
Also split by purpose of travel:
Holiday
Visits to friends and relatives
Business
Other
Hierarchical/grouped time series
A hierarchical time series is a collection of
several time series that are linked together in a
hierarchical structure.
Example: Pharmaceutical products are organized in
a hierarchy under the Anatomical Therapeutic
Chemical (ATC) Classification System.
A grouped time series is a collection of time
series that are aggregated in a number of
non-hierarchical ways.
Example: Australian tourism demand is grouped by
region and purpose of travel.
Forecasting hierarchical time series Hierarchical time series 8
Hierarchical/grouped time series
A hierarchical time series is a collection of
several time series that are linked together in a
hierarchical structure.
Example: Pharmaceutical products are organized in
a hierarchy under the Anatomical Therapeutic
Chemical (ATC) Classification System.
A grouped time series is a collection of time
series that are aggregated in a number of
non-hierarchical ways.
Example: Australian tourism demand is grouped by
region and purpose of travel.
Forecasting hierarchical time series Hierarchical time series 8
Hierarchical/grouped time series
A hierarchical time series is a collection of
several time series that are linked together in a
hierarchical structure.
Example: Pharmaceutical products are organized in
a hierarchy under the Anatomical Therapeutic
Chemical (ATC) Classification System.
A grouped time series is a collection of time
series that are aggregated in a number of
non-hierarchical ways.
Example: Australian tourism demand is grouped by
region and purpose of travel.
Forecasting hierarchical time series Hierarchical time series 8
Hierarchical/grouped time series
A hierarchical time series is a collection of
several time series that are linked together in a
hierarchical structure.
Example: Pharmaceutical products are organized in
a hierarchy under the Anatomical Therapeutic
Chemical (ATC) Classification System.
A grouped time series is a collection of time
series that are aggregated in a number of
non-hierarchical ways.
Example: Australian tourism demand is grouped by
region and purpose of travel.
Forecasting hierarchical time series Hierarchical time series 8
Hierarchical/grouped time series
Forecasts should be “aggregate
consistent”, unbiased, minimum variance.
Existing methods:
¢ Bottom-up
¢ Top-down
¢ Middle-out
How to compute forecast intervals?
Most research is concerned about relative
performance of existing methods.
There is no research on how to deal with
forecasting grouped time series.
Forecasting hierarchical time series Hierarchical time series 9
Hierarchical/grouped time series
Forecasts should be “aggregate
consistent”, unbiased, minimum variance.
Existing methods:
¢ Bottom-up
¢ Top-down
¢ Middle-out
How to compute forecast intervals?
Most research is concerned about relative
performance of existing methods.
There is no research on how to deal with
forecasting grouped time series.
Forecasting hierarchical time series Hierarchical time series 9
Hierarchical/grouped time series
Forecasts should be “aggregate
consistent”, unbiased, minimum variance.
Existing methods:
¢ Bottom-up
¢ Top-down
¢ Middle-out
How to compute forecast intervals?
Most research is concerned about relative
performance of existing methods.
There is no research on how to deal with
forecasting grouped time series.
Forecasting hierarchical time series Hierarchical time series 9
Hierarchical/grouped time series
Forecasts should be “aggregate
consistent”, unbiased, minimum variance.
Existing methods:
¢ Bottom-up
¢ Top-down
¢ Middle-out
How to compute forecast intervals?
Most research is concerned about relative
performance of existing methods.
There is no research on how to deal with
forecasting grouped time series.
Forecasting hierarchical time series Hierarchical time series 9
Hierarchical/grouped time series
Forecasts should be “aggregate
consistent”, unbiased, minimum variance.
Existing methods:
¢ Bottom-up
¢ Top-down
¢ Middle-out
How to compute forecast intervals?
Most research is concerned about relative
performance of existing methods.
There is no research on how to deal with
forecasting grouped time series.
Forecasting hierarchical time series Hierarchical time series 9
Hierarchical/grouped time series
Forecasts should be “aggregate
consistent”, unbiased, minimum variance.
Existing methods:
¢ Bottom-up
¢ Top-down
¢ Middle-out
How to compute forecast intervals?
Most research is concerned about relative
performance of existing methods.
There is no research on how to deal with
forecasting grouped time series.
Forecasting hierarchical time series Hierarchical time series 9
Hierarchical/grouped time series
Forecasts should be “aggregate
consistent”, unbiased, minimum variance.
Existing methods:
¢ Bottom-up
¢ Top-down
¢ Middle-out
How to compute forecast intervals?
Most research is concerned about relative
performance of existing methods.
There is no research on how to deal with
forecasting grouped time series.
Forecasting hierarchical time series Hierarchical time series 9
Hierarchical/grouped time series
Forecasts should be “aggregate
consistent”, unbiased, minimum variance.
Existing methods:
¢ Bottom-up
¢ Top-down
¢ Middle-out
How to compute forecast intervals?
Most research is concerned about relative
performance of existing methods.
There is no research on how to deal with
forecasting grouped time series.
Forecasting hierarchical time series Hierarchical time series 9
Top-down method
Forecasting hierarchical time series Hierarchical time series 10
Advantages
Works well in
presence of low
counts.
Single forecasting
model easy to
build
Provides reliable
forecasts for
aggregate levels.
Disadvantages
Loss of information,
especially
individual series
dynamics.
Distribution of
forecasts to lower
levels can be
difficult
No prediction
intervals
Top-down method
Forecasting hierarchical time series Hierarchical time series 10
Advantages
Works well in
presence of low
counts.
Single forecasting
model easy to
build
Provides reliable
forecasts for
aggregate levels.
Disadvantages
Loss of information,
especially
individual series
dynamics.
Distribution of
forecasts to lower
levels can be
difficult
No prediction
intervals
Top-down method
Forecasting hierarchical time series Hierarchical time series 10
Advantages
Works well in
presence of low
counts.
Single forecasting
model easy to
build
Provides reliable
forecasts for
aggregate levels.
Disadvantages
Loss of information,
especially
individual series
dynamics.
Distribution of
forecasts to lower
levels can be
difficult
No prediction
intervals
Top-down method
Forecasting hierarchical time series Hierarchical time series 10
Advantages
Works well in
presence of low
counts.
Single forecasting
model easy to
build
Provides reliable
forecasts for
aggregate levels.
Disadvantages
Loss of information,
especially
individual series
dynamics.
Distribution of
forecasts to lower
levels can be
difficult
No prediction
intervals
Top-down method
Forecasting hierarchical time series Hierarchical time series 10
Advantages
Works well in
presence of low
counts.
Single forecasting
model easy to
build
Provides reliable
forecasts for
aggregate levels.
Disadvantages
Loss of information,
especially
individual series
dynamics.
Distribution of
forecasts to lower
levels can be
difficult
No prediction
intervals
Top-down method
Forecasting hierarchical time series Hierarchical time series 10
Advantages
Works well in
presence of low
counts.
Single forecasting
model easy to
build
Provides reliable
forecasts for
aggregate levels.
Disadvantages
Loss of information,
especially
individual series
dynamics.
Distribution of
forecasts to lower
levels can be
difficult
No prediction
intervals
Bottom-up method
Forecasting hierarchical time series Hierarchical time series 11
Advantages
No loss of
information.
Better captures
dynamics of
individual series.
Disadvantages
Large number of
series to be
forecast.
Constructing
forecasting models
is harder because
of noisy data at
bottom level.
No prediction
intervals
Bottom-up method
Forecasting hierarchical time series Hierarchical time series 11
Advantages
No loss of
information.
Better captures
dynamics of
individual series.
Disadvantages
Large number of
series to be
forecast.
Constructing
forecasting models
is harder because
of noisy data at
bottom level.
No prediction
intervals
Bottom-up method
Forecasting hierarchical time series Hierarchical time series 11
Advantages
No loss of
information.
Better captures
dynamics of
individual series.
Disadvantages
Large number of
series to be
forecast.
Constructing
forecasting models
is harder because
of noisy data at
bottom level.
No prediction
intervals
Bottom-up method
Forecasting hierarchical time series Hierarchical time series 11
Advantages
No loss of
information.
Better captures
dynamics of
individual series.
Disadvantages
Large number of
series to be
forecast.
Constructing
forecasting models
is harder because
of noisy data at
bottom level.
No prediction
intervals
Bottom-up method
Forecasting hierarchical time series Hierarchical time series 11
Advantages
No loss of
information.
Better captures
dynamics of
individual series.
Disadvantages
Large number of
series to be
forecast.
Constructing
forecasting models
is harder because
of noisy data at
bottom level.
No prediction
intervals
A new approach
We propose a new statistical framework for
forecasting hierarchical time series which:
1 provides point forecasts that are
consistent across the hierarchy;
2 allows for correlations and interaction
between series at each level;
3 provides estimates of forecast uncertainty
which are consistent across the hierarchy;
4 allows for ad hoc adjustments and
inclusion of covariates at any level.
Forecasting hierarchical time series Hierarchical time series 12
A new approach
We propose a new statistical framework for
forecasting hierarchical time series which:
1 provides point forecasts that are
consistent across the hierarchy;
2 allows for correlations and interaction
between series at each level;
3 provides estimates of forecast uncertainty
which are consistent across the hierarchy;
4 allows for ad hoc adjustments and
inclusion of covariates at any level.
Forecasting hierarchical time series Hierarchical time series 12
A new approach
We propose a new statistical framework for
forecasting hierarchical time series which:
1 provides point forecasts that are
consistent across the hierarchy;
2 allows for correlations and interaction
between series at each level;
3 provides estimates of forecast uncertainty
which are consistent across the hierarchy;
4 allows for ad hoc adjustments and
inclusion of covariates at any level.
Forecasting hierarchical time series Hierarchical time series 12
A new approach
We propose a new statistical framework for
forecasting hierarchical time series which:
1 provides point forecasts that are
consistent across the hierarchy;
2 allows for correlations and interaction
between series at each level;
3 provides estimates of forecast uncertainty
which are consistent across the hierarchy;
4 allows for ad hoc adjustments and
inclusion of covariates at any level.
Forecasting hierarchical time series Hierarchical time series 12
Hierarchical data
Total
A B C
Forecasting hierarchical time series Hierarchical time series 13
Yt : observed aggregate of all
series at time t.
YX,t : observation on series X at
time t.
Bt : vector of all series at
bottom level in time t.
Hierarchical data
Total
A B C
Forecasting hierarchical time series Hierarchical time series 13
Yt : observed aggregate of all
series at time t.
YX,t : observation on series X at
time t.
Bt : vector of all series at
bottom level in time t.
Hierarchical data
Total
A B C
Yt = [Yt, YA,t, YB,t, YC,t] =




1 1 1
1 0 0
0 1 0
0 0 1






YA,t
YB,t
YC,t


Forecasting hierarchical time series Hierarchical time series 13
Yt : observed aggregate of all
series at time t.
YX,t : observation on series X at
time t.
Bt : vector of all series at
bottom level in time t.
Hierarchical data
Total
A B C
Yt = [Yt, YA,t, YB,t, YC,t] =




1 1 1
1 0 0
0 1 0
0 0 1




S


YA,t
YB,t
YC,t


Forecasting hierarchical time series Hierarchical time series 13
Yt : observed aggregate of all
series at time t.
YX,t : observation on series X at
time t.
Bt : vector of all series at
bottom level in time t.
Hierarchical data
Total
A B C
Yt = [Yt, YA,t, YB,t, YC,t] =




1 1 1
1 0 0
0 1 0
0 0 1




S


YA,t
YB,t
YC,t


Bt
Forecasting hierarchical time series Hierarchical time series 13
Yt : observed aggregate of all
series at time t.
YX,t : observation on series X at
time t.
Bt : vector of all series at
bottom level in time t.
Hierarchical data
Total
A B C
Yt = [Yt, YA,t, YB,t, YC,t] =




1 1 1
1 0 0
0 1 0
0 0 1




S


YA,t
YB,t
YC,t


Bt
Yt = SBt
Forecasting hierarchical time series Hierarchical time series 13
Yt : observed aggregate of all
series at time t.
YX,t : observation on series X at
time t.
Bt : vector of all series at
bottom level in time t.
Hierarchical data
Total
A
AX AY AZ
B
BX BY BZ
C
CX CY CZ
Yt =












Yt
YA,t
YB,t
YC,t
YAX,t
YAY,t
YAZ,t
YBX,t
YBY,t
YBZ,t
YCX,t
YCY,t
YCZ,t












=












1 1 1 1 1 1 1 1 1
1 1 1 0 0 0 0 0 0
0 0 0 1 1 1 0 0 0
0 0 0 0 0 0 1 1 1
1 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0
0 0 0 0 1 0 0 0 0
0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 1












S







YAX,t
YAY,t
YAZ,t
YBX,t
YBY,t
YBZ,t
YCX,t
YCY,t
YCZ,t







Bt
Forecasting hierarchical time series Hierarchical time series 14
Hierarchical data
Total
A
AX AY AZ
B
BX BY BZ
C
CX CY CZ
Yt =












Yt
YA,t
YB,t
YC,t
YAX,t
YAY,t
YAZ,t
YBX,t
YBY,t
YBZ,t
YCX,t
YCY,t
YCZ,t












=












1 1 1 1 1 1 1 1 1
1 1 1 0 0 0 0 0 0
0 0 0 1 1 1 0 0 0
0 0 0 0 0 0 1 1 1
1 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0
0 0 0 0 1 0 0 0 0
0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 1












S







YAX,t
YAY,t
YAZ,t
YBX,t
YBY,t
YBZ,t
YCX,t
YCY,t
YCZ,t







Bt
Forecasting hierarchical time series Hierarchical time series 14
Hierarchical data
Total
A
AX AY AZ
B
BX BY BZ
C
CX CY CZ
Yt =












Yt
YA,t
YB,t
YC,t
YAX,t
YAY,t
YAZ,t
YBX,t
YBY,t
YBZ,t
YCX,t
YCY,t
YCZ,t












=












1 1 1 1 1 1 1 1 1
1 1 1 0 0 0 0 0 0
0 0 0 1 1 1 0 0 0
0 0 0 0 0 0 1 1 1
1 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0
0 0 0 0 1 0 0 0 0
0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 1












S







YAX,t
YAY,t
YAZ,t
YBX,t
YBY,t
YBZ,t
YCX,t
YCY,t
YCZ,t







Bt
Forecasting hierarchical time series Hierarchical time series 14
Yt = SBt
Grouped data
Total
A
AX AY
B
BX BY
Total
X
AX BX
Y
AY BY
Yt =













Yt
YA,t
YB,t
YX,t
YY,t
YAX,t
YAY,t
YBX,t
YBY,t













=













1 1 1 1
1 1 0 0
0 0 1 1
1 0 1 0
0 1 0 1
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1













S



YAX,t
YAY,t
YBX,t
YBY,t



Bt
Forecasting hierarchical time series Hierarchical time series 15
Grouped data
Total
A
AX AY
B
BX BY
Total
X
AX BX
Y
AY BY
Yt =













Yt
YA,t
YB,t
YX,t
YY,t
YAX,t
YAY,t
YBX,t
YBY,t













=













1 1 1 1
1 1 0 0
0 0 1 1
1 0 1 0
0 1 0 1
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1













S



YAX,t
YAY,t
YBX,t
YBY,t



Bt
Forecasting hierarchical time series Hierarchical time series 15
Grouped data
Total
A
AX AY
B
BX BY
Total
X
AX BX
Y
AY BY
Yt =













Yt
YA,t
YB,t
YX,t
YY,t
YAX,t
YAY,t
YBX,t
YBY,t













=













1 1 1 1
1 1 0 0
0 0 1 1
1 0 1 0
0 1 0 1
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1













S



YAX,t
YAY,t
YBX,t
YBY,t



Bt
Forecasting hierarchical time series Hierarchical time series 15
Yt = SBt
Outline
1 Hierarchical time series
2 Forecasting framework
3 Optimal forecasts
4 Approximately optimal forecasts
5 Application to Australian tourism
6 hts package for R
7 References
Forecasting hierarchical time series Forecasting framework 16
Forecasting notation
Let ˆYn(h) be vector of initial h-step forecasts,
made at time n, stacked in same order as Yt.
(They may not add up.)
Hierarchical forecasting methods of the form:
˜Yn(h) = SPˆYn(h)
for some matrix P.
P extracts and combines base forecasts
ˆYn(h) to get bottom-level forecasts.
S adds them up
Revised reconciled forecasts: ˜Yn(h).
Forecasting hierarchical time series Forecasting framework 17
Forecasting notation
Let ˆYn(h) be vector of initial h-step forecasts,
made at time n, stacked in same order as Yt.
(They may not add up.)
Hierarchical forecasting methods of the form:
˜Yn(h) = SPˆYn(h)
for some matrix P.
P extracts and combines base forecasts
ˆYn(h) to get bottom-level forecasts.
S adds them up
Revised reconciled forecasts: ˜Yn(h).
Forecasting hierarchical time series Forecasting framework 17
Forecasting notation
Let ˆYn(h) be vector of initial h-step forecasts,
made at time n, stacked in same order as Yt.
(They may not add up.)
Hierarchical forecasting methods of the form:
˜Yn(h) = SPˆYn(h)
for some matrix P.
P extracts and combines base forecasts
ˆYn(h) to get bottom-level forecasts.
S adds them up
Revised reconciled forecasts: ˜Yn(h).
Forecasting hierarchical time series Forecasting framework 17
Forecasting notation
Let ˆYn(h) be vector of initial h-step forecasts,
made at time n, stacked in same order as Yt.
(They may not add up.)
Hierarchical forecasting methods of the form:
˜Yn(h) = SPˆYn(h)
for some matrix P.
P extracts and combines base forecasts
ˆYn(h) to get bottom-level forecasts.
S adds them up
Revised reconciled forecasts: ˜Yn(h).
Forecasting hierarchical time series Forecasting framework 17
Forecasting notation
Let ˆYn(h) be vector of initial h-step forecasts,
made at time n, stacked in same order as Yt.
(They may not add up.)
Hierarchical forecasting methods of the form:
˜Yn(h) = SPˆYn(h)
for some matrix P.
P extracts and combines base forecasts
ˆYn(h) to get bottom-level forecasts.
S adds them up
Revised reconciled forecasts: ˜Yn(h).
Forecasting hierarchical time series Forecasting framework 17
Forecasting notation
Let ˆYn(h) be vector of initial h-step forecasts,
made at time n, stacked in same order as Yt.
(They may not add up.)
Hierarchical forecasting methods of the form:
˜Yn(h) = SPˆYn(h)
for some matrix P.
P extracts and combines base forecasts
ˆYn(h) to get bottom-level forecasts.
S adds them up
Revised reconciled forecasts: ˜Yn(h).
Forecasting hierarchical time series Forecasting framework 17
Bottom-up forecasts
˜Yn(h) = SPˆYn(h)
Bottom-up forecasts are obtained using
P = [0 | I] ,
where 0 is null matrix and I is identity matrix.
P matrix extracts only bottom-level
forecasts from ˆYn(h)
S adds them up to give the bottom-up
forecasts.
Forecasting hierarchical time series Forecasting framework 18
Bottom-up forecasts
˜Yn(h) = SPˆYn(h)
Bottom-up forecasts are obtained using
P = [0 | I] ,
where 0 is null matrix and I is identity matrix.
P matrix extracts only bottom-level
forecasts from ˆYn(h)
S adds them up to give the bottom-up
forecasts.
Forecasting hierarchical time series Forecasting framework 18
Bottom-up forecasts
˜Yn(h) = SPˆYn(h)
Bottom-up forecasts are obtained using
P = [0 | I] ,
where 0 is null matrix and I is identity matrix.
P matrix extracts only bottom-level
forecasts from ˆYn(h)
S adds them up to give the bottom-up
forecasts.
Forecasting hierarchical time series Forecasting framework 18
Top-down forecasts
˜Yn(h) = SPˆYn(h)
Top-down forecasts are obtained using
P = [p | 0]
where p = [p1, p2, . . . , pmK
] is a vector of
proportions that sum to one.
P distributes forecasts of the aggregate to
the lowest level series.
Different methods of top-down forecasting
lead to different proportionality vectors p.
Forecasting hierarchical time series Forecasting framework 19
Top-down forecasts
˜Yn(h) = SPˆYn(h)
Top-down forecasts are obtained using
P = [p | 0]
where p = [p1, p2, . . . , pmK
] is a vector of
proportions that sum to one.
P distributes forecasts of the aggregate to
the lowest level series.
Different methods of top-down forecasting
lead to different proportionality vectors p.
Forecasting hierarchical time series Forecasting framework 19
Top-down forecasts
˜Yn(h) = SPˆYn(h)
Top-down forecasts are obtained using
P = [p | 0]
where p = [p1, p2, . . . , pmK
] is a vector of
proportions that sum to one.
P distributes forecasts of the aggregate to
the lowest level series.
Different methods of top-down forecasting
lead to different proportionality vectors p.
Forecasting hierarchical time series Forecasting framework 19
General properties: bias
˜Yn(h) = SPˆYn(h)
Assume: base forecasts ˆYn(h) are unbiased:
E[ˆYn(h)|Y1, . . . , Yn] = E[Yn+h|Y1, . . . , Yn]
Let ˆBn(h) be bottom level base forecasts
with βn(h) = E[ˆBn(h)|Y1, . . . , Yn].
Then E[ˆYn(h)] = Sβn(h).
We want the revised forecasts to be unbiased:
E[˜Yn(h)] = SPSβn(h) = Sβn(h).
Result will hold provided SPS = S.
True for bottom-up, but not for any top-down
method or middle-out method.
Forecasting hierarchical time series Forecasting framework 20
General properties: bias
˜Yn(h) = SPˆYn(h)
Assume: base forecasts ˆYn(h) are unbiased:
E[ˆYn(h)|Y1, . . . , Yn] = E[Yn+h|Y1, . . . , Yn]
Let ˆBn(h) be bottom level base forecasts
with βn(h) = E[ˆBn(h)|Y1, . . . , Yn].
Then E[ˆYn(h)] = Sβn(h).
We want the revised forecasts to be unbiased:
E[˜Yn(h)] = SPSβn(h) = Sβn(h).
Result will hold provided SPS = S.
True for bottom-up, but not for any top-down
method or middle-out method.
Forecasting hierarchical time series Forecasting framework 20
General properties: bias
˜Yn(h) = SPˆYn(h)
Assume: base forecasts ˆYn(h) are unbiased:
E[ˆYn(h)|Y1, . . . , Yn] = E[Yn+h|Y1, . . . , Yn]
Let ˆBn(h) be bottom level base forecasts
with βn(h) = E[ˆBn(h)|Y1, . . . , Yn].
Then E[ˆYn(h)] = Sβn(h).
We want the revised forecasts to be unbiased:
E[˜Yn(h)] = SPSβn(h) = Sβn(h).
Result will hold provided SPS = S.
True for bottom-up, but not for any top-down
method or middle-out method.
Forecasting hierarchical time series Forecasting framework 20
General properties: bias
˜Yn(h) = SPˆYn(h)
Assume: base forecasts ˆYn(h) are unbiased:
E[ˆYn(h)|Y1, . . . , Yn] = E[Yn+h|Y1, . . . , Yn]
Let ˆBn(h) be bottom level base forecasts
with βn(h) = E[ˆBn(h)|Y1, . . . , Yn].
Then E[ˆYn(h)] = Sβn(h).
We want the revised forecasts to be unbiased:
E[˜Yn(h)] = SPSβn(h) = Sβn(h).
Result will hold provided SPS = S.
True for bottom-up, but not for any top-down
method or middle-out method.
Forecasting hierarchical time series Forecasting framework 20
General properties: bias
˜Yn(h) = SPˆYn(h)
Assume: base forecasts ˆYn(h) are unbiased:
E[ˆYn(h)|Y1, . . . , Yn] = E[Yn+h|Y1, . . . , Yn]
Let ˆBn(h) be bottom level base forecasts
with βn(h) = E[ˆBn(h)|Y1, . . . , Yn].
Then E[ˆYn(h)] = Sβn(h).
We want the revised forecasts to be unbiased:
E[˜Yn(h)] = SPSβn(h) = Sβn(h).
Result will hold provided SPS = S.
True for bottom-up, but not for any top-down
method or middle-out method.
Forecasting hierarchical time series Forecasting framework 20
General properties: bias
˜Yn(h) = SPˆYn(h)
Assume: base forecasts ˆYn(h) are unbiased:
E[ˆYn(h)|Y1, . . . , Yn] = E[Yn+h|Y1, . . . , Yn]
Let ˆBn(h) be bottom level base forecasts
with βn(h) = E[ˆBn(h)|Y1, . . . , Yn].
Then E[ˆYn(h)] = Sβn(h).
We want the revised forecasts to be unbiased:
E[˜Yn(h)] = SPSβn(h) = Sβn(h).
Result will hold provided SPS = S.
True for bottom-up, but not for any top-down
method or middle-out method.
Forecasting hierarchical time series Forecasting framework 20
General properties: bias
˜Yn(h) = SPˆYn(h)
Assume: base forecasts ˆYn(h) are unbiased:
E[ˆYn(h)|Y1, . . . , Yn] = E[Yn+h|Y1, . . . , Yn]
Let ˆBn(h) be bottom level base forecasts
with βn(h) = E[ˆBn(h)|Y1, . . . , Yn].
Then E[ˆYn(h)] = Sβn(h).
We want the revised forecasts to be unbiased:
E[˜Yn(h)] = SPSβn(h) = Sβn(h).
Result will hold provided SPS = S.
True for bottom-up, but not for any top-down
method or middle-out method.
Forecasting hierarchical time series Forecasting framework 20
General properties: bias
˜Yn(h) = SPˆYn(h)
Assume: base forecasts ˆYn(h) are unbiased:
E[ˆYn(h)|Y1, . . . , Yn] = E[Yn+h|Y1, . . . , Yn]
Let ˆBn(h) be bottom level base forecasts
with βn(h) = E[ˆBn(h)|Y1, . . . , Yn].
Then E[ˆYn(h)] = Sβn(h).
We want the revised forecasts to be unbiased:
E[˜Yn(h)] = SPSβn(h) = Sβn(h).
Result will hold provided SPS = S.
True for bottom-up, but not for any top-down
method or middle-out method.
Forecasting hierarchical time series Forecasting framework 20
General properties: variance
˜Yn(h) = SPˆYn(h)
Let variance of base forecasts ˆYn(h) be given
by
Σh = V[ˆYn(h)|Y1, . . . , Yn]
Then the variance of the revised forecasts is
given by
V[˜Yn(h)|Y1, . . . , Yn] = SPΣhP S .
This is a general result for all existing methods.
Forecasting hierarchical time series Forecasting framework 21
General properties: variance
˜Yn(h) = SPˆYn(h)
Let variance of base forecasts ˆYn(h) be given
by
Σh = V[ˆYn(h)|Y1, . . . , Yn]
Then the variance of the revised forecasts is
given by
V[˜Yn(h)|Y1, . . . , Yn] = SPΣhP S .
This is a general result for all existing methods.
Forecasting hierarchical time series Forecasting framework 21
General properties: variance
˜Yn(h) = SPˆYn(h)
Let variance of base forecasts ˆYn(h) be given
by
Σh = V[ˆYn(h)|Y1, . . . , Yn]
Then the variance of the revised forecasts is
given by
V[˜Yn(h)|Y1, . . . , Yn] = SPΣhP S .
This is a general result for all existing methods.
Forecasting hierarchical time series Forecasting framework 21
Outline
1 Hierarchical time series
2 Forecasting framework
3 Optimal forecasts
4 Approximately optimal forecasts
5 Application to Australian tourism
6 hts package for R
7 References
Forecasting hierarchical time series Optimal forecasts 22
Forecasts
Key idea: forecast reconciliation
¯ Ignore structural constraints and forecast
every series of interest independently.
¯ Adjust forecasts to impose constraints.
Let ˆYn(h) be vector of initial h-step forecasts,
made at time n, stacked in same order as Yt.
Yt = SBt . So ˆYn(h) = Sβn(h) + εh .
βn(h) = E[Bn+h | Y1, . . . , Yn].
εh has zero mean and covariance Σh.
Estimate βn(h) using GLS?
Forecasting hierarchical time series Optimal forecasts 23
Forecasts
Key idea: forecast reconciliation
¯ Ignore structural constraints and forecast
every series of interest independently.
¯ Adjust forecasts to impose constraints.
Let ˆYn(h) be vector of initial h-step forecasts,
made at time n, stacked in same order as Yt.
Yt = SBt . So ˆYn(h) = Sβn(h) + εh .
βn(h) = E[Bn+h | Y1, . . . , Yn].
εh has zero mean and covariance Σh.
Estimate βn(h) using GLS?
Forecasting hierarchical time series Optimal forecasts 23
Forecasts
Key idea: forecast reconciliation
¯ Ignore structural constraints and forecast
every series of interest independently.
¯ Adjust forecasts to impose constraints.
Let ˆYn(h) be vector of initial h-step forecasts,
made at time n, stacked in same order as Yt.
Yt = SBt . So ˆYn(h) = Sβn(h) + εh .
βn(h) = E[Bn+h | Y1, . . . , Yn].
εh has zero mean and covariance Σh.
Estimate βn(h) using GLS?
Forecasting hierarchical time series Optimal forecasts 23
Forecasts
Key idea: forecast reconciliation
¯ Ignore structural constraints and forecast
every series of interest independently.
¯ Adjust forecasts to impose constraints.
Let ˆYn(h) be vector of initial h-step forecasts,
made at time n, stacked in same order as Yt.
Yt = SBt . So ˆYn(h) = Sβn(h) + εh .
βn(h) = E[Bn+h | Y1, . . . , Yn].
εh has zero mean and covariance Σh.
Estimate βn(h) using GLS?
Forecasting hierarchical time series Optimal forecasts 23
Forecasts
Key idea: forecast reconciliation
¯ Ignore structural constraints and forecast
every series of interest independently.
¯ Adjust forecasts to impose constraints.
Let ˆYn(h) be vector of initial h-step forecasts,
made at time n, stacked in same order as Yt.
Yt = SBt . So ˆYn(h) = Sβn(h) + εh .
βn(h) = E[Bn+h | Y1, . . . , Yn].
εh has zero mean and covariance Σh.
Estimate βn(h) using GLS?
Forecasting hierarchical time series Optimal forecasts 23
Forecasts
Key idea: forecast reconciliation
¯ Ignore structural constraints and forecast
every series of interest independently.
¯ Adjust forecasts to impose constraints.
Let ˆYn(h) be vector of initial h-step forecasts,
made at time n, stacked in same order as Yt.
Yt = SBt . So ˆYn(h) = Sβn(h) + εh .
βn(h) = E[Bn+h | Y1, . . . , Yn].
εh has zero mean and covariance Σh.
Estimate βn(h) using GLS?
Forecasting hierarchical time series Optimal forecasts 23
Forecasts
Key idea: forecast reconciliation
¯ Ignore structural constraints and forecast
every series of interest independently.
¯ Adjust forecasts to impose constraints.
Let ˆYn(h) be vector of initial h-step forecasts,
made at time n, stacked in same order as Yt.
Yt = SBt . So ˆYn(h) = Sβn(h) + εh .
βn(h) = E[Bn+h | Y1, . . . , Yn].
εh has zero mean and covariance Σh.
Estimate βn(h) using GLS?
Forecasting hierarchical time series Optimal forecasts 23
Optimal combination forecasts
˜Yn(h) = S ˆβn(h) = S(S Σ†
hS)−1
S Σ†
h
ˆYn(h)
Σ†
h is generalized inverse of Σh.
Optimal P = (S Σ†
hS)−1
S Σ†
h
Revised forecasts unbiased: SPS = S.
Revised forecasts minimum variance:
V[˜Yn(h)|Y1, . . . , Yn] = SPΣhP S
= S(S Σ†
hS)−1
S
Problem: Σh hard to estimate.
Forecasting hierarchical time series Optimal forecasts 24
Optimal combination forecasts
˜Yn(h) = S ˆβn(h) = S(S Σ†
hS)−1
S Σ†
h
ˆYn(h)
Initial forecasts
Σ†
h is generalized inverse of Σh.
Optimal P = (S Σ†
hS)−1
S Σ†
h
Revised forecasts unbiased: SPS = S.
Revised forecasts minimum variance:
V[˜Yn(h)|Y1, . . . , Yn] = SPΣhP S
= S(S Σ†
hS)−1
S
Problem: Σh hard to estimate.
Forecasting hierarchical time series Optimal forecasts 24
Optimal combination forecasts
˜Yn(h) = S ˆβn(h) = S(S Σ†
hS)−1
S Σ†
h
ˆYn(h)
Revised forecasts Initial forecasts
Σ†
h is generalized inverse of Σh.
Optimal P = (S Σ†
hS)−1
S Σ†
h
Revised forecasts unbiased: SPS = S.
Revised forecasts minimum variance:
V[˜Yn(h)|Y1, . . . , Yn] = SPΣhP S
= S(S Σ†
hS)−1
S
Problem: Σh hard to estimate.
Forecasting hierarchical time series Optimal forecasts 24
Optimal combination forecasts
˜Yn(h) = S ˆβn(h) = S(S Σ†
hS)−1
S Σ†
h
ˆYn(h)
Revised forecasts Initial forecasts
Σ†
h is generalized inverse of Σh.
Optimal P = (S Σ†
hS)−1
S Σ†
h
Revised forecasts unbiased: SPS = S.
Revised forecasts minimum variance:
V[˜Yn(h)|Y1, . . . , Yn] = SPΣhP S
= S(S Σ†
hS)−1
S
Problem: Σh hard to estimate.
Forecasting hierarchical time series Optimal forecasts 24
Optimal combination forecasts
˜Yn(h) = S ˆβn(h) = S(S Σ†
hS)−1
S Σ†
h
ˆYn(h)
Revised forecasts Initial forecasts
Σ†
h is generalized inverse of Σh.
Optimal P = (S Σ†
hS)−1
S Σ†
h
Revised forecasts unbiased: SPS = S.
Revised forecasts minimum variance:
V[˜Yn(h)|Y1, . . . , Yn] = SPΣhP S
= S(S Σ†
hS)−1
S
Problem: Σh hard to estimate.
Forecasting hierarchical time series Optimal forecasts 24
Optimal combination forecasts
˜Yn(h) = S ˆβn(h) = S(S Σ†
hS)−1
S Σ†
h
ˆYn(h)
Revised forecasts Initial forecasts
Σ†
h is generalized inverse of Σh.
Optimal P = (S Σ†
hS)−1
S Σ†
h
Revised forecasts unbiased: SPS = S.
Revised forecasts minimum variance:
V[˜Yn(h)|Y1, . . . , Yn] = SPΣhP S
= S(S Σ†
hS)−1
S
Problem: Σh hard to estimate.
Forecasting hierarchical time series Optimal forecasts 24
Optimal combination forecasts
˜Yn(h) = S ˆβn(h) = S(S Σ†
hS)−1
S Σ†
h
ˆYn(h)
Revised forecasts Initial forecasts
Σ†
h is generalized inverse of Σh.
Optimal P = (S Σ†
hS)−1
S Σ†
h
Revised forecasts unbiased: SPS = S.
Revised forecasts minimum variance:
V[˜Yn(h)|Y1, . . . , Yn] = SPΣhP S
= S(S Σ†
hS)−1
S
Problem: Σh hard to estimate.
Forecasting hierarchical time series Optimal forecasts 24
Optimal combination forecasts
˜Yn(h) = S ˆβn(h) = S(S Σ†
hS)−1
S Σ†
h
ˆYn(h)
Revised forecasts Initial forecasts
Σ†
h is generalized inverse of Σh.
Optimal P = (S Σ†
hS)−1
S Σ†
h
Revised forecasts unbiased: SPS = S.
Revised forecasts minimum variance:
V[˜Yn(h)|Y1, . . . , Yn] = SPΣhP S
= S(S Σ†
hS)−1
S
Problem: Σh hard to estimate.
Forecasting hierarchical time series Optimal forecasts 24
Optimal combination forecasts
˜Yn(h) = S ˆβn(h) = S(S Σ†
hS)−1
S Σ†
h
ˆYn(h)
Revised forecasts Initial forecasts
Σ†
h is generalized inverse of Σh.
Optimal P = (S Σ†
hS)−1
S Σ†
h
Revised forecasts unbiased: SPS = S.
Revised forecasts minimum variance:
V[˜Yn(h)|Y1, . . . , Yn] = SPΣhP S
= S(S Σ†
hS)−1
S
Problem: Σh hard to estimate.
Forecasting hierarchical time series Optimal forecasts 24
Outline
1 Hierarchical time series
2 Forecasting framework
3 Optimal forecasts
4 Approximately optimal forecasts
5 Application to Australian tourism
6 hts package for R
7 References
Forecasting hierarchical time series Approximately optimal forecasts 25
Optimal combination forecasts
˜Yn(h) = S(S Σ†
hS)−1
S Σ†
h
ˆYn(h)
Revised forecasts Base forecasts
Solution 1: OLS
Assume εh ≈ SεB,h where εB,h is the
forecast error at bottom level.
Then Σh ≈ SΩhS where Ωh = V(εB,h).
If Moore-Penrose generalized inverse used,
then (S Σ†
hS)−1
S Σ†
h = (S S)−1
S .
˜Yn(h) = S(S S)−1
S ˆYn(h)
Forecasting hierarchical time series Approximately optimal forecasts 26
Optimal combination forecasts
˜Yn(h) = S(S Σ†
hS)−1
S Σ†
h
ˆYn(h)
Revised forecasts Base forecasts
Solution 1: OLS
Assume εh ≈ SεB,h where εB,h is the
forecast error at bottom level.
Then Σh ≈ SΩhS where Ωh = V(εB,h).
If Moore-Penrose generalized inverse used,
then (S Σ†
hS)−1
S Σ†
h = (S S)−1
S .
˜Yn(h) = S(S S)−1
S ˆYn(h)
Forecasting hierarchical time series Approximately optimal forecasts 26
Optimal combination forecasts
˜Yn(h) = S(S Σ†
hS)−1
S Σ†
h
ˆYn(h)
Revised forecasts Base forecasts
Solution 1: OLS
Assume εh ≈ SεB,h where εB,h is the
forecast error at bottom level.
Then Σh ≈ SΩhS where Ωh = V(εB,h).
If Moore-Penrose generalized inverse used,
then (S Σ†
hS)−1
S Σ†
h = (S S)−1
S .
˜Yn(h) = S(S S)−1
S ˆYn(h)
Forecasting hierarchical time series Approximately optimal forecasts 26
Optimal combination forecasts
˜Yn(h) = S(S Σ†
hS)−1
S Σ†
h
ˆYn(h)
Revised forecasts Base forecasts
Solution 1: OLS
Assume εh ≈ SεB,h where εB,h is the
forecast error at bottom level.
Then Σh ≈ SΩhS where Ωh = V(εB,h).
If Moore-Penrose generalized inverse used,
then (S Σ†
hS)−1
S Σ†
h = (S S)−1
S .
˜Yn(h) = S(S S)−1
S ˆYn(h)
Forecasting hierarchical time series Approximately optimal forecasts 26
Optimal combination forecasts
˜Yn(h) = S(S Σ†
hS)−1
S Σ†
h
ˆYn(h)
Revised forecasts Base forecasts
Solution 1: OLS
Assume εh ≈ SεB,h where εB,h is the
forecast error at bottom level.
Then Σh ≈ SΩhS where Ωh = V(εB,h).
If Moore-Penrose generalized inverse used,
then (S Σ†
hS)−1
S Σ†
h = (S S)−1
S .
˜Yn(h) = S(S S)−1
S ˆYn(h)
Forecasting hierarchical time series Approximately optimal forecasts 26
Optimal combination forecasts
˜Yn(h) = S(S Σ†
hS)−1
S Σ†
h
ˆYn(h)
Revised forecasts Base forecasts
Solution 1: OLS
Assume εh ≈ SεB,h where εB,h is the
forecast error at bottom level.
Then Σh ≈ SΩhS where Ωh = V(εB,h).
If Moore-Penrose generalized inverse used,
then (S Σ†
hS)−1
S Σ†
h = (S S)−1
S .
˜Yn(h) = S(S S)−1
S ˆYn(h)
Forecasting hierarchical time series Approximately optimal forecasts 26
Optimal combination forecasts
˜Yn(h) = S(S S)−1
S ˆYn(h)
GLS = OLS.
Optimal weighted average of initial
forecasts.
Optimal reconciliation weights are
S(S S)−1
S .
Weights are independent of the data and
of the covariance structure of the
hierarchy!
Forecasting hierarchical time series Approximately optimal forecasts 27
Optimal combination forecasts
˜Yn(h) = S(S S)−1
S ˆYn(h)
GLS = OLS.
Optimal weighted average of initial
forecasts.
Optimal reconciliation weights are
S(S S)−1
S .
Weights are independent of the data and
of the covariance structure of the
hierarchy!
Forecasting hierarchical time series Approximately optimal forecasts 27
Optimal combination forecasts
˜Yn(h) = S(S S)−1
S ˆYn(h)
GLS = OLS.
Optimal weighted average of initial
forecasts.
Optimal reconciliation weights are
S(S S)−1
S .
Weights are independent of the data and
of the covariance structure of the
hierarchy!
Forecasting hierarchical time series Approximately optimal forecasts 27
Optimal combination forecasts
˜Yn(h) = S(S S)−1
S ˆYn(h)
GLS = OLS.
Optimal weighted average of initial
forecasts.
Optimal reconciliation weights are
S(S S)−1
S .
Weights are independent of the data and
of the covariance structure of the
hierarchy!
Forecasting hierarchical time series Approximately optimal forecasts 27
Optimal combination forecasts
Forecasting hierarchical time series Approximately optimal forecasts 28
˜Yn(h) = S(S S)−1
S ˆYn(h)Total
A B C
Optimal combination forecasts
Forecasting hierarchical time series Approximately optimal forecasts 28
˜Yn(h) = S(S S)−1
S ˆYn(h)Total
A B C
Weights:
S(S S)−1
S =




0.75 0.25 0.25 0.25
0.25 0.75 −0.25 −0.25
0.25 −0.25 0.75 −0.25
0.25 −0.25 −0.25 0.75




Optimal combination forecasts
Total
A
AA AB AC
B
BA BB BC
C
CA CB CC
Weights: S(S S)−1
S =






















0.69 0.23 0.23 0.23 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08
0.23 0.58 −0.17 −0.17 0.19 0.19 0.19 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06
0.23 −0.17 0.58 −0.17 −0.06 −0.06 −0.06 0.19 0.19 0.19 −0.06 −0.06 −0.06
0.23 −0.17 −0.17 0.58 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06 0.19 0.19 0.19
0.08 0.19 −0.06 −0.06 0.73 −0.27 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02
0.08 0.19 −0.06 −0.06 −0.27 0.73 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02
0.08 0.19 −0.06 −0.06 −0.27 −0.27 0.73 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02
0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 0.73 −0.27 −0.27 −0.02 −0.02 −0.02
0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 0.73 −0.27 −0.02 −0.02 −0.02
0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 −0.27 0.73 −0.02 −0.02 −0.02
0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 0.73 −0.27 −0.27
0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 0.73 −0.27
0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 −0.27 0.73






















Forecasting hierarchical time series Approximately optimal forecasts 29
Optimal combination forecasts
Total
A
AA AB AC
B
BA BB BC
C
CA CB CC
Weights: S(S S)−1
S =






















0.69 0.23 0.23 0.23 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08
0.23 0.58 −0.17 −0.17 0.19 0.19 0.19 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06
0.23 −0.17 0.58 −0.17 −0.06 −0.06 −0.06 0.19 0.19 0.19 −0.06 −0.06 −0.06
0.23 −0.17 −0.17 0.58 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06 0.19 0.19 0.19
0.08 0.19 −0.06 −0.06 0.73 −0.27 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02
0.08 0.19 −0.06 −0.06 −0.27 0.73 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02
0.08 0.19 −0.06 −0.06 −0.27 −0.27 0.73 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02
0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 0.73 −0.27 −0.27 −0.02 −0.02 −0.02
0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 0.73 −0.27 −0.02 −0.02 −0.02
0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 −0.27 0.73 −0.02 −0.02 −0.02
0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 0.73 −0.27 −0.27
0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 0.73 −0.27
0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 −0.27 0.73






















Forecasting hierarchical time series Approximately optimal forecasts 29
Features
Forget “bottom up” or “top down”. This
approach combines all forecasts optimally.
Method outperforms bottom-up and
top-down, especially for middle levels.
Covariates can be included in initial forecasts.
Adjustments can be made to initial forecasts
at any level.
Very simple and flexible method. Can work
with any hierarchical or grouped time series.
Conceptually easy to implement: OLS on
base forecasts.
Forecasting hierarchical time series Approximately optimal forecasts 30
Features
Forget “bottom up” or “top down”. This
approach combines all forecasts optimally.
Method outperforms bottom-up and
top-down, especially for middle levels.
Covariates can be included in initial forecasts.
Adjustments can be made to initial forecasts
at any level.
Very simple and flexible method. Can work
with any hierarchical or grouped time series.
Conceptually easy to implement: OLS on
base forecasts.
Forecasting hierarchical time series Approximately optimal forecasts 30
Features
Forget “bottom up” or “top down”. This
approach combines all forecasts optimally.
Method outperforms bottom-up and
top-down, especially for middle levels.
Covariates can be included in initial forecasts.
Adjustments can be made to initial forecasts
at any level.
Very simple and flexible method. Can work
with any hierarchical or grouped time series.
Conceptually easy to implement: OLS on
base forecasts.
Forecasting hierarchical time series Approximately optimal forecasts 30
Features
Forget “bottom up” or “top down”. This
approach combines all forecasts optimally.
Method outperforms bottom-up and
top-down, especially for middle levels.
Covariates can be included in initial forecasts.
Adjustments can be made to initial forecasts
at any level.
Very simple and flexible method. Can work
with any hierarchical or grouped time series.
Conceptually easy to implement: OLS on
base forecasts.
Forecasting hierarchical time series Approximately optimal forecasts 30
Features
Forget “bottom up” or “top down”. This
approach combines all forecasts optimally.
Method outperforms bottom-up and
top-down, especially for middle levels.
Covariates can be included in initial forecasts.
Adjustments can be made to initial forecasts
at any level.
Very simple and flexible method. Can work
with any hierarchical or grouped time series.
Conceptually easy to implement: OLS on
base forecasts.
Forecasting hierarchical time series Approximately optimal forecasts 30
Features
Forget “bottom up” or “top down”. This
approach combines all forecasts optimally.
Method outperforms bottom-up and
top-down, especially for middle levels.
Covariates can be included in initial forecasts.
Adjustments can be made to initial forecasts
at any level.
Very simple and flexible method. Can work
with any hierarchical or grouped time series.
Conceptually easy to implement: OLS on
base forecasts.
Forecasting hierarchical time series Approximately optimal forecasts 30
Challenges
Computational difficulties in big
hierarchies due to size of the S matrix and
singular behavior of (S S).
Need to estimate covariance matrix to
produce prediction intervals.
Assumption might be unrealistic.
Ignores covariance matrix in computing
point forecasts.
Forecasting hierarchical time series Approximately optimal forecasts 31
˜Yn(h) = S(S S)−1
S ˆYn(h)
Challenges
Computational difficulties in big
hierarchies due to size of the S matrix and
singular behavior of (S S).
Need to estimate covariance matrix to
produce prediction intervals.
Assumption might be unrealistic.
Ignores covariance matrix in computing
point forecasts.
Forecasting hierarchical time series Approximately optimal forecasts 31
˜Yn(h) = S(S S)−1
S ˆYn(h)
Challenges
Computational difficulties in big
hierarchies due to size of the S matrix and
singular behavior of (S S).
Need to estimate covariance matrix to
produce prediction intervals.
Assumption might be unrealistic.
Ignores covariance matrix in computing
point forecasts.
Forecasting hierarchical time series Approximately optimal forecasts 31
˜Yn(h) = S(S S)−1
S ˆYn(h)
Challenges
Computational difficulties in big
hierarchies due to size of the S matrix and
singular behavior of (S S).
Need to estimate covariance matrix to
produce prediction intervals.
Assumption might be unrealistic.
Ignores covariance matrix in computing
point forecasts.
Forecasting hierarchical time series Approximately optimal forecasts 31
˜Yn(h) = S(S S)−1
S ˆYn(h)
Optimal combination forecasts
Solution 2: Rescaling
Suppose we rescale the original forecasts
by Λ, reconcile using OLS, and backscale:
˜Y
∗
n(h) = S(S Λ2
S)−1
S Λ2 ˆYn(h).
If Λ = Σ†
h
1/2
, we get the GLS solution.
Approximately optimal solution:
Λ = diagonal Σ†
1
1/2
That is, Λ contains inverse one-step
forecast standard deviations.
Forecasting hierarchical time series Approximately optimal forecasts 32
˜Yn(h) = S(S S)−1
S ˆYn(h)
Optimal combination forecasts
Solution 2: Rescaling
Suppose we rescale the original forecasts
by Λ, reconcile using OLS, and backscale:
˜Y
∗
n(h) = S(S Λ2
S)−1
S Λ2 ˆYn(h).
If Λ = Σ†
h
1/2
, we get the GLS solution.
Approximately optimal solution:
Λ = diagonal Σ†
1
1/2
That is, Λ contains inverse one-step
forecast standard deviations.
Forecasting hierarchical time series Approximately optimal forecasts 32
˜Yn(h) = S(S S)−1
S ˆYn(h)
Optimal combination forecasts
Solution 2: Rescaling
Suppose we rescale the original forecasts
by Λ, reconcile using OLS, and backscale:
˜Y
∗
n(h) = S(S Λ2
S)−1
S Λ2 ˆYn(h).
If Λ = Σ†
h
1/2
, we get the GLS solution.
Approximately optimal solution:
Λ = diagonal Σ†
1
1/2
That is, Λ contains inverse one-step
forecast standard deviations.
Forecasting hierarchical time series Approximately optimal forecasts 32
˜Yn(h) = S(S S)−1
S ˆYn(h)
Optimal combination forecasts
Solution 2: Rescaling
Suppose we rescale the original forecasts
by Λ, reconcile using OLS, and backscale:
˜Y
∗
n(h) = S(S Λ2
S)−1
S Λ2 ˆYn(h).
If Λ = Σ†
h
1/2
, we get the GLS solution.
Approximately optimal solution:
Λ = diagonal Σ†
1
1/2
That is, Λ contains inverse one-step
forecast standard deviations.
Forecasting hierarchical time series Approximately optimal forecasts 32
˜Yn(h) = S(S S)−1
S ˆYn(h)
Optimal combination forecasts
Solution 3: Averaging
If the bottom level error series are
approximately uncorrelated and have
similar variances, then Λ2
is inversely
proportional to the number of series
making up each element of Y.
So set Λ2
to be the inverse row sums of S.
Forecasting hierarchical time series Approximately optimal forecasts 33
˜Y
∗
n(h) = S(S Λ2
S)−1
S Λ2 ˆYn(h)
Optimal combination forecasts
Solution 3: Averaging
If the bottom level error series are
approximately uncorrelated and have
similar variances, then Λ2
is inversely
proportional to the number of series
making up each element of Y.
So set Λ2
to be the inverse row sums of S.
Forecasting hierarchical time series Approximately optimal forecasts 33
˜Y
∗
n(h) = S(S Λ2
S)−1
S Λ2 ˆYn(h)
Outline
1 Hierarchical time series
2 Forecasting framework
3 Optimal forecasts
4 Approximately optimal forecasts
5 Application to Australian tourism
6 hts package for R
7 References
Forecasting hierarchical time series Application to Australian tourism 34
Application to Australian tourism
Forecasting hierarchical time series Application to Australian tourism 35
Application to Australian tourism
Forecasting hierarchical time series Application to Australian tourism 35
Quarterly data on visitor nights
Domestic visitor nights
from 1998 – 2006
Data from: National Visitor Survey,
based on annual interviews of 120,000
Australians aged 15+, collected by
Tourism Research Australia.
Application to Australian tourism
Forecasting hierarchical time series Application to Australian tourism 35
Also split by purpose of travel:
Holiday
Visits to friends and relatives
Business
Other
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
Forecasting hierarchical time series Application to Australian tourism 36
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
Forecasting hierarchical time series Application to Australian tourism 36
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
A,N: Holt’s linear method
Forecasting hierarchical time series Application to Australian tourism 36
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
A,N: Holt’s linear method
Ad,N: Additive damped trend method
Forecasting hierarchical time series Application to Australian tourism 36
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
A,N: Holt’s linear method
Ad,N: Additive damped trend method
M,N: Exponential trend method
Forecasting hierarchical time series Application to Australian tourism 36
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
A,N: Holt’s linear method
Ad,N: Additive damped trend method
M,N: Exponential trend method
Md,N: Multiplicative damped trend method
Forecasting hierarchical time series Application to Australian tourism 36
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
A,N: Holt’s linear method
Ad,N: Additive damped trend method
M,N: Exponential trend method
Md,N: Multiplicative damped trend method
A,A: Additive Holt-Winters’ method
Forecasting hierarchical time series Application to Australian tourism 36
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
A,N: Holt’s linear method
Ad,N: Additive damped trend method
M,N: Exponential trend method
Md,N: Multiplicative damped trend method
A,A: Additive Holt-Winters’ method
A,M: Multiplicative Holt-Winters’ method
Forecasting hierarchical time series Application to Australian tourism 36
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
There are 15 separate exponential
smoothing methods.
Forecasting hierarchical time series Application to Australian tourism 36
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
There are 15 separate exponential
smoothing methods.
Each can have an additive or multiplicative
error, giving 30 separate models.
Forecasting hierarchical time series Application to Australian tourism 36
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smooth
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsForecasting hierarchical time series Application to Australian tourism 37
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smooth
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsForecasting hierarchical time series Application to Australian tourism 37
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smooth
↑
Trend
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsForecasting hierarchical time series Application to Australian tourism 37
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smooth
↑
Trend Seasonal
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsForecasting hierarchical time series Application to Australian tourism 37
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smooth
↑
Error Trend Seasonal
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsForecasting hierarchical time series Application to Australian tourism 37
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smooth
↑
Error Trend Seasonal
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsForecasting hierarchical time series Application to Australian tourism 37
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smooth
↑
Error Trend Seasonal
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsForecasting hierarchical time series Application to Australian tourism 37
Innovations state space models
¯ All ETS models can be written in
innovations state space form (IJF, 2002).
¯ Additive and multiplicative versions give
the same point forecasts but different
prediction intervals.
Automatic forecasting
From Hyndman et al. (IJF, 2002):
Apply each of 30 models that are
appropriate to the data. Optimize
parameters and initial values using MLE
(or some other criterion).
Select best method using AIC:
AIC = −2 log(Likelihood) + 2p
where p = # parameters.
Produce forecasts using best method.
Obtain prediction intervals using
underlying state space model.
Forecasting hierarchical time series Application to Australian tourism 38
Automatic forecasting
From Hyndman et al. (IJF, 2002):
Apply each of 30 models that are
appropriate to the data. Optimize
parameters and initial values using MLE
(or some other criterion).
Select best method using AIC:
AIC = −2 log(Likelihood) + 2p
where p = # parameters.
Produce forecasts using best method.
Obtain prediction intervals using
underlying state space model.
Forecasting hierarchical time series Application to Australian tourism 38
Automatic forecasting
From Hyndman et al. (IJF, 2002):
Apply each of 30 models that are
appropriate to the data. Optimize
parameters and initial values using MLE
(or some other criterion).
Select best method using AIC:
AIC = −2 log(Likelihood) + 2p
where p = # parameters.
Produce forecasts using best method.
Obtain prediction intervals using
underlying state space model.
Forecasting hierarchical time series Application to Australian tourism 38
Automatic forecasting
From Hyndman et al. (IJF, 2002):
Apply each of 30 models that are
appropriate to the data. Optimize
parameters and initial values using MLE
(or some other criterion).
Select best method using AIC:
AIC = −2 log(Likelihood) + 2p
where p = # parameters.
Produce forecasts using best method.
Obtain prediction intervals using
underlying state space model.
Forecasting hierarchical time series Application to Australian tourism 38
Base forecasts
Forecasting hierarchical time series Application to Australian tourism 39
Domestic tourism forecasts: Total
Year
Visitornights
1998 2000 2002 2004 2006 2008
600006500070000750008000085000
Base forecasts
Forecasting hierarchical time series Application to Australian tourism 39
Domestic tourism forecasts: NSW
Year
Visitornights
1998 2000 2002 2004 2006 2008
18000220002600030000
Base forecasts
Forecasting hierarchical time series Application to Australian tourism 39
Domestic tourism forecasts: VIC
Year
Visitornights
1998 2000 2002 2004 2006 2008
1000012000140001600018000
Base forecasts
Forecasting hierarchical time series Application to Australian tourism 39
Domestic tourism forecasts: Nth.Coast.NSW
Year
Visitornights
1998 2000 2002 2004 2006 2008
50006000700080009000
Base forecasts
Forecasting hierarchical time series Application to Australian tourism 39
Domestic tourism forecasts: Metro.QLD
Year
Visitornights
1998 2000 2002 2004 2006 2008
800090001100013000
Base forecasts
Forecasting hierarchical time series Application to Australian tourism 39
Domestic tourism forecasts: Sth.WA
Year
Visitornights
1998 2000 2002 2004 2006 2008
400600800100012001400
Base forecasts
Forecasting hierarchical time series Application to Australian tourism 39
Domestic tourism forecasts: X201.Melbourne
Year
Visitornights
1998 2000 2002 2004 2006 2008
40004500500055006000
Base forecasts
Forecasting hierarchical time series Application to Australian tourism 39
Domestic tourism forecasts: X402.Murraylands
Year
Visitornights
1998 2000 2002 2004 2006 2008
0100200300
Base forecasts
Forecasting hierarchical time series Application to Australian tourism 39
Domestic tourism forecasts: X809.Daly
Year
Visitornights
1998 2000 2002 2004 2006 2008
020406080100
Hierarchy: states, zones, regions
Forecast Horizon (h)
MAPE 1 2 4 6 8 Average
Top Level: Australia
Bottom-up 3.79 3.58 4.01 4.55 4.24 4.06
OLS 3.83 3.66 3.88 4.19 4.25 3.94
Scaling 3.68 3.56 3.97 4.57 4.25 4.04
Averaging 3.76 3.60 4.01 4.58 4.22 4.06
Level 1: States
Bottom-up 10.70 10.52 10.85 11.46 11.27 11.03
OLS 11.07 10.58 11.13 11.62 12.21 11.35
Scaling 10.44 10.17 10.47 10.97 10.98 10.67
Averaging 10.59 10.36 10.69 11.27 11.21 10.89
Based on a rolling forecast origin with at least 12 observations in the
training set.
Forecasting hierarchical time series Application to Australian tourism 40
Hierarchy: states, zones, regions
Forecast Horizon (h)
MAPE 1 2 4 6 8 Average
Level 2: Zones
Bottom-up 14.99 14.97 14.98 15.69 15.65 15.32
OLS 15.16 15.06 15.27 15.74 16.15 15.48
Scaling 14.63 14.62 14.68 15.17 15.25 14.94
Averaging 14.79 14.79 14.85 15.46 15.49 15.14
Bottom Level: Regions
Bottom-up 33.12 32.54 32.26 33.74 33.96 33.18
OLS 35.89 33.86 34.26 36.06 37.49 35.43
Scaling 31.68 31.22 31.08 32.41 32.77 31.89
Averaging 32.84 32.20 32.06 33.44 34.04 32.96
Based on a rolling forecast origin with at least 12 observations in the
training set.
Forecasting hierarchical time series Application to Australian tourism 41
Groups: Purpose, states, capital
Forecast Horizon (h)
MAPE 1 2 4 6 8 Average
Top Level: Australia
Bottom-up 3.48 3.30 4.04 4.56 4.58 4.03
OLS 3.80 3.64 3.94 4.22 4.35 3.95
Scaling 3.65 3.45 4.00 4.52 4.57 4.04
Averaging 3.59 3.33 3.99 4.56 4.58 4.04
Level 1: Purpose of travel
Bottom-up 8.14 8.37 9.02 9.39 9.52 8.95
OLS 7.94 7.91 8.66 8.66 9.29 8.54
Scaling 7.99 8.10 8.59 9.09 9.43 8.71
Averaging 8.04 8.21 8.79 9.25 9.44 8.82
Based on a rolling forecast origin with at least 12 observations in the
training set.
Forecasting hierarchical time series Application to Australian tourism 42
Groups: Purpose, states, capital
Forecast Horizon (h)
MAPE 1 2 4 6 8 Average
Level 2: States
Bottom-up 21.34 21.75 22.39 23.26 23.31 22.58
OLS 22.17 21.80 23.53 23.15 23.90 22.99
Scaling 21.49 21.62 22.20 23.13 23.25 22.51
Averaging 21.38 21.61 22.30 23.17 23.24 22.51
Bottom Level: Capital city versus other
Bottom-up 31.97 31.65 32.19 33.70 33.47 32.62
OLS 32.31 30.92 32.41 33.35 34.13 32.55
Scaling 32.12 31.36 32.18 33.36 33.43 32.52
Averaging 31.92 31.39 32.04 33.51 33.39 32.49
Based on a rolling forecast origin with at least 12 observations in the
training set.
Forecasting hierarchical time series Application to Australian tourism 43
Outline
1 Hierarchical time series
2 Forecasting framework
3 Optimal forecasts
4 Approximately optimal forecasts
5 Application to Australian tourism
6 hts package for R
7 References
Forecasting hierarchical time series hts package for R 44
hts package for R
Forecasting hierarchical time series hts package for R 45
hts: Hierarchical and grouped time series
Methods for analysing and forecasting hierarchical and grouped
time series
Version: 3.01
Depends: forecast
Imports: SparseM
Published: 2013-05-07
Author: Rob J Hyndman, Roman A Ahmed, and Han Lin Shang
Maintainer: Rob J Hyndman <Rob.Hyndman at monash.edu>
License: GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]
Example using R
library(hts)
# bts is a matrix containing the bottom level time series
# g describes the grouping/hierarchical structure
y <- hts(bts, g=c(1,1,2,2))
Forecasting hierarchical time series hts package for R 46
Example using R
library(hts)
# bts is a matrix containing the bottom level time series
# g describes the grouping/hierarchical structure
y <- hts(bts, g=c(1,1,2,2))
Forecasting hierarchical time series hts package for R 46
Total
A
AX AY
B
BX BY
Example using R
library(hts)
# bts is a matrix containing the bottom level time series
# g describes the grouping/hierarchical structure
y <- hts(bts, g=c(1,1,2,2))
# Forecast 10-step-ahead using optimal combination method
# ETS used for each series by default
fc <- forecast(y, h=10)
Forecasting hierarchical time series hts package for R 47
Example using R
library(hts)
# bts is a matrix containing the bottom level time series
# g describes the grouping/hierarchical structure
y <- hts(bts, g=c(1,1,2,2))
# Forecast 10-step-ahead using OLS combination method
# ETS used for each series by default
fc <- forecast(y, h=10)
# Select your own methods
ally <- allts(y)
allf <- matrix(, nrow=10, ncol=ncol(ally))
for(i in 1:ncol(ally))
allf[,i] <- mymethod(ally[,i], h=10)
allf <- ts(allf, start=2004)
# Reconcile forecasts so they add up
fc2 <- combinef(allf, Smatrix(y))
Forecasting hierarchical time series hts package for R 48
hts function
Usage
hts(y, g)
gts(y, g, hierarchical=FALSE)
Arguments
y Multivariate time series containing the bot-
tom level series
g Group matrix indicating the group structure,
with one column for each series when com-
pletely disaggregated, and one row for each
grouping of the time series.
hierarchical Indicates if the grouping matrix should be
treated as hierarchical.
Details
hts is simply a wrapper for gts(y,g,TRUE). Both return an
object of class gts.
Forecasting hierarchical time series hts package for R 49
forecast.gts function
Usage
forecast(object, h,
method = c("comb", "bu", "mo", "tdgsf", "tdgsa", "tdfp", "all"),
fmethod = c("ets", "rw", "arima"), level, positive = FALSE,
xreg = NULL, newxreg = NULL, ...)
Arguments
object Hierarchical time series object of class gts.
h Forecast horizon
method Method for distributing forecasts within the hierarchy.
fmethod Forecasting method to use
level Level used for "middle-out" method (when method="mo")
positive If TRUE, forecasts are forced to be strictly positive
xreg When fmethod = "arima", a vector or matrix of external re-
gressors, which must have the same number of rows as the
original univariate time series
newxreg When fmethod = "arima", a vector or matrix of external re-
gressors, which must have the same number of rows as the
original univariate time series
... Other arguments passing to ets or auto.arima
Forecasting hierarchical time series hts package for R 50
Utility functions
allts(y) Returns all series in the
hierarchy
Smatrix(y) Returns the summing matrix
combinef(f) Combines initial forecasts
optimally.
Forecasting hierarchical time series hts package for R 51
More information
Forecasting hierarchical time series hts package for R 52
Vignette on CRAN
Outline
1 Hierarchical time series
2 Forecasting framework
3 Optimal forecasts
4 Approximately optimal forecasts
5 Application to Australian tourism
6 hts package for R
7 References
Forecasting hierarchical time series References 53
References
RJ Hyndman, RA Ahmed, G Athanasopoulos, and
HL Shang (2011). “Optimal combination
forecasts for hierarchical time series”.
Computational Statistics and Data Analysis
55(9), 2579–2589
RJ Hyndman, RA Ahmed, and HL Shang (2013).
hts: Hierarchical time series.
cran.r-project.org/package=hts.
RJ Hyndman and G Athanasopoulos (2013).
Forecasting: principles and practice. OTexts.
OTexts.org/fpp/.
Forecasting hierarchical time series References 54
References
RJ Hyndman, RA Ahmed, G Athanasopoulos, and
HL Shang (2011). “Optimal combination
forecasts for hierarchical time series”.
Computational Statistics and Data Analysis
55(9), 2579–2589
RJ Hyndman, RA Ahmed, and HL Shang (2013).
hts: Hierarchical time series.
cran.r-project.org/package=hts.
RJ Hyndman and G Athanasopoulos (2013).
Forecasting: principles and practice. OTexts.
OTexts.org/fpp/.
Forecasting hierarchical time series References 54
¯ Papers and R code:
robjhyndman.com
¯ Email: Rob.Hyndman@monash.edu

Contenu connexe

Tendances

Mixed Integer Linear Programming Formulation for the Taxi Sharing Problem
Mixed Integer Linear Programming Formulation for the Taxi Sharing ProblemMixed Integer Linear Programming Formulation for the Taxi Sharing Problem
Mixed Integer Linear Programming Formulation for the Taxi Sharing Problemjfrchicanog
 
Real-time personalized recommendations using product embeddings
Real-time personalized recommendations using product embeddingsReal-time personalized recommendations using product embeddings
Real-time personalized recommendations using product embeddingsJakub Macina
 
Module 5 - Data Science Methodology.pdf
Module 5 - Data Science Methodology.pdfModule 5 - Data Science Methodology.pdf
Module 5 - Data Science Methodology.pdffathiah5
 
Hadoop Data Modeling
Hadoop Data ModelingHadoop Data Modeling
Hadoop Data ModelingAdam Doyle
 
On the Role of Fitness, Precision, Generalization and Simplicity in Process D...
On the Role of Fitness, Precision, Generalization and Simplicity in Process D...On the Role of Fitness, Precision, Generalization and Simplicity in Process D...
On the Role of Fitness, Precision, Generalization and Simplicity in Process D...Wil van der Aalst
 
Logistic Regression: Predicting The Chances Of Coronary Heart Disease
Logistic Regression: Predicting The Chances Of Coronary Heart DiseaseLogistic Regression: Predicting The Chances Of Coronary Heart Disease
Logistic Regression: Predicting The Chances Of Coronary Heart DiseaseMichael Lieberman
 
AVATA S&OP / IBP Express
AVATA S&OP / IBP ExpressAVATA S&OP / IBP Express
AVATA S&OP / IBP ExpressAVATA
 
Best Practices for Building and Deploying Data Pipelines in Apache Spark
Best Practices for Building and Deploying Data Pipelines in Apache SparkBest Practices for Building and Deploying Data Pipelines in Apache Spark
Best Practices for Building and Deploying Data Pipelines in Apache SparkDatabricks
 
Splunk and Cisco UCS Breakout Session
Splunk and Cisco UCS Breakout SessionSplunk and Cisco UCS Breakout Session
Splunk and Cisco UCS Breakout SessionSplunk
 
Keeping Identity Graphs In Sync With Apache Spark
Keeping Identity Graphs In Sync With Apache SparkKeeping Identity Graphs In Sync With Apache Spark
Keeping Identity Graphs In Sync With Apache SparkDatabricks
 
Introduction To Analytics
Introduction To AnalyticsIntroduction To Analytics
Introduction To AnalyticsAlex Meadows
 
Capturing Network Traffic into Database
Capturing Network Traffic into Database Capturing Network Traffic into Database
Capturing Network Traffic into Database Tigran Tsaturyan
 
Hardware Provisioning
Hardware ProvisioningHardware Provisioning
Hardware ProvisioningMongoDB
 
Big Data Predictive Analytics for Retail businesses
Big Data Predictive Analytics for Retail businessesBig Data Predictive Analytics for Retail businesses
Big Data Predictive Analytics for Retail businessesGopalakrishna Palem
 
BigQuery best practices and recommendations to reduce costs with BI Engine, S...
BigQuery best practices and recommendations to reduce costs with BI Engine, S...BigQuery best practices and recommendations to reduce costs with BI Engine, S...
BigQuery best practices and recommendations to reduce costs with BI Engine, S...Márton Kodok
 
Exploratory data analysis
Exploratory data analysisExploratory data analysis
Exploratory data analysisVishwas N
 

Tendances (20)

Mixed Integer Linear Programming Formulation for the Taxi Sharing Problem
Mixed Integer Linear Programming Formulation for the Taxi Sharing ProblemMixed Integer Linear Programming Formulation for the Taxi Sharing Problem
Mixed Integer Linear Programming Formulation for the Taxi Sharing Problem
 
Real-time personalized recommendations using product embeddings
Real-time personalized recommendations using product embeddingsReal-time personalized recommendations using product embeddings
Real-time personalized recommendations using product embeddings
 
Module 5 - Data Science Methodology.pdf
Module 5 - Data Science Methodology.pdfModule 5 - Data Science Methodology.pdf
Module 5 - Data Science Methodology.pdf
 
Hadoop Data Modeling
Hadoop Data ModelingHadoop Data Modeling
Hadoop Data Modeling
 
On the Role of Fitness, Precision, Generalization and Simplicity in Process D...
On the Role of Fitness, Precision, Generalization and Simplicity in Process D...On the Role of Fitness, Precision, Generalization and Simplicity in Process D...
On the Role of Fitness, Precision, Generalization and Simplicity in Process D...
 
Missing data handling
Missing data handlingMissing data handling
Missing data handling
 
Logistic Regression: Predicting The Chances Of Coronary Heart Disease
Logistic Regression: Predicting The Chances Of Coronary Heart DiseaseLogistic Regression: Predicting The Chances Of Coronary Heart Disease
Logistic Regression: Predicting The Chances Of Coronary Heart Disease
 
AVATA S&OP / IBP Express
AVATA S&OP / IBP ExpressAVATA S&OP / IBP Express
AVATA S&OP / IBP Express
 
Best Practices for Building and Deploying Data Pipelines in Apache Spark
Best Practices for Building and Deploying Data Pipelines in Apache SparkBest Practices for Building and Deploying Data Pipelines in Apache Spark
Best Practices for Building and Deploying Data Pipelines in Apache Spark
 
Splunk and Cisco UCS Breakout Session
Splunk and Cisco UCS Breakout SessionSplunk and Cisco UCS Breakout Session
Splunk and Cisco UCS Breakout Session
 
Machine Learning in R
Machine Learning in RMachine Learning in R
Machine Learning in R
 
Keeping Identity Graphs In Sync With Apache Spark
Keeping Identity Graphs In Sync With Apache SparkKeeping Identity Graphs In Sync With Apache Spark
Keeping Identity Graphs In Sync With Apache Spark
 
Introduction To Analytics
Introduction To AnalyticsIntroduction To Analytics
Introduction To Analytics
 
Capturing Network Traffic into Database
Capturing Network Traffic into Database Capturing Network Traffic into Database
Capturing Network Traffic into Database
 
Hardware Provisioning
Hardware ProvisioningHardware Provisioning
Hardware Provisioning
 
Big Data Predictive Analytics for Retail businesses
Big Data Predictive Analytics for Retail businessesBig Data Predictive Analytics for Retail businesses
Big Data Predictive Analytics for Retail businesses
 
BigQuery best practices and recommendations to reduce costs with BI Engine, S...
BigQuery best practices and recommendations to reduce costs with BI Engine, S...BigQuery best practices and recommendations to reduce costs with BI Engine, S...
BigQuery best practices and recommendations to reduce costs with BI Engine, S...
 
Reliable and Scalable Data Ingestion at Airbnb
Reliable and Scalable Data Ingestion at AirbnbReliable and Scalable Data Ingestion at Airbnb
Reliable and Scalable Data Ingestion at Airbnb
 
Agreggates i
Agreggates iAgreggates i
Agreggates i
 
Exploratory data analysis
Exploratory data analysisExploratory data analysis
Exploratory data analysis
 

En vedette

Exploring the feature space of large collections of time series
Exploring the feature space of large collections of time seriesExploring the feature space of large collections of time series
Exploring the feature space of large collections of time seriesRob Hyndman
 
Automatic algorithms for time series forecasting
Automatic algorithms for time series forecastingAutomatic algorithms for time series forecasting
Automatic algorithms for time series forecastingRob Hyndman
 
Visualization and forecasting of big time series data
Visualization and forecasting of big time series dataVisualization and forecasting of big time series data
Visualization and forecasting of big time series dataRob Hyndman
 
Visualization of big time series data
Visualization of big time series dataVisualization of big time series data
Visualization of big time series dataRob Hyndman
 
Probabilistic forecasting of long-term peak electricity demand
Probabilistic forecasting of long-term peak electricity demandProbabilistic forecasting of long-term peak electricity demand
Probabilistic forecasting of long-term peak electricity demandRob Hyndman
 
Forecasting electricity demand distributions using a semiparametric additive ...
Forecasting electricity demand distributions using a semiparametric additive ...Forecasting electricity demand distributions using a semiparametric additive ...
Forecasting electricity demand distributions using a semiparametric additive ...Rob Hyndman
 
The Catchment Based Approach - the power of Collaborative Catchment Management
The Catchment Based Approach - the power of Collaborative Catchment ManagementThe Catchment Based Approach - the power of Collaborative Catchment Management
The Catchment Based Approach - the power of Collaborative Catchment ManagementCaBASupport
 
Demolition, Deconstruction & Dismantling
Demolition, Deconstruction & Dismantling Demolition, Deconstruction & Dismantling
Demolition, Deconstruction & Dismantling Emma Attwood
 
MEFM: An R package for long-term probabilistic forecasting of electricity demand
MEFM: An R package for long-term probabilistic forecasting of electricity demandMEFM: An R package for long-term probabilistic forecasting of electricity demand
MEFM: An R package for long-term probabilistic forecasting of electricity demandRob Hyndman
 
Automatic time series forecasting
Automatic time series forecastingAutomatic time series forecasting
Automatic time series forecastingRob Hyndman
 
Demographic forecasting
Demographic forecastingDemographic forecasting
Demographic forecastingRob Hyndman
 
Academia sinica jan-2015
Academia sinica jan-2015Academia sinica jan-2015
Academia sinica jan-2015Rob Hyndman
 
Top down construction by Sai Towers P Ltd.
Top down construction by Sai Towers P Ltd.Top down construction by Sai Towers P Ltd.
Top down construction by Sai Towers P Ltd.Sai Towers P Ltd.
 
Huohua: A Distributed Time Series Analysis Framework For Spark
Huohua: A Distributed Time Series Analysis Framework For SparkHuohua: A Distributed Time Series Analysis Framework For Spark
Huohua: A Distributed Time Series Analysis Framework For SparkJen Aman
 
Advances in automatic time series forecasting
Advances in automatic time series forecastingAdvances in automatic time series forecasting
Advances in automatic time series forecastingRob Hyndman
 
Forecasting without forecasters
Forecasting without forecastersForecasting without forecasters
Forecasting without forecastersRob Hyndman
 

En vedette (20)

Exploring the feature space of large collections of time series
Exploring the feature space of large collections of time seriesExploring the feature space of large collections of time series
Exploring the feature space of large collections of time series
 
Automatic algorithms for time series forecasting
Automatic algorithms for time series forecastingAutomatic algorithms for time series forecasting
Automatic algorithms for time series forecasting
 
Visualization and forecasting of big time series data
Visualization and forecasting of big time series dataVisualization and forecasting of big time series data
Visualization and forecasting of big time series data
 
Visualization of big time series data
Visualization of big time series dataVisualization of big time series data
Visualization of big time series data
 
Probabilistic forecasting of long-term peak electricity demand
Probabilistic forecasting of long-term peak electricity demandProbabilistic forecasting of long-term peak electricity demand
Probabilistic forecasting of long-term peak electricity demand
 
Forecasting electricity demand distributions using a semiparametric additive ...
Forecasting electricity demand distributions using a semiparametric additive ...Forecasting electricity demand distributions using a semiparametric additive ...
Forecasting electricity demand distributions using a semiparametric additive ...
 
The Catchment Based Approach - the power of Collaborative Catchment Management
The Catchment Based Approach - the power of Collaborative Catchment ManagementThe Catchment Based Approach - the power of Collaborative Catchment Management
The Catchment Based Approach - the power of Collaborative Catchment Management
 
Demolition, Deconstruction & Dismantling
Demolition, Deconstruction & Dismantling Demolition, Deconstruction & Dismantling
Demolition, Deconstruction & Dismantling
 
MEFM: An R package for long-term probabilistic forecasting of electricity demand
MEFM: An R package for long-term probabilistic forecasting of electricity demandMEFM: An R package for long-term probabilistic forecasting of electricity demand
MEFM: An R package for long-term probabilistic forecasting of electricity demand
 
Automatic time series forecasting
Automatic time series forecastingAutomatic time series forecasting
Automatic time series forecasting
 
Demographic forecasting
Demographic forecastingDemographic forecasting
Demographic forecasting
 
Academia sinica jan-2015
Academia sinica jan-2015Academia sinica jan-2015
Academia sinica jan-2015
 
Pilework
PileworkPilework
Pilework
 
Top down construction by Sai Towers P Ltd.
Top down construction by Sai Towers P Ltd.Top down construction by Sai Towers P Ltd.
Top down construction by Sai Towers P Ltd.
 
Huohua: A Distributed Time Series Analysis Framework For Spark
Huohua: A Distributed Time Series Analysis Framework For SparkHuohua: A Distributed Time Series Analysis Framework For Spark
Huohua: A Distributed Time Series Analysis Framework For Spark
 
Deep excavations
Deep excavationsDeep excavations
Deep excavations
 
Top down construction presentation
Top down construction presentationTop down construction presentation
Top down construction presentation
 
Advances in automatic time series forecasting
Advances in automatic time series forecastingAdvances in automatic time series forecasting
Advances in automatic time series forecasting
 
Forecasting without forecasters
Forecasting without forecastersForecasting without forecasters
Forecasting without forecasters
 
Top Down Construction Presentation.
Top Down Construction Presentation.Top Down Construction Presentation.
Top Down Construction Presentation.
 

Similaire à Forecasting Hierarchical Time Series

Embase for pharmacovigilance: Search and validation March 22 2017
Embase for pharmacovigilance: Search and validation March 22 2017Embase for pharmacovigilance: Search and validation March 22 2017
Embase for pharmacovigilance: Search and validation March 22 2017Ann-Marie Roche
 
Demand Forecasting
Demand ForecastingDemand Forecasting
Demand ForecastingAnupam Basu
 
Time Series Analysis: Theory and Practice
Time Series Analysis: Theory and PracticeTime Series Analysis: Theory and Practice
Time Series Analysis: Theory and PracticeTetiana Ivanova
 
QM-032-SQC tool application
QM-032-SQC tool applicationQM-032-SQC tool application
QM-032-SQC tool applicationhandbook
 
Patients Condition Classification Using Drug Reviews.pptx
Patients Condition Classification Using Drug Reviews.pptxPatients Condition Classification Using Drug Reviews.pptx
Patients Condition Classification Using Drug Reviews.pptxAnupama Kate
 
Doing a systematic review: top tips for progressing your review
Doing a systematic review: top tips for progressing your reviewDoing a systematic review: top tips for progressing your review
Doing a systematic review: top tips for progressing your reviewUniversity of Liverpool Library
 
Forecasting Techniques - Data Science SG
Forecasting Techniques - Data Science SG Forecasting Techniques - Data Science SG
Forecasting Techniques - Data Science SG Kai Xin Thia
 
systematic reviews and what the library can do to help
systematic reviews and what the library can do to helpsystematic reviews and what the library can do to help
systematic reviews and what the library can do to helpIsla Kuhn
 
Aggregate planning using transportation method a case study in cable industry
Aggregate planning using transportation method a case study in cable industryAggregate planning using transportation method a case study in cable industry
Aggregate planning using transportation method a case study in cable industryijmvsc
 

Similaire à Forecasting Hierarchical Time Series (11)

Embase for pharmacovigilance: Search and validation March 22 2017
Embase for pharmacovigilance: Search and validation March 22 2017Embase for pharmacovigilance: Search and validation March 22 2017
Embase for pharmacovigilance: Search and validation March 22 2017
 
Demand Forecasting
Demand ForecastingDemand Forecasting
Demand Forecasting
 
Time Series Analysis: Theory and Practice
Time Series Analysis: Theory and PracticeTime Series Analysis: Theory and Practice
Time Series Analysis: Theory and Practice
 
QM-032-SQC tool application
QM-032-SQC tool applicationQM-032-SQC tool application
QM-032-SQC tool application
 
Patients Condition Classification Using Drug Reviews.pptx
Patients Condition Classification Using Drug Reviews.pptxPatients Condition Classification Using Drug Reviews.pptx
Patients Condition Classification Using Drug Reviews.pptx
 
Doing a systematic review: top tips for progressing your review
Doing a systematic review: top tips for progressing your reviewDoing a systematic review: top tips for progressing your review
Doing a systematic review: top tips for progressing your review
 
200807 Search
200807 Search200807 Search
200807 Search
 
Forecasting Techniques - Data Science SG
Forecasting Techniques - Data Science SG Forecasting Techniques - Data Science SG
Forecasting Techniques - Data Science SG
 
PM 1 to 4.pdf
PM 1 to 4.pdfPM 1 to 4.pdf
PM 1 to 4.pdf
 
systematic reviews and what the library can do to help
systematic reviews and what the library can do to helpsystematic reviews and what the library can do to help
systematic reviews and what the library can do to help
 
Aggregate planning using transportation method a case study in cable industry
Aggregate planning using transportation method a case study in cable industryAggregate planning using transportation method a case study in cable industry
Aggregate planning using transportation method a case study in cable industry
 

Plus de Rob Hyndman

Exploring the boundaries of predictability
Exploring the boundaries of predictabilityExploring the boundaries of predictability
Exploring the boundaries of predictabilityRob Hyndman
 
Automatic time series forecasting
Automatic time series forecastingAutomatic time series forecasting
Automatic time series forecastingRob Hyndman
 
Coherent mortality forecasting using functional time series models
Coherent mortality forecasting using functional time series modelsCoherent mortality forecasting using functional time series models
Coherent mortality forecasting using functional time series modelsRob Hyndman
 
Forecasting using R
Forecasting using RForecasting using R
Forecasting using RRob Hyndman
 
SimpleR: tips, tricks & tools
SimpleR: tips, tricks & toolsSimpleR: tips, tricks & tools
SimpleR: tips, tricks & toolsRob Hyndman
 
FPP 1. Getting started
FPP 1. Getting startedFPP 1. Getting started
FPP 1. Getting startedRob Hyndman
 

Plus de Rob Hyndman (7)

Exploring the boundaries of predictability
Exploring the boundaries of predictabilityExploring the boundaries of predictability
Exploring the boundaries of predictability
 
Automatic time series forecasting
Automatic time series forecastingAutomatic time series forecasting
Automatic time series forecasting
 
Coherent mortality forecasting using functional time series models
Coherent mortality forecasting using functional time series modelsCoherent mortality forecasting using functional time series models
Coherent mortality forecasting using functional time series models
 
Forecasting using R
Forecasting using RForecasting using R
Forecasting using R
 
SimpleR: tips, tricks & tools
SimpleR: tips, tricks & toolsSimpleR: tips, tricks & tools
SimpleR: tips, tricks & tools
 
Ysc2013
Ysc2013Ysc2013
Ysc2013
 
FPP 1. Getting started
FPP 1. Getting startedFPP 1. Getting started
FPP 1. Getting started
 

Dernier

PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docxPoojaSen20
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Shubhangi Sonawane
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxNikitaBankoti2
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxnegromaestrong
 
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIFood Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIShubhangi Sonawane
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 

Dernier (20)

PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptx
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIFood Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 

Forecasting Hierarchical Time Series

  • 1. Total A AA AB AC B BA BB BC C CA CB CC 1 Rob J Hyndman Forecasting hierarchical time series
  • 2. Outline 1 Hierarchical time series 2 Forecasting framework 3 Optimal forecasts 4 Approximately optimal forecasts 5 Application to Australian tourism 6 hts package for R 7 References Forecasting hierarchical time series Hierarchical time series 2
  • 3. Introduction Total A AA AB AC B BA BB BC C CA CB CC Examples Manufacturing product hierarchies Net labour turnover Pharmaceutical sales Tourism demand by region and purpose Forecasting hierarchical time series Hierarchical time series 3
  • 4. Introduction Total A AA AB AC B BA BB BC C CA CB CC Examples Manufacturing product hierarchies Net labour turnover Pharmaceutical sales Tourism demand by region and purpose Forecasting hierarchical time series Hierarchical time series 3
  • 5. Introduction Total A AA AB AC B BA BB BC C CA CB CC Examples Manufacturing product hierarchies Net labour turnover Pharmaceutical sales Tourism demand by region and purpose Forecasting hierarchical time series Hierarchical time series 3
  • 6. Introduction Total A AA AB AC B BA BB BC C CA CB CC Examples Manufacturing product hierarchies Net labour turnover Pharmaceutical sales Tourism demand by region and purpose Forecasting hierarchical time series Hierarchical time series 3
  • 7. Introduction Total A AA AB AC B BA BB BC C CA CB CC Examples Manufacturing product hierarchies Net labour turnover Pharmaceutical sales Tourism demand by region and purpose Forecasting hierarchical time series Hierarchical time series 3
  • 8. Forecasting the PBS Forecasting hierarchical time series Hierarchical time series 4
  • 9. ATC drug classification A Alimentary tract and metabolism B Blood and blood forming organs C Cardiovascular system D Dermatologicals G Genito-urinary system and sex hormones H Systemic hormonal preparations, excluding sex hor- mones and insulins J Anti-infectives for systemic use L Antineoplastic and immunomodulating agents M Musculo-skeletal system N Nervous system P Antiparasitic products, insecticides and repellents R Respiratory system S Sensory organs V Various Forecasting hierarchical time series Hierarchical time series 5
  • 10. ATC drug classification A Alimentary tract and metabolism14 classes A10 Drugs used in diabetes84 classes A10B Blood glucose lowering drugs A10BA Biguanides A10BA02 Metformin Forecasting hierarchical time series Hierarchical time series 6
  • 11. Australian tourism Forecasting hierarchical time series Hierarchical time series 7
  • 12. Australian tourism Forecasting hierarchical time series Hierarchical time series 7 Also split by purpose of travel: Holiday Visits to friends and relatives Business Other
  • 13. Hierarchical/grouped time series A hierarchical time series is a collection of several time series that are linked together in a hierarchical structure. Example: Pharmaceutical products are organized in a hierarchy under the Anatomical Therapeutic Chemical (ATC) Classification System. A grouped time series is a collection of time series that are aggregated in a number of non-hierarchical ways. Example: Australian tourism demand is grouped by region and purpose of travel. Forecasting hierarchical time series Hierarchical time series 8
  • 14. Hierarchical/grouped time series A hierarchical time series is a collection of several time series that are linked together in a hierarchical structure. Example: Pharmaceutical products are organized in a hierarchy under the Anatomical Therapeutic Chemical (ATC) Classification System. A grouped time series is a collection of time series that are aggregated in a number of non-hierarchical ways. Example: Australian tourism demand is grouped by region and purpose of travel. Forecasting hierarchical time series Hierarchical time series 8
  • 15. Hierarchical/grouped time series A hierarchical time series is a collection of several time series that are linked together in a hierarchical structure. Example: Pharmaceutical products are organized in a hierarchy under the Anatomical Therapeutic Chemical (ATC) Classification System. A grouped time series is a collection of time series that are aggregated in a number of non-hierarchical ways. Example: Australian tourism demand is grouped by region and purpose of travel. Forecasting hierarchical time series Hierarchical time series 8
  • 16. Hierarchical/grouped time series A hierarchical time series is a collection of several time series that are linked together in a hierarchical structure. Example: Pharmaceutical products are organized in a hierarchy under the Anatomical Therapeutic Chemical (ATC) Classification System. A grouped time series is a collection of time series that are aggregated in a number of non-hierarchical ways. Example: Australian tourism demand is grouped by region and purpose of travel. Forecasting hierarchical time series Hierarchical time series 8
  • 17. Hierarchical/grouped time series Forecasts should be “aggregate consistent”, unbiased, minimum variance. Existing methods: ¢ Bottom-up ¢ Top-down ¢ Middle-out How to compute forecast intervals? Most research is concerned about relative performance of existing methods. There is no research on how to deal with forecasting grouped time series. Forecasting hierarchical time series Hierarchical time series 9
  • 18. Hierarchical/grouped time series Forecasts should be “aggregate consistent”, unbiased, minimum variance. Existing methods: ¢ Bottom-up ¢ Top-down ¢ Middle-out How to compute forecast intervals? Most research is concerned about relative performance of existing methods. There is no research on how to deal with forecasting grouped time series. Forecasting hierarchical time series Hierarchical time series 9
  • 19. Hierarchical/grouped time series Forecasts should be “aggregate consistent”, unbiased, minimum variance. Existing methods: ¢ Bottom-up ¢ Top-down ¢ Middle-out How to compute forecast intervals? Most research is concerned about relative performance of existing methods. There is no research on how to deal with forecasting grouped time series. Forecasting hierarchical time series Hierarchical time series 9
  • 20. Hierarchical/grouped time series Forecasts should be “aggregate consistent”, unbiased, minimum variance. Existing methods: ¢ Bottom-up ¢ Top-down ¢ Middle-out How to compute forecast intervals? Most research is concerned about relative performance of existing methods. There is no research on how to deal with forecasting grouped time series. Forecasting hierarchical time series Hierarchical time series 9
  • 21. Hierarchical/grouped time series Forecasts should be “aggregate consistent”, unbiased, minimum variance. Existing methods: ¢ Bottom-up ¢ Top-down ¢ Middle-out How to compute forecast intervals? Most research is concerned about relative performance of existing methods. There is no research on how to deal with forecasting grouped time series. Forecasting hierarchical time series Hierarchical time series 9
  • 22. Hierarchical/grouped time series Forecasts should be “aggregate consistent”, unbiased, minimum variance. Existing methods: ¢ Bottom-up ¢ Top-down ¢ Middle-out How to compute forecast intervals? Most research is concerned about relative performance of existing methods. There is no research on how to deal with forecasting grouped time series. Forecasting hierarchical time series Hierarchical time series 9
  • 23. Hierarchical/grouped time series Forecasts should be “aggregate consistent”, unbiased, minimum variance. Existing methods: ¢ Bottom-up ¢ Top-down ¢ Middle-out How to compute forecast intervals? Most research is concerned about relative performance of existing methods. There is no research on how to deal with forecasting grouped time series. Forecasting hierarchical time series Hierarchical time series 9
  • 24. Hierarchical/grouped time series Forecasts should be “aggregate consistent”, unbiased, minimum variance. Existing methods: ¢ Bottom-up ¢ Top-down ¢ Middle-out How to compute forecast intervals? Most research is concerned about relative performance of existing methods. There is no research on how to deal with forecasting grouped time series. Forecasting hierarchical time series Hierarchical time series 9
  • 25. Top-down method Forecasting hierarchical time series Hierarchical time series 10 Advantages Works well in presence of low counts. Single forecasting model easy to build Provides reliable forecasts for aggregate levels. Disadvantages Loss of information, especially individual series dynamics. Distribution of forecasts to lower levels can be difficult No prediction intervals
  • 26. Top-down method Forecasting hierarchical time series Hierarchical time series 10 Advantages Works well in presence of low counts. Single forecasting model easy to build Provides reliable forecasts for aggregate levels. Disadvantages Loss of information, especially individual series dynamics. Distribution of forecasts to lower levels can be difficult No prediction intervals
  • 27. Top-down method Forecasting hierarchical time series Hierarchical time series 10 Advantages Works well in presence of low counts. Single forecasting model easy to build Provides reliable forecasts for aggregate levels. Disadvantages Loss of information, especially individual series dynamics. Distribution of forecasts to lower levels can be difficult No prediction intervals
  • 28. Top-down method Forecasting hierarchical time series Hierarchical time series 10 Advantages Works well in presence of low counts. Single forecasting model easy to build Provides reliable forecasts for aggregate levels. Disadvantages Loss of information, especially individual series dynamics. Distribution of forecasts to lower levels can be difficult No prediction intervals
  • 29. Top-down method Forecasting hierarchical time series Hierarchical time series 10 Advantages Works well in presence of low counts. Single forecasting model easy to build Provides reliable forecasts for aggregate levels. Disadvantages Loss of information, especially individual series dynamics. Distribution of forecasts to lower levels can be difficult No prediction intervals
  • 30. Top-down method Forecasting hierarchical time series Hierarchical time series 10 Advantages Works well in presence of low counts. Single forecasting model easy to build Provides reliable forecasts for aggregate levels. Disadvantages Loss of information, especially individual series dynamics. Distribution of forecasts to lower levels can be difficult No prediction intervals
  • 31. Bottom-up method Forecasting hierarchical time series Hierarchical time series 11 Advantages No loss of information. Better captures dynamics of individual series. Disadvantages Large number of series to be forecast. Constructing forecasting models is harder because of noisy data at bottom level. No prediction intervals
  • 32. Bottom-up method Forecasting hierarchical time series Hierarchical time series 11 Advantages No loss of information. Better captures dynamics of individual series. Disadvantages Large number of series to be forecast. Constructing forecasting models is harder because of noisy data at bottom level. No prediction intervals
  • 33. Bottom-up method Forecasting hierarchical time series Hierarchical time series 11 Advantages No loss of information. Better captures dynamics of individual series. Disadvantages Large number of series to be forecast. Constructing forecasting models is harder because of noisy data at bottom level. No prediction intervals
  • 34. Bottom-up method Forecasting hierarchical time series Hierarchical time series 11 Advantages No loss of information. Better captures dynamics of individual series. Disadvantages Large number of series to be forecast. Constructing forecasting models is harder because of noisy data at bottom level. No prediction intervals
  • 35. Bottom-up method Forecasting hierarchical time series Hierarchical time series 11 Advantages No loss of information. Better captures dynamics of individual series. Disadvantages Large number of series to be forecast. Constructing forecasting models is harder because of noisy data at bottom level. No prediction intervals
  • 36. A new approach We propose a new statistical framework for forecasting hierarchical time series which: 1 provides point forecasts that are consistent across the hierarchy; 2 allows for correlations and interaction between series at each level; 3 provides estimates of forecast uncertainty which are consistent across the hierarchy; 4 allows for ad hoc adjustments and inclusion of covariates at any level. Forecasting hierarchical time series Hierarchical time series 12
  • 37. A new approach We propose a new statistical framework for forecasting hierarchical time series which: 1 provides point forecasts that are consistent across the hierarchy; 2 allows for correlations and interaction between series at each level; 3 provides estimates of forecast uncertainty which are consistent across the hierarchy; 4 allows for ad hoc adjustments and inclusion of covariates at any level. Forecasting hierarchical time series Hierarchical time series 12
  • 38. A new approach We propose a new statistical framework for forecasting hierarchical time series which: 1 provides point forecasts that are consistent across the hierarchy; 2 allows for correlations and interaction between series at each level; 3 provides estimates of forecast uncertainty which are consistent across the hierarchy; 4 allows for ad hoc adjustments and inclusion of covariates at any level. Forecasting hierarchical time series Hierarchical time series 12
  • 39. A new approach We propose a new statistical framework for forecasting hierarchical time series which: 1 provides point forecasts that are consistent across the hierarchy; 2 allows for correlations and interaction between series at each level; 3 provides estimates of forecast uncertainty which are consistent across the hierarchy; 4 allows for ad hoc adjustments and inclusion of covariates at any level. Forecasting hierarchical time series Hierarchical time series 12
  • 40. Hierarchical data Total A B C Forecasting hierarchical time series Hierarchical time series 13 Yt : observed aggregate of all series at time t. YX,t : observation on series X at time t. Bt : vector of all series at bottom level in time t.
  • 41. Hierarchical data Total A B C Forecasting hierarchical time series Hierarchical time series 13 Yt : observed aggregate of all series at time t. YX,t : observation on series X at time t. Bt : vector of all series at bottom level in time t.
  • 42. Hierarchical data Total A B C Yt = [Yt, YA,t, YB,t, YC,t] =     1 1 1 1 0 0 0 1 0 0 0 1       YA,t YB,t YC,t   Forecasting hierarchical time series Hierarchical time series 13 Yt : observed aggregate of all series at time t. YX,t : observation on series X at time t. Bt : vector of all series at bottom level in time t.
  • 43. Hierarchical data Total A B C Yt = [Yt, YA,t, YB,t, YC,t] =     1 1 1 1 0 0 0 1 0 0 0 1     S   YA,t YB,t YC,t   Forecasting hierarchical time series Hierarchical time series 13 Yt : observed aggregate of all series at time t. YX,t : observation on series X at time t. Bt : vector of all series at bottom level in time t.
  • 44. Hierarchical data Total A B C Yt = [Yt, YA,t, YB,t, YC,t] =     1 1 1 1 0 0 0 1 0 0 0 1     S   YA,t YB,t YC,t   Bt Forecasting hierarchical time series Hierarchical time series 13 Yt : observed aggregate of all series at time t. YX,t : observation on series X at time t. Bt : vector of all series at bottom level in time t.
  • 45. Hierarchical data Total A B C Yt = [Yt, YA,t, YB,t, YC,t] =     1 1 1 1 0 0 0 1 0 0 0 1     S   YA,t YB,t YC,t   Bt Yt = SBt Forecasting hierarchical time series Hierarchical time series 13 Yt : observed aggregate of all series at time t. YX,t : observation on series X at time t. Bt : vector of all series at bottom level in time t.
  • 46. Hierarchical data Total A AX AY AZ B BX BY BZ C CX CY CZ Yt =             Yt YA,t YB,t YC,t YAX,t YAY,t YAZ,t YBX,t YBY,t YBZ,t YCX,t YCY,t YCZ,t             =             1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1             S        YAX,t YAY,t YAZ,t YBX,t YBY,t YBZ,t YCX,t YCY,t YCZ,t        Bt Forecasting hierarchical time series Hierarchical time series 14
  • 47. Hierarchical data Total A AX AY AZ B BX BY BZ C CX CY CZ Yt =             Yt YA,t YB,t YC,t YAX,t YAY,t YAZ,t YBX,t YBY,t YBZ,t YCX,t YCY,t YCZ,t             =             1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1             S        YAX,t YAY,t YAZ,t YBX,t YBY,t YBZ,t YCX,t YCY,t YCZ,t        Bt Forecasting hierarchical time series Hierarchical time series 14
  • 48. Hierarchical data Total A AX AY AZ B BX BY BZ C CX CY CZ Yt =             Yt YA,t YB,t YC,t YAX,t YAY,t YAZ,t YBX,t YBY,t YBZ,t YCX,t YCY,t YCZ,t             =             1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1             S        YAX,t YAY,t YAZ,t YBX,t YBY,t YBZ,t YCX,t YCY,t YCZ,t        Bt Forecasting hierarchical time series Hierarchical time series 14 Yt = SBt
  • 49. Grouped data Total A AX AY B BX BY Total X AX BX Y AY BY Yt =              Yt YA,t YB,t YX,t YY,t YAX,t YAY,t YBX,t YBY,t              =              1 1 1 1 1 1 0 0 0 0 1 1 1 0 1 0 0 1 0 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1              S    YAX,t YAY,t YBX,t YBY,t    Bt Forecasting hierarchical time series Hierarchical time series 15
  • 50. Grouped data Total A AX AY B BX BY Total X AX BX Y AY BY Yt =              Yt YA,t YB,t YX,t YY,t YAX,t YAY,t YBX,t YBY,t              =              1 1 1 1 1 1 0 0 0 0 1 1 1 0 1 0 0 1 0 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1              S    YAX,t YAY,t YBX,t YBY,t    Bt Forecasting hierarchical time series Hierarchical time series 15
  • 51. Grouped data Total A AX AY B BX BY Total X AX BX Y AY BY Yt =              Yt YA,t YB,t YX,t YY,t YAX,t YAY,t YBX,t YBY,t              =              1 1 1 1 1 1 0 0 0 0 1 1 1 0 1 0 0 1 0 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1              S    YAX,t YAY,t YBX,t YBY,t    Bt Forecasting hierarchical time series Hierarchical time series 15 Yt = SBt
  • 52. Outline 1 Hierarchical time series 2 Forecasting framework 3 Optimal forecasts 4 Approximately optimal forecasts 5 Application to Australian tourism 6 hts package for R 7 References Forecasting hierarchical time series Forecasting framework 16
  • 53. Forecasting notation Let ˆYn(h) be vector of initial h-step forecasts, made at time n, stacked in same order as Yt. (They may not add up.) Hierarchical forecasting methods of the form: ˜Yn(h) = SPˆYn(h) for some matrix P. P extracts and combines base forecasts ˆYn(h) to get bottom-level forecasts. S adds them up Revised reconciled forecasts: ˜Yn(h). Forecasting hierarchical time series Forecasting framework 17
  • 54. Forecasting notation Let ˆYn(h) be vector of initial h-step forecasts, made at time n, stacked in same order as Yt. (They may not add up.) Hierarchical forecasting methods of the form: ˜Yn(h) = SPˆYn(h) for some matrix P. P extracts and combines base forecasts ˆYn(h) to get bottom-level forecasts. S adds them up Revised reconciled forecasts: ˜Yn(h). Forecasting hierarchical time series Forecasting framework 17
  • 55. Forecasting notation Let ˆYn(h) be vector of initial h-step forecasts, made at time n, stacked in same order as Yt. (They may not add up.) Hierarchical forecasting methods of the form: ˜Yn(h) = SPˆYn(h) for some matrix P. P extracts and combines base forecasts ˆYn(h) to get bottom-level forecasts. S adds them up Revised reconciled forecasts: ˜Yn(h). Forecasting hierarchical time series Forecasting framework 17
  • 56. Forecasting notation Let ˆYn(h) be vector of initial h-step forecasts, made at time n, stacked in same order as Yt. (They may not add up.) Hierarchical forecasting methods of the form: ˜Yn(h) = SPˆYn(h) for some matrix P. P extracts and combines base forecasts ˆYn(h) to get bottom-level forecasts. S adds them up Revised reconciled forecasts: ˜Yn(h). Forecasting hierarchical time series Forecasting framework 17
  • 57. Forecasting notation Let ˆYn(h) be vector of initial h-step forecasts, made at time n, stacked in same order as Yt. (They may not add up.) Hierarchical forecasting methods of the form: ˜Yn(h) = SPˆYn(h) for some matrix P. P extracts and combines base forecasts ˆYn(h) to get bottom-level forecasts. S adds them up Revised reconciled forecasts: ˜Yn(h). Forecasting hierarchical time series Forecasting framework 17
  • 58. Forecasting notation Let ˆYn(h) be vector of initial h-step forecasts, made at time n, stacked in same order as Yt. (They may not add up.) Hierarchical forecasting methods of the form: ˜Yn(h) = SPˆYn(h) for some matrix P. P extracts and combines base forecasts ˆYn(h) to get bottom-level forecasts. S adds them up Revised reconciled forecasts: ˜Yn(h). Forecasting hierarchical time series Forecasting framework 17
  • 59. Bottom-up forecasts ˜Yn(h) = SPˆYn(h) Bottom-up forecasts are obtained using P = [0 | I] , where 0 is null matrix and I is identity matrix. P matrix extracts only bottom-level forecasts from ˆYn(h) S adds them up to give the bottom-up forecasts. Forecasting hierarchical time series Forecasting framework 18
  • 60. Bottom-up forecasts ˜Yn(h) = SPˆYn(h) Bottom-up forecasts are obtained using P = [0 | I] , where 0 is null matrix and I is identity matrix. P matrix extracts only bottom-level forecasts from ˆYn(h) S adds them up to give the bottom-up forecasts. Forecasting hierarchical time series Forecasting framework 18
  • 61. Bottom-up forecasts ˜Yn(h) = SPˆYn(h) Bottom-up forecasts are obtained using P = [0 | I] , where 0 is null matrix and I is identity matrix. P matrix extracts only bottom-level forecasts from ˆYn(h) S adds them up to give the bottom-up forecasts. Forecasting hierarchical time series Forecasting framework 18
  • 62. Top-down forecasts ˜Yn(h) = SPˆYn(h) Top-down forecasts are obtained using P = [p | 0] where p = [p1, p2, . . . , pmK ] is a vector of proportions that sum to one. P distributes forecasts of the aggregate to the lowest level series. Different methods of top-down forecasting lead to different proportionality vectors p. Forecasting hierarchical time series Forecasting framework 19
  • 63. Top-down forecasts ˜Yn(h) = SPˆYn(h) Top-down forecasts are obtained using P = [p | 0] where p = [p1, p2, . . . , pmK ] is a vector of proportions that sum to one. P distributes forecasts of the aggregate to the lowest level series. Different methods of top-down forecasting lead to different proportionality vectors p. Forecasting hierarchical time series Forecasting framework 19
  • 64. Top-down forecasts ˜Yn(h) = SPˆYn(h) Top-down forecasts are obtained using P = [p | 0] where p = [p1, p2, . . . , pmK ] is a vector of proportions that sum to one. P distributes forecasts of the aggregate to the lowest level series. Different methods of top-down forecasting lead to different proportionality vectors p. Forecasting hierarchical time series Forecasting framework 19
  • 65. General properties: bias ˜Yn(h) = SPˆYn(h) Assume: base forecasts ˆYn(h) are unbiased: E[ˆYn(h)|Y1, . . . , Yn] = E[Yn+h|Y1, . . . , Yn] Let ˆBn(h) be bottom level base forecasts with βn(h) = E[ˆBn(h)|Y1, . . . , Yn]. Then E[ˆYn(h)] = Sβn(h). We want the revised forecasts to be unbiased: E[˜Yn(h)] = SPSβn(h) = Sβn(h). Result will hold provided SPS = S. True for bottom-up, but not for any top-down method or middle-out method. Forecasting hierarchical time series Forecasting framework 20
  • 66. General properties: bias ˜Yn(h) = SPˆYn(h) Assume: base forecasts ˆYn(h) are unbiased: E[ˆYn(h)|Y1, . . . , Yn] = E[Yn+h|Y1, . . . , Yn] Let ˆBn(h) be bottom level base forecasts with βn(h) = E[ˆBn(h)|Y1, . . . , Yn]. Then E[ˆYn(h)] = Sβn(h). We want the revised forecasts to be unbiased: E[˜Yn(h)] = SPSβn(h) = Sβn(h). Result will hold provided SPS = S. True for bottom-up, but not for any top-down method or middle-out method. Forecasting hierarchical time series Forecasting framework 20
  • 67. General properties: bias ˜Yn(h) = SPˆYn(h) Assume: base forecasts ˆYn(h) are unbiased: E[ˆYn(h)|Y1, . . . , Yn] = E[Yn+h|Y1, . . . , Yn] Let ˆBn(h) be bottom level base forecasts with βn(h) = E[ˆBn(h)|Y1, . . . , Yn]. Then E[ˆYn(h)] = Sβn(h). We want the revised forecasts to be unbiased: E[˜Yn(h)] = SPSβn(h) = Sβn(h). Result will hold provided SPS = S. True for bottom-up, but not for any top-down method or middle-out method. Forecasting hierarchical time series Forecasting framework 20
  • 68. General properties: bias ˜Yn(h) = SPˆYn(h) Assume: base forecasts ˆYn(h) are unbiased: E[ˆYn(h)|Y1, . . . , Yn] = E[Yn+h|Y1, . . . , Yn] Let ˆBn(h) be bottom level base forecasts with βn(h) = E[ˆBn(h)|Y1, . . . , Yn]. Then E[ˆYn(h)] = Sβn(h). We want the revised forecasts to be unbiased: E[˜Yn(h)] = SPSβn(h) = Sβn(h). Result will hold provided SPS = S. True for bottom-up, but not for any top-down method or middle-out method. Forecasting hierarchical time series Forecasting framework 20
  • 69. General properties: bias ˜Yn(h) = SPˆYn(h) Assume: base forecasts ˆYn(h) are unbiased: E[ˆYn(h)|Y1, . . . , Yn] = E[Yn+h|Y1, . . . , Yn] Let ˆBn(h) be bottom level base forecasts with βn(h) = E[ˆBn(h)|Y1, . . . , Yn]. Then E[ˆYn(h)] = Sβn(h). We want the revised forecasts to be unbiased: E[˜Yn(h)] = SPSβn(h) = Sβn(h). Result will hold provided SPS = S. True for bottom-up, but not for any top-down method or middle-out method. Forecasting hierarchical time series Forecasting framework 20
  • 70. General properties: bias ˜Yn(h) = SPˆYn(h) Assume: base forecasts ˆYn(h) are unbiased: E[ˆYn(h)|Y1, . . . , Yn] = E[Yn+h|Y1, . . . , Yn] Let ˆBn(h) be bottom level base forecasts with βn(h) = E[ˆBn(h)|Y1, . . . , Yn]. Then E[ˆYn(h)] = Sβn(h). We want the revised forecasts to be unbiased: E[˜Yn(h)] = SPSβn(h) = Sβn(h). Result will hold provided SPS = S. True for bottom-up, but not for any top-down method or middle-out method. Forecasting hierarchical time series Forecasting framework 20
  • 71. General properties: bias ˜Yn(h) = SPˆYn(h) Assume: base forecasts ˆYn(h) are unbiased: E[ˆYn(h)|Y1, . . . , Yn] = E[Yn+h|Y1, . . . , Yn] Let ˆBn(h) be bottom level base forecasts with βn(h) = E[ˆBn(h)|Y1, . . . , Yn]. Then E[ˆYn(h)] = Sβn(h). We want the revised forecasts to be unbiased: E[˜Yn(h)] = SPSβn(h) = Sβn(h). Result will hold provided SPS = S. True for bottom-up, but not for any top-down method or middle-out method. Forecasting hierarchical time series Forecasting framework 20
  • 72. General properties: bias ˜Yn(h) = SPˆYn(h) Assume: base forecasts ˆYn(h) are unbiased: E[ˆYn(h)|Y1, . . . , Yn] = E[Yn+h|Y1, . . . , Yn] Let ˆBn(h) be bottom level base forecasts with βn(h) = E[ˆBn(h)|Y1, . . . , Yn]. Then E[ˆYn(h)] = Sβn(h). We want the revised forecasts to be unbiased: E[˜Yn(h)] = SPSβn(h) = Sβn(h). Result will hold provided SPS = S. True for bottom-up, but not for any top-down method or middle-out method. Forecasting hierarchical time series Forecasting framework 20
  • 73. General properties: variance ˜Yn(h) = SPˆYn(h) Let variance of base forecasts ˆYn(h) be given by Σh = V[ˆYn(h)|Y1, . . . , Yn] Then the variance of the revised forecasts is given by V[˜Yn(h)|Y1, . . . , Yn] = SPΣhP S . This is a general result for all existing methods. Forecasting hierarchical time series Forecasting framework 21
  • 74. General properties: variance ˜Yn(h) = SPˆYn(h) Let variance of base forecasts ˆYn(h) be given by Σh = V[ˆYn(h)|Y1, . . . , Yn] Then the variance of the revised forecasts is given by V[˜Yn(h)|Y1, . . . , Yn] = SPΣhP S . This is a general result for all existing methods. Forecasting hierarchical time series Forecasting framework 21
  • 75. General properties: variance ˜Yn(h) = SPˆYn(h) Let variance of base forecasts ˆYn(h) be given by Σh = V[ˆYn(h)|Y1, . . . , Yn] Then the variance of the revised forecasts is given by V[˜Yn(h)|Y1, . . . , Yn] = SPΣhP S . This is a general result for all existing methods. Forecasting hierarchical time series Forecasting framework 21
  • 76. Outline 1 Hierarchical time series 2 Forecasting framework 3 Optimal forecasts 4 Approximately optimal forecasts 5 Application to Australian tourism 6 hts package for R 7 References Forecasting hierarchical time series Optimal forecasts 22
  • 77. Forecasts Key idea: forecast reconciliation ¯ Ignore structural constraints and forecast every series of interest independently. ¯ Adjust forecasts to impose constraints. Let ˆYn(h) be vector of initial h-step forecasts, made at time n, stacked in same order as Yt. Yt = SBt . So ˆYn(h) = Sβn(h) + εh . βn(h) = E[Bn+h | Y1, . . . , Yn]. εh has zero mean and covariance Σh. Estimate βn(h) using GLS? Forecasting hierarchical time series Optimal forecasts 23
  • 78. Forecasts Key idea: forecast reconciliation ¯ Ignore structural constraints and forecast every series of interest independently. ¯ Adjust forecasts to impose constraints. Let ˆYn(h) be vector of initial h-step forecasts, made at time n, stacked in same order as Yt. Yt = SBt . So ˆYn(h) = Sβn(h) + εh . βn(h) = E[Bn+h | Y1, . . . , Yn]. εh has zero mean and covariance Σh. Estimate βn(h) using GLS? Forecasting hierarchical time series Optimal forecasts 23
  • 79. Forecasts Key idea: forecast reconciliation ¯ Ignore structural constraints and forecast every series of interest independently. ¯ Adjust forecasts to impose constraints. Let ˆYn(h) be vector of initial h-step forecasts, made at time n, stacked in same order as Yt. Yt = SBt . So ˆYn(h) = Sβn(h) + εh . βn(h) = E[Bn+h | Y1, . . . , Yn]. εh has zero mean and covariance Σh. Estimate βn(h) using GLS? Forecasting hierarchical time series Optimal forecasts 23
  • 80. Forecasts Key idea: forecast reconciliation ¯ Ignore structural constraints and forecast every series of interest independently. ¯ Adjust forecasts to impose constraints. Let ˆYn(h) be vector of initial h-step forecasts, made at time n, stacked in same order as Yt. Yt = SBt . So ˆYn(h) = Sβn(h) + εh . βn(h) = E[Bn+h | Y1, . . . , Yn]. εh has zero mean and covariance Σh. Estimate βn(h) using GLS? Forecasting hierarchical time series Optimal forecasts 23
  • 81. Forecasts Key idea: forecast reconciliation ¯ Ignore structural constraints and forecast every series of interest independently. ¯ Adjust forecasts to impose constraints. Let ˆYn(h) be vector of initial h-step forecasts, made at time n, stacked in same order as Yt. Yt = SBt . So ˆYn(h) = Sβn(h) + εh . βn(h) = E[Bn+h | Y1, . . . , Yn]. εh has zero mean and covariance Σh. Estimate βn(h) using GLS? Forecasting hierarchical time series Optimal forecasts 23
  • 82. Forecasts Key idea: forecast reconciliation ¯ Ignore structural constraints and forecast every series of interest independently. ¯ Adjust forecasts to impose constraints. Let ˆYn(h) be vector of initial h-step forecasts, made at time n, stacked in same order as Yt. Yt = SBt . So ˆYn(h) = Sβn(h) + εh . βn(h) = E[Bn+h | Y1, . . . , Yn]. εh has zero mean and covariance Σh. Estimate βn(h) using GLS? Forecasting hierarchical time series Optimal forecasts 23
  • 83. Forecasts Key idea: forecast reconciliation ¯ Ignore structural constraints and forecast every series of interest independently. ¯ Adjust forecasts to impose constraints. Let ˆYn(h) be vector of initial h-step forecasts, made at time n, stacked in same order as Yt. Yt = SBt . So ˆYn(h) = Sβn(h) + εh . βn(h) = E[Bn+h | Y1, . . . , Yn]. εh has zero mean and covariance Σh. Estimate βn(h) using GLS? Forecasting hierarchical time series Optimal forecasts 23
  • 84. Optimal combination forecasts ˜Yn(h) = S ˆβn(h) = S(S Σ† hS)−1 S Σ† h ˆYn(h) Σ† h is generalized inverse of Σh. Optimal P = (S Σ† hS)−1 S Σ† h Revised forecasts unbiased: SPS = S. Revised forecasts minimum variance: V[˜Yn(h)|Y1, . . . , Yn] = SPΣhP S = S(S Σ† hS)−1 S Problem: Σh hard to estimate. Forecasting hierarchical time series Optimal forecasts 24
  • 85. Optimal combination forecasts ˜Yn(h) = S ˆβn(h) = S(S Σ† hS)−1 S Σ† h ˆYn(h) Initial forecasts Σ† h is generalized inverse of Σh. Optimal P = (S Σ† hS)−1 S Σ† h Revised forecasts unbiased: SPS = S. Revised forecasts minimum variance: V[˜Yn(h)|Y1, . . . , Yn] = SPΣhP S = S(S Σ† hS)−1 S Problem: Σh hard to estimate. Forecasting hierarchical time series Optimal forecasts 24
  • 86. Optimal combination forecasts ˜Yn(h) = S ˆβn(h) = S(S Σ† hS)−1 S Σ† h ˆYn(h) Revised forecasts Initial forecasts Σ† h is generalized inverse of Σh. Optimal P = (S Σ† hS)−1 S Σ† h Revised forecasts unbiased: SPS = S. Revised forecasts minimum variance: V[˜Yn(h)|Y1, . . . , Yn] = SPΣhP S = S(S Σ† hS)−1 S Problem: Σh hard to estimate. Forecasting hierarchical time series Optimal forecasts 24
  • 87. Optimal combination forecasts ˜Yn(h) = S ˆβn(h) = S(S Σ† hS)−1 S Σ† h ˆYn(h) Revised forecasts Initial forecasts Σ† h is generalized inverse of Σh. Optimal P = (S Σ† hS)−1 S Σ† h Revised forecasts unbiased: SPS = S. Revised forecasts minimum variance: V[˜Yn(h)|Y1, . . . , Yn] = SPΣhP S = S(S Σ† hS)−1 S Problem: Σh hard to estimate. Forecasting hierarchical time series Optimal forecasts 24
  • 88. Optimal combination forecasts ˜Yn(h) = S ˆβn(h) = S(S Σ† hS)−1 S Σ† h ˆYn(h) Revised forecasts Initial forecasts Σ† h is generalized inverse of Σh. Optimal P = (S Σ† hS)−1 S Σ† h Revised forecasts unbiased: SPS = S. Revised forecasts minimum variance: V[˜Yn(h)|Y1, . . . , Yn] = SPΣhP S = S(S Σ† hS)−1 S Problem: Σh hard to estimate. Forecasting hierarchical time series Optimal forecasts 24
  • 89. Optimal combination forecasts ˜Yn(h) = S ˆβn(h) = S(S Σ† hS)−1 S Σ† h ˆYn(h) Revised forecasts Initial forecasts Σ† h is generalized inverse of Σh. Optimal P = (S Σ† hS)−1 S Σ† h Revised forecasts unbiased: SPS = S. Revised forecasts minimum variance: V[˜Yn(h)|Y1, . . . , Yn] = SPΣhP S = S(S Σ† hS)−1 S Problem: Σh hard to estimate. Forecasting hierarchical time series Optimal forecasts 24
  • 90. Optimal combination forecasts ˜Yn(h) = S ˆβn(h) = S(S Σ† hS)−1 S Σ† h ˆYn(h) Revised forecasts Initial forecasts Σ† h is generalized inverse of Σh. Optimal P = (S Σ† hS)−1 S Σ† h Revised forecasts unbiased: SPS = S. Revised forecasts minimum variance: V[˜Yn(h)|Y1, . . . , Yn] = SPΣhP S = S(S Σ† hS)−1 S Problem: Σh hard to estimate. Forecasting hierarchical time series Optimal forecasts 24
  • 91. Optimal combination forecasts ˜Yn(h) = S ˆβn(h) = S(S Σ† hS)−1 S Σ† h ˆYn(h) Revised forecasts Initial forecasts Σ† h is generalized inverse of Σh. Optimal P = (S Σ† hS)−1 S Σ† h Revised forecasts unbiased: SPS = S. Revised forecasts minimum variance: V[˜Yn(h)|Y1, . . . , Yn] = SPΣhP S = S(S Σ† hS)−1 S Problem: Σh hard to estimate. Forecasting hierarchical time series Optimal forecasts 24
  • 92. Optimal combination forecasts ˜Yn(h) = S ˆβn(h) = S(S Σ† hS)−1 S Σ† h ˆYn(h) Revised forecasts Initial forecasts Σ† h is generalized inverse of Σh. Optimal P = (S Σ† hS)−1 S Σ† h Revised forecasts unbiased: SPS = S. Revised forecasts minimum variance: V[˜Yn(h)|Y1, . . . , Yn] = SPΣhP S = S(S Σ† hS)−1 S Problem: Σh hard to estimate. Forecasting hierarchical time series Optimal forecasts 24
  • 93. Outline 1 Hierarchical time series 2 Forecasting framework 3 Optimal forecasts 4 Approximately optimal forecasts 5 Application to Australian tourism 6 hts package for R 7 References Forecasting hierarchical time series Approximately optimal forecasts 25
  • 94. Optimal combination forecasts ˜Yn(h) = S(S Σ† hS)−1 S Σ† h ˆYn(h) Revised forecasts Base forecasts Solution 1: OLS Assume εh ≈ SεB,h where εB,h is the forecast error at bottom level. Then Σh ≈ SΩhS where Ωh = V(εB,h). If Moore-Penrose generalized inverse used, then (S Σ† hS)−1 S Σ† h = (S S)−1 S . ˜Yn(h) = S(S S)−1 S ˆYn(h) Forecasting hierarchical time series Approximately optimal forecasts 26
  • 95. Optimal combination forecasts ˜Yn(h) = S(S Σ† hS)−1 S Σ† h ˆYn(h) Revised forecasts Base forecasts Solution 1: OLS Assume εh ≈ SεB,h where εB,h is the forecast error at bottom level. Then Σh ≈ SΩhS where Ωh = V(εB,h). If Moore-Penrose generalized inverse used, then (S Σ† hS)−1 S Σ† h = (S S)−1 S . ˜Yn(h) = S(S S)−1 S ˆYn(h) Forecasting hierarchical time series Approximately optimal forecasts 26
  • 96. Optimal combination forecasts ˜Yn(h) = S(S Σ† hS)−1 S Σ† h ˆYn(h) Revised forecasts Base forecasts Solution 1: OLS Assume εh ≈ SεB,h where εB,h is the forecast error at bottom level. Then Σh ≈ SΩhS where Ωh = V(εB,h). If Moore-Penrose generalized inverse used, then (S Σ† hS)−1 S Σ† h = (S S)−1 S . ˜Yn(h) = S(S S)−1 S ˆYn(h) Forecasting hierarchical time series Approximately optimal forecasts 26
  • 97. Optimal combination forecasts ˜Yn(h) = S(S Σ† hS)−1 S Σ† h ˆYn(h) Revised forecasts Base forecasts Solution 1: OLS Assume εh ≈ SεB,h where εB,h is the forecast error at bottom level. Then Σh ≈ SΩhS where Ωh = V(εB,h). If Moore-Penrose generalized inverse used, then (S Σ† hS)−1 S Σ† h = (S S)−1 S . ˜Yn(h) = S(S S)−1 S ˆYn(h) Forecasting hierarchical time series Approximately optimal forecasts 26
  • 98. Optimal combination forecasts ˜Yn(h) = S(S Σ† hS)−1 S Σ† h ˆYn(h) Revised forecasts Base forecasts Solution 1: OLS Assume εh ≈ SεB,h where εB,h is the forecast error at bottom level. Then Σh ≈ SΩhS where Ωh = V(εB,h). If Moore-Penrose generalized inverse used, then (S Σ† hS)−1 S Σ† h = (S S)−1 S . ˜Yn(h) = S(S S)−1 S ˆYn(h) Forecasting hierarchical time series Approximately optimal forecasts 26
  • 99. Optimal combination forecasts ˜Yn(h) = S(S Σ† hS)−1 S Σ† h ˆYn(h) Revised forecasts Base forecasts Solution 1: OLS Assume εh ≈ SεB,h where εB,h is the forecast error at bottom level. Then Σh ≈ SΩhS where Ωh = V(εB,h). If Moore-Penrose generalized inverse used, then (S Σ† hS)−1 S Σ† h = (S S)−1 S . ˜Yn(h) = S(S S)−1 S ˆYn(h) Forecasting hierarchical time series Approximately optimal forecasts 26
  • 100. Optimal combination forecasts ˜Yn(h) = S(S S)−1 S ˆYn(h) GLS = OLS. Optimal weighted average of initial forecasts. Optimal reconciliation weights are S(S S)−1 S . Weights are independent of the data and of the covariance structure of the hierarchy! Forecasting hierarchical time series Approximately optimal forecasts 27
  • 101. Optimal combination forecasts ˜Yn(h) = S(S S)−1 S ˆYn(h) GLS = OLS. Optimal weighted average of initial forecasts. Optimal reconciliation weights are S(S S)−1 S . Weights are independent of the data and of the covariance structure of the hierarchy! Forecasting hierarchical time series Approximately optimal forecasts 27
  • 102. Optimal combination forecasts ˜Yn(h) = S(S S)−1 S ˆYn(h) GLS = OLS. Optimal weighted average of initial forecasts. Optimal reconciliation weights are S(S S)−1 S . Weights are independent of the data and of the covariance structure of the hierarchy! Forecasting hierarchical time series Approximately optimal forecasts 27
  • 103. Optimal combination forecasts ˜Yn(h) = S(S S)−1 S ˆYn(h) GLS = OLS. Optimal weighted average of initial forecasts. Optimal reconciliation weights are S(S S)−1 S . Weights are independent of the data and of the covariance structure of the hierarchy! Forecasting hierarchical time series Approximately optimal forecasts 27
  • 104. Optimal combination forecasts Forecasting hierarchical time series Approximately optimal forecasts 28 ˜Yn(h) = S(S S)−1 S ˆYn(h)Total A B C
  • 105. Optimal combination forecasts Forecasting hierarchical time series Approximately optimal forecasts 28 ˜Yn(h) = S(S S)−1 S ˆYn(h)Total A B C Weights: S(S S)−1 S =     0.75 0.25 0.25 0.25 0.25 0.75 −0.25 −0.25 0.25 −0.25 0.75 −0.25 0.25 −0.25 −0.25 0.75    
  • 106. Optimal combination forecasts Total A AA AB AC B BA BB BC C CA CB CC Weights: S(S S)−1 S =                       0.69 0.23 0.23 0.23 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.23 0.58 −0.17 −0.17 0.19 0.19 0.19 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06 0.23 −0.17 0.58 −0.17 −0.06 −0.06 −0.06 0.19 0.19 0.19 −0.06 −0.06 −0.06 0.23 −0.17 −0.17 0.58 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06 0.19 0.19 0.19 0.08 0.19 −0.06 −0.06 0.73 −0.27 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 0.08 0.19 −0.06 −0.06 −0.27 0.73 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 0.08 0.19 −0.06 −0.06 −0.27 −0.27 0.73 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 0.73 −0.27 −0.27 −0.02 −0.02 −0.02 0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 0.73 −0.27 −0.02 −0.02 −0.02 0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 −0.27 0.73 −0.02 −0.02 −0.02 0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 0.73 −0.27 −0.27 0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 0.73 −0.27 0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 −0.27 0.73                       Forecasting hierarchical time series Approximately optimal forecasts 29
  • 107. Optimal combination forecasts Total A AA AB AC B BA BB BC C CA CB CC Weights: S(S S)−1 S =                       0.69 0.23 0.23 0.23 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.23 0.58 −0.17 −0.17 0.19 0.19 0.19 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06 0.23 −0.17 0.58 −0.17 −0.06 −0.06 −0.06 0.19 0.19 0.19 −0.06 −0.06 −0.06 0.23 −0.17 −0.17 0.58 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06 0.19 0.19 0.19 0.08 0.19 −0.06 −0.06 0.73 −0.27 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 0.08 0.19 −0.06 −0.06 −0.27 0.73 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 0.08 0.19 −0.06 −0.06 −0.27 −0.27 0.73 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 0.73 −0.27 −0.27 −0.02 −0.02 −0.02 0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 0.73 −0.27 −0.02 −0.02 −0.02 0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 −0.27 0.73 −0.02 −0.02 −0.02 0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 0.73 −0.27 −0.27 0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 0.73 −0.27 0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 −0.27 0.73                       Forecasting hierarchical time series Approximately optimal forecasts 29
  • 108. Features Forget “bottom up” or “top down”. This approach combines all forecasts optimally. Method outperforms bottom-up and top-down, especially for middle levels. Covariates can be included in initial forecasts. Adjustments can be made to initial forecasts at any level. Very simple and flexible method. Can work with any hierarchical or grouped time series. Conceptually easy to implement: OLS on base forecasts. Forecasting hierarchical time series Approximately optimal forecasts 30
  • 109. Features Forget “bottom up” or “top down”. This approach combines all forecasts optimally. Method outperforms bottom-up and top-down, especially for middle levels. Covariates can be included in initial forecasts. Adjustments can be made to initial forecasts at any level. Very simple and flexible method. Can work with any hierarchical or grouped time series. Conceptually easy to implement: OLS on base forecasts. Forecasting hierarchical time series Approximately optimal forecasts 30
  • 110. Features Forget “bottom up” or “top down”. This approach combines all forecasts optimally. Method outperforms bottom-up and top-down, especially for middle levels. Covariates can be included in initial forecasts. Adjustments can be made to initial forecasts at any level. Very simple and flexible method. Can work with any hierarchical or grouped time series. Conceptually easy to implement: OLS on base forecasts. Forecasting hierarchical time series Approximately optimal forecasts 30
  • 111. Features Forget “bottom up” or “top down”. This approach combines all forecasts optimally. Method outperforms bottom-up and top-down, especially for middle levels. Covariates can be included in initial forecasts. Adjustments can be made to initial forecasts at any level. Very simple and flexible method. Can work with any hierarchical or grouped time series. Conceptually easy to implement: OLS on base forecasts. Forecasting hierarchical time series Approximately optimal forecasts 30
  • 112. Features Forget “bottom up” or “top down”. This approach combines all forecasts optimally. Method outperforms bottom-up and top-down, especially for middle levels. Covariates can be included in initial forecasts. Adjustments can be made to initial forecasts at any level. Very simple and flexible method. Can work with any hierarchical or grouped time series. Conceptually easy to implement: OLS on base forecasts. Forecasting hierarchical time series Approximately optimal forecasts 30
  • 113. Features Forget “bottom up” or “top down”. This approach combines all forecasts optimally. Method outperforms bottom-up and top-down, especially for middle levels. Covariates can be included in initial forecasts. Adjustments can be made to initial forecasts at any level. Very simple and flexible method. Can work with any hierarchical or grouped time series. Conceptually easy to implement: OLS on base forecasts. Forecasting hierarchical time series Approximately optimal forecasts 30
  • 114. Challenges Computational difficulties in big hierarchies due to size of the S matrix and singular behavior of (S S). Need to estimate covariance matrix to produce prediction intervals. Assumption might be unrealistic. Ignores covariance matrix in computing point forecasts. Forecasting hierarchical time series Approximately optimal forecasts 31 ˜Yn(h) = S(S S)−1 S ˆYn(h)
  • 115. Challenges Computational difficulties in big hierarchies due to size of the S matrix and singular behavior of (S S). Need to estimate covariance matrix to produce prediction intervals. Assumption might be unrealistic. Ignores covariance matrix in computing point forecasts. Forecasting hierarchical time series Approximately optimal forecasts 31 ˜Yn(h) = S(S S)−1 S ˆYn(h)
  • 116. Challenges Computational difficulties in big hierarchies due to size of the S matrix and singular behavior of (S S). Need to estimate covariance matrix to produce prediction intervals. Assumption might be unrealistic. Ignores covariance matrix in computing point forecasts. Forecasting hierarchical time series Approximately optimal forecasts 31 ˜Yn(h) = S(S S)−1 S ˆYn(h)
  • 117. Challenges Computational difficulties in big hierarchies due to size of the S matrix and singular behavior of (S S). Need to estimate covariance matrix to produce prediction intervals. Assumption might be unrealistic. Ignores covariance matrix in computing point forecasts. Forecasting hierarchical time series Approximately optimal forecasts 31 ˜Yn(h) = S(S S)−1 S ˆYn(h)
  • 118. Optimal combination forecasts Solution 2: Rescaling Suppose we rescale the original forecasts by Λ, reconcile using OLS, and backscale: ˜Y ∗ n(h) = S(S Λ2 S)−1 S Λ2 ˆYn(h). If Λ = Σ† h 1/2 , we get the GLS solution. Approximately optimal solution: Λ = diagonal Σ† 1 1/2 That is, Λ contains inverse one-step forecast standard deviations. Forecasting hierarchical time series Approximately optimal forecasts 32 ˜Yn(h) = S(S S)−1 S ˆYn(h)
  • 119. Optimal combination forecasts Solution 2: Rescaling Suppose we rescale the original forecasts by Λ, reconcile using OLS, and backscale: ˜Y ∗ n(h) = S(S Λ2 S)−1 S Λ2 ˆYn(h). If Λ = Σ† h 1/2 , we get the GLS solution. Approximately optimal solution: Λ = diagonal Σ† 1 1/2 That is, Λ contains inverse one-step forecast standard deviations. Forecasting hierarchical time series Approximately optimal forecasts 32 ˜Yn(h) = S(S S)−1 S ˆYn(h)
  • 120. Optimal combination forecasts Solution 2: Rescaling Suppose we rescale the original forecasts by Λ, reconcile using OLS, and backscale: ˜Y ∗ n(h) = S(S Λ2 S)−1 S Λ2 ˆYn(h). If Λ = Σ† h 1/2 , we get the GLS solution. Approximately optimal solution: Λ = diagonal Σ† 1 1/2 That is, Λ contains inverse one-step forecast standard deviations. Forecasting hierarchical time series Approximately optimal forecasts 32 ˜Yn(h) = S(S S)−1 S ˆYn(h)
  • 121. Optimal combination forecasts Solution 2: Rescaling Suppose we rescale the original forecasts by Λ, reconcile using OLS, and backscale: ˜Y ∗ n(h) = S(S Λ2 S)−1 S Λ2 ˆYn(h). If Λ = Σ† h 1/2 , we get the GLS solution. Approximately optimal solution: Λ = diagonal Σ† 1 1/2 That is, Λ contains inverse one-step forecast standard deviations. Forecasting hierarchical time series Approximately optimal forecasts 32 ˜Yn(h) = S(S S)−1 S ˆYn(h)
  • 122. Optimal combination forecasts Solution 3: Averaging If the bottom level error series are approximately uncorrelated and have similar variances, then Λ2 is inversely proportional to the number of series making up each element of Y. So set Λ2 to be the inverse row sums of S. Forecasting hierarchical time series Approximately optimal forecasts 33 ˜Y ∗ n(h) = S(S Λ2 S)−1 S Λ2 ˆYn(h)
  • 123. Optimal combination forecasts Solution 3: Averaging If the bottom level error series are approximately uncorrelated and have similar variances, then Λ2 is inversely proportional to the number of series making up each element of Y. So set Λ2 to be the inverse row sums of S. Forecasting hierarchical time series Approximately optimal forecasts 33 ˜Y ∗ n(h) = S(S Λ2 S)−1 S Λ2 ˆYn(h)
  • 124. Outline 1 Hierarchical time series 2 Forecasting framework 3 Optimal forecasts 4 Approximately optimal forecasts 5 Application to Australian tourism 6 hts package for R 7 References Forecasting hierarchical time series Application to Australian tourism 34
  • 125. Application to Australian tourism Forecasting hierarchical time series Application to Australian tourism 35
  • 126. Application to Australian tourism Forecasting hierarchical time series Application to Australian tourism 35 Quarterly data on visitor nights Domestic visitor nights from 1998 – 2006 Data from: National Visitor Survey, based on annual interviews of 120,000 Australians aged 15+, collected by Tourism Research Australia.
  • 127. Application to Australian tourism Forecasting hierarchical time series Application to Australian tourism 35 Also split by purpose of travel: Holiday Visits to friends and relatives Business Other
  • 128. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M Forecasting hierarchical time series Application to Australian tourism 36
  • 129. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing Forecasting hierarchical time series Application to Australian tourism 36
  • 130. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Forecasting hierarchical time series Application to Australian tourism 36
  • 131. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad,N: Additive damped trend method Forecasting hierarchical time series Application to Australian tourism 36
  • 132. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad,N: Additive damped trend method M,N: Exponential trend method Forecasting hierarchical time series Application to Australian tourism 36
  • 133. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad,N: Additive damped trend method M,N: Exponential trend method Md,N: Multiplicative damped trend method Forecasting hierarchical time series Application to Australian tourism 36
  • 134. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad,N: Additive damped trend method M,N: Exponential trend method Md,N: Multiplicative damped trend method A,A: Additive Holt-Winters’ method Forecasting hierarchical time series Application to Australian tourism 36
  • 135. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad,N: Additive damped trend method M,N: Exponential trend method Md,N: Multiplicative damped trend method A,A: Additive Holt-Winters’ method A,M: Multiplicative Holt-Winters’ method Forecasting hierarchical time series Application to Australian tourism 36
  • 136. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M There are 15 separate exponential smoothing methods. Forecasting hierarchical time series Application to Australian tourism 36
  • 137. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M There are 15 separate exponential smoothing methods. Each can have an additive or multiplicative error, giving 30 separate models. Forecasting hierarchical time series Application to Australian tourism 36
  • 138. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smooth Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsForecasting hierarchical time series Application to Australian tourism 37
  • 139. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smooth Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsForecasting hierarchical time series Application to Australian tourism 37
  • 140. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smooth ↑ Trend Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsForecasting hierarchical time series Application to Australian tourism 37
  • 141. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smooth ↑ Trend Seasonal Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsForecasting hierarchical time series Application to Australian tourism 37
  • 142. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smooth ↑ Error Trend Seasonal Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsForecasting hierarchical time series Application to Australian tourism 37
  • 143. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smooth ↑ Error Trend Seasonal Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsForecasting hierarchical time series Application to Australian tourism 37
  • 144. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smooth ↑ Error Trend Seasonal Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsForecasting hierarchical time series Application to Australian tourism 37 Innovations state space models ¯ All ETS models can be written in innovations state space form (IJF, 2002). ¯ Additive and multiplicative versions give the same point forecasts but different prediction intervals.
  • 145. Automatic forecasting From Hyndman et al. (IJF, 2002): Apply each of 30 models that are appropriate to the data. Optimize parameters and initial values using MLE (or some other criterion). Select best method using AIC: AIC = −2 log(Likelihood) + 2p where p = # parameters. Produce forecasts using best method. Obtain prediction intervals using underlying state space model. Forecasting hierarchical time series Application to Australian tourism 38
  • 146. Automatic forecasting From Hyndman et al. (IJF, 2002): Apply each of 30 models that are appropriate to the data. Optimize parameters and initial values using MLE (or some other criterion). Select best method using AIC: AIC = −2 log(Likelihood) + 2p where p = # parameters. Produce forecasts using best method. Obtain prediction intervals using underlying state space model. Forecasting hierarchical time series Application to Australian tourism 38
  • 147. Automatic forecasting From Hyndman et al. (IJF, 2002): Apply each of 30 models that are appropriate to the data. Optimize parameters and initial values using MLE (or some other criterion). Select best method using AIC: AIC = −2 log(Likelihood) + 2p where p = # parameters. Produce forecasts using best method. Obtain prediction intervals using underlying state space model. Forecasting hierarchical time series Application to Australian tourism 38
  • 148. Automatic forecasting From Hyndman et al. (IJF, 2002): Apply each of 30 models that are appropriate to the data. Optimize parameters and initial values using MLE (or some other criterion). Select best method using AIC: AIC = −2 log(Likelihood) + 2p where p = # parameters. Produce forecasts using best method. Obtain prediction intervals using underlying state space model. Forecasting hierarchical time series Application to Australian tourism 38
  • 149. Base forecasts Forecasting hierarchical time series Application to Australian tourism 39 Domestic tourism forecasts: Total Year Visitornights 1998 2000 2002 2004 2006 2008 600006500070000750008000085000
  • 150. Base forecasts Forecasting hierarchical time series Application to Australian tourism 39 Domestic tourism forecasts: NSW Year Visitornights 1998 2000 2002 2004 2006 2008 18000220002600030000
  • 151. Base forecasts Forecasting hierarchical time series Application to Australian tourism 39 Domestic tourism forecasts: VIC Year Visitornights 1998 2000 2002 2004 2006 2008 1000012000140001600018000
  • 152. Base forecasts Forecasting hierarchical time series Application to Australian tourism 39 Domestic tourism forecasts: Nth.Coast.NSW Year Visitornights 1998 2000 2002 2004 2006 2008 50006000700080009000
  • 153. Base forecasts Forecasting hierarchical time series Application to Australian tourism 39 Domestic tourism forecasts: Metro.QLD Year Visitornights 1998 2000 2002 2004 2006 2008 800090001100013000
  • 154. Base forecasts Forecasting hierarchical time series Application to Australian tourism 39 Domestic tourism forecasts: Sth.WA Year Visitornights 1998 2000 2002 2004 2006 2008 400600800100012001400
  • 155. Base forecasts Forecasting hierarchical time series Application to Australian tourism 39 Domestic tourism forecasts: X201.Melbourne Year Visitornights 1998 2000 2002 2004 2006 2008 40004500500055006000
  • 156. Base forecasts Forecasting hierarchical time series Application to Australian tourism 39 Domestic tourism forecasts: X402.Murraylands Year Visitornights 1998 2000 2002 2004 2006 2008 0100200300
  • 157. Base forecasts Forecasting hierarchical time series Application to Australian tourism 39 Domestic tourism forecasts: X809.Daly Year Visitornights 1998 2000 2002 2004 2006 2008 020406080100
  • 158. Hierarchy: states, zones, regions Forecast Horizon (h) MAPE 1 2 4 6 8 Average Top Level: Australia Bottom-up 3.79 3.58 4.01 4.55 4.24 4.06 OLS 3.83 3.66 3.88 4.19 4.25 3.94 Scaling 3.68 3.56 3.97 4.57 4.25 4.04 Averaging 3.76 3.60 4.01 4.58 4.22 4.06 Level 1: States Bottom-up 10.70 10.52 10.85 11.46 11.27 11.03 OLS 11.07 10.58 11.13 11.62 12.21 11.35 Scaling 10.44 10.17 10.47 10.97 10.98 10.67 Averaging 10.59 10.36 10.69 11.27 11.21 10.89 Based on a rolling forecast origin with at least 12 observations in the training set. Forecasting hierarchical time series Application to Australian tourism 40
  • 159. Hierarchy: states, zones, regions Forecast Horizon (h) MAPE 1 2 4 6 8 Average Level 2: Zones Bottom-up 14.99 14.97 14.98 15.69 15.65 15.32 OLS 15.16 15.06 15.27 15.74 16.15 15.48 Scaling 14.63 14.62 14.68 15.17 15.25 14.94 Averaging 14.79 14.79 14.85 15.46 15.49 15.14 Bottom Level: Regions Bottom-up 33.12 32.54 32.26 33.74 33.96 33.18 OLS 35.89 33.86 34.26 36.06 37.49 35.43 Scaling 31.68 31.22 31.08 32.41 32.77 31.89 Averaging 32.84 32.20 32.06 33.44 34.04 32.96 Based on a rolling forecast origin with at least 12 observations in the training set. Forecasting hierarchical time series Application to Australian tourism 41
  • 160. Groups: Purpose, states, capital Forecast Horizon (h) MAPE 1 2 4 6 8 Average Top Level: Australia Bottom-up 3.48 3.30 4.04 4.56 4.58 4.03 OLS 3.80 3.64 3.94 4.22 4.35 3.95 Scaling 3.65 3.45 4.00 4.52 4.57 4.04 Averaging 3.59 3.33 3.99 4.56 4.58 4.04 Level 1: Purpose of travel Bottom-up 8.14 8.37 9.02 9.39 9.52 8.95 OLS 7.94 7.91 8.66 8.66 9.29 8.54 Scaling 7.99 8.10 8.59 9.09 9.43 8.71 Averaging 8.04 8.21 8.79 9.25 9.44 8.82 Based on a rolling forecast origin with at least 12 observations in the training set. Forecasting hierarchical time series Application to Australian tourism 42
  • 161. Groups: Purpose, states, capital Forecast Horizon (h) MAPE 1 2 4 6 8 Average Level 2: States Bottom-up 21.34 21.75 22.39 23.26 23.31 22.58 OLS 22.17 21.80 23.53 23.15 23.90 22.99 Scaling 21.49 21.62 22.20 23.13 23.25 22.51 Averaging 21.38 21.61 22.30 23.17 23.24 22.51 Bottom Level: Capital city versus other Bottom-up 31.97 31.65 32.19 33.70 33.47 32.62 OLS 32.31 30.92 32.41 33.35 34.13 32.55 Scaling 32.12 31.36 32.18 33.36 33.43 32.52 Averaging 31.92 31.39 32.04 33.51 33.39 32.49 Based on a rolling forecast origin with at least 12 observations in the training set. Forecasting hierarchical time series Application to Australian tourism 43
  • 162. Outline 1 Hierarchical time series 2 Forecasting framework 3 Optimal forecasts 4 Approximately optimal forecasts 5 Application to Australian tourism 6 hts package for R 7 References Forecasting hierarchical time series hts package for R 44
  • 163. hts package for R Forecasting hierarchical time series hts package for R 45 hts: Hierarchical and grouped time series Methods for analysing and forecasting hierarchical and grouped time series Version: 3.01 Depends: forecast Imports: SparseM Published: 2013-05-07 Author: Rob J Hyndman, Roman A Ahmed, and Han Lin Shang Maintainer: Rob J Hyndman <Rob.Hyndman at monash.edu> License: GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]
  • 164. Example using R library(hts) # bts is a matrix containing the bottom level time series # g describes the grouping/hierarchical structure y <- hts(bts, g=c(1,1,2,2)) Forecasting hierarchical time series hts package for R 46
  • 165. Example using R library(hts) # bts is a matrix containing the bottom level time series # g describes the grouping/hierarchical structure y <- hts(bts, g=c(1,1,2,2)) Forecasting hierarchical time series hts package for R 46 Total A AX AY B BX BY
  • 166. Example using R library(hts) # bts is a matrix containing the bottom level time series # g describes the grouping/hierarchical structure y <- hts(bts, g=c(1,1,2,2)) # Forecast 10-step-ahead using optimal combination method # ETS used for each series by default fc <- forecast(y, h=10) Forecasting hierarchical time series hts package for R 47
  • 167. Example using R library(hts) # bts is a matrix containing the bottom level time series # g describes the grouping/hierarchical structure y <- hts(bts, g=c(1,1,2,2)) # Forecast 10-step-ahead using OLS combination method # ETS used for each series by default fc <- forecast(y, h=10) # Select your own methods ally <- allts(y) allf <- matrix(, nrow=10, ncol=ncol(ally)) for(i in 1:ncol(ally)) allf[,i] <- mymethod(ally[,i], h=10) allf <- ts(allf, start=2004) # Reconcile forecasts so they add up fc2 <- combinef(allf, Smatrix(y)) Forecasting hierarchical time series hts package for R 48
  • 168. hts function Usage hts(y, g) gts(y, g, hierarchical=FALSE) Arguments y Multivariate time series containing the bot- tom level series g Group matrix indicating the group structure, with one column for each series when com- pletely disaggregated, and one row for each grouping of the time series. hierarchical Indicates if the grouping matrix should be treated as hierarchical. Details hts is simply a wrapper for gts(y,g,TRUE). Both return an object of class gts. Forecasting hierarchical time series hts package for R 49
  • 169. forecast.gts function Usage forecast(object, h, method = c("comb", "bu", "mo", "tdgsf", "tdgsa", "tdfp", "all"), fmethod = c("ets", "rw", "arima"), level, positive = FALSE, xreg = NULL, newxreg = NULL, ...) Arguments object Hierarchical time series object of class gts. h Forecast horizon method Method for distributing forecasts within the hierarchy. fmethod Forecasting method to use level Level used for "middle-out" method (when method="mo") positive If TRUE, forecasts are forced to be strictly positive xreg When fmethod = "arima", a vector or matrix of external re- gressors, which must have the same number of rows as the original univariate time series newxreg When fmethod = "arima", a vector or matrix of external re- gressors, which must have the same number of rows as the original univariate time series ... Other arguments passing to ets or auto.arima Forecasting hierarchical time series hts package for R 50
  • 170. Utility functions allts(y) Returns all series in the hierarchy Smatrix(y) Returns the summing matrix combinef(f) Combines initial forecasts optimally. Forecasting hierarchical time series hts package for R 51
  • 171. More information Forecasting hierarchical time series hts package for R 52 Vignette on CRAN
  • 172. Outline 1 Hierarchical time series 2 Forecasting framework 3 Optimal forecasts 4 Approximately optimal forecasts 5 Application to Australian tourism 6 hts package for R 7 References Forecasting hierarchical time series References 53
  • 173. References RJ Hyndman, RA Ahmed, G Athanasopoulos, and HL Shang (2011). “Optimal combination forecasts for hierarchical time series”. Computational Statistics and Data Analysis 55(9), 2579–2589 RJ Hyndman, RA Ahmed, and HL Shang (2013). hts: Hierarchical time series. cran.r-project.org/package=hts. RJ Hyndman and G Athanasopoulos (2013). Forecasting: principles and practice. OTexts. OTexts.org/fpp/. Forecasting hierarchical time series References 54
  • 174. References RJ Hyndman, RA Ahmed, G Athanasopoulos, and HL Shang (2011). “Optimal combination forecasts for hierarchical time series”. Computational Statistics and Data Analysis 55(9), 2579–2589 RJ Hyndman, RA Ahmed, and HL Shang (2013). hts: Hierarchical time series. cran.r-project.org/package=hts. RJ Hyndman and G Athanasopoulos (2013). Forecasting: principles and practice. OTexts. OTexts.org/fpp/. Forecasting hierarchical time series References 54 ¯ Papers and R code: robjhyndman.com ¯ Email: Rob.Hyndman@monash.edu