2. OVERVIEW
Traditional statistical methods for dealing with time-
series data include various ARIMA models.
Autocorrelation component
Moving component
However, machine learning methods have
performed well compared to ARIMA models and
can handle unusually-distributed outcomes, as well.
Extra gains for both models when filtering or mode
decomposition is attempted first.
Fast Fourier Transform
Circular Convolutions
Chebyshev Polynomials
3. DATA
Small open-
source dataset
of quarterly
sales and
marketing data
Two outcomes:
sales,
marketing
36 observed
quarters
Marketing
Sales
4. FILTERING
Fast Fourier Transform
Frequency conversion
of time series
Composite of sine
waves (Fourier series)
Circular convolution
Related periodic
decomposition with
sparsity
Chebyshev polynomial
Angular frequency
(pendulum-like)
decomposition
5. ALGORITHMS TESTED
Algorithm Filtering? Optimization?
Random Forest Yes No
AR-1 No No
AR-2 Yes No
ARIMA No (best for model) Yes
Linear Regression Yes No
Neural Network Yes Yes (number of layers)
Extreme Learning Machine Yes No
Support Vector Machine Yes Yes (linear, penalty=4)
First 30 observations as training data, last 6
as test data (except extreme learning
machine—first 6 and second 6)
6. RESULTS (SUM SQUARE ERROR)
Algorithm Marketing Sales
Random Forest 2.92 288.90
AR-1 35.51 101.85
AR-2 33.42 92.22
Tuned ARIMA 28.65 (2,0,3) 93.23 (3,0,4)
Linear Regression Unable to compute 2e-28
Neural Network 0.02 2156.65
Extreme Learning Machine 3e-25 6e-25
Support Vector Machine 0.95 10.91
Extreme learning machines provide good prediction for
time series data across problems, while providing a quick
means to calculate and predict.
Several types of neural network models exist within
ARIMA packages, which may be useful.
7. NEW MULTIVARIATE “STATE” ALGORITHM
Extensible Markov Models
Clustering across time for future state prediction
Transition probabilities as Markov chain
Nice asymptotic properties and prediction algorithms
Can handle state-level prediction (multivariate), rather than individual-
level prediction (univariate)
Example output of state averages:
State Advertising Sales
1 19.84 20.14
2 28.87 47.91
3 20 4.5
8. EXTENSIBLE MARKOV MODEL RESULTS
Performance
Very good match to actual data
Better with a filter or two (1st or 2nd most likely predicted for
every test case)
Captures many features of data and gives iterative
probabilities to forecast states at much later times:
-1.0 -0.5 0.0 0.5 1.0
-1.0-0.50.00.51.0
MDS projection
These two dimensions explain 72.21 % of the point variability.
Dimension 1
Dimension2
1 23
4
5
6
7
9. CONCLUSIONS
Filters greatly improve predictive performance of
forecasting methods based on time-series data.
Many machine learning algorithms perform
competitively on time-series problems compared to
state-of-the-art statistical models.
Support vector machines and extreme learning
machines best overall methods
Extensible Markov Models provide a good
alternative to traditional forecasting models for
multivariate data, as is common in cross-
organization business data.
Provide a viable alternative with good prediction,
particularly when coupled with filtering methods