Contenu connexe
Similaire à Comparison of the forecasting techniques – arima, ann and svm a review-2
Similaire à Comparison of the forecasting techniques – arima, ann and svm a review-2 (20)
Plus de IAEME Publication
Plus de IAEME Publication (20)
Comparison of the forecasting techniques – arima, ann and svm a review-2
- 1. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 3, May – June (2013), © IAEME
370
COMPARISON OF THE FORECASTING TECHNIQUES – ARIMA,
ANN AND SVM - A REVIEW
V.Anandhi1
Dr.R.Manicka Chezian2
Assistant Professor(Computer Science Associate Professor,
Department of Forest Resource Management, Department of Computer Science,
Forest College and Research Institute, NGM College, Pollachi-642001,
Mettupalayam 641 301,Tamil Nadu-India Tamil Nadu, India
ABSTRACT
Wood Pulp is the most common raw material in paper making.Forecasting is a
systematic effort to anticipate future events or conditions. Forecasting is usually carried out
using the statistical methods like ARIMA and nowadays Artificial Neural Networks (ANN)
and Support Vector Machines (SVM) are widely used in forecasting for its accuracy. In ANN,
a Levenberg-Marquardt Back Propagation (LMBP) algorithm has been used to develop the
ANN models. In developing the ANN models, different networks with different numbers of
neuron hidden layers were evaluated. The forecast is done using the feed forward Back
Propagation Network (BPN). Support Vector Regression (SVR), a category for support
vector machine attempts to minimize the generalization error bound so as to achieve
generalized performance. Regression is that of finding a function which approximates
mapping from an input domain to the real numbers on the basis of a training sample.
Keywords: Forecasting, ARIMA, Artificial Neural Networks (ANN) and Support Vector
Machines (SVM), Levenberg-Marquardt Back Propagation (LMBP), Back Propagation
Network (BPN)
I. INTRODUCTION
Forecasting is a systematic effort to anticipate future events or conditions. Forecasts
are more accurate for larger groups of items and for longer time periods. Many forecasters
depend heavily on models to help in forecasting. A model consists of mathematical
expressions or equations which describe relationship among variables. A forecaster’s choice
of forecasting model is the key importance. Applications of forecasting include rainfall, stock
market- price, cash forecasting in banks etc. Forecasting is usually carried out using the
statistical methods like ARIMA and nowadays Artificial Neural Networks (ANN) and
Support Vector Machines (SVM) are widely used in forecasting for its accuracy.
INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING
& TECHNOLOGY (IJCET)
ISSN 0976 – 6367(Print)
ISSN 0976 – 6375(Online)
Volume 4, Issue 3, May-June (2013), pp. 370-376
© IAEME: www.iaeme.com/ijcet.asp
Journal Impact Factor (2013): 6.1302 (Calculated by GISI)
www.jifactor.com
IJCET
© I A E M E
- 2. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 3, May – June (2013), © IAEME
371
II. FORECASTING SYSTEM
A good forecasting system has qualities that distinguish it from other systems. The
qualities provide a useful basis for understanding why good forecasting systems outperform
others over time. If users understand the rationale of a forecast, they can appraise the
uncertainty of the forecast, and they will know when to revise the forecast in light of
changing circumstances. The more accurate a forecast is, the better are the decisions that
depend upon it. Inaccurate forecasts lead to too much or too little capacity and can be very
costly. Forecasts cost money, time and effort. Added expense must purchase added accuracy,
flexibility, or insight. A sophisticated forecasting system requires ample staff resources and
technical skills for maintenance. Choice of a forecasting system must include a commitment
to the resources necessary to maintain it and avoid systems the utility will be unable or
unwilling to maintain. Earlier statistical methods were used for forecasting. A time series
model was constructed solely from the past values of the variables to be forecast. Single
equation time series model known as ‘autoregressive models’ – the ARIMA (Auto
Regressive Integrated Moving Average) model. Auto Regressive is a model in which a
variable is a function of only its past values, except for deviations introduced by an error term.
Integrated are the period to period changes in the level of the original variable employed in
the estimation procedure rather than the level of the variable itself. Moving average
procedure has been used to eliminate any inter correlations of the error term to its own past or
future values. ARIMA models- Autoregressive Integrated Moving-average, Can represent a
wide range of time series, a “stochastic” modeling approach that can be used to calculate the
probability of a future value lying between two specified limits.
For more than two decades, Box and Jenkins Auto-Regressive Integrated Moving
Average (ARIMA) techniquehas been widely used for time series forecasting. However,
ARIMA is a general univariate model and it is developed based on the assumption that the
time series being forecasted are linear and stationary. Because of its popularity, the ARIMA
model has been used as a benchmark to evaluate many new modeling approaches [1]. The
method of least squares was used to estimate the parameters in ARIMA. Artificial Neural
Network(ANN) largely used in forecasting, assists multivariate analysis. Multivariate models
can rely on greater information, where not only the lagged time series being forecast, but also
other indicators (such as technical, fundamental, inter-marker etc. for financial market), are
combined to act as predictors. In addition, ANN is more effective in describing the dynamics
of nonstationary time series due to its unique non-parametric, non-assumable, noise-tolerant
and adaptive properties. ANNs are universal function approximators that can map any
nonlinear function without a priori assumptions about the data [2].
AUTOREGRESSIVE (AR) MODEL
An Autoregressive (AR) model is a representation of a type of random process; as
such, it describes certain time-varying processes in nature, economics, etc. The
autoregressive model specifies that the output variable depends linearly on its own previous
values, the notation AR (p) indicates an autoregressive model of order p. The AR (p) model is
defined as
- 3. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 3, May – June (2013), © IAEME
372
Where the parameters of the model are, is a constant, and is white noise. This
can be equivalently written using the backshift operatorB as
So that, moving the summation term to the left side and using polynomial notation, we have
An autoregressive model can thus be viewed as the output of an all-poleinfinite
impulse response filter whose input is white noise.Some constraints are necessary on the
values of the parameters of this model in order that the model remains wide-sense stationary.
For example, processes in the AR model with |φ1| ≥ 1 are not stationary. More generally, for
an AR (p) model to be wide-sense stationary, the roots of the polynomial
must lie within the unit circle, i.e., each root must satisfy
Machine learning techniques
Machine learning, a branch of artificial intelligence, was originally employed to
develop techniques to enable computers to learn. It includes a number of advanced statistical
methods for regression and classification. In certain applications it is sufficient to directly
predict the dependent variable without focusing on the underlying relationships between
variables. In other cases, the underlying relationships can be very complex and the
mathematical form of the dependencies unknown. For such cases, machine learning
techniques emulate human cognition and learn from training examples to predict future
events.
Artificial Neural Network
Artificial Neural Networks (ANNs) are models based on the neural structure of the
brain. The brain learns from experience. Artificial neural networks try to mimic the
functioning of the brain. A neural network is a massively parallel distributor processor made
up of simple processing units. It has a natural property for storing experiential knowledge and
making it available for use.
Fig.1 Artificial Neural Network Architecture
- 4. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 3, May – June (2013), © IAEME
373
The architecture of ANN is designed by the number of layers, number of neurons in
each layer, weights between neurons, a transfer function that controls the generation of output
in a neuron. A supervised training is accomplished by presenting a sequence of training
vectors, or patterns, each with an associated target output vector [3]. Neural networks are a
class of nonlinear model that can approximate any nonlinear function to an arbitrary degree
of accuracy and have the potential to be used as forecasting tools in many different areas [4].
The most commonly used neural network architecture is multilayer feedforwardnetwork. It
consists of an input layer, an output layer and one or more intermediate layer called hidden
layer. All the nodes at each layer are connected to each node at the upper layer by
interconnection strength called weights. A training algorithm is used to attain a set of weights
that minimizes the difference the target and actual output produced by the network. There are
many different neural net learning algorithms found in the literature. The ANN model is a
mathematical model inspired by the function of the human brain and its use is mainly
motivated by its capability of approximating a function to any degree of accuracy.
ANN, is a method which has few limiting hypotheses and can be easily adapted to the
types of data. Secondly, ANN can be generalized. Thirdly, ANN has a general functional
structure. Furthermore, ANN can be classified as non-linear models. One critical decision is
to determine the appropriate architecture, that is, the number of layers, number of nodes in
each layers, and the number of arcs which interconnect with the nodes. Feedforward neural
network (FNN) has been used in many studies for the forecasting process of the
series[5].Most popular supervised training algorithm is BPN. Back propagation is a form of
supervised learning for multi-layer nets, also known as the generalized delta rule. Error data
at the output layer is "back propagated" to earlier ones, allowing incoming weights to these
layers to be updated[6]. It is most often used as training algorithm in current neural network
applications. Determining the architecture depends on the basic problem. ANN gives good
accuracy and thus more preferred for forecasting.ANN had a significantly lower error
compared with other methods[7] .
Support Vector Machines
Support Vector Machines (SVM) is learning machines implementing the structural
risk minimization inductive principle to obtain good generalization on a limited number of
learning patterns. The theory has originally been developed by Vapnik[1] and his co-workers
on a basis of a separable bipartition problem at the AT & T Bell Laboratories. SVM
implements a learning algorithm, useful for recognizing subtle patterns in complex data sets.
Instead of minimizing the observed training error, Support Vector Regression (SVR) attempts
to minimize the eneralization error bound so as to achieve generalized performance. There
are two main categories for support vector machines: support vector classification (SVC) and
support vector regression (SVR). SVM is a learning system using a high dimensional feature
space. It yields prediction functions that are expanded on a subset of support vectors. SVM
can generalize complicated gray level structures with only a very few support vectors and
thus provides a new mechanism for image compression. A version of a SVM for regression
has been proposed in 1997 by Vapnik, Steven Golowich, and Alex Smola [8]. This method is
called support vector regression (SVR)the model produced by SVR only depends on a subset
of the training data, because the cost function for building the model ignores any training data
that is close (within a threshold ε) to the model prediction [9]. Support Vector Regression
(SVR) is the most common application form of SVMs. Support vector machines project the
- 5. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 3, May – June (2013), © IAEME
374
data into a higher dimensional space and maximize the margins between classes or minimize
the error margin for regression[10].
Support Vector Machines (SVMs) are a popular machine learning method for
classification, regression, and other learning tasks. Basic principle of SVM is that, given a set
of points which need to be classified into two classes, find a separating hyper plane which
maximizes the margin between the two classes. This will ensure the better classification of
the unseen points, i.e. better generalization. In SVR, our goal is to find a function f(x) that has
at most deviation from the actually obtained targets yi for all the training data. Support vector
regression is the natural extension of large margin kernel methods [11] used for classification
to regression analysis. The problem of regression is that of finding a function which
approximates mapping from an input domain to the real numbers on the basis of a training
sample. This refers to the difference between the hypothesis output and it’s training [12]
value as the residual of the output, an indication of the accuracy of the fit at this point. One
must decide how to measure the importance of this accuracy, as small residuals may be
inevitable even while we need to avoid in large ones. The loss function determines this
measure. Each choice of loss function will result in a different overall strategy for performing
regression.
Fig. 2. ε -insensitive Loss Function for regression
Fig. 3. ε -insensitive zone for non-linear support vector regression
Support vector regression performs linear regression in the feature space using ε -
insensitive loss function and, at the same time, tries to reduce model complexity by
minimizing ||w ||2. Support vector machines (SVM) are used as they reduce the time and
- 6. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 3, May – June (2013), © IAEME
375
expertise needed to construct/train price forecasting models. Also SVM has lower tune-able
parameters with parameter values choice being less critical for good forecasting results. SVM
can optimize its structure (tune its parameter settings) on input training data provided. SVM
training includes solving quadratic optimization as it has only a unique solution and does not
involve weights random initialization as training NN does. So an SVM with the similar
parameter settings and trained on identical data provides identical results. This increases
SVM forecast repeatability while reducing training runs number needed to locate optimum
SVM parameter settings [13].Data non-regularity enables SVMs to be used for regression
analysis, for example when data is not distributed regularly or has a known distribution [14].
Information to be transformed is evaluated prior to entering classification techniques score.
SVM techniques’ advantages are: SVMs gain flexibility through kernel introduction in the
choice of threshold form separating instances which do not need to be linear or have similar
functional form for all data. This is because its function is non-parametric and operates
locally. No assumptions are necessary as kernel contains a non-linear transformation and its
functional transformation ensures that data is linearly separable. Transformation is implicit on
a robust theoretical basis without the need for human judgment. When parameters C and r (in
the case of a Gaussian kernel) are chosen correctly, SVMs provide a good out-of-sample
generalization ensuring that selecting an appropriate generalization grade ensures that SVMs
are robust, even with a biased training sample. As optimality problem is convex, SVMs
deliver unique solutions which are advantageous when compared to Neural Networks which
have local minima linked multiple solutions and so might not be robust over samples.
V. CONCLUSION
A Support Vector Regression based prediction model appropriately tuned can
outperform other more complex models. Support vector regression is a statistical method for
creating regression functions of arbitrary type from a set of training data. Testing the quality
of regression on the training set shows good prediction accuracy. A neural network model
with improved learning technique is also promising for forecasting. The Artificial Neural
Networks, the well-known function approximators in prediction and system modelling, has
recently shown its great applicability in time-series analysis and forecasting.
ACKNOWLEDGMENTS
Sincere thanks to Mr. R. Sreenivasan, Assistant General Manager, Tamil Nadu
Newsprint and Papers Limited (TNPL), karur, Dr. K. T. Parthiban, Professor and Head, Tree
Breeding, Dr. P. Durairasu, Dean, FC&RI, Dr. M. Anjugam, Professor and Head, Forest
College and Research Institute for their guidance and support.
REFERENCES
[1] H. B. Hwarng and H. T. Ang, “A simple neural network for ARMA(p,q) time series,”
OMEGA: Int. Journal of Management Science, vol. 29, pp 319-333, 2002.
[2] L. Cao and F. Tay, “Financial forecasting using support vector machines,” Neural
Comput.&Applic, vol. 10, pp.184-192, 2001.
[3] V. Anandhi, R. ManickaChezian and K.T. Parthiban, “Forecast of Demand and Supply of
Pulpwood using Artificial Neural Network”, International Journal of Computer Science and
Telecommunications, pp. 35-38, 2012.
- 7. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 3, May – June (2013), © IAEME
376
[4] JoarderKamruzzaman and Ruhul A. Sarker, “ANN-Based Forecasting of Foreign
Currency Exchange Rates”, Neural Information Processing - Letters and Reviews, Vol. 3,
No. 2, May 2004.
[5] CemKadilar, MuammerSimsek and CagdasHakanAladag, “ Forecasting the exchange rate
series with ann: the case of turkey”, Istanbul University Econometrics and Statistics e-
Journal, pages 17-29, 2009.
[6] V. Anandhi, R. ManickaChezian, “Backpropagation Algorithm for Forecasting the Price
of Pulpwood –Eucalyptus”, International Journal of Advanced Research in Computer
Science, pp.355-357,2012
[7]K. Mohammadi, H. R. Eslami and Sh. DayyaniDardashti, “ Comparison of Regression,
ARIMA and ANN Models for Reservoir Inflow Forecasting using Snowmelt Equivalent (a
Case study of Karaj) “ , J. Agric. Sci. Technol. pp. 17-30,2005
[8] V. Vapnik, S. Golowich, and A. Smola, “Support vector method for function
approximation, regression estimation, and signal processing,” Neural Information Processing
Systems, vol. 9, MIT Press, Cambridge,MA. 1997.
[9] D. Basak, S. Pal, and D. C. Patranabis, “Neural Information Processing,” Letters and
Reviews, vol. 11, no. 10, pp. 203-224, October 2007.
[10] “A Comparison of Machine Learning Techniques and Traditional Methods,” Journal of
Applied Sciences, vol. 9, pp. 521-527.
[11] K. P. Soman, R. Loganathan, and V. Ajay, “Support vector machines and other kernel
methods,” Centre for Excellence in Computational Engineering and Networking Amrita
Vishwa Vidyapeetham.
[12] H. Drucker, C. J. C. Burges, L. Kaufman, A. Smola, and V. Vapnik, “Support vector
regression machines,” Advances in NeuralInformation Processing Systems, The MIT Press,
vol. 9, pp. 155, 1997.
[13]Sansom, D. C., Downs, T., & Saha, T. K. (2003). Evaluation of support vector machine
based forecasting tool in electricity price forecasting for Australian national electricity market
participants. Journal of Electrical and Electronics Engineering, Australia, 22(3), 227-
234,2003.
[14]Zhang, L., Lin, F., & Zhang, B. (2001, October). Support vector machine learning for
image retrieval. In Image Processing, 2001. Proceedings. International Conference on
(Vol. 2, pp. 721-724)IEEE, 2001.
[15] M. Nirmala and S. M. Sundaram, “Modeling and Predicting the Monthly Rainfall in
Tamilnadu as a Seasonal Multivariate Arima Process”, International Journal of Computer
Engineering & Technology (IJCET), Volume 1, Issue 1, 2010, pp. 103 - 111, ISSN Print:
0976 – 6367, ISSN Online: 0976 – 6375.
[16] Vilas Naik and Raghavendra Havin, “Entropy Features Trained Support Vector Machine
Based Logo Detection Method for Replay Detection and Extraction from Sports Videos”,
International Journal of Graphics and Multimedia (IJGM), Volume 4, Issue 1, 2013,
pp. 20 - 30, ISSN Print: 0976 – 6448, ISSN Online: 0976 –6456.
[17] Ankush Gupta, Ameesh Kumar Sharma and Umesh Sharma, “To Forecast the Future
Demand of Electrical Energy in India: by Arima & Exponential Methods”, International
Journal of Advanced Research in Engineering & Technology (IJARET), Volume 4, Issue 2,
2013, pp. 197 - 205, ISSN Print: 0976-6480, ISSN Online: 0976-6499.