SlideShare une entreprise Scribd logo
1  sur  27
Télécharger pour lire hors ligne
Implementation of Continuous Sequential
Importance Resampling Algorithm for Stochastic
Volatility Estimation
Harihara Subramanyam Sreenivasan
Master’s Degree in Data Science Student
Barcelona Graduate School of Economics
June 30, 2016
1
Acknowledgements
I would like to express my sincere gratitude to Prof. Christian Brown-
lees for his guidance and support during this trimester pertaining to the
material in this thesis. In addition, a majority of the fundamentals re-
quired for this thesis were developed as part of his coursework and I am
glad to have had the opportunity to work on this project.
Besides Prof. Brownlees, I would also like to thank Dr. Hrvoje Stojic
as the techniques taught by him in Advanced Computational Methods
helped me improve my efficiency while working on this thesis.
Lastly, I would like to thank all of the faculty at Barcelona Graduate
School of Economics for the knowledge I have been able to acquire over
this past year.
Abstract
The objective of this master’s thesis was to implement a CSIR Algo-
rithm for estimating predictive densities in SV Models into an R package.
In the beginning of the project, a brief revision of econometric concepts
was done followed by the step-by-step development of the package. The
diagnostics and test results of the functions developed are presented and
the current status and future work for the package are discussed.
Contents
1 Introduction 1
2 Econometrics 2
2.1 Time Series Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.2 Time Series Modelling . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2.1 Modelling for E [yt|Ft−1] + t . . . . . . . . . . . . . . . 4
2.2.2 Modelling for Var [yt|Ft−1] = E( 2
t |Ft−1) . . . . . . . . . 5
3 Stochastic Volatility &
Smooth Particle Filters 6
4 Implementation, Diagnostics,
Package Development and Testing 10
5 Conclusions 12
6 Figures 13
7 Code 16
7.1 SV Series Simulation . . . . . . . . . . . . . . . . . . . . . . . . . 17
7.2 Sequential Importance Resampling . . . . . . . . . . . . . . . . . 18
7.3 Continuous Sequential Importance Resampling . . . . . . . . . . 19
7.4 Likelihood Computation . . . . . . . . . . . . . . . . . . . . . . . 20
7.5 Likelihood Minimization/Parameter Estimation . . . . . . . . . . 21
7.6 Plotting Volatility . . . . . . . . . . . . . . . . . . . . . . . . . . 22
7.7 MC SV Parameter Estimate . . . . . . . . . . . . . . . . . . . . . 23
1 Introduction
The structure of this thesis is representative of the methodology followed for
approaching this tasks mentioned in the abstract.
We begin with introductory econometrics concepts, followed by an explanation
of the theory related to stochastic volatility and particle filtering. In this section
we will provide descriptions of the filtering algorithms used.
After developing an understanding of the fundamentals, we began to develop
and test the functions that would be incorporated into a final package. There
were numerous tests performed on simulated series from a sv model. The results
of these tests are reported.
Thorough testing was done following which the package was used with the daily
returns of SP500 and BAC stocks. Parameters estimated and the plots visual-
ized are reported as well.
There is a discussion regarding the feasibility in applications and functionalities
of the package presented in section 4. Further development of the package and
the future plans in order to make it available on CRAN are also mentioned.
Additional MC tests beyond the one carried out for this thesis are mentioned
in Conclusions.
A major component of the theory presented in thesis is drawn from the lec-
ture slides for the Financial Econometrics course at BGSE [2] and ’Time Series
Analysis and Statistical Arbitrage’ course in the MS Program on NYU Courant
Institute’s ’Mathematics in Finance’. The notes were made publicly available
by Robert Almgren [1] and Robert Reider [6]. All other references, including
these 3 are mentioned in the bibliography.
1
2 Econometrics
2.1 Time Series Data
Time series data can’t be regarded as data that is generated by an iid process.
This is due to the fact that the every data point observed does depend on the
previous data points observed just before it. Therefore, in order to work with
financial and economic time series, we regard the data points as realizations of a
time-varying stochastic process. A univariate stochastic process can be defined
as follows:
Yt t ∈ {0, 1, ..T}
Where Yt is a collection of random variables defined on a probability space
(Ω, F, P) and indexed by t. Using this definition, our time series data can be
interpreted as t realizations defined as:
yt ∼ Yt {ω}
where ω ∈ Ω and Ft : {yt−1, yt−2, ...y1}
This underlying process could be continuous in time, nevertheless, as a result of
our measurement methodology and frequency we will only be recording events
separated by fixed equal-length time intervals. Therefore, we will be studying
these as discrete-time processes.
While working with time series it is a requirement that the series Yt is co-variance
stationary, therefore it must satisfy the following properties:
E Y 2
t is finite ∀ t
E [Yt] = µ ∀ t
Cov(Yt, Yh) = γh−t ∀ t, h ∈ T
The autocovariance must depend only the number of lags k = h − t.
2
Most of the work done during this project was using daily returns data. More
specifically, SP500 and BAC daily returns. In order to convert these points into
the realizations of a stationary process, we difference the data points with the
values observed directly before them. Let us represent the time series of the
prices of these stocks with Xt where t ∈ T. Hence:
Yt = Xt − Xt−1 where t ∈ [1, ...T]
Apart from testing for stationarity, we also assess for normality using Jarque-
Bera’s test. Our goals when working with time series data can be summarized
as follows:
1) Modelling: Our first goal is to propose a theoretical process and de-
termine if the data points are plausible realizations of the optimized parameters
of that process.
2) Forecasting: After using the information set Ft for modelling to
study the underlying data generating process. We attempt to predict yt+1, yt+2....
2.2 Time Series Modelling
In general, the modelling approaches with time series data can be classified as
follows:
1) Modelling for the conditional mean E [yt|Ft−1] + t
2) Modelling for the conditional variance Var [yt|Ft−1] = E( 2
t |Ft−1)
where t is the innovation/shock. At any given t, the information available to
us is the finite-dimensional distribution of the data F(yt, yt−1, ...; ω) and joint
densities f(y1, y2, ..yn; ω) where n ∈ [1, 2, ..., t − 1]. ω is a vector of parameters
that generated the data points and our objective is determine these with reason-
able accuracy for our proposed theoretical process. Considering the likelihood
function:
Ln(ω) = f(yt, yt−1, .....y1; ω)
We have to note here that since our data are not iid, each joint density will be
expressed as
f(y1, y2, ...yn; ω) =f(yn|yn−1, yn−2, ....y1; ω) × f(yn|yn−1, yn−2, ....y1; ω)..
... × f(yp+1|yp, yp−1, ...y1; ω) × f(y1, y2, ...yn; ω)
3
Based on the above, we can define the log likehood as:
log Ln(ω) =
T
t=p+1
log f(yt|yt−1, yt−2...y1; ω) + log f(y1, y2, ...yp; ω)
We typically use only the first p lags of data. Finally, Wold’s decomposition
theorem states that any co-variance stationary process Yt can be expressed as
a sum of a deterministic time series and an infinite series of shocks:
Yt = ηt +
∞
i=1
βt t−i
Using the above concepts, we approach modelling.
2.2.1 Modelling for E [yt|Ft−1] + t
There are two classes of models which are used in modelling for the conditional
mean.
MA(q)
yt = θ0 +
q
i=0 θi t−i + t where t ∼ WN(0, σ )
With the moving average models, we attempt to describe yt as a weighted
combination of the past shocks from the innovation.
AR(p)
yt = φ0 +
p
i=0 φiyt−i + t where t ∼ WN(0, σ )
With the autoregressive models, we attempt to study yt as weighted combination
of the previous observed data points.
Typically, we combine these two and use an Autoregressive Moving Average
Model
ARMA(p,q)
yt = φ0 +
p
i=0 φiyt−i
q
i=0 θi t−i + t where t ∼ WN(0, σ )
We can usually determine the orders of p and q by looking at the autocorrelation
and partial autcorrelation functions of yt. With real world data, ARMA(1,1)
models perform quite well.
4
While forecasting with ARMA models, it can be observed that after k steps
ahead, the predictions will converge to the unconditional mean of the process.
This is a result of the fact that the past information will have little influence
after a certain time period.
2.2.2 Modelling for Var [yt|Ft−1] = E( 2
t |Ft−1)
When we assess the auto correlation functions of yT , |yt|, and y2
t , we notice
that the autocorrelations of the absolute and squared returns is much more
significant than the original return series. This is volatility clustering and in
order to utilize this serial dependence of returns to model for the fluctuations
in the scale of returns, we use volatility modelling. The volatility clustering in
SP500 series is shown using Figure 1.
There are two approaches to modelling for the σ2
of the process:
1) σ2
as a deterministic equation
2) σ2
as a stochastic equation
In the approaches that model for σ2
as a deterministic equation, there are a few
that adjust for the leverage effect. The leverage effect implies that a negative
return has a higher impact on the volatility than a positive return. One of the
popular models that try to capture this asymmetry in volatility is the GJR-
GARCH.
GJR-GARCH
yt = σ2
t zt where zt ∈ D(0, 1)
σ2
t = φ0 +
q
i=1 αiy2
t−1 + γy2
t−1I−
t−1 + β1σ2
t−1
where:
I−
t−1 =



1, if yt < 0
0, otherwise
(1)
5
However, we will be using the GARCH(1,1) model as a comparison benchmark
for volatility estimation performance. Based on earlier experience and general
agreement, it can be stated that GARCH(1,1) works well with real world data.
GARCH(p,q)
yt = σ2
t zt where zt ∈ D(0, 1)
σ2
t = φ0 +
q
i=1 αiy2
t−i +
p
i=1 βiσ2
t−i
p
i=1
3 Stochastic Volatility &
Smooth Particle Filters
The simple Stochastic Volatility model is
yt = σ2
t zt where zt ∈ N(0, 1)
log(σ2
t ) = φ0 + φ1 log(σ2
t−1) + νt where ν ∼ N(0, τ2
ν )
For notational simplicity, henceforth log(σ2
t ) will be represented with αt. In
this model, the assumption is that α is the latent variable of a Hidden Markov
Model. Therefore, in order to work with the series yt, we use two densities:
1) Transition density: h(αt|αt−1 ) ∼ N(φ0 + φ1αt−1, τ2
ν )
2) Measurement density: g(yt|αt ) ∼ N 0, exp αt
2
As we can see from the above, we need to find a way to summarize the informa-
tion about our latent state αt without actually observing it, and the way that
we approach is is by using another 2 densities:
1) Prediction density: f1(αt|Ft−1; ω)
2) Filtering density: f2(αt|Ft; ω)
The prediction density is the density of αt given the information till t − 1, and
the filtering density is with information till t. Therefore, at time t, we can the
prediction density is:
f1(αt|Ft−1) = h(αt|αt−1)f2(αt−1|Ft−1)dαt−1
6
Using this prediction density in t, yt, and the measurement density, we have our
filtering density at t which is evaluated as:
f2(αt|Ft) ∝ g(yt|αt)f1(αt|Ft−1)
Based on the above, we can define the log likehood of ω can be evaluated as:
log Ln(ω) =
n
i=1
log p(yt|Ft−1; ω)
and the conditional density of yt is:
p(yt|Ft−1; ω) = g(yt|αt; ω)f1(αt|Ft−1; ω)
The problem that arises is the above integrals are intractable. Hence, we need
to use simulation based filters (here SIR and CSIR) in order to approximate
them. At each time t, we will have P random draws:
1) Particles αi
t|t−1 from f(αt|Ft−1; ω)
2) Particles αi
t|t from f(αt|Ft; ω)
where i ∈ [1, . . . , P]. Using these particles, we will be evaluating the mea-
surement density and simulating the transition densities. We will do so in two
steps:
Prediction
We use P particles αi
t−1|t−1 from the filtering density in order to obtain a new
prediction particle using the transition density in the following manner:
αi
t|t−1 ∼ h(αt|αi
t−1|t−1; ω)
which for our SV is:
αi
t|t−1 = φ0 + φ1αi
t−1|t−1 + τνu where u ∼ N(0, 1)
These P particles will prove to be the input for the next step which propogates
the system from t to t + 1.
7
Filtering & Smooth Filtering
Using the obtained prediction particles with the realization yt of Yt, we compute
the importance weights of each prediction particle in the following manner:
wti
t =
g(yt|αi
t|t−1; ω)
n
i=1 g(yt|αi
t|t−1; ω)
we sample the filtering particles, αj
t|t−1 for all j ∈ [1, . . . , P] with the weights
wti
t, which for our SV is:
g(yt|αj
t|t−1; ω) = fN (yt; 0, exp(αi
t|t−1))
In order to perform this filtering, we look at SIR and CSIR algorithms. The
common steps for both are as follows:
1) Calculate the normalized weights of the particles αi
t|t−1 as
nwti
t =
wti
t
n
i=0 wti
t
2) Sort the particles from lowest to highest in value and the corresponding
normalized weights based on the order of the sorted particles.
3) Calculate the cumulative normalized weights of the particles as
cnwti
t =
i
j=1 nwtj
t
4) Using a vector of random draws u ∼ and the vector cnwtt, we sample from
αi
t|t−1 to obtain αj
t|t. In both algorithms, we sample αj
t|t as long as the following
condition is satisfied:
cnwti
t < uj
&uj
< cnwti+1
t
SIR
With the SIR algorithm, we draw the prediction particles as follows:
αj
t|t = αi
t|t−1 with probability wti
t
8
CSIR
With the CSIR algorithm, we draw the prediction particles as follows:
αj
t|t = αi
t|t−1 +
αi+1
t|t−1
−αi
t|t−1
cnwti+1
t −cnwti
t
× (uj
− cnwti
t) with probability wti
t
The SIR Algorithm was proposed by Gordon et. al (1993) [3] and the CSIR
Algorithm by Pitt and Malik (2011) [4].
Likelihood Computation
The aforementioned likelihood
p(yt|Ft−1; ω) = g(yt|αt; ω)f1(αt|Ft−1; ω)
can be approximated as:
ˆp(yt|Ft−1; ω) =
1
P
P
i=1
g(yt|αi
t|t−1; ω)
Once we have a suitable estimator, the likelihood can be computed as:
log Ln,P (ω) =
n
i=1
log ˆp(yt|Ft−1; ω)
As mentioned in the paper on Smooth Particle filters [5], the issue that arises is
that the likelihood estimator is not smooth with respect to ω. This is because
the sampling is done from the discrete pdf and a solution is proposed by the
author where we sample from the discrete pdf as if it were a continuous one
thereby generating smoothed samples for αi
t|t. Of course, as P tends towards
very large value, both cdf’s will be indistinguishable but while using a smaller
number of particles, the CSIR returns a likelihood estimator that is smooth
with respect to ω.
9
4 Implementation, Diagnostics,
Package Development and Testing
The first step of the implementation was to develop a function for simulating
data points from an SV model based on input parameters of the desired length.
This function is presented in subsection 7.1 and the return series generated is
shown in Figure 2, it is important to note that we store the randomly drawn α
in order to test the subsequent functions.
This was followed by construction of the filtering algorithms that take the pre-
diction particles as input and produce filtered particles in order to be used with
the transition density. These functions are presented in subsections 7.2 and 7.3.
Figures 3 and 4 are plots where a sv return series of length 1000 was used with
100 particles and both filters.
After the filters were tested, the next function developed was the likelihood
estimator as a function of ω. This was developed in a manner that both filters
could easily be used with it. We use and return the normalized loglikelhood
values from the function so that it can be compared with the results of garchFit
function from the package ’fGarch’.
The loglikelihood function was put inside a wrapper function that estimates
the parameters of the sv model for any return series given as input. This
function is presented in subsection 7.5. Since the likelihood minimization is
being done using previously available functions, both ’nlminb’ and ’optim’ were
tested. Based on a few initial results, we decided to use optim for the likelihood
minimization. The code for loglikelihood computation and parameter estimation
are in subsections 7.4 and 7.5.
Once the svfit function was developed, an MC test was set up with the true
parameter values φ0 = 0.1, φ1 = 0.975, τ2
ν = 0.02 which are usual values for
daily returns. We generated 100 series of length 500 and fit SV model with
pre-drawn initial particles, noise and uniform intervals. The code used for this
is reported in subsection 7.7 and the results are shown in Figure 5.
We concluded that the functions are working well, and we used the daily re-
turns of SP500 and BAC over the past 4 years and a simulated series with true
parameters(φ0 = 0.05, φ1 = 0.98, τ2
ν = 0.02) of length 1000 as inputs to svfit.
We estimated the annualized volatility using GARCH(1,1) and svfit. (Figure 6)
10
The package is named ”metricsComplete” and includes all of the functions men-
tioned here in this thesis. It can be downloaded and installed by running the
following code in R:
i f ( ! require ( ’ devtools ’ )) install . packages( ’ devtools ’ )
library ( devtools )
devtools : : install github ( ”HariharaSubramanyam/metricsComplete ” )
Typically, volatility analysis is carried out for short periods of time (Significantly
shorted than the ones used for testing here). Therefore, with real world data,
especially if a large number of particles are used, this model will be able to
estimate the parameters with reasonable accuracy.
Further Work
One way to develop further on the work done during this thesis is to carry out
MC tests for longer series with higher number of particles. The tests need to be
set up keeping real-world applicability in mind, making sure that the number
of particles increases with the length of the series. Theoretically, as P → ∞,
the estimated parameters ω → ω∗
(the true value).
During the initial tests, the number of Particles used was sufficiently large and
using both SIR and CSIR algorithms with simulated series resulted in conver-
gence to reasonable estimates of parameters. Since the focus of this thesis was
the CSIR algorithm, the MC testing was carried out with it.
A non-trivial issue that still persists is the time taken for fitting the model
(Each series took 17 minutes for series length 500 and 400 particles). Since the
entire implementation has been carried out in R, it carries the burden of slow
computation. A solution to speed up this process is to construct the filter in
C and embed that code into R. To speed it even further, one could code in
FORTRAN. Since the likelihood function has been developed for easy editing,
once a filter has been coded in another file, it can be easily embedded into this
package.
The final goal for the package is to incorporate the above sv estimation frame-
work as part of a complete Econometrics package that automatically installs all
commonly used time series packages and utilizes them as dependencies. There
are extensive guidelines for submitting a package to CRAN and the package will
be developed adhering to them. A detailed vignette will be constructed as well
displaying the package’s best features and shortcuts.
11
5 Conclusions
The MC test was done using series generated with true parameters φ0 = 0.1,
φ1 = 0.975, τ2
ν = 0.02. By performing MLE on the joint distribution of param-
eters estimated using the MC test we get the following results:
ˆφ0 = 0.12718434, ˆφ1 = 0.96840118, ˆτ2
ν = 0.02173305
Table 1: MC Testing Results Variance-Covariance matrix of φ0, φ1 and τ2
ν
4.081e-06 2.835e-07 5.609e-08
. 2.127e-08 5.278e-09
. . 1.941e-08
The parameter estimates for the CSIR filter used with the following series of
length 1000 and 200 particles are as follows:
Simulated SP500 BAC
constant 0.04785693 0.01000000 0.1053761
phi 0.98232680 0.95806860 0.8749957
tausq 0.01482527 0.09232735 0.1389295
Similarly, the parameter estimates using the SIR filter are:
Simulated SP500 BAC
constant 0.04759432 0.0100000 0.1176416
phi 0.98078177 0.9564788 0.8622350
tausq 0.01965572 0.1102391 0.1544638
Therefore we can conclude with reasonable certainity that the svfit function has
been implemented well with the CSIR algorithm in place. In order to compare
efficiency and visualize the difference between the filtering of SIR and CSIR,
multiple plots were attempted but they weren’t able to highlight the smoothing
in a distinguishable manner. The best way to compare the performance would
be to carry out MC tests with the same simulations and noise and apply MLE
to the parameter estimates from both filters (using a low number of particles as
this is when the improvement by smoothing is substantial). This along with the
points mentioned in the previous section are the suggestions for further work.
12
6 Figures
Figure 1: yt, y2
t and |yt| of SP500 Returns over the past 4 years.
13
Figure 2: Red: Simulated Return Series. Blue: 90% Confidence intervals of
Alpha. True parameters(φ0 = 0.05, φ1 = 0.98, τ2
ν = 0.02).
Figure 3: Red: Latent Variable of above return series. Blue: Mean of (αj
t|t)
& (yt|αj
t|t) from SIR filter. Light blue: 90% confidence intervals of SIR filter
results.
14
Figure 4: Red: Latent Variable of above return series. Blue: Mean of (αj
t|t) &
(yt|αj
t|t) from CSIR filter. Light blue: 90% confidence intervals of CSIR filter
results.
Figure 5: The parameter estimates of 100 series of length 500 using svfit (400
particles) The true parameters were φ0 = 0.1, φ1 = 0.975, τ2
ν = 0.02.
15
Figure 6: The Annualized Volatility plots for GARCH(1,1) and svfit (200 parti-
cles with CSIR Filter) with a simulated series, SP500, BAC returns of the past
4 years (length 1000).
7 Code
General Syntax
1. P: No. of Particles
2. N: No. of data points (length of series)
3. omega: Vector of Parameters (constant, phi, tausq)
4. eta: ν ∼ N(0, 1)
5. u: u ∼ U(0, 1)
6. alpha up: filtered αj
t−1|t−1 to determine αi
t|t−1
7. alpha pr: particles αi
t|t−1 for determining alpha wt
8. alpha wt: the probabilities at t of yt ∼ fN 0, exp
αi
t|t−1
2
16
7.1 SV Series Simulation
simulatesv <− function (omega , N){
constant <− omega [ 1 ]
phi <− omega [ 2 ]
tausq <− omega [ 3 ]
alpha <− rep (0 , N)
s v s e r i e s <− rep (0 , N)
eta <− rnorm(N, 0 , sqrt ( tausq ))
z <− rnorm(N, 0 , 1)
nu <− rnorm(N, 0 , 1)
alpha [ 1 ] <− constant
s v s e r i e s [ 1 ] <− z [ 1 ] ∗ exp( alpha [ 1 ] /2)
for ( i in 2:N){
alpha [ i ] <− constant + ( phi ∗ alpha [ i − 1 ] ) + eta [ i ]
s v s e r i e s [ i ] <− z [ i ] ∗ exp( alpha [ i ] / 2)
}
return ( l i s t ( alpha = alpha , s v s e r i e s = s v s e r i e s ))
}
17
7.2 Sequential Importance Resampling
s i r <− function ( alpha pr , alpha wt, u){
N <− length ( alpha pr )
alpha up <− rep (0 , N)
alpha wt <− alpha wt/sum( alpha wt)
sorted <− sort ( alpha pr , index . return = T)
alpha pr <− sorted$x
alpha wt <− alpha wt[ sorted$ix ]
alpha cwt <− c(0 , cumsum( alpha wt))
j <− 1
for ( i in 1:N){
while ( alpha cwt [ i ] < u [ j ] & u [ j ] <= alpha cwt [ i +1]){
alpha up [ j ] <− alpha pr [ i ]
i f ( j < N){
j <− j + 1
}
else {break}
}
}
return ( alpha up)
}
18
7.3 Continuous Sequential Importance Resampling
c s i r <− function ( alpha pr , alpha wt, u){
N <− length ( alpha pr )
alpha up <− rep (0 , N)
alpha wt <− alpha wt/sum( alpha wt)
sorted <− sort ( alpha pr , index . return = T)
alpha pr <− sorted$x
alpha wt <− alpha wt[ sorted$ix ]
alpha cwt <− c(0 , cumsum( alpha wt))
alpha pr <− c ( alpha pr [ 1 ] , alpha pr )
j <− 1
for ( i in 1:N){
while ( alpha cwt [ i ] < u [ j ] & u [ j ] <= alpha cwt [ i +1]){
alpha up [ j ] <− alpha pr [ i ] +
# (( alpha pr [ i +1] − alpha pr [ i ] ) /( alpha cwt [ i +1] − alpha cwt [ i ] ) )
# ∗(u [ j ] − alpha cwt [ i ] )
# The smoothing of the update
i f ( j < N){
j <− j + 1
}
else {break}
}
}
return ( alpha up)
}
19
7.4 Likelihood Computation
s v l l <− function (omega , y , eta sim , u sim , alpha up , alpha wt){
N <− length (y)
P <− length ( alpha up)
constant <− omega [ 1 ]
phi <− omega [ 2 ]
tausq <− omega [ 3 ]
l o g l i k <− 0
for ( i in 1:N){
alpha pr <− constant + ( phi ∗ alpha up) + ( sqrt ( tausq ) ∗ eta sim [ , i ] )
l i k <− dnorm( y [ i ] ∗rep (1 , P) , rep (0 , P) , exp( alpha pr/2) )
i f ( is . f i n i t e ( log (mean( l i k ) ) ) == T){
l o g l i k <− l o g l i k − log ( mean( l i k ))
}
else { l o g l i k <− Inf }
alpha wt <− l i k
alpha up <− c s i r ( alpha pr , alpha wt, u sim [ , i ] )
}
l o g l i k <− l o g l i k /N;
return ( l o g l i k )
}
# Usage :
N <− length (y)
eta sim <− matrix(rnorm(P ∗ N, 0 , 1) , P, N)
u sim <− matrix( runif (P ∗ N, 0 , 1) , P, N)
alpha up i n i t <− rnorm(P, 0 , 1)
alpha wt i n i t <− rep (1 , P)/P
for ( i in 1:N) {
u sim [ , i ] <− sort (u sim [ , i ] )
}
omega <− # set of parameters
s v l l (omega , y , eta sim , u sim , alpha up init , alpha wt i n i t )
20
7.5 Likelihood Minimization/Parameter Estimation
s v f i t <− function (y , omega , P) {
N <− length (y)
eta sim <− matrix(rnorm(P ∗ N, 0 , 1) , P, N)
u sim <− matrix( runif (P ∗ N, 0 , 1) , P, N)
alpha up i n i t <− rnorm(P, 0 , 1)
alpha wt i n i t <− rep (1 , P)/P
for ( i in 1:N) {
u sim [ , i ] <− sort (u sim [ , i ] )
}
lb <− rep (0 , length (omega )) + 0.01
ub <− rep (1 , length (omega )) − (2 ∗ (10ˆ −10))
sv proxy <− function (par) {
output <− s v l l (par , y , eta sim , u sim , alpha up init ,
alpha wt i n i t )
return ( output )
}
t r i a l <− optim(par = omega , fn = sv proxy , upper = ub , lower = lb ,
control = l i s t ( trace = 1) , method = ”L−BFGS−B” , hessian = T)
return ( t r i a l )
}
21
7.6 Plotting Volatility
svvoltrack <− function ( svmodel , s e r i e s , P){
N <− length ( s e r i e s )
constant <− svmodel$par [ 1 ]
phi <− svmodel$par [ 2 ]
tausq <− svmodel$par [ 3 ]
alpha up <− rnorm(P, 0 , 0 . 1 )
alpha pr <− rep (0 ,P)
alpha wt <− (rep (1 ,P)) /P
alpha pr track <− rep (0 , N)
for ( i in 1:N){
alpha pr <− constant + ( phi ∗ alpha up) + rnorm(P, 0 , sqrt ( tausq ))
alpha wt <− dnorm(( s e r i e s [ i ] ∗ rep (1 ,P)) , rep (0 , P) , exp( alpha pr/2))
alpha up <− c s i r ( alpha pr , alpha wt, sort ( runif (P, 0 , 1 ) ) )
}
return ( alpha pr track )
}
# Usage :
omega i n i t <− c ( 0 .1 , 0.9 , 0.1)
svmodel <− s v f i t ( s e r i e s , omega init , P)
v o l a t i l i t y <− sqrt (252) ∗ exp( svvoltrack ( svmodel , s e r i e s , P)/2)
22
7.7 MC SV Parameter Estimate
omega returns <− c ( 0 .1 , 0.975 , 0.02)
simulations <− 100
len <− 500
S e r i e s <− as . data . frame(matrix (0 , len , simulations ))
for ( i in 1: simulations ){
S e r i e s [ , i ] <− simulatesv (omega returns , len )
}
Parameters <− as . data . frame(matrix (0 , 3 , simulations ))
P <− 400
alpha up i n i t <− rnorm(P, 0 , 1)
alpha wt i n i t <− rep (1 , P)/P
for ( i in 1: simulations ){
N <− length ( S e r i e s [ , i ] )
eta sim <− matrix(rnorm(P ∗ N, 0 , 1) , P, N)
u sim <− matrix( runif (P ∗ N, 0 , 1) , P, N)
for ( i in 1:N) {
u sim [ , i ] <− sort (u sim [ , i ] )
}
lb <− rep (0 , length (omega )) + 0.01
ub <− rep (1 , length (omega )) − (2 ∗ (10ˆ −10))
omega = c (var(y) ∗ (1 − 0.95) , 0.95 , 0.1)
sv proxy <− function (par) {
output <− s v l l (par , y , eta sim , u sim , alpha up init ,
alpha wt i n i t )
return ( output )
}
t r i a l <− optim(par = omega , fn = sv proxy , upper = ub , lower = lb ,
control = l i s t ( trace = 1) , method = ”L−BFGS−B” , hessian = T)
Parameters [ , i ] <− t r i a l $par
}
# The parameters data frame i s used to study the convergence to true values .
23
References
[1] Robert Almgren. Time Series Analysis and Statistical Arbitrage. New York
University, 2009.
[2] Christian Brownlees. Financial Econometrics. BGSE, 2016.
[3] N.J. Gordon, D.J. Salmond, and a.F.M. Smith. Novel approach to
nonlinear/non-Gaussian Bayesian state estimation. IEE Proceedings F
Radar and Signal Processing, 140(2):107, 1993.
[4] Sheheryar Malik and Michael K. Pitt. Particle filters for continuous likeli-
hood evaluation and maximisation. Journal of Econometrics, 165(2):190–
209, 2011.
[5] Michael K. Pitt. Smooth particle filters for likelihood evaluation and max-
imisation. (651), 2002.
[6] Rob Reider. Time Series Analysis and Statistical Arbitrage. New York
University, 2009.
24

Contenu connexe

Tendances

1 Aminullah Assagaf_Estimation-of-domain-of-attraction-for-the-fract_2021_Non...
1 Aminullah Assagaf_Estimation-of-domain-of-attraction-for-the-fract_2021_Non...1 Aminullah Assagaf_Estimation-of-domain-of-attraction-for-the-fract_2021_Non...
1 Aminullah Assagaf_Estimation-of-domain-of-attraction-for-the-fract_2021_Non...Aminullah Assagaf
 
Parameter Estimation for Semiparametric Models with CMARS and Its Applications
Parameter Estimation for Semiparametric Models with CMARS and Its ApplicationsParameter Estimation for Semiparametric Models with CMARS and Its Applications
Parameter Estimation for Semiparametric Models with CMARS and Its ApplicationsSSA KPI
 
Exploring temporal graph data with Python: 
a study on tensor decomposition o...
Exploring temporal graph data with Python: 
a study on tensor decomposition o...Exploring temporal graph data with Python: 
a study on tensor decomposition o...
Exploring temporal graph data with Python: 
a study on tensor decomposition o...André Panisson
 
Introduction to finite element analysis
Introduction to finite element analysisIntroduction to finite element analysis
Introduction to finite element analysisTarun Gehlot
 
Principal Component Analysis for Tensor Analysis and EEG classification
Principal Component Analysis for Tensor Analysis and EEG classificationPrincipal Component Analysis for Tensor Analysis and EEG classification
Principal Component Analysis for Tensor Analysis and EEG classificationTatsuya Yokota
 
Quantum Minimax Theorem in Statistical Decision Theory (RIMS2014)
Quantum Minimax Theorem in Statistical Decision Theory (RIMS2014)Quantum Minimax Theorem in Statistical Decision Theory (RIMS2014)
Quantum Minimax Theorem in Statistical Decision Theory (RIMS2014)tanafuyu
 
IRJET- Analytic Evaluation of the Head Injury Criterion (HIC) within the Fram...
IRJET- Analytic Evaluation of the Head Injury Criterion (HIC) within the Fram...IRJET- Analytic Evaluation of the Head Injury Criterion (HIC) within the Fram...
IRJET- Analytic Evaluation of the Head Injury Criterion (HIC) within the Fram...IRJET Journal
 
Chapter2 - Linear Time-Invariant System
Chapter2 - Linear Time-Invariant SystemChapter2 - Linear Time-Invariant System
Chapter2 - Linear Time-Invariant SystemAttaporn Ninsuwan
 
Constant strain triangular
Constant strain triangular Constant strain triangular
Constant strain triangular rahul183
 
IFAC2008art
IFAC2008artIFAC2008art
IFAC2008artYuri Kim
 
Finite element analysis of space truss by abaqus
Finite element analysis of space truss by abaqus Finite element analysis of space truss by abaqus
Finite element analysis of space truss by abaqus P Venkateswalu
 
Finite Element Analysis Lecture Notes Anna University 2013 Regulation
Finite Element Analysis Lecture Notes Anna University 2013 Regulation Finite Element Analysis Lecture Notes Anna University 2013 Regulation
Finite Element Analysis Lecture Notes Anna University 2013 Regulation NAVEEN UTHANDI
 
CS8451 - Design and Analysis of Algorithms
CS8451 - Design and Analysis of AlgorithmsCS8451 - Design and Analysis of Algorithms
CS8451 - Design and Analysis of AlgorithmsKrishnan MuthuManickam
 

Tendances (20)

W33123127
W33123127W33123127
W33123127
 
1 Aminullah Assagaf_Estimation-of-domain-of-attraction-for-the-fract_2021_Non...
1 Aminullah Assagaf_Estimation-of-domain-of-attraction-for-the-fract_2021_Non...1 Aminullah Assagaf_Estimation-of-domain-of-attraction-for-the-fract_2021_Non...
1 Aminullah Assagaf_Estimation-of-domain-of-attraction-for-the-fract_2021_Non...
 
ICLR 2018
ICLR 2018ICLR 2018
ICLR 2018
 
Parameter Estimation for Semiparametric Models with CMARS and Its Applications
Parameter Estimation for Semiparametric Models with CMARS and Its ApplicationsParameter Estimation for Semiparametric Models with CMARS and Its Applications
Parameter Estimation for Semiparametric Models with CMARS and Its Applications
 
Exploring temporal graph data with Python: 
a study on tensor decomposition o...
Exploring temporal graph data with Python: 
a study on tensor decomposition o...Exploring temporal graph data with Python: 
a study on tensor decomposition o...
Exploring temporal graph data with Python: 
a study on tensor decomposition o...
 
Introduction to finite element analysis
Introduction to finite element analysisIntroduction to finite element analysis
Introduction to finite element analysis
 
Principal Component Analysis for Tensor Analysis and EEG classification
Principal Component Analysis for Tensor Analysis and EEG classificationPrincipal Component Analysis for Tensor Analysis and EEG classification
Principal Component Analysis for Tensor Analysis and EEG classification
 
Quantum Minimax Theorem in Statistical Decision Theory (RIMS2014)
Quantum Minimax Theorem in Statistical Decision Theory (RIMS2014)Quantum Minimax Theorem in Statistical Decision Theory (RIMS2014)
Quantum Minimax Theorem in Statistical Decision Theory (RIMS2014)
 
IRJET- Analytic Evaluation of the Head Injury Criterion (HIC) within the Fram...
IRJET- Analytic Evaluation of the Head Injury Criterion (HIC) within the Fram...IRJET- Analytic Evaluation of the Head Injury Criterion (HIC) within the Fram...
IRJET- Analytic Evaluation of the Head Injury Criterion (HIC) within the Fram...
 
Chapter2 - Linear Time-Invariant System
Chapter2 - Linear Time-Invariant SystemChapter2 - Linear Time-Invariant System
Chapter2 - Linear Time-Invariant System
 
Constant strain triangular
Constant strain triangular Constant strain triangular
Constant strain triangular
 
IFAC2008art
IFAC2008artIFAC2008art
IFAC2008art
 
ppt0320defenseday
ppt0320defensedayppt0320defenseday
ppt0320defenseday
 
ED7201 FEMMD_notes
ED7201 FEMMD_notesED7201 FEMMD_notes
ED7201 FEMMD_notes
 
Finite element analysis of space truss by abaqus
Finite element analysis of space truss by abaqus Finite element analysis of space truss by abaqus
Finite element analysis of space truss by abaqus
 
Finite Element Analysis Lecture Notes Anna University 2013 Regulation
Finite Element Analysis Lecture Notes Anna University 2013 Regulation Finite Element Analysis Lecture Notes Anna University 2013 Regulation
Finite Element Analysis Lecture Notes Anna University 2013 Regulation
 
PhD defense talk slides
PhD  defense talk slidesPhD  defense talk slides
PhD defense talk slides
 
Fem 1
Fem 1Fem 1
Fem 1
 
CS8451 - Design and Analysis of Algorithms
CS8451 - Design and Analysis of AlgorithmsCS8451 - Design and Analysis of Algorithms
CS8451 - Design and Analysis of Algorithms
 
Fem lecture
Fem lectureFem lecture
Fem lecture
 

Similaire à Implementation of Continuous Sequential Importance Resampling Algorithm for Stochastic Volatility Estimation

Computational Intelligence for Time Series Prediction
Computational Intelligence for Time Series PredictionComputational Intelligence for Time Series Prediction
Computational Intelligence for Time Series PredictionGianluca Bontempi
 
Financial Time Series Analysis Based On Normalized Mutual Information Functions
Financial Time Series Analysis Based On Normalized Mutual Information FunctionsFinancial Time Series Analysis Based On Normalized Mutual Information Functions
Financial Time Series Analysis Based On Normalized Mutual Information FunctionsIJCI JOURNAL
 
Stochastic Vol Forecasting
Stochastic Vol ForecastingStochastic Vol Forecasting
Stochastic Vol ForecastingSwati Mital
 
Paris2012 session4
Paris2012 session4Paris2012 session4
Paris2012 session4Cdiscount
 
State Space Model
State Space ModelState Space Model
State Space ModelCdiscount
 
PREDICTIVE EVALUATION OF THE STOCK PORTFOLIO PERFORMANCE USING FUZZY CMEANS A...
PREDICTIVE EVALUATION OF THE STOCK PORTFOLIO PERFORMANCE USING FUZZY CMEANS A...PREDICTIVE EVALUATION OF THE STOCK PORTFOLIO PERFORMANCE USING FUZZY CMEANS A...
PREDICTIVE EVALUATION OF THE STOCK PORTFOLIO PERFORMANCE USING FUZZY CMEANS A...ijfls
 
Multivariate high-order-fuzzy-time-series-forecasting-for-car-road-accidents
Multivariate high-order-fuzzy-time-series-forecasting-for-car-road-accidentsMultivariate high-order-fuzzy-time-series-forecasting-for-car-road-accidents
Multivariate high-order-fuzzy-time-series-forecasting-for-car-road-accidentsCemal Ardil
 
On Selection of Periodic Kernels Parameters in Time Series Prediction
On Selection of Periodic Kernels Parameters in Time Series Prediction On Selection of Periodic Kernels Parameters in Time Series Prediction
On Selection of Periodic Kernels Parameters in Time Series Prediction cscpconf
 
ON SELECTION OF PERIODIC KERNELS PARAMETERS IN TIME SERIES PREDICTION
ON SELECTION OF PERIODIC KERNELS PARAMETERS IN TIME SERIES PREDICTIONON SELECTION OF PERIODIC KERNELS PARAMETERS IN TIME SERIES PREDICTION
ON SELECTION OF PERIODIC KERNELS PARAMETERS IN TIME SERIES PREDICTIONcscpconf
 
A Monte Carlo strategy for structure multiple-step-head time series prediction
A Monte Carlo strategy for structure multiple-step-head time series predictionA Monte Carlo strategy for structure multiple-step-head time series prediction
A Monte Carlo strategy for structure multiple-step-head time series predictionGianluca Bontempi
 
COVARIANCE ESTIMATION AND RELATED PROBLEMS IN PORTFOLIO OPTIMI
COVARIANCE ESTIMATION AND RELATED PROBLEMS IN PORTFOLIO OPTIMICOVARIANCE ESTIMATION AND RELATED PROBLEMS IN PORTFOLIO OPTIMI
COVARIANCE ESTIMATION AND RELATED PROBLEMS IN PORTFOLIO OPTIMICruzIbarra161
 
Scalable inference for a full multivariate stochastic volatility
Scalable inference for a full multivariate stochastic volatilityScalable inference for a full multivariate stochastic volatility
Scalable inference for a full multivariate stochastic volatilitySYRTO Project
 
On selection of periodic kernels parameters in time series prediction
On selection of periodic kernels parameters in time series predictionOn selection of periodic kernels parameters in time series prediction
On selection of periodic kernels parameters in time series predictioncsandit
 
A Study on Performance Analysis of Different Prediction Techniques in Predict...
A Study on Performance Analysis of Different Prediction Techniques in Predict...A Study on Performance Analysis of Different Prediction Techniques in Predict...
A Study on Performance Analysis of Different Prediction Techniques in Predict...IJRES Journal
 

Similaire à Implementation of Continuous Sequential Importance Resampling Algorithm for Stochastic Volatility Estimation (20)

Computational Intelligence for Time Series Prediction
Computational Intelligence for Time Series PredictionComputational Intelligence for Time Series Prediction
Computational Intelligence for Time Series Prediction
 
Financial Time Series Analysis Based On Normalized Mutual Information Functions
Financial Time Series Analysis Based On Normalized Mutual Information FunctionsFinancial Time Series Analysis Based On Normalized Mutual Information Functions
Financial Time Series Analysis Based On Normalized Mutual Information Functions
 
Stochastic Vol Forecasting
Stochastic Vol ForecastingStochastic Vol Forecasting
Stochastic Vol Forecasting
 
Stephens-L
Stephens-LStephens-L
Stephens-L
 
thesis
thesisthesis
thesis
 
Forecasting Gasonline Price in Vietnam Based on Fuzzy Time Series and Automat...
Forecasting Gasonline Price in Vietnam Based on Fuzzy Time Series and Automat...Forecasting Gasonline Price in Vietnam Based on Fuzzy Time Series and Automat...
Forecasting Gasonline Price in Vietnam Based on Fuzzy Time Series and Automat...
 
Paris2012 session4
Paris2012 session4Paris2012 session4
Paris2012 session4
 
State Space Model
State Space ModelState Space Model
State Space Model
 
PREDICTIVE EVALUATION OF THE STOCK PORTFOLIO PERFORMANCE USING FUZZY CMEANS A...
PREDICTIVE EVALUATION OF THE STOCK PORTFOLIO PERFORMANCE USING FUZZY CMEANS A...PREDICTIVE EVALUATION OF THE STOCK PORTFOLIO PERFORMANCE USING FUZZY CMEANS A...
PREDICTIVE EVALUATION OF THE STOCK PORTFOLIO PERFORMANCE USING FUZZY CMEANS A...
 
Multivariate high-order-fuzzy-time-series-forecasting-for-car-road-accidents
Multivariate high-order-fuzzy-time-series-forecasting-for-car-road-accidentsMultivariate high-order-fuzzy-time-series-forecasting-for-car-road-accidents
Multivariate high-order-fuzzy-time-series-forecasting-for-car-road-accidents
 
intro
introintro
intro
 
On Selection of Periodic Kernels Parameters in Time Series Prediction
On Selection of Periodic Kernels Parameters in Time Series Prediction On Selection of Periodic Kernels Parameters in Time Series Prediction
On Selection of Periodic Kernels Parameters in Time Series Prediction
 
ON SELECTION OF PERIODIC KERNELS PARAMETERS IN TIME SERIES PREDICTION
ON SELECTION OF PERIODIC KERNELS PARAMETERS IN TIME SERIES PREDICTIONON SELECTION OF PERIODIC KERNELS PARAMETERS IN TIME SERIES PREDICTION
ON SELECTION OF PERIODIC KERNELS PARAMETERS IN TIME SERIES PREDICTION
 
trading
tradingtrading
trading
 
recko_paper
recko_paperrecko_paper
recko_paper
 
A Monte Carlo strategy for structure multiple-step-head time series prediction
A Monte Carlo strategy for structure multiple-step-head time series predictionA Monte Carlo strategy for structure multiple-step-head time series prediction
A Monte Carlo strategy for structure multiple-step-head time series prediction
 
COVARIANCE ESTIMATION AND RELATED PROBLEMS IN PORTFOLIO OPTIMI
COVARIANCE ESTIMATION AND RELATED PROBLEMS IN PORTFOLIO OPTIMICOVARIANCE ESTIMATION AND RELATED PROBLEMS IN PORTFOLIO OPTIMI
COVARIANCE ESTIMATION AND RELATED PROBLEMS IN PORTFOLIO OPTIMI
 
Scalable inference for a full multivariate stochastic volatility
Scalable inference for a full multivariate stochastic volatilityScalable inference for a full multivariate stochastic volatility
Scalable inference for a full multivariate stochastic volatility
 
On selection of periodic kernels parameters in time series prediction
On selection of periodic kernels parameters in time series predictionOn selection of periodic kernels parameters in time series prediction
On selection of periodic kernels parameters in time series prediction
 
A Study on Performance Analysis of Different Prediction Techniques in Predict...
A Study on Performance Analysis of Different Prediction Techniques in Predict...A Study on Performance Analysis of Different Prediction Techniques in Predict...
A Study on Performance Analysis of Different Prediction Techniques in Predict...
 

Implementation of Continuous Sequential Importance Resampling Algorithm for Stochastic Volatility Estimation

  • 1. Implementation of Continuous Sequential Importance Resampling Algorithm for Stochastic Volatility Estimation Harihara Subramanyam Sreenivasan Master’s Degree in Data Science Student Barcelona Graduate School of Economics June 30, 2016 1
  • 2. Acknowledgements I would like to express my sincere gratitude to Prof. Christian Brown- lees for his guidance and support during this trimester pertaining to the material in this thesis. In addition, a majority of the fundamentals re- quired for this thesis were developed as part of his coursework and I am glad to have had the opportunity to work on this project. Besides Prof. Brownlees, I would also like to thank Dr. Hrvoje Stojic as the techniques taught by him in Advanced Computational Methods helped me improve my efficiency while working on this thesis. Lastly, I would like to thank all of the faculty at Barcelona Graduate School of Economics for the knowledge I have been able to acquire over this past year. Abstract The objective of this master’s thesis was to implement a CSIR Algo- rithm for estimating predictive densities in SV Models into an R package. In the beginning of the project, a brief revision of econometric concepts was done followed by the step-by-step development of the package. The diagnostics and test results of the functions developed are presented and the current status and future work for the package are discussed.
  • 3. Contents 1 Introduction 1 2 Econometrics 2 2.1 Time Series Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2.2 Time Series Modelling . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2.1 Modelling for E [yt|Ft−1] + t . . . . . . . . . . . . . . . 4 2.2.2 Modelling for Var [yt|Ft−1] = E( 2 t |Ft−1) . . . . . . . . . 5 3 Stochastic Volatility & Smooth Particle Filters 6 4 Implementation, Diagnostics, Package Development and Testing 10 5 Conclusions 12 6 Figures 13 7 Code 16 7.1 SV Series Simulation . . . . . . . . . . . . . . . . . . . . . . . . . 17 7.2 Sequential Importance Resampling . . . . . . . . . . . . . . . . . 18 7.3 Continuous Sequential Importance Resampling . . . . . . . . . . 19 7.4 Likelihood Computation . . . . . . . . . . . . . . . . . . . . . . . 20 7.5 Likelihood Minimization/Parameter Estimation . . . . . . . . . . 21 7.6 Plotting Volatility . . . . . . . . . . . . . . . . . . . . . . . . . . 22 7.7 MC SV Parameter Estimate . . . . . . . . . . . . . . . . . . . . . 23
  • 4. 1 Introduction The structure of this thesis is representative of the methodology followed for approaching this tasks mentioned in the abstract. We begin with introductory econometrics concepts, followed by an explanation of the theory related to stochastic volatility and particle filtering. In this section we will provide descriptions of the filtering algorithms used. After developing an understanding of the fundamentals, we began to develop and test the functions that would be incorporated into a final package. There were numerous tests performed on simulated series from a sv model. The results of these tests are reported. Thorough testing was done following which the package was used with the daily returns of SP500 and BAC stocks. Parameters estimated and the plots visual- ized are reported as well. There is a discussion regarding the feasibility in applications and functionalities of the package presented in section 4. Further development of the package and the future plans in order to make it available on CRAN are also mentioned. Additional MC tests beyond the one carried out for this thesis are mentioned in Conclusions. A major component of the theory presented in thesis is drawn from the lec- ture slides for the Financial Econometrics course at BGSE [2] and ’Time Series Analysis and Statistical Arbitrage’ course in the MS Program on NYU Courant Institute’s ’Mathematics in Finance’. The notes were made publicly available by Robert Almgren [1] and Robert Reider [6]. All other references, including these 3 are mentioned in the bibliography. 1
  • 5. 2 Econometrics 2.1 Time Series Data Time series data can’t be regarded as data that is generated by an iid process. This is due to the fact that the every data point observed does depend on the previous data points observed just before it. Therefore, in order to work with financial and economic time series, we regard the data points as realizations of a time-varying stochastic process. A univariate stochastic process can be defined as follows: Yt t ∈ {0, 1, ..T} Where Yt is a collection of random variables defined on a probability space (Ω, F, P) and indexed by t. Using this definition, our time series data can be interpreted as t realizations defined as: yt ∼ Yt {ω} where ω ∈ Ω and Ft : {yt−1, yt−2, ...y1} This underlying process could be continuous in time, nevertheless, as a result of our measurement methodology and frequency we will only be recording events separated by fixed equal-length time intervals. Therefore, we will be studying these as discrete-time processes. While working with time series it is a requirement that the series Yt is co-variance stationary, therefore it must satisfy the following properties: E Y 2 t is finite ∀ t E [Yt] = µ ∀ t Cov(Yt, Yh) = γh−t ∀ t, h ∈ T The autocovariance must depend only the number of lags k = h − t. 2
  • 6. Most of the work done during this project was using daily returns data. More specifically, SP500 and BAC daily returns. In order to convert these points into the realizations of a stationary process, we difference the data points with the values observed directly before them. Let us represent the time series of the prices of these stocks with Xt where t ∈ T. Hence: Yt = Xt − Xt−1 where t ∈ [1, ...T] Apart from testing for stationarity, we also assess for normality using Jarque- Bera’s test. Our goals when working with time series data can be summarized as follows: 1) Modelling: Our first goal is to propose a theoretical process and de- termine if the data points are plausible realizations of the optimized parameters of that process. 2) Forecasting: After using the information set Ft for modelling to study the underlying data generating process. We attempt to predict yt+1, yt+2.... 2.2 Time Series Modelling In general, the modelling approaches with time series data can be classified as follows: 1) Modelling for the conditional mean E [yt|Ft−1] + t 2) Modelling for the conditional variance Var [yt|Ft−1] = E( 2 t |Ft−1) where t is the innovation/shock. At any given t, the information available to us is the finite-dimensional distribution of the data F(yt, yt−1, ...; ω) and joint densities f(y1, y2, ..yn; ω) where n ∈ [1, 2, ..., t − 1]. ω is a vector of parameters that generated the data points and our objective is determine these with reason- able accuracy for our proposed theoretical process. Considering the likelihood function: Ln(ω) = f(yt, yt−1, .....y1; ω) We have to note here that since our data are not iid, each joint density will be expressed as f(y1, y2, ...yn; ω) =f(yn|yn−1, yn−2, ....y1; ω) × f(yn|yn−1, yn−2, ....y1; ω).. ... × f(yp+1|yp, yp−1, ...y1; ω) × f(y1, y2, ...yn; ω) 3
  • 7. Based on the above, we can define the log likehood as: log Ln(ω) = T t=p+1 log f(yt|yt−1, yt−2...y1; ω) + log f(y1, y2, ...yp; ω) We typically use only the first p lags of data. Finally, Wold’s decomposition theorem states that any co-variance stationary process Yt can be expressed as a sum of a deterministic time series and an infinite series of shocks: Yt = ηt + ∞ i=1 βt t−i Using the above concepts, we approach modelling. 2.2.1 Modelling for E [yt|Ft−1] + t There are two classes of models which are used in modelling for the conditional mean. MA(q) yt = θ0 + q i=0 θi t−i + t where t ∼ WN(0, σ ) With the moving average models, we attempt to describe yt as a weighted combination of the past shocks from the innovation. AR(p) yt = φ0 + p i=0 φiyt−i + t where t ∼ WN(0, σ ) With the autoregressive models, we attempt to study yt as weighted combination of the previous observed data points. Typically, we combine these two and use an Autoregressive Moving Average Model ARMA(p,q) yt = φ0 + p i=0 φiyt−i q i=0 θi t−i + t where t ∼ WN(0, σ ) We can usually determine the orders of p and q by looking at the autocorrelation and partial autcorrelation functions of yt. With real world data, ARMA(1,1) models perform quite well. 4
  • 8. While forecasting with ARMA models, it can be observed that after k steps ahead, the predictions will converge to the unconditional mean of the process. This is a result of the fact that the past information will have little influence after a certain time period. 2.2.2 Modelling for Var [yt|Ft−1] = E( 2 t |Ft−1) When we assess the auto correlation functions of yT , |yt|, and y2 t , we notice that the autocorrelations of the absolute and squared returns is much more significant than the original return series. This is volatility clustering and in order to utilize this serial dependence of returns to model for the fluctuations in the scale of returns, we use volatility modelling. The volatility clustering in SP500 series is shown using Figure 1. There are two approaches to modelling for the σ2 of the process: 1) σ2 as a deterministic equation 2) σ2 as a stochastic equation In the approaches that model for σ2 as a deterministic equation, there are a few that adjust for the leverage effect. The leverage effect implies that a negative return has a higher impact on the volatility than a positive return. One of the popular models that try to capture this asymmetry in volatility is the GJR- GARCH. GJR-GARCH yt = σ2 t zt where zt ∈ D(0, 1) σ2 t = φ0 + q i=1 αiy2 t−1 + γy2 t−1I− t−1 + β1σ2 t−1 where: I− t−1 =    1, if yt < 0 0, otherwise (1) 5
  • 9. However, we will be using the GARCH(1,1) model as a comparison benchmark for volatility estimation performance. Based on earlier experience and general agreement, it can be stated that GARCH(1,1) works well with real world data. GARCH(p,q) yt = σ2 t zt where zt ∈ D(0, 1) σ2 t = φ0 + q i=1 αiy2 t−i + p i=1 βiσ2 t−i p i=1 3 Stochastic Volatility & Smooth Particle Filters The simple Stochastic Volatility model is yt = σ2 t zt where zt ∈ N(0, 1) log(σ2 t ) = φ0 + φ1 log(σ2 t−1) + νt where ν ∼ N(0, τ2 ν ) For notational simplicity, henceforth log(σ2 t ) will be represented with αt. In this model, the assumption is that α is the latent variable of a Hidden Markov Model. Therefore, in order to work with the series yt, we use two densities: 1) Transition density: h(αt|αt−1 ) ∼ N(φ0 + φ1αt−1, τ2 ν ) 2) Measurement density: g(yt|αt ) ∼ N 0, exp αt 2 As we can see from the above, we need to find a way to summarize the informa- tion about our latent state αt without actually observing it, and the way that we approach is is by using another 2 densities: 1) Prediction density: f1(αt|Ft−1; ω) 2) Filtering density: f2(αt|Ft; ω) The prediction density is the density of αt given the information till t − 1, and the filtering density is with information till t. Therefore, at time t, we can the prediction density is: f1(αt|Ft−1) = h(αt|αt−1)f2(αt−1|Ft−1)dαt−1 6
  • 10. Using this prediction density in t, yt, and the measurement density, we have our filtering density at t which is evaluated as: f2(αt|Ft) ∝ g(yt|αt)f1(αt|Ft−1) Based on the above, we can define the log likehood of ω can be evaluated as: log Ln(ω) = n i=1 log p(yt|Ft−1; ω) and the conditional density of yt is: p(yt|Ft−1; ω) = g(yt|αt; ω)f1(αt|Ft−1; ω) The problem that arises is the above integrals are intractable. Hence, we need to use simulation based filters (here SIR and CSIR) in order to approximate them. At each time t, we will have P random draws: 1) Particles αi t|t−1 from f(αt|Ft−1; ω) 2) Particles αi t|t from f(αt|Ft; ω) where i ∈ [1, . . . , P]. Using these particles, we will be evaluating the mea- surement density and simulating the transition densities. We will do so in two steps: Prediction We use P particles αi t−1|t−1 from the filtering density in order to obtain a new prediction particle using the transition density in the following manner: αi t|t−1 ∼ h(αt|αi t−1|t−1; ω) which for our SV is: αi t|t−1 = φ0 + φ1αi t−1|t−1 + τνu where u ∼ N(0, 1) These P particles will prove to be the input for the next step which propogates the system from t to t + 1. 7
  • 11. Filtering & Smooth Filtering Using the obtained prediction particles with the realization yt of Yt, we compute the importance weights of each prediction particle in the following manner: wti t = g(yt|αi t|t−1; ω) n i=1 g(yt|αi t|t−1; ω) we sample the filtering particles, αj t|t−1 for all j ∈ [1, . . . , P] with the weights wti t, which for our SV is: g(yt|αj t|t−1; ω) = fN (yt; 0, exp(αi t|t−1)) In order to perform this filtering, we look at SIR and CSIR algorithms. The common steps for both are as follows: 1) Calculate the normalized weights of the particles αi t|t−1 as nwti t = wti t n i=0 wti t 2) Sort the particles from lowest to highest in value and the corresponding normalized weights based on the order of the sorted particles. 3) Calculate the cumulative normalized weights of the particles as cnwti t = i j=1 nwtj t 4) Using a vector of random draws u ∼ and the vector cnwtt, we sample from αi t|t−1 to obtain αj t|t. In both algorithms, we sample αj t|t as long as the following condition is satisfied: cnwti t < uj &uj < cnwti+1 t SIR With the SIR algorithm, we draw the prediction particles as follows: αj t|t = αi t|t−1 with probability wti t 8
  • 12. CSIR With the CSIR algorithm, we draw the prediction particles as follows: αj t|t = αi t|t−1 + αi+1 t|t−1 −αi t|t−1 cnwti+1 t −cnwti t × (uj − cnwti t) with probability wti t The SIR Algorithm was proposed by Gordon et. al (1993) [3] and the CSIR Algorithm by Pitt and Malik (2011) [4]. Likelihood Computation The aforementioned likelihood p(yt|Ft−1; ω) = g(yt|αt; ω)f1(αt|Ft−1; ω) can be approximated as: ˆp(yt|Ft−1; ω) = 1 P P i=1 g(yt|αi t|t−1; ω) Once we have a suitable estimator, the likelihood can be computed as: log Ln,P (ω) = n i=1 log ˆp(yt|Ft−1; ω) As mentioned in the paper on Smooth Particle filters [5], the issue that arises is that the likelihood estimator is not smooth with respect to ω. This is because the sampling is done from the discrete pdf and a solution is proposed by the author where we sample from the discrete pdf as if it were a continuous one thereby generating smoothed samples for αi t|t. Of course, as P tends towards very large value, both cdf’s will be indistinguishable but while using a smaller number of particles, the CSIR returns a likelihood estimator that is smooth with respect to ω. 9
  • 13. 4 Implementation, Diagnostics, Package Development and Testing The first step of the implementation was to develop a function for simulating data points from an SV model based on input parameters of the desired length. This function is presented in subsection 7.1 and the return series generated is shown in Figure 2, it is important to note that we store the randomly drawn α in order to test the subsequent functions. This was followed by construction of the filtering algorithms that take the pre- diction particles as input and produce filtered particles in order to be used with the transition density. These functions are presented in subsections 7.2 and 7.3. Figures 3 and 4 are plots where a sv return series of length 1000 was used with 100 particles and both filters. After the filters were tested, the next function developed was the likelihood estimator as a function of ω. This was developed in a manner that both filters could easily be used with it. We use and return the normalized loglikelhood values from the function so that it can be compared with the results of garchFit function from the package ’fGarch’. The loglikelihood function was put inside a wrapper function that estimates the parameters of the sv model for any return series given as input. This function is presented in subsection 7.5. Since the likelihood minimization is being done using previously available functions, both ’nlminb’ and ’optim’ were tested. Based on a few initial results, we decided to use optim for the likelihood minimization. The code for loglikelihood computation and parameter estimation are in subsections 7.4 and 7.5. Once the svfit function was developed, an MC test was set up with the true parameter values φ0 = 0.1, φ1 = 0.975, τ2 ν = 0.02 which are usual values for daily returns. We generated 100 series of length 500 and fit SV model with pre-drawn initial particles, noise and uniform intervals. The code used for this is reported in subsection 7.7 and the results are shown in Figure 5. We concluded that the functions are working well, and we used the daily re- turns of SP500 and BAC over the past 4 years and a simulated series with true parameters(φ0 = 0.05, φ1 = 0.98, τ2 ν = 0.02) of length 1000 as inputs to svfit. We estimated the annualized volatility using GARCH(1,1) and svfit. (Figure 6) 10
  • 14. The package is named ”metricsComplete” and includes all of the functions men- tioned here in this thesis. It can be downloaded and installed by running the following code in R: i f ( ! require ( ’ devtools ’ )) install . packages( ’ devtools ’ ) library ( devtools ) devtools : : install github ( ”HariharaSubramanyam/metricsComplete ” ) Typically, volatility analysis is carried out for short periods of time (Significantly shorted than the ones used for testing here). Therefore, with real world data, especially if a large number of particles are used, this model will be able to estimate the parameters with reasonable accuracy. Further Work One way to develop further on the work done during this thesis is to carry out MC tests for longer series with higher number of particles. The tests need to be set up keeping real-world applicability in mind, making sure that the number of particles increases with the length of the series. Theoretically, as P → ∞, the estimated parameters ω → ω∗ (the true value). During the initial tests, the number of Particles used was sufficiently large and using both SIR and CSIR algorithms with simulated series resulted in conver- gence to reasonable estimates of parameters. Since the focus of this thesis was the CSIR algorithm, the MC testing was carried out with it. A non-trivial issue that still persists is the time taken for fitting the model (Each series took 17 minutes for series length 500 and 400 particles). Since the entire implementation has been carried out in R, it carries the burden of slow computation. A solution to speed up this process is to construct the filter in C and embed that code into R. To speed it even further, one could code in FORTRAN. Since the likelihood function has been developed for easy editing, once a filter has been coded in another file, it can be easily embedded into this package. The final goal for the package is to incorporate the above sv estimation frame- work as part of a complete Econometrics package that automatically installs all commonly used time series packages and utilizes them as dependencies. There are extensive guidelines for submitting a package to CRAN and the package will be developed adhering to them. A detailed vignette will be constructed as well displaying the package’s best features and shortcuts. 11
  • 15. 5 Conclusions The MC test was done using series generated with true parameters φ0 = 0.1, φ1 = 0.975, τ2 ν = 0.02. By performing MLE on the joint distribution of param- eters estimated using the MC test we get the following results: ˆφ0 = 0.12718434, ˆφ1 = 0.96840118, ˆτ2 ν = 0.02173305 Table 1: MC Testing Results Variance-Covariance matrix of φ0, φ1 and τ2 ν 4.081e-06 2.835e-07 5.609e-08 . 2.127e-08 5.278e-09 . . 1.941e-08 The parameter estimates for the CSIR filter used with the following series of length 1000 and 200 particles are as follows: Simulated SP500 BAC constant 0.04785693 0.01000000 0.1053761 phi 0.98232680 0.95806860 0.8749957 tausq 0.01482527 0.09232735 0.1389295 Similarly, the parameter estimates using the SIR filter are: Simulated SP500 BAC constant 0.04759432 0.0100000 0.1176416 phi 0.98078177 0.9564788 0.8622350 tausq 0.01965572 0.1102391 0.1544638 Therefore we can conclude with reasonable certainity that the svfit function has been implemented well with the CSIR algorithm in place. In order to compare efficiency and visualize the difference between the filtering of SIR and CSIR, multiple plots were attempted but they weren’t able to highlight the smoothing in a distinguishable manner. The best way to compare the performance would be to carry out MC tests with the same simulations and noise and apply MLE to the parameter estimates from both filters (using a low number of particles as this is when the improvement by smoothing is substantial). This along with the points mentioned in the previous section are the suggestions for further work. 12
  • 16. 6 Figures Figure 1: yt, y2 t and |yt| of SP500 Returns over the past 4 years. 13
  • 17. Figure 2: Red: Simulated Return Series. Blue: 90% Confidence intervals of Alpha. True parameters(φ0 = 0.05, φ1 = 0.98, τ2 ν = 0.02). Figure 3: Red: Latent Variable of above return series. Blue: Mean of (αj t|t) & (yt|αj t|t) from SIR filter. Light blue: 90% confidence intervals of SIR filter results. 14
  • 18. Figure 4: Red: Latent Variable of above return series. Blue: Mean of (αj t|t) & (yt|αj t|t) from CSIR filter. Light blue: 90% confidence intervals of CSIR filter results. Figure 5: The parameter estimates of 100 series of length 500 using svfit (400 particles) The true parameters were φ0 = 0.1, φ1 = 0.975, τ2 ν = 0.02. 15
  • 19. Figure 6: The Annualized Volatility plots for GARCH(1,1) and svfit (200 parti- cles with CSIR Filter) with a simulated series, SP500, BAC returns of the past 4 years (length 1000). 7 Code General Syntax 1. P: No. of Particles 2. N: No. of data points (length of series) 3. omega: Vector of Parameters (constant, phi, tausq) 4. eta: ν ∼ N(0, 1) 5. u: u ∼ U(0, 1) 6. alpha up: filtered αj t−1|t−1 to determine αi t|t−1 7. alpha pr: particles αi t|t−1 for determining alpha wt 8. alpha wt: the probabilities at t of yt ∼ fN 0, exp αi t|t−1 2 16
  • 20. 7.1 SV Series Simulation simulatesv <− function (omega , N){ constant <− omega [ 1 ] phi <− omega [ 2 ] tausq <− omega [ 3 ] alpha <− rep (0 , N) s v s e r i e s <− rep (0 , N) eta <− rnorm(N, 0 , sqrt ( tausq )) z <− rnorm(N, 0 , 1) nu <− rnorm(N, 0 , 1) alpha [ 1 ] <− constant s v s e r i e s [ 1 ] <− z [ 1 ] ∗ exp( alpha [ 1 ] /2) for ( i in 2:N){ alpha [ i ] <− constant + ( phi ∗ alpha [ i − 1 ] ) + eta [ i ] s v s e r i e s [ i ] <− z [ i ] ∗ exp( alpha [ i ] / 2) } return ( l i s t ( alpha = alpha , s v s e r i e s = s v s e r i e s )) } 17
  • 21. 7.2 Sequential Importance Resampling s i r <− function ( alpha pr , alpha wt, u){ N <− length ( alpha pr ) alpha up <− rep (0 , N) alpha wt <− alpha wt/sum( alpha wt) sorted <− sort ( alpha pr , index . return = T) alpha pr <− sorted$x alpha wt <− alpha wt[ sorted$ix ] alpha cwt <− c(0 , cumsum( alpha wt)) j <− 1 for ( i in 1:N){ while ( alpha cwt [ i ] < u [ j ] & u [ j ] <= alpha cwt [ i +1]){ alpha up [ j ] <− alpha pr [ i ] i f ( j < N){ j <− j + 1 } else {break} } } return ( alpha up) } 18
  • 22. 7.3 Continuous Sequential Importance Resampling c s i r <− function ( alpha pr , alpha wt, u){ N <− length ( alpha pr ) alpha up <− rep (0 , N) alpha wt <− alpha wt/sum( alpha wt) sorted <− sort ( alpha pr , index . return = T) alpha pr <− sorted$x alpha wt <− alpha wt[ sorted$ix ] alpha cwt <− c(0 , cumsum( alpha wt)) alpha pr <− c ( alpha pr [ 1 ] , alpha pr ) j <− 1 for ( i in 1:N){ while ( alpha cwt [ i ] < u [ j ] & u [ j ] <= alpha cwt [ i +1]){ alpha up [ j ] <− alpha pr [ i ] + # (( alpha pr [ i +1] − alpha pr [ i ] ) /( alpha cwt [ i +1] − alpha cwt [ i ] ) ) # ∗(u [ j ] − alpha cwt [ i ] ) # The smoothing of the update i f ( j < N){ j <− j + 1 } else {break} } } return ( alpha up) } 19
  • 23. 7.4 Likelihood Computation s v l l <− function (omega , y , eta sim , u sim , alpha up , alpha wt){ N <− length (y) P <− length ( alpha up) constant <− omega [ 1 ] phi <− omega [ 2 ] tausq <− omega [ 3 ] l o g l i k <− 0 for ( i in 1:N){ alpha pr <− constant + ( phi ∗ alpha up) + ( sqrt ( tausq ) ∗ eta sim [ , i ] ) l i k <− dnorm( y [ i ] ∗rep (1 , P) , rep (0 , P) , exp( alpha pr/2) ) i f ( is . f i n i t e ( log (mean( l i k ) ) ) == T){ l o g l i k <− l o g l i k − log ( mean( l i k )) } else { l o g l i k <− Inf } alpha wt <− l i k alpha up <− c s i r ( alpha pr , alpha wt, u sim [ , i ] ) } l o g l i k <− l o g l i k /N; return ( l o g l i k ) } # Usage : N <− length (y) eta sim <− matrix(rnorm(P ∗ N, 0 , 1) , P, N) u sim <− matrix( runif (P ∗ N, 0 , 1) , P, N) alpha up i n i t <− rnorm(P, 0 , 1) alpha wt i n i t <− rep (1 , P)/P for ( i in 1:N) { u sim [ , i ] <− sort (u sim [ , i ] ) } omega <− # set of parameters s v l l (omega , y , eta sim , u sim , alpha up init , alpha wt i n i t ) 20
  • 24. 7.5 Likelihood Minimization/Parameter Estimation s v f i t <− function (y , omega , P) { N <− length (y) eta sim <− matrix(rnorm(P ∗ N, 0 , 1) , P, N) u sim <− matrix( runif (P ∗ N, 0 , 1) , P, N) alpha up i n i t <− rnorm(P, 0 , 1) alpha wt i n i t <− rep (1 , P)/P for ( i in 1:N) { u sim [ , i ] <− sort (u sim [ , i ] ) } lb <− rep (0 , length (omega )) + 0.01 ub <− rep (1 , length (omega )) − (2 ∗ (10ˆ −10)) sv proxy <− function (par) { output <− s v l l (par , y , eta sim , u sim , alpha up init , alpha wt i n i t ) return ( output ) } t r i a l <− optim(par = omega , fn = sv proxy , upper = ub , lower = lb , control = l i s t ( trace = 1) , method = ”L−BFGS−B” , hessian = T) return ( t r i a l ) } 21
  • 25. 7.6 Plotting Volatility svvoltrack <− function ( svmodel , s e r i e s , P){ N <− length ( s e r i e s ) constant <− svmodel$par [ 1 ] phi <− svmodel$par [ 2 ] tausq <− svmodel$par [ 3 ] alpha up <− rnorm(P, 0 , 0 . 1 ) alpha pr <− rep (0 ,P) alpha wt <− (rep (1 ,P)) /P alpha pr track <− rep (0 , N) for ( i in 1:N){ alpha pr <− constant + ( phi ∗ alpha up) + rnorm(P, 0 , sqrt ( tausq )) alpha wt <− dnorm(( s e r i e s [ i ] ∗ rep (1 ,P)) , rep (0 , P) , exp( alpha pr/2)) alpha up <− c s i r ( alpha pr , alpha wt, sort ( runif (P, 0 , 1 ) ) ) } return ( alpha pr track ) } # Usage : omega i n i t <− c ( 0 .1 , 0.9 , 0.1) svmodel <− s v f i t ( s e r i e s , omega init , P) v o l a t i l i t y <− sqrt (252) ∗ exp( svvoltrack ( svmodel , s e r i e s , P)/2) 22
  • 26. 7.7 MC SV Parameter Estimate omega returns <− c ( 0 .1 , 0.975 , 0.02) simulations <− 100 len <− 500 S e r i e s <− as . data . frame(matrix (0 , len , simulations )) for ( i in 1: simulations ){ S e r i e s [ , i ] <− simulatesv (omega returns , len ) } Parameters <− as . data . frame(matrix (0 , 3 , simulations )) P <− 400 alpha up i n i t <− rnorm(P, 0 , 1) alpha wt i n i t <− rep (1 , P)/P for ( i in 1: simulations ){ N <− length ( S e r i e s [ , i ] ) eta sim <− matrix(rnorm(P ∗ N, 0 , 1) , P, N) u sim <− matrix( runif (P ∗ N, 0 , 1) , P, N) for ( i in 1:N) { u sim [ , i ] <− sort (u sim [ , i ] ) } lb <− rep (0 , length (omega )) + 0.01 ub <− rep (1 , length (omega )) − (2 ∗ (10ˆ −10)) omega = c (var(y) ∗ (1 − 0.95) , 0.95 , 0.1) sv proxy <− function (par) { output <− s v l l (par , y , eta sim , u sim , alpha up init , alpha wt i n i t ) return ( output ) } t r i a l <− optim(par = omega , fn = sv proxy , upper = ub , lower = lb , control = l i s t ( trace = 1) , method = ”L−BFGS−B” , hessian = T) Parameters [ , i ] <− t r i a l $par } # The parameters data frame i s used to study the convergence to true values . 23
  • 27. References [1] Robert Almgren. Time Series Analysis and Statistical Arbitrage. New York University, 2009. [2] Christian Brownlees. Financial Econometrics. BGSE, 2016. [3] N.J. Gordon, D.J. Salmond, and a.F.M. Smith. Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proceedings F Radar and Signal Processing, 140(2):107, 1993. [4] Sheheryar Malik and Michael K. Pitt. Particle filters for continuous likeli- hood evaluation and maximisation. Journal of Econometrics, 165(2):190– 209, 2011. [5] Michael K. Pitt. Smooth particle filters for likelihood evaluation and max- imisation. (651), 2002. [6] Rob Reider. Time Series Analysis and Statistical Arbitrage. New York University, 2009. 24