1. The document presents a numerical smoothing technique to improve the efficiency of option pricing and density estimation when analytic smoothing is not possible.
2. The technique involves numerically determining discontinuities in the integrand and computing the integral only over the smooth regions. It also uses hierarchical representations and Brownian bridges to reduce the effective dimension of the problem.
3. The numerical smoothing approach outperforms Monte Carlo methods for high dimensional cases and improves the complexity of multilevel Monte Carlo from O(TOL^-2.5) to O(TOL^-2 log(TOL)^2).
Introduction to Prompt Engineering (Focusing on ChatGPT)
Numerical smoothing and hierarchical approximations for efficient option pricing and density estimation_seminar_aachen_talk
1. Numerical Smoothing and Hierarchical Approximations
for Efficient Option Pricing and Density Estimation
Chiheb Ben Hammouda
Christian Bayer Ra´ul Tempone
RWTH Aachen University
February 25, 2020
0
2. Outline
1 Introduction and Motivation
2 Details of Numerical Smoothing and Hierarchical Approximations
3 Error and Work Discussion
4 Numerical Experiments and Results
5 Conclusions and Future Work
3. 1 Introduction and Motivation
2 Details of Numerical Smoothing and Hierarchical Approximations
3 Error and Work Discussion
4 Numerical Experiments and Results
5 Conclusions and Future Work
0
4. Options and Pricing
Option: Financial security that gives the holder the right, but
not the obligation, to buy (Call) or sell (Put) a specified
quantity of a specified underlying instrument (asset) at a
specified price (K: strike) on (European) or before (Bermudan,
American) a specified date (T: maturity).
Why using options
▸ Effective hedge instrument against a declining stock market to limit
downside losses.
▸ Speculative purposes such as wagering on the direction of a stock.
To value (price) the option is to compute the fair price of this
contract.
1
6. Martingale Representation and Notations
Theorem (Fair Value of Financial Derivatives)
The fair value of a financial derivative which can be exercised at time
T is given by (Harrison and Pliska 1981)
V (X,0) = e−rT
EQ[g(X,T)]
where EQ is the expectation under the local martingale measure Q.
Asset: {Xt t ≥ 0} stochastic process such that Xt ∈ Rd
represents
the price of the underlying at time t, defined on a continuous-time
probability space (Ω,F,Q).
r is the risk-free interest rate
Payoff function g Rd
→ R. E.g., European call option,
g(XT ) = max{XT − K,0}, where K is the strike price
d is the number of assets
7. Some of the Challenges in Option Pricing
Issue 1: X takes values in a high-dimensional space ⇒ Curse of
dimensionality1
when using deterministic quadrature methods.
▸ Case 1: Time-discretization of a stochastic differential equation
(large N (number of time steps)).
▸ Case 2: A large number of underlying assets (large d).
▸ Case 3: Path dependence of the option price on the whole
trajectory of the underlying
Issue 2: The payoff function g has typically low regularity ⇒
▸ Deterministic quadrature methods suffer from a slow convergence.
▸ Multilevel Monte Carlo (MLMC) suffers from
☀ High variance of coupled levels and low strong rate of convergence
⇒ Affecting badly the complexity of the MLMC estimator.
☀ High kurtosis at the deep levels ⇒ Expensive cost to get reliable
and robust estimates of the sample statistics in the MLMC setting
(sample mean, sample variance).2
1
Curse of dimensionality: An exponential growth of the work (number of
function evaluations) in terms of the dimension of the integration problem.
2
The standard deviation of the sample variance for the random variable
Y = g − g −1 is σS2(Y ) = Var[Y ]
√
M
√
(κ − 1) + 2
M−1
, with κ: the kurtosis, and M:
number of samples. 4
8. Our Methodology for Addressing
Option Pricing Challenges3
We design efficient hierarchical pricing and density estimation methods
1 Numerical smoothing: to uncover the available regularity
▸ Root finding for determining the discontinuity location.
▸ Pre-Integration (Conditional expectation): one dimensional
integration with respect to a single well chosen variable.
2 Approximating the resulting integral of the smoothed integrand
obtained from previous step
▸ Adaptive sparse grids quadrature (ASGQ) combined with
numerical smoothing, and hierarchical representations
☀ Brownian bridges as a Wiener path generation method ⇒ the
effective dimension of the problem.
☀ Richardson extrapolation ⇒ Faster convergence of the weak
error ⇒ number of time steps to achieve a certain error tolerance
⇒ smaller total dimension for the input space.
▸ MLMC estimator combined with numerical smoothing.
3
Christian Bayer, Chiheb Ben Hammouda, and Ra´ul Tempone. “Numerical
smoothing and hierarchical approximations for efficient option pricing and density
estimation”. In: arXiv preprint arXiv:2003.05708 (2020)
9. Framework in
(Bayer, Ben Hammouda, and Tempone 2020)
Approximate efficiently E[g(X(t))] using previous methodology
The payoff g Rd
→ R has either jumps or kinks. Given φ Rd
→ R
▸ Hockey-stick functions: g(x) = max(φ(x),0) (put or call payoffs).
▸ Indicator functions: g(x) = 1(φ(x)≥0) (digital option, distribution functions, . . . ).
▸ Dirac Delta functions: g(x) = δ(φ(x)=0) (density estimation, financial Greeks).
∂φ
∂xj
(x) > 0, ∀x ∈ Rd
(Monotonicity condition)
lim
xj →+∞
φ(x) = lim
xj →+∞
φ(xj,x−j) = +∞, ∀x−j ∈ Rd−1
or
∂2
φ
∂x2
j
(x) ≥ 0, ∀x ∈ Rd
(Growth condition).
Notation: x−j denotes the vector of length d − 1 denoting all the variables other than xj in
x.
The process X is approximated (via a discretization scheme) by X
▸ One/multi-dimensional geometric Brownian motion (GBM) process.
▸ Multi-dimensional stochastic volatility model: the Heston model
dXt = µXtdt +
√
vtXtdWX
t
dvt = κ(θ − vt)dt + ξ
√
vtdWv
t ,
(WX
t ,Wv
t ): correlated Wiener processes with correlation ρ.
10. Contributions in the Context of
Deterministic Quadrature Methods 4
1 We consider cases where we can not perform an analytic
smoothing (Bayer, Siebenmorgen, and Tempone 2018; Xiao and
Wang 2018; Bayer, Hammouda, and Tempone 2018) and introduce
a novel numerical smoothing technique, that is coupled with
ASGQ (combined with Hierarchical Brownian Bridge and
Richardson extrapolation to overcome the high dimensionality).
2 Our novel approach outperforms substantially the MC method
even for high dimensional cases and for dynamics where
discretization is needed such as the Heston model.
3 In (Bayer, Ben Hammouda, and Tempone 2020) we provide a
smoothness analysis for the smoothed integrand in the time
stepping setting.
4
Christian Bayer, Chiheb Ben Hammouda, and Ra´ul Tempone. “Numerical
smoothing and hierarchical approximations for efficient option pricing and density
estimation”. In: arXiv preprint arXiv:2003.05708 (2020)
11. Contributions in the Context of MLMC Methods 5
1 Compared to works (Giles 2008b; Giles, Debrabant, and R¨oßler
2013), our approach can be easily applied to cases where one can
not apply analytic smoothing.
2 Compared to the case without smoothing, we significantly reduce
the kurtosis at the deep levels of MLMC, and also improve the
strong convergence rate, and consequently the complexity of the
MLMC estimator from O (TOL−2.5
) to O (TOL−2
log(TOL)2
),
without the need to use higher order schemes such as Milstein
scheme as in (Giles 2008b; Giles, Debrabant, and R¨oßler 2013).
3 Contrary to the smoothing strategy used in (Giles, Nagapetyan,
and Ritter 2015), our numerical smoothing approach
▸ Does not deteriorate the strong convergence behavior.
▸ Is easier to apply for any dynamics and QoI.
▸ When estimating densities: our pointwise error does not increases
exponentially with respect to the dimension of state vector as in
(Giles, Nagapetyan, and Ritter 2015).
5
Christian Bayer, Chiheb Ben Hammouda, and Ra´ul Tempone. “Numerical
smoothing and hierarchical approximations for efficient option pricing and density
estimation”. In: arXiv preprint arXiv:2003.05708 (2020)
12. 1 Introduction and Motivation
2 Details of Numerical Smoothing and Hierarchical Approximations
3 Error and Work Discussion
4 Numerical Experiments and Results
5 Conclusions and Future Work
8
13. Wiener Path Generation Methods
{ti}N
i=0: Grid of time steps, {Bti }N
i=0: Brownian motion increments
Random Walk
▸ Proceeds incrementally, given Bti ,
Bti+1 = Bti +
√
∆tzi, zi ∼ N(0,1).
▸ All components of z = (z1,...,zN ) have the same scale of importance (in
terms of variance): isotropic.
Hierarchical Brownian Bridge
▸ Given a past value Bti and a future value Btk
, the value Btj (with
ti < tj < tk) can be generated according to (ρ = j−i
k−i
)
Btj = (1 − ρ)Bti + ρBtk
+ zj
√
ρ(1 − ρ)(k − i)∆t, zj ∼ N(0,1).
▸ The most important values (capture a large part of the total variance) are
the first components of z = (z1,...,zN ).
▸ the effective dimension (# important dimensions) by anisotropy
between different directions ⇒ Faster convergence of deterministic
quadrature methods.
9
14. Optimal Smoothing Direction in Continuous Time (I)
X = (X(1)
,...,X(d)
) is described by the following SDE
dX
(i)
t = ai(Xt)dt +
d
∑
j=1
bij(Xt)dW
(j)
t , (1)
{W(j)
}d
j=1 are standard Brownian motions.
Hierarchical representation of W
W(j)
(t) =
t
T
W(j)
(T) + B(j)
(t)
=
t
√
T
Zj + B(j)
(t),
Zj ∼ N(0,1) (iid coarse factors); {B(j)
}d
j=1: the Brownian bridges.
Hierarchical representation of Z = (Z1,...,Zd); v: the smoothing
direction
Z = P0Z
One dimensional projection
+ P⊥Z
Projection on the complementary
= (Z,v)
=Zv
v + w (2)
10
15. Optimal Smoothing Direction in Continuous Time (II)
Using (1) and (2), observe (Hv (Zv,w) = g (X(T)))
E[g (X(T))] = E[E[Hv (Zv,w) w]]
Var[g (X(T))] = E[Var[Hv (Zv,w) w]] + Var[E[Hv (Zv,w) w]].
The optimal smoothing direction, v, solves
max
v∈Rd
v =1
E[Var[Hv (Zv,w) w]] ⇐⇒ min
v∈Rd
v =1
Var[E[Hv (Zv,w) w]]. (3)
Solving the optimization problem (3) is a hard task.
The optimal smoothing direction v is problem dependent.
In (Bayer, Ben Hammouda, and Tempone 2020), we determine v
heuristically giving the structure of problem at hand.
11
16. Discrete Time Formulation: GBM Example
Consider the basket option under multi-dimensional GBM model
▸ The payoff function: g(X(T)) = max(∑
d
j=1 cjX(j)
(T) − K,0)
▸ The dynamics of the stock prices: dX
(j)
t = σ(j)
X
(j)
t dW
(j)
t .
The numerical approximation of {X(j)
(T)}d
j=1, with time step ∆t
X
(j)
(T) = X
(j)
0
N−1
∏
n=0
[1 +
σ(j)
√
T
Z
(j)
1 ∆t + σ(j)
∆B(j)
n ], 1 ≤ j ≤ d
▸ (Z
(1)
1 ,...,Z
(1)
N ,Z
(d)
1 ,...,Z
(d)
N ): N × d Gaussian independent rdvs.
▸ {B(j)
}d
j=1 are the Brownian bridges increments.
E[g(X(T))] ≈ E[g (X
(1)
T ,...,X
(d)
T )] = E[g(X
∆t
(T))]
= ∫
Rd×N
G(z)ρN×d(z)dz
(1)
1 ...dz
(1)
N ...z
(d)
1 ...dz
(d)
N ,
▸ z = (z
(1)
1 ,...,z
(1)
N ,...,z
(d)
1 ,...,z
(d)
N ).
▸ ρd×N (z) = 1
(2π)d×N/2 e− 1
2 zT
z
.
12
17. Numerical Smoothing: Motivation
Idea: Assume that the integration domain Ω can be divided into
two parts Ω1 and Ω2.
▸ In Ω1 the integrand G is smooth and positive.
▸ G(x) = 0 in Ω2.
▸ Along the boundary between Ω1 and Ω2, the integrand is
non-differentiable or non-continuous.
Procedure
1 Determine Ω2 numerically by root finding algorithm.
2 Compute
∫
Ω
G = ∫
Ω1
G + ∫
Ω2
G
= ∫
Ω1
G
13
18. Numerical Smoothing Step
We consider Z1 = (Z
(1)
1 ,...,Z
(d)
1 ) the most important directions.
Design of a sub-optimal smoothing direction (A: rotation matrix
and it is problem dependent.6
)
Y = AZ1.
The smoothing direction v (in continuous time formulation) is
given by the first row of A.
One dimensional root finding problem to solve for y1
K =
d
∑
j=1
cjX
(j)
0
N−1
∏
n=0
F(j)
n (y1(K)),y−1), (4)
F(j)
n (y1,y−1) = 1 +
σ(j)
∆t
√
T
(((A)−1
)j1y1 +
d
∑
i=2
((A)−1
)jiyi) + σ(j)
∆B(j)
n .
6
In this example, a sufficiently good choice of A is a rotation matrix with first
row leading to Y1 = ∑
d
i=1 Z
(i)
1 up to re-scaling. 14
19. Pre-Integration (Conditional Expectation) Step
E[g(X(T))]
≈ ∫
Rd×N
G(z)ρd×N (z)dz
(1)
1 ...dz
(1)
N ...dz
(d)
1 ...dz
(d)
N
= ∫
RdN−1
I(y−1,z
(1)
−1 ,...,z
(d)
−1 )ρd−1(y−1)dy−1ρdN−d(z
(1)
−1 ,...,z
(d)
−1 )dz
(1)
−1 ...dz
(d)
−1
= E[I(Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )] ≈ E[¯I(Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )], (5)
I(y−1,z
(1)
−1 ,...,z
(d)
−1 ) = ∫
R
G(y1,y−1,z
(1)
−1 ,...,z
(d)
−1 )ρy1 (y1)dy1
= ∫
y∗
1
−∞
G(y1,y−1,z
(1)
−1 ,...,z
(d)
−1 )ρy1 (y1)dy1 + ∫
+∞
y∗
1
G(y1,y−1,z
(1)
−1 ,...,z
(d)
−1 )ρy1 (y1)dy1
≈ ¯I(y−1,z
(1)
−1 ,...,z
(d)
−1 ) =
Nq
∑
k=0
ηkG(ζk (¯y∗
1 ),y−1,z
(1)
−1 ,...,z
(d)
−1 ),
y∗
1 : the exact discontinuity location; ¯y∗
1 : the approximated discontinuity location; Nq: the
number of Laguerre quadrature points ζk ∈ R, and corresponding weights ηk.
We show in (Bayer, Ben Hammouda, and Tempone 2020) that I and ¯I are highly smooth
functions.
Both approaches that we use in this work, ASGQ and MLMC methods aim to efficiently
approximate the resulting expectation in (5).
15
20. Density Estimation
Goal: Approximate the density ρX at u, for a stochastic process X
ρX(u) = E[δ(X − u)], δ is the Dirac delta function.
Without any smoothing techniques (regularization, kernel
density,. . . ) MC and MLMC fail due to the infinite variance caused by
the singularity of the function δ.
Strategy: in (Bayer, Ben Hammouda, and Tempone 2020)
1 Conditioning with respect to the Brownian bridge
ρX(u) =
1
√
2π
E[exp(−(y∗
1 (u))
2
/2)
dy∗
1
dx
(u)]
≈
1
√
2π
E[exp(−(¯y∗
1 (u))
2
/2)
d¯y∗
1
dx
(u)], (6)
y∗
1 : the exact discontinuity; ¯y∗
1 : the approximated discontinuity.
2 We use MLMC method to efficiently approximate (6).
Kernel density techniques or parametric regularization as in (Giles,
Nagapetyan, and Ritter 2015) ⇒ a pointwise error that increases
exponentially with respect to the dimension of the state vector X.
16
21. Why not Kernel Density Techniques
in High Dimension?
Similar to approaches based on parametric regularization as in Giles, Nagapetyan,
and Ritter 2015.
This class of approaches has a pointwise error that increases exponentially with
respect to the dimension of the state vector X.
For a d-dimensional problem, a kernel density estimator with a bandwidth matrix,
H = diag(h,...,h)
MSE ≈ c1M−1
h−d
+ c2h4
. (7)
M is the number of samples, and c1 and c2 are constants.
Our approach in high dimension: For u ∈ Rd
ρX(u) = E[δ(X − u)] = E[ρd (y∗
(u))det(J(u))]
≈ E[ρd (y∗
(u))det(J(u))], (8)
J is the Jacobian matrix, with Jij =
∂y∗
i
∂xj
; ρd(.) is the multivariate Gaussian density.
Thanks to the exact conditional expectation with respect to the Brownian bridge,
the error of our approach is only restricted to the error for finding an approximated
location of the discontinuity ⇒ the error in our approach is insensitive to the
dimension of the problem.
22. Sparse grids (I)
Notation:
Given F Rd
→ R and a multi-index β ∈ Nd
+.
Fβ = Qm(β)
[F] a quadrature operator based on a Cartesian
quadrature grid (m(βn) points along yn).
Approximating E[F] with Fβ is not an appropriate option due to
the well-known curse of dimensionality.
The first-order difference operators
∆iFβ {
Fβ − Fβ−ei
, if βi > 1
Fβ if βi = 1
where ei denotes the ith d-dimensional unit vector
The mixed (first-order tensor) difference operators
∆[Fβ] = ⊗d
i=1∆iFβ
Idea: A quadrature estimate of E[F] is
MI [F] = ∑
β∈I
∆[Fβ],
18
23. Sparse grids (II)
E[F] ≈ MI [F] = ∑
β∈I
∆[Fβ],
Product approach: I = { β ∞≤ ; β ∈ Nd
+} ⇒ EQ(M) = O (M−r/d
)
(for functions with bounded total derivatives up to order r).
Regular sparse grids:
I = { β 1≤ + d − 1; β ∈ Nd
+} ⇒ EQ(M) = O (M−s
(log M)(d−1)(s+1)
)
(for functions with bounded mixed derivatives up to order s).
Adaptive sparse grids quadrature (ASGQ): I = IASGQ
(Next slides).
EQ(M) = O (M−s
) (for functions with bounded weighted mixed
derivatives up to order s).
Notation: M: number of quadrature points; EQ: quadrature error.
Figure 2.1: Left are product
grids ∆β1 ⊗ ∆β2 for
1 ≤ β1,β2 ≤ 3. Right is the
corresponding SG
construction.
19
24. ASGQ in practice
E[F] ≈ MIASGQ [F] = ∑
β∈IASGQ
∆[Fβ],
The construction of IASGQ
is done by profit thresholding
IASGQ
= {β ∈ Nd
+ Pβ ≥ T}.
Profit of a hierarchical surplus Pβ =
∆Eβ
∆Wβ
.
Error contribution: ∆Eβ = MI∪{β}
− MI
.
Work contribution: ∆Wβ = Work[MI∪{β}
] − Work[MI
].
Figure 2.2: A posteriori,
adaptive construction as in
(Haji-Ali et al. 2016): Given
an index set Ik, compute the
profits of the neighbor indices
and select the most profitable
one
20
25. ASGQ in practice
E[F] ≈ MIASGQ [F] = ∑
β∈IASGQ
∆[Fβ],
The construction of IASGQ
is done by profit thresholding
IASGQ
= {β ∈ Nd
+ Pβ ≥ T}.
Profit of a hierarchical surplus Pβ =
∆Eβ
∆Wβ
.
Error contribution: ∆Eβ = MI∪{β}
− MI
.
Work contribution: ∆Wβ = Work[MI∪{β}
] − Work[MI
].
Figure 2.3: A posteriori,
adaptive construction as in
(Haji-Ali et al. 2016): Given
an index set Ik, compute the
profits of the neighbor indices
and select the most profitable
one
20
26. ASGQ in practice
E[F] ≈ MIASGQ [F] = ∑
β∈IASGQ
∆[Fβ],
The construction of IASGQ
is done by profit thresholding
IASGQ
= {β ∈ Nd
+ Pβ ≥ T}.
Profit of a hierarchical surplus Pβ =
∆Eβ
∆Wβ
.
Error contribution: ∆Eβ = MI∪{β}
− MI
.
Work contribution: ∆Wβ = Work[MI∪{β}
] − Work[MI
].
Figure 2.4: A posteriori,
adaptive construction as in
(Haji-Ali et al. 2016): Given
an index set Ik, compute the
profits of the neighbor indices
and select the most profitable
one
20
27. ASGQ in practice
E[F] ≈ MIASGQ [F] = ∑
β∈IASGQ
∆[Fβ],
The construction of IASGQ
is done by profit thresholding
IASGQ
= {β ∈ Nd
+ Pβ ≥ T}.
Profit of a hierarchical surplus Pβ =
∆Eβ
∆Wβ
.
Error contribution: ∆Eβ = MI∪{β}
− MI
.
Work contribution: ∆Wβ = Work[MI∪{β}
] − Work[MI
].
Figure 2.5: A posteriori,
adaptive construction as in
(Haji-Ali et al. 2016): Given
an index set Ik, compute the
profits of the neighbor indices
and select the most profitable
one
20
28. ASGQ in practice
E[F] ≈ MIASGQ [F] = ∑
β∈IASGQ
∆[Fβ],
The construction of IASGQ
is done by profit thresholding
IASGQ
= {β ∈ Nd
+ Pβ ≥ T}.
Profit of a hierarchical surplus Pβ =
∆Eβ
∆Wβ
.
Error contribution: ∆Eβ = MI∪{β}
− MI
.
Work contribution: ∆Wβ = Work[MI∪{β}
] − Work[MI
].
Figure 2.6: A posteriori,
adaptive construction as in
(Haji-Ali et al. 2016): Given
an index set Ik, compute the
profits of the neighbor indices
and select the most profitable
one
20
29. ASGQ in practice
E[F] ≈ MIASGQ [F] = ∑
β∈IASGQ
∆[Fβ],
The construction of IASGQ
is done by profit thresholding
IASGQ
= {β ∈ Nd
+ Pβ ≥ T}.
Profit of a hierarchical surplus Pβ =
∆Eβ
∆Wβ
.
Error contribution: ∆Eβ = MI∪{β}
− MI
.
Work contribution: ∆Wβ = Work[MI∪{β}
] − Work[MI
].
Figure 2.7: A posteriori,
adaptive construction as in
(Haji-Ali et al. 2016): Given
an index set Ik, compute the
profits of the neighbor indices
and select the most profitable
one
20
30. Monte Carlo (MC): Idea
Let X be a stochastic process and g Rd
→ R, a function of the
state of the system which gives a measurement of interest.
Aim: approximate E[g(X(T))] efficiently, using X
∆t
as an
approximate path of X.
Let µM be a classical Monte Carlo estimator of E[g(X
∆t
(T))]
defined by
µM =
1
M
M
∑
m=1
g(X
∆t
[m](T)),
where X
∆t
[m] are independent paths generated via the approximate
algorithm with a step-size of ∆t.
21
31. Monte Carlo (MC): Complexity
We can notice that
E[g(X(T))] − µM
= E[g(X(T))] − E[g(X
∆t
(T))]
bias (weak error)
+E[g(X
∆t
(T))] − µM
statistical error=O( 1√
M
)
.
If we have an order one method (bias = O (h) ≈ O (TOL))
Total computational complexity = (cost per path)
≈ T
h
=TOL−1
× (#paths)
=M=TOL−2
= O (TOL−1
TOL−2
) = O (TOL−3
).
The Multilevel Monte Carlo (MLMC) estimator introduced in
(Giles 2008a), in the context of SDEs, reduces the total work to
O (TOL−2.5
) (worst case), and even to O (TOL−2
) in the best case
(See complexity theorem in (Cliffe et al. 2011)).
22
32. Multilevel Monte Carlo (MLMC) (Giles 2008a)
Aim: Estimate efficiently E[¯I(Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )]
Setting
▸ A hierarchy of nested meshes of the time interval [0,T], indexed by { }L
=0.
▸ ∆t = K−
∆t0: The size of the subsequent time steps for levels ≥ 1 , where K>1
is a given integer constant and ∆t0 the step size used at level = 0.
▸ ¯I : the level approximation of ¯I, computed with step size of ∆t , Nq, Laguerre
quadrature points, and TOLNewton, as the tolerance of the Newton method at
level
MLMC estimator
E[¯IL] = E[¯I0] +
L
∑
=1
E[¯I − ¯I −1]
Var[¯I0] ≫ Var[¯I − ¯h −1] as
M0 ≫ M as
By defining Q0 = 1
M0
M0
∑
m0=1
¯I0,[m0]; Q = 1
M
M
∑
m =1
(¯I ,[m ] − ¯I −1,[m ]), we arrive
at the unbiased MLMC estimator, Q
Q =
L
∑
=0
Q . (9)
Complexity MLMC estimator: O (TOL
−2−max(0, γ−β
α
)
log (TOL)2×1{β=γ}
).
(α,β,γ) are weak, strong and work rates respectively.
23
33. 1 Introduction and Motivation
2 Details of Numerical Smoothing and Hierarchical Approximations
3 Error and Work Discussion
4 Numerical Experiments and Results
5 Conclusions and Future Work
23
34. Error and Work Discussion for ASGQ (I)
QN : the ASGQ estimator
E[g(X(T)] − QN = E[g(X(T))] − E[g(X
∆t
(T))]
Error I: bias or weak error
+ E[I (Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )] − E[¯I (Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )]
Error II: numerical smoothing error
+ E[¯I (Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )] − QN
Error III: ASGQ error
, (10)
Schemes based on forward Euler to simulate asset dynamics
Error I = O (∆t).
Giving the smoothness analysis in (Bayer, Ben Hammouda, and
Tempone 2020), we have
Error III = O (N−p
ASGQ),
NASGQ: the number of quadrature points used by the ASGQ
estimator, and p > 0 (mixed derivatives of ¯I, in the dN − 1 dimensional
space, are bounded up to order p).
24
35. Error and Work Discussion for ASGQ (II)
Error II = E[I (Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )] − E[¯I (Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )]
≤ sup
y−1,z
(1)
−1 ,...,z
(d)
−1
I (y−1,z
(1)
−1 ,...,z
(d)
−1 ) − ¯I (y−1,z
(1)
−1 ,...,z
(d)
−1 )
= O (Nq
−s
) + O ( y∗
1 − ¯y∗
1
κ+1
)
= O (N−s
q ) + O (TOLκ+1
Newton) (11)
y∗
1 : the exact location of the non smoothness.
¯y∗
1 : the approximated location of the non smoothness obtained by Newton iteration
⇒ y∗
1 − ¯y∗
1 = TOLNewton
κ ≥ 0 (κ = 0: heavy-side payoff (digital option), and κ = 1: call or put payoffs.
Nq is the number of points used by the Laguerre quadrature for the one dimensional
pre-integration step.
s > 0: Derivatives of G with respect to y1 are bounded up to order s.
25
36. Error and Work Discussion for ASGQ (III)
An optimal performance of ASGQ is given by
⎧⎪⎪⎪
⎨
⎪⎪⎪⎩
min
(NASGQ,Nq,TOLNewton)
WorkASGQ ∝ NASGQ × Nq × ∆t−1
s.t. Etotal,ASGQ = TOL.
(12)
Etotal, ASGQ = E[g(X(T)] − QN
= O (∆t) + O (NASGQ
−p
) + O (Nq
−s
) + O (TOLNewton
κ+1
).
We show in (Bayer, Ben Hammouda, and Tempone 2020) that
under certain conditions for the regularity parameters s and p
(p,s ≫ 1)
▸ The ASGQ: WorkASGQ = O (TOL−1
) (for the best case).
▸ The MC method: WorkMC = O (TOL−3
) (for the best case).
26
37. Error and Work Discussion for MLMC (I)
ˆQ: the MLMC estimator, as defined in (9).
E[g(X(T)] − ˆQ = E[g(X(T))] − E[g(X
∆tL
(T))]
Error I: bias or weak error
+ E[IL (Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )] − E[¯IL (Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )]
Error II: numerical smoothing error (same as in (11))
+ E[¯IL (Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )] − ˆQ
Error III: MLMC statistical error
. (13)
Schemes based on forward Euler to simulate asset dynamics
Error I = O (∆tL).
Error III =
L
∑
=L0
M−1V = O
⎛
⎜
⎝
L
∑
=L0
√
Nq, log (TOL−1
Newton, )
⎞
⎟
⎠
.
Notation: V = Var[¯I − ¯I −1]; M : number of samples at level .
27
38. Error and Work Discussion for MLMC (II)
An optimal performance of MLMC is given by
⎧⎪⎪⎪
⎨
⎪⎪⎪⎩
min
(L,L0,{M }L
=0
,Nq,TOLNewton)
WorkMLMC ∝ ∑L
=L0
M (Nq, ∆t−1
)
s.t. Etotal,MLMC = TOL.
(14)
Etotal, MLMC = E[g(X(T)] − ˆQ
= O (∆tL) + O
⎛
⎜
⎝
L
∑
=L0
√
Nq, log (TOLNewton,
−1)
⎞
⎟
⎠
+ O (Nq,L
−s
)
+ O (TOLNewton,L
κ+1
).
28
39. 1 Introduction and Motivation
2 Details of Numerical Smoothing and Hierarchical Approximations
3 Error and Work Discussion
4 Numerical Experiments and Results
5 Conclusions and Future Work
28
40. Numerical Results for ASGQ
Example Total relative error CPU time (ASGQ/MC)
Single digital option (GBM) 0.7% 7 × 10−3
Single call option (GBM) 0.5% 8 × 10−3
4d-Basket call option (GBM) 0.8% 7.4 × 10−2
Single digital option (Heston) 0.6% 6.2 × 10−2
Single call option (Heston) 0.5% 17.2 × 10−2
Table 1: Summary of relative errors and computational gains, achieved by the
different methods. In this table, we highlight the computational gains
achieved by ASGQ over MC method to meet a certain error tolerance. We
note that the ratios are computed for the best configuration with Richardson
extrapolation for each method.
29
41. Numerical Results for MLMC
Method κL α β γ Numerical Complexity
Without smoothing + digital under GBM 709 1 1/2 1 O (TOL−2.5
)
With numerical smoothing + digital under GBM 3 1 1 1 O (TOL−2
(log(TOL))
2
)
Without smoothing + digital under Heston 245 1 1/2 1 O (TOL−2.5
)
With numerical smoothing + digital under Heston 7 1 1 1 O (TOL−2
log(TOL)2
)
With numerical smoothing + GBM density 5 1 1 1 O (TOL−2
(log(TOL))
2
)
With numerical smoothing + Heston density 8 1 1 1 O (TOL−2
(log(TOL))
2
)
Table 2: Summary of the MLMC numerical results observed different examples. κL is the kurtosis
at the finest levels of MLMC, (α,β,γ) are weak, strong and work rates respectively. TOL is the
user-selected MLMC tolerance.
30
42. Digital Option under the Heston Model:
MLMC Without Smoothing
0 1 2 3 4 5 6
-8
-6
-4
-2
0 1 2 3 4 5 6
-10
-5
0
0 1 2 3 4 5 6
2
4
6
0 1 2 3 4 5 6
50
100
150
200
kurtosis
Figure 4.1: Digital option under Heston: Convergence plots for MLMC
without smoothing, combined with the fixed truncation scheme.
31
43. Digital Option under the Heston Model:
MLMC With Numerical Smoothing
0 1 2 3 4 5 6 7
-15
-10
-5
0 1 2 3 4 5 6 7
-10
-5
0
0 1 2 3 4 5 6 7
2
4
6
8
0 1 2 3 4 5 6 7
8
9
10
11
12
kurtosis
Figure 4.2: Digital option under Heston: Convergence plots for MLMC with
numerical smoothing, combined with the Heston Ornstein-Uhlenbeck (OU)
based scheme.
32
44. Digital Option under the Heston Model:
Numerical Complexity Comparison
10-4
10-3
10-2
10-1TOL
1e-04
1e-02
10
2
10
3
E[W]
MLMC without smoothing
TOL
-2.5
MLMC+ Numerical smoothing
TOL-2
log(TOL)2
Figure 4.3: Digital option under Heston: Comparison of the numerical
complexity of i) standard MLMC (based on fixed truncation scheme), and ii)
MLMC with numerical smoothing (based on Heston OU based scheme).
33
45. 1 Introduction and Motivation
2 Details of Numerical Smoothing and Hierarchical Approximations
3 Error and Work Discussion
4 Numerical Experiments and Results
5 Conclusions and Future Work
33
46. Conclusions: Context of Deterministic Quadrature
1 We consider cases where we can not perform an analytic
smoothing (Bayer, Siebenmorgen, and Tempone 2018; Xiao and
Wang 2018; Bayer, Hammouda, and Tempone 2018) and introduce
a novel numerical smoothing technique, that is coupled with
ASGQ (combined with Hierarchical Brownian Bridge and
Richardson extrapolation to overcome the high dimensionality).
2 Our novel approach outperforms substantially the MC method
even for high dimensional cases and for dynamics where
discretization is needed such as the Heston model.
3 In (Bayer, Ben Hammouda, and Tempone 2020) we provide a
smoothness analysis for the smoothed integrand in the time
stepping setting.
4 More details can be found in
Christian Bayer, Chiheb Ben Hammouda, and Ra´ul Tempone.
“Numerical smoothing and hierarchical approximations for
efficient option pricing and density estimation”. In: arXiv
preprint arXiv:2003.05708 (2020)
34
47. Conclusions: Context of MLMC Methods
1 Compared to works (Giles 2008b; Giles, Debrabant, and R¨oßler 2013),
our approach can be easily applied to cases where one can not apply
analytic smoothing.
2 Compared to the case without smoothing, we significantly reduce the
kurtosis at the deep levels of MLMC, and also improve the strong
convergence rate, and consequently the complexity of the MLMC
estimator from O (TOL−2.5
) to O (TOL−2
log(TOL)2
), without the
need to use higher order schemes such as Milstein scheme as in (Giles
2008b; Giles, Debrabant, and R¨oßler 2013).
3 Contrary to the smoothing strategy used in (Giles, Nagapetyan, and
Ritter 2015), our numerical smoothing approach
▸ Does not deteriorate the strong convergence behavior.
▸ Is easier to apply for any dynamics and QoI.
▸ When estimating densities: our pointwise error does not increases
exponentially with respect to the dimension of state vector as in (Giles,
Nagapetyan, and Ritter 2015).
4 More details can be found in
Christian Bayer, Chiheb Ben Hammouda, and Ra´ul Tempone.
“Numerical smoothing and hierarchical approximations for efficient
option pricing and density estimation”. In: arXiv preprint
arXiv:2003.05708 (2020)
35
48. References I
Christian Bayer, Chiheb Ben Hammouda, and Ra´ul Tempone.
“Numerical smoothing and hierarchical approximations for
efficient option pricing and density estimation”. In: arXiv
preprint arXiv:2003.05708 (2020).
Christian Bayer, Chiheb Ben Hammouda, and Raul Tempone.
“Hierarchical adaptive sparse grids for option pricing under the
rough Bergomi model”. In: arXiv preprint arXiv:1812.08533
(2018).
Christian Bayer, Markus Siebenmorgen, and R´aul Tempone.
“Smoothing the payoff for efficient computation of basket option
pricing.” In: Quantitative Finance 18.3 (2018), pp. 491–505.
K Andrew Cliffe et al. “Multilevel Monte Carlo methods and
applications to elliptic PDEs with random coefficients”. In:
Computing and Visualization in Science 14.1 (2011), p. 3.
36
49. References II
Michael Giles. “Multi-level Monte Carlo path simulation”. In:
Operations Research 53.3 (2008), pp. 607–617.
Michael B Giles. “Improved multilevel Monte Carlo convergence
using the Milstein scheme”. In: Monte Carlo and Quasi-Monte
Carlo Methods 2006. Springer, 2008, pp. 343–358.
Michael B Giles, Kristian Debrabant, and Andreas R¨oßler.
“Numerical analysis of multilevel Monte Carlo path simulation
using the Milstein discretisation”. In: arXiv preprint
arXiv:1302.4676 (2013).
Michael B Giles, Tigran Nagapetyan, and Klaus Ritter.
“Multilevel Monte Carlo approximation of distribution functions
and densities”. In: SIAM/ASA Journal on Uncertainty
Quantification 3.1 (2015), pp. 267–295.
37
50. References III
Abdul-Lateef Haji-Ali et al. “Multi-index stochastic collocation
for random PDEs”. In: Computer Methods in Applied Mechanics
and Engineering 306 (2016), pp. 95–122.
J Michael Harrison and Stanley R Pliska. “Martingales and
stochastic integrals in the theory of continuous trading”. In:
Stochastic processes and their applications 11.3 (1981),
pp. 215–260.
Ye Xiao and Xiaoqun Wang. “Conditional quasi-Monte Carlo
methods and dimension reduction for option pricing and hedging
with discontinuous functions”. In: Journal of Computational and
Applied Mathematics 343 (2018), pp. 289–308.
38