SlideShare une entreprise Scribd logo
1  sur  16
Télécharger pour lire hors ligne
Introduction
Clustering Financial Time Series:
How Long is Enough?
25th International Joint Conference on Artificial Intelligence
IJCAI-16
S. Andler, G. Marti, F. Nielsen, P. Donnat
July 14, 2016
Gautier Marti Clustering Financial Time Series: How Long is Enough?
Introduction
Clustering of Financial Time Series
Goal: Build Risk & Trading AI agents. . .
source: www.datagrapple.com
. . . which can strive with this kind of data.
Gautier Marti Clustering Financial Time Series: How Long is Enough?
Introduction
Clustering of Financial Time Series
Stylized fact I: Financial time series correlations have a strong
hierarchical block diagonal structure (Econophysics [4])
Stylized fact II: Most correlations are spurious (RMT [2])
Motivation for clustering financial time series using correlation as a
similarity measure:
dimensionality reduction ≡ filtering noisy correlations
Gautier Marti Clustering Financial Time Series: How Long is Enough?
Introduction
Challenge for the statistical practitioner
The dilemma:
the longer the time interval, the more precise the correlation
estimates, but also
the longer the time interval, the more unrealistic the
stationarity hypothesis for these time series.
Question: How does the clustering behave with statistical errors
of the correlation estimates?
How long is enough? 30 days? 120 days? 10 years?
Gautier Marti Clustering Financial Time Series: How Long is Enough?
Introduction
A first theoretical approach - simplified setting
We consider the following framework:
financial time series ≡ random walks
they follow a joint elliptical distribution (e.g. Gaussian,
Student) parameterized by a correlation matrix
the correlation matrix has a hierarchical block structure:
Gautier Marti Clustering Financial Time Series: How Long is Enough?
Introduction
Simulations in the simplified setting
Some influential parameters:
clustering algorithm
number of observations T
number of variables N relative to T
contrast between the correlations, and their values
correlation estimator (e.g. Pearson, Spearman)
100 200 300 400 500
Sample size
0.0
0.2
0.4
0.6
0.8
1.0
Score
Empirical rates of convergence for Single Linkage
Gaussian - Pearson
Gaussian - Spearman
Student - Pearson
Student - Spearman
100 200 300 400 500
Sample size
0.0
0.2
0.4
0.6
0.8
1.0
Score
Empirical rates of convergence for Average Linkage
Gaussian - Pearson
Gaussian - Spearman
Student - Pearson
Student - Spearman
100 200 300 400 500
Sample size
0.0
0.2
0.4
0.6
0.8
1.0
Score
Empirical rates of convergence for Ward
Gaussian - Pearson
Gaussian - Spearman
Student - Pearson
Student - Spearman
Ratio of the number of correct clustering obtained over the
number of trials as a function of T
Gautier Marti Clustering Financial Time Series: How Long is Enough?
Introduction
A consistency proof & first convergence bounds
A 2-step proof. First step:
We consider Hierarchical Agglomerative Clustering algorithms
Space contracting vs. Space conserving vs. Space dilating [1]
D(t+1)
C
(t)
i
∪ C
(t)
j
, C
(t)
k
≤ min D
(t)
ik
, D
(t)
jk
D(t+1)
C
(t)
i
∪ C
(t)
j
, C
(t)
k
∈
min D
(t)
ik
, D
(t)
jk
, max D
(t)
ik
, D
(t)
jk
D(t+1)
C
(t)
i
∪ C
(t)
j
, C
(t)
k
≥ max D
(t)
ik
, D
(t)
jk
Gautier Marti Clustering Financial Time Series: How Long is Enough?
Introduction
A consistency proof & first convergence bounds
A 2-step proof. First step:
Which geometrical configurations lead to the true clustering?
For space-conserving algorithms (e.g. Single, Complete, Average
Linkage), a sufficient separability condition reads
max Dintra := max
1≤i,j≤N
C(i)=C(j)
d(Xi , Xj ) < min
1≤i,j≤N
C(i)=C(j)
d(Xi , Xj ) =: min Dinter
Gautier Marti Clustering Financial Time Series: How Long is Enough?
Introduction
A consistency proof & first convergence bounds
A 2-step proof. Second step:
How long does it take for the estimates of the correlation
coefficients to be precise enough to be with high probability in
a good configuration for the clustering algorithm?
Answer: Concentration inequalities for correlation coefficients.
Gautier Marti Clustering Financial Time Series: How Long is Enough?
Introduction
Convergence bounds
Combining both steps, we get the following convergence rate:
Convergence rate
The probability of the clustering algorithm making an error is
O
log N
T
.
Gautier Marti Clustering Financial Time Series: How Long is Enough?
Introduction
Proof. Step 1 - A bit more details
By induction.
Let’s assume the separability condition is satisfied at step t,
then
min D
(t)
intra ≤ max D
(t)
intra < min D
(t)
inter ≤ max D
(t)
inter
From the space-conserving property, we get:
D
(t+1)
intra ∈ min D
(t)
intra, max D
(t)
intra and D
(t+1)
inter ∈ min D
(t)
inter, max D
(t)
inter .
Therefore:
separability condition is satisfied at t+1,
the clustering algorithm has not linked points from two
different clusters between step t and step t + 1.
Gautier Marti Clustering Financial Time Series: How Long is Enough?
Introduction
Proof. Step 2 - A bit more details
Maximum statistical error
For space conserving algorithm the separability condition is met if
ˆΣ − Σ ∞ <
minρi ,ρj
|ρi − ρj |
2
,
where C(i) = C(j).
This means that the statistical error has to be below the minimum
correlation ‘contrast’ between the clusters.
Weaker the ‘contrast’, more precise the correlation estimates have to be.
N.B. From Cram´er–Rao lower bound, we get for Pearson correlation
estimator:
var(ˆρ) ≥
(1 − ρ2
)2
1 + ρ2
.
When correlation is high, it is easier to estimate.
Gautier Marti Clustering Financial Time Series: How Long is Enough?
Introduction
Correlation estimates concentration bounds
number of variables N, observations T, minimum separation d
Concentration bounds [3]
If Σ and ˆΣ are the population and empirical Spearman correlation
matrices respectively, then for N ≥ 24
log T + 2, we have with
probability at least 1 − 1
T2 ,
ˆΣ − Σ ∞ ≤ 24
log N
T
.
P(“correct clustering”) ≥ 1 − 2N2
e−Td2/24
Not sharp enough! (for reasonable values of N, T, d)
Gautier Marti Clustering Financial Time Series: How Long is Enough?
Introduction
Future developments
Bounds are not sharp enough. We can try to refine them using:
(theoretical) Intrinsic dimension of the HCBM model [5];
(empirical) A distance between dendrograms (instead of
correct/incorrect) for a finer analysis;
(empirical) A study of ‘correctness’ isoquants:
Precise convergence rates of clustering methodologies can provide
a useful model selection criterion for practitioners!
Gautier Marti Clustering Financial Time Series: How Long is Enough?
Introduction
Zhenmin Chen and John W Van Ness.
Space-conserving agglomerative algorithms.
Journal of classification, 13(1):157–168, 1996.
Laurent Laloux, Pierre Cizeau, Marc Potters, and
Jean-Philippe Bouchaud.
Random matrix theory and financial correlations.
International Journal of Theoretical and Applied Finance,
3(03):391–397, 2000.
Han Liu, Fang Han, Ming Yuan, John Lafferty, Larry
Wasserman, et al.
High-dimensional semiparametric gaussian copula graphical
models.
The Annals of Statistics, 40(4):2293–2326, 2012.
Rosario N Mantegna.
Hierarchical structure in financial markets.
Gautier Marti Clustering Financial Time Series: How Long is Enough?
Introduction
The European Physical Journal B-Condensed Matter and
Complex Systems, 11(1):193–197, 1999.
Joel A Tropp.
An introduction to matrix concentration inequalities.
arXiv preprint arXiv:1501.01571, 2015.
Gautier Marti Clustering Financial Time Series: How Long is Enough?

Contenu connexe

Tendances

Clustering Random Walk Time Series
Clustering Random Walk Time SeriesClustering Random Walk Time Series
Clustering Random Walk Time SeriesGautier Marti
 
A review of two decades of correlations, hierarchies, networks and clustering...
A review of two decades of correlations, hierarchies, networks and clustering...A review of two decades of correlations, hierarchies, networks and clustering...
A review of two decades of correlations, hierarchies, networks and clustering...Gautier Marti
 
Autoregressive Convolutional Neural Networks for Asynchronous Time Series
Autoregressive Convolutional Neural Networks for Asynchronous Time SeriesAutoregressive Convolutional Neural Networks for Asynchronous Time Series
Autoregressive Convolutional Neural Networks for Asynchronous Time SeriesGautier Marti
 
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...Chiheb Ben Hammouda
 
Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...
Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...
Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...Mokhtar SELLAMI
 
MCQMC 2020 talk: Importance Sampling for a Robust and Efficient Multilevel Mo...
MCQMC 2020 talk: Importance Sampling for a Robust and Efficient Multilevel Mo...MCQMC 2020 talk: Importance Sampling for a Robust and Efficient Multilevel Mo...
MCQMC 2020 talk: Importance Sampling for a Robust and Efficient Multilevel Mo...Chiheb Ben Hammouda
 
Numerical smoothing and hierarchical approximations for efficient option pric...
Numerical smoothing and hierarchical approximations for efficient option pric...Numerical smoothing and hierarchical approximations for efficient option pric...
Numerical smoothing and hierarchical approximations for efficient option pric...Chiheb Ben Hammouda
 
Is ABC a new empirical Bayes approach?
Is ABC a new empirical Bayes approach?Is ABC a new empirical Bayes approach?
Is ABC a new empirical Bayes approach?Christian Robert
 
A Maximum Entropy Approach to the Loss Data Aggregation Problem
A Maximum Entropy Approach to the Loss Data Aggregation ProblemA Maximum Entropy Approach to the Loss Data Aggregation Problem
A Maximum Entropy Approach to the Loss Data Aggregation ProblemErika G. G.
 
Measuring credit risk in a large banking system: econometric modeling and emp...
Measuring credit risk in a large banking system: econometric modeling and emp...Measuring credit risk in a large banking system: econometric modeling and emp...
Measuring credit risk in a large banking system: econometric modeling and emp...SYRTO Project
 
Bayesian model choice in cosmology
Bayesian model choice in cosmologyBayesian model choice in cosmology
Bayesian model choice in cosmologyChristian Robert
 
11.the comparative study of finite difference method and monte carlo method f...
11.the comparative study of finite difference method and monte carlo method f...11.the comparative study of finite difference method and monte carlo method f...
11.the comparative study of finite difference method and monte carlo method f...Alexander Decker
 
Affine Term Structure Model with Stochastic Market Price of Risk
Affine Term Structure Model with Stochastic Market Price of RiskAffine Term Structure Model with Stochastic Market Price of Risk
Affine Term Structure Model with Stochastic Market Price of RiskSwati Mital
 
Uncertain Volatility Models
Uncertain Volatility ModelsUncertain Volatility Models
Uncertain Volatility ModelsSwati Mital
 
Spillover Dynamics for Systemic Risk Measurement Using Spatial Financial Time...
Spillover Dynamics for Systemic Risk Measurement Using Spatial Financial Time...Spillover Dynamics for Systemic Risk Measurement Using Spatial Financial Time...
Spillover Dynamics for Systemic Risk Measurement Using Spatial Financial Time...SYRTO Project
 
"Correlated Volatility Shocks" by Dr. Xiao Qiao, Researcher at SummerHaven In...
"Correlated Volatility Shocks" by Dr. Xiao Qiao, Researcher at SummerHaven In..."Correlated Volatility Shocks" by Dr. Xiao Qiao, Researcher at SummerHaven In...
"Correlated Volatility Shocks" by Dr. Xiao Qiao, Researcher at SummerHaven In...Quantopian
 
Pricing interest rate derivatives (ext)
Pricing interest rate derivatives (ext)Pricing interest rate derivatives (ext)
Pricing interest rate derivatives (ext)Swati Mital
 
Network and risk spillovers: a multivariate GARCH perspective
Network and risk spillovers: a multivariate GARCH perspectiveNetwork and risk spillovers: a multivariate GARCH perspective
Network and risk spillovers: a multivariate GARCH perspectiveSYRTO Project
 
Pricing Exotics using Change of Numeraire
Pricing Exotics using Change of NumerairePricing Exotics using Change of Numeraire
Pricing Exotics using Change of NumeraireSwati Mital
 

Tendances (20)

Clustering Random Walk Time Series
Clustering Random Walk Time SeriesClustering Random Walk Time Series
Clustering Random Walk Time Series
 
A review of two decades of correlations, hierarchies, networks and clustering...
A review of two decades of correlations, hierarchies, networks and clustering...A review of two decades of correlations, hierarchies, networks and clustering...
A review of two decades of correlations, hierarchies, networks and clustering...
 
Autoregressive Convolutional Neural Networks for Asynchronous Time Series
Autoregressive Convolutional Neural Networks for Asynchronous Time SeriesAutoregressive Convolutional Neural Networks for Asynchronous Time Series
Autoregressive Convolutional Neural Networks for Asynchronous Time Series
 
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
 
Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...
Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...
Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...
 
MCQMC 2020 talk: Importance Sampling for a Robust and Efficient Multilevel Mo...
MCQMC 2020 talk: Importance Sampling for a Robust and Efficient Multilevel Mo...MCQMC 2020 talk: Importance Sampling for a Robust and Efficient Multilevel Mo...
MCQMC 2020 talk: Importance Sampling for a Robust and Efficient Multilevel Mo...
 
Numerical smoothing and hierarchical approximations for efficient option pric...
Numerical smoothing and hierarchical approximations for efficient option pric...Numerical smoothing and hierarchical approximations for efficient option pric...
Numerical smoothing and hierarchical approximations for efficient option pric...
 
ABC in Varanasi
ABC in VaranasiABC in Varanasi
ABC in Varanasi
 
Is ABC a new empirical Bayes approach?
Is ABC a new empirical Bayes approach?Is ABC a new empirical Bayes approach?
Is ABC a new empirical Bayes approach?
 
A Maximum Entropy Approach to the Loss Data Aggregation Problem
A Maximum Entropy Approach to the Loss Data Aggregation ProblemA Maximum Entropy Approach to the Loss Data Aggregation Problem
A Maximum Entropy Approach to the Loss Data Aggregation Problem
 
Measuring credit risk in a large banking system: econometric modeling and emp...
Measuring credit risk in a large banking system: econometric modeling and emp...Measuring credit risk in a large banking system: econometric modeling and emp...
Measuring credit risk in a large banking system: econometric modeling and emp...
 
Bayesian model choice in cosmology
Bayesian model choice in cosmologyBayesian model choice in cosmology
Bayesian model choice in cosmology
 
11.the comparative study of finite difference method and monte carlo method f...
11.the comparative study of finite difference method and monte carlo method f...11.the comparative study of finite difference method and monte carlo method f...
11.the comparative study of finite difference method and monte carlo method f...
 
Affine Term Structure Model with Stochastic Market Price of Risk
Affine Term Structure Model with Stochastic Market Price of RiskAffine Term Structure Model with Stochastic Market Price of Risk
Affine Term Structure Model with Stochastic Market Price of Risk
 
Uncertain Volatility Models
Uncertain Volatility ModelsUncertain Volatility Models
Uncertain Volatility Models
 
Spillover Dynamics for Systemic Risk Measurement Using Spatial Financial Time...
Spillover Dynamics for Systemic Risk Measurement Using Spatial Financial Time...Spillover Dynamics for Systemic Risk Measurement Using Spatial Financial Time...
Spillover Dynamics for Systemic Risk Measurement Using Spatial Financial Time...
 
"Correlated Volatility Shocks" by Dr. Xiao Qiao, Researcher at SummerHaven In...
"Correlated Volatility Shocks" by Dr. Xiao Qiao, Researcher at SummerHaven In..."Correlated Volatility Shocks" by Dr. Xiao Qiao, Researcher at SummerHaven In...
"Correlated Volatility Shocks" by Dr. Xiao Qiao, Researcher at SummerHaven In...
 
Pricing interest rate derivatives (ext)
Pricing interest rate derivatives (ext)Pricing interest rate derivatives (ext)
Pricing interest rate derivatives (ext)
 
Network and risk spillovers: a multivariate GARCH perspective
Network and risk spillovers: a multivariate GARCH perspectiveNetwork and risk spillovers: a multivariate GARCH perspective
Network and risk spillovers: a multivariate GARCH perspective
 
Pricing Exotics using Change of Numeraire
Pricing Exotics using Change of NumerairePricing Exotics using Change of Numeraire
Pricing Exotics using Change of Numeraire
 

En vedette

On Clustering Financial Time Series - Beyond Correlation
On Clustering Financial Time Series - Beyond CorrelationOn Clustering Financial Time Series - Beyond Correlation
On Clustering Financial Time Series - Beyond CorrelationGautier Marti
 
Fernando Imperiale - Una aguja en el pajar
Fernando Imperiale - Una aguja en el pajarFernando Imperiale - Una aguja en el pajar
Fernando Imperiale - Una aguja en el pajarFernando M. Imperiale
 
Cormac Ferrick Sociology 204 Final Presentation
Cormac Ferrick Sociology 204 Final PresentationCormac Ferrick Sociology 204 Final Presentation
Cormac Ferrick Sociology 204 Final PresentationMac Ferrick
 
Geography 372 Final Presentation
Geography 372 Final PresentationGeography 372 Final Presentation
Geography 372 Final PresentationMac Ferrick
 
IBM - Security Intelligence para PYMES
IBM - Security Intelligence para PYMESIBM - Security Intelligence para PYMES
IBM - Security Intelligence para PYMESFernando M. Imperiale
 
Bartha_Éva_Lili-A_matroid_és_gráfelmélet_összefüggései - MSc_Diplomamunka
Bartha_Éva_Lili-A_matroid_és_gráfelmélet_összefüggései - MSc_DiplomamunkaBartha_Éva_Lili-A_matroid_és_gráfelmélet_összefüggései - MSc_Diplomamunka
Bartha_Éva_Lili-A_matroid_és_gráfelmélet_összefüggései - MSc_DiplomamunkaLili Eva Bartha
 
Yasemin yilmazer latifepalta_zeynepucar
Yasemin yilmazer latifepalta_zeynepucarYasemin yilmazer latifepalta_zeynepucar
Yasemin yilmazer latifepalta_zeynepucarzeynepucarr
 
Fernando Imperiale - Security Intelligence para PYMES
Fernando Imperiale - Security Intelligence para PYMESFernando Imperiale - Security Intelligence para PYMES
Fernando Imperiale - Security Intelligence para PYMESFernando M. Imperiale
 
Carla Casilli - Cineca + open badges - May 2015
Carla Casilli - Cineca + open badges - May 2015Carla Casilli - Cineca + open badges - May 2015
Carla Casilli - Cineca + open badges - May 2015Bestr
 
International Coaching News article page 3
International Coaching News article page 3International Coaching News article page 3
International Coaching News article page 3Christine Charles
 
integrating climate risks in agricultural value chains enamul haque
integrating climate risks in agricultural value chains   enamul haqueintegrating climate risks in agricultural value chains   enamul haque
integrating climate risks in agricultural value chains enamul haqueEnamul Haque
 

En vedette (13)

On Clustering Financial Time Series - Beyond Correlation
On Clustering Financial Time Series - Beyond CorrelationOn Clustering Financial Time Series - Beyond Correlation
On Clustering Financial Time Series - Beyond Correlation
 
Fernando Imperiale - Una aguja en el pajar
Fernando Imperiale - Una aguja en el pajarFernando Imperiale - Una aguja en el pajar
Fernando Imperiale - Una aguja en el pajar
 
Cormac Ferrick Sociology 204 Final Presentation
Cormac Ferrick Sociology 204 Final PresentationCormac Ferrick Sociology 204 Final Presentation
Cormac Ferrick Sociology 204 Final Presentation
 
Geography 372 Final Presentation
Geography 372 Final PresentationGeography 372 Final Presentation
Geography 372 Final Presentation
 
IBM - Security Intelligence para PYMES
IBM - Security Intelligence para PYMESIBM - Security Intelligence para PYMES
IBM - Security Intelligence para PYMES
 
Bartha_Éva_Lili-A_matroid_és_gráfelmélet_összefüggései - MSc_Diplomamunka
Bartha_Éva_Lili-A_matroid_és_gráfelmélet_összefüggései - MSc_DiplomamunkaBartha_Éva_Lili-A_matroid_és_gráfelmélet_összefüggései - MSc_Diplomamunka
Bartha_Éva_Lili-A_matroid_és_gráfelmélet_összefüggései - MSc_Diplomamunka
 
Yasemin yilmazer latifepalta_zeynepucar
Yasemin yilmazer latifepalta_zeynepucarYasemin yilmazer latifepalta_zeynepucar
Yasemin yilmazer latifepalta_zeynepucar
 
Fernando Imperiale - Security Intelligence para PYMES
Fernando Imperiale - Security Intelligence para PYMESFernando Imperiale - Security Intelligence para PYMES
Fernando Imperiale - Security Intelligence para PYMES
 
Carla Casilli - Cineca + open badges - May 2015
Carla Casilli - Cineca + open badges - May 2015Carla Casilli - Cineca + open badges - May 2015
Carla Casilli - Cineca + open badges - May 2015
 
International Coaching News article page 3
International Coaching News article page 3International Coaching News article page 3
International Coaching News article page 3
 
Prabhu Sundaramurthi (4)
Prabhu Sundaramurthi (4)Prabhu Sundaramurthi (4)
Prabhu Sundaramurthi (4)
 
integrating climate risks in agricultural value chains enamul haque
integrating climate risks in agricultural value chains   enamul haqueintegrating climate risks in agricultural value chains   enamul haque
integrating climate risks in agricultural value chains enamul haque
 
Magento News @ Magento Meetup Wien 17
Magento News @ Magento Meetup Wien 17Magento News @ Magento Meetup Wien 17
Magento News @ Magento Meetup Wien 17
 

Similaire à Clustering Financial Time Series: How Long is Enough?

Probabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering NetworksProbabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering NetworksTomaso Aste
 
Cointegration and Long-Horizon Forecasting
Cointegration and Long-Horizon ForecastingCointegration and Long-Horizon Forecasting
Cointegration and Long-Horizon Forecastingمحمد إسماعيل
 
Systemic Risk Modeling - André Lucas, April 16 2014
Systemic Risk Modeling - André Lucas, April 16 2014Systemic Risk Modeling - André Lucas, April 16 2014
Systemic Risk Modeling - André Lucas, April 16 2014SYRTO Project
 
Degree presentation: Indirect Inference Applied to Financial Econometrics
Degree presentation: Indirect Inference Applied to Financial EconometricsDegree presentation: Indirect Inference Applied to Financial Econometrics
Degree presentation: Indirect Inference Applied to Financial EconometricsJean Duchesne
 
Adesanya dissagregation of data corrected
Adesanya dissagregation of data correctedAdesanya dissagregation of data corrected
Adesanya dissagregation of data correctedAlexander Decker
 
slides of ABC talk at i-like workshop, Warwick, May 16
slides of ABC talk at i-like workshop, Warwick, May 16slides of ABC talk at i-like workshop, Warwick, May 16
slides of ABC talk at i-like workshop, Warwick, May 16Christian Robert
 
Threshold autoregressive (tar) &momentum threshold autoregressive (mtar) mode...
Threshold autoregressive (tar) &momentum threshold autoregressive (mtar) mode...Threshold autoregressive (tar) &momentum threshold autoregressive (mtar) mode...
Threshold autoregressive (tar) &momentum threshold autoregressive (mtar) mode...Alexander Decker
 
Numerical Smoothing and Hierarchical Approximations for E cient Option Pricin...
Numerical Smoothing and Hierarchical Approximations for E cient Option Pricin...Numerical Smoothing and Hierarchical Approximations for E cient Option Pricin...
Numerical Smoothing and Hierarchical Approximations for E cient Option Pricin...Chiheb Ben Hammouda
 
Machine Learning for Epidemiological Models (Enrico Meloni)
Machine Learning for Epidemiological Models (Enrico Meloni)Machine Learning for Epidemiological Models (Enrico Meloni)
Machine Learning for Epidemiological Models (Enrico Meloni)MeetupDataScienceRoma
 
Icitam2019 2020 book_chapter
Icitam2019 2020 book_chapterIcitam2019 2020 book_chapter
Icitam2019 2020 book_chapterBan Bang
 
Sequence Similarity between Genetic Codes using Improved Longest Common Subse...
Sequence Similarity between Genetic Codes using Improved Longest Common Subse...Sequence Similarity between Genetic Codes using Improved Longest Common Subse...
Sequence Similarity between Genetic Codes using Improved Longest Common Subse...rahulmonikasharma
 
Average case acceleration through spectral density estimation
Average case acceleration through spectral density estimationAverage case acceleration through spectral density estimation
Average case acceleration through spectral density estimationFabian Pedregosa
 
Time series analysis, modeling and applications
Time series analysis, modeling and applicationsTime series analysis, modeling and applications
Time series analysis, modeling and applicationsSpringer
 
A NEW STUDY OF TRAPEZOIDAL, SIMPSON’S1/3 AND SIMPSON’S 3/8 RULES OF NUMERICAL...
A NEW STUDY OF TRAPEZOIDAL, SIMPSON’S1/3 AND SIMPSON’S 3/8 RULES OF NUMERICAL...A NEW STUDY OF TRAPEZOIDAL, SIMPSON’S1/3 AND SIMPSON’S 3/8 RULES OF NUMERICAL...
A NEW STUDY OF TRAPEZOIDAL, SIMPSON’S1/3 AND SIMPSON’S 3/8 RULES OF NUMERICAL...mathsjournal
 

Similaire à Clustering Financial Time Series: How Long is Enough? (20)

Probabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering NetworksProbabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering Networks
 
Cointegration and Long-Horizon Forecasting
Cointegration and Long-Horizon ForecastingCointegration and Long-Horizon Forecasting
Cointegration and Long-Horizon Forecasting
 
Systemic Risk Modeling - André Lucas, April 16 2014
Systemic Risk Modeling - André Lucas, April 16 2014Systemic Risk Modeling - André Lucas, April 16 2014
Systemic Risk Modeling - André Lucas, April 16 2014
 
Degree presentation: Indirect Inference Applied to Financial Econometrics
Degree presentation: Indirect Inference Applied to Financial EconometricsDegree presentation: Indirect Inference Applied to Financial Econometrics
Degree presentation: Indirect Inference Applied to Financial Econometrics
 
Adesanya dissagregation of data corrected
Adesanya dissagregation of data correctedAdesanya dissagregation of data corrected
Adesanya dissagregation of data corrected
 
slides of ABC talk at i-like workshop, Warwick, May 16
slides of ABC talk at i-like workshop, Warwick, May 16slides of ABC talk at i-like workshop, Warwick, May 16
slides of ABC talk at i-like workshop, Warwick, May 16
 
Threshold autoregressive (tar) &momentum threshold autoregressive (mtar) mode...
Threshold autoregressive (tar) &momentum threshold autoregressive (mtar) mode...Threshold autoregressive (tar) &momentum threshold autoregressive (mtar) mode...
Threshold autoregressive (tar) &momentum threshold autoregressive (mtar) mode...
 
Glm
GlmGlm
Glm
 
Forecasting Gasonline Price in Vietnam Based on Fuzzy Time Series and Automat...
Forecasting Gasonline Price in Vietnam Based on Fuzzy Time Series and Automat...Forecasting Gasonline Price in Vietnam Based on Fuzzy Time Series and Automat...
Forecasting Gasonline Price in Vietnam Based on Fuzzy Time Series and Automat...
 
Numerical Smoothing and Hierarchical Approximations for E cient Option Pricin...
Numerical Smoothing and Hierarchical Approximations for E cient Option Pricin...Numerical Smoothing and Hierarchical Approximations for E cient Option Pricin...
Numerical Smoothing and Hierarchical Approximations for E cient Option Pricin...
 
Cb36469472
Cb36469472Cb36469472
Cb36469472
 
ICCF_2022_talk.pdf
ICCF_2022_talk.pdfICCF_2022_talk.pdf
ICCF_2022_talk.pdf
 
intro
introintro
intro
 
Machine Learning for Epidemiological Models (Enrico Meloni)
Machine Learning for Epidemiological Models (Enrico Meloni)Machine Learning for Epidemiological Models (Enrico Meloni)
Machine Learning for Epidemiological Models (Enrico Meloni)
 
Icitam2019 2020 book_chapter
Icitam2019 2020 book_chapterIcitam2019 2020 book_chapter
Icitam2019 2020 book_chapter
 
Sequence Similarity between Genetic Codes using Improved Longest Common Subse...
Sequence Similarity between Genetic Codes using Improved Longest Common Subse...Sequence Similarity between Genetic Codes using Improved Longest Common Subse...
Sequence Similarity between Genetic Codes using Improved Longest Common Subse...
 
Average case acceleration through spectral density estimation
Average case acceleration through spectral density estimationAverage case acceleration through spectral density estimation
Average case acceleration through spectral density estimation
 
Jmestn42351212
Jmestn42351212Jmestn42351212
Jmestn42351212
 
Time series analysis, modeling and applications
Time series analysis, modeling and applicationsTime series analysis, modeling and applications
Time series analysis, modeling and applications
 
A NEW STUDY OF TRAPEZOIDAL, SIMPSON’S1/3 AND SIMPSON’S 3/8 RULES OF NUMERICAL...
A NEW STUDY OF TRAPEZOIDAL, SIMPSON’S1/3 AND SIMPSON’S 3/8 RULES OF NUMERICAL...A NEW STUDY OF TRAPEZOIDAL, SIMPSON’S1/3 AND SIMPSON’S 3/8 RULES OF NUMERICAL...
A NEW STUDY OF TRAPEZOIDAL, SIMPSON’S1/3 AND SIMPSON’S 3/8 RULES OF NUMERICAL...
 

Plus de Gautier Marti

Using Large Language Models in 10 Lines of Code
Using Large Language Models in 10 Lines of CodeUsing Large Language Models in 10 Lines of Code
Using Large Language Models in 10 Lines of CodeGautier Marti
 
What deep learning can bring to...
What deep learning can bring to...What deep learning can bring to...
What deep learning can bring to...Gautier Marti
 
A quick demo of Top2Vec With application on 2020 10-K business descriptions
A quick demo of Top2Vec With application on 2020 10-K business descriptionsA quick demo of Top2Vec With application on 2020 10-K business descriptions
A quick demo of Top2Vec With application on 2020 10-K business descriptionsGautier Marti
 
cCorrGAN: Conditional Correlation GAN for Learning Empirical Conditional Dist...
cCorrGAN: Conditional Correlation GAN for Learning Empirical Conditional Dist...cCorrGAN: Conditional Correlation GAN for Learning Empirical Conditional Dist...
cCorrGAN: Conditional Correlation GAN for Learning Empirical Conditional Dist...Gautier Marti
 
How deep generative models can help quants reduce the risk of overfitting?
How deep generative models can help quants reduce the risk of overfitting?How deep generative models can help quants reduce the risk of overfitting?
How deep generative models can help quants reduce the risk of overfitting?Gautier Marti
 
Generating Realistic Synthetic Data in Finance
Generating Realistic Synthetic Data in FinanceGenerating Realistic Synthetic Data in Finance
Generating Realistic Synthetic Data in FinanceGautier Marti
 
Applications of GANs in Finance
Applications of GANs in FinanceApplications of GANs in Finance
Applications of GANs in FinanceGautier Marti
 
My recent attempts at using GANs for simulating realistic stocks returns
My recent attempts at using GANs for simulating realistic stocks returnsMy recent attempts at using GANs for simulating realistic stocks returns
My recent attempts at using GANs for simulating realistic stocks returnsGautier Marti
 
Takeaways from ICML 2019, Long Beach, California
Takeaways from ICML 2019, Long Beach, CaliforniaTakeaways from ICML 2019, Long Beach, California
Takeaways from ICML 2019, Long Beach, CaliforniaGautier Marti
 

Plus de Gautier Marti (9)

Using Large Language Models in 10 Lines of Code
Using Large Language Models in 10 Lines of CodeUsing Large Language Models in 10 Lines of Code
Using Large Language Models in 10 Lines of Code
 
What deep learning can bring to...
What deep learning can bring to...What deep learning can bring to...
What deep learning can bring to...
 
A quick demo of Top2Vec With application on 2020 10-K business descriptions
A quick demo of Top2Vec With application on 2020 10-K business descriptionsA quick demo of Top2Vec With application on 2020 10-K business descriptions
A quick demo of Top2Vec With application on 2020 10-K business descriptions
 
cCorrGAN: Conditional Correlation GAN for Learning Empirical Conditional Dist...
cCorrGAN: Conditional Correlation GAN for Learning Empirical Conditional Dist...cCorrGAN: Conditional Correlation GAN for Learning Empirical Conditional Dist...
cCorrGAN: Conditional Correlation GAN for Learning Empirical Conditional Dist...
 
How deep generative models can help quants reduce the risk of overfitting?
How deep generative models can help quants reduce the risk of overfitting?How deep generative models can help quants reduce the risk of overfitting?
How deep generative models can help quants reduce the risk of overfitting?
 
Generating Realistic Synthetic Data in Finance
Generating Realistic Synthetic Data in FinanceGenerating Realistic Synthetic Data in Finance
Generating Realistic Synthetic Data in Finance
 
Applications of GANs in Finance
Applications of GANs in FinanceApplications of GANs in Finance
Applications of GANs in Finance
 
My recent attempts at using GANs for simulating realistic stocks returns
My recent attempts at using GANs for simulating realistic stocks returnsMy recent attempts at using GANs for simulating realistic stocks returns
My recent attempts at using GANs for simulating realistic stocks returns
 
Takeaways from ICML 2019, Long Beach, California
Takeaways from ICML 2019, Long Beach, CaliforniaTakeaways from ICML 2019, Long Beach, California
Takeaways from ICML 2019, Long Beach, California
 

Dernier

NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 

Dernier (20)

NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 

Clustering Financial Time Series: How Long is Enough?

  • 1. Introduction Clustering Financial Time Series: How Long is Enough? 25th International Joint Conference on Artificial Intelligence IJCAI-16 S. Andler, G. Marti, F. Nielsen, P. Donnat July 14, 2016 Gautier Marti Clustering Financial Time Series: How Long is Enough?
  • 2. Introduction Clustering of Financial Time Series Goal: Build Risk & Trading AI agents. . . source: www.datagrapple.com . . . which can strive with this kind of data. Gautier Marti Clustering Financial Time Series: How Long is Enough?
  • 3. Introduction Clustering of Financial Time Series Stylized fact I: Financial time series correlations have a strong hierarchical block diagonal structure (Econophysics [4]) Stylized fact II: Most correlations are spurious (RMT [2]) Motivation for clustering financial time series using correlation as a similarity measure: dimensionality reduction ≡ filtering noisy correlations Gautier Marti Clustering Financial Time Series: How Long is Enough?
  • 4. Introduction Challenge for the statistical practitioner The dilemma: the longer the time interval, the more precise the correlation estimates, but also the longer the time interval, the more unrealistic the stationarity hypothesis for these time series. Question: How does the clustering behave with statistical errors of the correlation estimates? How long is enough? 30 days? 120 days? 10 years? Gautier Marti Clustering Financial Time Series: How Long is Enough?
  • 5. Introduction A first theoretical approach - simplified setting We consider the following framework: financial time series ≡ random walks they follow a joint elliptical distribution (e.g. Gaussian, Student) parameterized by a correlation matrix the correlation matrix has a hierarchical block structure: Gautier Marti Clustering Financial Time Series: How Long is Enough?
  • 6. Introduction Simulations in the simplified setting Some influential parameters: clustering algorithm number of observations T number of variables N relative to T contrast between the correlations, and their values correlation estimator (e.g. Pearson, Spearman) 100 200 300 400 500 Sample size 0.0 0.2 0.4 0.6 0.8 1.0 Score Empirical rates of convergence for Single Linkage Gaussian - Pearson Gaussian - Spearman Student - Pearson Student - Spearman 100 200 300 400 500 Sample size 0.0 0.2 0.4 0.6 0.8 1.0 Score Empirical rates of convergence for Average Linkage Gaussian - Pearson Gaussian - Spearman Student - Pearson Student - Spearman 100 200 300 400 500 Sample size 0.0 0.2 0.4 0.6 0.8 1.0 Score Empirical rates of convergence for Ward Gaussian - Pearson Gaussian - Spearman Student - Pearson Student - Spearman Ratio of the number of correct clustering obtained over the number of trials as a function of T Gautier Marti Clustering Financial Time Series: How Long is Enough?
  • 7. Introduction A consistency proof & first convergence bounds A 2-step proof. First step: We consider Hierarchical Agglomerative Clustering algorithms Space contracting vs. Space conserving vs. Space dilating [1] D(t+1) C (t) i ∪ C (t) j , C (t) k ≤ min D (t) ik , D (t) jk D(t+1) C (t) i ∪ C (t) j , C (t) k ∈ min D (t) ik , D (t) jk , max D (t) ik , D (t) jk D(t+1) C (t) i ∪ C (t) j , C (t) k ≥ max D (t) ik , D (t) jk Gautier Marti Clustering Financial Time Series: How Long is Enough?
  • 8. Introduction A consistency proof & first convergence bounds A 2-step proof. First step: Which geometrical configurations lead to the true clustering? For space-conserving algorithms (e.g. Single, Complete, Average Linkage), a sufficient separability condition reads max Dintra := max 1≤i,j≤N C(i)=C(j) d(Xi , Xj ) < min 1≤i,j≤N C(i)=C(j) d(Xi , Xj ) =: min Dinter Gautier Marti Clustering Financial Time Series: How Long is Enough?
  • 9. Introduction A consistency proof & first convergence bounds A 2-step proof. Second step: How long does it take for the estimates of the correlation coefficients to be precise enough to be with high probability in a good configuration for the clustering algorithm? Answer: Concentration inequalities for correlation coefficients. Gautier Marti Clustering Financial Time Series: How Long is Enough?
  • 10. Introduction Convergence bounds Combining both steps, we get the following convergence rate: Convergence rate The probability of the clustering algorithm making an error is O log N T . Gautier Marti Clustering Financial Time Series: How Long is Enough?
  • 11. Introduction Proof. Step 1 - A bit more details By induction. Let’s assume the separability condition is satisfied at step t, then min D (t) intra ≤ max D (t) intra < min D (t) inter ≤ max D (t) inter From the space-conserving property, we get: D (t+1) intra ∈ min D (t) intra, max D (t) intra and D (t+1) inter ∈ min D (t) inter, max D (t) inter . Therefore: separability condition is satisfied at t+1, the clustering algorithm has not linked points from two different clusters between step t and step t + 1. Gautier Marti Clustering Financial Time Series: How Long is Enough?
  • 12. Introduction Proof. Step 2 - A bit more details Maximum statistical error For space conserving algorithm the separability condition is met if ˆΣ − Σ ∞ < minρi ,ρj |ρi − ρj | 2 , where C(i) = C(j). This means that the statistical error has to be below the minimum correlation ‘contrast’ between the clusters. Weaker the ‘contrast’, more precise the correlation estimates have to be. N.B. From Cram´er–Rao lower bound, we get for Pearson correlation estimator: var(ˆρ) ≥ (1 − ρ2 )2 1 + ρ2 . When correlation is high, it is easier to estimate. Gautier Marti Clustering Financial Time Series: How Long is Enough?
  • 13. Introduction Correlation estimates concentration bounds number of variables N, observations T, minimum separation d Concentration bounds [3] If Σ and ˆΣ are the population and empirical Spearman correlation matrices respectively, then for N ≥ 24 log T + 2, we have with probability at least 1 − 1 T2 , ˆΣ − Σ ∞ ≤ 24 log N T . P(“correct clustering”) ≥ 1 − 2N2 e−Td2/24 Not sharp enough! (for reasonable values of N, T, d) Gautier Marti Clustering Financial Time Series: How Long is Enough?
  • 14. Introduction Future developments Bounds are not sharp enough. We can try to refine them using: (theoretical) Intrinsic dimension of the HCBM model [5]; (empirical) A distance between dendrograms (instead of correct/incorrect) for a finer analysis; (empirical) A study of ‘correctness’ isoquants: Precise convergence rates of clustering methodologies can provide a useful model selection criterion for practitioners! Gautier Marti Clustering Financial Time Series: How Long is Enough?
  • 15. Introduction Zhenmin Chen and John W Van Ness. Space-conserving agglomerative algorithms. Journal of classification, 13(1):157–168, 1996. Laurent Laloux, Pierre Cizeau, Marc Potters, and Jean-Philippe Bouchaud. Random matrix theory and financial correlations. International Journal of Theoretical and Applied Finance, 3(03):391–397, 2000. Han Liu, Fang Han, Ming Yuan, John Lafferty, Larry Wasserman, et al. High-dimensional semiparametric gaussian copula graphical models. The Annals of Statistics, 40(4):2293–2326, 2012. Rosario N Mantegna. Hierarchical structure in financial markets. Gautier Marti Clustering Financial Time Series: How Long is Enough?
  • 16. Introduction The European Physical Journal B-Condensed Matter and Complex Systems, 11(1):193–197, 1999. Joel A Tropp. An introduction to matrix concentration inequalities. arXiv preprint arXiv:1501.01571, 2015. Gautier Marti Clustering Financial Time Series: How Long is Enough?