On the stability of clustering financial time series

Gautier Marti
Gautier MartiAI Quant à Shell Street Labs
Introduction to financial time series clustering
Empirical results from the clustering stability study
Conclusion
On the Stability of Clustering Financial Time
Series – How to investigate?
IEEE ICMLA Miami, Florida, USA, December 9-11, 2015
Gautier Marti, Philippe Very, Philippe Donnat, Frank Nielsen
9 December 2015
Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
Introduction to financial time series clustering
Empirical results from the clustering stability study
Conclusion
1 Introduction to financial time series clustering
2 Empirical results from the clustering stability study
3 Conclusion
Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
Introduction to financial time series clustering
Empirical results from the clustering stability study
Conclusion
Financial time series (data from www.datagrapple.com)
Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
Introduction to financial time series clustering
Empirical results from the clustering stability study
Conclusion
Clustering?
Definition
Clustering is the task of grouping a set of objects in such a way
that objects in the same group (cluster) are more similar to each
other than those in different groups.
French banks (blue) and
building materials (red)
CDS over 2006-2015
Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
Introduction to financial time series clustering
Empirical results from the clustering stability study
Conclusion
Why clustering?
Mathematical finance: Use of variance-covariance matrices
(e.g., Markowitz, Value-at-Risk)
Stylized fact: Empirical
variance-covariance matrices
estimated on financial time
series are very noisy
(Random Matrix Theory,
Noise Dressing of Financial
Correlation Matrices, Laloux
et al, 1999)
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
λ
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
1.8
ρ(λ)
Marchenko-Pastur distribution vs.
empirical eigenvalues distribution
of the correlation matrix
How to filter these variance-covariance matrices?
Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
Introduction to financial time series clustering
Empirical results from the clustering stability study
Conclusion
For filtering, clustering!
Mantegna (1999) et al’s work:
0 100 200 300 400 500
0
100
200
300
400
500
0 100 200 300 400 500
0
100
200
300
400
500
0 100 200 300 400 500
0
100
200
300
400
500
(left) empirical correlation matrix
(center) the same matrix seriated using a hierarchical clustering
(right) correlations filtered using the clustering structure
N.B. other applications: statarb, alternative risk measures
Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
Introduction to financial time series clustering
Empirical results from the clustering stability study
Conclusion
Why stability?
statistical consistency of
the clustering method
requires assumptions that
may not hold in practice:
e.g. returns are i.i.d.,
underlying elliptical copula,
enough data is available
stability is a weaker
property: reproducibility of
results across a wide range
of slight data perturbations
Clusters obtained at time t, t + 1,
t + 2; Is the difference between the
successive clusters a“true”signal?
Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
Introduction to financial time series clustering
Empirical results from the clustering stability study
Conclusion
Is the clustering of financial time series stable?
According to [2], clusters are not stable
with respect to the clustering algorithm,
but only a squared Euclidean distance was considered which is not
relevant for clustering assets from their returns (cf. [4]).
Idea: A more relevant distance should increase stability
We investigate the clustering stability resulting from using:
an Euclidean distance
a Pearson correlation distance [3]
a Spearman correlation distance
a distance for comparing two dependent random variables [4]
Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
Introduction to financial time series clustering
Empirical results from the clustering stability study
Conclusion
Some usual distances for clustering financial time series
(Pi
t )t≥0
Si
t+1 = log Pi
t+1 −log Pi
t
(Si
t )t≥1
Euclidean distance:
d(Si , Sj ) = T
t=1(Si
t − Sj
t )2
Pearson correl.: ρ(Si , Sj ) =
T
t=1(Si
t −Si )(Sj
t −Sj )
T
t=1(Si
t −Si )2 T
t=1(Sj
t −Sj )2
Spearman correl.: ρS (Si , Sj ) =
1 − 6
T(T2−1)
T
t=1(Si
(t) − Si
(t))2
Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
Introduction to financial time series clustering
Empirical results from the clustering stability study
Conclusion
Generic Non-Parametric Distance [4]
d2
θ (Xi , Xj ) = θ3E |Pi (Xi ) − Pj (Xj )|2
+ (1 − θ)
1
2 R
dPi
dλ
−
dPj
dλ
2
dλ
(i) 0 ≤ dθ ≤ 1, (ii) 0 < θ < 1, dθ metric,
(iii) dθ is invariant under diffeomorphism
Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
Introduction to financial time series clustering
Empirical results from the clustering stability study
Conclusion
Generic Non-Parametric Distance [4]
d2
0 : 1
2 R
dPi
dλ −
dPj
dλ
2
dλ = Hellinger2
d2
1 : 3E |Pi (Xi ) − Pj (Xj )|2
=
1 − ρS
2
= 2−6
1
0
1
0
C(u, v)dudv
Remark: If
f (x, θ) = c(F1(x1; ν1), . . . , FN(xN; νN); θc)
N
i=1
fi (xi ; νi )
then with CML hypothesis
ds2
= ds2
copula +
N
i=1
ds2
margins
Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
Introduction to financial time series clustering
Empirical results from the clustering stability study
Conclusion
1 Introduction to financial time series clustering
2 Empirical results from the clustering stability study
3 Conclusion
Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
Introduction to financial time series clustering
Empirical results from the clustering stability study
Conclusion
Sliding Window
PCA stability curve (red) vs.
Euclidean Clusters stability curve as
a function of time using results from
[1] for fair comparison: clusters are
more stable
most basic perturbation:
traders face it everyday
when monitoring their
indicators
we do not want to overfit
our analysis to this
particular stability goal
stability perf.: dist. [4]
Spearman Pearson
Euclidean
Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
Introduction to financial time series clustering
Empirical results from the clustering stability study
Conclusion
Odd vs. Even
A clustering al-
gorithm applied
on two samples
describing the same
phenomenon should
yield the same
results.
How to obtain two
of these samples? (un)Stability of
clusters with L2
distance
Stability of clusters
with the proposed
distance [4]
Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
Introduction to financial time series clustering
Empirical results from the clustering stability study
Conclusion
Economic Regimes
AXA 5-year CDS spread over 2006-2015
Average of the pairwise
correlations; correlation
skyrockets during crises
Is the clustering structure persistent?
Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
Introduction to financial time series clustering
Empirical results from the clustering stability study
Conclusion
Economic Regimes Clustering Stability
Pearson (top left), Spearman (top right),
Euclidean (bottom left), corr+distr (bottom right)
Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
Introduction to financial time series clustering
Empirical results from the clustering stability study
Conclusion
Heart vs. Tails Clustering Stability
≈ orange+red vs. green+yellow periods
Pearson (top left), Spearman (top right),
Euclidean (bottom left), corr+distr (bottom right)
Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
Introduction to financial time series clustering
Empirical results from the clustering stability study
Conclusion
Multiscale
Is the clustering structure persistent to different sampling frequencies?
Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
Introduction to financial time series clustering
Empirical results from the clustering stability study
Conclusion
Multiscale Clustering Stability
Pearson (top left), Spearman (top right),
Euclidean (bottom left), corr+distr (bottom right)
Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
Introduction to financial time series clustering
Empirical results from the clustering stability study
Conclusion
Maturities & Term Structure
An asset is described by several time series whose dynamics are similar:
Nokia Oyj is described here by the cost of insurance against its default
for {1, 3, 5, 7, 10} years
Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
Introduction to financial time series clustering
Empirical results from the clustering stability study
Conclusion
Maturities & Term Structure Clustering Stability
Pearson (top left), Spearman (top right),
Euclidean (bottom left), corr+distr (bottom right)
Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
Introduction to financial time series clustering
Empirical results from the clustering stability study
Conclusion
1 Introduction to financial time series clustering
2 Empirical results from the clustering stability study
3 Conclusion
Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
Introduction to financial time series clustering
Empirical results from the clustering stability study
Conclusion
Discussion and questions?
A given clustering algorithm yields a particular clustering
structure, but with a relevant distance it can be more stable
The perturbations presented can be readily extended (e.g.
using different CDS datasets)
Disclosing stability results is interesting since complex
models often perform poorly (the many parameters are
somewhat overfitted) and cannot be used by practitioners
Correlation+distribution distance (presented in [4]) may work
for your applications (which ones?)
Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
Introduction to financial time series clustering
Empirical results from the clustering stability study
Conclusion
C. Ding and X. He.
K-means clustering via principal component analysis.
In Proceedings of the twenty-first international conference on
Machine learning, page 29. ACM, 2004.
V. Lemieux, P. S. Rahmdel, R. Walker, B. Wong, and
M. Flood.
Clustering techniques and their effect on portfolio formation
and risk analysis.
In Proceedings of the International Workshop on Data Science
for Macro-Modeling, pages 1–6. ACM, 2014.
R. N. Mantegna and H. E. Stanley.
Introduction to econophysics: correlations and complexity in
finance.
Cambridge university press, 1999.
G. Marti, P. Very, and P. Donnat.
Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
Introduction to financial time series clustering
Empirical results from the clustering stability study
Conclusion
Toward a generic representation of random variables for
machine learning.
Pattern Recognition Letters, 2015.
Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
1 sur 25

Recommandé

RM-CVaR: Regularized Multiple β-CVaR Portfolio(IJCAI Presentation) par
RM-CVaR: Regularized Multiple β-CVaR Portfolio(IJCAI Presentation)RM-CVaR: Regularized Multiple β-CVaR Portfolio(IJCAI Presentation)
RM-CVaR: Regularized Multiple β-CVaR Portfolio(IJCAI Presentation)Kei Nakagawa
591 vues27 diapositives
013_20160328_Topological_Measurement_Of_Protein_Compressibility par
013_20160328_Topological_Measurement_Of_Protein_Compressibility013_20160328_Topological_Measurement_Of_Protein_Compressibility
013_20160328_Topological_Measurement_Of_Protein_CompressibilityHa Phuong
867 vues64 diapositives
Dynamic Time Warping を用いた高頻度取引データのLead-Lag 効果の推定 par
Dynamic Time Warping を用いた高頻度取引データのLead-Lag 効果の推定Dynamic Time Warping を用いた高頻度取引データのLead-Lag 効果の推定
Dynamic Time Warping を用いた高頻度取引データのLead-Lag 効果の推定Katsuya Ito
2.1K vues26 diapositives
Topology for Computing: Homology par
Topology for Computing: HomologyTopology for Computing: Homology
Topology for Computing: HomologySangwoo Mo
973 vues37 diapositives
GARCHSKモデルを用いた条件付き固有モーメントの実証分析 par
GARCHSKモデルを用いた条件付き固有モーメントの実証分析GARCHSKモデルを用いた条件付き固有モーメントの実証分析
GARCHSKモデルを用いた条件付き固有モーメントの実証分析Kei Nakagawa
800 vues37 diapositives
連続変量を含む条件付相互情報量の推定 par
連続変量を含む条件付相互情報量の推定連続変量を含む条件付相互情報量の推定
連続変量を含む条件付相互情報量の推定Joe Suzuki
2K vues43 diapositives

Contenu connexe

Tendances

金融時系列解析入門 AAMAS2021 著者発表会 par
金融時系列解析入門 AAMAS2021 著者発表会金融時系列解析入門 AAMAS2021 著者発表会
金融時系列解析入門 AAMAS2021 著者発表会Katsuya Ito
1.2K vues47 diapositives
ベイズ推論による機械学習入門 第4章 par
ベイズ推論による機械学習入門 第4章ベイズ推論による機械学習入門 第4章
ベイズ推論による機械学習入門 第4章YosukeAkasaka
861 vues88 diapositives
[DL輪読会]Graph Convolutional Policy Network for Goal-Directed Molecular Graph G... par
[DL輪読会]Graph Convolutional Policy Network for Goal-Directed Molecular Graph G...[DL輪読会]Graph Convolutional Policy Network for Goal-Directed Molecular Graph G...
[DL輪読会]Graph Convolutional Policy Network for Goal-Directed Molecular Graph G...Deep Learning JP
2.5K vues22 diapositives
Risk based portfolio with large dynamic covariance matrices par
Risk based portfolio with large dynamic covariance matricesRisk based portfolio with large dynamic covariance matrices
Risk based portfolio with large dynamic covariance matricesKei Nakagawa
722 vues36 diapositives
[DL輪読会]Weakly-Supervised Disentanglement Without Compromises par
[DL輪読会]Weakly-Supervised Disentanglement Without Compromises[DL輪読会]Weakly-Supervised Disentanglement Without Compromises
[DL輪読会]Weakly-Supervised Disentanglement Without CompromisesDeep Learning JP
1.1K vues23 diapositives
Jokyokai par
JokyokaiJokyokai
JokyokaiTaiji Suzuki
4K vues59 diapositives

Tendances(20)

金融時系列解析入門 AAMAS2021 著者発表会 par Katsuya Ito
金融時系列解析入門 AAMAS2021 著者発表会金融時系列解析入門 AAMAS2021 著者発表会
金融時系列解析入門 AAMAS2021 著者発表会
Katsuya Ito1.2K vues
ベイズ推論による機械学習入門 第4章 par YosukeAkasaka
ベイズ推論による機械学習入門 第4章ベイズ推論による機械学習入門 第4章
ベイズ推論による機械学習入門 第4章
YosukeAkasaka861 vues
[DL輪読会]Graph Convolutional Policy Network for Goal-Directed Molecular Graph G... par Deep Learning JP
[DL輪読会]Graph Convolutional Policy Network for Goal-Directed Molecular Graph G...[DL輪読会]Graph Convolutional Policy Network for Goal-Directed Molecular Graph G...
[DL輪読会]Graph Convolutional Policy Network for Goal-Directed Molecular Graph G...
Deep Learning JP2.5K vues
Risk based portfolio with large dynamic covariance matrices par Kei Nakagawa
Risk based portfolio with large dynamic covariance matricesRisk based portfolio with large dynamic covariance matrices
Risk based portfolio with large dynamic covariance matrices
Kei Nakagawa722 vues
[DL輪読会]Weakly-Supervised Disentanglement Without Compromises par Deep Learning JP
[DL輪読会]Weakly-Supervised Disentanglement Without Compromises[DL輪読会]Weakly-Supervised Disentanglement Without Compromises
[DL輪読会]Weakly-Supervised Disentanglement Without Compromises
Deep Learning JP1.1K vues
PRML輪読#12 par matsuolab
PRML輪読#12PRML輪読#12
PRML輪読#12
matsuolab5.2K vues
Bayesian Neural Networks : Survey par tmtm otm
Bayesian Neural Networks : SurveyBayesian Neural Networks : Survey
Bayesian Neural Networks : Survey
tmtm otm5K vues
알기쉬운 Variational autoencoder par 홍배 김
알기쉬운 Variational autoencoder알기쉬운 Variational autoencoder
알기쉬운 Variational autoencoder
홍배 김34.6K vues
Moment matching networkを用いた音声パラメータのランダム生成の検討 par Shinnosuke Takamichi
Moment matching networkを用いた音声パラメータのランダム生成の検討Moment matching networkを用いた音声パラメータのランダム生成の検討
Moment matching networkを用いた音声パラメータのランダム生成の検討
【論文読み会】Moser Flow: Divergence-based Generative Modeling on Manifolds par ARISE analytics
【論文読み会】Moser Flow: Divergence-based Generative Modeling on Manifolds【論文読み会】Moser Flow: Divergence-based Generative Modeling on Manifolds
【論文読み会】Moser Flow: Divergence-based Generative Modeling on Manifolds
ARISE analytics701 vues
深層学習を用いた音源定位、音源分離、クラス分類の統合~環境音セグメンテーション手法の紹介~ par Yui Sudo
深層学習を用いた音源定位、音源分離、クラス分類の統合~環境音セグメンテーション手法の紹介~深層学習を用いた音源定位、音源分離、クラス分類の統合~環境音セグメンテーション手法の紹介~
深層学習を用いた音源定位、音源分離、クラス分類の統合~環境音セグメンテーション手法の紹介~
Yui Sudo1.5K vues
How good is your prediction a gentle introduction to conformal prediction. par Deep Learning Italia
How good is your prediction  a gentle introduction to conformal prediction.How good is your prediction  a gentle introduction to conformal prediction.
How good is your prediction a gentle introduction to conformal prediction.
研究紹介(学生向け) par Joe Suzuki
研究紹介(学生向け)研究紹介(学生向け)
研究紹介(学生向け)
Joe Suzuki1.6K vues
渡辺澄夫著「ベイズ統計の理論と方法」5.1 マルコフ連鎖モンテカルロ法 par Kenichi Hironaka
渡辺澄夫著「ベイズ統計の理論と方法」5.1 マルコフ連鎖モンテカルロ法渡辺澄夫著「ベイズ統計の理論と方法」5.1 マルコフ連鎖モンテカルロ法
渡辺澄夫著「ベイズ統計の理論と方法」5.1 マルコフ連鎖モンテカルロ法
Kenichi Hironaka1.1K vues
Chapter 8 ボルツマンマシン - 深層学習本読み会 par Taikai Takeda
Chapter 8 ボルツマンマシン - 深層学習本読み会Chapter 8 ボルツマンマシン - 深層学習本読み会
Chapter 8 ボルツマンマシン - 深層学習本読み会
Taikai Takeda23.4K vues
リスクベースポートフォリオの高次モーメントへの拡張 par Kei Nakagawa
リスクベースポートフォリオの高次モーメントへの拡張リスクベースポートフォリオの高次モーメントへの拡張
リスクベースポートフォリオの高次モーメントへの拡張
Kei Nakagawa5.2K vues

En vedette

Magento News @ Magento Meetup Wien 17 par
Magento News @ Magento Meetup Wien 17Magento News @ Magento Meetup Wien 17
Magento News @ Magento Meetup Wien 17Matthias Glitzner-Zeis
448 vues25 diapositives
IBM - Security Intelligence para PYMES par
IBM - Security Intelligence para PYMESIBM - Security Intelligence para PYMES
IBM - Security Intelligence para PYMESFernando M. Imperiale
289 vues10 diapositives
Health & safety officer performance appraisal par
Health & safety officer performance appraisalHealth & safety officer performance appraisal
Health & safety officer performance appraisalsandersjamie999
7.2K vues19 diapositives
Nutrifit parcial vane par
Nutrifit parcial vaneNutrifit parcial vane
Nutrifit parcial vanevanessaghia12
361 vues10 diapositives
bala.resume par
bala.resumebala.resume
bala.resumebala krishna
188 vues3 diapositives
Prezentacja1 par
Prezentacja1Prezentacja1
Prezentacja1Patrycja Kubat
219 vues11 diapositives

En vedette(19)

Health & safety officer performance appraisal par sandersjamie999
Health & safety officer performance appraisalHealth & safety officer performance appraisal
Health & safety officer performance appraisal
sandersjamie9997.2K vues
Yasemin yilmazer latifepalta_zeynepucar par zeynepucarr
Yasemin yilmazer latifepalta_zeynepucarYasemin yilmazer latifepalta_zeynepucar
Yasemin yilmazer latifepalta_zeynepucar
zeynepucarr231 vues
Geography 372 Final Presentation par Mac Ferrick
Geography 372 Final PresentationGeography 372 Final Presentation
Geography 372 Final Presentation
Mac Ferrick297 vues
Clustering CDS: algorithms, distances, stability and convergence rates par Gautier Marti
Clustering CDS: algorithms, distances, stability and convergence ratesClustering CDS: algorithms, distances, stability and convergence rates
Clustering CDS: algorithms, distances, stability and convergence rates
Gautier Marti662 vues
Here be dragons par deelay1
Here be dragonsHere be dragons
Here be dragons
deelay1765 vues
Diapo bourse aux sports par mfrfye
Diapo bourse aux sportsDiapo bourse aux sports
Diapo bourse aux sports
mfrfye275 vues

Similaire à On the stability of clustering financial time series

Some contributions to the clustering of financial time series - Applications ... par
Some contributions to the clustering of financial time series - Applications ...Some contributions to the clustering of financial time series - Applications ...
Some contributions to the clustering of financial time series - Applications ...Gautier Marti
1.6K vues62 diapositives
Master_Thesis_Harihara_Subramanyam_Sreenivasan par
Master_Thesis_Harihara_Subramanyam_SreenivasanMaster_Thesis_Harihara_Subramanyam_Sreenivasan
Master_Thesis_Harihara_Subramanyam_SreenivasanHarihara Subramanyam Sreenivasan
184 vues27 diapositives
Econometrics par
EconometricsEconometrics
EconometricsQuantUniversity
825 vues22 diapositives
Dr. Syed Muhammad Ali Tirmizi - Special topics in finance lec 5 par
Dr. Syed Muhammad Ali Tirmizi - Special topics in finance   lec 5Dr. Syed Muhammad Ali Tirmizi - Special topics in finance   lec 5
Dr. Syed Muhammad Ali Tirmizi - Special topics in finance lec 5Dr. Muhammad Ali Tirmizi., Ph.D.
65 vues28 diapositives
Clustering Financial Time Series: How Long is Enough? par
Clustering Financial Time Series: How Long is Enough?Clustering Financial Time Series: How Long is Enough?
Clustering Financial Time Series: How Long is Enough?Gautier Marti
700 vues16 diapositives
Financial Time Series Analysis Using R par
Financial Time Series Analysis Using RFinancial Time Series Analysis Using R
Financial Time Series Analysis Using RMajeed Simaan
289 vues54 diapositives

Similaire à On the stability of clustering financial time series(20)

Some contributions to the clustering of financial time series - Applications ... par Gautier Marti
Some contributions to the clustering of financial time series - Applications ...Some contributions to the clustering of financial time series - Applications ...
Some contributions to the clustering of financial time series - Applications ...
Gautier Marti1.6K vues
Clustering Financial Time Series: How Long is Enough? par Gautier Marti
Clustering Financial Time Series: How Long is Enough?Clustering Financial Time Series: How Long is Enough?
Clustering Financial Time Series: How Long is Enough?
Gautier Marti700 vues
Financial Time Series Analysis Using R par Majeed Simaan
Financial Time Series Analysis Using RFinancial Time Series Analysis Using R
Financial Time Series Analysis Using R
Majeed Simaan289 vues
On clustering financial time series - A need for distances between dependent ... par Gautier Marti
On clustering financial time series - A need for distances between dependent ...On clustering financial time series - A need for distances between dependent ...
On clustering financial time series - A need for distances between dependent ...
Gautier Marti990 vues
The dangers of policy experiments Initial beliefs under adaptive learning par GRAPE
The dangers of policy experiments Initial beliefs under adaptive learningThe dangers of policy experiments Initial beliefs under adaptive learning
The dangers of policy experiments Initial beliefs under adaptive learning
GRAPE97 vues
Putting the cycle back into business cycle analysis par ADEMU_Project
Putting the cycle back into business cycle analysisPutting the cycle back into business cycle analysis
Putting the cycle back into business cycle analysis
ADEMU_Project224 vues
Banque de France's Workshop on Granularity: Basile Grassi's slides, June 2016 par Soledad Zignago
Banque de France's Workshop on Granularity: Basile Grassi's slides, June 2016 Banque de France's Workshop on Granularity: Basile Grassi's slides, June 2016
Banque de France's Workshop on Granularity: Basile Grassi's slides, June 2016
Soledad Zignago479 vues
Multivariate high-order-fuzzy-time-series-forecasting-for-car-road-accidents par Cemal Ardil
Multivariate high-order-fuzzy-time-series-forecasting-for-car-road-accidentsMultivariate high-order-fuzzy-time-series-forecasting-for-car-road-accidents
Multivariate high-order-fuzzy-time-series-forecasting-for-car-road-accidents
Cemal Ardil290 vues
Cointegration among biotech stocks par Peter Zobel
Cointegration among biotech stocksCointegration among biotech stocks
Cointegration among biotech stocks
Peter Zobel1.4K vues
Forecasting Slides par knksmart
Forecasting SlidesForecasting Slides
Forecasting Slides
knksmart93.4K vues
A Framework for Analyzing the Impact of Business Cycles on Endogenous Growth par GRAPE
A Framework for Analyzing the Impact of Business Cycles on Endogenous GrowthA Framework for Analyzing the Impact of Business Cycles on Endogenous Growth
A Framework for Analyzing the Impact of Business Cycles on Endogenous Growth
GRAPE496 vues
"Correlated Volatility Shocks" by Dr. Xiao Qiao, Researcher at SummerHaven In... par Quantopian
"Correlated Volatility Shocks" by Dr. Xiao Qiao, Researcher at SummerHaven In..."Correlated Volatility Shocks" by Dr. Xiao Qiao, Researcher at SummerHaven In...
"Correlated Volatility Shocks" by Dr. Xiao Qiao, Researcher at SummerHaven In...
Quantopian359 vues
Statistical Arbitrage Pairs Trading, Long-Short Strategy par z-score
Statistical Arbitrage Pairs Trading, Long-Short StrategyStatistical Arbitrage Pairs Trading, Long-Short Strategy
Statistical Arbitrage Pairs Trading, Long-Short Strategy
z-score2.2K vues

Plus de Gautier Marti

Using Large Language Models in 10 Lines of Code par
Using Large Language Models in 10 Lines of CodeUsing Large Language Models in 10 Lines of Code
Using Large Language Models in 10 Lines of CodeGautier Marti
1.3K vues13 diapositives
What deep learning can bring to... par
What deep learning can bring to...What deep learning can bring to...
What deep learning can bring to...Gautier Marti
94 vues12 diapositives
A quick demo of Top2Vec With application on 2020 10-K business descriptions par
A quick demo of Top2Vec With application on 2020 10-K business descriptionsA quick demo of Top2Vec With application on 2020 10-K business descriptions
A quick demo of Top2Vec With application on 2020 10-K business descriptionsGautier Marti
523 vues15 diapositives
cCorrGAN: Conditional Correlation GAN for Learning Empirical Conditional Dist... par
cCorrGAN: Conditional Correlation GAN for Learning Empirical Conditional Dist...cCorrGAN: Conditional Correlation GAN for Learning Empirical Conditional Dist...
cCorrGAN: Conditional Correlation GAN for Learning Empirical Conditional Dist...Gautier Marti
250 vues27 diapositives
How deep generative models can help quants reduce the risk of overfitting? par
How deep generative models can help quants reduce the risk of overfitting?How deep generative models can help quants reduce the risk of overfitting?
How deep generative models can help quants reduce the risk of overfitting?Gautier Marti
3.4K vues59 diapositives
Generating Realistic Synthetic Data in Finance par
Generating Realistic Synthetic Data in FinanceGenerating Realistic Synthetic Data in Finance
Generating Realistic Synthetic Data in FinanceGautier Marti
3.9K vues64 diapositives

Plus de Gautier Marti(17)

Using Large Language Models in 10 Lines of Code par Gautier Marti
Using Large Language Models in 10 Lines of CodeUsing Large Language Models in 10 Lines of Code
Using Large Language Models in 10 Lines of Code
Gautier Marti1.3K vues
What deep learning can bring to... par Gautier Marti
What deep learning can bring to...What deep learning can bring to...
What deep learning can bring to...
Gautier Marti94 vues
A quick demo of Top2Vec With application on 2020 10-K business descriptions par Gautier Marti
A quick demo of Top2Vec With application on 2020 10-K business descriptionsA quick demo of Top2Vec With application on 2020 10-K business descriptions
A quick demo of Top2Vec With application on 2020 10-K business descriptions
Gautier Marti523 vues
cCorrGAN: Conditional Correlation GAN for Learning Empirical Conditional Dist... par Gautier Marti
cCorrGAN: Conditional Correlation GAN for Learning Empirical Conditional Dist...cCorrGAN: Conditional Correlation GAN for Learning Empirical Conditional Dist...
cCorrGAN: Conditional Correlation GAN for Learning Empirical Conditional Dist...
Gautier Marti250 vues
How deep generative models can help quants reduce the risk of overfitting? par Gautier Marti
How deep generative models can help quants reduce the risk of overfitting?How deep generative models can help quants reduce the risk of overfitting?
How deep generative models can help quants reduce the risk of overfitting?
Gautier Marti3.4K vues
Generating Realistic Synthetic Data in Finance par Gautier Marti
Generating Realistic Synthetic Data in FinanceGenerating Realistic Synthetic Data in Finance
Generating Realistic Synthetic Data in Finance
Gautier Marti3.9K vues
My recent attempts at using GANs for simulating realistic stocks returns par Gautier Marti
My recent attempts at using GANs for simulating realistic stocks returnsMy recent attempts at using GANs for simulating realistic stocks returns
My recent attempts at using GANs for simulating realistic stocks returns
Gautier Marti1.1K vues
Takeaways from ICML 2019, Long Beach, California par Gautier Marti
Takeaways from ICML 2019, Long Beach, CaliforniaTakeaways from ICML 2019, Long Beach, California
Takeaways from ICML 2019, Long Beach, California
Gautier Marti1.3K vues
A review of two decades of correlations, hierarchies, networks and clustering... par Gautier Marti
A review of two decades of correlations, hierarchies, networks and clustering...A review of two decades of correlations, hierarchies, networks and clustering...
A review of two decades of correlations, hierarchies, networks and clustering...
Gautier Marti1.2K vues
Autoregressive Convolutional Neural Networks for Asynchronous Time Series par Gautier Marti
Autoregressive Convolutional Neural Networks for Asynchronous Time SeriesAutoregressive Convolutional Neural Networks for Asynchronous Time Series
Autoregressive Convolutional Neural Networks for Asynchronous Time Series
Gautier Marti968 vues
Clustering Financial Time Series using their Correlations and their Distribut... par Gautier Marti
Clustering Financial Time Series using their Correlations and their Distribut...Clustering Financial Time Series using their Correlations and their Distribut...
Clustering Financial Time Series using their Correlations and their Distribut...
Gautier Marti523 vues
A closer look at correlations par Gautier Marti
A closer look at correlationsA closer look at correlations
A closer look at correlations
Gautier Marti843 vues
Optimal Transport vs. Fisher-Rao distance between Copulas par Gautier Marti
Optimal Transport vs. Fisher-Rao distance between CopulasOptimal Transport vs. Fisher-Rao distance between Copulas
Optimal Transport vs. Fisher-Rao distance between Copulas
Gautier Marti1.3K vues
On Clustering Financial Time Series - Beyond Correlation par Gautier Marti
On Clustering Financial Time Series - Beyond CorrelationOn Clustering Financial Time Series - Beyond Correlation
On Clustering Financial Time Series - Beyond Correlation
Gautier Marti614 vues
Optimal Transport between Copulas for Clustering Time Series par Gautier Marti
Optimal Transport between Copulas for Clustering Time SeriesOptimal Transport between Copulas for Clustering Time Series
Optimal Transport between Copulas for Clustering Time Series
Gautier Marti30.2K vues
Clustering Random Walk Time Series par Gautier Marti
Clustering Random Walk Time SeriesClustering Random Walk Time Series
Clustering Random Walk Time Series
Gautier Marti710 vues

Dernier

RemeOs science and clinical evidence par
RemeOs science and clinical evidenceRemeOs science and clinical evidence
RemeOs science and clinical evidencePetrusViitanen1
37 vues96 diapositives
POSTER IV LAWCN_ROVER_IUE.pdf par
POSTER IV LAWCN_ROVER_IUE.pdfPOSTER IV LAWCN_ROVER_IUE.pdf
POSTER IV LAWCN_ROVER_IUE.pdfSOCIEDAD JULIO GARAVITO
9 vues1 diapositive
CSF -SHEEBA.D presentation.pptx par
CSF -SHEEBA.D presentation.pptxCSF -SHEEBA.D presentation.pptx
CSF -SHEEBA.D presentation.pptxSheebaD7
14 vues13 diapositives
scopus cited journals.pdf par
scopus cited journals.pdfscopus cited journals.pdf
scopus cited journals.pdfKSAravindSrivastava
7 vues15 diapositives
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ... par
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...ILRI
5 vues6 diapositives
DATABASE MANAGEMENT SYSTEM par
DATABASE MANAGEMENT SYSTEMDATABASE MANAGEMENT SYSTEM
DATABASE MANAGEMENT SYSTEMDr. GOPINATH D
7 vues50 diapositives

Dernier(20)

CSF -SHEEBA.D presentation.pptx par SheebaD7
CSF -SHEEBA.D presentation.pptxCSF -SHEEBA.D presentation.pptx
CSF -SHEEBA.D presentation.pptx
SheebaD714 vues
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ... par ILRI
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...
ILRI5 vues
Experimental animal Guinea pigs.pptx par Mansee Arya
Experimental animal Guinea pigs.pptxExperimental animal Guinea pigs.pptx
Experimental animal Guinea pigs.pptx
Mansee Arya17 vues
MODULE-9-Biotechnology, Genetically Modified Organisms, and Gene Therapy.pdf par KerryNuez1
MODULE-9-Biotechnology, Genetically Modified Organisms, and Gene Therapy.pdfMODULE-9-Biotechnology, Genetically Modified Organisms, and Gene Therapy.pdf
MODULE-9-Biotechnology, Genetically Modified Organisms, and Gene Therapy.pdf
KerryNuez125 vues
Synthesis and Characterization of Magnetite-Magnesium Sulphate-Sodium Dodecyl... par GIFT KIISI NKIN
Synthesis and Characterization of Magnetite-Magnesium Sulphate-Sodium Dodecyl...Synthesis and Characterization of Magnetite-Magnesium Sulphate-Sodium Dodecyl...
Synthesis and Characterization of Magnetite-Magnesium Sulphate-Sodium Dodecyl...
GIFT KIISI NKIN26 vues
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ... par ILRI
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...
ILRI5 vues
Artificial Intelligence Helps in Drug Designing and Discovery.pptx par abhinashsahoo2001
Artificial Intelligence Helps in Drug Designing and Discovery.pptxArtificial Intelligence Helps in Drug Designing and Discovery.pptx
Artificial Intelligence Helps in Drug Designing and Discovery.pptx
Light Pollution for LVIS students par CWBarthlmew
Light Pollution for LVIS studentsLight Pollution for LVIS students
Light Pollution for LVIS students
CWBarthlmew7 vues
PRINCIPLES-OF ASSESSMENT par rbalmagro
PRINCIPLES-OF ASSESSMENTPRINCIPLES-OF ASSESSMENT
PRINCIPLES-OF ASSESSMENT
rbalmagro12 vues

On the stability of clustering financial time series

  • 1. Introduction to financial time series clustering Empirical results from the clustering stability study Conclusion On the Stability of Clustering Financial Time Series – How to investigate? IEEE ICMLA Miami, Florida, USA, December 9-11, 2015 Gautier Marti, Philippe Very, Philippe Donnat, Frank Nielsen 9 December 2015 Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
  • 2. Introduction to financial time series clustering Empirical results from the clustering stability study Conclusion 1 Introduction to financial time series clustering 2 Empirical results from the clustering stability study 3 Conclusion Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
  • 3. Introduction to financial time series clustering Empirical results from the clustering stability study Conclusion Financial time series (data from www.datagrapple.com) Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
  • 4. Introduction to financial time series clustering Empirical results from the clustering stability study Conclusion Clustering? Definition Clustering is the task of grouping a set of objects in such a way that objects in the same group (cluster) are more similar to each other than those in different groups. French banks (blue) and building materials (red) CDS over 2006-2015 Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
  • 5. Introduction to financial time series clustering Empirical results from the clustering stability study Conclusion Why clustering? Mathematical finance: Use of variance-covariance matrices (e.g., Markowitz, Value-at-Risk) Stylized fact: Empirical variance-covariance matrices estimated on financial time series are very noisy (Random Matrix Theory, Noise Dressing of Financial Correlation Matrices, Laloux et al, 1999) 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 λ 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 ρ(λ) Marchenko-Pastur distribution vs. empirical eigenvalues distribution of the correlation matrix How to filter these variance-covariance matrices? Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
  • 6. Introduction to financial time series clustering Empirical results from the clustering stability study Conclusion For filtering, clustering! Mantegna (1999) et al’s work: 0 100 200 300 400 500 0 100 200 300 400 500 0 100 200 300 400 500 0 100 200 300 400 500 0 100 200 300 400 500 0 100 200 300 400 500 (left) empirical correlation matrix (center) the same matrix seriated using a hierarchical clustering (right) correlations filtered using the clustering structure N.B. other applications: statarb, alternative risk measures Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
  • 7. Introduction to financial time series clustering Empirical results from the clustering stability study Conclusion Why stability? statistical consistency of the clustering method requires assumptions that may not hold in practice: e.g. returns are i.i.d., underlying elliptical copula, enough data is available stability is a weaker property: reproducibility of results across a wide range of slight data perturbations Clusters obtained at time t, t + 1, t + 2; Is the difference between the successive clusters a“true”signal? Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
  • 8. Introduction to financial time series clustering Empirical results from the clustering stability study Conclusion Is the clustering of financial time series stable? According to [2], clusters are not stable with respect to the clustering algorithm, but only a squared Euclidean distance was considered which is not relevant for clustering assets from their returns (cf. [4]). Idea: A more relevant distance should increase stability We investigate the clustering stability resulting from using: an Euclidean distance a Pearson correlation distance [3] a Spearman correlation distance a distance for comparing two dependent random variables [4] Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
  • 9. Introduction to financial time series clustering Empirical results from the clustering stability study Conclusion Some usual distances for clustering financial time series (Pi t )t≥0 Si t+1 = log Pi t+1 −log Pi t (Si t )t≥1 Euclidean distance: d(Si , Sj ) = T t=1(Si t − Sj t )2 Pearson correl.: ρ(Si , Sj ) = T t=1(Si t −Si )(Sj t −Sj ) T t=1(Si t −Si )2 T t=1(Sj t −Sj )2 Spearman correl.: ρS (Si , Sj ) = 1 − 6 T(T2−1) T t=1(Si (t) − Si (t))2 Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
  • 10. Introduction to financial time series clustering Empirical results from the clustering stability study Conclusion Generic Non-Parametric Distance [4] d2 θ (Xi , Xj ) = θ3E |Pi (Xi ) − Pj (Xj )|2 + (1 − θ) 1 2 R dPi dλ − dPj dλ 2 dλ (i) 0 ≤ dθ ≤ 1, (ii) 0 < θ < 1, dθ metric, (iii) dθ is invariant under diffeomorphism Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
  • 11. Introduction to financial time series clustering Empirical results from the clustering stability study Conclusion Generic Non-Parametric Distance [4] d2 0 : 1 2 R dPi dλ − dPj dλ 2 dλ = Hellinger2 d2 1 : 3E |Pi (Xi ) − Pj (Xj )|2 = 1 − ρS 2 = 2−6 1 0 1 0 C(u, v)dudv Remark: If f (x, θ) = c(F1(x1; ν1), . . . , FN(xN; νN); θc) N i=1 fi (xi ; νi ) then with CML hypothesis ds2 = ds2 copula + N i=1 ds2 margins Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
  • 12. Introduction to financial time series clustering Empirical results from the clustering stability study Conclusion 1 Introduction to financial time series clustering 2 Empirical results from the clustering stability study 3 Conclusion Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
  • 13. Introduction to financial time series clustering Empirical results from the clustering stability study Conclusion Sliding Window PCA stability curve (red) vs. Euclidean Clusters stability curve as a function of time using results from [1] for fair comparison: clusters are more stable most basic perturbation: traders face it everyday when monitoring their indicators we do not want to overfit our analysis to this particular stability goal stability perf.: dist. [4] Spearman Pearson Euclidean Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
  • 14. Introduction to financial time series clustering Empirical results from the clustering stability study Conclusion Odd vs. Even A clustering al- gorithm applied on two samples describing the same phenomenon should yield the same results. How to obtain two of these samples? (un)Stability of clusters with L2 distance Stability of clusters with the proposed distance [4] Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
  • 15. Introduction to financial time series clustering Empirical results from the clustering stability study Conclusion Economic Regimes AXA 5-year CDS spread over 2006-2015 Average of the pairwise correlations; correlation skyrockets during crises Is the clustering structure persistent? Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
  • 16. Introduction to financial time series clustering Empirical results from the clustering stability study Conclusion Economic Regimes Clustering Stability Pearson (top left), Spearman (top right), Euclidean (bottom left), corr+distr (bottom right) Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
  • 17. Introduction to financial time series clustering Empirical results from the clustering stability study Conclusion Heart vs. Tails Clustering Stability ≈ orange+red vs. green+yellow periods Pearson (top left), Spearman (top right), Euclidean (bottom left), corr+distr (bottom right) Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
  • 18. Introduction to financial time series clustering Empirical results from the clustering stability study Conclusion Multiscale Is the clustering structure persistent to different sampling frequencies? Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
  • 19. Introduction to financial time series clustering Empirical results from the clustering stability study Conclusion Multiscale Clustering Stability Pearson (top left), Spearman (top right), Euclidean (bottom left), corr+distr (bottom right) Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
  • 20. Introduction to financial time series clustering Empirical results from the clustering stability study Conclusion Maturities & Term Structure An asset is described by several time series whose dynamics are similar: Nokia Oyj is described here by the cost of insurance against its default for {1, 3, 5, 7, 10} years Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
  • 21. Introduction to financial time series clustering Empirical results from the clustering stability study Conclusion Maturities & Term Structure Clustering Stability Pearson (top left), Spearman (top right), Euclidean (bottom left), corr+distr (bottom right) Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
  • 22. Introduction to financial time series clustering Empirical results from the clustering stability study Conclusion 1 Introduction to financial time series clustering 2 Empirical results from the clustering stability study 3 Conclusion Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
  • 23. Introduction to financial time series clustering Empirical results from the clustering stability study Conclusion Discussion and questions? A given clustering algorithm yields a particular clustering structure, but with a relevant distance it can be more stable The perturbations presented can be readily extended (e.g. using different CDS datasets) Disclosing stability results is interesting since complex models often perform poorly (the many parameters are somewhat overfitted) and cannot be used by practitioners Correlation+distribution distance (presented in [4]) may work for your applications (which ones?) Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
  • 24. Introduction to financial time series clustering Empirical results from the clustering stability study Conclusion C. Ding and X. He. K-means clustering via principal component analysis. In Proceedings of the twenty-first international conference on Machine learning, page 29. ACM, 2004. V. Lemieux, P. S. Rahmdel, R. Walker, B. Wong, and M. Flood. Clustering techniques and their effect on portfolio formation and risk analysis. In Proceedings of the International Workshop on Data Science for Macro-Modeling, pages 1–6. ACM, 2014. R. N. Mantegna and H. E. Stanley. Introduction to econophysics: correlations and complexity in finance. Cambridge university press, 1999. G. Marti, P. Very, and P. Donnat. Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series
  • 25. Introduction to financial time series clustering Empirical results from the clustering stability study Conclusion Toward a generic representation of random variables for machine learning. Pattern Recognition Letters, 2015. Gautier Marti, Philippe Donnat On the Stability of Clustering Financial Time Series