SlideShare une entreprise Scribd logo
1  sur  1
Télécharger pour lire hors ligne
Optimal Copula Transport for Clustering Time Series
Gautier Marti1,2
, Frank Nielsen2
, Philippe Donnat1
1
Hellebore Capital Limited & 2
Ecole Polytechnique
Clustering Time Series
Which Dependence Measure?
For Which Dependence?
Many bivariate dependence measures are avail-
able. Usually, they aim at measuring:
• any deviation from independence,
• any deviation from co/counter-monotonicity.
Motivation: What if
• we aim at specific dependence,
• and try to “ignore” some others?
Dependence to detect (ρij := 1)
Dependence to ignore (ρij := 0)
Problem: A dependence measure powerful
enough to detect y = f(x2
) will also detect
y = g(x), f increasing, g decreasing.
Copulas & Dependence
• Sklar’s Theorem:
F(xi, xj) = Cij(Fi(xi), Fj(xj))
• Cij, the copula, encodes the dependence structure
• Fréchet-Hoeffding bounds:
max{ui + uj − 1, 0} ≤ Cij(ui, uj) ≤ min{ui, uj}
• Bivariate dependence measures:
• deviation from lower and upper bounds
• Spearman’s ρS, Gini’s γ
• deviation from independence uiuj
• Spearman, Copula MMD, Schweizer-Wolff’s σ, Hoeffding’s Φ2
Figure 1: (left) lower-bound copula, (mid) independence copula,
(right) upper-bound copula
Optimal Transport
Wasserstein metrics:
Wp
p (µ, ν) := inf
γ∈Γ(µ,ν) M×M
d(x, y)p
dγ(x, y)
In practice, the distance W1 is estimated on discrete
data by solving the following linear program with
the Hungarian algorithm:
EMD(s1, s2) := min
f 1≤k,l≤n
pk − ql fkl
subject to fkl ≥ 0, 1 ≤ k, l ≤ n,
n
l=1
fkl ≤ wpk
, 1 ≤ k ≤ n,
n
k=1
fkl ≤ wql
, 1 ≤ l ≤ n,
n
k=1
n
l=1
fkl = 1.
It is called the Earth Mover Distance (EMD) in the
CS literature.
A target-oriented dependence
coefficient
• Build the independence copula Cind
• Build the target-dependence copulas {Ck}k
• Compute the empirical copula Cij from xi, xj
TDC(Cij) =
EMD(Cind, Cij)
EMD(Cind, Cij) + mink EMD(Cij, Ck)
Figure 2: Dependence is measured as the relative distance from
independence to the nearest target-dependence
EMD between Copulas
• Probability integral transform of a variable xi:
FT (xk
i ) =
1
T
T
t=1
I(xt
i ≤ xk
i ),
i.e. computing the ranks of the realizations, and
normalizing them into [0,1]
Why the Earth Mover Distance?
Figure 3: Copulas C1, C2, C3 encoding a correlation of
0.5, 0.99, 0.9999 respectively; Which pair of copulas is the
nearest? For Fisher-Rao, Kullback-Leibler, Hellinger and re-
lated divergences: D(C1, C2) ≤ D(C2, C3); EMD(C2, C3) ≤
EMD(C1, C2)
Benchmark: Power of Estimators
Our coefficient can robustly target complex depen-
dence patterns such as the ones displayed in Fig. 4.
• x-axis measures the noise added to the sample
• y-axis measures the frequency the coefficient is
able to discern between the dependent sample
and the independent one
• Basic check: no coefficient can discern between
the “dependent” sample (with no dependence)
and the independent sample.
0.00.40.8
xvals
power.cor[typ,]
xvals
power.cor[typ,]
0.00.40.8
xvals
power.cor[typ,]
xvals
power.cor[typ,]
cor
dCor
MIC
ACE
RDC
TDC
0.00.40.8
xvals
power.cor[typ,]
xvals
power.cor[typ,]
0 20 40 60 80 100
0.00.40.8
xvals
power.cor[typ,]
0 20 40 60 80 100
xvals
power.cor[typ,]
Noise Level
Power
Figure 4: Dependence estimators power as a function of the
noise for several deterministic patterns + noise. Their power is
the percentage of times that they are able to distinguish between
dependent and independent samples.
Clustering of Credit Default Swaps
• We use the two targets from Fig. 2
• Clustering distance: Dij = (1 − TDC(Cij))/2
Figure 5: Impact of different measures on clusters
Conclusion
The methodology presented is
• non-parametric, robust, deterministic.
It has some scalability issues:
• in dimension, non-parametric density estimation;
• in time, EMD is costly to compute.
Approximation schemes or parametric modelling
can alleviate these issues.
Information
• Web: www.datagrapple.com
• Email: gautier.marti@helleborecapital.com

Contenu connexe

Plus de Gautier Marti

My recent attempts at using GANs for simulating realistic stocks returns
My recent attempts at using GANs for simulating realistic stocks returnsMy recent attempts at using GANs for simulating realistic stocks returns
My recent attempts at using GANs for simulating realistic stocks returnsGautier Marti
 
Takeaways from ICML 2019, Long Beach, California
Takeaways from ICML 2019, Long Beach, CaliforniaTakeaways from ICML 2019, Long Beach, California
Takeaways from ICML 2019, Long Beach, CaliforniaGautier Marti
 
A review of two decades of correlations, hierarchies, networks and clustering...
A review of two decades of correlations, hierarchies, networks and clustering...A review of two decades of correlations, hierarchies, networks and clustering...
A review of two decades of correlations, hierarchies, networks and clustering...Gautier Marti
 
Autoregressive Convolutional Neural Networks for Asynchronous Time Series
Autoregressive Convolutional Neural Networks for Asynchronous Time SeriesAutoregressive Convolutional Neural Networks for Asynchronous Time Series
Autoregressive Convolutional Neural Networks for Asynchronous Time SeriesGautier Marti
 
Some contributions to the clustering of financial time series - Applications ...
Some contributions to the clustering of financial time series - Applications ...Some contributions to the clustering of financial time series - Applications ...
Some contributions to the clustering of financial time series - Applications ...Gautier Marti
 
Clustering Financial Time Series using their Correlations and their Distribut...
Clustering Financial Time Series using their Correlations and their Distribut...Clustering Financial Time Series using their Correlations and their Distribut...
Clustering Financial Time Series using their Correlations and their Distribut...Gautier Marti
 
A closer look at correlations
A closer look at correlationsA closer look at correlations
A closer look at correlationsGautier Marti
 
Clustering Financial Time Series: How Long is Enough?
Clustering Financial Time Series: How Long is Enough?Clustering Financial Time Series: How Long is Enough?
Clustering Financial Time Series: How Long is Enough?Gautier Marti
 
Optimal Transport vs. Fisher-Rao distance between Copulas
Optimal Transport vs. Fisher-Rao distance between CopulasOptimal Transport vs. Fisher-Rao distance between Copulas
Optimal Transport vs. Fisher-Rao distance between CopulasGautier Marti
 
On Clustering Financial Time Series - Beyond Correlation
On Clustering Financial Time Series - Beyond CorrelationOn Clustering Financial Time Series - Beyond Correlation
On Clustering Financial Time Series - Beyond CorrelationGautier Marti
 
Optimal Transport between Copulas for Clustering Time Series
Optimal Transport between Copulas for Clustering Time SeriesOptimal Transport between Copulas for Clustering Time Series
Optimal Transport between Copulas for Clustering Time SeriesGautier Marti
 
On the stability of clustering financial time series
On the stability of clustering financial time seriesOn the stability of clustering financial time series
On the stability of clustering financial time seriesGautier Marti
 
Clustering Random Walk Time Series
Clustering Random Walk Time SeriesClustering Random Walk Time Series
Clustering Random Walk Time SeriesGautier Marti
 
On clustering financial time series - A need for distances between dependent ...
On clustering financial time series - A need for distances between dependent ...On clustering financial time series - A need for distances between dependent ...
On clustering financial time series - A need for distances between dependent ...Gautier Marti
 

Plus de Gautier Marti (14)

My recent attempts at using GANs for simulating realistic stocks returns
My recent attempts at using GANs for simulating realistic stocks returnsMy recent attempts at using GANs for simulating realistic stocks returns
My recent attempts at using GANs for simulating realistic stocks returns
 
Takeaways from ICML 2019, Long Beach, California
Takeaways from ICML 2019, Long Beach, CaliforniaTakeaways from ICML 2019, Long Beach, California
Takeaways from ICML 2019, Long Beach, California
 
A review of two decades of correlations, hierarchies, networks and clustering...
A review of two decades of correlations, hierarchies, networks and clustering...A review of two decades of correlations, hierarchies, networks and clustering...
A review of two decades of correlations, hierarchies, networks and clustering...
 
Autoregressive Convolutional Neural Networks for Asynchronous Time Series
Autoregressive Convolutional Neural Networks for Asynchronous Time SeriesAutoregressive Convolutional Neural Networks for Asynchronous Time Series
Autoregressive Convolutional Neural Networks for Asynchronous Time Series
 
Some contributions to the clustering of financial time series - Applications ...
Some contributions to the clustering of financial time series - Applications ...Some contributions to the clustering of financial time series - Applications ...
Some contributions to the clustering of financial time series - Applications ...
 
Clustering Financial Time Series using their Correlations and their Distribut...
Clustering Financial Time Series using their Correlations and their Distribut...Clustering Financial Time Series using their Correlations and their Distribut...
Clustering Financial Time Series using their Correlations and their Distribut...
 
A closer look at correlations
A closer look at correlationsA closer look at correlations
A closer look at correlations
 
Clustering Financial Time Series: How Long is Enough?
Clustering Financial Time Series: How Long is Enough?Clustering Financial Time Series: How Long is Enough?
Clustering Financial Time Series: How Long is Enough?
 
Optimal Transport vs. Fisher-Rao distance between Copulas
Optimal Transport vs. Fisher-Rao distance between CopulasOptimal Transport vs. Fisher-Rao distance between Copulas
Optimal Transport vs. Fisher-Rao distance between Copulas
 
On Clustering Financial Time Series - Beyond Correlation
On Clustering Financial Time Series - Beyond CorrelationOn Clustering Financial Time Series - Beyond Correlation
On Clustering Financial Time Series - Beyond Correlation
 
Optimal Transport between Copulas for Clustering Time Series
Optimal Transport between Copulas for Clustering Time SeriesOptimal Transport between Copulas for Clustering Time Series
Optimal Transport between Copulas for Clustering Time Series
 
On the stability of clustering financial time series
On the stability of clustering financial time seriesOn the stability of clustering financial time series
On the stability of clustering financial time series
 
Clustering Random Walk Time Series
Clustering Random Walk Time SeriesClustering Random Walk Time Series
Clustering Random Walk Time Series
 
On clustering financial time series - A need for distances between dependent ...
On clustering financial time series - A need for distances between dependent ...On clustering financial time series - A need for distances between dependent ...
On clustering financial time series - A need for distances between dependent ...
 

Dernier

call girls in Nand Nagri (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in  Nand Nagri (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in  Nand Nagri (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Nand Nagri (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Tenets of Physiocracy History of Economic
Tenets of Physiocracy History of EconomicTenets of Physiocracy History of Economic
Tenets of Physiocracy History of Economiccinemoviesu
 
(办理学位证)加拿大萨省大学毕业证成绩单原版一比一
(办理学位证)加拿大萨省大学毕业证成绩单原版一比一(办理学位证)加拿大萨省大学毕业证成绩单原版一比一
(办理学位证)加拿大萨省大学毕业证成绩单原版一比一S SDS
 
How Automation is Driving Efficiency Through the Last Mile of Reporting
How Automation is Driving Efficiency Through the Last Mile of ReportingHow Automation is Driving Efficiency Through the Last Mile of Reporting
How Automation is Driving Efficiency Through the Last Mile of ReportingAggregage
 
Interimreport1 January–31 March2024 Elo Mutual Pension Insurance Company
Interimreport1 January–31 March2024 Elo Mutual Pension Insurance CompanyInterimreport1 January–31 March2024 Elo Mutual Pension Insurance Company
Interimreport1 January–31 March2024 Elo Mutual Pension Insurance CompanyTyöeläkeyhtiö Elo
 
fca-bsps-decision-letter-redacted (1).pdf
fca-bsps-decision-letter-redacted (1).pdffca-bsps-decision-letter-redacted (1).pdf
fca-bsps-decision-letter-redacted (1).pdfHenry Tapper
 
原版1:1复刻堪萨斯大学毕业证KU毕业证留信学历认证
原版1:1复刻堪萨斯大学毕业证KU毕业证留信学历认证原版1:1复刻堪萨斯大学毕业证KU毕业证留信学历认证
原版1:1复刻堪萨斯大学毕业证KU毕业证留信学历认证jdkhjh
 
Call Girls In Yusuf Sarai Women Seeking Men 9654467111
Call Girls In Yusuf Sarai Women Seeking Men 9654467111Call Girls In Yusuf Sarai Women Seeking Men 9654467111
Call Girls In Yusuf Sarai Women Seeking Men 9654467111Sapana Sha
 
原版1:1复刻温哥华岛大学毕业证Vancouver毕业证留信学历认证
原版1:1复刻温哥华岛大学毕业证Vancouver毕业证留信学历认证原版1:1复刻温哥华岛大学毕业证Vancouver毕业证留信学历认证
原版1:1复刻温哥华岛大学毕业证Vancouver毕业证留信学历认证rjrjkk
 
Monthly Market Risk Update: April 2024 [SlideShare]
Monthly Market Risk Update: April 2024 [SlideShare]Monthly Market Risk Update: April 2024 [SlideShare]
Monthly Market Risk Update: April 2024 [SlideShare]Commonwealth
 
(中央兰开夏大学毕业证学位证成绩单-案例)
(中央兰开夏大学毕业证学位证成绩单-案例)(中央兰开夏大学毕业证学位证成绩单-案例)
(中央兰开夏大学毕业证学位证成绩单-案例)twfkn8xj
 
Stock Market Brief Deck for "this does not happen often".pdf
Stock Market Brief Deck for "this does not happen often".pdfStock Market Brief Deck for "this does not happen often".pdf
Stock Market Brief Deck for "this does not happen often".pdfMichael Silva
 
Economics, Commerce and Trade Management: An International Journal (ECTIJ)
Economics, Commerce and Trade Management: An International Journal (ECTIJ)Economics, Commerce and Trade Management: An International Journal (ECTIJ)
Economics, Commerce and Trade Management: An International Journal (ECTIJ)ECTIJ
 
Classical Theory of Macroeconomics by Adam Smith
Classical Theory of Macroeconomics by Adam SmithClassical Theory of Macroeconomics by Adam Smith
Classical Theory of Macroeconomics by Adam SmithAdamYassin2
 
NO1 Certified Ilam kala Jadu Specialist Expert In Bahawalpur, Sargodha, Sialk...
NO1 Certified Ilam kala Jadu Specialist Expert In Bahawalpur, Sargodha, Sialk...NO1 Certified Ilam kala Jadu Specialist Expert In Bahawalpur, Sargodha, Sialk...
NO1 Certified Ilam kala Jadu Specialist Expert In Bahawalpur, Sargodha, Sialk...Amil Baba Dawood bangali
 
Call Girls Near Golden Tulip Essential Hotel, New Delhi 9873777170
Call Girls Near Golden Tulip Essential Hotel, New Delhi 9873777170Call Girls Near Golden Tulip Essential Hotel, New Delhi 9873777170
Call Girls Near Golden Tulip Essential Hotel, New Delhi 9873777170Sonam Pathan
 
letter-from-the-chair-to-the-fca-relating-to-british-steel-pensions-scheme-15...
letter-from-the-chair-to-the-fca-relating-to-british-steel-pensions-scheme-15...letter-from-the-chair-to-the-fca-relating-to-british-steel-pensions-scheme-15...
letter-from-the-chair-to-the-fca-relating-to-british-steel-pensions-scheme-15...Henry Tapper
 
AfRESFullPaper22018EmpiricalPerformanceofRealEstateInvestmentTrustsandShareho...
AfRESFullPaper22018EmpiricalPerformanceofRealEstateInvestmentTrustsandShareho...AfRESFullPaper22018EmpiricalPerformanceofRealEstateInvestmentTrustsandShareho...
AfRESFullPaper22018EmpiricalPerformanceofRealEstateInvestmentTrustsandShareho...yordanosyohannes2
 

Dernier (20)

🔝+919953056974 🔝young Delhi Escort service Pusa Road
🔝+919953056974 🔝young Delhi Escort service Pusa Road🔝+919953056974 🔝young Delhi Escort service Pusa Road
🔝+919953056974 🔝young Delhi Escort service Pusa Road
 
call girls in Nand Nagri (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in  Nand Nagri (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in  Nand Nagri (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Nand Nagri (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Tenets of Physiocracy History of Economic
Tenets of Physiocracy History of EconomicTenets of Physiocracy History of Economic
Tenets of Physiocracy History of Economic
 
(办理学位证)加拿大萨省大学毕业证成绩单原版一比一
(办理学位证)加拿大萨省大学毕业证成绩单原版一比一(办理学位证)加拿大萨省大学毕业证成绩单原版一比一
(办理学位证)加拿大萨省大学毕业证成绩单原版一比一
 
How Automation is Driving Efficiency Through the Last Mile of Reporting
How Automation is Driving Efficiency Through the Last Mile of ReportingHow Automation is Driving Efficiency Through the Last Mile of Reporting
How Automation is Driving Efficiency Through the Last Mile of Reporting
 
Interimreport1 January–31 March2024 Elo Mutual Pension Insurance Company
Interimreport1 January–31 March2024 Elo Mutual Pension Insurance CompanyInterimreport1 January–31 March2024 Elo Mutual Pension Insurance Company
Interimreport1 January–31 March2024 Elo Mutual Pension Insurance Company
 
fca-bsps-decision-letter-redacted (1).pdf
fca-bsps-decision-letter-redacted (1).pdffca-bsps-decision-letter-redacted (1).pdf
fca-bsps-decision-letter-redacted (1).pdf
 
原版1:1复刻堪萨斯大学毕业证KU毕业证留信学历认证
原版1:1复刻堪萨斯大学毕业证KU毕业证留信学历认证原版1:1复刻堪萨斯大学毕业证KU毕业证留信学历认证
原版1:1复刻堪萨斯大学毕业证KU毕业证留信学历认证
 
Call Girls In Yusuf Sarai Women Seeking Men 9654467111
Call Girls In Yusuf Sarai Women Seeking Men 9654467111Call Girls In Yusuf Sarai Women Seeking Men 9654467111
Call Girls In Yusuf Sarai Women Seeking Men 9654467111
 
原版1:1复刻温哥华岛大学毕业证Vancouver毕业证留信学历认证
原版1:1复刻温哥华岛大学毕业证Vancouver毕业证留信学历认证原版1:1复刻温哥华岛大学毕业证Vancouver毕业证留信学历认证
原版1:1复刻温哥华岛大学毕业证Vancouver毕业证留信学历认证
 
Monthly Economic Monitoring of Ukraine No 231, April 2024
Monthly Economic Monitoring of Ukraine No 231, April 2024Monthly Economic Monitoring of Ukraine No 231, April 2024
Monthly Economic Monitoring of Ukraine No 231, April 2024
 
Monthly Market Risk Update: April 2024 [SlideShare]
Monthly Market Risk Update: April 2024 [SlideShare]Monthly Market Risk Update: April 2024 [SlideShare]
Monthly Market Risk Update: April 2024 [SlideShare]
 
(中央兰开夏大学毕业证学位证成绩单-案例)
(中央兰开夏大学毕业证学位证成绩单-案例)(中央兰开夏大学毕业证学位证成绩单-案例)
(中央兰开夏大学毕业证学位证成绩单-案例)
 
Stock Market Brief Deck for "this does not happen often".pdf
Stock Market Brief Deck for "this does not happen often".pdfStock Market Brief Deck for "this does not happen often".pdf
Stock Market Brief Deck for "this does not happen often".pdf
 
Economics, Commerce and Trade Management: An International Journal (ECTIJ)
Economics, Commerce and Trade Management: An International Journal (ECTIJ)Economics, Commerce and Trade Management: An International Journal (ECTIJ)
Economics, Commerce and Trade Management: An International Journal (ECTIJ)
 
Classical Theory of Macroeconomics by Adam Smith
Classical Theory of Macroeconomics by Adam SmithClassical Theory of Macroeconomics by Adam Smith
Classical Theory of Macroeconomics by Adam Smith
 
NO1 Certified Ilam kala Jadu Specialist Expert In Bahawalpur, Sargodha, Sialk...
NO1 Certified Ilam kala Jadu Specialist Expert In Bahawalpur, Sargodha, Sialk...NO1 Certified Ilam kala Jadu Specialist Expert In Bahawalpur, Sargodha, Sialk...
NO1 Certified Ilam kala Jadu Specialist Expert In Bahawalpur, Sargodha, Sialk...
 
Call Girls Near Golden Tulip Essential Hotel, New Delhi 9873777170
Call Girls Near Golden Tulip Essential Hotel, New Delhi 9873777170Call Girls Near Golden Tulip Essential Hotel, New Delhi 9873777170
Call Girls Near Golden Tulip Essential Hotel, New Delhi 9873777170
 
letter-from-the-chair-to-the-fca-relating-to-british-steel-pensions-scheme-15...
letter-from-the-chair-to-the-fca-relating-to-british-steel-pensions-scheme-15...letter-from-the-chair-to-the-fca-relating-to-british-steel-pensions-scheme-15...
letter-from-the-chair-to-the-fca-relating-to-british-steel-pensions-scheme-15...
 
AfRESFullPaper22018EmpiricalPerformanceofRealEstateInvestmentTrustsandShareho...
AfRESFullPaper22018EmpiricalPerformanceofRealEstateInvestmentTrustsandShareho...AfRESFullPaper22018EmpiricalPerformanceofRealEstateInvestmentTrustsandShareho...
AfRESFullPaper22018EmpiricalPerformanceofRealEstateInvestmentTrustsandShareho...
 

Optimal Copula Transport for Clustering Time Series

  • 1. Optimal Copula Transport for Clustering Time Series Gautier Marti1,2 , Frank Nielsen2 , Philippe Donnat1 1 Hellebore Capital Limited & 2 Ecole Polytechnique Clustering Time Series Which Dependence Measure? For Which Dependence? Many bivariate dependence measures are avail- able. Usually, they aim at measuring: • any deviation from independence, • any deviation from co/counter-monotonicity. Motivation: What if • we aim at specific dependence, • and try to “ignore” some others? Dependence to detect (ρij := 1) Dependence to ignore (ρij := 0) Problem: A dependence measure powerful enough to detect y = f(x2 ) will also detect y = g(x), f increasing, g decreasing. Copulas & Dependence • Sklar’s Theorem: F(xi, xj) = Cij(Fi(xi), Fj(xj)) • Cij, the copula, encodes the dependence structure • Fréchet-Hoeffding bounds: max{ui + uj − 1, 0} ≤ Cij(ui, uj) ≤ min{ui, uj} • Bivariate dependence measures: • deviation from lower and upper bounds • Spearman’s ρS, Gini’s γ • deviation from independence uiuj • Spearman, Copula MMD, Schweizer-Wolff’s σ, Hoeffding’s Φ2 Figure 1: (left) lower-bound copula, (mid) independence copula, (right) upper-bound copula Optimal Transport Wasserstein metrics: Wp p (µ, ν) := inf γ∈Γ(µ,ν) M×M d(x, y)p dγ(x, y) In practice, the distance W1 is estimated on discrete data by solving the following linear program with the Hungarian algorithm: EMD(s1, s2) := min f 1≤k,l≤n pk − ql fkl subject to fkl ≥ 0, 1 ≤ k, l ≤ n, n l=1 fkl ≤ wpk , 1 ≤ k ≤ n, n k=1 fkl ≤ wql , 1 ≤ l ≤ n, n k=1 n l=1 fkl = 1. It is called the Earth Mover Distance (EMD) in the CS literature. A target-oriented dependence coefficient • Build the independence copula Cind • Build the target-dependence copulas {Ck}k • Compute the empirical copula Cij from xi, xj TDC(Cij) = EMD(Cind, Cij) EMD(Cind, Cij) + mink EMD(Cij, Ck) Figure 2: Dependence is measured as the relative distance from independence to the nearest target-dependence EMD between Copulas • Probability integral transform of a variable xi: FT (xk i ) = 1 T T t=1 I(xt i ≤ xk i ), i.e. computing the ranks of the realizations, and normalizing them into [0,1] Why the Earth Mover Distance? Figure 3: Copulas C1, C2, C3 encoding a correlation of 0.5, 0.99, 0.9999 respectively; Which pair of copulas is the nearest? For Fisher-Rao, Kullback-Leibler, Hellinger and re- lated divergences: D(C1, C2) ≤ D(C2, C3); EMD(C2, C3) ≤ EMD(C1, C2) Benchmark: Power of Estimators Our coefficient can robustly target complex depen- dence patterns such as the ones displayed in Fig. 4. • x-axis measures the noise added to the sample • y-axis measures the frequency the coefficient is able to discern between the dependent sample and the independent one • Basic check: no coefficient can discern between the “dependent” sample (with no dependence) and the independent sample. 0.00.40.8 xvals power.cor[typ,] xvals power.cor[typ,] 0.00.40.8 xvals power.cor[typ,] xvals power.cor[typ,] cor dCor MIC ACE RDC TDC 0.00.40.8 xvals power.cor[typ,] xvals power.cor[typ,] 0 20 40 60 80 100 0.00.40.8 xvals power.cor[typ,] 0 20 40 60 80 100 xvals power.cor[typ,] Noise Level Power Figure 4: Dependence estimators power as a function of the noise for several deterministic patterns + noise. Their power is the percentage of times that they are able to distinguish between dependent and independent samples. Clustering of Credit Default Swaps • We use the two targets from Fig. 2 • Clustering distance: Dij = (1 − TDC(Cij))/2 Figure 5: Impact of different measures on clusters Conclusion The methodology presented is • non-parametric, robust, deterministic. It has some scalability issues: • in dimension, non-parametric density estimation; • in time, EMD is costly to compute. Approximation schemes or parametric modelling can alleviate these issues. Information • Web: www.datagrapple.com • Email: gautier.marti@helleborecapital.com