Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
A closer ...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
1 Introdu...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
What is c...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Pearson c...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Pearson c...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Pearson c...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Pearson c...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Pearson c...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Pearson c...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Pearson c...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Pearson c...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Pearson c...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Pearson c...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Pearson c...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Pearson c...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Pearson c...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Pearson c...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Pearson c...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Pearson c...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Pearson c...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
On the im...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
On the im...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
On the im...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
On the im...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
On the im...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
On the im...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
On the im...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
On the im...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
On the im...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
On the im...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
On the im...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
On the im...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
On the im...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
On the im...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
On the im...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
On the im...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
On the im...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Explore t...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Explore t...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Explore t...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Explore t...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Explore t...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Explore t...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Explore t...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Explore t...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Explore t...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
1 Introdu...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Summary
D...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Take Home...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Internshi...
HELLEBORECAPITAL
Introduction
Standard correlation coefficients
A metric space for copulas
Applications
Conclusion
Marco Cut...
Prochain SlideShare
Chargement dans…5
×

A closer look at correlations

500 vues

Publié le

You may have already read many times that the job of a Data Scientist is to skim through a huge amount of data searching for correlations between some variables of interest. And also, that one of his worst enemies (besides correlation doesn't imply causation) is spurious correlation. But what really is correlation? Are there several types of correlations? Some "good", some "bad"? What about their estimation? This talk will be a very visual presentation around the notion of correlation and dependence. I will first illustrate how the standard linear correlation is estimated (Pearson coefficient), then some more robust alternative: the Spearman coefficient. Building on the geometric understanding of their nature, I will present a generalization that can help Data Scientists to explore, interpret, and measure the dependence (not necessarily linear or comonotonic) between the variables of a given dataset. Financial time series (stocks, credit default swaps, fx rates), and features from the UCI datasets are considered as use cases.

Publié dans : Données & analyses
  • Soyez le premier à commenter

A closer look at correlations

  1. 1. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion A closer look at correlations Paris Machine Learning Meetup #3 Season 4 G. Marti, S. Andler, F. Nielsen, P. Donnat HELLEBORECAPITAL November 9, 2016 Gautier Marti A closer look at correlations
  2. 2. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion 1 Introduction 2 Standard correlation coefficients Pearson correlation coefficient Spearman correlation coefficient 3 A metric space for copulas On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC 4 Applications Explore the correlations with clustering Query your dataset about correlations with TFDC 5 Conclusion Gautier Marti A closer look at correlations
  3. 3. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion What is correlation? E[Xi Xj ] − E[Xi ]E[Xj ] (E[X2 i ] − E[Xi ]2)(E[X2 j ] − E[Xj ]2) ∈ [−1, 1] N k=1(xik − xi )(xjk − xj ) N k=1(xik − xi )2 N k=1(xjk − xj )2 ∈ [−1, 1] import numpy as np np.corrcoef(x_i,x_j) Gautier Marti A closer look at correlations
  4. 4. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Pearson correlation coefficient Spearman correlation coefficient 1 Introduction 2 Standard correlation coefficients Pearson correlation coefficient Spearman correlation coefficient 3 A metric space for copulas On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC 4 Applications Explore the correlations with clustering Query your dataset about correlations with TFDC 5 Conclusion Gautier Marti A closer look at correlations
  5. 5. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Pearson correlation coefficient Spearman correlation coefficient 1 Introduction 2 Standard correlation coefficients Pearson correlation coefficient Spearman correlation coefficient 3 A metric space for copulas On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC 4 Applications Explore the correlations with clustering Query your dataset about correlations with TFDC 5 Conclusion Gautier Marti A closer look at correlations
  6. 6. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Pearson correlation coefficient Spearman correlation coefficient Pearson correlation Gautier Marti A closer look at correlations
  7. 7. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Pearson correlation coefficient Spearman correlation coefficient Pearson correlation Gautier Marti A closer look at correlations
  8. 8. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Pearson correlation coefficient Spearman correlation coefficient Pearson correlation Gautier Marti A closer look at correlations
  9. 9. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Pearson correlation coefficient Spearman correlation coefficient Pearson correlation Gautier Marti A closer look at correlations
  10. 10. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Pearson correlation coefficient Spearman correlation coefficient Pearson correlation Gautier Marti A closer look at correlations
  11. 11. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Pearson correlation coefficient Spearman correlation coefficient Pearson correlation Gautier Marti A closer look at correlations
  12. 12. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Pearson correlation coefficient Spearman correlation coefficient Pearson correlation with outliers Gautier Marti A closer look at correlations
  13. 13. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Pearson correlation coefficient Spearman correlation coefficient 1 Introduction 2 Standard correlation coefficients Pearson correlation coefficient Spearman correlation coefficient 3 A metric space for copulas On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC 4 Applications Explore the correlations with clustering Query your dataset about correlations with TFDC 5 Conclusion Gautier Marti A closer look at correlations
  14. 14. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Pearson correlation coefficient Spearman correlation coefficient Spearman correlation: Pearson on ranks Gautier Marti A closer look at correlations
  15. 15. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Pearson correlation coefficient Spearman correlation coefficient Spearman correlation: Pearson on ranks Gautier Marti A closer look at correlations
  16. 16. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Pearson correlation coefficient Spearman correlation coefficient Spearman correlation: Pearson on ranks Gautier Marti A closer look at correlations
  17. 17. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Pearson correlation coefficient Spearman correlation coefficient Spearman correlation: Pearson on ranks Gautier Marti A closer look at correlations
  18. 18. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Pearson correlation coefficient Spearman correlation coefficient Spearman correlation: Pearson on ranks Gautier Marti A closer look at correlations
  19. 19. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Pearson correlation coefficient Spearman correlation coefficient Spearman correlation: Pearson on ranks Gautier Marti A closer look at correlations
  20. 20. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Pearson correlation coefficient Spearman correlation coefficient Spearman correlation with outliers Gautier Marti A closer look at correlations
  21. 21. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC 1 Introduction 2 Standard correlation coefficients Pearson correlation coefficient Spearman correlation coefficient 3 A metric space for copulas On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC 4 Applications Explore the correlations with clustering Query your dataset about correlations with TFDC 5 Conclusion Gautier Marti A closer look at correlations
  22. 22. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC 1 Introduction 2 Standard correlation coefficients Pearson correlation coefficient Spearman correlation coefficient 3 A metric space for copulas On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC 4 Applications Explore the correlations with clustering Query your dataset about correlations with TFDC 5 Conclusion Gautier Marti A closer look at correlations
  23. 23. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC From ranks to empirical copula Sklar’s Theorem [3] For (Xi , Xj ) having continuous marginal cdfs FXi , FXj , its joint cumulative distribution F is uniquely expressed as F(Xi , Xj ) = C(FXi (Xi ), FXj (Xj )), where C is known as the copula of (Xi , Xj ). Gautier Marti A closer look at correlations
  24. 24. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC Minimum, Independence, Maximum copulas Fr´echet–Hoeffding copula bounds For any copula C : [0, 1]2 → [0, 1] and any (u, v) ∈ [0, 1]2 the following bounds hold: W(u, v) ≤ C(u, v) ≤ M(u, v), where W is the copula for counter-monotonic random variables, and M is the copula for co-monotonic random variables. 0 0.5 1 ui 0 0.5 1 uj w(ui,uj) 0.000 0.002 0.004 0.006 0.008 0.010 0.012 0.014 0.016 0.018 0.020 0 0.5 1 ui 0 0.5 1 uj W(ui,uj) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 0.5 1 ui 0 0.5 1 uj π(ui,uj) 0.00036 0.00037 0.00038 0.00039 0.00040 0.00041 0.00042 0.00043 0.00044 0 0.5 1 ui 0 0.5 1 uj Π(ui,uj) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 0.5 1 ui 0 0.5 1 uj m(ui,uj) 0.000 0.002 0.004 0.006 0.008 0.010 0.012 0.014 0.016 0.018 0.020 0 0.5 1 ui 0 0.5 1 uj M(ui,uj) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Gautier Marti A closer look at correlations
  25. 25. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC 1 Introduction 2 Standard correlation coefficients Pearson correlation coefficient Spearman correlation coefficient 3 A metric space for copulas On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC 4 Applications Explore the correlations with clustering Query your dataset about correlations with TFDC 5 Conclusion Gautier Marti A closer look at correlations
  26. 26. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC A metric space for copulas Gautier Marti A closer look at correlations
  27. 27. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC A metric space for copulas Gautier Marti A closer look at correlations
  28. 28. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC Which metric? (Regularized) Optimal Transport Distance is the minimum cost of transportation to transform one pile of dirt into another one, i.e. the amount of dirt moved times the distance by which it is moved. EMD = |x1 − x2| EMD = 1 6|x1 − x3| + 1 6|x2 − x3| Gautier Marti A closer look at correlations
  29. 29. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC Which metric? (Regularized) Optimal Transport Its geometry has good properties in general [1], and for copulas [2]. 0 0.5 1 0 0.5 1 0.0000 0.0015 0.0030 0.0045 0.0060 0.0075 0.0090 0.0105 0.0120 0 0.5 1 0 0.5 1 0.0000 0.0015 0.0030 0.0045 0.0060 0.0075 0.0090 0.0105 0.0120 0 0.5 1 0 0.5 1 0.0000 0.0015 0.0030 0.0045 0.0060 0.0075 0.0090 0.0105 0.0120 0 0.5 1 0 0.5 1 0.0000 0.0015 0.0030 0.0045 0.0060 0.0075 0.0090 0.0105 0.0120 0 0.5 1 0 0.5 1 Bregman barycenter copula 0.0000 0.0008 0.0016 0.0024 0.0032 0.0040 0.0048 0.0056 0 0.5 1 0 0.5 1 Wasserstein barycenter copula 0.0000 0.0004 0.0008 0.0012 0.0016 0.0020 0.0024 0.0028 0.0032 Gautier Marti A closer look at correlations
  30. 30. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC A metric space for copulas Gautier Marti A closer look at correlations
  31. 31. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC A metric space for copulas Gautier Marti A closer look at correlations
  32. 32. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC A metric space for copulas Gautier Marti A closer look at correlations
  33. 33. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC A metric space for copulas Gautier Marti A closer look at correlations
  34. 34. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC 1 Introduction 2 Standard correlation coefficients Pearson correlation coefficient Spearman correlation coefficient 3 A metric space for copulas On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC 4 Applications Explore the correlations with clustering Query your dataset about correlations with TFDC 5 Conclusion Gautier Marti A closer look at correlations
  35. 35. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC The Target/Forget Dependence Coefficient (TFDC) Gautier Marti A closer look at correlations
  36. 36. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC The Target/Forget Dependence Coefficient (TFDC) Now, we can define our bespoke dependence coefficient: Build the forget-dependence copulas {CF l }l Build the target-dependence copulas {CT k }k Compute the empirical copula Cij from xi , xj TFDC(Cij ) = minl D(CF l , Cij ) minl D(CF l , Cij ) + mink D(Cij , CT k ) Gautier Marti A closer look at correlations
  37. 37. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC TFDC Power 0.00.20.40.60.81.0 xvals power.cor[typ,] xvals power.cor[typ,] 0.00.20.40.60.81.0 xvals power.cor[typ,] xvals power.cor[typ,] cor dCor MIC ACE MMD CMMD RDC TFDC 0.00.20.40.60.81.0 xvals power.cor[typ,] xvals power.cor[typ,] 0 20 40 60 80 100 0.00.20.40.60.81.0 xvals power.cor[typ,] 0 20 40 60 80 100 xvals power.cor[typ,] Noise Level Power Figure: Power of several dependence coefficients as a function of the noise level in eight different scenarios. Insets show the noise-free form of each association pattern. The coefficient power was estimated via 500 simulations with sample size 500 each. Gautier Marti A closer look at correlations
  38. 38. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Explore the correlations with clustering Query your dataset about correlations with TFDC 1 Introduction 2 Standard correlation coefficients Pearson correlation coefficient Spearman correlation coefficient 3 A metric space for copulas On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC 4 Applications Explore the correlations with clustering Query your dataset about correlations with TFDC 5 Conclusion Gautier Marti A closer look at correlations
  39. 39. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Explore the correlations with clustering Query your dataset about correlations with TFDC 1 Introduction 2 Standard correlation coefficients Pearson correlation coefficient Spearman correlation coefficient 3 A metric space for copulas On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC 4 Applications Explore the correlations with clustering Query your dataset about correlations with TFDC 5 Conclusion Gautier Marti A closer look at correlations
  40. 40. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Explore the correlations with clustering Query your dataset about correlations with TFDC Clustering of empirical copulas Gautier Marti A closer look at correlations
  41. 41. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Explore the correlations with clustering Query your dataset about correlations with TFDC Financial correlations - Stocks CAC 40 Figure: Stocks: More mass in the bottom-left corner, i.e. lower tail dependence. Stock prices tend to plummet together. Gautier Marti A closer look at correlations
  42. 42. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Explore the correlations with clustering Query your dataset about correlations with TFDC Financial correlations - Credit Default Swaps Figure: Credit default swaps: More mass in the top-right corner, i.e. upper tail dependence. Insurance cost against entities’ default tends to soar in stressed market. Gautier Marti A closer look at correlations
  43. 43. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Explore the correlations with clustering Query your dataset about correlations with TFDC Financial correlations - FX rates Figure: FX rates: Empirical copulas show that dependence between FX rates are various. For example, rates may exhibit either strong dependence or independence while being anti-correlated during extreme events. Gautier Marti A closer look at correlations
  44. 44. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Explore the correlations with clustering Query your dataset about correlations with TFDC Associations between features in UCI datasets Dependence patterns (= clustering centroids) found between features in UCI datasets Breast Cancer (wdbc) 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 Libras Movement 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 Parkinsons 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 Gamma Telescope 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 Gautier Marti A closer look at correlations
  45. 45. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Explore the correlations with clustering Query your dataset about correlations with TFDC 1 Introduction 2 Standard correlation coefficients Pearson correlation coefficient Spearman correlation coefficient 3 A metric space for copulas On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC 4 Applications Explore the correlations with clustering Query your dataset about correlations with TFDC 5 Conclusion Gautier Marti A closer look at correlations
  46. 46. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Explore the correlations with clustering Query your dataset about correlations with TFDC The Art of formulating questions about correlations Encode your dependence hypothesis as a copula, and your query as a “k-NN search”. Gautier Marti A closer look at correlations
  47. 47. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion 1 Introduction 2 Standard correlation coefficients Pearson correlation coefficient Spearman correlation coefficient 3 A metric space for copulas On the importance of the normalization Which metric? (Regularized) Optimal Transport A customizable dependence coefficient: TFDC 4 Applications Explore the correlations with clustering Query your dataset about correlations with TFDC 5 Conclusion Gautier Marti A closer look at correlations
  48. 48. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Summary Designing data-driven tailored correlation coefficients Gautier Marti A closer look at correlations
  49. 49. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Take Home Message Gautier Marti A closer look at correlations
  50. 50. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Internships at Hellebore If you are interested by an internship at Hellebore in applied machine learning for Finance (NLP, Text Classification, Information Extraction), please contact: stage@helleboretech.com in ML/Finance research (copulas, bayesian inference, clustering, time series analysis), please contact: gmarti@helleborecapital.com Gautier Marti A closer look at correlations
  51. 51. HELLEBORECAPITAL Introduction Standard correlation coefficients A metric space for copulas Applications Conclusion Marco Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. In Advances in Neural Information Processing Systems, pages 2292–2300, 2013. Gautier Marti, S´ebastien Andler, Frank Nielsen, and Philippe Donnat. Optimal transport vs. fisher-rao distance between copulas for clustering multivariate time series. In IEEE Statistical Signal Processing Workshop, SSP 2016, Palma de Mallorca, Spain, June 26-29, 2016, pages 1–5, 2016. A Sklar. Fonctions de r´epartition `a n dimensions et leurs marges. Universit´e Paris 8, 1959. Gautier Marti A closer look at correlations

×