Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Prochain SlideShare
Chargement dans…5
×

# On clustering financial time series - A need for distances between dependent random variables

Talk at ICMS CIG ISP 2015

• Full Name
Comment goes here.

Are you sure you want to Yes No
• Soyez le premier à commenter

### On clustering financial time series - A need for distances between dependent random variables

1. 1. Introduction Dependence and Distribution Toward an extension to the multivariate case On clustering ﬁnancial time series A need for distances between dependent random variables Gautier Marti, Frank Nielsen, Philippe Very, Philippe Donnat 24 September 2015 Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
2. 2. Introduction Dependence and Distribution Toward an extension to the multivariate case 1 Introduction 2 Dependence and Distribution 3 Toward an extension to the multivariate case Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
3. 3. Introduction Dependence and Distribution Toward an extension to the multivariate case Motivations: Why clustering? Motivations: Mathematical ﬁnance: Use of variance-covariance matrices (e.g., Markowitz, Value-at-Risk) Stylized fact: Empirical variance-covariance matrices estimated on ﬁnancial time series are very noisy (Random Matrix Theory, Noise Dressing of Financial Correlation Matrices, Laloux et al, 1999) Figure: Marchenko-Pastur distribution vs. eigenvalues of the empirical correlation matrix How to ﬁlter these variance-covariance matrices? Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
4. 4. Introduction Dependence and Distribution Toward an extension to the multivariate case Information ﬁltering? Clustering! Mantegna (1999) et al’s work: Limits: focus on ρij (Pearson correlation) which is not robust to outliers / heavy tails → could lead to spurious clusters Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
5. 5. Introduction Dependence and Distribution Toward an extension to the multivariate case Modelling Asset i variations or returns follow random variable Xi Assets variations or returns are ”correlated” i.i.d. observations: X1 : X1 1 , X2 1 , . . . , XT 1 X2 : X1 2 , X2 2 , . . . , XT 2 . . . , . . . , . . . , . . . , . . . XN : X1 N, X2 N, . . . , XT N Which distances d(Xi , Xj ) between dependent random variables? Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
6. 6. Introduction Dependence and Distribution Toward an extension to the multivariate case 1 Introduction 2 Dependence and Distribution 3 Toward an extension to the multivariate case Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
7. 7. Introduction Dependence and Distribution Toward an extension to the multivariate case Pitfalls of a basic distance Let (X, Y ) be a bivariate Gaussian vector, with X ∼ N(µX , σ2 X ), Y ∼ N(µY , σ2 Y ) and whose correlation is ρ(X, Y ) ∈ [−1, 1]. E[(X − Y )2 ] = (µX − µY )2 + (σX − σY )2 + 2σX σY (1 − ρ(X, Y )) Now, consider the following values for correlation: ρ(X, Y ) = 0, so E[(X − Y )2] = (µX − µY )2 + σ2 X + σ2 Y . Assume µX = µY and σX = σY . For σX = σY 1, we obtain E[(X − Y )2] 1 instead of the distance 0, expected from comparing two equal Gaussians. ρ(X, Y ) = 1, so E[(X − Y )2] = (µX − µY )2 + (σX − σY )2. Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
8. 8. Introduction Dependence and Distribution Toward an extension to the multivariate case Pitfalls of a basic distance (Marti, Nielsen, Very, Donnat, ICMLA 2015) Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
9. 9. Introduction Dependence and Distribution Toward an extension to the multivariate case The Financial Engineer Bias: Correlation correlation patterns are blatant Mantegna et al. aim at ﬁltering information from the correlation matrix using clustering O(N2) (correlation) vs. O(N) (distribution) parameters Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
10. 10. Introduction Dependence and Distribution Toward an extension to the multivariate case Information Geometry and its statistical distances original poster: http://www.sonycsl.co.jp/person/nielsen/FrankNielsen-distances-figs.pdf Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
11. 11. Introduction Dependence and Distribution Toward an extension to the multivariate case Sklar’s Theorem and the Copula Transform Theorem (Sklar’s Theorem (1959)) For any random vector X = (X1, . . . , XN) having continuous marginal cdfs Pi , 1 ≤ i ≤ N, its joint cumulative distribution P is uniquely expressed as P(X1, . . . , XN) = C(P1(X1), . . . , PN(XN)), where C, the multivariate distribution of uniform marginals, is known as the copula of X. Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
12. 12. Introduction Dependence and Distribution Toward an extension to the multivariate case Sklar’s Theorem and the Copula Transform Deﬁnition (The Copula Transform) Let X = (X1, . . . , XN) be a random vector with continuous marginal cumulative distribution functions (cdfs) Pi , 1 ≤ i ≤ N. The random vector U = (U1, . . . , UN) := P(X) = (P1(X1), . . . , PN(XN)) is known as the copula transform. Ui , 1 ≤ i ≤ N, are uniformly distributed on [0, 1] (the probability integral transform): for Pi the cdf of Xi , we have x = Pi (Pi −1 (x)) = Pr(Xi ≤ Pi −1 (x)) = Pr(Pi (Xi ) ≤ x), thus Pi (Xi ) ∼ U[0, 1]. Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
13. 13. Introduction Dependence and Distribution Toward an extension to the multivariate case Distance Design d2 θ (Xi , Xj ) = θ3E |Pi (Xi ) − Pj (Xj )|2 + (1 − θ) 1 2 R dPi dλ − dPj dλ 2 dλ Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
14. 14. Introduction Dependence and Distribution Toward an extension to the multivariate case Results: Data from Hierarchical Block Model Adjusted Rand Index Algo. Distance A B C HC-AL (1 − ρ)/2 0.00 ±0.01 0.99 ±0.01 0.56 ±0.01 E[(X − Y )2 ] 0.00 ±0.00 0.09 ±0.12 0.55 ±0.05 GPR θ = 0 0.34 ±0.01 0.01 ±0.01 0.06 ±0.02 GPR θ = 1 0.00 ±0.01 0.99 ±0.01 0.56 ±0.01 GPR θ = .5 0.34 ±0.01 0.59 ±0.12 0.57 ±0.01 GNPR θ = 0 1 0.00 ±0.00 0.17 ±0.00 GNPR θ = 1 0.00 ±0.00 1 0.57 ±0.00 GNPR θ = .5 0.99 ±0.01 0.25 ±0.20 0.95 ±0.08 AP (1 − ρ)/2 0.00 ±0.00 0.99 ±0.07 0.48 ±0.02 E[(X − Y )2 ] 0.14 ±0.03 0.94 ±0.02 0.59 ±0.00 GPR θ = 0 0.25 ±0.08 0.01 ±0.01 0.05 ±0.02 GPR θ = 1 0.00 ±0.01 0.99 ±0.01 0.48 ±0.02 GPR θ = .5 0.06 ±0.00 0.80 ±0.10 0.52 ±0.02 GNPR θ = 0 1 0.00 ±0.00 0.18 ±0.01 GNPR θ = 1 0.00 ±0.01 1 0.59 ±0.00 GNPR θ = .5 0.39 ±0.02 0.39 ±0.11 1 Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
15. 15. Introduction Dependence and Distribution Toward an extension to the multivariate case Results: Data from CDS market (Marti, Nielsen, Very, Donnat, ICMLA 2015) Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
16. 16. Introduction Dependence and Distribution Toward an extension to the multivariate case Limits and questions Why a convex combination? no a priori support from geometry In practice: no real control on the weight of correlation and on the weight of distribution stability methods are still prone to overﬁtting for selecting parameters θ actually depends on the convergence rate of the estimators: correlation measures converge faster than distribution estimation Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
17. 17. Introduction Dependence and Distribution Toward an extension to the multivariate case 1 Introduction 2 Dependence and Distribution 3 Toward an extension to the multivariate case Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
18. 18. Introduction Dependence and Distribution Toward an extension to the multivariate case Overview Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
19. 19. Introduction Dependence and Distribution Toward an extension to the multivariate case Multivariate dependence What is the state of the art on multivariate dependence? multivariate mutual information: In information theory there have been various attempts over the years to extend the deﬁnition of mutual information to more than two random variables. These attempts have met with a great deal of confusion and a realization that interactions among many random variables are poorly understood. Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
20. 20. Introduction Dependence and Distribution Toward an extension to the multivariate case Optimal Copula Transport for intra-dependence Dintra(X1, X2) := EMD(s1, s2), EMD(s1, s2) := min f 1≤i,j≤n pi − qj fij subject to fij ≥ 0, 1 ≤ i, j ≤ n, n j=1 fij ≤ wpi , 1 ≤ i ≤ n, n i=1 fij ≤ wqj , 1 ≤ j ≤ n, n i=1 n j=1 fij = 1. Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
21. 21. Introduction Dependence and Distribution Toward an extension to the multivariate case Optimal Copula Transport for inter-dependence Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
22. 22. Introduction Dependence and Distribution Toward an extension to the multivariate case Limits and questions does not scale well with even moderate dimensionality: density estimation computing cost full parametric approach? how to connect with the (copula,margins) representation? information geometry? (approximate) optimal transport? kernel embedding of distributions? contact: gautier.marti@helleborecapital.com Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
23. 23. Introduction Dependence and Distribution Toward an extension to the multivariate case Daniel Aloise, Amit Deshpande, Pierre Hansen, and Preyas Popat. NP-hardness of Euclidean sum-of-squares clustering. Machine Learning, 75(2):245–248, 2009. Luigi Ambrosio and Nicola Gigli. A user’s guide to optimal transport. In Modelling and optimisation of ﬂows on networks, pages 1–155. Springer, 2013. David Applegate, Tamraparni Dasu, Shankar Krishnan, and Simon Urbanek. Unsupervised clustering of multidimensional distributions using earth mover distance. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 636–644. ACM, 2011. Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
24. 24. Introduction Dependence and Distribution Toward an extension to the multivariate case Shai Ben-David, Ulrike Von Luxburg, and D´avid P´al. A sober look at clustering stability. In Learning theory, pages 5–19. Springer, 2006. Petro Borysov, Jan Hannig, and JS Marron. Asymptotics of hierarchical clustering for growing dimension. Journal of Multivariate Analysis, 124:465–479, 2014. Leo Breiman and Jerome H Friedman. Estimating optimal transformations for multiple regression and correlation. Journal of the American statistical Association, 80(391): 580–598, 1985. Jo¨el Bun, Romain Allez, Jean-Philippe Bouchaud, and Marc Potters. Rotational invariant estimator for general noisy matrices. arXiv preprint arXiv:1502.06736, 2015. Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
25. 25. Introduction Dependence and Distribution Toward an extension to the multivariate case Gunnar Carlsson and Facundo M´emoli. Characterization, stability and convergence of hierarchical clustering methods. The Journal of Machine Learning Research, 11:1425–1470, 2010. Yanping Chen, Eamonn Keogh, Bing Hu, Nurjahan Begum, Anthony Bagnall, Abdullah Mueen, and Gustavo Batista. The UCR time series classiﬁcation archive, July 2015. www.cs.ucr.edu/~eamonn/time_series_data/. Tamraparni Dasu, Deborah F Swayne, and David Poole. Grouping multivariate time series: A case study. In Proceedings of the IEEE Workshop on Temporal Data Mining: Algorithms, Theory and Applications, in conjunction with the Conference on Data Mining, Houston, pages 25–32, 2005. Paul Deheuvels. Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
26. 26. Introduction Dependence and Distribution Toward an extension to the multivariate case La fonction de d´ependance empirique et ses propri´et´es. un test non param´etrique d’ind´ependance. Acad. Roy. Belg. Bull. Cl. Sci.(5), 65(6):274–292, 1979. Paul Deheuvels. An asymptotic decomposition for multivariate distribution-free tests of independence. Journal of Multivariate Analysis, 11(1):102–113, 1981. T Di Matteo, T Aste, ST Hyde, and S Ramsden. Interest rates hierarchical structure. Physica A: Statistical Mechanics and its Applications, 355(1): 21–33, 2005. T Di Matteo, Francesca Pozzi, and Tomaso Aste. The use of dynamical networks to detect the hierarchical organization of ﬁnancial market sectors. The European Physical Journal B-Condensed Matter and Complex Systems, 73(1):3–11, 2010. Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
27. 27. Introduction Dependence and Distribution Toward an extension to the multivariate case Francis X Diebold and Canlin Li. Forecasting the term structure of government bond yields. Journal of econometrics, 130(2):337–364, 2006. A Adam Ding and Yi Li. Copula correlation: An equitable dependence measure and extension of pearson’s correlation. arXiv preprint arXiv:1312.7214, 2013. Bradley Efron. Bootstrap methods: another look at the jackknife. The annals of Statistics, pages 1–26, 1979. Gal Elidan. Copulas in machine learning. In Copulae in Mathematical and Quantitative Finance, pages 39–60. Springer, 2013. Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
28. 28. Introduction Dependence and Distribution Toward an extension to the multivariate case Sira Ferradans, Nicolas Papadakis, Julien Rabin, Gabriel Peyr´e, and Jean-Fran¸cois Aujol. Regularized discrete optimal transport. Springer, 2013. Hans Gebelein. Das statistische problem der korrelation als variations-und eigenwertproblem und sein zusammenhang mit der ausgleichsrechnung. ZAMM-Journal of Applied Mathematics and Mechanics/Zeitschrift f¨ur Angewandte Mathematik und Mechanik, 21(6):364–379, 1941. Cyril Goutte, Peter Toft, Egill Rostrup, Finn ˚A Nielsen, and Lars Kai Hansen. On clustering fMRI time series. NeuroImage, 9(3):298–310, 1999. Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
29. 29. Introduction Dependence and Distribution Toward an extension to the multivariate case Clive WJ Granger and Paul Newbold. Spurious regressions in econometrics. Journal of econometrics, 2(2):111–120, 1974. Isabelle Guyon, Ulrike Von Luxburg, and Robert C Williamson. Clustering: Science or art. In NIPS 2009 Workshop on Clustering Theory, 2009. Jiang Hangjin and Ding Yiming. Equitability of dependence measure. stat, 1050:9, 2015. Keith Henderson, Brian Gallagher, and Tina Eliassi-Rad. EP-MEANS: An eﬃcient nonparametric clustering of empirical probability distributions. 2015. Weiming Hu, Tieniu Tan, Liang Wang, and Steve Maybank. A survey on visual surveillance of object motion and behaviors. Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
30. 30. Introduction Dependence and Distribution Toward an extension to the multivariate case Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 34(3):334–352, 2004. John C Hull. Options, futures, and other derivatives. Pearson Education, 2006. Anil K Jain. Data clustering: 50 years beyond k-means. Pattern recognition letters, 31(8):651–666, 2010. Konstantinos Kalpakis, Dhiral Gada, and Vasundhara Puttagunta. Distance measures for eﬀective clustering of ARIMA time-series. In Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on, pages 273–280. IEEE, 2001. M Kanevski, V Timonin, A Pozdnoukhov, and M Maignan. Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
31. 31. Introduction Dependence and Distribution Toward an extension to the multivariate case Evolution of interest rate curve: empirical analysis of patterns using nonlinear clustering tools. In European Symposium on Time Series Prediction, 2008. Leonid Vitalievich Kantorovich. On the translocation of masses. In Dokl. Akad. Nauk SSSR, volume 37, pages 199–201, 1942. Justin B Kinney and Gurinder S Atwal. Equitability, mutual information, and the maximal information coeﬃcient. Proceedings of the National Academy of Sciences, 111(9): 3354–3359, 2014. Jon M. Kleinberg. An impossibility theorem for clustering. In S. Thrun and K. Obermayer, editors, Advances in Neural Information Processing Systems 15, pages 446–453. MIT Press, Cambridge, MA, 2002. Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
32. 32. Introduction Dependence and Distribution Toward an extension to the multivariate case URL http://books.nips.cc/papers/files/nips15/LT17.pdf. Laurent Laloux, Pierre Cizeau, Marc Potters, and Jean-Philippe Bouchaud. Random matrix theory and ﬁnancial correlations. International Journal of Theoretical and Applied Finance, 3 (03):391–397, 2000. Victoria Lemieux, Payam S Rahmdel, Rick Walker, BL Wong, and Mark Flood. Clustering techniques and their eﬀect on portfolio formation and risk analysis. In Proceedings of the International Workshop on Data Science for Macro-Modeling, pages 1–6. ACM, 2014. Erel Levine and Eytan Domany. Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
33. 33. Introduction Dependence and Distribution Toward an extension to the multivariate case Resampling method for unsupervised estimation of cluster validity. Neural computation, 13(11):2573–2593, 2001. T Warren Liao. Clustering of time series data—a survey. Pattern recognition, 38(11):1857–1874, 2005. Jessica Lin, Eamonn Keogh, Stefano Lonardi, and Bill Chiu. A symbolic representation of time series, with implications for streaming algorithms. In Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery, pages 2–11. ACM, 2003. Jessica Lin, Michail Vlachos, Eamonn Keogh, and Dimitrios Gunopulos. Iterative incremental clustering of time series. Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
34. 34. Introduction Dependence and Distribution Toward an extension to the multivariate case In Advances in Database Technology-EDBT 2004, pages 106–122. Springer, 2004. Jessica Lin, Eamonn Keogh, Li Wei, and Stefano Lonardi. Experiencing SAX: a novel symbolic representation of time series. Data Mining and knowledge discovery, 15(2):107–144, 2007. David Lopez-Paz, Philipp Hennig, and Bernhard Sch¨olkopf. The randomized dependence coeﬃcient. arXiv preprint arXiv:1304.7717, 2013. Rosario N Mantegna. Hierarchical structure in ﬁnancial markets. The European Physical Journal B-Condensed Matter and Complex Systems, 11(1):193–197, 1999. Martin Martens and Ser-Huang Poon. Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
35. 35. Introduction Dependence and Distribution Toward an extension to the multivariate case Returns synchronization and daily correlation dynamics between international stock markets. Journal of Banking & Finance, 25(10):1805–1827, 2001. Gautier Marti, Philippe Donnat, Frank Nielsen, and Philippe Very. HCMapper: An interactive visualization tool to compare partition-based ﬂat clustering extracted from pairs of dendrograms. arXiv preprint arXiv:1507.08137, 2015a. Gautier Marti, Philippe Very, and Philippe Donnat. Toward a generic representation of random variables for machine learning. arXiv preprint arXiv:1506.00976, 2015b. Sergio Mayordomo, Juan Ignacio Pe˜na, and Eduardo S Schwartz. Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
36. 36. Introduction Dependence and Distribution Toward an extension to the multivariate case Are all credit default swap databases equal? Technical report, National Bureau of Economic Research, 2010. Sergio Mayordomo, Juan Ignacio Pe˜na, and Eduardo S Schwartz. Are all credit default swap databases equal? European Financial Management, 20(4):677–713, 2014. Gaspard Monge. M´emoire sur la th´eorie des d´eblais et des remblais. De l’Imprimerie Royale, 1781. James Munkres. Algorithms for the assignment and transportation problems. Journal of the Society for Industrial and Applied Mathematics, 5(1):32–38, 1957. Nicolo Musmeci, Tomaso Aste, and Tiziana Di Matteo. Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
37. 37. Introduction Dependence and Distribution Toward an extension to the multivariate case Relation between ﬁnancial market structure and the real economy: Comparison between clustering methods. Available at SSRN 2525291, 2014. Nicol´o Musmeci, Tomaso Aste, and Tiziana Di Matteo. Relation between ﬁnancial market structure and the real economy: comparison between clustering methods. 2015. Roger B Nelsen. An introduction to copulas, volume 139. Springer Science & Business Media, 2013. Dominic O’Kane. Modelling single-name and multi-name credit derivatives, volume 573. John Wiley & Sons, 2011. Barnab´as P´oczos, Zoubin Ghahramani, and Jeﬀ Schneider. Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
38. 38. Introduction Dependence and Distribution Toward an extension to the multivariate case Copula-based kernel dependency measures. arXiv preprint arXiv:1206.4682, 2012. David N Reshef, Yakir A Reshef, Hilary K Finucane, Sharon R Grossman, Gilean McVean, Peter J Turnbaugh, Eric S Lander, Michael Mitzenmacher, and Pardis C Sabeti. Detecting novel associations in large data sets. science, 334(6062):1518–1524, 2011. David N Reshef, Yakir A Reshef, Pardis C Sabeti, and Michael M Mitzenmacher. An empirical study of leading measures of dependence. arXiv preprint arXiv:1505.02214, 2015a. Yakir A Reshef, David N Reshef, Hilary K Finucane, Pardis C Sabeti, and Michael M Mitzenmacher. Measuring dependence powerfully and equitably. arXiv preprint arXiv:1505.02213, 2015b. Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
39. 39. Introduction Dependence and Distribution Toward an extension to the multivariate case Yakir A Reshef, David N Reshef, Pardis C Sabeti, and Michael M Mitzenmacher. Equitability, interval estimation, and statistical power. arXiv preprint arXiv:1505.02212, 2015c. Yossi Rubner, Carlo Tomasi, and Leonidas J Guibas. The earth mover’s distance as a metric for image retrieval. International journal of computer vision, 40(2):99–121, 2000. Daniil Ryabko. Clustering processes. arXiv preprint arXiv:1004.5194, 2010. Ohad Shamir and Naftali Tishby. Cluster stability for ﬁnite samples. In NIPS, 2007. Robert H Shumway. Time-frequency clustering and discriminant analysis. Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
40. 40. Introduction Dependence and Distribution Toward an extension to the multivariate case Statistics & probability letters, 63(3):307–314, 2003. Noah Simon and Robert Tibshirani. Comment on”detecting novel associations in large data sets” by reshef et al, science dec 16, 2011. arXiv preprint arXiv:1401.7645, 2014. Ashish Singhal and Dale E Seborg. Clustering of multivariate time-series data. Journal of Chemometrics, 19:427—-438, 2005. A Sklar. Fonctions de r´epartition `a n dimensions et leurs marges. Universit´e Paris 8, 1959. Won-Min Song, T Di Matteo, and Tomaso Aste. Hierarchical information clustering by means of topologically embedded graphs. PLoS One, 7(3):e31929, 2012. Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
41. 41. Introduction Dependence and Distribution Toward an extension to the multivariate case Jimeng Sun, Christos Faloutsos, Spiros Papadimitriou, and Philip S Yu. Graphscope: parameter-free mining of large time-evolving graphs. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 687–696. ACM, 2007. G´abor J Sz´ekely, Maria L Rizzo, Nail K Bakirov, et al. Measuring and testing dependence by correlation of distances. The Annals of Statistics, 35(6):2769–2794, 2007. Chayant Tantipathananandh and Tanya Y Berger-Wolf. Finding communities in dynamic social networks. In Data Mining (ICDM), 2011 IEEE 11th International Conference on, pages 1236–1241. IEEE, 2011. Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
42. 42. Introduction Dependence and Distribution Toward an extension to the multivariate case Vincenzo Tola, Fabrizio Lillo, Mauro Gallegati, and Rosario N Mantegna. Cluster analysis for portfolio optimization. Journal of Economic Dynamics and Control, 32(1):235–258, 2008. Michele Tumminello, Tomaso Aste, Tiziana Di Matteo, and Rosario N Mantegna. A tool for ﬁltering information in complex systems. Proceedings of the National Academy of Sciences of the United States of America, 102(30):10421–10426, 2005. Michele Tumminello, Fabrizio Lillo, and Rosario N Mantegna. Correlation, hierarchies, and networks in ﬁnancial markets. Journal of Economic Behavior & Organization, 75(1):40–58, 2010. C´edric Villani. Gautier Marti, Frank Nielsen On clustering ﬁnancial time series
43. 43. Introduction Dependence and Distribution Toward an extension to the multivariate case Optimal transport: old and new, volume 338. Springer Science & Business Media, 2008. Kiyoung Yang and Cyrus Shahabi. A pca-based similarity measure for multivariate time series. In Proceedings of the 2nd ACM international workshop on Multimedia databases, pages 65–74. ACM, 2004. Kiyoung Yang and Cyrus Shahabi. On the stationarity of multivariate time series for correlation-based data analysis. In Data Mining, Fifth IEEE International Conference on, pages 4–pp. IEEE, 2005. Gautier Marti, Frank Nielsen On clustering ﬁnancial time series