Abstract
The "maximum similarity correlation" definition introduced in this study is motivated by the seminal work of Szekely et al on "distance covariance" (Ann. Statist. 2007, 35: 2769-2794; Ann. Appl. Stat. 2009, 3: 1236-1265). Instead of using Euclidean distances "d" as in Szekely et al, we use "similarity", which can be defined as "exp(-d/s)", where the scaling parameter s>0 controls how rapidly the similarity falls off with distance. Scale parameters are chosen by maximizing the similarity correlation. The motivation for using "similarity" originates in spectral clustering theory (see e.g. Ng et al 2001, Advances in Neural Information Processing Systems 14: 849-856). We show that a particular form of similarity correlation is asymptotically equivalent to distance correlation for large values of the scale parameter. Furthermore, we extend similarity correlation to coherence between complex valued vectors, including its partitioning into real and imaginary contributions. Several toy examples are used for comparing distance and similarity correlations. For instance, points on a noiseless straight line give distance and similarity correlation values equal to 1; but points on a noiseless circle produces near zero distance correlation (dCorr=0.02) while the similarity correlation is distinctly non zero (sCorr=0.36). In distinction to the distance approach, similarity gives more importance to small distances, which emphasizes the local properties of functional relations. This paper represents a preliminary empirical study, showing that the novel similarity association has some distinct practical advantages over distance based association.For the sake of reproducible research, the software code implementing all methods here (using lazarus free-pascal "www.lazarus.freepascal.org"), including all test data, are freely available at: "sites.google.com/site/pascualmarqui/home/similaritycovariance".