# Equivalence of distance-based and RKHS-based statistics in hypothesis testing

@article{Sejdinovic2012EquivalenceOD,
title={Equivalence of distance-based and RKHS-based statistics in hypothesis testing},
author={D. Sejdinovic and Bharath K. Sriperumbudur and Arthur Gretton and Kenji Fukumizu},
journal={ArXiv},
year={2012},
volume={abs/1207.6076}
}
We provide a unifying framework linking two classes of statistics used in two-sample and independence testing: on the one hand, the energy distances and distance covariances from the statistics literature; on the other, maximum mean discrepancies (MMD), that is, distances between embeddings of distributions to reproducing kernel Hilbert spaces (RKHS), as established in machine learning. In the case where the energy distance is computed with a semimetric of negative type, a positive definite… Expand
402 Citations

#### Figures, Tables, and Topics from this paper

The Exact Equivalence of Distance and Kernel Methods for Hypothesis Testing
• Computer Science, Mathematics
• ArXiv
• 2018
A new bijective transformation between metrics and kernels is proposed that simplifies the fixed-point transformation, inherits similar theoretical properties, allows distance methods to be exactly the same as kernel methods for sample statistics and p-value, and better preserves the data structure upon transformation. Expand
Adaptivity and Computation-Statistics Tradeoffs for Kernel and Distance based High Dimensional Two Sample Testing
• Mathematics, Computer Science
• ArXiv
• 2015
This paper formally characterize the power of popular tests for GDA like the Maximum Mean Discrepancy with the Gaussian kernel (gMMD) and bandwidth-dependent variants of the Energy Distance with the Euclidean norm (eED) in the high-dimensional MDA regime. Expand
Distance-based and RKHS-based dependence metrics in high dimension
• Mathematics
• 2019
In this paper, we study distance covariance, Hilbert-Schmidt covariance (aka Hilbert-Schmidt independence criterion [Gretton et al. (2008)]) and related independence tests under the high dimensionalExpand
Comparing distributions: 𝓁1 geometry improves kernel two-sample testing
• Computer Science, Mathematics
• NeurIPS
• 2019
Experiments on artificial and real-world problems demonstrate improved power/time tradeoff than the state of the art, based on $\ell_2$ norms, and in some cases, better outright power than even the most expensive quadratic-time tests. Expand
On distance covariance in metric and Hilbert spaces
Distance covariance is a measure of dependence between two random variables that take values in two, in general different, metric spaces, see Szekely, Rizzo and Bakirov (2007) and Lyons (2013). It isExpand
An Adaptive Test of Independence with Analytic Kernel Embeddings
• Mathematics, Computer Science
• ICML
• 2017
A new computationally efficient dependence measure, and an adaptive statistical test of independence, are proposed, which perform comparably to the state-of-the-art quadratic-time HSIC test, and outperform competing O( n) and O(n log n) tests. Expand
A kernel-based measure for conditional mean dependence
• Computer Science, Mathematics
• Comput. Stat. Data Anal.
• 2021
Simulation studies indicate that the tests based on the KCMD have close powers to the testsbased on martingale difference divergence in monotone dependence, but excel in the cases of nonlinear relationships or the moment restriction on X is violated. Expand
The Exact Equivalence of Independence Testing and Two-Sample Testing
• Computer Science, Mathematics
• ArXiv
• 2019
It is shown that two-sample testing are special cases of independence testing via an auxiliary label vector, and it is proved that distance correlation is exactly equivalent to the energy statistic in terms of the population statistic, the sample statistic, and the testing p-value via permutation test. Expand
Optimal Kernel Combination for Test of Independence against Local Alternatives
• Mathematics
• 2014
Testing the independence between two random variables $x$ and $y$ is an important problem in statistics and machine learning, where the kernel-based tests of independence is focused to address theExpand
Independence test and canonical correlation analysis based on the alignment between kernel matrices for multivariate functional data
• Mathematics, Computer Science
• Artificial Intelligence Review
• 2018
This paper constructed independence test and nonlinear canonical variables for multivariate functional data and shows that using functional variants of the proposed measures, they obtain much better results in recognizing nonlinear dependence. Expand

#### References

SHOWING 1-10 OF 41 REFERENCES
Hypothesis testing using pairwise distances and associated kernels
• Computer Science, Mathematics
• ICML
• 2012
It is shown that the energy distance most commonly employed in statistics is just one member of a parametric family of kernels, and that other choices from this family can yield more powerful tests. Expand
A Fast, Consistent Kernel Two-Sample Test
• Computer Science, Mathematics
• NIPS
• 2009
A novel estimate of the null distribution is computed, computed from the eigen-spectrum of the Gram matrix on the aggregate sample from P and Q, and having lower computational cost than the bootstrap. Expand
Kernel Choice and Classifiability for RKHS Embeddings of Probability Distributions
• Computer Science, Mathematics
• NIPS
• 2009
It is established that MMD corresponds to the optimal risk of a kernel classifier, thus forming a natural link between the distance between distributions and their ease of classification, and a generalization of the MMD is proposed for families of kernels. Expand
Optimal kernel choice for large-scale two-sample tests
The new kernel selection approach yields a more powerful test than earlier kernel selection heuristics, and makes the kernel selection and test procedures suited to data streams, where the observations cannot all be stored in memory. Expand
A Kernel Two-Sample Test
• Mathematics, Computer Science
• J. Mach. Learn. Res.
• 2012
This work proposes a framework for analyzing and comparing distributions, which is used to construct statistical tests to determine if two samples are drawn from different distributions, and presents two distribution free tests based on large deviation bounds for the maximum mean discrepancy (MMD). Expand
Hilbert Space Embeddings and Metrics on Probability Measures
• Mathematics, Computer Science
• J. Mach. Learn. Res.
• 2010
It is shown that the distance between distributions under γk results from an interplay between the properties of the kernel and the distributions, by demonstrating that distributions are close in the embedding space when their differences occur at higher frequencies. Expand
Injective Hilbert Space Embeddings of Probability Measures
• Mathematics, Computer Science
• COLT
• 2008
This work considers more broadly the problem of specifying characteristic kernels, defined as kernels for which the RKHS embedding of probability measures is injective, and restricts ourselves to translation-invariant kernels on Euclidean space. Expand
Measuring Statistical Dependence with Hilbert-Schmidt Norms
• Computer Science, Mathematics
• ALT
• 2005
We propose an independence criterion based on the eigen-spectrum of covariance operators in reproducing kernel Hilbert spaces (RKHSs), consisting of an empirical estimate of the Hilbert-Schmidt normExpand
On the relation between universality, characteristic kernels and RKHS embedding of measures
• Mathematics, Computer Science
• AISTATS
• 2010
The main contribution of this paper is to clarify the relation between universal and characteristic kernels by presenting a unifying study relating them to RKHS embedding of measures, in addition to clarifying their relation to other common notions of strictly pd, conditionally strictly pD and integrally strictlypd kernels. Expand
Mixture density estimation via Hilbert space embedding of measures
• Bharath K. Sriperumbudur
• Mathematics, Computer Science
• 2011 IEEE International Symposium on Information Theory Proceedings
• 2011
This paper analyzes the estimation and approximation errors for an M-estimator and shows the estimation error rate to be better than that obtained with KL divergence while achieving the same approximation error rate. Expand