This work proposes a framework for analyzing and comparing distributions, which is used to construct statistical tests to determine if two samples are drawn from different distributions, and presents two distribution free tests based on large deviation bounds for the maximum mean discrepancy (MMD).Expand

This work proposes two statistical tests to determine if two samples are from different distributions, and applies this approach to a variety of problems, including attribute matching for databases using the Hungarian marriage method, where the test performs strongly.Expand

We propose an independence criterion based on the eigen-spectrum of covariance operators in reproducing kernel Hilbert spaces (RKHSs), consisting of an empirical estimate of the Hilbert-Schmidt norm… Expand

A nonparametric method which directly produces resampling weights without distribution estimation is presented, which works by matching distributions between training and testing sets in feature space.Expand

A novel statistical test of whether two samples are from the same distribution, compatible with both multivariate and structured data, that is fast, easy to implement, and works well, as confirmed by the experiments.Expand

We describe a technique for comparing distributions without the need for density estimation as an intermediate step. Our approach relies on mapping the distributions into a reproducing kernel Hilbert… Expand

The situation with bias in GAN loss functions raised by recent work is clarified, and it is shown that gradient estimators used in the optimization process for both MMD GANs and Wasserstein GAns are unbiased, but learning a discriminator based on samples leads to biased gradients for the generator parameters.Expand

A novel test of the independence hypothesis for one particular kernel independence measure, the Hilbert-Schmidt independence criterion (HSIC), which outperforms established contingency table and functional correlation-based tests, and is greater for multivariate data.Expand

A simple universal ranking algorithm for data lying in the Euclidean space, such as text or image data, to rank the data with respect to the intrinsic manifold structure collectively revealed by a great amount of data.Expand

The new kernel selection approach yields a more powerful test than earlier kernel selection heuristics, and makes the kernel selection and test procedures suited to data streams, where the observations cannot all be stored in memory.Expand