We propose a framework for analyzing and comparing distribu tions, which we use to construct statistical tests to determine if two samples are drawn from dif ferent distributions. Our test statisticâ€¦ (More)

We propose a framework for analyzing and comparing distributions, allowing us to design statistical tests to determine if two samples are drawn from different distributions. Our test statistic is theâ€¦ (More)

We consider the scenario where training and test data are drawn from different distributions, commonly referred to as sample selection bias. Most algorithms for this setting try to first recoverâ€¦ (More)

We propose an independence criterion based on the eigenspectrum of covariance operators in reproducing kernel Hilbert spaces (RKHSs), consisting of an empirical estimate of the Hilbert-Schmidt normâ€¦ (More)

MOTIVATION
Many problems in data integration in bioinformatics can be posed as one common question: Are two sets of observations generated by the same distribution? We propose a kernel-basedâ€¦ (More)

The Google search engine has enjoyed huge success with its web page ranking algorithm, which exploits global, rather than local, hyperlink structure of the web using random walks. Here we propose aâ€¦ (More)

Given sets of observations of training and test data, we consider the problem of re-weighting the training data such that its distribution more closely matches that of the test data. We achieve thisâ€¦ (More)

A Hilbert space embedding for probability measures has recently been proposed (Gretton et al., 2007; Smola et al., 2007), with applications including dimensionality reduction, homogeneity testing andâ€¦ (More)

Although kernel measures of independence have been widely applied in machine learning (notably in kernel ICA), there is as yet no method to determine whether they have detected statisticallyâ€¦ (More)