Measuring Statistical Dependence with Hilbert-Schmidt Norms

@inproceedings{Gretton2005MeasuringSD,
  title={Measuring Statistical Dependence with Hilbert-Schmidt Norms},
  author={Arthur Gretton and Olivier Bousquet and Alex Smola and Bernhard Sch{\"o}lkopf},
  booktitle={ALT},
  year={2005}
}
We propose an independence criterion based on the eigen-spectrum of covariance operators in reproducing kernel Hilbert spaces (RKHSs), consisting of an empirical estimate of the Hilbert-Schmidt norm of the cross-covariance operator (we term this a Hilbert-Schmidt Independence Criterion, or HSIC. [...] Key Method First, the empirical estimate is simpler than any other kernel dependence test, and requires no user-defined regularisation. Second, there is a clearly defined population quantity which the empirical…Expand
Kernel Learning with Hilbert-Schmidt Independence Criterion
TLDR
This work introduces a unifying view of kernel learning with the Hilbert-Schmidt independence criterion (HSIC) which is a kernel method for measuring the statistical dependence between random variables and proposes an effective Gaussian kernel optimization method for classification by maximizing the HSIC. Expand
Kernel learning and optimization with Hilbert–Schmidt independence criterion
  • Tinghua Wang, Wei Li
  • Mathematics, Computer Science
  • Int. J. Mach. Learn. Cybern.
  • 2018
TLDR
A unifying view of kernel learning via statistical dependence estimation is presented and a Gaussian kernel optimization method for classification by maximizing the HSIC is proposed, where two forms of Gaussian kernels (spherical kernel and ellipsoidal kernel) are considered. Expand
On Kernel Parameter Selection in Hilbert-Schmidt Independence Criterion
TLDR
This paper shows that HSIC can actually be regarded as an approximation to LSMI, which allows it to utilize cross-validation of LSMI for determining kernel parameters in HSIC, and both computational efficiency and cross- validation can be achieved. Expand
Sensitivity maps of the Hilbert-Schmidt independence criterion
TLDR
The sensitivity maps (SMs) for the Hilbert–Schmidt independence criterion (HSIC) are introduced and the randomized HSIC and its corresponding sensitivity maps to cope with large scale problems are presented. Expand
A simpler condition for consistency of a kernel independence test
A statistical test of independence may be constructed using the Hilbert-Schmidt Independence Criterion (HSIC) as a test statistic. The HSIC is defined as the distance between the embedding of theExpand
Spectral Non-Convex Optimization for Dimension Reduction with Hilbert-Schmidt Independence Criterion
TLDR
A spectral-based optimization algorithm that extends beyond the linear kernel and provides the first and second-order local guarantees when a fixed point is reached is proposed, thereby removing the need to repeat the algorithm at random initialization points. Expand
A kernel-based measure for conditional mean dependence
TLDR
Simulation studies indicate that the tests based on the KCMD have close powers to the testsbased on martingale difference divergence in monotone dependence, but excel in the cases of nonlinear relationships or the moment restriction on X is violated. Expand
Aggregated test of independence based on HSIC measures
Dependence measures based on reproducing kernel Hilbert spaces, also known as Hilbert-Schmidt Independence Criterion and denoted HSIC, are widely used to statistically decide whether or not twoExpand
Estimations of singular functions of kernel cross-covariance operators
TLDR
This paper establishes learning rates of some estimators associated with kernel cross-covariance operators, and bound a weighted summation of squared estimation errors of empirical singular functions of Hilbert–Schmidt operators by 16 times of the estimation error of empirical cross- cvariance. Expand
Kernel Methods for Measuring Independence
TLDR
Two new functionals, the constrained covariance and the kernel mutual information, are introduced to measure the degree of independence of random variables and it is proved that when the RKHSs are universal, both functionals are zero if and only if the random variables are pairwise independent. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 23 REFERENCES
Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces
TLDR
This work treats the problem of dimensionality reduction as that of finding a low-dimensional “effective subspace” of X which retains the statistical relationship between X and Y and establishes a general nonparametric characterization of conditional independence using covariance operators on a reproducing kernel Hilbert space. Expand
Statistical Properties of Kernel Principal Component Analysis
TLDR
This work focuses on Kernel Principal Component Analysis (KPCA) and obtains sharp excess risk bounds for the reconstruction error using local Rademacher averages and the dependence on the decay of the spectrum and on the closeness of successive eigenvalues is made explicit. Expand
The kernel mutual information
  • A. Gretton, R. Herbrich, Alex Smola
  • Mathematics, Computer Science
  • 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).
  • 2003
TLDR
A new contrast function is introduced, the kernel mutual information (KMI), to measure the degree of independence of continuous random variables, and it is suggested that the addition of a regularising term in the KGV causes it to approach the KMI. Expand
Kernel independent component analysis
  • F. Bach, Michael I. Jordan
  • Computer Science, Mathematics
  • 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).
  • 2003
TLDR
A class of algorithms for independent component analysis which use contrast functions based on canonical correlations in a reproducing kernel Hilbert space is presented, showing that these algorithms outperform many of the presently known algorithms. Expand
ICA Using Spacings Estimates of Entropy
TLDR
A new algorithm for the independent components analysis (ICA) problem based on an efficient entropy estimator that is simple, computationally efficient, intuitively appealing, and outperforms other well known algorithms. Expand
Joint measures and cross-covariance operators
Let H1 (resp., H2) be a real and separable Hilbert space with Borel o-field r1 (resp., r2), and let (H1 x H2, r, x r2) be the product measurable space generated by the measurable rectangles. ThisExpand
On the Influence of the Kernel on the Consistency of Support Vector Machines
TLDR
It is shown that the soft margin algorithms with universal kernels are consistent for a large class of classification problems including some kind of noisy tasks provided that the regularization parameter is chosen well. Expand
An Information-Maximization Approach to Blind Separation and Blind Deconvolution
TLDR
It is suggested that information maximization provides a unifying framework for problems in "blind" signal processing and dependencies of information transfer on time delays are derived. Expand
QUADRATIC DEPENDENCE MEASURE FOR NONLINEAR BLIND SOURCES SEPARATION
This work focuses on a quadratic dependence measure which can be used for blind source separation. After defining it, we show some links with other quadratic dependence measures used by FeuervergerExpand
Canonical correlation analysis when the data are curves.
SUMMARY It is not immediately straightforward to extend canonical correlation analysis to the context of functional data analysis, where the data are themselves curves or functions. The obviousExpand
...
1
2
3
...