• Corpus ID: 239009914

# A Distribution-Free Independence Test for High Dimension Data

@inproceedings{Cai2021ADI,
title={A Distribution-Free Independence Test for High Dimension Data},
author={Zhanrui Cai and Jing Lei and Kathryn Roeder},
year={2021}
}
• Zhanrui Cai
• Published 14 October 2021
• Mathematics
Test of independence is of fundamental importance in modern data analysis, with broad applications in variable selection, graphical models, and causal inference. When the data is high dimensional and the potential dependence signal is sparse, independence testing becomes very challenging without distributional or structural assumptions. In this paper we propose a general framework for independence testing by first fitting a classifier that distinguishes the joint and product distributions, and…

## References

SHOWING 1-10 OF 35 REFERENCES
Testing mutual independence in high dimension via distance covariance
• Mathematics
• 2016
We introduce an L2â€ type test for testing mutual independence and banded dependence structure for high dimensional data. The test is constructed on the basis of the pairwise distance covariance and
Global and local two-sample tests via regression
• Mathematics
Electronic Journal of Statistics
• 2019
Two-sample testing is a fundamental problem in statistics. Despite its long history, there has been renewed interest in this problem with the advent of high-dimensional and complex data.
The distance correlation t-test of independence in high dimension
• Computer Science, Mathematics
J. Multivar. Anal.
• 2013
A modified distance correlation statistic is proposed, such that under independence the distribution of a transformation of the statistic converges to Student t, as dimension tends to infinity, and the resulting t-test is unbiased for every sample size greater than three and all significance levels.
On some exact distribution-free tests of independence between two random vectors of arbitrary dimensions
• Mathematics
• 2016
Abstract Several nonparametric methods are available in the literature to test the independence between two random vectors. But, many of them perform poorly for high dimensional data and are not
Nonparametric independence testing via mutual information
• Mathematics, Computer Science
Biometrika
• 2019
This work proposes a test of independence of two multivariate random vectors, given a sample from the underlying population, based on the estimation of mutual information, whose decomposition into joint and marginal entropies facilitates the use of recently-developed efficient entropy estimators derived from nearest neighbour distances.
A Kernel Statistical Test of Independence
• Computer Science, Mathematics
NIPS
• 2007
A novel test of the independence hypothesis for one particular kernel independence measure, the Hilbert-Schmidt independence criterion (HSIC), which outperforms established contingency table and functional correlation-based tests, and is greater for multivariate data.
Universal inference
• Medicine, Computer Science
Proceedings of the National Academy of Sciences
• 2020
A surprisingly simple method for producing statistical significance statements without any regularity conditions and it is shown that in settings when computing the MLE is hard, for the purpose of constructing valid tests and intervals, it is sufficient to upper bound the maximum likelihood.
Classification Accuracy as a Proxy for Two Sample Testing
• Mathematics, Computer Science
ArXiv
• 2016
This work proves two results that hold for all classifiers in any dimensions: if its true error remains $\epsilon-better than chance for some$\epSilon>0$as$d,n \to \infty\$, then (a) the permutation-based test is consistent (has power approaching to one), and (b) a computationally efficient test based on a Gaussian approximation of the null distribution is also consistent.
Multivariate Rank-Based Distribution-Free Nonparametric Testing Using Measure Transportation
• Mathematics
• 2019
In this paper, we propose a general framework for distribution-free nonparametric testing in multi-dimensions, based on a notion of multivariate ranks defined using the theory of measure
A Distribution-Free Test of Covariate Shift Using Conformal Prediction
• Computer Science, Mathematics
• 2020
This is the first successful attempt of using conformal prediction for testing statistical hypotheses and can be effectively combined with existing classification algorithms to find good conformity score functions.