Minimax estimation in sparse canonical correlation analysis

@article{Gao2015MinimaxEI,
title={Minimax estimation in sparse canonical correlation analysis},
author={Chao Gao and Zongming Ma and Zhao Ren and Harrison H. Zhou},
journal={Annals of Statistics},
year={2015},
volume={43},
pages={2168-2197}
}
Canonical correlation analysis is a widely used multivariate statistical technique for exploring the relation between two sets of variables. This paper considers the problem of estimating the leading canonical correlation directions in high-dimensional settings. Recently, under the assumption that the leading canonical correlation directions are sparse, various procedures have been proposed for many highdimensional applications involving massive data sets. However, there has been few… Expand
Subspace perspective on canonical correlation analysis: Dimension reduction and minimax rates
• Mathematics
• 2016
Canonical correlation analysis (CCA) is a fundamental statistical tool for exploring the correlation structure between two sets of random variables. In this paper, motivated by recent success ofExpand
An Efficient and Optimal Method for Sparse Canonical Correlation Analysis
• Mathematics
• 2014
Canonical correlation analysis (CCA) is an important multivariate technique for exploring the relationship between two sets of variables which finds applications in many fields. This paper considersExpand
Sparse CCA: Adaptive Estimation and Computational Barriers
• Mathematics
• 2014
Canonical correlation analysis is a classical technique for exploring the relationship between two sets of variables. It has important applications in analyzing high dimensional datasets originatedExpand
An iterative penalized least squares approach to sparse canonical correlation analysis.
• Computer Science, Medicine
• Biometrics
• 2019
This work proposes a new sparse CCA (SCCA) method that recasts high-dimensional CCA as an iterative penalized least squares problem and produces nested solutions and thus provides great convenient in practice. Expand
Sparse Generalized Eigenvalue Problem: Optimal Statistical Rates via Truncated Rayleigh Flow
• Mathematics
• 2016
Sparse generalized eigenvalue problem (GEP) plays a pivotal role in a large family of high-dimensional statistical models, including sparse Fisher's discriminant analysis, canonical correlationExpand
Resistant multiple sparse canonical correlation
• Mathematics, Medicine
• Statistical applications in genetics and molecular biology
• 2016
This paper has demonstrated the success of resistant estimation in variable selection using SCCA, and used it to find multiple canonical pairs for extended knowledge about the datasets at hand, using resistant estimators provided more accurate estimates than standard estimators in the multiple canonical correlation setting. Expand
Rate-Optimal Perturbation Bounds for Singular Subspaces with Applications to High-Dimensional Statistics
• Mathematics
• 2016
Perturbation bounds for singular spaces, in particular Wedin's $\sin \Theta$ theorem, are a fundamental tool in many fields including high-dimensional statistics, machine learning, and appliedExpand
The optimal rate of canonical correlation analysis for stochastic processes
• Mathematics
• 2020
Abstract Functional canonical correlation analysis (FCCA) has been applied in many contexts, but the asymptotic properties have not yet been studied enough. In this paper we consider a general setupExpand
Sparse GCA and Thresholded Gradient Descent
• Sheng Gao
• Computer Science, Mathematics
• ArXiv
• 2021
This work forms sparse GCA as generalized eigenvalue problems at both population and sample levels via a careful choice of normalization constraints and proposes a thresholded gradient descent algorithm for estimating GCA loading vectors and matrices in high dimensions. Expand
On Bayesian sparse canonical correlation analysis via Rayleigh quotient framework
• Mathematics, Computer Science
• ArXiv
• 2020
This work proposes a semi-parametric Bayesian method for the principal canonical pair that employs the scaled Rayleigh quotient as a quasi-log-likelihood with the spike-and-slab prior as the sparse constraints and uses it to maximally correlate clinical variables and proteomic data for a better understanding of covid-19 disease. Expand

References

SHOWING 1-10 OF 53 REFERENCES
Sparse CCA: Adaptive Estimation and Computational Barriers
• Mathematics
• 2014
Canonical correlation analysis is a classical technique for exploring the relationship between two sets of variables. It has important applications in analyzing high dimensional datasets originatedExpand
Statistical and computational trade-offs in estimation of sparse principal components
• Computer Science, Mathematics
• 2014
It is shown that there is an effective sample size regime in which no randomised polynomial time algorithm can achieve the minimax optimal rate. Expand
MINIMAX SPARSE PRINCIPAL SUBSPACE ESTIMATION IN HIGH DIMENSIONS
• Mathematics
• 2013
We study sparse principal components analysis in high dimensions, where p (the number of variables) can be much larger than n (the number of observations), and analyze the problem of estimating theExpand
Rate-optimal posterior contraction for sparse PCA
• Mathematics
• 2015
Principal component analysis (PCA) is possibly one of the most widely used statistical tools to recover a low-rank structure of the data. In the high-dimensional settings, the leading eigenvector ofExpand
Sparse PCA: Optimal rates and adaptive estimation
• Mathematics
• 2013
Principal component analysis (PCA) is one of the most commonly used statistical procedures with a wide range of applications. This paper considers both minimax and adaptive estimation of theExpand
A greedy approach to sparse canonical correlation analysis
• Mathematics
• 2008
We consider the problem of sparse canonical correlation analysis (CCA), i.e., the search for two linear combinations, one for each multivariate, that yield maximum correlation using a specifiedExpand
Sparse CCA via Precision Adjusted Iterative Thresholding
• Mathematics
• 2013
Sparse Canonical Correlation Analysis (CCA) has received considerable attention in high-dimensional data analysis to study the relationship between two sets of random variables. However, there hasExpand
High-dimensional analysis of semidefinite relaxations for sparse principal components
• Mathematics, Computer Science
• 2008 IEEE International Symposium on Information Theory
• 2008
This paper analyzes a simple and computationally inexpensive diagonal cut-off method, and establishes a threshold of the order thetasdiag = n/[k2 log(p-k)] separating success from failure, and proves that a more complex semidefinite programming (SDP) relaxation due to dpsilaAspremont et al., succeeds once the sample size is of theorder thetassdp. Expand
On Consistency and Sparsity for Principal Components Analysis in High Dimensions
• Mathematics, Medicine
• Journal of the American Statistical Association
• 2009
A simple algorithm for selecting a subset of coordinates with largest sample variances is provided, and it is shown that if PCA is done on the selected subset, then consistency is recovered, even if p(n) ≫ n. Expand
Sparse canonical correlation analysis
• Mathematics, Computer Science
• Machine Learning
• 2010
A novel method for solving Canonical Correlation Analysis (CCA) in a sparse convex framework using a least squares approach and is able to observe that when the number of the original features is large SCCA outperforms Kernel CCA (KCCA), learning the common semantic space from a sparse set of features. Expand