• Corpus ID: 88517109

Spiked covariances and principal components analysis in high-dimensional random effects models

@article{Fan2018SpikedCA,
  title={Spiked covariances and principal components analysis in high-dimensional random effects models},
  author={Zhou Fan and Iain M. Johnstone and Yi Sun},
  journal={arXiv: Statistics Theory},
  year={2018}
}
We study principal components analyses in multivariate random and mixed effects linear models, assuming a spherical-plus-spikes structure for the covariance matrix of each random effect. We characterize the behavior of outlier sample eigenvalues and eigenvectors of MANOVA variance components estimators in such models under a high-dimensional asymptotic regime. Our results show that an aliasing phenomenon may occur in high dimensions, in which eigenvalues and eigenvectors of the MANOVA estimate… 

Figures from this paper

PRINCIPAL COMPONENTS IN LINEAR MIXED MODELS WITH GENERAL BULK
We study the outlier eigenvalues and eigenvectors in variance components estimates for highdimensional mixed effects linear models using a free probability approach. We quantify the almost-sure
Principal components in linear mixed models with general bulk
We study the principal components of covariance estimators in multivariate mixed-effects linear models. We show that, in high dimensions, the principal eigenvalues and eigenvectors may exhibit bias
EIGENVALUE DISTRIBUTIONS OF VARIANCE COMPONENTS ESTIMATORS IN HIGH-DIMENSIONAL RANDOM EFFECTS MODELS.
TLDR
This work studies the spectra of MANOVA estimators for variance component covariance matrices in multivariate random effects models, and establishes a general asymptotic freeness result for families of rectangular orthogonally-invariant random matrices, which is of independent interest.
Tracy-Widom at each edge of real covariance and MANOVA estimators
We study the sample covariance matrix for real-valued data with general population covariance, as well as MANOVA-type covariance estimators in variance components models under null hypotheses of
Matrix Means and a Novel High-Dimensional Shrinkage Phenomenon
Many statistical settings call for estimating a population parameter, most typically the population mean, from a sample of matrices. The most natural estimate of the population mean is the arithmetic
Notes on asymptotics of sample eigenstructure for spiked covariance models with non-Gaussian data
TLDR
In the spiked covariance model, results on asymptotic normality of sample leading eigenvalues and certain projections of the corresponding sample eigenvectors are developed.
Spectral Methods for Data Science: A Statistical Perspective
TLDR
This monograph aims to present a systematic, comprehensive, yet accessible introduction to spectral methods from a modern statistical perspective, highlighting their algorithmic implications in diverse large-scale applications.
Analysis of Information Transfer from Heterogeneous Sources via Precise High-dimensional Asymptotics
TLDR
The main ingredient of the analysis is finding the high-dimensional asymptotic limits of various functions involving the sum of two independent sample covarianceMatrices with different population covariance matrices, which may be of independent interest.
Singular vector and singular subspace distribution for the matrix denoising model
In this paper, we study the matrix denosing model $Y=S+X$, where $S$ is a low-rank deterministic signal matrix and $X$ is a random noise matrix, and both are $M\times n$. In the scenario that $M$ and
Anomaly detection for electricity consumption in cloud computing: framework, methods, applications, and challenges
TLDR
The basic definition of anomaly detection for electricity consumption is introduced and a new framework with cloud computing is proposed and the applications of centralized and decentralized detection methods for the anomaly electricity consumption are listed.

References

SHOWING 1-10 OF 68 REFERENCES
EIGENVALUE DISTRIBUTIONS OF VARIANCE COMPONENTS ESTIMATORS IN HIGH-DIMENSIONAL RANDOM EFFECTS MODELS.
TLDR
This work studies the spectra of MANOVA estimators for variance component covariance matrices in multivariate random effects models, and establishes a general asymptotic freeness result for families of rectangular orthogonally-invariant random matrices, which is of independent interest.
ASYMPTOTICS OF SAMPLE EIGENSTRUCTURE FOR A LARGE DIMENSIONAL SPIKED COVARIANCE MODEL
This paper deals with a multivariate Gaussian observation model where the eigenvalues of the covariance matrix are all one, except for a finite number which are larger. Of interest is the asymptotic
Central limit theorems for eigenvalues in a spiked population model
In a spiked population model, the population covariance matrix has all its eigenvalues equal to units except for a few fixed eigenvalues (spikes). This model is proposed by Johnstone to cope with
Estimation of Variance and Covariance Components in Linear Models
Abstract We write a linear model in the form , where is an unknown parameter and ξ is a hypothetical random variable with a given dispersion structure but containing unknown parameters called
Perils of Parsimony: Properties of Reduced-Rank Estimates of Genetic Covariance Matrices
TLDR
It is emphasized that the rank of the genetic covariance matrix should be chosen sufficiently large to accommodate all important genetic principal components, even though, paradoxically, this may require including a number of components with negligible eigenvalues.
Tracy-Widom at each edge of real covariance and MANOVA estimators
We study the sample covariance matrix for real-valued data with general population covariance, as well as MANOVA-type covariance estimators in variance components models under null hypotheses of
Restricted maximum likelihood estimation of genetic principal components and smoothed covariance matrices
TLDR
It is shown that reduced rank estimation can reduce computational requirements of multivariate analyses substantially and an application to the analysis of eight traits recorded via live ultrasound scanning of beef cattle is given.
Finite sample approximation results for principal component analysis: a matrix perturbation approach
TLDR
A matrix perturbation view of the "phase transition phenomenon," and a simple linear-algebra based derivation of the eigenvalue and eigenvector overlap in this asymptotic limit of finite sample PCA are presented.
Direct Estimation of Genetic Principal Components
TLDR
Direct estimation of the principal components reduces the number of parameters to be estimated, uses the data efficiently, and provides the basis for new estimation algorithms.
...
...