Statistical Inference on Explained Variation in High-dimensional Linear Model with Dense Effects
@inproceedings{Chen2022StatisticalIO, title={Statistical Inference on Explained Variation in High-dimensional Linear Model with Dense Effects}, author={Hua Yun Chen}, year={2022} }
Statistical inference on the explained variation of an outcome by a set of covariates is of particular interest in practice. When the covariates are of moderate to highdimension and the effects are not sparse, several approaches have been proposed for estimation and inference. One major problem with the existing approaches is that the inference procedures are not robust to the normality assumption on the covariates and the residual errors. In this paper, we propose an estimating equation…
Figures and Tables from this paper
3 Citations
Statistical Methods for Assessing the Explained Variation of a Health Outcome by a Mixture of Exposures
- EngineeringInternational journal of environmental research and public health
- 2022
Exposures to environmental pollutants are often composed of mixtures of chemicals that can be highly correlated because of similar sources and/or chemical structures. The effect of an individual…
Powering Research through Innovative Methods for Mixtures in Epidemiology (PRIME) Program: Novel and Expanded Statistical Methods
- BiologyInternational journal of environmental research and public health
- 2022
37 new methods from PRIME projects are reviewed and summarized to enable more informed analyses of environmental mixtures and stress training for early career scientists as well as innovation in statistical methodology as an ongoing need.
A zero-estimator approach for estimating the signal level in a high-dimensional model-free setting
- Mathematics, Computer Science
- 2022
This work presents an unbiased and consistent estimator and then improves it by using a zero-estimator approach, where aZero estimator is a statistic whose expected value is zero.
References
SHOWING 1-10 OF 21 REFERENCES
Variance estimation in high-dimensional linear models
- Mathematics
- 2014
The residual variance and the proportion of explained variation are important quantities in many statistical models and model fitting procedures. They play an important role in regression diagnostics…
Maximum Likelihood for Variance Estimation in High-Dimensional Linear Models
- MathematicsAISTATS
- 2016
We study maximum likelihood estimators (MLEs) for the residual variance, the signalto-noise ratio, and other variance parameters in high-dimensional linear models. These parameters are essential in…
Semi-supervised Inference for Explained Variance in High-dimensional Linear Regression and Its Applications
- Computer Science
- 2018
It is shown that the estimator achieves the minimax optimal rate of convergence in the general semi-supervised framework and the limiting distribution for the proposed estimator is established and data-driven confidence intervals for the explained variance are constructed.
Fast and Accurate Construction of Confidence Intervals for Heritability.
- MathematicsAmerican journal of human genetics
- 2016
EigenPrism: inference for high dimensional signal‐to‐noise ratios
- Mathematics, Computer ScienceJournal of the Royal Statistical Society. Series B, Statistical methodology
- 2017
A novel procedure is derived, called EigenPrism, which is asymptotically correct when the covariates are multivariate Gaussian and produces valid confidence intervals in finite samples as well and applies to a genetic data set to estimate the genetic signal‐to‐noise ratio for a number of continuous phenotypes.
Optimal Estimation of Co-heritability in High-dimensional Linear Models
- Mathematics
- 2016
Co-heritability is an important concept that characterizes the genetic associations within pairs of quantitative traits. There has been significant recent interest in estimating the co-heritability…
Variance estimation using refitted cross‐validation in ultrahigh dimensional regression
- Computer ScienceJournal of the Royal Statistical Society. Series B, Statistical methodology
- 2012
A two‐stage refitted procedure via a data splitting technique, called refitted cross‐validation, to attenuate the influence of irrelevant variables with high spurious correlations is proposed and results show that the resulting procedure performs as well as the oracle estimator, which knows in advance the mean regression function.
Adaptive estimation of high-dimensional signal-to-noise ratios
- Computer Science, MathematicsBernoulli
- 2018
This work considers the equivalent problems of estimating the residual variance, the proportion of explained variance $\eta$ and the signal strength in a high-dimensional linear regression model with Gaussian random design and builds an adaptive procedure whose convergence rate achieves the minimax risk over all up to a logarithmic loss.
Common SNPs explain a large proportion of the heritability for human height
- BiologyNature Genetics
- 2010
Evidence is provided that the remaining heritability is due to incomplete linkage disequilibrium between causal variants and genotyped SNPs, exacerbated by causal variants having lower minor allele frequency than the SNPs explored to date.
On asymptotics of eigenvectors of large sample covariance matrix
- Mathematics
- 2007
Let {X ij }, i, j = ..., be a double array of i.i.d. complex random variables with EX 11 = 0, E|X 11 | 2 = 1 and E|X 11 | 4 <∞, and let An = (1 N T 1/2 n X n X* n (T 1/2 n , where T 1/2 n is the…