• Corpus ID: 246210238

# Statistical Inference on Explained Variation in High-dimensional Linear Model with Dense Effects

@inproceedings{Chen2022StatisticalIO,
title={Statistical Inference on Explained Variation in High-dimensional Linear Model with Dense Effects},
author={Hua Yun Chen},
year={2022}
}
• H. Y. Chen
• Published 18 January 2022
• Computer Science
Statistical inference on the explained variation of an outcome by a set of covariates is of particular interest in practice. When the covariates are of moderate to highdimension and the effects are not sparse, several approaches have been proposed for estimation and inference. One major problem with the existing approaches is that the inference procedures are not robust to the normality assumption on the covariates and the residual errors. In this paper, we propose an estimating equation…
3 Citations

## Figures and Tables from this paper

• Engineering
International journal of environmental research and public health
• 2022
Exposures to environmental pollutants are often composed of mixtures of chemicals that can be highly correlated because of similar sources and/or chemical structures. The effect of an individual
• Biology
International journal of environmental research and public health
• 2022
37 new methods from PRIME projects are reviewed and summarized to enable more informed analyses of environmental mixtures and stress training for early career scientists as well as innovation in statistical methodology as an ongoing need.
• Mathematics, Computer Science
• 2022
This work presents an unbiased and consistent estimator and then improves it by using a zero-estimator approach, where aZero estimator is a statistic whose expected value is zero.

## References

SHOWING 1-10 OF 21 REFERENCES

The residual variance and the proportion of explained variation are important quantities in many statistical models and model fitting procedures. They play an important role in regression diagnostics
• Mathematics
AISTATS
• 2016
We study maximum likelihood estimators (MLEs) for the residual variance, the signalto-noise ratio, and other variance parameters in high-dimensional linear models. These parameters are essential in
• Computer Science
• 2018
It is shown that the estimator achieves the minimax optimal rate of convergence in the general semi-supervised framework and the limiting distribution for the proposed estimator is established and data-driven confidence intervals for the explained variance are constructed.
• Mathematics, Computer Science
Journal of the Royal Statistical Society. Series B, Statistical methodology
• 2017
A novel procedure is derived, called EigenPrism, which is asymptotically correct when the covariates are multivariate Gaussian and produces valid confidence intervals in finite samples as well and applies to a genetic data set to estimate the genetic signal‐to‐noise ratio for a number of continuous phenotypes.
• Mathematics
• 2016
Co-heritability is an important concept that characterizes the genetic associations within pairs of quantitative traits. There has been significant recent interest in estimating the co-heritability
• Computer Science
Journal of the Royal Statistical Society. Series B, Statistical methodology
• 2012
A two‐stage refitted procedure via a data splitting technique, called refitted cross‐validation, to attenuate the influence of irrelevant variables with high spurious correlations is proposed and results show that the resulting procedure performs as well as the oracle estimator, which knows in advance the mean regression function.
• Computer Science, Mathematics
Bernoulli
• 2018
This work considers the equivalent problems of estimating the residual variance, the proportion of explained variance $\eta$ and the signal strength in a high-dimensional linear regression model with Gaussian random design and builds an adaptive procedure whose convergence rate achieves the minimax risk over all up to a logarithmic loss.
• Biology
Nature Genetics
• 2010
Evidence is provided that the remaining heritability is due to incomplete linkage disequilibrium between causal variants and genotyped SNPs, exacerbated by causal variants having lower minor allele frequency than the SNPs explored to date.
• Mathematics
• 2007
Let {X ij }, i, j = ..., be a double array of i.i.d. complex random variables with EX 11 = 0, E|X 11 | 2 = 1 and E|X 11 | 4 <∞, and let An = (1 N T 1/2 n X n X* n (T 1/2 n , where T 1/2 n is the