Estimating R 2 Shrinkage in Multiple Regression: A Comparison of Different Analytical Methods

@article{Yin2001EstimatingR2,
  title={Estimating R 2 Shrinkage in Multiple Regression: A Comparison of Different Analytical Methods},
  author={Ping Yin and Xitao Fan},
  journal={The Journal of Experimental Education},
  year={2001},
  volume={69},
  pages={203 - 224}
}
  • Ping Yin, Xitao Fan
  • Published 1 January 2001
  • Mathematics
  • The Journal of Experimental Education
Abstract The effectiveness of various analytical formulas for estimating R 2 shrinkage in multiple regression analysis was investigated. Two categories of formulas were identified: estimators of the squared population multiple correlation coefficient (ρ2) and those of the squared population cross-validity coefficient (ρc 2). The authors conducted a Monte Carlo experiment to investigate the effectiveness of the analytical formulas for estimating R 2 shrinkage, with 4 fully crossed factors… 
ESTIMATING R 2 SHRINKAGE IN REGRESSION
The effectiveness of various analytical formulas for estimating R Shrinkage in multiple regression analysis was investigated. Two categories of formulas were identified estimators of the squared
Improved Shrinkage Estimation of Squared Multiple Correlation Coefficient and Squared Cross-Validity Coefficient
The sample squared multiple correlation coefficient is widely used for describing the usefulness of a multiple linear regression model in many areas of science. In this article, the author considers
Exact Analysis of Squared Cross-Validity Coefficient in Predictive Regression Models
  • G. Shieh
  • Business
    Multivariate behavioral research
  • 2009
TLDR
All the currently available exact methods for interval estimation, power calculation, and sample size determination of the squared multiple correlation coefficient are naturally modified and extended to the analysis of the squares cross-validity coefficient.
Investigating bias in squared regression structure coefficients
TLDR
Using data from a Monte Carlo simulation, this study found that squared regression structure coefficients corrected with Pratt's formula produced less biased estimates and might be more accurate and stable estimates of population squared regressionructure coefficients than estimates with no such corrections.
Estimation of the simple correlation coefficient
  • G. Shieh
  • Mathematics
    Behavior research methods
  • 2010
TLDR
The results reveal specific situations in which the sample correlation coefficient performs better than the unbiased and nearly unbiased estimators, facilitating recommendation of r as an effect size index for the strength of linear association between two variables.
Sample size requirements for interval estimation of the strength of association effect sizes in multiple regression analysis.
TLDR
The simulation results showed that the sample size procedures proposed by Bonett and Wright for precise interval estimation of the squared multiple correlation coefficient showed that their simple method for attaining the desired precision of expected width provides satisfactory results only when sample sizes are large.
Attenuation of the Squared Canonical Correlation Coefficient Under Varying Estimates of Score Reliability
Research pertaining to the distortion of the squared canonical correlation coefficient has traditionally been limited to the effects of sampling error and associated correction formulas. The purpose
Ezekiel’s classic estimator of the population squared multiple correlation coefficient: Monte Carlo-based extensions and refinements
  • J. Hittner
  • Psychology
    The Journal of general psychology
  • 2019
TLDR
Results indicated that sample size-to-predictor ratios of 66.67 and greater were associated with low bias and that ratios of this magnitude were accompanied by large sample sizes, thus suggesting that researchers using Ezekiel’s adjusted R2 should aim for sample sizes of 200 or greater in order to minimize bias when estimating the population squared multiple correlation coefficient.
Improving on Adjusted R-Squared
  • J. Karch
  • Mathematics
    Collabra: Psychology
  • 2020
The amount of variance explained is widely reported for quantifying the model fit of a multiple linear regression model. The default adjusted R-squared estimator has the disadvantage of not being
Bias and Precision of the Squared Canonical Correlation Coefficient under Nonnormal Data Conditions
This dissertation: (a) investigated the degree to which the squared canonical correlation coefficient is biased in multivariate nonnormal distributions and (b) identified formulae that adjust the
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 51 REFERENCES
Estimating the Coefficient of Cross-Validity in Multiple Regression: A Comparison of Analytical and Empirical Methods
Abstract In predictive applications of multiple regression, interest centers on the estimation of the population coefficient of cross-validation rather than the population multiple correlation. The
Use of Empirical Estimates of Shrinkage in Multiple Regression: A Caution
Empirical techniques to estimate the shrinkage of the sample R2 have been advocated as alternatives to analytical formulae. Although such techniques may be appropriate for estimating the coefficient
Comparison of Different Shrinkage Formulas in Estimating Population Multiple Correlation Coefficients
Five different shrinkage formulas were compared to see which most accurately reduced the positive bias in sample R 2 values as estimators of the squared population multiple correlation coefficient
Estimation of the Squared Cross-Validity Coefficient in the Context of Best Subset Regression
A monte carlo study was conducted to examine the performance of several strategies for estimating the squared cross-validity coefficient of a sample regres sion equation in the context of best subset
Estimation in Multiple Correlation/Prediction
The distinction between the square of a population correlation coefficient (ρ2) and of the true validity of a sample prediction equation (ρ v 2) as the parameters of interest in multiple correlation
Multiple Regression and Validity Estimation in One Sample
This study empirically investigated equations for estimating the value of the multiple correlation co efficient in the population underlying a sample and the value of the population validity
Predicting Shrinkage in the Multiple Correlation Coefficient
one sample from a defined population, we are often interested in determining how accurate this same equation would be in predicting the same criterion variable for new samples from the same
PREDICTIVE VALIDITY OF A LINEAR REGRESSION EQUATION
The squared correlation coefficient, w2, between an empirically chosen linear function of predictors, B0 + B′x, and a criterion, y, is employed as a measure of predictive precision. This coefficient
The Parameters of Cross-Validation
Abstract : The validation of predictor weights, derived in one sample, by computing the correlation of the weighted sum of the predictors with the criterion in new samples is called cross-validation.
A study of reduced rank models for multiple prediction
Abstract : The present study proceeds along both theoretical and empirical lines. First an attempt is made to work out some of the consequences of regression theory for reduced-rank models. Since, as
...
1
2
3
4
5
...