• Corpus ID: 237108357

Covariate Selection Based on a Assumpton-free Approach to Linear Regression with Exact Probabilities

  title={Covariate Selection Based on a Assumpton-free Approach to Linear Regression with Exact Probabilities},
  author={Laurie Davies and Lutz Dumbgen},
In this paper we give a completely new approach to the problem of covariate selection in linear regression. A covariate or a set of covariates is included only if it is better in the sense of least squares than the same number of Gaussian covariates consisting of i.i.d. N(0, 1) random variables. The Gaussian P-value is defined as the probability that the Gaussian covariates are better. It is given in terms of the Beta distribution, it is exact and it holds for all data. The covariate selection… 

Figures and Tables from this paper

Linear Regression, Covariate Selection and the Failure of Modelling
It is argued that all model based approaches to the selection of covariates in linear regression have failed. This applies to frequentist approaches based on P-values and to Bayesian approaches


A simple test statistic based on lasso fitted values is proposed, called the covariance test statistic, and it is shown that when the true model is linear, this statistic has an Exp(1) asymptotic distribution under the null hypothesis (the null being that all truly active variables are contained in the current lasso model).
Model selection and estimation in regression with grouped variables
Summary. We consider the problem of selecting grouped variables (factors) for accurate prediction in regression. Such a problem arises naturally in many practical situations with the multifactor
Lasso, knockoff and Gaussian covariates: a comparison
Given data $\mathbf{y}$ and $k$ covariates $\mathbf{x}_j$ one problem in linear regression is to decide which if any of the covariates to include when regressing the dependent variable $\mathbf{y}$
High-dimensional graphs and variable selection with the Lasso
The pattern of zero entries in the inverse covariance matrix of a multivariate normal distribution corresponds to conditional independence restrictions between variables. Covariance selection aims at
Panning for Gold: Model-X Knockoffs for High-dimensional Controlled Variable Selection
A new framework of model-X knockoffs is proposed, which reads from a different perspective the knockoff procedure, originally designed for controlling the false discovery rate in linear models, and demonstrates the superior power of knockoffs through simulations.
Extensions of Smoothing via Taut Strings
Suppose that we observe independent, identically distributed random pairs (X1; Y1), (X2; Y2), . . . , (Xn; Yn). Our goal is to estimate regression functions such as the conditional mean or nquantile
the Significance Test
We describe the siepuficance test approach to statistical detection and derive its form for the detection of multiple signals in Weibull clutter. The significance test, as we use it, is a statistical
Local Extremes, Runs, Strings and Multiresolution
The paper considers the problem of nonparametric regression with emphasis on controlling the number of local extremes. Two methods, the run method and the taut-string multiresolution method, are
Rectangular Confidence Regions for the Means of Multivariate Normal Distributions
Abstract For rectangular confidence regions for the mean values of multivariate normal distributions the following conjecture of 0. J. Dunn [3], [4] is proved: Such a confidence region constructed
Extreme value theory : an introduction
This treatment of extreme value theory is unique in book literature in that it focuses on some beautiful theoretical results along with applications. All the main topics covering the heart of the