• Corpus ID: 202734155

Double-estimation-friendly inference for high-dimensional misspecified models

  title={Double-estimation-friendly inference for high-dimensional misspecified models},
  author={Rajen Dinesh Shah and Peter Buhlmann},
  journal={arXiv: Statistics Theory},
All models may be wrong---but that is not necessarily a problem for inference. Consider the standard $t$-test for the significance of a variable $X$ for predicting response $Y$ whilst controlling for $p$ other covariates $Z$ in a random design linear model. This yields correct asymptotic type~I error control for the null hypothesis that $X$ is conditionally independent of $Y$ given $Z$ under an \emph{arbitrary} regression model of $Y$ on $(X, Z)$, provided that a linear regression model for $X… 
Inference in High-dimensional Linear Regression
We develop an approach to inference in a linear regression model when the number of potential explanatory variables is larger than the sample size. Our approach treats each regression coefficient in
High-dimensional regression with potential prior information on variable importance
It is shown that the computational cost for fitting all models when ridge regression is used is no more than for a single fit of ridge regression, and a strategy for Lasso regression is described that makes use of previous fits to greatly speed up fitting the entire sequence of models.
High-Dimensional Feature Selection for Sample E cient Treatment E↵ect Estimation
A common objective function involving outcomes across treatment cohorts with nonconvex joint sparsity regularization that is guaranteed to recover S with high probability under a linear outcome model for Y and subgaussian covariates for each of the treatment cohort is proposed.


High-dimensional doubly robust tests for regression parameters
This work proposes tests of the null that are uniformly valid under sparsity conditions weaker than those typically invoked in the literature, assuming working models for the exposure and outcome are both correctly specified.
A unifying approach for doubly-robust $\ell_1$ regularized estimation of causal contrasts
We consider inference about a scalar parameter under a non-parametric model based on a one-step estimator computed as a plug in estimator plus the empirical mean of an estimator of the parameter's
The hardness of conditional independence testing and the generalised covariance measure
It is a common saying that testing for conditional independence, i.e., testing whether whether two random vectors $X$ and $Y$ are independent, given $Z$, is a hard statistical problem if $Z$ is a
Goodness-offit testing in high-dimensional generalized linear models
A family of tests to assess the goodness-of-fit of a high-dimensional generalized linear model is proposed and may be used to construct an omnibus test or directed against testing specific non-linearities and interaction effects, or for testing the significance of groups of variables.
Linear Hypothesis Testing in Dense High-Dimensional Linear Models
This work proposes a methodology for testing linear hypothesis in high-dimensional linear models, and establishes asymptotically exact control on Type I error without imposing any sparsity assumptions on model parameter or the vector representing the linear hypothesis.
A General Theory of Hypothesis Tests and Confidence Regions for Sparse High Dimensional Models
The decorrelated score function can be used to construct point and confidence region estimators that are semiparametrically efficient and extended to handle high dimensional null hypothesis, where the number of parameters of interest can increase exponentially fast with the sample size.
Goodness‐of‐fit tests for high dimensional linear models
It is argued that residual prediction tests can be designed to test for as diverse model misspecifications as heteroscedasticity and non‐linearity and that some form of the parametric bootstrap can do the same when the high dimensional linear model is under consideration.
Doubly Robust Estimation in Missing Data and Causal Inference Models
The results of simulation studies are presented which demonstrate that the finite sample performance of DR estimators is as impressive as theory would predict and the proposed method is applied to a cardiovascular clinical trial.
High-dimensional inference in misspecified linear models
This work focuses on the de-sparsified Lasso procedure and describes some correct interpretations and corresponding sufficient assumptions for valid asymptotic inference of the model parameters, which still have a useful meaning when the model is misspecified.
Double machine learning for treatment and causal parameters
The resulting method could be called a "double ML" method because it relies on estimating primary and auxiliary predictive models and achieves the fastest rates of convergence and exhibit robust good behavior with respect to a broader class of probability distributions than naive "single" ML estimators.