# Double-estimation-friendly inference for high-dimensional misspecified models

@article{Shah2019DoubleestimationfriendlyIF, title={Double-estimation-friendly inference for high-dimensional misspecified models}, author={Rajen Dinesh Shah and Peter Buhlmann}, journal={arXiv: Statistics Theory}, year={2019} }

All models may be wrong---but that is not necessarily a problem for inference. Consider the standard $t$-test for the significance of a variable $X$ for predicting response $Y$ whilst controlling for $p$ other covariates $Z$ in a random design linear model. This yields correct asymptotic type~I error control for the null hypothesis that $X$ is conditionally independent of $Y$ given $Z$ under an \emph{arbitrary} regression model of $Y$ on $(X, Z)$, provided that a linear regression model for $X…

## Figures from this paper

## 3 Citations

Inference in High-dimensional Linear Regression

- Mathematics
- 2021

We develop an approach to inference in a linear regression model when the number of potential explanatory variables is larger than the sample size. Our approach treats each regression coefficient in…

High-dimensional regression with potential prior information on variable importance

- Computer ScienceStat. Comput.
- 2022

It is shown that the computational cost for fitting all models when ridge regression is used is no more than for a single fit of ridge regression, and a strategy for Lasso regression is described that makes use of previous fits to greatly speed up fitting the entire sequence of models.

High-Dimensional Feature Selection for Sample E cient Treatment E↵ect Estimation

- Mathematics, Computer Science
- 2021

A common objective function involving outcomes across treatment cohorts with nonconvex joint sparsity regularization that is guaranteed to recover S with high probability under a linear outcome model for Y and subgaussian covariates for each of the treatment cohort is proposed.

## References

SHOWING 1-10 OF 72 REFERENCES

High-dimensional doubly robust tests for regression parameters

- Computer Science
- 2018

This work proposes tests of the null that are uniformly valid under sparsity conditions weaker than those typically invoked in the literature, assuming working models for the exposure and outcome are both correctly specified.

A unifying approach for doubly-robust $\ell_1$ regularized estimation of causal contrasts

- Mathematics
- 2019

We consider inference about a scalar parameter under a non-parametric model based on a one-step estimator computed as a plug in estimator plus the empirical mean of an estimator of the parameter's…

The hardness of conditional independence testing and the generalised covariance measure

- MathematicsThe Annals of Statistics
- 2020

It is a common saying that testing for conditional independence, i.e., testing whether whether two random vectors $X$ and $Y$ are independent, given $Z$, is a hard statistical problem if $Z$ is a…

Goodness-offit testing in high-dimensional generalized linear models

- Computer Science, Mathematics
- 2019

A family of tests to assess the goodness-of-fit of a high-dimensional generalized linear model is proposed and may be used to construct an omnibus test or directed against testing specific non-linearities and interaction effects, or for testing the significance of groups of variables.

Linear Hypothesis Testing in Dense High-Dimensional Linear Models

- Computer Science, MathematicsJournal of the American Statistical Association
- 2018

This work proposes a methodology for testing linear hypothesis in high-dimensional linear models, and establishes asymptotically exact control on Type I error without imposing any sparsity assumptions on model parameter or the vector representing the linear hypothesis.

A General Theory of Hypothesis Tests and Confidence Regions for Sparse High Dimensional Models

- Computer Science, Mathematics
- 2014

The decorrelated score function can be used to construct point and confidence region estimators that are semiparametrically efficient and extended to handle high dimensional null hypothesis, where the number of parameters of interest can increase exponentially fast with the sample size.

Goodness‐of‐fit tests for high dimensional linear models

- Computer Science
- 2015

It is argued that residual prediction tests can be designed to test for as diverse model misspecifications as heteroscedasticity and non‐linearity and that some form of the parametric bootstrap can do the same when the high dimensional linear model is under consideration.

Doubly Robust Estimation in Missing Data and Causal Inference Models

- MathematicsBiometrics
- 2005

The results of simulation studies are presented which demonstrate that the finite sample performance of DR estimators is as impressive as theory would predict and the proposed method is applied to a cardiovascular clinical trial.

High-dimensional inference in misspecified linear models

- Computer Science, Economics
- 2015

This work focuses on the de-sparsified Lasso procedure and describes some correct interpretations and corresponding sufficient assumptions for valid asymptotic inference of the model parameters, which still have a useful meaning when the model is misspecified.

Double machine learning for treatment and causal parameters

- Computer Science, Mathematics
- 2016

The resulting method could be called a "double ML" method because it relies on estimating primary and auxiliary predictive models and achieves the fastest rates of convergence and exhibit robust good behavior with respect to a broader class of probability distributions than naive "single" ML estimators.