# Splitting strategies for post-selection inference

@inproceedings{Rasines2021SplittingSF, title={Splitting strategies for post-selection inference}, author={Daniel Garcia Rasines and G. Alastair Young}, year={2021} }

We consider the problem of providing valid inference for a selected parameter in a sparse regression setting. It is well known that classical regression tools can be unreliable in this context due to the bias generated in the selection step. Many approaches have been proposed in recent years to ensure inferential validity. Here, we consider a simple alternative to data splitting based on randomising the response vector, which allows for higher selection and inferential power than the former and…

## 12 Citations

### Approximate Post-Selective Inference for Regression with the Group LASSO

- Computer Science
- 2020

A consistent, post-selective, Bayesian method to address the existing gaps by deriving a likelihood adjustment factor and an approximation thereof that eliminates bias from the selection of groups is developed.

### Selective Inference in Propensity Score Analysis

- Mathematics, Computer Science
- 2021

This paper develops selective inference in propensity score analysis with a semiparametric approach, which has become a standard tool in causal inference.

### Post-Selection Inference via Algorithmic Stability

- Computer Science, Mathematics
- 2020

This work revisit the PoSI problem through the lens of algorithmic stability, and shows that stability parameters of a selection method alone suffice to provide non-trivial corrections to classical z-test and t-test intervals.

### Conditional Versus Unconditional Approaches to Selective Inference

- Computer Science
- 2022

It is shown that selective inference methods based on selection and conditioning are always dominated by multiple testing methods deﬁned directly on the full universe of hypotheses, even when this universe is potentially inﬂnite and only deﷁned implicitly, such as in data splitting.

### Inference in High-dimensional Linear Regression

- Mathematics
- 2021

This paper develops an approach to inference in a linear regression model when the number of potential explanatory variables is larger than the sample size. The approach treats each regression…

### Selective inference for k-means clustering

- Computer Science, Mathematics
- 2022

A finite-sample p-value is proposed that controls the selective Type I error for a test of the difference in means between a pair of clusters obtained using k-means clustering, and it is shown that it can be efficiently computed.

### Some Perspectives on Inference in High Dimensions

- Computer ScienceStatistical Science
- 2022

The main emphasis of the present paper lies on contexts where formulation in terms of a probabilistic model is feasible and fruitful but to be at all realistic large numbers of unknown parameters need consideration.

### Empirical Bayes and Selective Inference

- MathematicsJournal of the Indian Institute of Science
- 2022

We review the empirical Bayes approach to large-scale inference. In the context of the problem of inference for a high-dimensional normal mean, empirical Bayes methods are advocated as they exhibit…

### More powerful selective inference for the graph fused lasso

- Computer ScienceJournal of Computational and Graphical Statistics
- 2022

This work proposes a new test for this task that controls the selective Type I error, and conditions on less information than existing approaches, leading to substantially higher power.

### Data blurring: sample splitting a single sample A Applications to selective conﬁdence intervals in generalized linear models

- Computer Science, Mathematics
- 2021

A more general methodology for achieving a split in samples of a random vector X by borrowing ideas from Bayesian inference to yield a (frequentist) solution that can be viewed as a continuous analog of data splitting.

## References

SHOWING 1-10 OF 35 REFERENCES

### Exact Post Model Selection Inference for Marginal Screening

- Computer Science, MathematicsNIPS
- 2014

A framework for post model selection inference, via marginal screening, in linear regression is developed that characterizes the exact distribution of linear functions of the response $y$, conditional on the model being selected ( ``condition on selection" framework).

### Bootstrapping and sample splitting for high-dimensional, assumption-lean inference

- Computer Science, MathematicsThe Annals of Statistics
- 2019

This paper revisit sample splitting combined with the bootstrap, and shows that this leads to a simple, assumption-free approach to inference, and finds new bounds on the accuracy of the boot strap and the Normal approximation for general nonlinear parameters with increasing dimension.

### Integrative methods for post-selection inference under convex constraints

- Computer ScienceThe Annals of Statistics
- 2021

Methods for carrying out inference conditional on selection are developed, which are more flexible in the sense that they naturally accommodate different models for the data, instead of requiring a case-by-case treatment.

### Inference after black box selection.

- Computer Science, Mathematics
- 2019

The problem of inference for parameters selected to report only after some algorithm is considered, the canonical example being inference for model parameters after a model selection procedure, and is recast into a statistical learning problem which can be fit with off-the-shelf models for binary regression.

### Uniformly valid confidence intervals post-model-selection

- Mathematics, Computer ScienceThe Annals of Statistics
- 2020

This work suggests general methods to construct asymptotically uniformly valid confidence intervals post-model-selection based on principles recently proposed by Berk et al. (2013), which perform remarkably well, even when compared to existing methods that are tailored only for specific model selection procedures.

### Exact post-selection inference, with application to the lasso

- Computer Science
- 2013

A general approach to valid inference after model selection by the lasso is developed to form valid confidence intervals for the selected coefficients and test whether all relevant variables have been included in the model.

### Optimal Inference After Model Selection

- Computer Science
- 2014

To perform inference after model selection, we propose controlling the selective type I error; i.e., the error rate of a test given that it was performed. By doing so, we recover long-run frequency…

### Valid post-selection inference

- Mathematics
- 2013

It is common practice in statistical data analysis to perform data-driven variable selection and derive statistical inference from the resulting model. Such inference enjoys none of the guarantees…

### HIGH DIMENSIONAL VARIABLE SELECTION.

- EconomicsAnnals of statistics
- 2009

This paper looks at the error rates and power of some multi-stage regression methods and considers three screening methods: the lasso, marginal regression, and forward stepwise regression.

### Valid confidence intervals for post-model-selection predictors

- Mathematics, Computer ScienceThe Annals of Statistics
- 2019

The PoSI intervals are generalized to post-model-selection predictors in linear regression, and their applications in inference and model selection are considered.