# Selective Sequential Model Selection

@article{Fithian2015SelectiveSM, title={Selective Sequential Model Selection}, author={William Fithian and Jonathan E. Taylor and Robert Tibshirani and Ryan J. Tibshirani}, journal={arXiv: Methodology}, year={2015} }

Many model selection algorithms produce a path of fits specifying a sequence of increasingly complex models. Given such a sequence and the data used to produce them, we consider the problem of choosing the least complex model that is not falsified by the data. Extending the selected-model tests of Fithian et al. (2014), we construct p-values for each step in the path which account for the adaptive selection of the model path using the data. In the case of linear regression, we propose two…

## Figures and Tables from this paper

## 50 Citations

More Powerful Selective Kernel Tests for Feature Selection

- Computer ScienceAISTATS
- 2020

This work extends two recent proposals for selecting features using the Maximum Mean Discrepancy and Hilbert Schmidt Independence Criterion to condition on the minimal conditioning event and shows how recent advances in multiscale bootstrap makes conditioning on the minimum selection event possible.

Exact post-selection inference for the generalized lasso path

- Computer Science
- 2018

Practical aspects of the methods such as (valid, i.e., fully-accounted for) post-processing of generalized lasso estimates before performing inference in order to improve power, and problem-specific visualization aids that may be given to the data analyst for he/she to choose linear contrasts to be tested are described.

Testing-Based Forward Model Selection

- Economics
- 2015

This work introduces a theoretical foundation for a procedure called `testing-based forward model selection' in regression problems. Forward selection is a general term refering to a model selection…

Exact Post-Selection Inference for Sequential Regression Procedures

- Mathematics
- 2014

ABSTRACT We propose new inference tools for forward stepwise regression, least angle regression, and the lasso. Assuming a Gaussian model for the observation vector y, we first describe a general…

A One-Covariate at a Time, Multiple Testing Approach to Variable Selection in High-Dimensional Linear Regression Models

- Computer Science
- 2016

The OCMT provides an alternative to penalised regression methods that is based on statistical inference and is therefore easier to interpret and relate to the classical statistical analysis, it allows working under more general assumptions, it is faster, and performs well in small samples for almost all of the different sets of experiments considered in this paper.

Exact Post-Selection Inference for Changepoint Detection and Other Generalized Lasso Problems

- Computer Science
- 2016

Practical aspects of the methods such as valid post-processing of generalized estimates before performing inference in order to improve power, and problem-specific visualization aids that may be given to the data analyst for he/she to choose linear contrasts to be tested are described.

Efficient test-based variable selection for high-dimensional linear models

- MathematicsJ. Multivar. Anal.
- 2018

More Powerful Conditional Selective Inference for Generalized Lasso by Parametric Programming

- Computer ScienceArXiv
- 2021

This study proposes a more powerful and general conditional SI method for a class of problems that can be converted into quadratic parametric programming, which includes generalized lasso and improves the performance and practicality of SI in various respects.

Analysis of Testing‐Based Forward Model Selection

- Mathematics, Computer ScienceEconometrica
- 2020

This paper proves probabilistic bounds, which depend on the quality of the tests, for prediction error and the number of selected covariates in linear regression problems, to be specialized to a case with heteroscedastic data.

More powerful post-selection inference, with application to the Lasso

- Computer Science
- 2018

This work shows how to generate hypotheses in a strategic manner that sharply reduces the cost of data exploration and results in useful confidence intervals.

## References

SHOWING 1-10 OF 34 REFERENCES

A significance test for forward stepwise model selection

- Mathematics
- 2014

We apply the methods developed by Lockhart et al. (2013) and Taylor et al. (2013) on significance tests for penalized regression to forward stepwise model selection. A general framework for selection…

Testing-Based Forward Model Selection

- Economics
- 2015

This work introduces a theoretical foundation for a procedure called `testing-based forward model selection' in regression problems. Forward selection is a general term refering to a model selection…

Exact Post-Selection Inference for Sequential Regression Procedures

- Mathematics
- 2014

ABSTRACT We propose new inference tools for forward stepwise regression, least angle regression, and the lasso. Assuming a Gaussian model for the observation vector y, we first describe a general…

Sequential selection procedures and false discovery rate control

- Business, Computer Science
- 2013

This work proposes two new testing procedures and proves that they control the false discovery rate in the ordered testing setting and shows how the methods can be applied to model selection by using recent results on p‐values in sequential model selection settings.

A simple forward selection procedure based on false discovery rate control

- Computer Science
- 2009

It is shown that FDR based procedures have good performance, and in particular the newly proposed method, emerges as having empirical minimax performance, Interestingly, using FDR level of 0.05 is a global best.

Uniform asymptotic inference and the bootstrap after model selection

- Mathematics, Computer ScienceThe Annals of Statistics
- 2018

The large sample properties of this method, without assuming normality, are studied, and it is proved that the test statistic of Tibshirani et al. (2016) is asymptotically valid, as the number of samples n grows and the dimension d of the regression problem stays fixed.

Accumulation Tests for FDR Control in Ordered Hypothesis Testing

- Computer Science
- 2015

This article develops a family of “accumulation tests” to choose a cutoff k that adapts to the amount of signal at the top of the ranked list, and introduces a new method in this family, the HingeExp method, which offers higher power to detect true signals compared to existing techniques.

Stability selection

- Computer Science
- 2010

It is proved for the randomized lasso that stability selection will be variable selection consistent even if the necessary conditions for consistency of the original lasso method are violated.

Optimal Inference After Model Selection

- Computer Science
- 2014

To perform inference after model selection, we propose controlling the selective type I error; i.e., the error rate of a test given that it was performed. By doing so, we recover long-run frequency…

A new look at the statistical model identification

- Mathematics
- 1974

The history of the development of statistical hypothesis testing in time series analysis is reviewed briefly and it is pointed out that the hypothesis testing procedure is not adequately defined as…