# Feature subset selection for logistic regression via mixed integer optimization

@article{Sato2016FeatureSS, title={Feature subset selection for logistic regression via mixed integer optimization}, author={Toshiki Sato and Yuichi Takano and Ryuhei Miyashiro and Akiko Yoshise}, journal={Computational Optimization and Applications}, year={2016}, volume={64}, pages={865-880} }

This paper concerns a method of selecting a subset of features for a logistic regression model. Information criteria, such as the Akaike information criterion and Bayesian information criterion, are employed as a goodness-of-fit measure. The purpose of our work is to establish a computational framework for selecting a subset of features with an optimality guarantee. For this purpose, we devise mixed integer optimization formulations for feature subset selection in logistic regression…

## 44 Citations

An effective procedure for feature subset selection in logistic regression based on information criteria

- Computer ScienceComput. Optim. Appl.
- 2021

This paper proposes a new approach, which combines mixed-integer programming and decomposition techniques in order to overcome the aforementioned scalability issues, and provides a theoretical characterization of the proposed algorithm properties.

Piecewise-Linear Approximation for Feature Subset Selection in a Sequential Logit Model

- Computer Science, BusinessArXiv
- 2015

This paper applies a piecewise-linear approximation to the logistic loss function of a sequential logit model to frame the feature subset selection problem of minimizing an information criterion as a mixed integer linear optimization problem.

APPLICATION OF A MIXED INTEGER NONLINEAR PROGRAMMING APPROACH TO VARIABLE SELECTION IN LOGISTIC REGRESSION

- Computer ScienceJournal of the Operations Research Society of Japan
- 2019

A mixed integer nonlinear programming approach to AIC minimization for linear regression and it is shown that the approach outperformed existing approaches in terms of computational time and piecewise linear approximation approach.

Feature subset selection for the multinomial logit model via mixed-integer optimization

- Computer Science, MathematicsAISTATS
- 2019

This paper proposes an exact algorithm based on a mixed-integer optimization approach with an outer approximation method for the multinomial logit (MNL) model and proves the convergence properties of the algorithm for more general models including generalized linear models for multiclass classification.

Feature Subset Selection for Ordered Logit Model via Tangent-Plane-Based Approximation

- Mathematics, BusinessIEICE Trans. Inf. Syst.
- 2019

SUMMARY This paper is concerned with a mixed-integer optimiza- tion (MIO) approach to selecting a subset of relevant features from among many candidates. For ordinal classiﬁcation, a sequential logit…

Stochastic Discrete First-Order Algorithm for Feature Subset Selection

- Computer ScienceIEICE Trans. Inf. Syst.
- 2020

This paper proposes a stochastic discrete first-order (SDFO) algorithm for feature subset selection of candidate features to use for multiple linear regression, and derives the optimal step size in the gradient-descent direction to accelerate convergence of the algorithm.

An efficient optimization approach for best subset selection in linear regression, with application to model selection and fitting in autoregressive time-series

- Computer ScienceComput. Optim. Appl.
- 2019

This paper proposes to model both problems as mixed integer non linear optimization ones and propose numerical procedures based on state of the art optimization tools in order to solve both of them.

A mathematical programming approach for integrated multiple linear regression subset selection and validation

- MathematicsPattern Recognit.
- 2020

Mixed integer second-order cone programming formulations for variable selection in linear regression

- MathematicsEur. J. Oper. Res.
- 2015

Best subset selection via cross-validation criterion

- Computer Science
- 2020

The purpose of this paper is to establish a mixed-integer optimization approach to selecting the best subset of explanatory variables via the cross-validation criterion, which can be formulated as a bilevel MIO problem and reduced to a single-level mixed- integer quadratic optimization problem.

## References

SHOWING 1-10 OF 48 REFERENCES

Piecewise-Linear Approximation for Feature Subset Selection in a Sequential Logit Model

- Computer Science, BusinessArXiv
- 2015

This paper applies a piecewise-linear approximation to the logistic loss function of a sequential logit model to frame the feature subset selection problem of minimizing an information criterion as a mixed integer linear optimization problem.

Mixed integer second-order cone programming formulations for variable selection in linear regression

- MathematicsEur. J. Oper. Res.
- 2015

Subset selection by Mallows' Cp: A mixed integer programming approach

- Computer ScienceExpert Syst. Appl.
- 2015

A Branch and Bound Algorithm for Feature Subset Selection

- Computer ScienceIEEE Transactions on Computers
- 1977

A feature subset selection algorithm based on branch and bound techniques is developed to select the best subset of m features from an n-feature set with the computational effort of evaluating only 6000 subsets.

A variable selection method based on Tabu search for logistic regression models

- BusinessEur. J. Oper. Res.
- 2009

Best Subset Selection via a Modern Optimization Lens

- Computer Science
- 2015

It is established via numerical experiments that the MIO approach performs better than {\texttt {Lasso}} and other popularly used sparse learning procedures, in terms of achieving sparse solutions with good predictive power.

Fast branch & bound algorithms for optimal feature selection

- Computer ScienceIEEE Transactions on Pattern Analysis and Machine Intelligence
- 2004

A novel search principle for optimal feature subset selection using the branch & bound method using a simple mechanism for predicting criterion values is introduced and two implementations of the proposed prediction mechanism are proposed that are suitable for use with nonrecursive and recursive criterion forms.

Choosing the best set of variables in regression analysis using integer programming

- Business, MathematicsJ. Glob. Optim.
- 2009

An algorithm for selecting the best set of s variables out of k(> s) candidate variables in a multiple linear regression model is employed and a heuristic algorithm is proposed to obtain a close to optimal set of variables in terms of squared deviation.