Feature subset selection for logistic regression via mixed integer optimization

@article{Sato2016FeatureSS,
  title={Feature subset selection for logistic regression via mixed integer optimization},
  author={Toshiki Sato and Yuichi Takano and Ryuhei Miyashiro and Akiko Yoshise},
  journal={Computational Optimization and Applications},
  year={2016},
  volume={64},
  pages={865-880}
}
This paper concerns a method of selecting a subset of features for a logistic regression model. Information criteria, such as the Akaike information criterion and Bayesian information criterion, are employed as a goodness-of-fit measure. The purpose of our work is to establish a computational framework for selecting a subset of features with an optimality guarantee. For this purpose, we devise mixed integer optimization formulations for feature subset selection in logistic regression… 
An effective procedure for feature subset selection in logistic regression based on information criteria
TLDR
This paper proposes a new approach, which combines mixed-integer programming and decomposition techniques in order to overcome the aforementioned scalability issues, and provides a theoretical characterization of the proposed algorithm properties.
Piecewise-Linear Approximation for Feature Subset Selection in a Sequential Logit Model
TLDR
This paper applies a piecewise-linear approximation to the logistic loss function of a sequential logit model to frame the feature subset selection problem of minimizing an information criterion as a mixed integer linear optimization problem.
APPLICATION OF A MIXED INTEGER NONLINEAR PROGRAMMING APPROACH TO VARIABLE SELECTION IN LOGISTIC REGRESSION
  • K. Kimura
  • Computer Science
    Journal of the Operations Research Society of Japan
  • 2019
TLDR
A mixed integer nonlinear programming approach to AIC minimization for linear regression and it is shown that the approach outperformed existing approaches in terms of computational time and piecewise linear approximation approach.
Feature subset selection for the multinomial logit model via mixed-integer optimization
TLDR
This paper proposes an exact algorithm based on a mixed-integer optimization approach with an outer approximation method for the multinomial logit (MNL) model and proves the convergence properties of the algorithm for more general models including generalized linear models for multiclass classification.
Feature Subset Selection for Ordered Logit Model via Tangent-Plane-Based Approximation
SUMMARY This paper is concerned with a mixed-integer optimiza- tion (MIO) approach to selecting a subset of relevant features from among many candidates. For ordinal classification, a sequential logit
Stochastic Discrete First-Order Algorithm for Feature Subset Selection
TLDR
This paper proposes a stochastic discrete first-order (SDFO) algorithm for feature subset selection of candidate features to use for multiple linear regression, and derives the optimal step size in the gradient-descent direction to accelerate convergence of the algorithm.
An efficient optimization approach for best subset selection in linear regression, with application to model selection and fitting in autoregressive time-series
TLDR
This paper proposes to model both problems as mixed integer non linear optimization ones and propose numerical procedures based on state of the art optimization tools in order to solve both of them.
Best subset selection via cross-validation criterion
TLDR
The purpose of this paper is to establish a mixed-integer optimization approach to selecting the best subset of explanatory variables via the cross-validation criterion, which can be formulated as a bilevel MIO problem and reduced to a single-level mixed- integer quadratic optimization problem.
...
...

References

SHOWING 1-10 OF 48 REFERENCES
Piecewise-Linear Approximation for Feature Subset Selection in a Sequential Logit Model
TLDR
This paper applies a piecewise-linear approximation to the logistic loss function of a sequential logit model to frame the feature subset selection problem of minimizing an information criterion as a mixed integer linear optimization problem.
A Branch and Bound Algorithm for Feature Subset Selection
TLDR
A feature subset selection algorithm based on branch and bound techniques is developed to select the best subset of m features from an n-feature set with the computational effort of evaluating only 6000 subsets.
Best Subset Selection via a Modern Optimization Lens
TLDR
It is established via numerical experiments that the MIO approach performs better than {\texttt {Lasso}} and other popularly used sparse learning procedures, in terms of achieving sparse solutions with good predictive power.
Fast branch & bound algorithms for optimal feature selection
TLDR
A novel search principle for optimal feature subset selection using the branch & bound method using a simple mechanism for predicting criterion values is introduced and two implementations of the proposed prediction mechanism are proposed that are suitable for use with nonrecursive and recursive criterion forms.
Choosing the best set of variables in regression analysis using integer programming
TLDR
An algorithm for selecting the best set of s variables out of k(> s) candidate variables in a multiple linear regression model is employed and a heuristic algorithm is proposed to obtain a close to optimal set of variables in terms of squared deviation.
...
...