Optimal randomized classification trees

  title={Optimal randomized classification trees},
  author={Rafael Blanquero and Emilio Carrizosa and Cristina Molero-R{\'i}o and Dolores Romero Morales},
  journal={Comput. Oper. Res.},
Abstract Classification and Regression Trees (CARTs) are off-the-shelf techniques in modern Statistics and Machine Learning. CARTs are traditionally built by means of a greedy procedure, sequentially deciding the splitting predictor variable(s) and the associated threshold. This greedy approach trains trees very fast, but, by its nature, their classification accuracy may not be competitive against other state-of-the-art procedures. Moreover, controlling critical issues, such as the… 
Sparsity in Optimal Randomized Classification Trees
This paper proposes a continuous optimization approach to build sparse optimal classification trees, based on oblique cuts, with the aim of using fewer predictor variables in the cuts as well as along the whole tree.
Generalized Optimal Sparse Decision Trees
The contribution in this work is to provide a general framework for decision tree optimization that addresses the two significant open problems in the area: treatment of imbalanced data and fully optimizing over continuous variables.
On multivariate randomized classification trees: l0-based sparsity, VC dimension and decomposition methods
This work investigates the nonlinear continuous optimization formulation proposed in Blanquero et al. (2020) for (sparse) optimal randomized classification trees and proposes a general decomposition scheme and an efficient version of it.
Optimal Sparse Decision Trees
This work introduces the first practical algorithm for optimal decision trees for binary variables, a co-design of analytical bounds that reduce the search space and modern systems techniques, including data structures and a custom bit-vector library.
Mathematical optimization in classification and regression trees
Classification and regression trees, as well as their variants, are off-the-shelf methods in Machine Learning. In this paper, we review recent contributions within the Continuous Optimization and the
Variable selection for Naïve Bayes classification
A sparse version of the Naive Bayes classifier that is characterized by three properties is proposed, which leads to a smart search, which yields competitive running times, whereas the flexibility in terms of performance measure for classification is integrated.
Constrained Naïve Bayes with application to unbalanced data classification
This paper addresses the issue of misclassification for the Naive Bayes classifier by adding constraints to the optimization problem underlying the estimation process, and shows that under a reasonable computational cost, the performance measures under consideration achieve the desired levels yielding a user-friendly constrained classification procedure.
Multiclass Optimal Classification Trees with SVM-splits
A novel mathematical optimizationbased methodology to construct tree-shaped classification rules for multiclass instances by means of a SVM separating hyperplane, which provides a Mixed Integer Non Linear Programming formulation for the problem.
Learning Optimal Classification Trees Using a Binary Linear Program Formulation
This work provides a new formulation for the problem of learning the optimal classification tree of a given depth as a binary linear program by making the formulation size largely independent from the training data size.
On sparse ensemble methods: An application to short-term predictions of the evolution of COVID-19
A novel Mathematical Optimization model is proposed to build a sparse ensemble, which trades off the accuracy of the ensemble and the number of base regressors used, and the latter is controlled by means of a regularization term that penalizes regressors with a poor individual performance.


Optimal classification trees
Optimal classification trees are presented, a novel formulation of the decision tree problem using modern MIO techniques that yields the optimal decision tree for axes-aligned splits and synthetic tests demonstrate that these methods recover the true decision tree more closely than heuristics, refuting the notion that optimal methods overfit the training data.
Efficient Non-greedy Optimization of Decision Trees
It is shown that the problem of finding optimal linear-combination splits for decision trees is related to structured prediction with latent variables, and a convex-concave upper bound on the tree's empirical loss is formed, and the use of stochastic gradient descent for optimization enables effective training with large datasets.
Optimal trees for prediction and prescription
For the past 30 years, decision tree methods have been one of the most widelyused approaches in machine learning across industry and academia, due in large part to their interpretability. However,
Classification and Regression by randomForest
random forests are proposed, which add an additional layer of randomness to bagging and are robust against overfitting, and the randomForest package provides an R interface to the Fortran programs by Breiman and Cutler.
Multivariate classification trees based on minimum features discrete support vector machines
Decision trees have been widely recognized as one of the most effective techniques for classification in the data mining context, particularly when dealing with business oriented applications, such
Optimal decision trees for categorical data via integer programming
A mixed integer programming formulation to construct optimal decision trees of a prespecified size that takes the special structure of categorical features into account and allow combinatorial decisions (based on subsets of values of features) at each node.
Classification and regression trees
  • W. Loh
  • Computer Science
    Wiley Interdiscip. Rev. Data Min. Knowl. Discov.
  • 2011
This article gives an introduction to the subject of classification and regression trees by reviewing some widely available algorithms and comparing their capabilities, strengths, and weakness in two examples.
Do we need hundreds of classifiers to solve real world classification problems?
The random forest is clearly the best family of classifiers (3 out of 5 bests classifiers are RF), followed by SVM (4 classifiers in the top-10), neural networks and boosting ensembles (5 and 3 members in theTop-20, respectively).
Optimal Decision Trees
We propose an Extreme Point Tabu Search (EPTS) algorithm that constructs globally optimal decision trees for classiication problems. Typically, decision tree algorithms are greedy. They optimize the
Fast growing and interpretable oblique trees via logistic regression models
The focus of this thesis is to grow oblique trees in a fast and deterministic manner and to propose ways of making them more interpretable, as the proposed approach to finding oblique splits makes use of logistic regression, well-founded variable selection techniques are introduced to classification trees.