# Optimization of Tree Ensembles

@inproceedings{Mivsic2017OptimizationOT, title={Optimization of Tree Ensembles}, author={Velibor V. Mivsi'c}, year={2017} }

Tree ensemble models such as random forests and boosted trees are among the most widely used and practically successful predictive models in applied machine learning and business analytics. Although such models have been used to make predictions based on exogenous, uncontrollable independent variables, they are increasingly being used to make predictions where the independent variables are controllable and are also decision variables. In this paper, we study the problem of tree ensemble… Expand

#### Figures, Tables, and Topics from this paper

#### 15 Citations

ENTMOOT: A Framework for Optimization over Ensemble Tree Models

- Computer Science, Mathematics
- Comput. Chem. Eng.
- 2021

This work shows how the ENTMOOT approach allows a simple integration of tree models into decision-making and black-box optimization, where it proves as a strong competitor to commonly-used frameworks. Expand

Mathematical optimization in classification and regression trees

- TOP
- 2021

Classification and regression trees, as well as their variants, are off-the-shelf methods in Machine Learning. In this paper, we review recent contributions within the Continuous Optimization and the… Expand

Learning Optimal Classification Trees: Strong Max-Flow Formulations

- Computer Science, Mathematics
- ArXiv
- 2020

This work proposes a flow-based MIP formulation for optimal binary classification trees that has a stronger linear programming relaxation and exploits the structure and max-flow/min-cut duality to derive a Benders' decomposition method, which scales to larger instances. Expand

The Use of Binary Choice Forests to Model and Estimate Discrete Choices

- Computer Science, Economics
- ArXiv
- 2019

We show the equivalence of discrete choice models and the class of binary choice forests, which are random forests based on binary choice trees. This suggests that standard machine learning… Expand

JANOS: An Integrated Predictive and Prescriptive Modeling Framework

- Computer Science, Mathematics
- ArXiv
- 2019

A modeling framework JANOS is described that seamlessly integrates the two streams of analytics, for the first time allowing researchers and practitioners to embed machine learning models in an optimization framework. Expand

Rare-Event Simulation for Neural Network and Random Forest Predictors

- Computer Science, Mathematics
- ArXiv
- 2020

An importance sampling scheme is investigated that integrates the dominating point machinery in large deviations and sequential mixed integer programming to locate the underlying dominating points in rare-event simulation. Expand

Decision Forest: A Nonparametric Approach to Modeling Irrational Choice

- Computer Science, Mathematics
- ArXiv
- 2019

A new nonparametric choice model that relaxes this assumption and can model a wider range of customer behavior, such as decoy effects between products, is proposed and outperforms both rational and non-rational benchmark models in out-of-sample predictive ability. Expand

Exact Logit-Based Product Design

- 2021

The share-of-choice product design (SOCPD) problem is to find the product, as defined by its attributes, that maximizes market share arising from a collection of customer types or segments. When… Expand

Exact Logit-Based Product Design

- Mathematics
- 2021

The share-of-choice product design (SOCPD) problem is to find the product, as defined by its attributes, that maximizes market share arising from a collection of customer types or segments. When… Expand

BooST: Boosting Smooth Trees for Partial Effect Estimation in Nonlinear Regressions

- Mathematics, Computer Science
- ArXiv
- 2018

A new machine learning model for nonlinear regression called the Boosted Smooth Transition Regression Trees (BooST), which is a combination of boosting algorithms with smooth transition regression trees, which can provide more interpretation about the mapping between the covariates and the dependent variable than other tree-based models. Expand

#### References

SHOWING 1-10 OF 54 REFERENCES

Optimal classification trees

- Mathematics, Computer Science
- Machine Learning
- 2017

Optimal classification trees are presented, a novel formulation of the decision tree problem using modern MIO techniques that yields the optimal decision tree for axes-aligned splits and synthetic tests demonstrate that these methods recover the true decision tree more closely than heuristics, refuting the notion that optimal methods overfit the training data. Expand

Adaptive Concentration of Regression Trees, with Application to Random Forests

- Mathematics
- 2015

We study the convergence of the predictive surface of regression trees and forests. To support our analysis we introduce a notion of adaptive concentration for regression trees. This approach breaks… Expand

A working guide to boosted regression trees.

- Computer Science, Medicine
- The Journal of animal ecology
- 2008

This study provides a working guide to boosted regression trees (BRT), an ensemble method for fitting statistical models that differs fundamentally from conventional techniques that aim to fit a single parsimonious model. Expand

Do we need hundreds of classifiers to solve real world classification problems?

- Mathematics, Computer Science
- J. Mach. Learn. Res.
- 2014

The random forest is clearly the best family of classifiers (3 out of 5 bests classifiers are RF), followed by SVM (4 classifiers in the top-10), neural networks and boosting ensembles (5 and 3 members in theTop-20, respectively). Expand

Classification and Regression by randomForest

- Computer Science
- 2007

random forests are proposed, which add an additional layer of randomness to bagging and are robust against overfitting, and the randomForest package provides an R interface to the Fortran programs by Breiman and Cutler. Expand

Consistency of Random Forests

- Mathematics
- 2015

Random forests are a learning algorithm proposed by Breiman [Mach. Learn. 45 (2001) 5–32] that combines several randomized decision trees and aggregates their predictions by averaging. Despite its… Expand

Best Subset Selection via a Modern Optimization Lens

- Mathematics
- 2015

In the last twenty-five years (1990-2014), algorithmic advances in integer optimization combined with hardware improvements have resulted in an astonishing 200 billion factor speedup in solving Mixed… Expand

The Power and Limits of Predictive Approaches to Observational-Data-Driven Optimization

- Mathematics
- 2016

While data-driven decision-making is transforming modern operations, most large-scale data is of an observational nature, such as transactional records. These data pose unique challenges in a variety… Expand

OR Forum - An Algorithmic Approach to Linear Regression

- Mathematics, Computer Science
- Oper. Res.
- 2016

This work presents an MIQO-based approach for designing high quality linear regression models that explicitly addresses various competing objectives and demonstrates the effectiveness of the approach on both real and synthetic data sets. Expand

Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling

- Mathematics, Computer Science
- J. Chem. Inf. Comput. Sci.
- 2003

It is the combination of relatively high prediction accuracy and its collection of desired features that makes Random Forest uniquely suited for modeling in cheminformatics. Expand