Corpus ID: 189928424

Linear Aggregation in Tree-based Estimators

  title={Linear Aggregation in Tree-based Estimators},
  author={Soren R. Kunzel and Theo F. Saarinen and Edward W. Liu and Jasjeet S. Sekhon},
  journal={arXiv: Methodology},
Regression trees and their ensemble methods are popular methods for non-parametric regression --- combining strong predictive performance with interpretable estimators. In order to improve their utility for smooth response surfaces, we study regression trees and random forests with linear aggregation functions. We introduce a new algorithm which finds the best axis-aligned split to fit optimal linear aggregation functions on the corresponding nodes and implement this method in the provably… Expand
Empirical Bayes mean estimation with nonparametric errors via order statistic regression
We study empirical Bayes estimation of the effect sizes of $N$ units from $K$ noisy observations on each unit. We show that it is possible to achieve near-Bayes optimal mean squared error, withoutExpand
A cautionary tale on fitting decision trees to data from additive models: generalization lower bounds
  • Yan Shuo Tan, Abhineet Agarwal, Bin Yu
  • Mathematics, Computer Science
  • 2021
A sharp squared error generalization lower bound is proved for a large class of decision tree algorithms fitted to sparse additive models with C component functions, and a novel connection between decision tree estimation and rate-distortion theory, a sub-field of information theory is established. Expand
Bayesian Additive Regression Trees with Model Trees
An extension of BART, called Model Trees BART (MOTR-BART), that considers piecewise linear functions at node levels instead of piecewise constants and local linearities are captured more efficiently and fewer trees are required to achieve equal or better performance than BART. Expand
Nonparametric identification is not enough, but randomized controlled trials are
We argue that randomized controlled trials (RCTs) are special even among settings where average treatment effects are identified by a nonparametric unconfoundedness assumption. This claim followsExpand


Local Linear Forests
A central limit theorem valid under regularity conditions on the forest and smoothness constraints is proved, a computationally efficient construction for confidence intervals is proposed, and a causal inference application is discussed. Expand
Generalized random forests
We propose generalized random forests, a method for non-parametric statistical estimation based on random forests (Breiman, 2001) that can be used to fit any quantity of interest identified as theExpand
Functional Models for Regression Tree Leaves
  • L. Torgo
  • Mathematics, Computer Science
  • ICML
  • 1997
This study indicates that by integrating regression trees with other regression approaches the authors are able to overcome the limitations of individual methods both in terms of accuracy as well as in computational efficiency. Expand
Quantile Regression Forests
It is shown here that random forests provide information about the full conditional distribution of the response variable, not only about the conditional mean, in order to be competitive in terms of predictive power. Expand
BART: Bayesian Additive Regression Trees
We develop a Bayesian "sum-of-trees" model where each tree is constrained by a regularization prior to be a weak learner, and fitting and inference are accomplished via an iterative BayesianExpand
Supervised Neighborhoods for Distributed Nonparametric Regression
A new method, Silo, is proposed for fitting predictiontime local models that uses supervised neighborhoods that adapt to the local shape of the regression surface, and works well in both the serial and distributed settings. Expand
Classification and regression trees
  • W. Loh
  • Computer Science
  • Wiley Interdiscip. Rev. Data Min. Knowl. Discov.
  • 2011
This article gives an introduction to the subject of classification and regression trees by reviewing some widely available algorithms and comparing their capabilities, strengths, and weakness in two examples. Expand
Model-Based Recursive Partitioning
Recursive partitioning is embedded into the general and well-established class of parametric models that can be fitted using M-type estimators (including maximum likelihood). An algorithm forExpand
Analysis of a Random Forests Model
  • G. Biau
  • Mathematics, Computer Science
  • J. Mach. Learn. Res.
  • 2012
An in-depth analysis of a random forests model suggested by Breiman (2004), which is very close to the original algorithm, and shows in particular that the procedure is consistent and adapts to sparsity, in the sense that its rate of convergence depends only on the number of strong features and not on how many noise variables are present. Expand
Fitting Prediction Rule Ensembles with R Package pre
Results indicate that pre derives ensembles with predictive accuracy similar to that of random forests, while using a smaller number of variables for prediction, and provided better accuracy and sparsity than the original RuleFit implementation. Expand