• Corpus ID: 59453855

Adaptive Concentration of Regression Trees, with Application to Random Forests

  title={Adaptive Concentration of Regression Trees, with Application to Random Forests},
  author={Stefan Wager and Guenther Walther},
  journal={arXiv: Statistics Theory},
We study the convergence of the predictive surface of regression trees and forests. To support our analysis we introduce a notion of adaptive concentration for regression trees. This approach breaks tree training into a model selection phase in which we pick the tree splits, followed by a model fitting phase where we find the best regression model consistent with these splits. We then show that the fitted regression tree concentrates around the optimal predictor with the same splits: as d and n… 

Figures and Tables from this paper

Best Split Nodes for Regression Trees

It is shown that the size of a terminal subnode for a variable is small when the MDI for that variable is large and that this relationship is exponential, confirming theoretically that decision trees with CART have small bias and are adaptive to signal strength and direction.

Sharp Analysis of a Simple Model for Random Forests

A historically important random forest model, where a feature is selected at random and the splits occurs at the midpoint of the node along the chosen feature, is revisited and it is shown that this rate cannot be improved in general.

Analyzing CART.

This paper aims to study the statistical properties of regression trees constructed with CART and finds that the training error is governed by Pearson's correlation between the optimal decision stump and response data in each node, which is bound by solving a quadratic program.

Asymptotic Normality for Multivariate Random Forest Estimators.

  • Kevin Li
  • Mathematics, Computer Science
  • 2020
This paper shows that the covariance matrix of the limiting normal distribution is diagonal, so that the estimates at any two points are independent in sufficiently deep trees, and proposes the proposed covariance bound.

Posterior concentration for Bayesian regression trees and forests

A spike-and-tree variant of the popular Bayesian CART prior is proposed and new theoretical results are established showing that regression trees (and their ensembles) are capable of recovering smooth regression surfaces, achieving optimal rates up to a log factor and can adapt to the unknown level of smoothness.

Universal Consistency of Decision Trees in High Dimensions

This paper shows that decision trees constructed with Classification and Regression Trees (CART) methodology are universally consistent in an additive model context, even when the number of predictor

Estimation and Inference with Trees and Forests in High Dimensions

It is proved that if only $r$ of the features are relevant for the mean outcome function, then shallow trees built greedily via the CART empirical MSE criterion achieve MSE rates that depend only logarithmically on the ambient dimension $d$.

Universal Consistency of Decision Trees for High Dimensional Additive Models

This paper shows that decision trees constructed with Classification and Regression Trees (CART) methodology are universally consistent for additive models, even when the dimensionality scales

Impact of subsampling and tree depth on random forests

It is proved that quantile forests (a specific type of random forests) based on subsampling and quantiles forests whose tree construction is terminated early have similar performances, as long as their respective parameters (subsampling rate and tree depth) are well chosen.

Complete Analysis of a Random Forest Model

This paper revisits a random forest model originally proposed by (Breiman, 2004) and later studied by (Biau, 2012), where a feature is selected at random and the split occurs at the midpoint of the block containing the chosen feature.



Random Forests

Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.

Consistency of Random Forests

A step forward in forest exploration is taken by proving a consistency result for Breiman's original algorithm in the context of additive regression models, and sheds an interesting light on how random forests can nicely adapt to sparsity.

Tree-structured regression and the differentiation of integrals

This paper provides answers to questions regarding the almost sure limiting behavior of rooted, binary tree-structured rules for regression. Examples show that questions raised by Gordon and Olshen

On the asymptotics of random forests

Analysis of purely random forests bias

Under some regularity assumptions on the regression function, it is shown that the bias of an infinite forest decreases at a faster rate (with respect to the size of each tree) than a single tree, and infinite forests attain a strictly better risk rate than single trees.

Histogram regression estimation using data-dependent partitions

  • A. Nobel
  • Computer Science, Mathematics
  • 1996
The consistency of histograms regression estimates based on cubic partitions with data-dependent offsets, k-thresholding in one dimension and empirically optimal nearest-neighbor clustering schemes are established.

Analysis of a Random Forests Model

  • G. Biau
  • Computer Science
    J. Mach. Learn. Res.
  • 2012
An in-depth analysis of a random forests model suggested by Breiman (2004), which is very close to the original algorithm, and shows in particular that the procedure is consistent and adapts to sparsity, in the sense that its rate of convergence depends only on the number of strong features and not on how many noise variables are present.

Random Forests and Adaptive Nearest Neighbors

It is shown that random forests with adaptive splitting schemes assign weights to k-PNNs in a desirable way: for the estimation at a given target point, these random forests assign voting weights to the k- PNNs of the target point according to the local importance of different input variables.

Quantile Regression Forests

It is shown here that random forests provide information about the full conditional distribution of the response variable, not only about the conditional mean, in order to be competitive in terms of predictive power.

Impact of subsampling and pruning on random forests

It is shown that fully developed sub-sampled forests and pruned (without subsampling) forests have similar performances, as long as respective parameters are well chosen.