Random Forests

@article{Breiman2001RandomF,
  title={Random Forests},
  author={L. Breiman},
  journal={Machine Learning},
  year={2001},
  volume={45},
  pages={5-32}
}
  • L. Breiman
  • Published 1 October 2001
  • Computer Science
  • Machine Learning
Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features… 

Ensemble of optimal trees, random forest and random projection ensemble classification

This work investigates the idea of integrating trees that are accurate and diverse and uses out-of-bag observations as a validation sample from the training bootstrap samples, to choose the best trees based on their individual performance and assess these trees for diversity using the Brier score on an independent validation sample.

of optimal trees, random forest and random projection ensemble classification.

The idea of integrating trees that are accurate and diverse are investigated, to choose the best trees based on their individual performance and then assess these trees for diversity using the Brier score on an independent validation sample.

Effects of stopping criterion on the growth of trees in regression random forests

This work has developed a straightforward method for incorporating weights into the random forest analysis of survey data and demonstrates that generalization error under the proposed approach is competitive to that attained from the original random forest approach when data have large random error variability.

An Ensemble of Optimal Trees for Classification and Regression (OTE)

This work investigates the idea of integrating trees that are accurate and diverse and utilizes out-of-bag observation as validation sample from the training bootstrap samples to choose the best trees based on their individual performance and then assess these trees for diversity using Brier score.

Improvement of randomized ensembles of trees for supervised learning in very high dimension

Empirical experiments show that the combination of the monotone LASSO with features extracted from tree ensembles leads at the same time to a drastic reduction of the number of features and can improve the accuracy with respect to unpruned ensembleles of trees.

On the selection of decision trees in Random Forests

It is shown that better subsets of decision trees can be obtained even using a sub-optimal classifier selection method, which proves that “classical” RF induction process, for which randomized trees are arbitrary added to the ensemble, is not the best approach to produce accurate RF classifiers.

Trees Weighting Random Forest Method for Classifying High-Dimensional Noisy Data

This paper presents a new approach to solve the problem of noisy trees in random forest through weighting the trees according to their classification ability, named Trees Weighting Random Forest (TWRF).

Random Forests and Decision Trees Classifiers : Effects of Data Quality on the Learning Curve

It appeared that random forests and individual decision trees have different sensitivities to those perturbation factors, but counterintuitively random forests show a greater sensitivity to noise than decision trees for this parameter.

Is rotation forest the best classifier for problems with continuous features?

It is demonstrated that on large problems rotation forest can be made an order of magnitude faster without significant loss of accuracy, and it is maintained that without any domain knowledge to indicate an algorithm preference, rotation forest should be the default algorithm of choice for problems with continuous attributes.

Analysis of purely random forests bias

Under some regularity assumptions on the regression function, it is shown that the bias of an infinite forest decreases at a faster rate (with respect to the size of each tree) than a single tree, and infinite forests attain a strictly better risk rate than single trees.
...

References

SHOWING 1-10 OF 19 REFERENCES

On the Algorithmic Implementation of Stochastic Discrimination

  • E. Kleinberg
  • Computer Science
    IEEE Trans. Pattern Anal. Mach. Intell.
  • 2000
An outline of the underlying mathematical theory of stochastic discrimination is outlined and a remark concerning boosting is made, which provides a theoretical justification for properties of that method observed in practice, including its ability to generalize.

Boosting the margin: A new explanation for the effectiveness of voting methods

It is shown that techniques used in the analysis of Vapnik's support vector classifiers and of neural networks with small weights can be applied to voting methods to relate the margin distribution to the test error.

Shape Quantization and Recognition with Randomized Trees

A new approach to shape recognition based on a virtually infinite family of binary features (queries) of the image data, designed to accommodate prior information about shape invariance and regularity, and a comparison with artificial neural networks methods is presented.

An Efficient Method To Estimate Bagging's Generalization Error

This paper presents several techniques for estimating the generalization error of a bagged learning algorithm without invoking yet more training of the underlying learning algorithm (beyond that of the bagging itself), as is required by cross-validation-based estimation.

Arcing classifier (with discussion and a rejoinder by the author)

Two arcing algorithms are explored, compared to each other and to bagging, and the definitions of bias and variance for a classifier as components of the test set error are introduced.

Experiments with a New Boosting Algorithm

This paper describes experiments carried out to assess how well AdaBoost with and without pseudo-loss, performs on real learning problems and compared boosting to Breiman's "bagging" method when used to aggregate various classifiers.

Bagging predictors

Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy.

Bias, Variance and Prediction Error for Classification Rules

A decomposition of prediction error into its natural components is developed and a bootstrap estimate of the error of a \bagged" classiier is obtained.

Randomizing Outputs to Increase Prediction Accuracy

Two methods of randomizing outputs are experimented with, one is called output smearing and the other output flipping, which are shown to consistently do better than bagging.

An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization

The experiments show that in situations with little or no classification noise, randomization is competitive with (and perhaps slightly superior to) bagging but not as accurate as boosting, and sometimes better than randomization.