Extremely randomized trees

@article{Geurts2006ExtremelyRT,
  title={Extremely randomized trees},
  author={Pierre Geurts and Damien Ernst and Louis Wehenkel},
  journal={Machine Learning},
  year={2006},
  volume={63},
  pages={3-42}
}
This paper proposes a new tree-based ensemble method for supervised classification and regression problems. It essentially consists of randomizing strongly both attribute and cut-point choice while splitting a tree node. In the extreme case, it builds totally randomized trees whose structures are independent of the output values of the learning sample. The strength of the randomization can be tuned to problem specifics by the appropriate choice of a parameter. We evaluate the robustness of the… Expand
Improvement of randomized ensembles of trees for supervised learning in very high dimension
Tree-based ensemble methods, such as random forests and extremely randomized trees, are methods of choice for handling high dimensional problems. One important drawback of these methods however isExpand
Learning with Ensembles of Randomized Trees : New Insights
TLDR
A connection with kernel target alignment, a measure of kernel quality, is pointed out, which suggests that randomization is a way to obtain a high alignment, leading to possibly low generalization error. Expand
Towards generating random forests via extremely randomized trees
TLDR
The results on several public datasets show that random partition without exhaustive search at each node of a decision tree can yield better performance with less computational complexity. Expand
Consistency of Random Forests 1
Random forests are a learning algorithm proposed by Breiman [Mach. Learn. 45 (2001) 5–32] that combines several randomized decision trees and aggregates their predictions by averaging. Despite itsExpand
Influence of Hyperparameters on Random Forest Accuracy
TLDR
This work evaluates the Forest-RI algorithm on several machine learning problems and with different settings of K in order to understand the way it acts on RF performance, and shows that default values of K traditionally used in the literature are globally near-optimal, except for some cases for which they are all significatively sub-optical. Expand
AMF: Aggregated Mondrian Forests for Online Learning
TLDR
AMF, an online random forest algorithm based on Mondrian Forests is introduced and it is shown that it is possible to efficiently perform an exact aggregation over all prunings of the trees, which enables to obtain a truly online parameter-free algorithm. Expand
Random Forests with Stochastic Induction of Decision Trees
TLDR
The proposed algorithm is based on a stochastic process to induct each decision tree, assigning a probability for the selection of the split attribute in every tree node, designed in order to create strong and independent trees. Expand
Embedding Monte Carlo Search of Features in Tree-Based Ensemble Methods
TLDR
A general scheme to embed feature generation in a wide range of tree-based learning algorithms, including single decision trees, random forests and tree boosting, based on the formalization of feature construction as a sequential decision making problem addressed by a tractable Monte Carlo search algorithm coupled with node splitting is proposed. Expand
An Empirical Comparison of Supervised Ensemble Learning Approaches
We present an extensive empirical comparison between twenty prototypical supervised ensemble learning algorithms, including Boosting, Bagging, Random Forests, Rotation Forests, Arc-X4,Expand
An extensive empirical comparison of ensemble learning methods for binary classification
We present an extensive empirical comparison between nineteen prototypical supervised ensemble learning algorithms, including Boosting, Bagging, Random Forests, Rotation Forests, Arc-X4,Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 76 REFERENCES
The Random Subspace Method for Constructing Decision Forests
  • T. Ho
  • Mathematics, Computer Science
  • IEEE Trans. Pattern Anal. Mach. Intell.
  • 1998
TLDR
A method to construct a decision tree based classifier is proposed that maintains highest accuracy on training data and improves on generalization accuracy as it grows in complexity. Expand
An Empirical Comparison of Selection Measures for Decision-Tree Induction
TLDR
The paper considers a number of different measures and experimentally examines their behavior in four domains and shows that the choice of measure affects the size of a tree but not its accuracy, which remains the same even when attributes are selected randomly. Expand
Approximate Splitting for Ensembles of Trees using Histograms
TLDR
The approach combines the ideas behind discretization through histograms and randomization in ensembles to create decision trees by randomly selecting a split point in an interval around the best bin boundary in the histogram. Expand
An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization
TLDR
The experiments show that in situations with little or no classification noise, randomization is competitive with (and perhaps slightly superior to) bagging but not as accurate as boosting, and sometimes better than randomization. Expand
A further comparison of splitting rules for decision-tree induction
TLDR
The results indicate that random splitting leads to increased error and are at variance with those presented by Mingers. Expand
A Further Comparison of Splitting Rules for Decision-Tree Induction
One approach to learning classification rules from examples is to build decision trees. A review and comparison paper by Mingers (Mingers, 1989) looked at the first stage of tree building, which usesExpand
SOME INFINITY THEORY FOR PREDICTOR ENSEMBLES
To dispel some of the mystery about what makes tree ensembles work, they are looked at in distribution space i.e. the limit case of "infinite" sample size. It is shown that the simplest kind of treesExpand
Random Forests
  • L. Breiman
  • Mathematics, Computer Science
  • Machine Learning
  • 2004
TLDR
Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression. Expand
PERT – Perfect Random Tree Ensembles
Ensemble classifiers originated in the machine learning community. They work by fitting many individual classifiers and combining them by weighted or unweighted voting. The ensemble classifier isExpand
An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants
TLDR
It is found that Bagging improves when probabilistic estimates in conjunction with no-pruning are used, as well as when the data was backfit, and that Arc-x4 behaves differently than AdaBoost if reweighting is used instead of resampling, indicating a fundamental difference. Expand
...
1
2
3
4
5
...