An Analysis of Reduced Error Pruning
@article{Elomaa2001AnAO, title={An Analysis of Reduced Error Pruning}, author={Tapio Elomaa and Matti K{\"a}{\"a}ri{\"a}inen}, journal={J. Artif. Intell. Res.}, year={2001}, volume={15}, pages={163-187} }
Top-down induction of decision trees has been observed to suffer from the inadequate functioning of the pruning phase. In particular, it is known that the size of the resulting tree grows linearly with the sample size, even though the accuracy of the tree does not improve. Reduced Error Pruning is an algorithm that has been used as a representative technique in attempts to explain the problems of decision tree learning.
In this paper we present analyses of Reduced Error Pruning in three…
122 Citations
The Difficulty of Reduced Error Pruning of Leveled Branching Programs
- Computer ScienceAnnals of Mathematics and Artificial Intelligence
- 2004
The experiments show that, despite the negative theoretical results, heuristic pruning of branching programs can reduce their size without significantly altering the accuracy, and this result is proved to be APX-hard.
Is Error-Based Pruning Redeemable?
- Computer ScienceInt. J. Artif. Intell. Tools
- 2003
Experimental results support the conclusion that error based pruning can be used to produce appropriately sized trees with good accuracy when compared with reduced error pruning.
A novel decision tree classification based on post-pruning with Bayes minimum risk
- Computer SciencePloS one
- 2018
A post-pruning method that considers various evaluation standards such as attribute selection, accuracy, tree complexity, and time taken to prune the tree, precision/recall scores, TP/FN rates and area under ROC is proposed.
A k-norm pruning algorithm for decision tree classifiers based on error rate estimation
- Computer ScienceMachine Learning
- 2007
This work applies Lidstone’s Law of Succession for the estimation of the class probabilities and error rates of decision tree classifiers, and proposes an efficient pruning algorithm, called k-norm pruning, that has a clear theoretical interpretation, is easily implemented, and does not require a validation set.
Selective Rademacher Penalization and Reduced Error Pruning of Decision Trees
- Computer ScienceJ. Mach. Learn. Res.
- 2004
This paper applies Rademacher penalization to the in practice important hypothesis class of unrestricted decision trees by considering the prunings of a given decision tree rather than the tree growing phase, and generalizes the error-bounding approach from binary classification to multi-class situations.
An analysis of misclassification rates for decision trees
- Computer Science
- 2007
This dissertation focuses on the minimization of the misclassification rate for decision tree classifiers, and proposes an efficient pruning algorithm that has a clear theoretical interpretation, is easily implemented, and does not require a validation set.
Error-Based Pruning of Decision Trees Grown on Very Large Data Sets Can Work!
- Computer ScienceICTAI
- 2002
It is shown that, in general, an appropriate setting of the certainty factor for error-based pruning will cause decision tree size to plateau when accuracy is not increasing with more training data.
Experiments with an innovative tree pruning algorithm
- Computer ScienceArtificial Intelligence and Applications
- 2007
This paper demonstrates the experimental results of the comparison among the 2-norm pruning algorithm and two classical pruning algorithms, the Minimal Cost-Complexity algorithm (used in CART) and the Error-based pruninggorithms ( used in C4.5), and confirms that the2-normPruning algorithm is superior in accuracy and speed.
Contribution to Decision Tree Induction with Python: A Review
- Computer ScienceData Mining - Methods, Applications and Systems
- 2021
This review presents essential steps to understand the fundamental concepts and mathematics behind decision tree from training to building and study criteria and pruning algorithms, which have been proposed to control complexity and optimize decision tree performance.
A new minimum description length based pruning technique for rule induction algorithms
- Computer Science
- 2008
A new pruning technique built on the sound foundation of the minimum description length principle is presented, which is designed to improve the performance of the RULe Extraction System family of inductive learning algorithms, but can be used for pruning rule sets created by other learning algorithms.
References
SHOWING 1-10 OF 39 REFERENCES
A Fast, Bottom-Up Decision Tree Pruning Algorithm with Near-Optimal Generalization
- Computer ScienceICML
- 1998
In this work, we present a new bottom-up algorithm for decision tree pruning that is very e cient (requiring only a single pass through the given tree), and prove a strong performance guarantee for…
A Comparative Analysis of Methods for Pruning Decision Trees
- Computer ScienceIEEE Trans. Pattern Anal. Mach. Intell.
- 1997
A comparative study of six well-known pruning methods with the aim of understanding their theoretical foundations, their computational complexity, and the strengths and weaknesses of their formulation, and an objective evaluation of the tendency to overprune/underprune observed in each method is made.
Predicting Nearly As Well As the Best Pruning of a Decision Tree
- Computer ScienceCOLT '95
- 1995
This paper presents a new method of making predictions on test data, and proves that the algorithm's performance will not be “much worse” than the predictions made by the best reasonably small pruning of the given decision tree, and is guaranteed to be competitive with any pruning algorithm.
The Effects of Training Set Size on Decision Tree Complexity
- Computer ScienceICML
- 1997
This paper presents experiments with 19 datasets and 5 decision tree pruning algorithms that show that increasing training set size often results in a linear increase in tree size, even when that…
Decision Tree Pruning as a Search in the State Space
- Computer ScienceECML
- 1993
The introduction of the state space shows that very simple search strategies are used by the postpruning methods considered, and some empirical results allow theoretical observations on strengths and weaknesses of pruning methods to be better understood.
Toward a Theoretical Understanding of Why and When Decision Tree Pruning Algorithms Fail
- Computer ScienceAAAI/IAAI
- 1999
This work constructs a statistical model of reduced error pruning that is shown to control tree growth far better than the original algorithm and makes predictions about how to lessen their effects.
An Efficient Extension to Mixture Techniques for Prediction and Decision Trees
- Computer ScienceCOLT '97
- 1997
An efficient method for maintaining mixtures of prunings of a prediction or decision tree that extends the previous methods for “node-based” pruning to the larger class of edge-based prunments, and it is proved that the algorithm maintains correctly the mixture weights for edge- based prunts with any bounded loss function.
Pruning Decision Trees and Lists
- Computer Science
- 2000
This thesis presents pruning algorithms for decision trees and lists that are based on significance tests and explains why pruning is often necessary to obtain small and accurate models and shows that the performance of standard pruned algorithms can be improved by taking the statistical significance of observations into account.
Overprvning Large Decision Trees
- Computer ScienceIJCAI
- 1991
This paper presents empirical evidence for five hypotheses about learning from large noisy domains: that trees built from very large training sets are larger and more accurate than trees built from…