Predicting Nearly as Well as the Best Pruning of a Decision Tree

  title={Predicting Nearly as Well as the Best Pruning of a Decision Tree},
  author={D. Helmbold and R. Schapire},
  • D. Helmbold, R. Schapire
  • Published in COLT 1995
  • Computer Science
  • Many algorithms for inferring a decision tree from data involve a two-phase process: First, a very large decision tree is grown which typically ends up “over-fitting” the data. To reduce over-fitting, in the second phase, the tree is pruned using one of a number of available methods. The final tree is then output and used for classification on test data. In this paper, we suggest an alternative approach to the pruning phase. Using a given unpruned decision tree, we present a new method of… CONTINUE READING
    101 Citations
    A Fast, Bottom-Up Decision Tree Pruning Algorithm with Near-Optimal Generalization
    • 105
    • Highly Influenced
    • PDF
    An Analysis of Reduced Error Pruning
    • 104
    • PDF
    On-Line Algorithm to Predict Nearly as Well as the Best Pruning of a Decision Tree
    On growing better decision trees from data
    • 142
    An Efficient Extension to Mixture Techniques for Prediction and Decision Trees
    • 20
    • Highly Influenced
    • PDF
    An E cient Extension to Mixture Techniques forPrediction and Decision
    • 2
    • Highly Influenced
    Learning Small Trees and Graphs that Generalize
    • 19
    • PDF


    Multiple decision trees
    • 171
    • PDF
    A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting
    • 6,396
    • PDF
    The Weighted Majority Algorithm
    • 1,337
    • PDF
    How to use expert advice
    • 611
    • PDF
    Learning classification trees
    • 419
    • PDF
    C4.5: Programs for Machine Learning
    • 20,975
    Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm
    • N. Littlestone
    • Mathematics
    • 28th Annual Symposium on Foundations of Computer Science (sfcs 1987)
    • 1987
    • 1,618
    • PDF
    The context-tree weighting method: basic properties
    • 677
    • PDF
    Optimal sequential probability assignment for individual sequences
    • 96