An Exact Probability Metric for Decision Tree Splitting and Stopping

  title={An Exact Probability Metric for Decision Tree Splitting and Stopping},
  author={J. Kent Martin},
  journal={Machine Learning},
ID3's information gain heuristic is well-known to be biased towards multi-valued attributes. This bias is only partially compensated for by C4.5's gain ratio. Several alternatives have been proposed and are examined here (distance, orthogonality, a Beta function, and two chi-squared tests). All of these metrics are biased towards splits with smaller branches, where low-entropy splits are likely to occur by chance. Both classical and Bayesian statistics lead to the multiple hypergeometric… CONTINUE READING


Publications citing this paper.
Showing 1-10 of 37 extracted citations


Publications referenced by this paper.
Showing 1-10 of 38 references

An exact probability metric for decision tree splitting and stopping

  • J. K. Martin
  • Technical Report 95-16,
  • 1995
Highly Influential
13 Excerpts

1990).A theory of learning classification rules

  • W. L. Buntine
  • 1990
Highly Influential
2 Excerpts

Similar Papers

Loading similar papers…