Data Set Used
The technology for building knowledge-based systems by inductive inference from examples has been demonstrated successfully in several practical applications. This paper summarizes an approach to synthesizing decision trees that has been used in a variety of systems, and it describes one such system, ID3, in detail. Results from recent studies show ways in… (More)
This paper describes FORE, a system that learns Horn clauses from data expressed as relations. FOIL is based on ideas that have proved effective in attribute-value learning systems, but extends them to a first-order formalism. This new system has been applied successfully to several tasks taken from the machine learning literature.
A reported weakness of C4.5 in domains with continuous attributes is addressed by modifying the formation and evaluation of tests on continuous attributes. An MDL-inspired penalty is applied to such tests, eliminating some of them from consideration and altering the relative desirability of all tests. Empirical trials show that the modiications lead to… (More)
This paper concerns learning tasks that require the prediction of a continuous value rather than a discrete class. A general method is presented that allows predictions to use both instance-based and model-based learning. Results with three approaches to constructing models and with eight datasets demonstrate improvements due to the composite method.
Breiman's bagging and Freund and Schapire's boosting are recent methods for improving the predictive power of classiier learning systems. Both form a set of classiiers that are combined by voting, bagging by generating replicated boot-strap samples of the data, and boosting by adjusting the weights of training instances. This paper reports results of… (More)
FOIL is a learning system that constructs Horn clause programs from examples. This paper summarises the development of FOIL from 1989 up to early 1993 and evaluates its eeectiveness on a non-trivial sequence of learning tasks taken from a Prolog programming text. Although many of these tasks are handled reasonably well, the experiment highlights some… (More)
This paper presents the top 10 data mining algorithms identified by the IEEE These top 10 algorithms are among the most influential data mining algorithms in the research community. With each algorithm, we provide a description of the algorithm, discuss the impact of the algorithm, and review current and further research on the algorithm. These 10… (More)
We explore the use of Rissanen's minimum description length principle for the construction of decision trees. Empirical results comparing this approach to other methods are given. 0 1989 Academic Press, Inc.