Type Extension Trees for feature construction and learning in relational domains
We introduce relational information gain, a refinement scoring function measuring the informativeness of newly introduced variables. The gain can be interpreted as a conditional entropy in a well-defined sense and can be efficiently approximately computed. In conjunction with simple greedy general-to-specific search algorithms such as FOIL, it yields an efficient and competitive algorithm in terms of predictive accuracy and compactness of the learned theory. In conjunction with the decision tree learner TILDE, it offers a beneficial alternative to lookahead, achieving similar performance while significantly reducing the number of evaluated literals.