Detecting Feature Interactions from Accuracies of Random Feature Subsets

Abstract

Interaction among features notoriously causes difficulty for machine learning algorithms because the relevance of one feature for predicting the target class can depend on the values of other features. In this paper, we introduce a new method for detecting feature interactions by evaluating the accuracies of a learning algorithm on random subsets of features. We give an operational defufition for feature interactions based on when a set of features allows a leamlng algorithm to achieve higher than expected accuracy, assuming independence. Then we show how to adjust the sampling of random subsets in a way that is fair and balanced, given a limited amount of time. Finally, we show how decision trees built from sets of interacting features can be converted into DNF expressions to form constructed features. We demonstrate the effectiveness of the method empirically by showing that it can improve the accuracy ofthe C4.5 decision-tree algorithm on several benchmark databases.

Extracted Key Phrases

Cite this paper

@inproceedings{Ioerger1999DetectingFI, title={Detecting Feature Interactions from Accuracies of Random Feature Subsets}, author={Thomas R. Ioerger}, booktitle={AAAI/IAAI}, year={1999} }