Corpus ID: 234342525

A Rigorous Information-Theoretic Definition of Redundancy and Relevancy in Feature Selection Based on (Partial) Information Decomposition

@article{Wollstadt2021ARI,
  title={A Rigorous Information-Theoretic Definition of Redundancy and Relevancy in Feature Selection Based on (Partial) Information Decomposition},
  author={Patricia Wollstadt and Sebastian Schmitt and Michael Wibral},
  journal={ArXiv},
  year={2021},
  volume={abs/2105.04187}
}
Selecting a minimal feature set that is maximally informative about a target variable is a central task in machine learning and statistics. Information theory provides a powerful framework for formulating feature selection algorithms—yet, a rigorous, information-theoretic definition of feature relevancy, which accounts for feature interactions such as redundant and synergistic contributions, is still missing. We argue that this lack is inherent to classical information theory which does not… Expand
Estimating the Unique Information of Continuous Variables
TLDR
This work presents a method for estimating the unique information in continuous distributions, for the case of two sources and one target, by combining copula decompositions and techniques developed to optimize variational autoencoders. Expand
Estimating the Unique Information of Continuous Variables in Recurrent Networks
TLDR
This work presents a method for estimating the unique information in continuous distributions, for the case of one versus two variables, and solves the associated optimization problem over the space of distributions with fixed bivariate marginals by combining copula decompositions and techniques developed to optimize variational autoencoders. Expand

References

SHOWING 1-10 OF 84 REFERENCES
A New Perspective for Information Theoretic Feature Selection
TLDR
This paper shows how to naturally derive a space of possible ranking criteria, and shows that several recent contributions in the feature selection literature are points within this continuous space, and that there exist many points that have never been explored. Expand
Conditional Infomax Learning: An Integrated Framework for Feature Extraction and Fusion
TLDR
A new framework for feature learning in classification motivated by information theory is introduced where a novel concept called class-relevant redundancy is introduced and a new algorithm called Conditional Informative Feature Extraction is formulated, which maximizes the jointclass-relevant information by explicitly reducing the class- relevant redundancies among features. Expand
Nearest neighbor estimate of conditional mutual information in feature selection
TLDR
The nearest neighbor estimate of CMI is proposed, appropriate for high-dimensional variables, and an iterative scheme for sequential feature selection with a termination criterion, called CMINN is built, equivalent to feature selection MI filters, in the presence of solely single feature effects, and more appropriate for combined feature effects. Expand
Nonnegative Decomposition of Multivariate Information
TLDR
This work reconsider from first principles the general structure of the information that a set of sources provides about a given variable and proposes a definition of partial information atoms that exhaustively decompose the Shannon information in a multivariate system in terms of the redundancy between synergies of subsets of the sources. Expand
A Simple Filter Benchmark for Feature Selection
A new correlation-based filter approach for simple, fast, and effective feature selection (FS) is proposed. The association strength between each feature and the response variable (relevance) andExpand
Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy
TLDR
This work derives an equivalent form, called minimal-redundancy-maximal-relevance criterion (mRMR), for first-order incremental feature selection, and presents a two-stage feature selection algorithm by combining mRMR and other more sophisticated feature selectors (e.g., wrappers). Expand
A Bivariate Measure of Redundant Information
TLDR
A new formalism for redundant information is introduced and it is proved that it satisfies all the properties necessary outlined in earlier work, as well as an additional criterion that is proposed to be necessary to capture redundancy. Expand
Searching for interacting features in subset selection
TLDR
This paper takes up the challenge to design a special data structure for feature quality evaluation, and to employ an information-theoretic feature ranking mechanism to efficiently handle feature interaction in subset selection. Expand
Using mutual information for selecting features in supervised neural net learning
  • R. Battiti
  • Computer Science, Medicine
  • IEEE Trans. Neural Networks
  • 1994
This paper investigates the application of the mutual information criterion to evaluate a set of candidate features and to select an informative subset to be used as input data for a neural networkExpand
Conditional Mutual Information-Based Feature Selection Analyzing for Synergy and Redundancy
TLDR
An automated greedy feature selection algorithm called conditional mutual information-based feature selection (CMIFS) that takes account of both redundancy and synergy interactions of features and identifies discriminative features and can achieve higher best-classification-accuracy. Expand
...
1
2
3
4
5
...