Corpus ID: 6418797

Bayes Optimal Multilabel Classification via Probabilistic Classifier Chains

@inproceedings{Dembczynski2010BayesOM,
  title={Bayes Optimal Multilabel Classification via Probabilistic Classifier Chains},
  author={K. Dembczynski and Weiwei Cheng and E. H{\"u}llermeier},
  booktitle={ICML},
  year={2010}
}
In the realm of multilabel classification (MLC), it has become an opinio communis that optimal predictive performance can only be achieved by learners that explicitly take label dependence into account. The goal of this paper is to elaborate on this postulate in a critical way. To this end, we formalize and analyze MLC within a probabilistic setting. Thus, it becomes possible to look at the problem from the point of view of risk minimization and Bayes optimal prediction. Moreover, inspired by… Expand
Learning and Inference in Probabilistic Classifier Chains with Beam Search
TLDR
It is shown how to use beam search to perform tractable test time inference, and how to integrate beam search with training to determine a suitable tag ordering, to dramatically extend the practical viability of probabilistic classifier chains. Expand
Consistent Multilabel Classification
TLDR
This work shows that for multilabel metrics constructed as instance-, micro- and macro-averages, the population optimal classifier can be decomposed into binary classifiers based on the marginal instance-conditional distribution of each label, with a weak association between labels via the threshold. Expand
Hierarchical Multilabel Classification with Minimum Bayes Risk
  • Wei Bi, J. Kwok
  • Mathematics, Computer Science
  • 2012 IEEE 12th International Conference on Data Mining
  • 2012
TLDR
This paper uses Bayesian decision theory to develop a Bayes-optimal classifier that outperforms existing HMC methods and can be efficiently solved using a greedy algorithm on both tree-and DAG-structured label hierarchies. Expand
Reliable Multi-label Classification: Prediction with Partial Abstention
TLDR
This paper proposes a formalization of MLC with abstention in terms of a generalized loss minimization problem and presents first results for the case of the Hamming loss, rank loss, and F-measure, both theoretical and experimental. Expand
Efficient monte carlo optimization for multi-label classifier chains
TLDR
This paper presents a novel double-Monte Carlo scheme (M2CC), both for finding a good chain sequence and performing efficient inference, which remains tractable for high-dimensional data sets and obtains the best overall accuracy. Expand
Beam search algorithms for multilabel learning
TLDR
This paper shows how to apply beam search to make inference tractable, and how to integrate beam search with training to determine a suitable tag ordering, and shows that the proposed improvements yield a state-of-the-art method for multilabel learning. Expand
Multilabel classifiers with a probabilistic thresholding strategy
TLDR
This paper introduces a family of thresholding strategies which take into account the posterior probability of all possible labels to determine a different threshold for each instance, and finds experimentally that these strategies outperform other thresholding options for multilabel classification. Expand
Bayes-Optimal Hierarchical Multilabel Classification
  • Wei Bi, J. Kwok
  • Computer Science
  • IEEE Transactions on Knowledge and Data Engineering
  • 2015
TLDR
This work proposes hierarchical extensions of the Hamming loss and ranking loss which take the mistake at every node of the label hierarchy into consideration, and develops Bayes-optimal predictions that minimize the corresponding risks with the trained model. Expand
Probabilistic Classifier Chain Inference via Gibbs Sampling
TLDR
This work proposes a novel inference method with gibbs sampling based on the claim that Probabilistic Classifier Chain is a special case of Bayesian network and may inspire more inference algorithms for PCC. Expand
Binary relevance efficacy for multilabel classification
TLDR
Some interesting properties of BR are discussed, mainly that it produces optimal models for several ML loss functions, and the use of synthetic datasets to better analyze the behavior of ML methods in domains with different characteristics is proposed. Expand
...
1
2
3
4
5
...

References

SHOWING 1-9 OF 9 REFERENCES
Combining instance-based learning and logistic regression for multilabel classification
TLDR
This paper proposes a new approach to multilabel classification, which is based on a framework that unifies instance-based learning and logistic regression, comprising both methods as special cases, and allows one to capture interdependencies between labels and to combine model-based and similarity-based inference for multILabel classification. Expand
Combining Instance-Based Learning and Logistic Regression for Multilabel Classification
TLDR
This paper proposes a new approach to multilabel classification, which is based on a framework that unifies instance-based learning and logistic regression, comprising both methods as special cases, and allows one to capture interdependencies between labels and to combine model-based and similarity-based inference for multILabel classification. Expand
Classifier chains for multi-label classification
TLDR
This paper presents a novel classifier chains method that can model label correlations while maintaining acceptable computational complexity, and illustrates the competitiveness of the chaining method against related and state-of-the-art methods, both in terms of predictive performance and time complexity. Expand
Collective multi-label classification
TLDR
Experiments show that the models outperform their single-label counterparts on standard text corpora and improve subset classification error by as much as 40% when multi-labels are sparse. Expand
Discriminative Methods for Multi-labeled Classification
TLDR
A new technique for combining text features and features indicating relationships between classes, which can be used with any discriminative algorithm is presented, which beat accuracy of existing methods with statistically significant improvements. Expand
Statistical Comparisons of Classifiers over Multiple Data Sets
  • J. Demsar
  • Computer Science
  • J. Mach. Learn. Res.
  • 2006
TLDR
A set of simple, yet safe and robust non-parametric tests for statistical comparisons of classifiers is recommended: the Wilcoxon signed ranks test for comparison of two classifiers and the Friedman test with the corresponding post-hoc tests for comparisons of more classifiers over multiple data sets. Expand
Multi-Label Classification: An Overview
TLDR
The task of multi-label classification is introduced, the sparse related literature is organizes into a structured presentation and comparative experimental results of certain multilabel classification methods are performed. Expand
Predicting Multivariate Responses in Multiple Linear Regression
We look at the problem of predicting several response variables from the same set of explanatory variables. The question is how to take advantage of correlations between the response variables toExpand
Collective multilabel classification
  • CIKM '05
  • 2005