Willem Waegeman

Learn More
Most of the multi-label classification (MLC) methods proposed in recent years intended to exploit, in one way or the other, dependencies between the class labels. Comparing to simple binary relevance learning as a baseline, any gain in performance is normally explained by the fact that this method is ignoring such dependencies. Without questioning the(More)
The F-measure, originally introduced in information retrieval, is nowadays routinely used as a performance metric for problems such as binary classification, multi-label classification, and structured output prediction. Optimizing this measure remains a statistically and computationally challenging problem, since no closed-form maximizer exists. Current(More)
The aim of this paper is to elaborate on the important issue of label dependence in multi-label classification (MLC). Looking at the problem from a statistical perspective, we claim that two different types of label dependence should be distinguished, namely conditional and unconditional. We formally explain the differences and connections between both(More)
The idea of classifier chains has recently been introduced as a promising technique for multi-label classification. However, despite being intuitively appealing and showing strong performance in empirical studies, still very little is known about the main principles underlying this type of method. In this paper, we provide a detailed probabilistic analysis(More)
The F-measure, which has originally been introduced in information retrieval, is nowadays routinely used as a performance metric for problems such as binary classification, multi-label classification, and structured output prediction. Optimizing this measure is a statistically and computationally challenging problem, since no closed-form solution exists.(More)
We compare the plug-in rule approach for optimizing the Fβ-measure in multi-label classification with an approach based on structured loss minimization, such as the structured support vector machine (SSVM). Whereas the former derives an optimal prediction from a probabilistic model in a separate inference step, the latter seeks to optimize the Fβ-measure(More)
At present, there is much variability between MALDI-TOF MS methodology for the characterization of bacteria through differences in e.g., sample preparation methods, matrix solutions, organic solvents, acquisition methods and data analysis methods. After evaluation of the existing methods, a standard protocol was developed to generate MALDI-TOF mass spectra(More)
In multi-label classification (MLC), each instance is associated with a subset of labels instead of a single class, as in conventional classification, and this generalization enables the definition of a multitude of loss functions. Indeed, a large number of losses has already been proposed and is commonly applied as performance metrics in experimental(More)