Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers

  title={Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers},
  author={Erin Allwein and Robert E. Schapire and Yoram Singer},
  journal={J. Mach. Learn. Res.},
We present a unifying framework for studying the solution of multiclass categorization problems by reducing them to multiple binary problems that are then solved using a margin-based binary learning algorithm. The proposed framework unifies some of the most popular approaches in which each class is compared against all others, or in which all pairs of classes are compared to each other, or in which output codes with error-correcting properties are used. We propose a general method for combining… 

Figures and Tables from this paper

An Empirical Comparison of Hierarchical vs. Two-Level Approaches to Multiclass Problems

The results show that while there is no clear advantage to either technique in terms of classification accuracy, BHCs typically achieve this performance using fewer classifiers, and have the added advantage of automatically generating a hierarchy of classes.

Superior Multi-Class Classification Through a Margin-Optimized Single Binary Problem

A new multi-class classification method is developed that reduces the multi- class problem to a single binary classifier (SBC) and indicates that it can outperform one-vs-all, all-pairs and the error-correcting output coding scheme.

Multiclass Boosting with Hinge Loss based on Output Coding

A new output-coding-based multiclass boosting algorithm using the multiclass hinge loss, which is called HingeBoost.OC, which shows better performance than the existing multiclassBoost algorithm AdaBoost.

Large Margin Multiclass Learning: Models and Algorithms

This thesis investigates large margin based approaches to the problem of learning over multiple classes and proposes a convergent additive reweighting strategy that is able to improve the margin of the examples of the training set and a framework for general multiclass problems and algorithms.

Enhancing directed binary trees for multi-class classification

A New Multiclass Support Vector Machine An Approach Using Iterative Majorization and Huber Hinge Errors

A new multiclass Support Vector Machine (SVM) is presented, which can be used to find the optimal decision boundaries in a multiclass classification problem, and it is believed that the proposed method provides a promising new way of looking at multiclass Classification problems.

Multiclass boosting with repartitioning

This paper proposes a new multiclass boosting algorithm that modifies the coding matrix according to the learning ability of the base learner, and shows experimentally that this algorithm is very efficient in optimizing the multiclass margin cost, and outperforms existing multiclass algorithms such as AdaBoost.

Finding shareable informative patterns and optimal coding matrix for multiclass boosting

A new multiclass boosting algorithm called AdaBoost.SIP is proposed that considers both two factors simultaneously, the qualities of the base classifiers and the coding matrix, and its effectiveness in multiclass classification is shown.

Convex Optimization for Binary Classifier Aggregation in Multiclass Problems

This paper presents a convex optimization method for an optimal aggregation of binary classifiers to estimate class membership probabilities in multiclass problems and demonstrates that this method outperforms existing aggregation methods as well as direct methods, in terms of the classification accuracy and the quality of class membership probability estimates.



A comparison of methods for multiclass support vector machines

Decomposition implementations for two "all-together" multiclass SVM methods are given and it is shown that for large problems methods by considering all data at once in general need fewer support vectors.

Solving Multiclass Learning Problems via Error-Correcting Output Codes

It is demonstrated that error-correcting output codes provide a general-purpose method for improving the performance of inductive learning programs on multiclass problems.

Multiclass learning, boosting, and error-correcting codes

ECC, that, by using a different weighting of the votes of the weak hypotheses, is able to improve on the performance of ADABOOST.OC, is arguably a more direct reduction of multiclass learning to binary learning problems than previous multiclass boosting algorithms.

Improved Boosting Algorithms Using Confidence-rated Predictions

We describe several improvements to Freund and Schapire's AdaBoost boosting algorithm, particularly in a setting in which hypotheses may assign confidences to each of their predictions. We give a

Special Invited Paper-Additive logistic regression: A statistical view of boosting

This work shows that this seemingly mysterious phenomenon of boosting can be understood in terms of well-known statistical principles, namely additive modeling and maximum likelihood, and develops more direct approximations and shows that they exhibit nearly identical results to boosting.

Arcing the edge

A framework for understanding arcing algorithms is defined and a relation is derived between the optimal reduction in the maximum value of the edge and the PAC concept of weak learner.

Boosting the margin: A new explanation for the effectiveness of voting methods

It is shown that techniques used in the analysis of Vapnik's support vector classifiers and of neural networks with small weights can be applied to voting methods to relate the margin distribution to the test error.

SSVM: A Smooth Support Vector Machine for Classification

Smoothing methods are applied here to generate and solve an unconstrained smooth reformulation of the support vector machine for pattern classification using a completely arbitrary kernel, which converges globally and quadratically.

Training support vector machines: an application to face detection

  • E. OsunaR. FreundF. Girosi
  • Computer Science
    Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition
  • 1997
A decomposition algorithm that guarantees global optimality, and can be used to train SVM's over very large data sets is presented, and the feasibility of the approach on a face detection problem that involves a data set of 50,000 data points is demonstrated.

Additive Logistic Regression : a Statistical View ofBoostingJerome

This work develops more direct approximations of boosting that exhibit performance comparable to other recently proposed multi-class generalizations of boosting, and suggests a minor modiication to boosting that can reduce computation, often by factors of 10 to 50.