Ultraconservative Online Algorithms for Multiclass Problems
In this paper we study a paradigm to generalize online classification algorithms for binary classification problems to multiclass problems. The particular hypotheses we investigate maintain one prototype vector per class. Given an input instance, a multiclass hypothesis computes a similarity-score between each prototype and the input instance and sets the predicted label to be the index of the prototype achieving the highest similarity. To design and analyze the learning algorithms in this paper we introduce the notion of ultraconservativeness. Ultraconservative algorithms are algorithms that update only the prototypes attaining similarity-scores which are higher than the score of the correct label's prototype. We start by describing a family of additive ultraconservative algorithms where each algorithm in the family updates its prototypes by finding a feasible solution for a set of linear constraints that depend on the instantaneous similarity-scores. We then discuss a specific online algorithm that seeks a set of prototypes which have a small norm. The resulting algorithm, which we term MIRA (for Margin Infused Relaxed Algorithm) is ultraconservative as well. We derive mistake bounds for all the algorithms and provide further analysis of MIRA using a generalized notion of the margin for multiclass problems. We discuss the form the algorithms take in the binary case and show that all the algorithms from the first family reduce to the Perceptron algorithm while MIRA provides a new Perceptron-like algorithm with a margin-dependent learning rate. We then return to multiclass problems and describe an analogous multiplicative family of algorithms with corresponding mistake bounds. We end the formal part by deriving and analyzing a multiclass version of Li and Long's ROMMA algorithm. We conclude with a discussion of experimental results that demonstrate the merits of our algorithms.