Learn More
We present a family of online learning, margin based, algorithms for various prediction tasks. In particular we derive and analyze algorithms for binary and multiclass categorization, regression, uniclass prediction and sequence prediction. All of the algorithms we present can utilize kernel functions. The update steps of our different algorithms are all(More)
A common problem of kernel-based online algorithms, such as the kernel-based Perceptron algorithm , is the amount of memory required to store the online hypothesis, which may increase without bound as the algorithm progresses. Furthermore, the computational load of such algorithms grows linearly with the amount of memory used to store the hypothesis. To(More)
We present a discriminative online algorithm with a bounded memory growth, which is based on the kernel-based Perceptron. Generally, the required memory of the kernel-based Perceptron for storing the online hypothesis is not bounded. Previous work has been focused on discarding part of the instances in order to keep the memory bounded. In the proposed(More)
We consider the problem of binary classification where the classifier may abstain instead of classifying each observation. The Bayes decision rule for this setup, known as Chow's rule, is defined by two thresholds on posterior probabilities. From simple desiderata, namely the consistency and the sparsity of the classifier, we derive the double hinge loss(More)
We describe a new approach for phoneme recognition which aims at minimizing the phoneme error rate. Building on structured prediction techniques, we formulate the phoneme recognizer as a linear combination of feature functions. We state a PAC-Bayesian generalization bound, which gives an upper-bound on the expected phoneme error rate in terms of the(More)
We describe a new method for phoneme sequence recognition given a speech utterance, which is not based on the HMM. In contrast to HMM-based approaches, our method uses a discriminative kernel-based training procedure in which the learning process is tailored to the goal of minimizing the Levenshtein distance between the predicted phoneme sequence and the(More)
We present a method for efficiently training binary and multiclass kernelized SVMs on a Graphics Processing Unit (GPU). Our methods apply to a broad range of kernels, including the popular Gaus- sian kernel, on datasets as large as the amount of available memory on the graphics card. Our approach is distinguished from earlier work in that it cleanly and(More)