The weighted majority algorithm
- N. Littlestone, Manfred K. Warmuth
- Computer Science30th Annual Symposium on Foundations of Computer…
- 30 October 1989
A simple and effective method, based on weighted voting, is introduced for constructing a compound algorithm in a situation in which a learner faces a sequence of trials, and the goal of the learner is to make few mistakes.
Learnability and the Vapnik-Chervonenkis dimension
This paper shows that the essential condition for distribution-free learnability is finiteness of the Vapnik-Chervonenkis dimension, a simple combinatorial parameter of the class of concepts to be learned.
Exponentiated Gradient Versus Gradient Descent for Linear Predictors
The bounds suggest that the losses of the algorithms are in general incomparable, but EG(+/-) has a much smaller loss if only a few components of the input are relevant for the predictions, which is quite tight already on simple artificial data.
Tracking the Best Expert
The generalization allows the sequence to be partitioned into segments, and the goal is to bound the additional loss of the algorithm over the sum of the losses of the best experts for each segment to model situations in which the examples change and different experts are best for certain segments of the sequence of examples.
How to use expert advice
- N. Cesa-Bianchi, Y. Freund, D. Helmbold, D. Haussler, R. Schapire, Manfred K. Warmuth
- Computer ScienceSymposium on the Theory of Computing
- 1 June 1993
This work analyzes algorithms that predict a binary value by combining the predictions of several prediction strategies, called `experts', and shows how this leads to certain kinds of pattern recognition/learning algorithms with performance bounds that improve on the best results currently known in this context.
On‐Line Portfolio Selection Using Multiplicative Updates
- D. Helmbold, R. Schapire, Y. Singer, Manfred K. Warmuth
- Computer ScienceInternational Conference on Machine Learning
- 1 October 1998
We present an on‐line investment algorithm that achieves almost the same wealth as the best constant‐rebalanced portfolio determined in hindsight from the actual market outcomes. The algorithm…
Relating Data Compression and Learnability
It is demonstrated that the existence of a suitable data compression scheme is sufficient to ensure learnability and the introduced compression scheme provides a rigorous model for studying data compression in connection with machine learning.
Relative Loss Bounds for On-Line Density Estimation with the Exponential Family of Distributions
This work considers on-line density estimation with a parameterized density from the exponential family and uses a Bregman divergence to derive and analyze each algorithm to design algorithms with the best possible relative loss bounds.
Tracking a Small Set of Experts by Mixing Past Posteriors
A live bait fishing lure having a quickly attachable clamp strap to hold the live lure on the main body portion of the lure, and float means engageable with said lure and with said clamp means for…
Tracking the Best Linear Predictor
This paper develops the methodology for lifting known static bounds to the shifting case and obtains bounds when the comparison class consists of linear neurons (linear combinations of experts).