A decision-theoretic generalization of on-line learning and an application to boosting

@article{Freund1995ADG,
  title={A decision-theoretic generalization of on-line learning and an application to boosting},
  author={Yoav Freund and Robert E. Schapire},
  journal={J. Comput. Syst. Sci.},
  year={1995},
  volume={55},
  pages={119-139}
}
In the first part of the paper we consider the problem of dynamically apportioning resources among a set of options in a worst-case on-line framework. The model we study can be interpreted as a broad, abstract extension of the well-studied on-line prediction model to a general decision-theoretic setting. We show that the multiplicative weightupdate Littlestone Warmuth rule can be adapted to this model, yielding bounds that are slightly weaker in some cases, but applicable to a considerably more… Expand
Potential-Based Algorithms in On-Line Prediction and Game Theory
TLDR
This paper shows that several known algorithms for sequential prediction problems, for playing iterated games, and for boosting are special cases of a general decision strategy based on the notion of potential, and describes a notion of generalized regret and its applications in learning theory. Expand
Potential-Based Algorithms in On-Line Prediction and Game Theory ∗
In this paper we show that several known algorithms for sequential prediction problems (including Weighted Majority and the quasi-additive family of Grove, Littlestone, and Schuurmans), for playingExpand
Special Invited Paper-Additive logistic regression: A statistical view of boosting
Boosting is one of the most important recent developments in classification methodology. Boosting works by sequentially applying a classification algorithm to reweighted versions of the training dataExpand
On-line Learning and the Metrical Task System Problem
TLDR
An experimental comparison of how these algorithms perform on a process migration problem, a problem that combines aspects of both the experts-tracking and MTS formalisms, is presented. Expand
Boosting a Weak Learning Algorithm by Majority to Be Published in Information and Computation
We present an algorithm for improving the accuracy of algorithms for learning binary concepts. The improvement is achieved by combining a large number of hypotheses , each of which is generated byExpand
Practical Algorithms for On-line Sampling
TLDR
This paper presents two on-line sampling algorithms for selecting a hypothesis, gives theoretical bounds on the number of examples needed, and analyses them experimentally to study the problem of how to determine which of the hypotheses in the class is almost the best one. Expand
Practical Algorithms for On-line Sampling Produced as Part of the Esprit Working Group in Neural and Computational Learning Ii, Neurocolt2 27150 Introduction and Motivation 2
One of the core applications of machine learning to knowledge discovery consists on building a function (a hypothesis) from a given amount of data (for instance a decision tree or a neural network)Expand
Experiments with a New Boosting Algorithm
TLDR
This paper describes experiments carried out to assess how well AdaBoost with and without pseudo-loss, performs on real learning problems and compared boosting to Breiman's "bagging" method when used to aggregate various classifiers. Expand
Improved Boosting Algorithms Using Confidence-rated Predictions
We describe several improvements to Freund and Schapire's AdaBoost boosting algorithm, particularly in a setting in which hypotheses may assign confidences to each of their predictions. We give aExpand
Improved Boosting Algorithms using Confidence-Rated Predictions
We describe several improvements to Freund and Schapire‘s AdaBoost boosting algorithm, particularly in a setting in which hypotheses may assign confidences to each of their predictions. We give aExpand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 43 REFERENCES
Experiments with a New Boosting Algorithm
TLDR
This paper describes experiments carried out to assess how well AdaBoost with and without pseudo-loss, performs on real learning problems and compared boosting to Breiman's "bagging" method when used to aggregate various classifiers. Expand
Boosting a weak learning algorithm by majority
TLDR
An algorithm for improving the accuracy of algorithms for learning binary concepts by combining a large number of hypotheses, each of which is generated by training the given learning algorithm on a different set of examples, is presented. Expand
The Weighted Majority Algorithm
TLDR
A simple and effective method, based on weighted voting, is introduced for constructing a compound algorithm, which is robust in the presence of errors in the data, and is called the Weighted Majority Algorithm. Expand
Boosting Decision Trees
TLDR
A constructive, incremental learning system for regression problems that models data by means of locally linear experts that does not compete for data during learning and derives asymptotic results for this method. Expand
How to use expert advice
TLDR
This work analyzes algorithms that predict a binary value by combining the predictions of several prediction strategies, called `experts', and shows how this leads to certain kinds of pattern recognition/learning algorithms with performance bounds that improve on the best results currently known in this context. Expand
Data filtering and distribution modeling algorithms for machine learning
TLDR
This thesis is concerned with the analysis of algorithms for machine learning and describes and analyses an algorithm for improving the performance of a general concept learning algorithm by selecting those labeled instances that are most informative. Expand
Tight worst-case loss bounds for predicting with expert advice
TLDR
This work considers on-line algorithms for predicting binary or continuous-valued outcomes, when the algorithm has available the predictions made by N experts, and shows that for a large class of loss functions, with binary outcomes the total loss of the algorithm proposed by Vovk exceeds the total losses of the best expert at most by the amount c ln N, where c is a constant determined by the loss function. Expand
What Size Net Gives Valid Generalization?
TLDR
It is shown that if m O(W/ ∊ log N/∊) random examples can be loaded on a feedforward network of linear threshold functions with N nodes and W weights, so that at least a fraction 1 ∊/2 of the examples are correctly classified, then one has confidence approaching certainty that the network will correctly classify a fraction 2 ∊ of future test examples drawn from the same distribution. Expand
Solving Multiclass Learning Problems via Error-Correcting Output Codes
TLDR
It is demonstrated that error-correcting output codes provide a general-purpose method for improving the performance of inductive learning programs on multiclass problems. Expand
Game theory, on-line prediction and boosting
TLDR
An algorithm for learning to play repeated games based on the on-line prediction methods of Littlestone and Warmuth is described, which yields a simple proof of von Neumann’s famous minmax theorem, as well as a provable method of approximately solving a game. Expand
...
1
2
3
4
5
...