# Prediction Games and Arcing Algorithms

@article{Breiman1999PredictionGA, title={Prediction Games and Arcing Algorithms}, author={Leo Breiman}, journal={Neural Computation}, year={1999}, volume={11}, pages={1493-1517} }

The theory behind the success of adaptive reweighting and combining algorithms (arcing) such as Adaboost (Freund & Schapire, 1996a, 1997) and others in reducing generalization error has not been well understood. By formulating prediction as a game where one player makes a selection from instances in the training set and the other a convex linear combination of predictors from a finite set, existing arcing algorithms are shown to be algorithms for finding good game strategies. The minimax… Expand

#### Topics from this paper

#### 538 Citations

On the Convergence Properties of Optimal AdaBoost

- Computer Science, Mathematics
- ArXiv
- 2012

This paper establishes the convergence of "Optimal AdaBoost," a term coined by Rudin, Daubechies, and Schapire in 2004, and proves the convergence, with the number of rounds, of the classifier itself, its generalization error, and its resulting margins for fixed data sets, under certain reasonable conditions. Expand

Axiomatic Characterization of AdaBoost and the Multiplicative Weight Update Procedure

- Computer Science
- ECML/PKDD
- 2018

It is proved that any method that satisfies three natural axioms on adaptive re-weighting and combining algorithms must be minimizing the composition of an exponential loss with an additive function, and that the weights must be updated according to the multiplicative weight update procedure. Expand

The Dynamics of AdaBoost: Cyclic Behavior and Convergence of Margins

- Mathematics, Computer Science
- J. Mach. Learn. Res.
- 2004

This work reduces AdaBoost to a nonlinear iterated map and studies the evolution of its weight vectors to understand AdaBoost's convergence properties completely, and shows that AdaBoost does not always converge to a maximum margin combined classifier, answering an open question. Expand

Deriving and Analyzing Learning Algorithms

Project Summary There is a large variety of learning problems across all disciplines waiting for the right algorithms. Many of these are on-line problems, where the learning algorithm continually… Expand

Additive Logistic Regression : a Statistical

- Computer Science
- 1998

This work develops more direct approximations of boosting that exhibit performance comparable to other recently proposed multi-class generalizations of boosting, and suggests a minor modiication to boosting that can reduce computation, often by factors of 10 to 50. Expand

Additive Logistic Regression : a Statistical View ofBoostingJerome

- 1998

Boosting (Freund & Schapire 1996, Schapire & Singer 1998) is one of the most important recent developments in classiication methodology. The performance of many classiication algorithms often can be… Expand

On the Dynamics of Boosting

- Computer Science, Mathematics
- NIPS
- 2003

By considering AdaBoost as a dynamical system, this work is able to prove Ratsch and Warmuth's conjecture that AdaBoost may fail to converge to a maximal-margin combined classifier when given a 'non-optimal' weak learning algorithm. Expand

Improving Policy Functions in High-Dimensional Dynamic Games

- Computer Science
- 2015

The approach combines ideas from literatures in Machine Learning and the econometric analysis of games to derive a one-step improvement policy over any given benchmark policy in high-dimensional Markov dynamic optimization problems, focusing in particular on dynamic games. Expand

The Rate of Convergence of Adaboost

- Mathematics, Computer Science
- COLT
- 2011

The rate at which AdaBoost iteratively converges to the minimum of the "exponential loss" is studied to show that this dependence of the rate on e is optimal up to constant factors, that is, at least Ω(1/e) rounds are necessary to achieve within e of the optimal exponential loss. Expand

Greedy Fun tion Approximation : A Gradient Boosting

- Mathematics
- 1999

Function approximation is viewed from the perspective of numerical optimization in function space, rather than parameter space. A connection is made between stagewise additive expansions and… Expand

#### References

SHOWING 1-10 OF 23 REFERENCES

Arcing the edge

- 1997

Recent work has shown that adaptively reweighting the training set, growing a classifier using the new weights, and combining the classifiers constructed to date can significantly decrease… Expand

A decision-theoretic generalization of on-line learning and an application to boosting

- Computer Science
- EuroCOLT
- 1995

The model studied can be interpreted as a broad, abstract extension of the well-studied on-line prediction model to a general decision-theoretic setting, and the multiplicative weightupdate Littlestone Warmuth rule can be adapted to this model, yielding bounds that are slightly weaker in some cases, but applicable to a considerably more general class of learning problems. Expand

Bias, Variance , And Arcing Classifiers

- Computer Science
- 1996

This work explores two arcing algorithms, compares them to each other and to bagging, and tries to understand how arcing works, which is more sucessful than bagging in variance reduction. Expand

Game theory, on-line prediction and boosting

- Mathematics, Computer Science
- COLT '96
- 1996

An algorithm for learning to play repeated games based on the on-line prediction methods of Littlestone and Warmuth is described, which yields a simple proof of von Neumann’s famous minmax theorem, as well as a provable method of approximately solving a game. Expand

Self bounding learning algorithms

- Mathematics, Computer Science
- COLT' 98
- 1998

A self-bounding learning algorithm is an algorithm which, in addition to the hypothesis that it outputs, outputs a reliable upper bound on the generalization error of this hypothesis. Expand

Boosting Decision Trees

- Computer Science
- NIPS
- 1995

A constructive, incremental learning system for regression problems that models data by means of locally linear experts that does not compete for data during learning and derives asymptotic results for this method. Expand

Experiments with a New Boosting Algorithm

- Computer Science
- ICML
- 1996

This paper describes experiments carried out to assess how well AdaBoost with and without pseudo-loss, performs on real learning problems and compared boosting to Breiman's "bagging" method when used to aggregate various classifiers. Expand

Generalization in Decision Trees and DNF: Does Size Matter?

- Mathematics, Computer Science
- NIPS
- 1997

This paper shows that with high probability any decision tree of depth no more than d that is consistent with m training examples has misclassification probabilityNo more than O((1/m(Neff VCdim(U) log2 m log d))1/2), where U is the class of node decision functions, and Neff ≤ N can be thought of as the effective number of leaves. Expand

Combinations of Weak Classifiers

- Mathematics, Computer Science
- NIPS
- 1996

The method developed is able to obtain combinations of weak classifiers with good generalization performance and a fast training time on a variety of test problems and real applications and when the strength of strong classifiers is properly chosen, combinations ofWeak classifiers can achieve a good generalized performance with polynomial space- and time-complexity. Expand

Bagging, Boosting, and C4.5

- Computer Science
- AAAI/IAAI, Vol. 1
- 1996

Results of applying Breiman's bagging and Freund and Schapire's boosting to a system that learns decision trees and testing on a representative collection of datasets show boosting shows the greater benefit. Expand