Boosting Algorithms as Gradient Descent in Function Space


Much recent attention, both experimental and theoretical, has been focussed on classii-cation algorithms which produce voted combinations of classiiers. Recent theoretical work has shown that the impressive generalization performance of algorithms like AdaBoost can be attributed to the classiier having large margins on the training data. We present abstract algorithms for nding linear and convex combinations of functions that minimize arbitrary cost functionals (i.e functionals that do not necessarily depend on the margin). Many existing voting methods can be shown to be special cases of these abstract algorithms. Then, following previous theoretical results bounding the generalization performance of convex combinations of classiiers in terms of general cost functions of the margin, we present a new algorithm (DOOM II) for performing a gradient descent optimization of such cost functions. Experiments on several data sets from the UC Irvine repository demonstrate that DOOM II generally outperforms AdaBoost, especially in high noise situations. Margin distribution plots verify that DOOM II is willing tògive up' on examples that are too hard in order to avoid overrtting. We also show that the overrtting behavior exhibited by AdaBoost can be quantiied in terms of our proposed cost function.

1 Figure or Table

Citations per Year

186 Citations

Semantic Scholar estimates that this publication has 186 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Mason1999BoostingAA, title={Boosting Algorithms as Gradient Descent in Function Space}, author={Llew Mason and Jonathan Baxter and Peter L. Bartlett and Marcus R. Frean}, year={1999} }