Special Invited Paper-Additive logistic regression: A statistical view of boosting

@article{Friedman2000SpecialIP,
  title={Special Invited Paper-Additive logistic regression: A statistical view of boosting},
  author={Jerome H. Friedman},
  journal={Annals of Statistics},
  year={2000},
  volume={28},
  pages={374-376}
}
  • J. Friedman
  • Published 2000
  • Mathematics
  • Annals of Statistics
Boosting is one of the most important recent developments in classification methodology. Boosting works by sequentially applying a classification algorithm to reweighted versions of the training data and then taking a weighted majority vote of the sequence of classifiers thus produced. For many classification algorithms, this simple strategy results in dramatic improvements in performance. We show that this seemingly mysterious phenomenon can be understood in terms of well-known statistical… Expand

Figures from this paper

Noise peeling methods to improve boosting algorithms
TLDR
It is found that peeling methods generally perform better than AdaBoost and other noise resistant boosters, especially when high levels of noise are present in the data. Expand
Robust Boosting via Convex Optimization: Theory and Applications
TLDR
It is shown that boosting can be used to solve large scale constrained optimization problems, whose solutions are well characterizable and derive convergence guarantees for a quite general family of boosting algorithms. Expand
Gradient and Newton Boosting for Classification and Regression
TLDR
The experiments show that Newton boosting outperforms gradient and hybrid gradient-Newton boosting in terms of predictive accuracy on the majority of datasets, and empirical evidence is presented that this difference in predictive accuracy is not primarily due to faster convergence of Newton boosting, but rather since Newton boosting often achieves lower test errors while at the same time having lower training losses. Expand
A New Perspective on Boosting in Linear Regression via Subgradient Optimization and Relatives
TLDR
This paper derives novel, comprehensive computational guarantees for several boosting algorithms in linear regression by using techniques of modern first-order methods in convex optimization, and provides a precise theoretical description of the amount of data-fidelity and regularization imparted by running a boosting algorithm with a prespecified learning rate for a fixed but arbitrary number of iterations, for any dataset. Expand
Multi-resolution boosting for classification and regression problems
TLDR
This paper proposes a novel multi-resolution approach for choosing the weak learners during additive modeling and applies insights from multi- resolution analysis and chooses the optimal learners at multiple resolutions during different iterations of the boosting algorithms, which are simple yet powerful additive modeling methods. Expand
An adaptive multiclass boosting algorithm for classification
  • Shixun Wang, Peng Pan, Y. Lu
  • Mathematics, Computer Science
  • 2014 International Joint Conference on Neural Networks (IJCNN)
  • 2014
TLDR
A novel large-margin loss function to directly design multiclass classifier that guarantees Bayes consistency and global optimization, is minimized by gradient descent or Newton method in a multidimensional functional space. Expand
Greedy Fun tion Approximation : A Gradient Boosting
Function approximation is viewed from the perspective of numerical optimization in function space, rather than parameter space. A connection is made between stagewise additive expansions andExpand
Improved Boosting Algorithms using Confidence-Rated Predictions
We describe several improvements to Freund and Schapire‘s AdaBoost boosting algorithm, particularly in a setting in which hypotheses may assign confidences to each of their predictions. We give aExpand
Improved Boosting Algorithms Using Confidence-rated Predictions
We describe several improvements to Freund and Schapire's AdaBoost boosting algorithm, particularly in a setting in which hypotheses may assign confidences to each of their predictions. We give aExpand
Boosting Algorithms as Gradient Descent in Function
Much recent attention, both experimental and theoretical, has been focussed on classii-cation algorithms which produce voted combinations of classiiers. Recent theoretical work has shown that theExpand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 71 REFERENCES
A decision-theoretic generalization of on-line learning and an application to boosting
TLDR
The model studied can be interpreted as a broad, abstract extension of the well-studied on-line prediction model to a general decision-theoretic setting, and the multiplicative weightupdate Littlestone Warmuth rule can be adapted to this model, yielding bounds that are slightly weaker in some cases, but applicable to a considerably more general class of learning problems. Expand
Improved Boosting Algorithms Using Confidence-rated Predictions
We describe several improvements to Freund and Schapire's AdaBoost boosting algorithm, particularly in a setting in which hypotheses may assign confidences to each of their predictions. We give aExpand
Experiments with a New Boosting Algorithm
TLDR
This paper describes experiments carried out to assess how well AdaBoost with and without pseudo-loss, performs on real learning problems and compared boosting to Breiman's "bagging" method when used to aggregate various classifiers. Expand
Greedy function approximation: A gradient boosting machine.
Function estimation/approximation is viewed from the perspective of numerical optimization in function space, rather than parameter space. A connection is made between stagewise additive expansionsExpand
Boosting in the Limit: Maximizing the Margin of Learned Ensembles
TLDR
The crucial question as to why boosting works so well in practice, and how to further improve upon it, remains mostly open, and it is concluded that no simple version of the minimum-margin story can be complete. Expand
Boosting with Bayesian stumps
  • D. Denison
  • Mathematics, Computer Science
  • Stat. Comput.
  • 2001
TLDR
This paper reviews some of the commonly used methods for performing boosting and shows how they can be fit into a Bayesian setup at each iteration of the algorithm. Expand
Boosting the margin: A new explanation for the effectiveness of voting methods
TLDR
It is shown that techniques used in the analysis of Vapnik's support vector classifiers and of neural networks with small weights can be applied to voting methods to relate the margin distribution to the test error. Expand
Boosting Decision Trees
TLDR
A constructive, incremental learning system for regression problems that models data by means of locally linear experts that does not compete for data during learning and derives asymptotic results for this method. Expand
Using output codes to boost multiclass learning problems
TLDR
This paper describes a new technique for multiclass learning problems by combining Freund and Schapire's boosting algorithm with the main ideas of Diet- terich and Bakiri's method of error-correcting output codes (ECOC), and shows that the new hybrid method has advantages of both. Expand
Boosting a weak learning algorithm by majority
TLDR
An algorithm for improving the accuracy of algorithms for learning binary concepts by combining a large number of hypotheses, each of which is generated by training the given learning algorithm on a different set of examples, is presented. Expand
...
1
2
3
4
5
...