Greedy function approximation: A gradient boosting machine.

@article{Friedman2001GreedyFA,
  title={Greedy function approximation: A gradient boosting machine.},
  author={Jerome H. Friedman},
  journal={Annals of Statistics},
  year={2001},
  volume={29},
  pages={1189-1232}
}
  • J. Friedman
  • Published 2001
  • Mathematics
  • Annals of Statistics
Function estimation/approximation is viewed from the perspective of numerical optimization in function space, rather than parameter space. A connection is made between stagewise additive expansions and steepest-descent minimization. A general gradient descent boosting paradigm is developed for additive expansions based on any fitting criterion. Specific algorithms are presented for least-squares, least absolute deviation, and Huber-M loss functions for regression, and multiclass logistic… Expand
Optimization by gradient boosting
TLDR
A thorough analysis of two widespread versions of gradient boosting is provided, and a general framework for studying these algorithms from the point of view of functional optimization is introduced. Expand
Gradient Boosting Trees
Gradient boosting is a machine learning technique for regression and classification problems, which produces a prediction model in the form of an ensemble of weak prediction models, typicallyExpand
Stochastic gradient boosting
Gradient boosting constructs additive regression models by sequentially fitting a simple parameterized function (base learner) to current "pseudo'-residuals by least squares at each iteration. TheExpand
A Fast Sampling Gradient Tree Boosting Framework
TLDR
This work combines gradient tree boosting with importance sampling, which achieves better performance by reducing the stochastic variance and uses a regularizer to improve the diagonal approximation in the Newton step of gradient boosting. Expand
BOOSTING WITH EARLY STOPPING: CONVERGENCE
Boosting is one of the most significant advances in machine learning for classification and regression. In its original and computationally flexible version, boosting seeks to minimize empirically aExpand
Conjugate direction boosting for regression
  • 2004
Boosting in the context of linear regression has gained additional attraction by the invention of least angle regression (LARS), where the connection between the lasso and forward stagewise fittingExpand
BOOSTING ALGORITHMS: REGULARIZATION, PREDICTION AND MODEL FITTING
We present a statistical perspective on boosting. Special emphasis is given to estimating potentially complex parametric or nonparametric models, including generalized linear and additive models asExpand
Special Invited Paper-Additive logistic regression: A statistical view of boosting
Boosting is one of the most important recent developments in classification methodology. Boosting works by sequentially applying a classification algorithm to reweighted versions of the training dataExpand
Boosting with early stopping: Convergence and consistency
Boosting is one of the most significant advances in machine learning for classification and regression. In its original and computationally flexible version, boosting seeks to minimize empirically aExpand
Gradient and Newton Boosting for Classification and Regression
TLDR
The experiments show that Newton boosting outperforms gradient and hybrid gradient-Newton boosting in terms of predictive accuracy on the majority of datasets, and empirical evidence is presented that this difference in predictive accuracy is not primarily due to faster convergence of Newton boosting, but rather since Newton boosting often achieves lower test errors while at the same time having lower training losses. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 32 REFERENCES
Additive Logistic Regression : a Statistical
TLDR
This work develops more direct approximations of boosting that exhibit performance comparable to other recently proposed multi-class generalizations of boosting, and suggests a minor modiication to boosting that can reduce computation, often by factors of 10 to 50. Expand
A Geometric Approach to Leveraging Weak Learners
TLDR
A new leveraging algorithm is introduced based on a natural potential function that has bounds that are incomparable to AdaBoost's, and their empirical performance is similar to Ada boost's. Expand
Generalized Additive Models
TLDR
The class of generalized additive models is introduced, which replaces the linear form E fjXj by a sum of smooth functions E sj(Xj), and has the advantage of being completely auto- matic, i.e., no "detective work" is needed on the part of the statistician. Expand
Improved Boosting Algorithms using Confidence-Rated Predictions
We describe several improvements to Freund and Schapire‘s AdaBoost boosting algorithm, particularly in a setting in which hypotheses may assign confidences to each of their predictions. We give aExpand
Experiments with a New Boosting Algorithm
TLDR
This paper describes experiments carried out to assess how well AdaBoost with and without pseudo-loss, performs on real learning problems and compared boosting to Breiman's "bagging" method when used to aggregate various classifiers. Expand
Prediction Games and Arcing Algorithms
  • L. Breiman
  • Mathematics, Computer Science
  • Neural Computation
  • 1999
TLDR
The theory behind the success of adaptive reweighting and combining algorithms (arcing) such as Adaboost and others in reducing generalization error has not been well understood, and an explanation of whyAdaboost works in terms of its ability to produce generally high margins is offered. Expand
Radial Basis Functions
TLDR
This paper gives a selective but up-to-date survey of several recent developments that explains their usefulness from the theoretical point of view and contributes useful new classes of radial basis function. Expand
Radial basis functions
TLDR
This paper gives a selective but up-to-date survey of several recent developments that explains their usefulness from the theoretical point of view and contributes useful new classes of radial basis function. Expand
Learning representations by back-propagating errors
TLDR
Back-propagation repeatedly adjusts the weights of the connections in the network so as to minimize a measure of the difference between the actual output vector of the net and the desired output vector, which helps to represent important features of the task domain. Expand
Improving Regressors using Boosting Techniques
TLDR
This work uses regression trees as fundamental building blocks in bagging committee machines and boosting committee machines to build a committee of regressors that may be superior to a single regressor. Expand
...
1
2
3
4
...