• Corpus ID: 2354909

Discussion of the Paper \additive Logistic Regression: a Statistical View of Boosting" By

  title={Discussion of the Paper \additive Logistic Regression: a Statistical View of Boosting" By},
  author={Jerome H. Friedman and Trevor J. Hastie and Robert Tibshirani and Yoav Freund and Robert E. Schapire},
The main and important contribution of this paper is in establishing a connection between boosting, a newcomer to the statistics scene, and additive models. One of the main properties of boosting that has made it interesting to statisticians and others is its relative (but not complete) immunity to overrtting. As pointed out by the authors, the current paper does not address this issue. Leo Breiman 1] tried to explain this behaviour in terms of bias and variance. In our paper with Bartlett and… 

Boosting as a Regularized Path to a Maximum Margin Classifier

It is built on recent work by Efron et al. to show that boosting approximately (and in some cases exactly) minimizes its loss criterion with an l1 constraint on the coefficient vector, and shows that as the constraint is relaxed the solution converges (in the separable case) to an "l1-optimal" separating hyper-plane.

Improving Boosting by Exploiting Former Assumptions

This study proposes a new approach and modifications carried out on the algorithm of AdaBoost, called hybrid approach, and demonstrates that it is possible to improve the performance of the Boosting, by exploiting assumptions generated with the former iterations to correct the weights of the examples.

Response to Mease and Wyner, Evidence Contrary to the Statistical View of Boosting, JMLR 9:131-156, 2008

These myriad interpretations of AdaBoost form a robust theory of the algorithm that provides understanding from an extraordinary range of points of view in which each perspective tells us something unique about the algorithm.

The Fast Convergence of Boosting

This manuscript considers the convergence rate of boosting under a large class of losses, including the exponential and logistic losses, where the best previous rate of convergence was O(exp(1/∊2); the principal technical hurdle throughout this work is the potential unattainability of the infimal empirical risk.

Boosting with the L 2-Loss : Regression and Classi cationPeter

This paper investigates a computationally simple variant of boosting, L 2 Boost, which is constructed from a functional gradient descent algorithm with the L 2-loss function and demonstrates that L 2boosting with a novel component-wise cubic smoothing spline is both practical and eeective in the presence of high-dimensional predictors.

Boosting and Support Vector Machines as Optimal Separators

It is shown that boosting approximately (and in some cases exactly) minimizes its loss criterion with an L1 constraint and that as the constraint diminishes, or equivalently as the boosting iterations proceed, the solution converges in the separable case to an “L1-optimal” separating hyper-plane.

A Discussion of: "Process Consistency for AdaBoost" by Wenxin Jiang "On the Bayes-risk consistency of regularized boosting methods" by G´ abor Lugosi and Nicolas Vayatis "Statistical Behavior and Consistency of Classification Methods based on Convex Risk Minimization" by Tong Zhang

A view of boosting is given that is very different from its original interpretation and analysis as an algorithm for improving the accuracy of a weak learner, showing that conditional probability estimates P{y|x} can be obtained when minimizing any smooth convex loss function.

Improving the Predictive Power of AdaBoost: A Case Study in Classifying Borrowers

Applying these proposed modifications of the original version of boosting to three data sets from the banking industry provides results which indicate a significant and substantial improvement in predictive power over the original AdaBoost algorithm.

On Boosting Improvement: Error Reduction and Convergence Speed-Up

This article proposes a slight modification of the weight update rule of the algorithm ADABOOST, and shows that by exploiting an adaptive measure of a local entropy, computed from a neighborhood graph built on the examples, it is possible to identify not only the outliers but also the examples located in the Bayesian error region.

L 2 Boosting in Kernel Regression 1

In this paper, we investigate the theoretical and empirical properties of L2 boosting with kernel regression estimates as weak learners. We show that each step of L2 boosting reduces the bias of the

Boosting the margin: A new explanation for the effectiveness of voting methods

It is shown that techniques used in the analysis of Vapnik's support vector classifiers and of neural networks with small weights can be applied to voting methods to relate the margin distribution to the test error.

The Alternating Decision Tree Learning Algorithm

A new type of classi cation rule, the alternating decision tree, which is a generalization of decision trees, voted decision trees and voted decision stumps and generates rules that are usually smaller in size and thus easier to interpret.

An Adaptive Version of the Boost by Majority Algorithm

The paper describes two methods for finding approximate solutions to the differential equations and a method that results in a provably polynomial time algorithm based on the Newton-Raphson minimization procedure, which is much more efficient in practice but is not known to bePolynomial.

Arcing classi ers

  • The Annals of Statistics,
  • 1998

Boosting the margin: A new explanation for the eeectiveness of voting methods The Annals of Statistics

  • Machine Learning: Proceedings of the Fourteenth International Conference
  • 1997

Arcing classiiers. The Annals of Statistics

  • Arcing classiiers. The Annals of Statistics
  • 1998