On the doubt about margin explanation of boosting

@article{Gao2013OnTD,
  title={On the doubt about margin explanation of boosting},
  author={Wei Gao and Zhi-Hua Zhou},
  journal={Artif. Intell.},
  year={2013},
  volume={203},
  pages={1-18}
}

Figures and Tables from this paper

On the Insufficiency of the Large Margins Theory in Explaining the Performance of Ensemble Methods

The large margins theory is shown to be not sufficient for explaining the performance of voting classifiers by illustrating how it is possible to improve upon the margin distribution of an ensemble solution, while keeping the complexity fixed, yet not improve the test set performance.

Margins are Insufficient for Explaining Gradient Boosting

This work demonstrates that the $k$'th margin bound is inadequate in explaining the performance of state-of-the-art gradient boosters and proves a stronger and more refined margin-based generalization bound for boosted classifiers that indeed succeeds in explainingThe performance of modern gradient boosters.

Margin-Based Generalization Lower Bounds for Boosted Classifiers

The first margin-based lower bounds on the generalization error of boosted classifiers are given, which nearly match the $k$th margin bound and thus almost settle thegeneralization performance of boosting classifiers in terms of margins.

Boosting via Approaching Optimal Margin Distribution

A new boosting method is proposed by utilizing the Emargin bound to approach the optimal margin distribution and it is shown that boosting on the \(k^*\)-optimization margin distribution is sound and efficient.

On the Optimization of Margin Distribution

This work provides a new generalization error bound, which is heavily relevant to margin distribution by incorporating ingredients such as average margin and semi-variance, a new margin statistics for the characterization of margin distribution.

Large margin distribution machine

The Large margin Distribution Machine (LDM), which tries to achieve a better generalization performance by optimizing the margin distribution, is proposed and its superiority is verified both theoretically and empirically in this paper.

On the Current State of Research in Explaining Ensemble Performance Using Margins

Several techniques are proposed and evidence suggesting that the generalization error of a voting classifier might be reduced by increasing the mean and decreasing the variance of the margins is provided, suggesting the current state of research in explaining ensemble performance holds.

Optimal Margin Distribution Machine

The Optimal margin Distribution Machine (ODM) is proposed, which can achieve a better generalization performance by optimizing the margin distribution explicitly and its superiority is verified both theoretically and empirically in this paper.

The role of margins in boosting and ensemble performance

The role of margins is examined in boosting and ensemble method performance, which can be very robust to overfitting, in most instances having lower generalization error than other competing ensemble methodologies, such as bagging and random forests.

Optimal Minimal Margin Maximization with Boosting

A new algorithm refuting the conjecture that an optimal trade-off between number of hypotheses trained and the minimal margin over all training points is possible and a lower bound is proved which implies that the new algorithm is optimal.
...

References

SHOWING 1-10 OF 57 REFERENCES

A Refined Margin Analysis for Boosting Algorithms via Equilibrium Margin

A refined analysis of the margin theory is made, which proves a bound in terms of a new margin measure called the Equilibrium margin (Emargin) which is uniformly sharper than Breiman's minimum margin bound.

Boosting in the Limit: Maximizing the Margin of Learned Ensembles

The crucial question as to why boosting works so well in practice, and how to further improve upon it, remains mostly open, and it is concluded that no simple version of the minimum-margin story can be complete.

How boosting the margin can also boost classifier complexity

A close look at Breiman's compelling but puzzling results finds that the poorer performance of arc-gv can be explained by the increased complexity of the base classifiers it uses, an explanation supported by experiments and entirely consistent with the margins theory.

Boosting the margin: A new explanation for the effectiveness of voting methods

It is shown that techniques used in the analysis of Vapnik's support vector classifiers and of neural networks with small weights can be applied to voting methods to relate the margin distribution to the test error.

Boosting Through Optimization of Margin Distributions

A new boosting algorithm is designed, termed margin-distribution boosting (MDBoost), which directly maximizes the average margin and minimizes the margin variance at the same time, and a totally corrective optimization algorithm based on column generation is proposed to implement MDBoost.

Empirical margin distributions and bounding the generalization error of combined classifiers

New probabilistic upper bounds on generalization error of complex classifiers that are combinations of simple classifier combinations, based on the methods of the theory of Gaussian and empirical processes are proved.

Data-dependent margin-based generalization bounds for classification

New margin-based inequalities for the probability of error of classifiers can be calculated using the training data and therefore may be effectively used for model selection purposes and appear to be sharper and more general than recent results involving empirical complexity measures.

Convexity, Classification, and Risk Bounds

A general quantitative relationship between the risk as assessed using the 0–1 loss and the riskAs assessed using any nonnegative surrogate loss function is provided, and it is shown that this relationship gives nontrivial upper bounds on excess risk under the weakest possible condition on the loss function.

Special Invited Paper-Additive logistic regression: A statistical view of boosting

This work shows that this seemingly mysterious phenomenon of boosting can be understood in terms of well-known statistical principles, namely additive modeling and maximum likelihood, and develops more direct approximations and shows that they exhibit nearly identical results to boosting.

Generalization Performance of Classifiers in Terms of Observed Covering Numbers

It is shown that one can utilize an analogous argument in terms of the observed covering numbers on a single m-sample (being the actual observed data points) to bound the generalization performance of a classifier by using a margin based analysis.
...