Corpus ID: 7600536

Algorithmic Stability and Generalization Performance

@inproceedings{Bousquet2000AlgorithmicSA,
  title={Algorithmic Stability and Generalization Performance},
  author={Olivier Bousquet and Andr{\'e} Elisseeff},
  booktitle={NIPS},
  year={2000}
}
We present a novel way of obtaining PAC-style bounds on the generalization error of learning algorithms, explicitly using their stability properties. A stable learner is one for which the learned solution does not change much with small changes in the training set. The bounds we obtain do not depend on any measure of the complexity of the hypothesis space (e.g. VC dimension) but rather depend on how the learning algorithm searches this space, and can thus be applied even when the VC dimension… Expand
Almost-everywhere Algorithmic Stability and Generalization Error
TLDR
The new notion of training stability of a learning algorithm is introduced and it is shown that, in a general setting, it is sufficient for good bounds on generalization error. Expand
Stability and Generalization
TLDR
These notions of stability for learning algorithms are defined and it is shown how to use these notions to derive generalization error bounds based on the empirical error and the leave-one-out error. Expand
Stability Bounds for Noni . i . d . Processes
The notion of algorithmic stability has been used effectively in the past to derive tight generalization bounds. A key advantage of these bounds is that they are designed for specific learningExpand
Stability Bounds for Non-i.i.d. Processes
TLDR
Novel stability-based generalization bounds that hold even with this more general setting are proved, which strictly generalize the bounds given in the i.i.d. case. Expand
Stability Analysis and Learning Bounds for Transductive Regression Algorithms
TLDR
The notion of algorithmic stability is used to derive novel generalization bounds for several families of transductive regression algorithms, both by using convexity and closed-form solutions, and shows that a number of widely used transductives regression algorithms are in fact unstable. Expand
Stability Bounds for Stationary φ-mixing and β-mixing Processes
Most generalization bounds in learning theory are based on some measure of the complexity of the hypothesis class used, independently of any algorithm. In contrast, the notion of algorithmicExpand
Algorithmic Luckiness
TLDR
This paper studies learning algorithms more directly and in a way that allows us to exploit the serendipity of the training sample and presents an application of this framework to the maximum margin algorithm for linear classifiers which results in a bound that exploits the margin. Expand
Stable Foundations for Learning: a foundational framework for learning theory in both the classical and modern regime.
I consider here the class of supervised learning algorithms known as Empirical Risk Minimization (ERM). The classical theory by Vapnik and others characterize universal consistency of ERM in theExpand
A Different Type of Convergence for Statistical Learning Algorithms
We discuss stability for a class of learning algorithms with respect to noisy labels. The algorithms we consider are for regression, and they involve the minimization of regularized risk functionals,Expand
Stable Foundations for Learning: a framework for learning theory (in both the classical and modern regime)
I consider here the class of supervised learning algorithms known as Empirical Risk Minimization (ERM). The classical theory by Vapnik and others characterize universal consistency of ERM in theExpand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 12 REFERENCES
A Study about Algorithmic Stability and Its Relation to Generalization
This technical report presents some results about how to control the generalization error for stable algorithms. We deene a new notion of stable algorithm and derive conndence bounds. It is shownExpand
Scale-sensitive dimensions, uniform convergence, and learnability
TLDR
A characterization of learnability in the probabilistic concept model, solving an open problem posed by Kearns and Schapire, and shows that the accuracy parameter plays a crucial role in determining the effective complexity of the learner's hypothesis class. Expand
Self bounding learning algorithms
  • Y. Freund
  • Mathematics, Computer Science
  • COLT' 98
  • 1998
TLDR
A self-bounding learning algorithm is an algorithm which, in addition to the hypothesis that it outputs, outputs a reliable upper bound on the generalization error of this hypothesis. Expand
Algorithmic Stability and Sanity-Check Bounds for Leave-one-Out Cross-Validation
In this article we prove sanity-check bounds for the error of the leave-oneout cross-validation estimate of the generalization error: that is, bounds showing that the worst-case error of thisExpand
Generalization Performance of Classifiers in Terms of Observed Covering Numbers
TLDR
It is shown that one can utilize an analogous argument in terms of the observed covering numbers on a single m-sample (being the actual observed data points) to bound the generalization performance of a classifier by using a margin based analysis. Expand
Regularization Algorithms for Learning That Are Equivalent to Multilayer Networks
TLDR
A theory is reported that shows the equivalence between regularization and a class of three-layer networks called regularization networks or hyper basis functions. Expand
A Unified Framework for Regularization Networks and Support Vector Machines
TLDR
This work presents regularization Networks and Support Vector Machines in a unified framework in the context of Vapnik''s theory of statistical learning which provides a general foundation for the learning problem, combining functional analysis and statistics. Expand
Theory of Reproducing Kernels.
Abstract : The present paper may be considered as a sequel to our previous paper in the Proceedings of the Cambridge Philosophical Society, Theorie generale de noyaux reproduisants-Premiere partieExpand
Distribution-free performance bounds for potential function rules
TLDR
It is shown that the mean-square difference between the probability of error for the nde and its deleted estimate is bounded by A/ \sqrt{n} where A is an explicitly given constant depending only on M and the potential function. Expand
A framework for structural risk minimisation
TLDR
The paper introduces a framework for studying structural risk minimisation in a PAC context and considers the more general case when the hierarchy of classes is chosen in response to the data. Expand
...
1
2
...