• Corpus ID: 235606222

Learning Stochastic Majority Votes by Minimizing a PAC-Bayes Generalization Bound

@inproceedings{Zantedeschi2021LearningSM,
  title={Learning Stochastic Majority Votes by Minimizing a PAC-Bayes Generalization Bound},
  author={Valentina Zantedeschi and Paul Viallard and Emilie Morvant and R{\'e}mi Emonet and Amaury Habrard and Pascal Germain and Benjamin Guedj},
  booktitle={NeurIPS},
  year={2021}
}
We investigate a stochastic counterpart of majority votes over finite ensembles of classifiers, and study its generalization properties. While our approach holds for arbitrary distributions, we instantiate it with Dirichlet distributions: this allows for a closed-form and differentiable expression for the expected risk, which then turns the generalization bound into a tractable training objective. The resulting stochastic majority vote learning algorithm achieves state-of-the-art accuracy and… 
On Margins and Generalisation for Voting Classifiers
TLDR
This work studies the generalisation properties of majority voting on finite ensembles of classifiers, proving margin-based generalisation bounds via the PAC-Bayes theory, and adds perspective to the debate on the “margins theory” proposed by Schapire et al.
A PAC-Bayes Analysis of Adversarial Robustness
We propose the first general PAC-Bayesian generalization bounds for adversarial robustness, that estimate, at test time, how much a model will be invariant to imperceptible perturbations in the
On Margins and Derandomisation in PAC-Bayes
TLDR
This work gives a general recipe for derandomising PAC-Bayesian bounds using margins, extending to partially-derandomised predictors where only some of the randomness is removed, and extends bounds to cases where the concentration properties of the predictors are otherwise poor.
Shedding a PAC-Bayesian Light on Adaptive Sliced-Wasserstein Distances
TLDR
The PAC-Bayesian theory and the central observation that SW actually hinges on a slice-distribution-dependent Gibbs risk are leveraged to bring new contributions to this line of research.
Non-Vacuous Generalisation Bounds for Shallow Neural Networks
TLDR
This work derives new generalisation bounds through the PAC-Bayesian theory for shallow neural networks with a single hidden layer, namely those with L 2 -normalised data and either a sigmoid-shaped Gaussian error function (“erf”) activation or a Gaussian Error Linear Unit (GELU) activation.

References

SHOWING 1-10 OF 47 REFERENCES
Self-Bounding Majority Vote Learning Algorithms by the Direct Minimization of a Tight PAC-Bayesian C-Bound
TLDR
By directly optimizing PAC-Bayesian guarantees on the C-Bound, self-bounding majority vote learning algorithms are derived that are scalable and lead to accurate predictors paired with non-vacuous guarantees.
From PAC-Bayes Bounds to Quadratic Programs for Majority Votes
TLDR
A new PAC-Bayes theorem is proposed that bounds the difference between the "true" value of the C-bound and its empirical estimate and that, unexpectedly, contains no KL-divergence.
Second Order PAC-Bayesian Bounds for the Weighted Majority Vote
We present a novel analysis of the expected risk of weighted majority vote in multiclass classification. The analysis takes correlation of predictions by ensemble members into account and provides a
Risk bounds for the majority vote: from a PAC-Bayesian analysis to a learning algorithm
TLDR
An extensive analysis of the behavior of majority votes in binary classification is proposed and a risk bound for majority votes, called the C-bound, is introduced that takes into account the average quality of the voters and their average disagreement.
On PAC-Bayesian bounds for random forests
TLDR
Various PAC-Bayesian approaches are discussed and evaluated to derive generalization bounds for random forests on various benchmark data sets, finding that bounds based on the analysis of Gibbs classifiers are typically superior and often reasonably tight.
PAC-Bayes Analysis Beyond the Usual Bounds
TLDR
A basic PAC-Bayes inequality for stochastic kernels is presented, from which one may derive extensions of various known PAC- Bayes bounds as well as novel bounds, and a simple bound for a loss function with unbounded range is presented.
A General Framework for the Derandomization of PAC-Bayesian Bounds
TLDR
Three new PAC-Bayesian generalization bounds are introduced that have the originality to be pointwise, meaning that they provide guarantees over one single hypothesis instead of the usual averaged analysis.
A finite sample analysis of the Naive Bayes classifier
TLDR
This work revisits the classical decision-theoretic problem of weighted expert voting and examines the consistency (both asymptotic and finitary) of the optimal Naive Bayes weighted majority and related rules.
On the properties of variational approximations of Gibbs posteriors
TLDR
The main finding is that such a variational approximation of the Gibbs posterior has often the same rate of convergence as the original PAC-Bayesian procedure it approximates.
...
...