Corpus ID: 235765642

On Margins and Derandomisation in PAC-Bayes

@article{Biggs2021OnMA,
  title={On Margins and Derandomisation in PAC-Bayes},
  author={Felix Biggs and Benjamin Guedj},
  journal={ArXiv},
  year={2021},
  volume={abs/2107.03955}
}
We develop a framework for derandomising PAC-Bayesian generalisation bounds achieving a margin on training data, relating this process to the concentration-of-measure phenomenon. We apply these tools to linear prediction, single-hidden-layer neural networks with an unusual erf activation function, and deep ReLU networks, obtaining new bounds. The approach is also extended to the idea of “partial-derandomisation” where only some layers are derandomised and the others are stochastic. This allows… Expand

Figures and Tables from this paper

References

SHOWING 1-10 OF 32 REFERENCES
A General Framework for the Derandomization of PAC-Bayesian Bounds
TLDR
Three new PAC-Bayesian generalization bounds are introduced that have the originality to be pointwise, meaning that they provide guarantees over one single hypothesis instead of the usual averaged analysis. Expand
Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data
TLDR
By optimizing the PAC-Bayes bound directly, Langford and Caruana (2001) are able to extend their approach and obtain nonvacuous generalization bounds for deep stochastic neural network classifiers with millions of parameters trained on only tens of thousands of examples. Expand
PAC-BAYESIAN SUPERVISED CLASSIFICATION: The Thermodynamics of Statistical Learning
This monograph deals with adaptive supervised classification, using tools borrowed from statistical mechanics and information theory, stemming from the PACBayesian approach pioneered by DavidExpand
Non-vacuous Generalization Bounds at the ImageNet Scale: a PAC-Bayesian Compression Approach
TLDR
This paper provides the first non-vacuous generalization guarantees for realistic architectures applied to the ImageNet classification problem and establishes an absolute limit on expected compressibility as a function of expected generalization error. Expand
Entropy-SGD optimizes the prior of a PAC-Bayes bound: Data-dependent PAC-Bayes priors via differential privacy
We show that Entropy-SGD (Chaudhari et al., 2017), when viewed as a learning algorithm, optimizes a PAC-Bayes bound on the risk of a Gibbs (posterior) classifier, i.e., a randomized classifierExpand
Generalization bounds for deep learning
TLDR
Desiderata for techniques that predict generalization errors for deep learning models in supervised learning are introduced, and a marginal-likelihood PAC-Bayesian bound is derived that fulfills desiderata 1-3 and 5. Expand
Dichotomize and Generalize: PAC-Bayesian Binary Activated Deep Neural Networks
TLDR
This work develops an end-to-end framework to train a binary activated deep neural network with binary activation, and provides nonvacuous PAC-Bayesian generalization bounds for binaryactivated deep neural networks. Expand
Rademacher and Gaussian Complexities: Risk Bounds and Structural Results
TLDR
This work investigates the use of certain data-dependent estimates of the complexity of a function class called Rademacher and Gaussian complexities and proves general risk bounds in terms of these complexities in a decision theoretic setting. Expand
Weight Uncertainty in Neural Network
TLDR
This work introduces a new, efficient, principled and backpropagation-compatible algorithm for learning a probability distribution on the weights of a neural network, called Bayes by Backprop, and shows how the learnt uncertainty in the weights can be used to improve generalisation in non-linear regression problems. Expand
Simplified PAC-Bayesian Margin Bounds
TLDR
This paper provides an explicit solution to the variational problem implicit in the Langford-Shawe-Taylor bound and shows that the PAC-Bayesian margin bounds are significantly tighter. Expand
...
1
2
3
4
...