Improved PAC-Bayesian Bounds for Linear Regression

@article{Shalaeva2020ImprovedPB,
  title={Improved PAC-Bayesian Bounds for Linear Regression},
  author={Vera Shalaeva and Alireza Fakhrizadeh Esfahani and Pascal Germain and Mih{\'a}ly Petreczky},
  journal={ArXiv},
  year={2020},
  volume={abs/1912.03036}
}
In this paper, we improve the PAC-Bayesian error bound for linear regression derived in Germain et al. (2016). The improvements are two-fold. First, the proposed error bound is tighter, and converges to the generalization loss with a well-chosen temperature parameter. Second, the error bound also holds for training data that are not independently sampled. In particular, the error bound applies to certain time series generated by well-known classes of dynamical models, such as ARX models. 
PAC-Bayes Unleashed: Generalisation Bounds with Unbounded Losses
TLDR
A novel PAC-Bayesian generalisation bound for unbounded loss functions is derived and instantiated on a linear regression problem, and to make the theory usable by the largest audience possible, it includes discussions on actual computation, practicality and limitations of the authors' assumptions.
PAC-Bayes Analysis Beyond the Usual Bounds
TLDR
A basic PAC-Bayes inequality for stochastic kernels is presented, from which one may derive extensions of various known PAC- Bayes bounds as well as novel bounds, and a simple bound for a loss function with unbounded range is presented.
PAC-Bayesian theory for stochastic LTI systems
TLDR
A PAC-Bayesian error bound for autonomous stochastic LTI state-space models is derived and it is shown that these bounds will allow deriving similar error bounds for more general dynamical systems, including recurrent neural networks.
Novel Change of Measure Inequalities with Applications to PAC-Bayesian Bounds and Monte Carlo Estimation
TLDR
Several applications are presented, including PAC-Bayesian bounds for various classes of losses and non-asymptotic intervals for Monte Carlo estimates and a generalized version of Hammersley-Chapman-Robbins inequality.
Learning under Model Misspecification: Applications to Variational and Ensemble methods
TLDR
This work presents a novel analysis of the generalization performance of Bayesian model averaging under model misspecification and i.i.d. data using a new family of second-order PAC-Bayes bounds, and derives a newfamily of Bayes-like algorithms, which can be implemented as variational and ensemble methods.

References

SHOWING 1-10 OF 32 REFERENCES
Tighter PAC-Bayes Bounds
TLDR
A PAC-Bayes bound to measure the performance of Support Vector Machine (SVM) classifiers is proposed, based on learning a prior over the distribution of classifiers with a part of the training samples, resulting in an enhancement of the predictive capabilities of the PAC- Bayes bound.
The Safe Bayesian - Learning the Learning Rate via the Mixability Gap
TLDR
This work presents a modification of Bayesian inference which continues to achieve good rates with wrong models, and adapts the Bayesian learning rate to the data, picking the rate minimizing the cumulative loss of sequential prediction by posterior randomization.
Aggregation by exponential weighting, sharp PAC-Bayesian bounds and sparsity
TLDR
Sharp PAC-Bayesian risk bounds are obtained for aggregates defined via exponential weights, under general assumptions on the distribution of errors and on the functions to aggregate, which are applied to derive sparsity oracle inequalities.
Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data
TLDR
By optimizing the PAC-Bayes bound directly, Langford and Caruana (2001) are able to extend their approach and obtain nonvacuous generalization bounds for deep stochastic neural network classifiers with millions of parameters trained on only tens of thousands of examples.
Some PAC-Bayesian Theorems
TLDR
The PAC-Bayesian theorems given here apply to an arbitrary prior measure on an arbitrary concept space and provide an alternative to the use of VC dimension in proving PAC bounds for parameterized concepts.
Excess Risk Bounds for the Bayes Risk using Variational Inference in Latent Gaussian Models
TLDR
Previous results for variational algorithms are strengthened by showing they are competitive with any point-estimate predictor and bounds on the risk of the \emph{Bayesian} predictor and not just therisk of the Gibbs predictor for the same approximate posterior are provided.
PAC-Bayesian Theory Meets Bayesian Inference
TLDR
For the negative log-likelihood loss function, it is shown that the minimization of PAC-Bayesian generalization risk bounds maximizes the Bayesian marginal likelihood.
Theory and Algorithms for Forecasting Time Series
TLDR
The authors' learning guarantees are expressed in terms of a data-dependent measure of sequential complexity and a discrepancy measure that can be estimated from data under some mild assumptions.
PAC-BAYESIAN SUPERVISED CLASSIFICATION: The Thermodynamics of Statistical Learning
TLDR
An alternative selection scheme based on relative bounds between estimators is described and study, and a two step localization technique which can handle the selection of a parametric model from a family of those is presented.
Information-theoretic upper and lower bounds for statistical estimation
  • Tong Zhang
  • Mathematics, Computer Science
    IEEE Transactions on Information Theory
  • 2006
TLDR
This paper establishes upper and lower bounds for some statistical estimation problems through concise information-theoretic arguments based on a simple yet general inequality, which naturally leads to a general randomized estimation method, for which performance upper bounds can be obtained.
...
...