• Corpus ID: 209405177

Learning from i.i.d. data under model miss-specification

  title={Learning from i.i.d. data under model miss-specification},
  author={Andr{\'e}s R. Masegosa},
  • A. Masegosa
  • Published 18 December 2019
  • Computer Science
  • ArXiv
This paper introduces a new approach to learning from i.i.d. data under model miss-specification. This approach casts the problem of learning as minimizing the expected code-length of a Bayesian mixture code. To solve this problem, we build on PAC-Bayes bounds, information theory and a new family of secondorder Jensen bounds. The key insight of this paper is that the use of the standard (first-order) Jensen bounds in learning is suboptimal when our model class is miss-specified (i.e. it does… 

Figures from this paper

Distilling Ensembles Improves Uncertainty Estimates

This work obtains negative theoretical results on the possibility of approximating deep ensemble weights by batch ensemble weights, and so turns to distillation.



Inconsistency of Bayesian Inference for Misspecified Linear Models, and a Proposal for Repairing It

We empirically show that Bayesian inference can be inconsistent under misspecification in simple linear regression problems, both in a model averaging/selection and in a Bayesian ridge regression

PAC-Bayesian Theory Meets Bayesian Inference

For the negative log-likelihood loss function, it is shown that the minimization of PAC-Bayesian generalization risk bounds maximizes the Bayesian marginal likelihood.

Fast-rate PAC-Bayes Generalization Bounds via Shifted Rademacher Processes

A new framework is established for deriving fast-rate PAC-Bayes bounds in terms of the "flatness" of the empirical risk surface on which the posterior concentrates and yields new insights on PAC- Bayesian theory.

Variational Inference: A Review for Statisticians

Variational inference (VI), a method from machine learning that approximates probability densities through optimization, is reviewed and a variant that uses stochastic optimization to scale up to massive data is derived.

On PAC-Bayesian bounds for random forests

Various PAC-Bayesian approaches are discussed and evaluated to derive generalization bounds for random forests on various benchmark data sets, finding that bounds based on the analysis of Gibbs classifiers are typically superior and often reasonably tight.

PAC-BAYESIAN SUPERVISED CLASSIFICATION: The Thermodynamics of Statistical Learning

An alternative selection scheme based on relative bounds between estimators is described and study, and a two step localization technique which can handle the selection of a parametric model from a family of those is presented.

Monte Carlo Gradient Estimation in Machine Learning

A broad and accessible survey of the methods for Monte Carlo gradient estimation in machine learning and across the statistical sciences, exploring three strategies--the pathwise, score function, and measure-valued gradient estimators--exploring their historical developments, derivation, and underlying assumptions.

PAC-Bayesian model averaging

The method constructs an optimized weighted mixture of concepts analogous to a Bayesian posterior distribution, and the main result is stated for bounded loss, a preliminary analysis for unbounded loss is also given.

Advances in Variational Inference

An overview of recent trends in variational inference is given and a summary of promising future research directions is provided.

Machine learning - a probabilistic perspective

  • K. Murphy
  • Computer Science
    Adaptive computation and machine learning series
  • 2012
This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach, and is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.