General Bayesian loss function selection and the use of improper models

  title={General Bayesian loss function selection and the use of improper models},
  author={Jack Jewson and David Rossell},
  journal={Journal of the Royal Statistical Society: Series B (Statistical Methodology)},
  • Jack JewsonD. Rossell
  • Published 2 June 2021
  • Computer Science
  • Journal of the Royal Statistical Society: Series B (Statistical Methodology)
Statisticians often face the choice between using probability models or a paradigm defined by minimising a loss function. Both approaches are useful and, if the loss can be re-cast into a proper probability model, there are many tools to decide which model or loss is more appropriate for the observed data, in the sense of explaining the data’s nature. However, when the loss leads to an improper model, there are no principled ways to guide this choice. We address this task by combining the Hyv… 

Figures and Tables from this paper

On Selection Criteria for the Tuning Parameter in Robust Divergence

A selection criterion based on an asymptotic approximation of the Hyvarinen score applied to an unnormalized model defined by robust divergence is proposed, which demonstrates the usefulness of the proposed method via numerical studies using normal distributions and regularized linear regression.

Adaptation of the Tuning Parameter in General Bayesian Inference with Robust Divergence

A novel methodology for robust Bayesian estimation with robust divergence, treating the exponential of robust divergence as an unnormalisable statistical model, and estimating the tuning parameter by minimising the Hyvarinen score is introduced.

Generalised Bayesian Inference for Discrete Intractable Likelihood

The main idea is to update beliefs about model parameters using a discrete Fisher divergence, in lieu of the problematic intractable likelihood, to create a generalised posterior that can be sampled using standard computational tools, circumventing the intractables normalising constant.

Approximate Gibbs sampler for Bayesian Huberized lasso

A new posterior computation algorithm for the Bayesian Huberized lasso regression is proposed based on the approximation of full conditional distribution and it is possible to estimate a tuning parameter for robustness of the pseudo-Huber loss function.

General Bayesian L2 calibration of mathematical models

Methodology is proposed for the general Bayesian calibration of mathematical models where the resulting posterior distributions estimate the values of the parameters that minimize the L 2 norm of the difference between the mathematical model and true physical system.




An exact and easily computable expression for the mean integrated squared error (MISE) for the kernel estimator of a general normal mixture density, is given for Gaussian kernels of arbitrary order.

Robust and efficient estimation by minimising a density power divergence

A minimum divergence estimation method is developed for robust parameter estimation. The proposed approach uses new density-based divergences which, unlike existing methods of this type such as

On choosing mixture components via non‐local priors

The NLP‐induced sparsity is characterized theoretically, derived tractable expressions and algorithms are derived, and an estimator for posterior model probabilities under local priors and NLPs is proposed, showing that Bayes factors are ratios of posterior‐to‐prior empty cluster probabilities.

Tractable Bayesian Variable Selection: Beyond Normality

This work focuses on nonlocal priors to induce extra sparsity and ameliorate finite-sample effects caused by misspecification, and shows the importance of considering the likelihood rather than solely the prior, for Bayesian variable selection.

A Tuning-free Robust and Efficient Approach to High-dimensional Regression

A novel approach for high-dimensional regression with theoretical guarantees that overcomes the challenge of tuning parameter selection of Lasso and possesses several appealing properties, and is robust with substantial efficiency gain for heavy-tailed random errors while maintaining high efficiency for normal random errors.

Information criteria for non-normalized models

Simulation results and application to natural image and RNAseq data demonstrate that the proposed criteria enable selection of the appropriate non-normalized model in a data-driven manner.

Objective Bayesian inference with proper scoring rules

This paper discusses the use of scoring rules in the Bayes formula in order to compute a posterior distribution, named SR-posterior distribution, and its asymptotic normality, and proposes a procedure for building default priors for the unknown parameter of interest that can be used to update the information provided by the scoring rule in the SR-northern distribution.

A general framework for updating belief distributions

It is argued that a valid update of a prior belief distribution to a posterior can be made for parameters which are connected to observations through a loss function rather than the traditional likelihood function, which is recovered as a special case.

Estimation of Non-Normalized Statistical Models by Score Matching

While the estimation of the gradient of log-density function is, in principle, a very difficult non-parametric problem, it is proved a surprising result that gives a simple formula that simplifies to a sample average of a sum of some derivatives of the log- density given by the model.

Interpretation and Generalization of Score Matching

This paper provides a formal link between maximum likelihood and score matching and develops a generalization of score matching, which shows that score matching finds model parameters that are more robust with noisy training data.