# General Bayesian loss function selection and the use of improper models

@article{Jewson2022GeneralBL, title={General Bayesian loss function selection and the use of improper models}, author={Jack Jewson and David Rossell}, journal={Journal of the Royal Statistical Society: Series B (Statistical Methodology)}, year={2022} }

Statisticians often face the choice between using probability models or a paradigm deﬁned by minimising a loss function. Both approaches are useful and, if the loss can be re-cast into a proper probability model, there are many tools to decide which model or loss is more appropriate for the observed data, in the sense of explaining the data’s nature. However, when the loss leads to an improper model, there are no principled ways to guide this choice. We address this task by combining the Hyv…

## 5 Citations

### On Selection Criteria for the Tuning Parameter in Robust Divergence

- Computer ScienceEntropy
- 2021

A selection criterion based on an asymptotic approximation of the Hyvarinen score applied to an unnormalized model defined by robust divergence is proposed, which demonstrates the usefulness of the proposed method via numerical studies using normal distributions and regularized linear regression.

### Adaptation of the Tuning Parameter in General Bayesian Inference with Robust Divergence

- Computer Science
- 2021

A novel methodology for robust Bayesian estimation with robust divergence, treating the exponential of robust divergence as an unnormalisable statistical model, and estimating the tuning parameter by minimising the Hyvarinen score is introduced.

### Generalised Bayesian Inference for Discrete Intractable Likelihood

- Computer Science, Mathematics
- 2022

The main idea is to update beliefs about model parameters using a discrete Fisher divergence, in lieu of the problematic intractable likelihood, to create a generalised posterior that can be sampled using standard computational tools, circumventing the intractables normalising constant.

### Approximate Gibbs sampler for Bayesian Huberized lasso

- Computer ScienceJournal of Statistical Computation and Simulation
- 2022

A new posterior computation algorithm for the Bayesian Huberized lasso regression is proposed based on the approximation of full conditional distribution and it is possible to estimate a tuning parameter for robustness of the pseudo-Huber loss function.

### General Bayesian L2 calibration of mathematical models

- Computer Science
- 2021

Methodology is proposed for the general Bayesian calibration of mathematical models where the resulting posterior distributions estimate the values of the parameters that minimize the L 2 norm of the diﬀerence between the mathematical model and true physical system.

## References

SHOWING 1-10 OF 81 REFERENCES

### EXACT MEAN INTEGRATED SQUARED ERROR

- Mathematics
- 1992

An exact and easily computable expression for the mean integrated squared error (MISE) for the kernel estimator of a general normal mixture density, is given for Gaussian kernels of arbitrary order.…

### Robust and efficient estimation by minimising a density power divergence

- Mathematics
- 1998

A minimum divergence estimation method is developed for robust parameter estimation. The proposed approach uses new density-based divergences which, unlike existing methods of this type such as…

### On choosing mixture components via non‐local priors

- Computer ScienceJournal of the Royal Statistical Society: Series B (Statistical Methodology)
- 2019

The NLP‐induced sparsity is characterized theoretically, derived tractable expressions and algorithms are derived, and an estimator for posterior model probabilities under local priors and NLPs is proposed, showing that Bayes factors are ratios of posterior‐to‐prior empty cluster probabilities.

### Tractable Bayesian Variable Selection: Beyond Normality

- EconomicsJournal of the American Statistical Association
- 2018

This work focuses on nonlocal priors to induce extra sparsity and ameliorate finite-sample effects caused by misspecification, and shows the importance of considering the likelihood rather than solely the prior, for Bayesian variable selection.

### A Tuning-free Robust and Efficient Approach to High-dimensional Regression

- Computer Science
- 2020

A novel approach for high-dimensional regression with theoretical guarantees that overcomes the challenge of tuning parameter selection of Lasso and possesses several appealing properties, and is robust with substantial efficiency gain for heavy-tailed random errors while maintaining high efficiency for normal random errors.

### Information criteria for non-normalized models

- Mathematics, Computer ScienceJ. Mach. Learn. Res.
- 2021

Simulation results and application to natural image and RNAseq data demonstrate that the proposed criteria enable selection of the appropriate non-normalized model in a data-driven manner.

### Objective Bayesian inference with proper scoring rules

- Computer ScienceTEST
- 2018

This paper discusses the use of scoring rules in the Bayes formula in order to compute a posterior distribution, named SR-posterior distribution, and its asymptotic normality, and proposes a procedure for building default priors for the unknown parameter of interest that can be used to update the information provided by the scoring rule in the SR-northern distribution.

### A general framework for updating belief distributions

- Computer ScienceJournal of the Royal Statistical Society. Series B, Statistical methodology
- 2016

It is argued that a valid update of a prior belief distribution to a posterior can be made for parameters which are connected to observations through a loss function rather than the traditional likelihood function, which is recovered as a special case.

### Estimation of Non-Normalized Statistical Models by Score Matching

- Computer ScienceJ. Mach. Learn. Res.
- 2005

While the estimation of the gradient of log-density function is, in principle, a very difficult non-parametric problem, it is proved a surprising result that gives a simple formula that simplifies to a sample average of a sum of some derivatives of the log- density given by the model.

### Interpretation and Generalization of Score Matching

- Computer ScienceUAI
- 2009

This paper provides a formal link between maximum likelihood and score matching and develops a generalization of score matching, which shows that score matching finds model parameters that are more robust with noisy training data.