# Optimal Thinning of MCMC Output

@article{Riabiz2020OptimalTO, title={Optimal Thinning of MCMC Output}, author={Marina Riabiz and Wilson Ye Chen and Jon Cockayne and Pawel Swietach and Steven A. Niederer and Lester W. Mackey and Chris J. Oates}, journal={arXiv: Methodology}, year={2020} }

The use of heuristics to assess the convergence and compress the output of Markov chain Monte Carlo can be sub-optimal in terms of the empirical approximations that are produced. Typically a number of the initial states are attributed to "burn in" and removed, whilst the remainder of the chain is "thinned" if compression is also required. In this paper we consider the problem of retrospectively selecting a subset of states, of fixed cardinality, from the sample path such that the approximation… Expand

#### 15 Citations

Distribution Compression in Near-linear Time

- Computer Science, Mathematics
- ArXiv
- 2021

Compress++ is introduced, a simple meta-procedure for speeding up any thinning algorithm while suffering at most a factor of 4 in error and reduces the runtime of super-quadratic algorithms by a square-root factor. Expand

Generalized Kernel Thinning

- Mathematics, Computer Science
- ArXiv
- 2021

The kernel thinning (KT) algorithm of Dwivedi and Mackey (2021) compresses an n point distributional summary into a √ n point summary with better-than-Monte-Carlo maximum mean discrepancy for a… Expand

Kernel Thinning

- Mathematics, Computer Science
- COLT
- 2021

Kernel thinning is introduced, a new procedure for compressing a distribution P more effectively than i.i.d. sampling or standard thinning, and explicit non-asymptotic maximum mean discrepancy bounds for Gaussian, Matérn, and B-spline kernels are derived. Expand

Optimal Quantisation of Probability Measures Using Maximum Mean Discrepancy

- Computer Science, Mathematics
- AISTATS
- 2021

A novel non-myopic algorithm is proposed and a variant that applies this technique to a mini-batch of the candidate set at each iteration is investigated, in order to both improve statistical efficiency and reduce computational cost. Expand

Fast Compression of MCMC Output

- Computer Science, Mathematics
- Entropy
- 2021

A novel method for compressing the output of an MCMC (Markov chain Monte Carlo) algorithm when control variates are available, using the cube method, which compares favourably to previous methods, such as Stein thinning. Expand

Kernel Stein Discrepancy Descent

- Mathematics, Computer Science
- ICML
- 2021

The convergence properties of KSD Descent are studied and its practical relevance is demonstrated, but failure cases are highlighted by showing that the algorithm can get stuck in spurious local minima. Expand

Minimum Discrepancy Methods in Uncertainty Quantification

- Mathematics
- 2021

5 Partial Solutions to Exercises 29 5.1 Exercise 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 5.2 Exercise 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . .… Expand

Post-Processing of MCMC

- Computer Science
- 2021

State-of-the-art techniques for post-processing Markov chain output are reviewed, including methods based on discrepancy minimisation, which directly address the bias-variance trade-off, as well as general-purpose control variate methods for approximating expected quantities of interest. Expand

Postprocessing of MCMC

- Mathematics
- Annual Review of Statistics and Its Application
- 2021

Markov chain Monte Carlo is the engine of modern Bayesian statistics, being used to approximate the posterior and derived quantities of interest. Despite this, the issue of how the output from a… Expand

Robust Generalised Bayesian Inference for Intractable Likelihoods

- Mathematics
- 2021

Generalised Bayesian inference updates prior beliefs using a loss function, rather than a likelihood, and can therefore be used to confer robustness against possible misspecification of the… Expand

#### References

SHOWING 1-10 OF 101 REFERENCES

Revisiting the Gelman–Rubin Diagnostic

- Mathematics
- Statistical Science
- 2021

Gelman and Rubin's (1992) convergence diagnostic is one of the most popular methods for terminating a Markov chain Monte Carlo (MCMC) sampler. Since the seminal paper, researchers have developed… Expand

Variazioni e fluttuazioni del numero d’individui in specie animali conviventi

- 2019

Black-box Importance Sampling

- Mathematics, Computer Science
- AISTATS
- 2017

Black-box importance sampling methods that calculate importance weights for samples generated from any unknown proposal or black-box mechanism are studied, allowing for better and richer proposals to solve difficult problems and improving the estimation accuracy beyond typical importance sampling. Expand

Stan: A Probabilistic Programming Language

- Computer Science
- 2017

Stan is a probabilistic programming language for specifying statistical models that provides full Bayesian inference for continuous-variable models through Markov chain Monte Carlo methods such as the No-U-Turn sampler and an adaptive form of Hamiltonian Monte Carlo sampling. Expand

Deterministic Sampling of Expensive Posteriors Using Minimum Energy Designs

- Computer Science, Mathematics
- Technometrics
- 2019

An efficient algorithm is developed that can generate MinED samples with few posterior evaluations and makes several improvements to the MinED criterion to make it perform better in high dimensions. Expand

Stein points Markov chain Monte Carlo

- In Proceedings of the 36th International Conference on Machine Learning,
- 2019

Briol, and C

- J. Oates. Stein points. In Proceedings of the 35th International Conference on Machine Learning,
- 2018

A Kernel Test of Goodness of Fit

- Mathematics, Computer Science
- ICML
- 2016

A nonparametric statistical test for goodness-of-fit is proposed: given a set of samples, the test determines how likely it is that these were generated from a target density function, taking the form of a V-statistic in terms of the log gradients of the target density and the kernel. Expand

On the Equivalence between Herding and Conditional Gradient Algorithms

- Mathematics, Computer Science
- ICML
- 2012

The experiments indicate that while the herding procedure of Welling (2009) can improve over herding on the task of approximating integrals, the original herding algorithm tends to approach more often the maximum entropy distribution, shedding more light on the learning bias behind herding. Expand

Reproducing kernel Hilbert spaces in probability and statistics

- Mathematics
- 2004

1 Theory.- 2 RKHS AND STOCHASTIC PROCESSES.- 3 Nonparametric Curve Estimation.- 4 Measures And Random Measures.- 5 Miscellaneous Applications.- 6 Computational Aspects.- 7 A Collection of Examples.-… Expand