Corpus ID: 218571291

Optimal Thinning of MCMC Output

  title={Optimal Thinning of MCMC Output},
  author={Marina Riabiz and Wilson Ye Chen and Jon Cockayne and Pawel Swietach and Steven A. Niederer and Lester W. Mackey and Chris J. Oates},
  journal={arXiv: Methodology},
The use of heuristics to assess the convergence and compress the output of Markov chain Monte Carlo can be sub-optimal in terms of the empirical approximations that are produced. Typically a number of the initial states are attributed to "burn in" and removed, whilst the remainder of the chain is "thinned" if compression is also required. In this paper we consider the problem of retrospectively selecting a subset of states, of fixed cardinality, from the sample path such that the approximation… Expand

Figures from this paper

Distribution Compression in Near-linear Time
Compress++ is introduced, a simple meta-procedure for speeding up any thinning algorithm while suffering at most a factor of 4 in error and reduces the runtime of super-quadratic algorithms by a square-root factor. Expand
Generalized Kernel Thinning
The kernel thinning (KT) algorithm of Dwivedi and Mackey (2021) compresses an n point distributional summary into a √ n point summary with better-than-Monte-Carlo maximum mean discrepancy for aExpand
Kernel Thinning
Kernel thinning is introduced, a new procedure for compressing a distribution P more effectively than i.i.d. sampling or standard thinning, and explicit non-asymptotic maximum mean discrepancy bounds for Gaussian, Matérn, and B-spline kernels are derived. Expand
Optimal Quantisation of Probability Measures Using Maximum Mean Discrepancy
A novel non-myopic algorithm is proposed and a variant that applies this technique to a mini-batch of the candidate set at each iteration is investigated, in order to both improve statistical efficiency and reduce computational cost. Expand
Fast Compression of MCMC Output
A novel method for compressing the output of an MCMC (Markov chain Monte Carlo) algorithm when control variates are available, using the cube method, which compares favourably to previous methods, such as Stein thinning. Expand
Kernel Stein Discrepancy Descent
The convergence properties of KSD Descent are studied and its practical relevance is demonstrated, but failure cases are highlighted by showing that the algorithm can get stuck in spurious local minima. Expand
Minimum Discrepancy Methods in Uncertainty Quantification
5 Partial Solutions to Exercises 29 5.1 Exercise 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 5.2 Exercise 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . .Expand
Post-Processing of MCMC
State-of-the-art techniques for post-processing Markov chain output are reviewed, including methods based on discrepancy minimisation, which directly address the bias-variance trade-off, as well as general-purpose control variate methods for approximating expected quantities of interest. Expand
Postprocessing of MCMC
Markov chain Monte Carlo is the engine of modern Bayesian statistics, being used to approximate the posterior and derived quantities of interest. Despite this, the issue of how the output from aExpand
Robust Generalised Bayesian Inference for Intractable Likelihoods
Generalised Bayesian inference updates prior beliefs using a loss function, rather than a likelihood, and can therefore be used to confer robustness against possible misspecification of theExpand


Revisiting the Gelman–Rubin Diagnostic
Gelman and Rubin's (1992) convergence diagnostic is one of the most popular methods for terminating a Markov chain Monte Carlo (MCMC) sampler. Since the seminal paper, researchers have developedExpand
Variazioni e fluttuazioni del numero d’individui in specie animali conviventi
  • 2019
Black-box Importance Sampling
Black-box importance sampling methods that calculate importance weights for samples generated from any unknown proposal or black-box mechanism are studied, allowing for better and richer proposals to solve difficult problems and improving the estimation accuracy beyond typical importance sampling. Expand
Stan: A Probabilistic Programming Language
Stan is a probabilistic programming language for specifying statistical models that provides full Bayesian inference for continuous-variable models through Markov chain Monte Carlo methods such as the No-U-Turn sampler and an adaptive form of Hamiltonian Monte Carlo sampling. Expand
Deterministic Sampling of Expensive Posteriors Using Minimum Energy Designs
An efficient algorithm is developed that can generate MinED samples with few posterior evaluations and makes several improvements to the MinED criterion to make it perform better in high dimensions. Expand
Stein points Markov chain Monte Carlo
  • In Proceedings of the 36th International Conference on Machine Learning,
  • 2019
Briol, and C
  • J. Oates. Stein points. In Proceedings of the 35th International Conference on Machine Learning,
  • 2018
A Kernel Test of Goodness of Fit
A nonparametric statistical test for goodness-of-fit is proposed: given a set of samples, the test determines how likely it is that these were generated from a target density function, taking the form of a V-statistic in terms of the log gradients of the target density and the kernel. Expand
On the Equivalence between Herding and Conditional Gradient Algorithms
The experiments indicate that while the herding procedure of Welling (2009) can improve over herding on the task of approximating integrals, the original herding algorithm tends to approach more often the maximum entropy distribution, shedding more light on the learning bias behind herding. Expand
Reproducing kernel Hilbert spaces in probability and statistics
1 Theory.- 2 RKHS AND STOCHASTIC PROCESSES.- 3 Nonparametric Curve Estimation.- 4 Measures And Random Measures.- 5 Miscellaneous Applications.- 6 Computational Aspects.- 7 A Collection of Examples.-Expand