Online, Informative MCMC Thinning with Kernelized Stein Discrepancy
@article{Hawkins2022OnlineIM, title={Online, Informative MCMC Thinning with Kernelized Stein Discrepancy}, author={Cole Hawkins and Alec Koppel and Zheng Zhang}, journal={ArXiv}, year={2022}, volume={abs/2201.07130} }
A fundamental challenge in Bayesian inference is efficient representation of a target distribution. Many non-parametric approaches do so by sampling a large number of points using variants of Markov Chain Monte Carlo (MCMC). We propose an MCMC variant that retains only those posterior samples which exceed a KSD threshold, which we call KSD Thinning. We establish the convergence and complexity tradeoffs for several settings of KSD Thinning as a function of the KSD threshold parameter, sample size…
Figures and Tables from this paper
3 Citations
A stochastic Stein Variational Newton method
- Computer ScienceArXiv
- 2022
This paper derives, and provides a practical implementation of, a stochastic variant of SVN (sSVN) which is both asymptotically correct and converges rapidly and is a promising approach to accelerating high-precision Bayesian inference tasks with modest-dimension.
Posterior Coreset Construction with Kernelized Stein Discrepancy for Model-Based Reinforcement Learning
- Computer Science
- 2022
This work proposes a novel Kernelized S tein Discrepancy-based Posterior Sampling for RL algorithm (named KSRL), which extends model-based RL based upon posterior sampling (PSRL) in several ways, and develops a novel regret analysis of PSRL based upon integral probability metrics.
A STOCHASTIC S TEIN VARIATIONAL N EWTON METHOD
- Computer Science
- 2022
This paper derives, and provides a practical implementation of, a stochastic variant of SVN (sSVN) which is both asymptotically correct and converges rapidly and is a promising approach to accelerating high-precision Bayesian inference tasks with modest-dimension.
References
SHOWING 1-10 OF 35 REFERENCES
Measuring Sample Quality with Kernels
- MathematicsICML
- 2017
A theory of weak convergence for K SDs based on Stein's method is developed, it is demonstrated that commonly used KSDs fail to detect non-convergence even for Gaussian targets, and it is shown that kernels with slowly decaying tails provably determine convergence for a large class of target distributions.
Dimension-independent likelihood-informed MCMC
- Computer Science, MathematicsJ. Comput. Phys.
- 2016
Optimal thinning of MCMC output
- Computer ScienceJournal of the Royal Statistical Society: Series B (Statistical Methodology)
- 2022
A novel method is proposed, based on greedy minimisation of a kernel Stein discrepancy, that is suitable for problems where heavy compression is required and its effectiveness is demonstrated in the challenging context of parameter inference for ordinary differential equations.
Bayesian Learning via Stochastic Gradient Langevin Dynamics
- Computer ScienceICML
- 2011
In this paper we propose a new framework for learning from large scale datasets based on iterative learning from small mini-batches. By adding the right amount of noise to a standard stochastic…
Stein Point Markov Chain Monte Carlo
- Computer ScienceICML
- 2019
This paper removes the need to solve this optimisation problem by selecting each new point based on a Markov chain sample path, which significantly reduces the computational cost of Stein Points and leads to a suite of algorithms that are straightforward to implement.
A Complete Recipe for Stochastic Gradient MCMC
- Computer Science, MathematicsNIPS
- 2015
This paper provides a general recipe for constructing MCMC samplers--including stochastic gradient versions--based on continuous Markov processes specified via two matrices, and uses the recipe to straightforwardly propose a new state-adaptive sampler: stochastics gradient Riemann Hamiltonian Monte Carlo (SGRHMC).
What Are Bayesian Neural Network Posteriors Really Like?
- Computer ScienceICML
- 2021
It is shown that BNNs can achieve significant performance gains over standard training and deep ensembles, and a single long HMC chain can provide a comparable representation of the posterior to multiple shorter chains, and posterior tempering is not needed for near-optimal performance.
Subspace Inference for Bayesian Deep Learning
- Computer ScienceUAI
- 2019
Low-dimensional subspaces of parameter space, such as the first principal components of the stochastic gradient descent (SGD) trajectory, are constructed, which contain diverse sets of high performing models and show that Bayesian model averaging over the induced posterior produces accurate predictions and well calibrated predictive uncertainty for both regression and image classification.
Consistent Online Gaussian Process Regression Without the Sample Complexity Bottleneck
- Computer Science2019 American Control Conference (ACC)
- 2019
This work develops the first compression sub-routine for online Gaussian processes that preserves their convergence to the population posterior, i.e., asymptotic posterior consistency, while ameliorating their intractable complexity growth with the sample size.
A Stochastic Newton MCMC Method for Large-Scale Statistical Inverse Problems with Application to Seismic Inversion
- MathematicsSIAM J. Sci. Comput.
- 2012
This work addresses the solution of large-scale statistical inverse problems in the framework of Bayesian inference with a so-called Stochastic Monte Carlo method.