# Austerity in MCMC Land: Cutting the Metropolis-Hastings Budget

@inproceedings{Korattikara2013AusterityIM, title={Austerity in MCMC Land: Cutting the Metropolis-Hastings Budget}, author={Anoop Korattikara and Yutian Chen and Max Welling}, booktitle={International Conference on Machine Learning}, year={2013} }

Can we make Bayesian posterior MCMC sampling more efficient when faced with very large datasets? We argue that computing the likelihood for N datapoints in the Metropolis-Hastings (MH) test to reach a single binary decision is computationally inefficient. We introduce an approximate MH rule based on a sequential hypothesis test that allows us to accept or reject samples with high confidence using only a fraction of the data required for the exact MH rule. While this method introduces an…

## 278 Citations

### An Adaptive Subsampling Approach for MCMC Inference in Large Datasets

- Computer Science
- 2014

This paper describes a methodology that aims to scale up the Metropolis-Hastings (MH) algorithm and proposes an approximate implementation of the accept/reject step of MH that only requires evaluating the likelihood of a random subset of the data, yet is guaranteed to coincide with the acceptance/rejection step based on the full dataset with a probability superior to a user-specified tolerance level.

### No Free Lunch for Approximate MCMC

- Computer Science
- 2020

It is pointed out that well-known MCMC convergence results often imply that these "subsampling" MCMC algorithms cannot greatly improve performance, and generic results are applied to realistic statistical problems and proposed algorithms.

### Sequential Tests for Large Scale Learning

- Computer Science
- 2015

Algorithms that use sequential hypothesis tests to adaptively select a subset of data points and can be used to control the efficiency and accuracy of learning or inference are introduced.

### Speeding Up MCMC by Efficient Data Subsampling

- Computer Science, MathematicsJournal of the American Statistical Association
- 2018

Subsampling Markov chain Monte Carlo is substantially more efficient than standard MCMC in terms of sampling efficiency for a given computational budget, and that it outperforms other subsampling methods for MCMC proposed in the literature.

### Scalable MCMC for Large Data Problems Using Data Subsampling and the Difference Estimator

- Computer Science
- 2015

A generic Markov Chain Monte Carlo algorithm to speed up computations for datasets with many observations by using the highly efficient difference estimator from the survey sampling literature to estimate the log-likelihood accurately using only a small fraction of the data.

### Mini-Batch Metropolis–Hastings With Reversible SGLD Proposal

- Computer ScienceJournal of the American Statistical Association
- 2020

This work proposes a general framework for performing MH-MCMC using mini-batches of the whole dataset and shows that this gives rise to approximately a tempered stationary distribution, and proves that the algorithm preserves the modes of the original target distribution.

### Sequential Tests for Large-Scale Learning

- Computer ScienceNeural Computation
- 2016

Algorithms that use sequential hypothesis tests to adaptively select a subset of data points and the statistical properties of this subsampling process can be used to control the efficiency and accuracy of learning or inference.

### An Efficient Minibatch Acceptance Test for Metropolis-Hastings

- Computer ScienceUAI
- 2017

A method that can be tuned to provide arbitrarily small batch sizes, by adjusting either proposal step size or temperature, for large datasets that uses small expected-size mini-batches of data is presented.

### UvA-DARE (Digital Academic Repository) Sequential Tests for Large Scale Learning Sequential Tests for Large Scale Learning

- Computer Science
- 2015

Algorithms that use sequential hypothesis tests to adaptively select a subset of data points and the statistical properties of this subsampling process can be used to control theency and accuracy of learning or inference.

### AMAGOLD: Amortized Metropolis Adjustment for Efficient Stochastic Gradient MCMC

- Computer ScienceAISTATS
- 2020

This work proposes a novel second-order SG-MCMC algorithm---AMAGOLD---that infrequently uses Metropolis-Hastings (M-H) corrections to remove bias and proves AMAGOLD converges to the target distribution with a fixed, rather than a diminishing, step size, and that its convergence rate is at most a constant factor slower than a full-batch baseline.

## References

SHOWING 1-10 OF 19 REFERENCES

### Bayesian Posterior Sampling via Stochastic Gradient Fisher Scoring

- Computer Science, Mathematics
- 2012

By leveraging the Bayesian Central Limit Theorem, the SGLD algorithm is extended so that at high mixing rates it will sample from a normal approximation of the posterior, while for slow mixing rate it will mimic the behavior of S GLD with a pre-conditioner matrix.

### Bayesian Learning via Stochastic Gradient Langevin Dynamics

- Computer ScienceICML
- 2011

In this paper we propose a new framework for learning from large scale datasets based on iterative learning from small mini-batches. By adding the right amount of noise to a standard stochastic…

### Monte Carlo MCMC: Efficient Inference by Approximate Sampling

- Computer ScienceEMNLP
- 2012

This paper proposes an alternative MCMC sampling scheme in which transition probabilities are approximated by sampling from the set of relevant factors, and demonstrates that this method converges more quickly than a traditional MCMC sampler for both marginal and MAP inference.

### Reversible jump Markov chain Monte Carlo computation and Bayesian model determination

- Mathematics
- 1995

Markov chain Monte Carlo methods for Bayesian computation have until recently been restricted to problems where the joint distribution of all variables has a density with respect to some fixed…

### The pseudo-marginal approach for efficient Monte Carlo computations

- Computer Science
- 2009

A powerful and flexible MCMC algorithm for stochastic simulation that builds on a pseudo-marginal method, showing how algorithms which are approximations to an idealized marginal algorithm, can share the same marginal stationary distribution as the idealized method.

### Monte Carlo Sampling Methods Using Markov Chains and Their Applications

- Mathematics
- 1970

SUMMARY A generalization of the sampling method introduced by Metropolis et al. (1953) is presented along with an exposition of the relevant theory, techniques of application and methods and…

### A noisy Monte Carlo algorithm

- Computer Science
- 2000

A Monte Carlo algorithm is proposed to promote the Kennedy-Kuti linear accept-reject algorithm which accommodates unbiased stochastic estimates of the probability to an exact one and test it on the five state model and obtain desirable results.

### Markov Chain Monte Carlo: Stochastic Simulation for Bayesian Inference

- Computer ScienceTechnometrics
- 2008

The current edition of the handbook is intended to provide practitioners with a comprehensive resource for the use of software package Stata, which provides almost all standard commonly used methods of data analysis.

### Bayesian additive regression kernels

- Computer Science, Mathematics
- 2008

It is shown that the α-stable prior on the kernel regression coefficients may be approximated by tα distributions, and a reversible-jump Markov chain Monte Carlo algorithm is developed to make posterior inference on the unknown mean function.