• Corpus ID: 1607379

Firefly Monte Carlo: Exact MCMC with Subsets of Data

@inproceedings{Maclaurin2014FireflyMC,
  title={Firefly Monte Carlo: Exact MCMC with Subsets of Data},
  author={Dougal Maclaurin and Ryan P. Adams},
  booktitle={Conference on Uncertainty in Artificial Intelligence},
  year={2014}
}
Markov chain Monte Carlo (MCMC) is a popular tool for Bayesian inference. However, MCMC cannot be practically applied to large data sets because of the prohibitive cost of evaluating every likelihood term at every iteration. Here we present Firefly Monte Carlo (FlyMC) MCMC algorithm with auxiliary variables that only queries the likelihoods of a subset of the data at each iteration yet simulates from the exact posterior distribution. FlyMC is compatible with modern MCMC algorithms, and only… 

Figures and Tables from this paper

An algorithm for distributed Bayesian inference

A scalable extension of Monte Carlo algorithms using the divide‐and‐conquer (D&C) technique that divides the data into a sufficiently large number of subsets, draws parameters in parallel on the subsets using a powered likelihood and produces Monte Carlo draws of the parameter by combining parameter draws obtained from each subset.

No Free Lunch for Approximate MCMC

It is pointed out that well-known MCMC convergence results often imply that these "subsampling" MCMC algorithms cannot greatly improve performance, and generic results are applied to realistic statistical problems and proposed algorithms.

An Approximate MCMC Method for Convex Hulls

The initial work in this thesis is to define a data-augmentation algorithm along the lines of FLYMC, which uses pseudo-marginal algorithm (PMMH) to replace interest parameter’s distribution conditional on augmented variable by an estimator and introduces an auxiliary random variable to mark subsets.

An Algorithm for Distributed Bayesian Inference in Generalized Linear Models

This work develops a scalable extension of Monte Carlo algorithms using the divide-and-conquer technique that divides the data into a sufficiently large number of subsets, draws parameters in parallel on the subsets using a \textit{powered} likelihood, and produces Monte Carlo draws of the parameter by combining parameter draws obtained from each subset.

Parallelizing MCMC with Random Partition Trees

A new EP-MCMC algorithm PART is proposed that applies random partition trees to combine the subset posterior draws, which is distribution-free, easy to re-sample from and can adapt to multiple scales.

Scalable Metropolis-Hastings for Exact Bayesian Inference with Large Datasets

The Scalable Metropolis-Hastings (SMH) kernel is proposed, a kernel that exploits Gaussian concentration of the posterior to require processing on average only $O(1)$ or even $O (1/\sqrt{n})$ data points per step.

Variational Consensus Monte Carlo

Practitioners of Bayesian statistics have long depended on Markov chain Monte Carlo (MCMC) to obtain samples from intractable posterior distributions. Unfortunately, MCMC algorithms are typically

On Markov chain Monte Carlo methods for tall data

An original subsampling-based approach is proposed which samples from a distribution provably close to the posterior distribution of interest, yet can require less than $O(n)$ data point likelihood evaluations at each iteration for certain statistical models in favourable scenarios.

Speeding Up MCMC by Efficient Data Subsampling

Subsampling Markov chain Monte Carlo is substantially more efficient than standard MCMC in terms of sampling efficiency for a given computational budget, and that it outperforms other subsampling methods for MCMC proposed in the literature.

Comparing consensus Monte Carlo strategies for distributed Bayesian computation

It is found that resampling and kernel density based methods break down after 10 or sometimes fewer dimensions, while the new mixture-based approach works well, but the necessary mixture models take too long to take place.
...

References

SHOWING 1-10 OF 26 REFERENCES

CODA: convergence diagnosis and output analysis for MCMC

Bayesian inference with Markov Chain Monte Carlo with coda package for R contains a set of functions designed to help the user answer questions about how many samples are required to accurately estimate posterior quantities of interest.

Austerity in MCMC Land: Cutting the Metropolis-Hastings Budget

This work introduces an approximate MH rule based on a sequential hypothesis test that allows us to accept or reject samples with high confidence using only a fraction of the data required for the exact MH rule.

The pseudo-marginal approach for efficient Monte Carlo computations

A powerful and flexible MCMC algorithm for stochastic simulation that builds on a pseudo-marginal method, showing how algorithms which are approximations to an idealized marginal algorithm, can share the same marginal stationary distribution as the idealized method.

Bayesian Learning via Stochastic Gradient Langevin Dynamics

In this paper we propose a new framework for learning from large scale datasets based on iterative learning from small mini-batches. By adding the right amount of noise to a standard stochastic

Black Box Variational Inference

This paper presents a "black box" variational inference algorithm, one that can be quickly applied to many models with little additional derivation, based on a stochastic optimization of the variational objective where the noisy gradient is computed from Monte Carlo samples from the Variational distribution.

Stochastic variational inference

Stochastic variational inference lets us apply complex Bayesian models to massive data sets, and it is shown that the Bayesian nonparametric topic model outperforms its parametric counterpart.

Slice Sampling

Markov chain sampling methods that adapt to characteristics of the distribution being sampled can be constructed using the principle that one can ample from a distribution by sampling uniformly from

Accelerating MCMC via Parallel Predictive Prefetching

This work speculatively evaluates many potential steps of an MCMC chain in parallel while exploiting fast, iterative approximations to the target density, and achieves speedup close to linear in the number of available cores.

Optimal scaling of discrete approximations to Langevin diffusions

An asymptotic diffusion limit theorem is proved and it is shown that, as a function of dimension n, the complexity of the algorithm is O(n1/3), which compares favourably with the O- complexity of random walk Metropolis algorithms.

Weak convergence and optimal scaling of random walk Metropolis algorithms

This paper considers the problem of scaling the proposal distribution of a multidimensional random walk Metropolis algorithm in order to maximize the efficiency of the algorithm. The main result is a