# DG-LMC: A Turn-key and Scalable Synchronous Distributed MCMC Algorithm

@article{Plassier2021DGLMCAT, title={DG-LMC: A Turn-key and Scalable Synchronous Distributed MCMC Algorithm}, author={Vincent Plassier and Maxime Vono and Alain Durmus and {\'E}ric Moulines}, journal={ArXiv}, year={2021}, volume={abs/2106.06300} }

Performing reliable Bayesian inference on a big data scale is becoming a keystone in the modern era of machine learning. A workhorse class of methods to achieve this task are Markov chain Monte Carlo (MCMC) algorithms and their design to handle distributed datasets has been the subject of many works. However, existing methods are not completely either reliable or computationally efficient. In this paper, we propose to fill this gap in the case where the dataset is partitioned and stored on… Expand

#### One Citation

Asynchronous and Distributed Data Augmentation for Massive Data Settings

- Mathematics
- 2021

Data augmentation (DA) algorithms are widely used for Bayesian inference due to their simplicity. In massive data settings, however, DA algorithms are prohibitively slow because they pass through the… Expand

#### References

SHOWING 1-10 OF 52 REFERENCES

Distributed Stochastic Gradient MCMC

- Computer Science
- ICML
- 2014

This work argues that stochastic gradient MCMC algorithms are particularly suited for distributed inference because individual chains can draw mini-batches from their local pool of data for a flexible amount of time before jumping to or syncing with other chains, which greatly reduces communication overhead and allows adaptive load balancing. Expand

Parallel and Distributed MCMC via Shepherding Distributions

- Geology, Computer Science
- AISTATS
- 2018

A general algorithmic framework for developing easily parallelizable/distributable Markov Chain Monte Carlo algorithms that relies on the introduction of an auxiliary distribution called a shepherding distribution that is used to control several MCMC chains that run in parallel. Expand

Parallelizing MCMC with Random Partition Trees

- Mathematics, Computer Science
- NIPS
- 2015

A new EP-MCMC algorithm PART is proposed that applies random partition trees to combine the subset posterior draws, which is distribution-free, easy to re-sample from and can adapt to multiple scales. Expand

Asymptotically Exact, Embarrassingly Parallel MCMC

- Computer Science, Mathematics
- UAI
- 2014

This paper presents a parallel Markov chain Monte Carlo (MCMC) algorithm in which subsets of data are processed independently, with very little communication, and proves that it generates asymptotically exact samples and empirically demonstrate its ability to parallelize burn-in and sampling in several models. Expand

Efficient MCMC Sampling with Dimension-Free Convergence Rate using ADMM-type Splitting.

- Mathematics, Computer Science
- 2019

A detailed theoretical study of a recent alternative class of MCMC schemes exploiting a splitting strategy akin to the one used by the celebrated ADMM optimization algorithm, known as the split Gibbs sampler. Expand

Embarrassingly Parallel MCMC using Deep Invertible Transformations

- Computer Science, Mathematics
- UAI
- 2019

This work introduces a novel method to introduce a deep invertible transformation to approximate each of the subposteriors, which can be made accurate even for complex distributions and serve as intermediate representations, keeping the total communication cost limited. Expand

Split-and-Augmented Gibbs Sampler—Application to Large-Scale Inference Problems

- Computer Science, Mathematics
- IEEE Transactions on Signal Processing
- 2019

This paper derives two new optimization-driven Monte Carlo algorithms inspired from variable splitting and data augmentation that enables to derive faster and more efficient sampling schemes than the current state-of-the-art methods and can embed the latter. Expand

Global Consensus Monte Carlo

- Mathematics, Computer Science
- 2018

An instrumental hierarchical model associating auxiliary statistical parameters with each term, which are conditionally independent given the top-level parameters, leads to a distributed MCMC algorithm on an extended state space yielding approximations of posterior expectations. Expand

Communication-Efficient Distributed Statistical Inference

- Computer Science, Mathematics
- 2016

CSL provides a communication-efficient surrogate to the global likelihood that can be used for low-dimensional estimation, high-dimensional regularized estimation, and Bayesian inference and significantly improves the computational efficiency of Markov chain Monte Carlo algorithms even in a nondistributed setting. Expand

Stochastic Gradient MCMC with Stale Gradients

- Computer Science, Mathematics
- NIPS
- 2016

Stochastic gradient MCMC (SG-MCMC) has played an important role in large-scale Bayesian learning, with well-developed theoretical convergence properties. In such applications of SG-MCMC, it is… Expand