Corpus ID: 235417432

DG-LMC: A Turn-key and Scalable Synchronous Distributed MCMC Algorithm

@article{Plassier2021DGLMCAT,
  title={DG-LMC: A Turn-key and Scalable Synchronous Distributed MCMC Algorithm},
  author={Vincent Plassier and Maxime Vono and Alain Durmus and {\'E}ric Moulines},
  journal={ArXiv},
  year={2021},
  volume={abs/2106.06300}
}
Performing reliable Bayesian inference on a big data scale is becoming a keystone in the modern era of machine learning. A workhorse class of methods to achieve this task are Markov chain Monte Carlo (MCMC) algorithms and their design to handle distributed datasets has been the subject of many works. However, existing methods are not completely either reliable or computationally efficient. In this paper, we propose to fill this gap in the case where the dataset is partitioned and stored on… Expand

Figures and Tables from this paper

Asynchronous and Distributed Data Augmentation for Massive Data Settings
  • Jiayuan Zhou, K. Khare, Sanvesh Srivastava
  • Mathematics
  • 2021
Data augmentation (DA) algorithms are widely used for Bayesian inference due to their simplicity. In massive data settings, however, DA algorithms are prohibitively slow because they pass through theExpand

References

SHOWING 1-10 OF 52 REFERENCES
Distributed Stochastic Gradient MCMC
TLDR
This work argues that stochastic gradient MCMC algorithms are particularly suited for distributed inference because individual chains can draw mini-batches from their local pool of data for a flexible amount of time before jumping to or syncing with other chains, which greatly reduces communication overhead and allows adaptive load balancing. Expand
Parallel and Distributed MCMC via Shepherding Distributions
TLDR
A general algorithmic framework for developing easily parallelizable/distributable Markov Chain Monte Carlo algorithms that relies on the introduction of an auxiliary distribution called a shepherding distribution that is used to control several MCMC chains that run in parallel. Expand
Parallelizing MCMC with Random Partition Trees
TLDR
A new EP-MCMC algorithm PART is proposed that applies random partition trees to combine the subset posterior draws, which is distribution-free, easy to re-sample from and can adapt to multiple scales. Expand
Asymptotically Exact, Embarrassingly Parallel MCMC
TLDR
This paper presents a parallel Markov chain Monte Carlo (MCMC) algorithm in which subsets of data are processed independently, with very little communication, and proves that it generates asymptotically exact samples and empirically demonstrate its ability to parallelize burn-in and sampling in several models. Expand
Efficient MCMC Sampling with Dimension-Free Convergence Rate using ADMM-type Splitting.
TLDR
A detailed theoretical study of a recent alternative class of MCMC schemes exploiting a splitting strategy akin to the one used by the celebrated ADMM optimization algorithm, known as the split Gibbs sampler. Expand
Embarrassingly Parallel MCMC using Deep Invertible Transformations
TLDR
This work introduces a novel method to introduce a deep invertible transformation to approximate each of the subposteriors, which can be made accurate even for complex distributions and serve as intermediate representations, keeping the total communication cost limited. Expand
Split-and-Augmented Gibbs Sampler—Application to Large-Scale Inference Problems
TLDR
This paper derives two new optimization-driven Monte Carlo algorithms inspired from variable splitting and data augmentation that enables to derive faster and more efficient sampling schemes than the current state-of-the-art methods and can embed the latter. Expand
Global Consensus Monte Carlo
TLDR
An instrumental hierarchical model associating auxiliary statistical parameters with each term, which are conditionally independent given the top-level parameters, leads to a distributed MCMC algorithm on an extended state space yielding approximations of posterior expectations. Expand
Communication-Efficient Distributed Statistical Inference
TLDR
CSL provides a communication-efficient surrogate to the global likelihood that can be used for low-dimensional estimation, high-dimensional regularized estimation, and Bayesian inference and significantly improves the computational efficiency of Markov chain Monte Carlo algorithms even in a nondistributed setting. Expand
Stochastic Gradient MCMC with Stale Gradients
Stochastic gradient MCMC (SG-MCMC) has played an important role in large-scale Bayesian learning, with well-developed theoretical convergence properties. In such applications of SG-MCMC, it isExpand
...
1
2
3
4
5
...