Average of Recentered Parallel MCMC for Big Data
@article{Wu2017AverageOR, title={Average of Recentered Parallel MCMC for Big Data}, author={Changye Wu and Christian P. Robert}, journal={arXiv: Computation}, year={2017} }
In big data context, traditional MCMC methods, such as Metropolis-Hastings algorithms and hybrid Monte Carlo, scale poorly because of their need to evaluate the likelihood over the whole data set at each iteration. In order to resurrect MCMC methods, numerous approaches belonging to two categories: divide-and-conquer and subsampling, are proposed. In this article, we study the parallel MCMC and propose a new combination method in the divide-and-conquer framework. Compared with some parallel…
3 Citations
Divide and Recombine for Large and Complex Data: Model Likelihood Functions Using MCMC and TRMM Big Data Analysis
- Mathematics
- 2018
An innovate D\&R procedure is proposed to compute likelihood functions of data-model (DM) parameters for big data by fitting the density to MCMC draws from each subset DM likelihood function, and then the fitted densities are recombined.
Modeling Network Populations via Graph Distances
- Mathematics, Computer ScienceJournal of the American Statistical Association
- 2020
A new class of models for multiple networks to parameterize a distribution on labeled graphs in terms of a Fréchet mean graph and a parameter that controls the concentration of this distribution about its mean is introduced.
A Survey of Bayesian Statistical Approaches for Big Data
- Computer ScienceCase Studies in Applied Bayesian Data Science
- 2020
The question of whether focusing only on improving computational algorithms and infrastructure will be enough to face the challenges of Big Data is addressed.
References
SHOWING 1-10 OF 16 REFERENCES
Parallelizing MCMC with Random Partition Trees
- Computer ScienceNIPS
- 2015
A new EP-MCMC algorithm PART is proposed that applies random partition trees to combine the subset posterior draws, which is distribution-free, easy to re-sample from and can adapt to multiple scales.
On Markov chain Monte Carlo methods for tall data
- Computer ScienceJ. Mach. Learn. Res.
- 2017
An original subsampling-based approach is proposed which samples from a distribution provably close to the posterior distribution of interest, yet can require less than $O(n)$ data point likelihood evaluations at each iteration for certain statistical models in favourable scenarios.
Parallelizing MCMC via Weierstrass Sampler
- Computer Science
- 2013
This article proposes a new Weierstrass sampler for parallel MCMC based on independent subsets that approximates the full data posterior samples via combining the posterior draws from independent subset MCMC chains, and thus enjoys a higher computational efficiency.
Asymptotically Exact, Embarrassingly Parallel MCMC
- Computer ScienceUAI
- 2014
This paper presents a parallel Markov chain Monte Carlo (MCMC) algorithm in which subsets of data are processed independently, with very little communication, and proves that it generates asymptotically exact samples and empirically demonstrate its ability to parallelize burn-in and sampling in several models.
Expectation Propagation as a Way of Life
- Computer Science
- 2020
EP is revisited as a prototype for scalable algorithms that partition big datasets into many parts and analyze each part in parallel to perform inference of shared parameters to be particularly efficient for hierarchical models.
WASP: Scalable Bayes via barycenters of subset posteriors
- Computer ScienceAISTATS
- 2015
The Wasserstein posterior (WASP) has an atomic form, facilitating straightforward estimation of posterior summaries of functionals of interest and theoretical justification in terms of posterior consistency and algorithm eciency is provided.
Austerity in MCMC Land: Cutting the Metropolis-Hastings Budget
- Computer ScienceICML 2014
- 2013
This work introduces an approximate MH rule based on a sequential hypothesis test that allows us to accept or reject samples with high confidence using only a fraction of the data required for the exact MH rule.
Stochastic Gradient Hamiltonian Monte Carlo
- Computer ScienceICML
- 2014
A variant that uses second-order Langevin dynamics with a friction term that counteracts the effects of the noisy gradient, maintaining the desired target distribution as the invariant distribution is introduced.
Patterns of Scalable Bayesian Inference
- Computer ScienceFound. Trends Mach. Learn.
- 2016
This paper seeks to identify unifying principles, patterns, and intuitions for scaling Bayesian inference by reviewing existing work on utilizing modern computing resources with both MCMC and variational approximation techniques.
Scalable and Robust Bayesian Inference via the Median Posterior
- Computer ScienceICML
- 2014
This work proposes a novel general approach to Bayesian inference that is scalable and robust to corruption in the data, based on the idea of splitting the data into several non-overlapping subgroups, evaluating the posterior distribution given each independent subgroup, and then combining the results.