• Corpus ID: 232092991

Challenges and Opportunities in High-dimensional Variational Inference

@inproceedings{Dhaka2021ChallengesAO,
  title={Challenges and Opportunities in High-dimensional Variational Inference},
  author={Akash Kumar Dhaka and Alejandro Catalina and Manushi K. V. Welandawe and Michael Riis Andersen and Jonathan Huggins and Aki Vehtari},
  booktitle={NeurIPS},
  year={2021}
}
Current black-box variational inference (BBVI) methods require the user to make numerous design choices—such as the selection of variational objective and approximating family—yet there is little principled guidance on how to do so. We develop a conceptual framework and set of experimental tools to understand the effects of these choices, which we leverage to propose best practices for max-imizing posterior approximation accuracy. Our approach is based on studying the pre-asymptotic tail… 

Figures and Tables from this paper

Mixture weights optimisation for Alpha-Divergence Variational Inference
TLDR
The link between Power Descent and Entropic Mirror Descent is investigated and first-order approximations allow us to introduce the Rényi Descent, a novel algorithm for which the authors prove an O (1 /N ) convergence rate.
Bernstein Flows for Flexible Posteriors in Variational Bayes
TLDR
This paper presents Bernstein variational inference (BF-VI), a robust and easy-to-use method, feasible enough to approximate complex multivariate posteriors and shows for low-dimensional models that BF-VI accurately approximates the true posterior; in higher- dimensional models, BF- VI outperforms other VI methods.
Pathfinder: Parallel quasi-Newton variational inference
TLDR
Evaluating Pathfinder on a wide range of posterior distributions shows that its approximate draws are better than those from automatic differentiation variational inference (ADVI) and comparable to those produced by short chains of dynamic Hamiltonian Monte Carlo (HMC), as measured by 1-Wasserstein distance.
Gradients should stay on Path: Better Estimators of the Reverse- and Forward KL Divergence for Normalizing Flows
We propose an algorithm to estimate the path-gradient of both the reverse and forward Kullback–Leibler divergence for an arbitrary manifestly invertible normalizing flow. The resulting path-gradient
Markov Chain Score Ascent: A Unifying Framework of Variational Inference with Markovian Gradients
TLDR
This paper provides the first non-asymptotic convergence analysis of Markov chain score ascent methods by establishing their mixing rate and gradient variance, and proposes a novel MCSA scheme, parallel MCSA (pMCSA), that achieves a tighter bound on the gradient variance.
Robust, Automated, and Accurate Black-box Variational Inference
TLDR
This paper proposes Robust, Automated, and Accurate BBVI, a framework for reliable BBVI optimization that is based on rigorously justified automation techniques, includes just a small number of intuitive tuning parameters, and detects inaccurate estimates of the optimal variational approximation.
Distilling importance sampling
TLDR
A novel approach combining features of both sampling and optimisation is proposed, which uses a flexible parameterised family of densities, such as a normalising flow, to produce a weighted sample from a more accurate posterior approximation.

References

SHOWING 1-10 OF 40 REFERENCES
Variational Inference with Normalizing Flows
TLDR
It is demonstrated that the theoretical advantages of having posteriors that better match the true posterior, combined with the scalability of amortized variational approaches, provides a clear improvement in performance and applicability of variational inference.
Robust, Accurate Stochastic Optimization for Variational Inference
TLDR
A more robust and accurate stochastic optimization framework is developed by viewing the underlying optimization algorithm as producing a Markov chain, which includes a diagnostic for convergence and a novel stopping rule, both of which are robust to noisy evaluations of the objective function.
Advances in Black-Box VI: Normalizing Flows, Importance Weighting, and Optimization
TLDR
This paper postulates that black-box VI is best addressed through a careful combination of numerous algorithmic components, and evaluates components relating to optimization, flows, and Monte-Carlo methods on a benchmark of 30 models from the Stan model library.
Variational Inference with Tail-adaptive f-Divergence
TLDR
A new class of tail-adaptive f-divergences that adaptively change the convex function f with the tail of the importance weights, in a way that theoretically guarantee finite moments, while simultaneously achieving mass-covering properties is proposed.
Perturbative Black Box Variational Inference
TLDR
This paper views BBVI with generalized divergences as a form of estimating the marginal likelihood via biased importance sampling, and builds a family of new variational bounds that captures the standard KL bound for $K=1, and converges to the exact marginal likelihood as $K\to\infty$.
Validated Variational Inference via Practical Posterior Error Bounds
TLDR
This paper provides rigorous bounds on the error of posterior mean and uncertainty estimates that arise from full-distribution approximations, as in variational inference.
Variational Inference: A Review for Statisticians
TLDR
Variational inference (VI), a method from machine learning that approximates probability densities through optimization, is reviewed and a variant that uses stochastic optimization to scale up to massive data is derived.
Empirical Evaluation of Biased Methods for Alpha Divergence Minimization
TLDR
Empirically evaluate biased methods for alpha-divergence minimization and empirically show that weight degeneracy does indeed occur with these estimators in cases where they return highly biased solutions, relating these results to the curse of dimensionality.
Markovian Score Climbing: Variational Inference with KL(p||q)
TLDR
This paper develops a simple algorithm for reliably minimizing the inclusive KL, and provides a new algorithm that melds VI and MCMC, and demonstrates the utility of MSC on Bayesian probit regression for classification as well as a stochastic volatility model for financial data.
Gaussian Variational Approximation With a Factor Covariance Structure
TLDR
General stochastic gradient ascent methods are described for efficient implementation, with gradient estimates obtained using the so-called “reparameterization trick” the end result is a flexible and efficient approach to high-dimensional Gaussian variational approximation.
...
...