• Corpus ID: 1898841

An Adaptive Learning Rate for Stochastic Variational Inference

@inproceedings{Ranganath2013AnAL,
  title={An Adaptive Learning Rate for Stochastic Variational Inference},
  author={Rajesh Ranganath and Chong Wang and David M. Blei and Eric P. Xing},
  booktitle={ICML},
  year={2013}
}
Stochastic variational inference finds good posterior approximations of probabilistic models with very large data sets. It optimizes the variational objective with stochastic optimization, following noisy estimates of the natural gradient. Operationally, stochastic inference iteratively subsamples from the data, analyzes the subsample, and updates parameters with a decreasing learning rate. However, the algorithm is sensitive to that rate, which usually requires hand-tuning to each application… 

Figures from this paper

Tuning the Learning Rate for Stochastic Variational Inference
TLDR
A novel algorithm is developed, which tunes the learning rate of each iteration adaptively and performs better and converges faster than commonly used learning rates.
Deterministic Annealing for Stochastic Variational Inference
TLDR
Deterministic annealing for SVI is introduced, which introduces a temperature parameter that deterministically deforms the objective, and then reduces this parameter over the course of the optimization.
Stochastic variational inference
TLDR
Stochastic variational inference lets us apply complex Bayesian models to massive data sets, and it is shown that the Bayesian nonparametric topic model outperforms its parametric counterpart.
Variance Reduction for Stochastic Gradient Optimization
TLDR
This paper demonstrates how to construct the control variate for two practical problems using stochastic gradient optimization, one is convex—the MAP estimation for logistic regression, and the other is non-converage—stochastic variational inference for latent Dirichlet allocation.
Incremental Variational Inference for Latent Dirichlet Allocation
TLDR
A stochastic approximation of incremental variational inference is introduced which extends to the asynchronous distributed setting and the resulting distributed algorithm achieves comparable performance as single host incremental Variational inference, but with a significant speed-up.
Multicanonical Stochastic Variational Inference
TLDR
Compared to the traditional SVI algorithm, both approaches find improved predictive likelihoods on held-out data, with MVI being close to the best-tuned annealing schedule.
Memoized Online Variational Inference for Dirichlet Process Mixture Models
TLDR
A new algorithm, memoized online variational inference, which scales to very large (yet finite) datasets while avoiding the complexities of stochastic gradient is presented, requiring some additional memory but still scaling to millions of examples.
A trust-region method for stochastic variational inference with applications to streaming data
TLDR
This work replaces the natural gradient step of stochastic varitional inference with a trust-region update, and shows that this leads to generally better results and reduced sensitivity to hyperparameters.
Accelerating Stochastic Probabilistic Inference
TLDR
This paper derives the Hessian matrix of the variational objective and devise two numerical schemes to implement second-order SVI efficiently and bridges the gap between secondorder methods and stochastic variational inference.
A Filtering Approach to Stochastic Variational Inference
TLDR
An alternative perspective on SVI as approximate parallel coordinate ascent is presented and a model to automate this process, which outperforms the original SVI schedule and a state-of-the-art adaptive SVI algorithm in two diverse domains.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 28 REFERENCES
Stochastic variational inference
TLDR
Stochastic variational inference lets us apply complex Bayesian models to massive data sets, and it is shown that the Bayesian nonparametric topic model outperforms its parametric counterpart.
On Smoothing and Inference for Topic Models
TLDR
Using the insights gained from this comparative study, it is shown how accurate topic models can be learned in several seconds on text corpora with thousands of documents.
Online Variational Inference for the Hierarchical Dirichlet Process
TLDR
This work proposes an online variational inference algorithm for the HDP, an algorithm that is easily applicable to massive and streaming data, and lets us analyze much larger data sets.
Online Learning for Latent Dirichlet Allocation
TLDR
An online variational Bayes (VB) algorithm for Latent Dirichlet Allocation (LDA) based on online stochastic optimization with a natural gradient step is developed, which shows converges to a local optimum of the VB objective function.
Practical Bayesian Optimization of Machine Learning Algorithms
TLDR
This work describes new algorithms that take into account the variable cost of learning algorithm experiments and that can leverage the presence of multiple cores for parallel experimentation and shows that these proposed algorithms improve on previous automatic procedures and can reach or surpass human expert-level optimization for many algorithms.
No more pesky learning rates
TLDR
The proposed method to automatically adjust multiple learning rates so as to minimize the expected error at any one time relies on local gradient variations across samples, making it suitable for non-stationary problems.
On Bayesian Learning and Stochastic Approximation
  • Y. Chien, K. Fu
  • Computer Science
    IEEE Trans. Syst. Sci. Cybern.
  • 1967
TLDR
The results of this work may provide an alternative approach to the study of learning theory and suggest a different mathematical basis for the analysis and synthesis of learning systems in pattern recognition, automatic control, and statistical communications.
The Discrete Innite Logistic Normal Distribution
TLDR
A stochastic variational inference algorithm for DILN is developed and compared with similar algorithms for HDP and latent Dirichlet allocation on a collection of 350; 000 articles from Nature.
Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming
TLDR
This paper reviews the literature on deterministic and stochastic stepsize rules, and derives formulas for optimal stepsizes for minimizing estimation error, and an approximation is proposed for the case where the parameters are unknown.
Robust Stochastic Approximation Approach to Stochastic Programming
TLDR
It is intended to demonstrate that a properly modified SA approach can be competitive and even significantly outperform the SAA method for a certain class of convex stochastic problems.
...
1
2
3
...