Stochastic Variational Inference


We derive a stochastic optimization algorithm for mean field variational inference, which we call online variational inference. Our algorithm approximates the posterior distribution of a probabilistic model with hidden variables, and can handle large (or even streaming) data sets of observations. Let x = x 1:n be n observations, β be global hidden variables, and z = z 1:n be n local hidden variables. We assume that the joint distribution of the hidden variables and observations is p(β, z 1:n , x 1:n) = p(β | α) n i=1 p(z i | β)p(x i | z i , β), (1) where α are fixed hyperparameters. In this model, the global variables β can govern the distributions of any of the other variables. The local variables z i only govern the distributions of their respective observations x i. Figure 1 illustrates this model. Our goal is to approximate the posterior p(β, z | x). The distinction between local and global variables will be important for us to develop online inference. In Bayesian statistics, for example, think of β as parameters with a prior and z 1:n as hidden variables which are individual to each observation. In a Bayesian mixture of Gaussians the global variables β are the mixture components and mixture proportions; the local variables z i are the mixture assignments for each data point. We make the assumption of conditional conjugacy, which means that the model satisfies two properties. The first property is that each hidden variable's factor in Equation 2 is in an exponential family, p(β | α) = exp{α t(β) − a(α)} (2) p(z i | β) = exp{β t(z i) − a(β)},

Extracted Key Phrases

3 Figures and Tables

Showing 1-10 of 367 extracted citations
Citations per Year

531 Citations

Semantic Scholar estimates that this publication has received between 458 and 621 citations based on the available data.

See our FAQ for additional information.