• Publications
  • Influence
Stan: A Probabilistic Programming Language
TLDR
Stan is a probabilistic programming language for specifying statistical models that provides full Bayesian inference for continuous-variable models through Markov chain Monte Carlo methods such as the No-U-Turn sampler and an adaptive form of Hamiltonian Monte Carlo sampling.
Stochastic variational inference
TLDR
Stochastic variational inference lets us apply complex Bayesian models to massive data sets, and it is shown that the Bayesian nonparametric topic model outperforms its parametric counterpart.
The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo
TLDR
The No-U-Turn Sampler (NUTS), an extension to HMC that eliminates the need to set a number of steps L, and derives a method for adapting the step size parameter {\epsilon} on the fly based on primal-dual averaging.
Online Learning for Latent Dirichlet Allocation
TLDR
An online variational Bayes (VB) algorithm for Latent Dirichlet Allocation (LDA) based on online stochastic optimization with a natural gradient step is developed, which shows converges to a local optimum of the VB objective function.
Variational Autoencoders for Collaborative Filtering
TLDR
A generative model with multinomial likelihood and use Bayesian inference for parameter estimation is introduced and the pros and cons of employing a principledBayesian inference approach are identified and characterize settings where it provides the most significant improvements.
Stochastic Gradient Descent as Approximate Bayesian Inference
TLDR
It is demonstrated that constant SGD gives rise to a new variational EM algorithm that optimizes hyperparameters in complex probabilistic models and a scalable approximate MCMC algorithm, the Averaged Stochastic Gradient Sampler is proposed.
Learning Activation Functions to Improve Deep Neural Networks
TLDR
A novel form of piecewise linear activation function that is learned independently for each neuron using gradient descent is designed, achieving state-of-the-art performance on CIFar-10, CIFAR-100, and a benchmark from high-energy physics involving Higgs boson decay modes.
Sparse stochastic inference for latent Dirichlet allocation
TLDR
A hybrid algorithm for Bayesian topic models that combines the efficiency of sparse Gibbs sampling with the scalability of online stochastic inference is presented that reduces the bias of variational inference and generalizes to many Bayesian hidden-variable models.
Music Transformer: Generating Music with Long-Term Structure
TLDR
It is demonstrated that a Transformer with the modified relative attention mechanism can generate minute-long compositions with compelling structure, generate continuations that coherently elaborate on a given motif, and in a seq2seq setup generate accompaniments conditioned on melodies.
Nonparametric variational inference
TLDR
The efficacy of the nonparametric approximation with a hierarchical logistic regression model and a nonlinear matrix factorization model is demonstrated and it is obtained predictive performance as good as or better than more specialized variational methods and MCMC approximations.
...
1
2
3
4
5
...