• Corpus ID: 239998630

CARMS: Categorical-Antithetic-REINFORCE Multi-Sample Gradient Estimator

  title={CARMS: Categorical-Antithetic-REINFORCE Multi-Sample Gradient Estimator},
  author={Aleksandar Dimitriev and Mingyuan Zhou},
  • A. Dimitriev, Mingyuan Zhou
  • Published 26 October 2021
  • Computer Science, Mathematics
  • ArXiv
Accurately backpropagating the gradient through categorical variables is a challenging task that arises in various domains, such as training discrete latent variable models. To this end, we propose CARMS, an unbiased estimator for categorical random variables based on multiple mutually negatively correlated (jointly antithetic) samples. CARMS combines REINFORCE with copula based sampling to avoid duplicate samples and reduce its variance, while keeping the estimator unbiased using importance… 

Figures and Tables from this paper


ARMS: Antithetic-REINFORCE-Multi-Sample Gradient for Binary Variables
  • A. Dimitriev, Mingyuan Zhou
  • Computer Science, Mathematics
  • 2021
ARMS, an Antithetic REINFORCE-based Multi-Sample gradient estimator that uses a copula to generate any number of mutually antithetic samples is proposed and evaluated on several datasets for training generative models, and the experimental results show that it outperforms competing methods.
ARSM: Augment-REINFORCE-Swap-Merge Estimator for Gradient Backpropagation Through Categorical Variables
Experimental results show ARSM closely resembles the performance of the true gradient for optimization in univariate settings; outperforms existing estimators by a large margin when applied to categorical variational auto-encoders; and provides a "try-and-see self-critic" variance reduction method for discrete-action policy gradient.
REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models
This work introduces a modification to the continuous relaxation of discrete variables and shows that the tightness of the relaxation can be adapted online, removing it as a hyperparameter, leading to faster convergence to a better final log-likelihood.
Coupled Gradient Estimators for Discrete Latent Variables
Gradient estimators based on reparameterizing categorical variables as sequences of binary variables and Rao-Blackwellization are introduced and it is shown that these proposed categorical gradient estimators provide state-of-the-art performance.
Categorical Reparameterization with Gumbel-Softmax
It is shown that the Gumbel-Softmax estimator outperforms state-of-the-art gradient estimators on structured output prediction and unsupervised generative modeling tasks with categorical latent variables, and enables large speedups on semi-supervised classification.
Variational Inference for Monte Carlo Objectives
The first unbiased gradient estimator designed for importance-sampled objectives is developed, which is both simpler and more effective than the NVIL estimator proposed for the single-sample variational objective, and is competitive with the currently used biases.
Rao-Blackwellizing the Straight-Through Gumbel-Softmax Gradient Estimator
It is shown that the variance of the straight-through variant of the popular Gumbel-Softmax estimator can be reduced through Rao-Blackwellization without increasing the number of function evaluations, which provably reduces the mean squared error.
Importance Weighted Autoencoders
The importance weighted autoencoder (IWAE), a generative model with the same architecture as the VAE, but which uses a strictly tighter log-likelihood lower bound derived from importance weighting, shows empirically that IWAEs learn richer latent space representations than VAEs, leading to improved test log- likelihood on density estimation benchmarks.
Stochastic Beams and Where to Find Them: The Gumbel-Top-k Trick for Sampling Sequences Without Replacement
It is shown that sequences sampled without replacement can be used to construct low-variance estimators for expected sentence-level BLEU score and model entropy.
Neural Variational Inference and Learning in Belief Networks
This work proposes a fast non-iterative approximate inference method that uses a feedforward network to implement efficient exact sampling from the variational posterior and shows that it outperforms the wake-sleep algorithm on MNIST and achieves state-of-the-art results on the Reuters RCV1 document dataset.