• Corpus ID: 242758258

Perturb-and-max-product: Sampling and learning in discrete energy-based models

@inproceedings{LzaroGredilla2021PerturbandmaxproductSA,
  title={Perturb-and-max-product: Sampling and learning in discrete energy-based models},
  author={Miguel L{\'a}zaro-Gredilla and Antoine Dedieu and Dileep George},
  booktitle={Neural Information Processing Systems},
  year={2021}
}
Perturb-and-MAP offers an elegant approach to approximately sample from an energy-based model (EBM) by computing the maximum-a-posteriori (MAP) configuration of a perturbed version of the model. Sampling in turn enables learning. However, this line of research has been hindered by the general intractability of the MAP computation. Very few works venture outside tractable models, and when they do, they use linear programming approaches, which as we show, have several limitations. In this work… 

Figures from this paper

PGMax: Factor Graphs for Discrete Probabilistic Graphical Models and Loopy Belief Propagation in JAX

PGMax is an open-source Python package for easycation of discrete Probabilistic Graphical Models as factor graphs, and automatic derivation of scalable loopy belief propagation implementation in JAX.

References

SHOWING 1-10 OF 41 REFERENCES

Perturb-and-MAP random fields: Using discrete optimization to learn and sample from energy models

A novel way to induce a random field from an energy function on discrete labels by locally injecting noise to the energy potentials, followed by finding the global minimum of the perturbed energy function is proposed.

Marginal Weighted Maximum Log-likelihood for Efficient Learning of Perturb-and-Map models

It is shown that for log-supermodular pairwise pairwise models these operations can be performed efficiently using the machinery of dynamic graph cuts, and proposed to use double stochastic gradient descent, both on the data and on the perturbations, for efficient learning.

On the parameter learning for Perturb-and-MAP models

It appears that one can apply a stochastic technique over the proposed perturb-and-map approximation approach and still maintain convergence while make it faster in practice, which is an efficient and scalable generalization of the parameter learning approach.

Learning in Markov Random Fields with Contrastive Free Energies

A new framework for learning MRF models based on the contrastive free energy (CF) objective function is presented and it is shown that maximum likelihood, mean field, contrastive divergence and pseudo-likelihood objectives can be understood in this paradigm.

MAP Estimation, Linear Programming and Belief Propagation with Convex Free Energies

Convex BP is defined as BP algorithms based on a convex free energy approximation and it is shown that this class includes ordinary BP with single-cycle, tree reweighted BP and many other BP variants, and fixed-points of convex max-product BP will provably give the MAP solution when there are no ties.

High Dimensional Inference With Random Maximum A-Posteriori Perturbations

It is shown that the expected value of perturb-max inference with low dimensional perturbations can be used sequentially to generate unbiased samples from the Gibbs distribution.

Smooth and Strong: MAP Inference with Linear Convergence

This work introduces strong convexity by adding a quadratic term to the LP relaxation objective, and provides theoretical guarantees for the resulting programs, bounding the difference between their optimal value and the original optimum.

Oops I Took A Gradient: Scalable Sampling for Discrete Distributions

This work proposes a general and scalable approximate sampling strategy for probabilistic models with discrete variables that outperforms variational auto-encoders and existing energy-based models and gives bounds showing that this approach is near-optimal in the class of samplers which propose local updates.

Neural Variational Inference and Learning in Undirected Graphical Models

This work proposes black-box learning and inference algorithms for undirected models that optimize a variational approximation to the log-likelihood of the model via a unified variational inference framework and empirically demonstrates the effectiveness of the method on several popular generative modeling datasets.

What Cannot be Learned with Bethe Approximations

The results provide a novel approach to analyzing learning with Bethe approximations and highlight when it can be expected to work or fail.