Corpus ID: 207847493

Improving Joint Training of Inference Networks and Structured Prediction Energy Networks

  title={Improving Joint Training of Inference Networks and Structured Prediction Energy Networks},
  author={Lifu Tu and Richard Yuanzhe Pang and Kevin Gimpel},
  • Lifu Tu, Richard Yuanzhe Pang, Kevin Gimpel
  • Published 2019
  • Computer Science
  • ArXiv
  • Deep energy-based models are powerful, but pose challenges for learning and inference (Belanger and McCallum, 2016). Tu and Gimpel (2018) developed an efficient framework for energy-based models by training "inference networks" to approximate structured inference instead of using gradient descent. However, their alternating optimization approach suffers from instabilities during training, requiring additional loss terms and careful hyperparameter tuning. In this paper, we contribute several… CONTINUE READING

    Figures and Tables from this paper.

    ENGINE: Energy-Based Inference Networks for Non-Autoregressive Machine Translation
    • 4
    • PDF
    An Exploration of Arbitrary-Order Sequence Labeling via Energy-Based Inference Networks


    Publications referenced by this paper.
    Adam: A Method for Stochastic Optimization
    • 49,574
    • PDF
    Generative Adversarial Nets
    • 17,740
    • PDF
    Learning Approximate Inference Networks for Structured Prediction
    • 31
    • PDF
    BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
    • 9,972
    • PDF
    End-to-End Learning for Structured Prediction Energy Networks
    • 82
    • Highly Influential
    • PDF
    Improved Training of Wasserstein GANs
    • 3,321
    • PDF
    Structured Prediction Energy Networks
    • 129
    • Highly Influential
    • PDF
    Energy-based Generative Adversarial Network
    • 667
    • PDF
    A Tutorial on Energy-Based Learning
    • 514
    • PDF