Corpus ID: 195791459

On the Weaknesses of Reinforcement Learning for Neural Machine Translation

  title={On the Weaknesses of Reinforcement Learning for Neural Machine Translation},
  author={Leshem Choshen and Lior Fox and Zohar Aizenbud and Omri Abend},
  • Leshem Choshen, Lior Fox, +1 author Omri Abend
  • Published 2020
  • Computer Science
  • ArXiv
  • Reinforcement learning (RL) is frequently used to increase performance in text generation tasks, including machine translation (MT), notably through the use of Minimum Risk Training (MRT) and Generative Adversarial Networks (GAN). However, little is known about what and how these methods learn in the context of MT. We prove that one of the most common RL methods for MT does not optimize the expected reward, as well as show that other methods take an infeasibly long time to converge. In fact… CONTINUE READING
    14 Citations

    Figures, Tables, and Topics from this paper

    Text Generation by Learning from Off-Policy Demonstrations
    • PDF
    BERT as a Teacher: Contextual Embeddings for Sequence-Level Reward
    • 3
    • PDF
    Cost-Sensitive Training for Autoregressive Models
    • PDF
    MLE-guided parameter search for task loss minimization in neural sequence modeling
    • 1
    • PDF
    Stochasticity and Non-Autoregressive Modeling in Deep Generative Models of Text
    Neural Machine Translation: A Review of Methods, Resources, and Tools
    • PDF


    A Study of Reinforcement Learning for Neural Machine Translation
    • 70
    • PDF
    Classical Structured Prediction Losses for Sequence to Sequence Learning
    • 104
    • Highly Influential
    • PDF
    Adversarial Neural Machine Translation
    • 89
    • PDF
    Language Generation with Recurrent Generative Adversarial Networks without Pre-training
    • 79
    • PDF
    Language GANs Falling Short
    • 81
    • Highly Influential
    • PDF
    SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient
    • 1,219
    • PDF
    Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets
    • 117
    • Highly Influential
    • PDF
    Adversarial Learning for Neural Dialogue Generation
    • 615
    • PDF
    Evaluating Text GANs as Language Models
    • 21
    • PDF