Learn More
In order to capture rich language phenomena , neural machine translation models have to use a large vocabulary size, which requires high computing time and large memory usage. In this paper, we alleviate this issue by introducing a sentence-level or batch-level vocabulary, which is only a very small subset of the full output vocabulary. For each sentence or(More)
In this paper, we enhance the attention-based neural machine translation by adding an explicit coverage embedding model to alleviate issues of repeating and dropping translations in NMT. For each source word, our model starts with a full coverage embedding vector, and then keeps updating it with a gated recurrent unit as the translation goes. All the(More)
In this paper, we improve the attention or alignment accuracy of neural machine translation by utilizing the alignments of training sentence pairs. We simply compute the distance between the machine attentions and the " true " alignments, and minimize this cost in the training procedure. Our experiments on large-scale Chinese-to-English task show that our(More)
In this paper, we enhance the attention-based neural machine translation (NMT) by adding explicit coverage embedding models to alleviate issues of repeating and dropping translations in NMT. For each source word, our model starts with a full coverage embedding vector to track the coverage status, and then keeps updating it with neural networks as the(More)
Attention-based Neural Machine Translation (NMT) models suffer from attention deficiency issues as has been observed in recent research. We propose a novel mechanism to address some of these limitations and improve the NMT attention. Specifically, our approach memorizes the alignments temporally (within each sentence) and modulates the attention with the(More)
  • 1