Neural Machine Translation by Jointly Learning to Align and Translate

Abstract

Neural machine translation is a recently proposed approach to machine translation. Unlike the traditional statistical machine translation, the neural machine translation aims at building a single neural network that can be jointly tuned to maximize the translation performance. The models proposed recently for neu-ral machine translation often belong to a family of encoder–decoders and encode a source sentence into a fixed-length vector from which a decoder generates a translation. In this paper, we conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder–decoder architecture , and propose to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly. With this new approach, we achieve a translation performance comparable to the existing state-of-the-art phrase-based system on the task of English-to-French translation. Furthermore, qualitative analysis reveals that the (soft-)alignments found by the model agree well with our intuition.

Extracted Key Phrases

5 Figures and Tables

Showing 1-10 of 30 references

On the properties of neural machine translation: Encoder–Decoder approaches

  • K Cho, B Van Merriënboer, D Bahdanau, Y Bengio
  • 2014
Highly Influential
13 Excerpts

Published as a conference paper at ICLR

  • 2015
Showing 1-10 of 1,227 extracted citations
050010002014201520162017
Citations per Year

1,714 Citations

Semantic Scholar estimates that this publication has received between 1,585 and 1,858 citations based on the available data.

See our FAQ for additional information.