• Corpus ID: 11212020

Neural Machine Translation by Jointly Learning to Align and Translate

@article{Bahdanau2015NeuralMT,
  title={Neural Machine Translation by Jointly Learning to Align and Translate},
  author={Dzmitry Bahdanau and Kyunghyun Cho and Yoshua Bengio},
  journal={CoRR},
  year={2015},
  volume={abs/1409.0473}
}
Neural machine translation is a recently proposed approach to machine translation. [] Key Result Furthermore, qualitative analysis reveals that the (soft-)alignments found by the model agree well with our intuition.

Figures and Tables from this paper

Improved Neural Machine Translation with SMT Features

TLDR
The proposed method significantly improves the translation quality of the state-ofthe-art NMT system on Chinese-to-English translation tasks and incorporates statistical machine translation (SMT) features, such as a translation model and an n-gram language model, with the NMT model under the log-linear framework.

Neural Machine Translation With Explicit Phrase Alignment

TLDR
The key idea is to build a search space similar to that of phrase-based statistical machine translation for neural machine translation where phrase alignment is readily available and a new decoding algorithm is designed that can easily impose lexical and structural constraints.

An Efficient Character-Level Neural Machine Translation

TLDR
An efficient architecture to train a deep character-level neural machine translation by introducing a decimator and an interpolator is proposed, which is able to translate the misspelled word like human beings.

Variational Neural Machine Translation

TLDR
This paper builds a neural posterior approximator conditioned on both the source and the target sides, and equip it with a reparameterization technique to estimate the variational lower bound, and shows that the proposed variational neural machine translation achieves significant improvements over the vanilla neural machinetranslation baselines.

Re-encoding in Neural Machine Translation

TLDR
This thesis investigates two architectures that use re-encoding and compares them to multiple baselines and shows that re- Encoding can potentially improve performance of neural models, especially on longer sentences.

Pre-Translation for Neural Machine Translation

TLDR
This work used phrase-based machine translation to pre-translate the input into the target language and analyzed the influence of the quality of the initial system on the final result.

Improving Sequence to Sequence Neural Machine Translation by Utilizing Syntactic Dependency Information

TLDR
This work proposes an approach that utilize more grammatical information such as syntactic dependencies, so that the output can be generated based on more abundant information and solved the two existing problems, ineffective translation for long sentences and over-translation in Neural Machine Translation.

Incorporating Source-Side Phrase Structures into Neural Machine Translation

TLDR
This model has an attention mechanism that enables the decoder to generate a translated word while softly aligning it with phrases as well as words of the source sentence, and is called a tree-to-sequence NMT model, extending a sequence- to-sequence model with the source-side phrase structure.

Incorporating Source-Side Phrase Structures into Neural Machine Translation

TLDR
This model has an attention mechanism that enables the decoder to generate a translated word while softly aligning it with phrases as well as words of the source sentence, and is called a tree-to-sequence NMT model, extending a sequence- to-sequence model with the source-side phrase structure.

Language-Independent Representor for Neural Machine Translation

TLDR
A language-independent representor is proposed to replace the encoder and decoder of NMT by using weight sharing, which can not only reduce large portion of network parameters, but also facilitate us to fully explore the language duality by jointly training source-to-target, target- to-source, left- To-right and right-To-left translations within a multi-task learning framework.
...

References

SHOWING 1-10 OF 32 REFERENCES

On the Properties of Neural Machine Translation: Encoder–Decoder Approaches

TLDR
It is shown that the neural machine translation performs relatively well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase.

Continuous Space Translation Models for Phrase-Based Statistical Machine Translation

TLDR
Experimental evidence is provided that the approach seems to be able to infer meaningful translation probabilities for phrase pairs not seen in the training data, or even predict a list of the most likely translations given a source phrase.

Recurrent Continuous Translation Models

We introduce a class of probabilistic continuous translation models called Recurrent Continuous Translation Models that are purely based on continuous representations for words, phrases and sentences

Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation

TLDR
Qualitatively, the proposed RNN Encoder‐Decoder model learns a semantically and syntactically meaningful representation of linguistic phrases.

Overcoming the Curse of Sentence Length for Neural Machine Translation using Automatic Segmentation

TLDR
A way to address the issue of a significant drop in translation quality when translating long sentences by automatically segmenting an input sentence into phrases that can be easily translated by the neural network translation model.

Continuous Space Language Models for Statistical Machine Translation

TLDR
This work proposes to use a new statistical language model that is based on a continuous representation of the words in the vocabulary, which achieves consistent improvements in the BLEU score on the development and test data.

Sequence to Sequence Learning with Neural Networks

TLDR
This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.

Statistical Phrase-Based Translation

TLDR
The empirical results suggest that the highest levels of performance can be obtained through relatively simple means: heuristic learning of phrase translations from word-based alignments and lexical weighting of phrase translation.

Statistical Machine Translation

  • M. Osborne
  • Computer Science
    Encyclopedia of Machine Learning and Data Mining
  • 2010
TLDR
Statistical Machine Translation deals with automating sentences in one human language into another human language (such as English) and estimates from parallel corpora and also from monolingual corpora (examples of target sentences).

A Neural Probabilistic Language Model

TLDR
This work proposes to fight the curse of dimensionality by learning a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences.