Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation

@inproceedings{Cho2014LearningPR,
  title={Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation},
  author={Kyunghyun Cho and Bart van Merrienboer and Çaglar G{\"u}lçehre and Dzmitry Bahdanau and Fethi Bougares and Holger Schwenk and Yoshua Bengio},
  booktitle={EMNLP},
  year={2014}
}
In this paper, we propose a novel neural network model called RNN Encoder‐ Decoder that consists of two recurrent neural networks (RNN. [...] Key Result Qualitatively, we show that the proposed model learns a semantically and syntactically meaningful representation of linguistic phrases.Expand
On the Properties of Neural Machine Translation: Encoder–Decoder Approaches
TLDR
It is shown that the neural machine translation performs relatively well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase. Expand
Unsupervised Learning of Sentence Representations using Convolutional Neural Networks
TLDR
Experimental results on several benchmark datasets, and across a broad range of applications, demonstrate the superiority of the proposed model over competing methods. Expand
Convolutional Encoders for Neural Machine Translation
We propose a general Convolutional Neural Network (CNN) encoder model for machine translation that fits within in the framework of Encoder-Decoder models proposed by Cho, et. al. [1]. A CNN takes asExpand
Feedforward sequential memory networks based encoder-decoder model for machine translation
TLDR
A sequence to sequence model is presented by replacing the recurrent neural networks with feedforward sequential memory networks in both encoder and decoder, which enables the new architecture to encode the entire source sentence simultaneously. Expand
Learning Generic Sentence Representations Using Convolutional Neural Networks
TLDR
Experimental results on several benchmark datasets, and across a broad range of applications, demonstrate the superiority of the proposed model over competing methods. Expand
Neural Machine Translation with Word Predictions
TLDR
This paper proposes to use word predictions as a mechanism for direct supervision to be able to predict the vocabulary in target sentence and ensures better representations in the encoder and decoder without using any extra data or annotation. Expand
Neural Machine Transliteration: Preliminary Results
TLDR
A character-based encoder-decoder model has been proposed that consists of two Recurrent Neural Networks that are jointly trained to maximize the conditional probability of a target sequence given a source sequence. Expand
Learning bilingual phrase representations with recurrent neural networks
TLDR
A novel method is introduced which transforms a sequence of word feature vectors into a fixed-length phrase vector across two languages, and can be used to predict the semantic equivalence of source and target word sequences in the phrasal translation units used in phrase-based statistical machine translation. Expand
Improved Neural Machine Translation with a Syntax-Aware Encoder and Decoder
TLDR
This paper proposes a bidirectional tree encoder which learns both sequential and tree structured representations; a tree-coverage model that lets the attention depend on the source-side syntax and experiments demonstrate that the proposed models outperform the sequential attentional model. Expand
Universal Vector Neural Machine Translation With Effective Attention
TLDR
A neutral/universal model representation is introduced that can be used to predict more than one language depending on the source and a provided target and the number of models needed for multiple language translation applications are reduced. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 39 REFERENCES
Joint Language and Translation Modeling with Recurrent Neural Networks
TLDR
This work presents a joint language and translation model based on a recurrent neural network which predicts target words based on an unbounded history of both source and target words which shows competitive accuracy compared to the traditional channel model features. Expand
Learning Semantic Representations for the Phrase Translation Model
TLDR
The results show that the new semantic-based phrase translation model significantly improves the performance of a state-of-the-art phrase-based statistical machine translation sys-tem, leading to a gain of 0.7-1.0 BLEU points. Expand
Continuous Space Translation Models for Phrase-Based Statistical Machine Translation
TLDR
Experimental evidence is provided that the approach seems to be able to infer meaningful translation probabilities for phrase pairs not seen in the training data, or even predict a list of the most likely translations given a source phrase. Expand
Continuous Space Translation Models with Neural Networks
TLDR
Several continuous space translation models are explored, where translation probabilities are estimated using a continuous representation of translation units in lieu of standard discrete representations, jointly computed using a multi-layer neural network with a SOUL architecture. Expand
Decoding with Large-Scale Neural Language Models Improves Translation
TLDR
This work develops a new model that combines the neural probabilistic language model of Bengio et al., rectified linear units, and noise-contrastive estimation, and incorporates it into a machine translation system both by reranking k-best lists and by direct integration into the decoder. Expand
Recurrent Continuous Translation Models
We introduce a class of probabilistic continuous translation models called Recurrent Continuous Translation Models that are purely based on continuous representations for words, phrases and sentencesExpand
A Neural Probabilistic Language Model
TLDR
This work proposes to fight the curse of dimensionality by learning a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences. Expand
Statistical Phrase-Based Translation
TLDR
The empirical results suggest that the highest levels of performance can be obtained through relatively simple means: heuristic learning of phrase translations from word-based alignments and lexical weighting of phrase translation. Expand
Continuous space language models
TLDR
Highly efficient learning algorithms are described that enable the use of training corpora of several hundred million words and it is shown that this approach can be incorporated into a large vocabulary continuous speech recognizer using a lattice rescoring framework at a very low additional processing time. Expand
How to Construct Deep Recurrent Neural Networks
TLDR
Two novel architectures of a deep RNN are proposed which are orthogonal to an earlier attempt of stacking multiple recurrent layers to build aDeep RNN, and an alternative interpretation is provided using a novel framework based on neural operators. Expand
...
1
2
3
4
...