• Corpus ID: 3603249

Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation

  title={Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation},
  author={Yonghui Wu and Mike Schuster and Z. Chen and Quoc V. Le and Mohammad Norouzi and Wolfgang Macherey and Maxim Krikun and Yuan Cao and Qin Gao and Klaus Macherey and Jeff Klingner and Apurva Shah and Melvin Johnson and Xiaobing Liu and Lukasz Kaiser and Stephan Gouws and Yoshikiyo Kato and Taku Kudo and Hideto Kazawa and Keith Stevens and George Kurian and Nishant Patil and Wei Wang and Cliff Young and Jason R. Smith and Jason Riesa and Alex Rudnick and Oriol Vinyals and Gregory S. Corrado and Macduff Hughes and Jeffrey Dean},
Neural Machine Translation (NMT) is an end-to-end learning approach for automated translation, with the potential to overcome many of the weaknesses of conventional phrase-based translation systems. [] Key Method This method provides a good balance between the flexibility of "character"-delimited models and the efficiency of "word"-delimited models, naturally handles translation of rare words, and ultimately improves the overall accuracy of the system.
Deep Neural Network--based Machine Translation System Combination
This article proposes a deep neural network--based system combination framework leveraging both minimum Bayes-risk decoding and multi-source NMT, which take as input the N-best outputs of NMT and SMT systems and produce the final translation.
Improving the Quality of Neural Machine Translation
This thesis proposes a fertilitybased approach to neural machine translation, which couples the concept of fertility with sparse and constrained attention transformations, and presents two machine translation metrics that allow us to understand how much the problems of over and under-translations are affecting the model.
Making sense of neural machine translation
This paper introduces NMT, and explains in detail, without the mathematical complexity, how neural machine translation systems work, how they are trained, and their main differences with SMT systems.
A Survey on Document-level Neural Machine Translation
The aim of this survey article is to highlight the major works that have been undertaken in the space of document-level machine translation after the neural revolution, so researchers can recognize the current state and future directions of this field.
Multilingual Neural Machine Translation for Low Resource Languages
This work shows how the so-called multilingual NMT can help to tackle the challenges associated with low-resourced language translation, and introduces the recently proposed iterative self-training method, which incrementally improves a mult bilingual NMT on a zero-shot direction by just relying on monolingual data.
Incorporating Statistical Machine Translation Word Knowledge Into Neural Machine Translation
  • Xing Wang, Zhaopeng Tu, Min Zhang
  • Computer Science
    IEEE/ACM Transactions on Audio, Speech, and Language Processing
  • 2018
Experimental results on Chinese-to-English and English- to-German translation tasks show that the proposed framework can take advantage of the SMT word knowledge and consistently achieve significant improvements over NMT and SMT baseline systems.
Machine Translation Using Improved Attention-based Transformer with Hybrid Input
An attention-based deep learning architecture is proposed for MT, with all layers focused exclusively on multi-head attention and based on a transformer that includes multi-layer encoders/decoders that results in a more accurate transformation compared to non-hybrid inputs.
Neural Machine Translation With Explicit Phrase Alignment
The key idea is to build a search space similar to that of phrase-based statistical machine translation for neural machine translation where phrase alignment is readily available and a new decoding algorithm is designed that can easily impose lexical and structural constraints.
A comparative study of Neural Machine Translation frameworks for the automatic translation of open data resources
This work introduces conventional theoretical models behind NMT together with the required background to provide a comprehensive view and develops state-of-the-art NMT systems built on top of two well-known frameworks for machine learning, Tensorflow and MXNet.
NoDeeLe: A Novel Deep Learning Schema for Evaluating Neural Machine Translation Systems
This paper seeks to put to the test an evaluation model based on a novel deep learning schema (NoDeeLe) used to compare two NMT systems on four different text genres, i.e. medical, legal, marketing and literary in the English-Greek language pair.


Addressing the Rare Word Problem in Neural Machine Translation
This paper proposes and implements an effective technique to address the problem of end-to-end neural machine translation's inability to correctly translate very rare words, and is the first to surpass the best result achieved on a WMT’14 contest task.
Neural Machine Translation by Jointly Learning to Align and Translate
It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translation
This work introduces a new type of linear connections, named fast-forward connections, based on deep Long Short-Term Memory (LSTM) networks, and an interleaved bi-directional architecture for stacking the LSTM layers, and achieves state-of-the-art performance and outperforms the best conventional model by 0.7 BLEU points.
Neural Machine Translation of Rare Words with Subword Units
This paper introduces a simpler and more effective approach, making the NMT model capable of open-vocabulary translation by encoding rare and unknown words as sequences of subword units, and empirically shows that subword models improve over a back-off dictionary baseline for the WMT 15 translation tasks English-German and English-Russian by 1.3 BLEU.
Effective Approaches to Attention-based Neural Machine Translation
A global approach which always attends to all source words and a local one that only looks at a subset of source words at a time are examined, demonstrating the effectiveness of both approaches on the WMT translation tasks between English and German in both directions.
Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models
A novel word-character solution to achieving open vocabulary NMT that can successfully learn to not only generate well-formed words for Czech, a highly-inflected language with a very complex vocabulary, but also build correct representations for English source words.
On Using Very Large Target Vocabulary for Neural Machine Translation
It is shown that decoding can be efficiently done even with the model having a very large target vocabulary by selecting only a small subset of the whole target vocabulary.
A Character-level Decoder without Explicit Segmentation for Neural Machine Translation
The existing machine translation systems, whether phrase-based or neural, have relied almost exclusively on word-level modelling with explicit segmentation. In this paper, we ask a fundamental
Character-based Neural Machine Translation
A neural MT system using character-based embeddings in combination with convolutional and highway layers to replace the standard lookup-based word representations to provide improved results even when the source language is not morphologically rich is proposed.
Coverage-based Neural Machine Translation
Experiments show that coverage-based NMT significantly improves both alignment and translation quality over NMT without coverage.