• Corpus ID: 3603249

Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation

  title={Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation},
  author={Yonghui Wu and Mike Schuster and Z. Chen and Quoc V. Le and Mohammad Norouzi and Wolfgang Macherey and Maxim Krikun and Yuan Cao and Qin Gao and Klaus Macherey and Jeff Klingner and Apurva Shah and Melvin Johnson and Xiaobing Liu and Lukasz Kaiser and Stephan Gouws and Yoshikiyo Kato and Taku Kudo and Hideto Kazawa and Keith Stevens and George Kurian and Nishant Patil and Wei Wang and Cliff Young and Jason R. Smith and Jason Riesa and Alex Rudnick and Oriol Vinyals and Gregory S. Corrado and Macduff Hughes and Jeffrey Dean},
Neural Machine Translation (NMT) is an end-to-end learning approach for automated translation, with the potential to overcome many of the weaknesses of conventional phrase-based translation systems. [] Key Method This method provides a good balance between the flexibility of "character"-delimited models and the efficiency of "word"-delimited models, naturally handles translation of rare words, and ultimately improves the overall accuracy of the system.
Deep Neural Network--based Machine Translation System Combination
This article proposes a deep neural network--based system combination framework leveraging both minimum Bayes-risk decoding and multi-source NMT, which take as input the N-best outputs of NMT and SMT systems and produce the final translation.
Improving the Quality of Neural Machine Translation
This thesis proposes a fertilitybased approach to neural machine translation, which couples the concept of fertility with sparse and constrained attention transformations, and presents two machine translation metrics that allow us to understand how much the problems of over and under-translations are affecting the model.
Making sense of neural machine translation
This paper introduces NMT, and explains in detail, without the mathematical complexity, how neural machine translation systems work, how they are trained, and their main differences with SMT systems.
A Survey on Document-level Neural Machine Translation
The aim of this survey article is to highlight the major works that have been undertaken in the space of document-level machine translation after the neural revolution, so researchers can recognize the current state and future directions of this field.
Incorporating Statistical Machine Translation Word Knowledge Into Neural Machine Translation
  • Xing WangZhaopeng TuMin Zhang
  • Computer Science
    IEEE/ACM Transactions on Audio, Speech, and Language Processing
  • 2018
Experimental results on Chinese-to-English and English- to-German translation tasks show that the proposed framework can take advantage of the SMT word knowledge and consistently achieve significant improvements over NMT and SMT baseline systems.
Machine Translation Using Improved Attention-based Transformer with Hybrid Input
An attention-based deep learning architecture is proposed for MT, with all layers focused exclusively on multi-head attention and based on a transformer that includes multi-layer encoders/decoders that results in a more accurate transformation compared to non-hybrid inputs.
Neural Machine Translation With Explicit Phrase Alignment
The key idea is to build a search space similar to that of phrase-based statistical machine translation for neural machine translation where phrase alignment is readily available and a new decoding algorithm is designed that can easily impose lexical and structural constraints.
A comparative study of Neural Machine Translation frameworks for the automatic translation of open data resources
This work introduces conventional theoretical models behind NMT together with the required background to provide a comprehensive view and develops state-of-the-art NMT systems built on top of two well-known frameworks for machine learning, Tensorflow and MXNet.
NoDeeLe: A Novel Deep Learning Schema for Evaluating Neural Machine Translation Systems
This paper seeks to put to the test an evaluation model based on a novel deep learning schema (NoDeeLe) used to compare two NMT systems on four different text genres, i.e. medical, legal, marketing and literary in the English-Greek language pair.
Statistical machine translation outperforms neural machine translation in software engineering: why and how
This work provides a hypothesis that SE corpus has inherent characteristics that NMT will confront challenges compared to the state-of-the-art translation engine based on Statistical Machine Translation, and implements and optimize the original SMT and NMT to mitigate those challenges.


Neural Machine Translation by Jointly Learning to Align and Translate
It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translation
This work introduces a new type of linear connections, named fast-forward connections, based on deep Long Short-Term Memory (LSTM) networks, and an interleaved bi-directional architecture for stacking the LSTM layers, and achieves state-of-the-art performance and outperforms the best conventional model by 0.7 BLEU points.
Effective Approaches to Attention-based Neural Machine Translation
A global approach which always attends to all source words and a local one that only looks at a subset of source words at a time are examined, demonstrating the effectiveness of both approaches on the WMT translation tasks between English and German in both directions.
Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models
A novel word-character solution to achieving open vocabulary NMT that can successfully learn to not only generate well-formed words for Czech, a highly-inflected language with a very complex vocabulary, but also build correct representations for English source words.
On Using Very Large Target Vocabulary for Neural Machine Translation
It is shown that decoding can be efficiently done even with the model having a very large target vocabulary by selecting only a small subset of the whole target vocabulary.
Character-based Neural Machine Translation
A neural MT system using character-based embeddings in combination with convolutional and highway layers to replace the standard lookup-based word representations to provide improved results even when the source language is not morphologically rich is proposed.
Coverage-based Neural Machine Translation
Experiments show that coverage-based NMT significantly improves both alignment and translation quality over NMT without coverage.
Sequence to Sequence Learning with Neural Networks
This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.
Multi-Task Learning for Multiple Language Translation
The recently proposed neural machine translation model is extended to a multi-task learning framework which shares source language representation and separates the modeling of different target language translation.
Pointing the Unknown Words
A novel way to deal with the rare and unseen words for the neural network models using attention is proposed using attention, which uses two softmax layers in order to predict the next word in conditional language models.