• Publications
  • Influence
Neural Machine Translation of Rare Words with Subword Units
TLDR
In this paper, we introduce a simpler and more effective approach, making the NMT model capable of open-vocabulary translation by encoding rare and unknown words as sequences of subword units. Expand
  • 2,876
  • 522
  • PDF
Improving Neural Machine Translation Models with Monolingual Data
TLDR
Neural Machine Translation (NMT) has obtained state-of-the art performance for several language pairs, while only using parallel data for training. Expand
  • 982
  • 269
  • PDF
Edinburgh Neural Machine Translation Systems for WMT 16
TLDR
We participated in the WMT 2016 shared news translation task by building neural translation systems for four language pairs, each trained in both directions: English Czech, English German, English Romanian and English Russian. Expand
  • 362
  • 87
  • PDF
Nematus: a Toolkit for Neural Machine Translation
We present Nematus, a toolkit for Neural Machine Translation. The toolkit prioritizes high translation accuracy, usability, and extensibility. Nematus has been used to build top-performingExpand
  • 306
  • 47
  • PDF
Context-Aware Neural Machine Translation Learns Anaphora Resolution
TLDR
We introduce a context-aware neural machine translation model designed in such way that the flow of information from the extended context to the translation model can be controlled and analyzed. Expand
  • 114
  • 37
  • PDF
Linguistic Input Features Improve Neural Machine Translation
TLDR
Neural machine translation has recently achieved impressive results, while using little in the way of external linguistic information. Expand
  • 243
  • 35
  • PDF
Controlling Politeness in Neural Machine Translation via Side Constraints
TLDR
We propose a simple and effective method for including target-side T-V annotation in the training of a neural machine translation (NMT) system, which allows us to control the level of politeness at test time through side constraints. Expand
  • 157
  • 35
  • PDF
Evaluating Discourse Phenomena in Neural Machine Translation
TLDR
In this article, we present hand-crafted, discourse test sets, designed to test the models' ability to exploit previous source and target sentences. Expand
  • 109
  • 29
  • PDF
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
TLDR
We evaluate the contribution made by individual attention heads to the overall performance of the model and analyze the roles played by them in the encoder. Expand
  • 163
  • 20
  • PDF