Corpus ID: 230435667

Decoding Time Lexical Domain Adaptationfor Neural Machine Translation

@article{Bogoychev2021DecodingTL,
  title={Decoding Time Lexical Domain Adaptationfor Neural Machine Translation},
  author={Nikolay Bogoychev and Pinzhen Chen},
  journal={ArXiv},
  year={2021},
  volume={abs/2101.00421}
}
  • Nikolay Bogoychev, Pinzhen Chen
  • Published 2021
  • Computer Science
  • ArXiv
  • Machine translation systems are vulnerable to domain mismatch, especially when the task is low-resource. In this setting, out of domain translations are often of poor quality and prone to hallucinations, due to the translation model preferring to predict common words it has seen during training, as opposed to the more uncommon ones from a different domain. We present two simple methods for improving translation quality in this particular setting: First, we use lexical shortlisting in order to… CONTINUE READING

    Tables from this paper

    References

    SHOWING 1-10 OF 34 REFERENCES
    Domain Robustness in Neural Machine Translation
    • 7
    • Highly Influential
    • PDF
    Microsoft Translator at WMT 2019: Towards Large-Scale Document-Level Neural Machine Translation
    • 54
    • Highly Influential
    • PDF
    Neural Machine Translation of Rare Words with Subword Units
    • 3,227
    • PDF
    In Neural Machine Translation, What Does Transfer Learning Transfer?
    • 10
    • PDF
    Bridging the Gap between Training and Inference for Neural Machine Translation
    • 63
    • PDF
    Continuous Space Translation Models with Neural Networks
    • 144
    • PDF
    Multi-Hypothesis Machine Translation Evaluation
    • 2
    • PDF
    On the Word Alignment from Neural Machine Translation
    • 24
    • PDF
    Smooth Bilingual N-Gram Translation
    • 43
    • PDF