• Corpus ID: 239009896

Alternative Input Signals Ease Transfer in Multilingual Machine Translation

  title={Alternative Input Signals Ease Transfer in Multilingual Machine Translation},
  author={Simeng Sun and Angela Fan and James Cross and Vishrav Chaudhary and C. Tran and Philipp Koehn and Francisco Guzm{\'a}n},
Recent work in multilingual machine translation (MMT) has focused on the potential of positive transfer between languages, particularly cases where higher-resourced languages can benefit lower-resourced ones. While training an MMT model, the supervision signals learned from one language pair can be transferred to the other via the tokens shared by multiple source languages. However, the transfer is inhibited when the token overlap among source languages is small, which manifests naturally when… 


Multi-Source Neural Machine Translation with Missing Data
This study focuses on the use of incomplete multilingual corpora in multi-encoder NMT and mixture of NMT experts and examines a very simple implementation where missing source translations are replaced by a special symbol.
Multilingual Neural Machine Translation With Soft Decoupled Encoding
This paper proposes Soft Decoupled Encoding (SDE), a multilingual lexicon encoding framework specifically designed to share lexical-level information intelligently without requiring heuristic preprocessing such as pre-segmenting the data.
Massively Multilingual Neural Machine Translation
It is shown that massively multilingual many-to-many models are effective in low resource settings, outperforming the previous state-of-the-art while supporting up to 59 languages in 116 translation directions in a single model.
Unsupervised Neural Machine Translation
This work proposes a novel method to train an NMT system in a completely unsupervised manner, relying on nothing but monolingual corpora, and consists of a slightly modified attentional encoder-decoder model that can be trained on monolingUAL corpora alone using a combination of denoising and backtranslation.
Beyond English-Centric Multilingual Machine Translation
This work creates a true Many-to-Many multilingual translation model that can translate directly between any pair of 100 languages and explores how to effectively increase model capacity through a combination of dense scaling and language-specific sparse parameters to create high quality models.
Word Translation Without Parallel Data
It is shown that a bilingual dictionary can be built between two languages without using any parallel corpora, by aligning monolingual word embedding spaces in an unsupervised way.
Unsupervised Machine Translation On Dravidian Languages
It is shown that transliteration is essential in unsupervised translation between Dravidian languages, as they do not share a common writing system, and several model architectures are explored that use the auxiliary data in order to maximize knowledge sharing and enable UNMT for dissimilar language pairs.
Ensemble Learning for Multi-Source Neural Machine Translation
This paper proposes several methods with different degrees of parameterization to combine individual predictions of NMT systems so that they mutually compensate for each other’s mistakes and improve overall performance, finding that the biggest improvements can be obtained from a context-dependent weighting scheme for multi-source ensembles.
Neural Machine Translation of Rare Words with Subword Units
This paper introduces a simpler and more effective approach, making the NMT model capable of open-vocabulary translation by encoding rare and unknown words as sequences of subword units, and empirically shows that subword models improve over a back-off dictionary baseline for the WMT 15 translation tasks English-German and English-Russian by 1.3 BLEU.
The FLORES Evaluation Datasets for Low-Resource Machine Translation: Nepali–English and Sinhala–English
This work introduces the FLORES evaluation datasets for Nepali–English and Sinhala– English, based on sentences translated from Wikipedia, and demonstrates that current state-of-the-art methods perform rather poorly on this benchmark, posing a challenge to the research community working on low-resource MT.