The MeMAD Submission to the WMT18 Multimodal Translation Task

@inproceedings{Grnroos2018TheMS,
  title={The MeMAD Submission to the WMT18 Multimodal Translation Task},
  author={Stig-Arne Gr{\"o}nroos and Benoit Huet and Mikko Kurimo and Jorma T. Laaksonen and Bernard M{\'e}rialdo and Phu Pham and Mats Sj{\"o}berg and Umut Sulubacak and J{\"o}rg Tiedemann and Raphael Troncy and Ra{\'u}l V{\'a}zquez},
  booktitle={WMT},
  year={2018}
}
This paper describes the MeMAD project entry to the WMT Multimodal Machine Translation Shared Task. We propose adapting the Transformer neural machine translation (NMT) architecture to a multi-modal setting. In this paper, we also describe the preliminary experiments with text-only translation systems leading us up to this choice. We have the top scoring system for both English-to-German and English-to-French, according to the automatic metrics for flickr18. Our experiments show that the effect… 

Figures and Tables from this paper

TMU Japanese-English Multimodal Machine Translation System for WAT 2020
TLDR
The experimental results indicate that translation performance can be improved using the method of textual data augmentation with noising on the target side and probabilistic dropping of either context vector in the decoder.
Findings of the Third Shared Task on Multimodal Machine Translation
TLDR
Compared to last year, the performance of the multimodal submissions improved, but text-only systems remain competitive.
Multimodal Machine Translation with Embedding Prediction
TLDR
This study effectively combines two approaches to improve NMT of low-resource domains in the context of multimodal NMT and explores how to take full advantage of pretrained word embeddings to better translate rare words.
GOOD FOR MISCONCEIVED REASONS: REVISITING NEURAL MULTIMODAL MACHINE TRANSLATION
  • Computer Science
  • 2020
TLDR
This work revisits the recent development of neural multimodal machine translation by proposing two interpretable MMT models that achieve new state-of-the-art results on the standard Multi30k dataset and reports the empirical findings which express the importance of M MT models’ interpretability and set new paradigms for future MMT research.
Debiasing Word Embeddings Improves Multimodal Machine Translation
TLDR
This study examines various kinds of word embeddings and introduces two debiasing techniques for three multimodal NMT models and two language pairs -- English-German translation and English-French translation and finds that with optimal settings, the overall performance of multimodals models was improved.
Probing the Need for Visual Context in Multimodal Machine Translation
TLDR
This paper probes the contribution of the visual modality to state-of-the-art MMT models by conducting a systematic analysis where the models are partially deprived from source-side textual context and shows that under limited textual context, models are capable of leveraging the visual input to generate better translations.
Make the Blind Translator See The World: A Novel Transfer Learning Solution for Multimodal Machine Translation
TLDR
The experimental result demonstrates that performing transfer learning with monomodal pre-trained NMT model on multimodal NMT tasks can obtain considerable boosts and be evaluated on the Multi30k en-de and en-fr dataset.
Supervised Visual Attention for Multimodal Neural Machine Translation
TLDR
The experiments show that a Transformer-based MNMT model can be improved by incorporating the proposed supervised visual attention mechanism and that further improvements can be achieved by combining it with a supervised cross-lingual attention mechanism.
Distilling Translations with Visual Awareness
TLDR
This work proposes a translate-and-refine approach to this problem where images are only used by a second stage decoder and shows that it has the ability to recover from erroneous or missing words in the source language.
Multimodal machine translation through visuals and speech
TLDR
The paper concludes with a discussion of directions for future research in multimodal machine translation: the need for more expansive and challenging datasets, for targeted evaluations of model performance, and for multimodality in both the input and output space.
...
1
2
3
4
...

References

SHOWING 1-10 OF 25 REFERENCES
LIUM-CVC Submissions for WMT17 Multimodal Translation Task
TLDR
The monomodal and multimodal Neural Machine Translation systems developed by LIUM and CVC for WMT17 Shared Task on Multimodal Translation ranked first for both En-De and En-Fr language pairs according to the automatic evaluation metrics METEOR and BLEU.
DCU System Report on the WMT 2017 Multi-modal Machine Translation Task
We report experiments with multi-modal neural machine translation models that incorporate global visual features in different parts of the encoder and decoder, and use the VGG19 network to extract
The Helsinki Neural Machine Translation System
We introduce the Helsinki Neural Machine Translation system (HNMT) and how it is applied in the news translation task at WMT 2017, where it ranked first in both the human and automatic evaluations
Pre-Translation for Neural Machine Translation
TLDR
This work used phrase-based machine translation to pre-translate the input into the target language and analyzed the influence of the quality of the initial system on the final result.
Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation
TLDR
This work proposes a simple solution to use a single Neural Machine Translation (NMT) model to translate between multiple languages using a shared wordpiece vocabulary, and introduces an artificial token at the beginning of the input sentence to specify the required target language.
Imagination Improves Multimodal Translation
TLDR
This work decomposes multimodal translation into two sub-tasks: learning to translate and learning visually grounded representations, and finds improvements if the translation model is trained on the external News Commentary parallel text dataset.
Neural Machine Translation of Rare Words with Subword Units
TLDR
This paper introduces a simpler and more effective approach, making the NMT model capable of open-vocabulary translation by encoding rare and unknown words as sequences of subword units, and empirically shows that subword models improve over a back-off dictionary baseline for the WMT 15 translation tasks English-German and English-Russian by 1.3 BLEU.
Attention is All you Need
TLDR
A new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely is proposed, which generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.
chrF: character n-gram F-score for automatic MT evaluation
TLDR
The proposed use of character n-gram F-score for automatic evaluation of machine translation output shows very promising results, especially for the CHRF3 score – for translation from English, this variant showed the highest segment-level correlations outperforming even the best metrics on the WMT14 shared evaluation task.
OpenNMT: Open-Source Toolkit for Neural Machine Translation
TLDR
The toolkit prioritizes efficiency, modularity, and extensibility with the goal of supporting NMT research into model architectures, feature representations, and source modalities, while maintaining competitive performance and reasonable training requirements.
...
1
2
3
...