• Corpus ID: 44157913

A Survey of Domain Adaptation for Neural Machine Translation

@inproceedings{Chu2018ASO,
  title={A Survey of Domain Adaptation for Neural Machine Translation},
  author={Chenhui Chu and Rui Wang},
  booktitle={COLING},
  year={2018}
}
Neural machine translation (NMT) is a deep learning based approach for machine translation, which yields the state-of-the-art translation performance in scenarios where large-scale parallel corpora are available. Although the high-quality and domain-specific translation is crucial in the real world, domain-specific corpora are usually scarce or nonexistent, and thus vanilla NMT performs poorly in such scenarios. Domain adaptation that leverages both out-of-domain parallel corpora as well as… 

Figures from this paper

A Survey of Domain Adaptation for Machine Translation
TLDR
This paper gives a comprehensive survey of the state-of-the-art domain adaptation techniques for MT, and gives a brief review of domain adaptation for SMT, but put most of their effort into the survey ofdomain adaptation for NMT.
Multilingual Multi-Domain Adaptation Approaches for Neural Machine Translation
TLDR
Experiments show that multilingual multi-domain adaptation can significantly improve both resource-poor in-domain and resource-rich out-of-domain translations, and the combination of the proposed methods with mixed fine tuning achieves the best performance.
Lexical Micro-adaptation for Neural Machine Translation
TLDR
A generic framework applied at inference is introduced in which a subset of segment pairs are first extracted from training data according to their similarity to the input sentences, thus performing a lexical micro-adaptation in a generic NMT network.
Unsupervised Domain Adaptation for Neural Machine Translation with Iterative Back Translation
TLDR
An Iterative Back Translation training scheme is applied on in- domain monolingual data, which repeatedly uses a Transformer-based NMT model to create in-domain pseudo-parallel sentence pairs in one translation direction on the fly and then use them to train the model in the other direction.
Domain Adaptation and Multi-Domain Adaptation for Neural Machine Translation: A Survey
TLDR
This work focuses on robust approaches to domain adaptation for NMT, particularly where a system may need to translate across multiple domains, and divides techniques into those revolving around data selection or generation, model architecture, parameter adaptation procedure, and inference procedure.
An Empirical Study of Domain Adaptation for Unsupervised Neural Machine Translation
TLDR
This work empirically shows different scenarios for unsupervised domain-specific neural machine translation and proposes several potential solutions to improve the performances of domain- specific UNMT systems.
Unsupervised Domain Adaptation for Neural Machine Translation with Domain-Aware Feature Embeddings
TLDR
This work proposes an approach that adapts models with domain-aware feature embeddings, which are learned via an auxiliary language modeling task, and allows the model to assign domain-specific representations to words and output sentences in the desired domain.
Rapid Domain Adaptation for Machine Translation with Monolingual Data
TLDR
This paper proposes an approach that enables rapid domain adaptation from the perspective of unsupervised translation, which only requires in-domain monolingual data and can be quickly applied to a preexisting translation system trained on general domain.
Domain Adaptation of Neural Machine Translation by Lexicon Induction
TLDR
This paper proposes an unsupervised adaptation method which fine-tunes a pre-trained out-of-domain NMT model using a pseudo-in-domain corpus, and performs lexicon induction to extract an in-domain lexicon, and construct a Pseudo-parallel in- domain corpus.
Domain Adaptation for NMT via Filtered Iterative Back-Translation
TLDR
A simpler filtering approach based on a domain classifier, applied only to the pseudo-training data can consistently perform better, providing performance gains of 1.40, 1.82 and 0.76 in terms of BLEU score for Medical, Law and IT in one direction.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 99 REFERENCES
An Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation
TLDR
A novel domain adaptation method named “mixed fine tuning” for neural machine translation (NMT) is proposed which combines two existing approaches namely fine tuning and multi domain NMT.
Sentence Embedding for Neural Machine Translation Domain Adaptation
TLDR
The NMT’s internal embedding of the source sentence is exploited and the sentence embedding similarity is used to select the sentences which are close to in-domain data to substantially improve NMT performance.
Multilingual and Multi-Domain Adaptation for Neural Machine Translation
TLDR
This paper proposes to simultaneously use both, multilingual and multi-domain data for domain adaptation of NMT, which might outperform the methods that use them independently.
Domain Control for Neural Machine Translation
TLDR
A new technique for neural machine translation (NMT) that is performed at runtime using a unique neural network covering multiple domains is proposed, called domain control, which shows quality improvements when compared to dedicated domains translating on any of the covered domains and even on out-of-domain data.
Multi-domain Adaptation for Statistical Machine Translation Based on Feature Augmentation
TLDR
This paper presents domain adaptation methods for machine translation that assume multiple domains that combine two model types: a corpus-concatenated model covering multiple domains and single-domain models that are accurate but sparse in specific domains.
Fast Domain Adaptation for Neural Machine Translation
TLDR
This paper proposes an approach for adapting a NMT system to a new domain with the main idea behind domain adaptation that the availability of large out-of-domain training data and a small in- domain training data.
Semi-Supervised Learning for Neural Machine Translation
TLDR
This work proposes a semi-supervised approach for training NMT models on the concatenation of labeled and unlabeled monolingual corpora data, in which the source- to-target and target-to-source translation models serve as the encoder and decoder, respectively.
Instance Weighting for Neural Machine Translation Domain Adaptation
TLDR
Two instance weighting technologies, i.e., sentence weighting and domain weighting with a dynamic weight learning strategy, are proposed for NMT domain adaptation and empirical results show that the proposed methods can substantially improve NMT performance.
A Comprehensive Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation
TLDR
This work empirically compares all the methods of domain adaptation methods for NMT and proposes a novel domain adaptation method named mixed fine tuning, which combines two existing methods namely fine tuning and multi domain NMT.
Sentence Selection and Weighting for Neural Machine Translation Domain Adaptation
TLDR
Empirical results show that the sentence selection and weighting methods can significantly improve the NMT performance, outperforming the existing baselines.
...
1
2
3
4
5
...