• Publications
  • Influence
Text Simplification for Reading Assistance: A Project Note
TLDR
The issues to address to realize text simplification are discussed, the present results in three different aspects of this task are reported on: readability assessment, paraphrase representation and post-transfer error detection.
Enhancement of Encoder and Attention Using Target Monolingual Corpora in Neural Machine Translation
TLDR
The experimental results show that the translation quality is improved by increasing the number of synthetic source sentences for each given target sentence, and quality close to that using a manually created parallel corpus was achieved.
Automatic Generation of Syntactically Well-formed and Semantically Appropriate Paraphrases
TLDR
A paraphrase generation model which consists of a case assignment rule and a handful of LCS transformation rules, with particular focus on verb alternation and compound noun decomposition is implemented, and Experimental results indicate that the model significantly outperforms conventional models.
Exploiting Multilingualism through Multistage Fine-Tuning for Low-Resource Neural Machine Translation
TLDR
This paper reports on a systematic comparison of multistage fine-tuning configurations, confirming that multi-parallel corpora are extremely useful despite their scarcity and content-wise redundancy thus exhibiting the true power of multilingualism.
NICT’s Unsupervised Neural and Statistical Machine Translation Systems for the WMT19 News Translation Task
TLDR
The NICT’s participation in the WMT19 unsupervised news translation task is presented, with the system ranked first for the German-to-Czech translation task, using only the data provided by the organizers (“constraint’”), according to both BLEU-cased and human evaluation.
Unsupervised Neural Machine Translation Initialized by Unsupervised Statistical Machine Translation
TLDR
This work proposes to define unsupervised NMT (UNMT) as NMT trained with the supervision of synthetic bilingual data, and straightforwardly enables the use of state-of-the-art architectures proposed for supervised NMT by replacing human-made bilingual data with syntheticilingual data for training.
Enlarging Paraphrase Collections through Generalization and Instantiation
TLDR
A paraphrase acquisition method that uncovers and exploits generalities underlying paraphrases: paraphrase patterns are first induced and then used to collect novel instances, which uses both bilingual parallel and monolingual corpora.
Paraphrasing of Japanese Light-verb Constructions Based on Lexical Conceptual Structure
TLDR
Experimental results show that the LCS-based paraphrasing model characterizes some of the semantic features of those verbs required for generating paraphrases, such as the direction of an action and the relationship between arguments and surface cases.
NICT’s Participation in WAT 2018: Approaches Using Multilingualism and Recurrently Stacked Layers
TLDR
This paper described all NMT systems for the following translation tasks the authors participated in and noted that a single multilingual/bidirectional model (without ensembling) has the potential to achieve (near) stateof-the-art results for all the language pairs.
Combination of Neural Machine Translation Systems at WMT20
TLDR
This paper presents neural machine translation systems and their combination built for the WMT20 English-Polish and Japanese->English translation tasks and reveals that the presence of translationese texts in the validation data led them to take decisions in building NMT systems that were not optimal to obtain the best results on the test data.
...
1
2
3
4
5
...