• Corpus ID: 218591349

Evaluating the suitability of human-oriented text simplification for machine translation

  title={Evaluating the suitability of human-oriented text simplification for machine translation},
  author={Rei Miyata and Midori Tatsumi},
We present the results of an experiment to evaluate the suitability of simplified text as a source for machine translation (MT). Focusing on Japanese as the source language, we first proposed a simplest possible rule set to write text that can be easily understood by language learners and children. Following this rule set, we manually rewrote expository sentences concerning Japanese cultural assets in simplified Japanese, through two steps: (1) splitting long sentences into short complete… 

Figures and Tables from this paper


Japanese controlled language rules to improve machine translatability of municipal documents
Comparing the results of four MT systems showed that the effectiveness of CL rules varies depending on the particular MT systems, and a preliminary selection of optimal rules for each system showed more than 15% increase in MT performance.
Readability and Translatability Judgments for “Controlled Japanese”
An experiment to test the ef-ficacy of ‘controlled language’ authoring of technical documents in Japanese, with respect both to the readability of the Jap-anese source and the quality of the Eng-lish machine-translated output.
Improving Machine Translation of English Relative Clauses with Automatic Text Simplification
This article explores the use of automatic sentence simplification as a preprocessing step in neural machine translation of English relative clauses into grammatically complex languages and shows that this approach can reduce technical post-editing effort to obtain correct translation.
Can Text Simplification Help Machine Translation?
The use of text simplification as a pre-processing step for statistical machine translation of grammatically complex under-resourced languages can improve grammaticality (fluency) of the translation output and reduce technical post-editing effort.
Impact of controlled language on translation quality and post-editing in a statistical machine translation environment
This paper examines whether the use of CL improves productivity in terms of reduced PE effort, using character-based edit-distance and measures the degree of impact of CL rules on MT quality based on the difference in human evaluation as well as BLEU scores.
A linguistically motivated taxonomy for Machine Translation error analysis
This paper significantly extends previous error taxonomies so that translation errors associated with Romance language specificities can be accommodated and carries out an extensive analysis of the errors generated by four different systems.
Simplified Corpus with Core Vocabulary
Despite vocabulary restrictions, the simplified corpus for the Japanese language achieved high quality in grammaticality and meaning preservation and it is believed that the same quality can be obtained by extending this corpus.
Sentence Splitting for Vietnamese-English Machine Translation
A rule-based technique is proposed to split long Vietnamese sentences based on linguistic information and used for translating sentences with two type of constrains: wall and zone, showing an improvement BLEU and NIST score.
Overview of the Patent Machine Translation Task at the NTCIR-10 Workshop
An overview of the Patent Machine Translation Task (PatentMT) at NTCIR-9 is given by describing the test collection, evaluation methods, and evaluation results.
A Survey of Automated Text Simplification
This survey identifies and classifies simplification research within the period 1998-2013 and gives an overview of contemporary research whilst taking into account the history that has brought text simplification to its current state.