Learn More
This paper describes methods and results for the annotation of two discourse-level phenomena, connectives and pronouns, over a multilingual parallel corpus. Excerpts from Europarl in English and French have been annotated with disambiguation information for connectives and pronouns, for about 3600 tokens. This data is then used in several ways: for(More)
Inter-sentential dependencies such as discourse connectives or pronouns have an impact on the translation of these items. These dependencies have classically been analyzed within complex theoretical frameworks, often monolingual ones, and the resulting fine-grained descriptions, although relevant to translation, are likely beyond reach of statistical(More)
In this paper, we are going to evaluate a statistical machine translation (SMT) system, Moses, trained for several language pairs to explore how it cope with multiword expression (MWE) translation. We will experiment with Czech-Russian, English-French and English-Sinhala language pairs to make sure that our conclusions are as language-independent as(More)
This paper presents a solution to evaluate spoken post-editing of imperfect machine translation output by a human translator. We compare two approaches to the combination of machine translation (MT) and automatic speech recognition (ASR): a heuristic algorithm and a machine learning method. To obtain a data set with spoken post-editing information, we use(More)
  • 1