Fabienne Cap

Learn More
The current state-of-the-art in statistical machine translation (SMT) suffers from issues of sparsity and inadequate modeling power when translating into morphologically rich languages. We model both inflection and word-formation for the task of translating into German. We translate from English words to an underspecified German representation and then use(More)
Compounding in morphologically rich languages is a highly productive process which often causes SMT approaches to fail because of unseen words. We present an approach for translation into a compounding language that splits compounds into simple words for training and, due to an underspecified representation, allows for free merging of simple words into(More)
Support-verb constructions (i.e., multiword expressions combining a semantically light verb with a predicative noun) are problematic for standard statistical machine translation systems, because SMT systems cannot distinguish between literal and idiomatic uses of the verb. We work on the German to English translation direction, for which the identification(More)
This paper summarises the contributions of the teams at the University of Helsinki, Uppsala University and the University of Turku to the news translation tasks for translating from and to Finnish. Our models address the problem of treating morphology and data coverage in various ways. We introduce a new efficient tool for word alignment and discuss(More)
The paper presents an approach to morphological compound splitting that takes the degree of compositionality into account. We apply our approach to German noun compounds and particle verbs within a German–English SMT system, and study the effect of only splitting compositional compounds as opposed to an aggressive splitting. A qualitative study explores the(More)
We present the CimS submissions to the 2014 Shared Task for the language pair EN→DE. We address the major problems that arise when translating into German: complex nominal and verbal morphology, productive compounding and flexible word ordering. Our morphologyaware translation systems handle word formation issues on different levels of morpho-syntactic(More)
We present the CimS submissions to the WMT 2015 Shared Task for the translation direction English to German. Similar to our previous submissions, all of our systems are aware of the complex nominal morphology of German. In this paper, we combine source-side reordering and target-side compound processing with basic morphological processing in order to obtain(More)