Daniel Déchelotte

Learn More
This paper presents a two-way speech translation system that is completely hosted on an off-the-shelf handheld device. Specifically, this end-to-end system includes an HMM-based large vocabulary continuous speech recognizer (LVCSR) for both English and Chinese using statistical -grams, a two-way translation system between English and Chinese, and, a(More)
This paper describes our statistical machine translation systems based on the Moses toolkit for the WMT08 shared task. We address the Europarl and News conditions for the following language pairs: English with French, German and Spanish. For Europarl, n-best rescoring is performed using an enhanced n-gram or a neuronal language model; for the News(More)
This paper reports on recent experiments for speech to text (STT) translation of European Parliamentary speeches. A Spanish speech to English text translation system has been built using data from the TC-STAR European project. The speech recognizer is a state-of-the-art multipass system trained for the Spanish EPPS task and the statistical translation(More)
This paper describes an approach for computing a consensus translation from the outputs of multiple machine translation (MT) systems. The consensus translation is computed by weighted majority voting on a confusion network, similarly to the well-established ROVER approach of Fiscus for combining speech recognition hypotheses. To create the confusion(More)
Combining automatic speech recognition and machine translation is frequent in current research programs. This paper first presents several pre-processing steps to limit the performance degradation observed when translating an automatic transcription (as opposed to a manual transcription). Indeed, automatically transcribed speech often differs significantly(More)
The purpose of this work is to explore the integration of morphosyntactic information into the translation model itself, by enriching words with their morphosyntactic categories. We investigate word disambiguation using morphosyntactic categories, n-best hypotheses reranking, and the combination of both methods with word or morphosyntactic n-gram language(More)
This paper describes a statistical machine translation system based on freely available programs such as Moses. Several new features were added, in particular a two-pass decoding strategy using n-best lists and a continuous space language model that aims at taking better advantage of the limited training data. We also investigated lexical disambiguation(More)
In the French administrative "departement" of Côte-d'Or, between 1982 and 1990, the crude incidence rate and the age-adjusted world standardised incidence rate (ASR) for corpus uteri cancer were respectively 16.0 +/- 0.8 and 10.7 +/- 0.6 per 100,000 women per year. The incidence increased after 50 years of age, reaching a maximum of 66.7 per 100,000 women(More)
  • 1