• Publications
  • Influence
cdec: A Decoder, Alignment, and Learning Framework for Finite- State and Context-Free Translation Models
We present cdec, an open source framework for decoding, aligning with, and training a number of statistical machine translation models, including word-based models, phrase-based models, and modelsExpand
  • 241
  • 18
  • PDF
Improved Speech-to-Text Translation with the Fisher and Callhome Spanish–English Speech Translation Corpus
Research into the translation of the output of automatic speech recognition (ASR) systems is hindered by the dearth of datasets developed for that explicit purpose. For SpanishEnglish translation, inExpand
  • 69
  • 12
  • PDF
A unified framework for phrase-based, hierarchical, and syntax-based statistical machine translation
Despite many differences between phrase-based, hierarchical, and syntax-based translation models, their training and testing pipelines are strikingly similar. Drawing on this fact, we extend theExpand
  • 71
  • 12
  • PDF
Tera-Scale Translation Models via Pattern Matching
  • A. Lopez
  • Computer Science
  • COLING
  • 18 August 2008
Translation model size is growing at a pace that outstrips improvements in computing power, and this hinders research on many interesting models. We show how an algorithmic scaling technique can beExpand
  • 64
  • 10
  • PDF
Hierarchical Phrase-Based Translation with Suffix Arrays
  • A. Lopez
  • Computer Science
  • EMNLP-CoNLL
  • 1 June 2007
A major engineering challenge in statistical machine translation systems is the efficient representation of extremely large translation rulesets. In phrase-based models, this problem can be addressedExpand
  • 120
  • 8
  • PDF
Neural Networks For Negation Scope Detection
Automatic negation scope detection is a task that has been tackled using different classifiers and heuristics. Most systems are however 1) highly-engineered, 2) English-specific, and 3) only testedExpand
  • 54
  • 8
  • PDF
Pre-training on high-resource speech recognition improves low-resource speech-to-text translation
We present a simple approach to improve direct speech-to-text translation (ST) when the source language is low-resource: we pre-train the model on a high-resource automatic speech recognition (ASR)Expand
  • 54
  • 6
  • PDF
The Hiero Machine Translation System: Extensions, Evaluation, and Analysis
Hierarchical organization is a well known property of language, and yet the notion of hierarchical structure has been largely absent from the best performing machine translation systems in recentExpand
  • 62
  • 5
  • PDF
A Systematic Analysis of Translation Model Search Spaces
Translation systems are complex, and most metrics do little to pinpoint causes of error or isolate system differences. We use a simple technique to discover induction errors, which occur when goodExpand
  • 36
  • 5
  • PDF