• Publications
  • Influence
Moses: Open Source Toolkit for Statistical Machine Translation
TLDR
We describe an open-source toolkit for statistical machine translation whose novel contributions are (a) support for linguistically motivated factors, (b) confusion network decoding, and efficient data formats for translation models and language models. Expand
  • 5,437
  • 775
  • PDF
PPDB: The Paraphrase Database
TLDR
We present the 1.0 release of our paraphrase database, PPDB, which contains over 220 million paraphrase pairs, consisting of 73 million phrasal and 8 million lexical paraphrases. Expand
  • 613
  • 109
  • PDF
Optimizing Statistical Machine Translation for Text Simplification
TLDR
We present an in-depth adaptation of statistical machine translation to perform text simplification, taking advantage of large-scale paraphrases learned from bilingual texts and a small amount of manual simplifications with multiple references. Expand
  • 196
  • 81
  • PDF
Findings of the 2012 Workshop on Statistical Machine Translation
TLDR
This paper presents the results of the WMT12 shared tasks, which included a translation task, a task for machine translation evaluation metrics, and atask for run-time estimation of machine translation quality. Expand
  • 337
  • 59
  • PDF
PPDB 2.0: Better paraphrase ranking, fine-grained entailment relations, word embeddings, and style classification
We present a new release of the Paraphrase Database. PPDB 2.0 includes a discriminatively re-ranked set of paraphrases that achieve a higher correlation with human judgments than PPDB 1.0’s heuristicExpand
  • 229
  • 57
  • PDF
Paraphrasing with Bilingual Parallel Corpora
TLDR
We introduce a novel method for extracting paraphrases that uses bilingual parallel corpora, a much more commonly available resource. Expand
  • 581
  • 56
  • PDF
(Meta-) Evaluation of Machine Translation
TLDR
This paper evaluates the translation quality of machine translation systems for 8 language pairs: translating French, German, Spanish, and Czech to English and back. Expand
  • 394
  • 48
  • PDF
Problems in Current Text Simplification Research: New Data Can Help
TLDR
We introduce a new simplification dataset that is a significant improvement over Simple Wikipedia, and present a novel quantitative-comparative approach to study the quality of simplification data resources. Expand
  • 173
  • 45
  • PDF
Edinburgh system description for the 2005 IWSLT speech translation evaluation
TLDR
We adapted our statistical machine translation system that performed successfully in previous DARPA competitions on open domain text translations to the limited domain speech translation task. Expand
  • 403
  • 43
  • PDF