• Publications
  • Influence
Moses: Open Source Toolkit for Statistical Machine Translation
We describe an open-source toolkit for statistical machine translation whose novel contributions are (a) support for linguistically motivated factors, (b) confusion network decoding, and (c)Expand
PPDB: The Paraphrase Database
TLDR
The 1.0 release of the paraphrase database, PPDB, contains over 220 million paraphrase pairs, consisting of 73 million phrasal and 8 million lexical paraphrases, as well as 140million paraphrase patterns, which capture many meaning-preserving syntactic transformations. Expand
Optimizing Statistical Machine Translation for Text Simplification
TLDR
This work is the first to design automatic metrics that are effective for tuning and evaluating simplification systems, which will facilitate iterative development for this task. Expand
PPDB 2.0: Better paraphrase ranking, fine-grained entailment relations, word embeddings, and style classification
TLDR
PPDB 2.0 includes a discriminatively re-ranked set of paraphrases that achieve a higher correlation with human judgments than PPDB 1.0's heuristic rankings. Expand
Findings of the 2012 Workshop on Statistical Machine Translation
TLDR
A large-scale manual evaluation of 103 machine translation systems submitted by 34 teams was conducted, which used the ranking of these systems to measure how strongly automatic metrics correlate with human judgments of translation quality for 12 evaluation metrics. Expand
Problems in Current Text Simplification Research: New Data Can Help
TLDR
This opinion paper argues that focusing on Wikipedia limits simplification research, and introduces a new simplification dataset that is a significant improvement over Simple Wikipedia, and presents a novel quantitative-comparative approach to study the quality of simplification data resources. Expand
Paraphrasing with Bilingual Parallel Corpora
TLDR
This work defines a paraphrase probability that allows paraphrases extracted from a bilingual parallel corpus to be ranked using translation probabilities, and shows how it can be refined to take contextual information into account. Expand
(Meta-) Evaluation of Machine Translation
TLDR
An extensive human evaluation was carried out not only to rank the different MT systems, but also to perform higher-level analysis of the evaluation process, revealing surprising facts about the most commonly used methodologies. Expand
Edinburgh System Description for the 2005 IWSLT Speech Translation Evaluation
TLDR
This work adapted the statistical machine translation system that performed successfully in previous DARPA competitions on open domain text translations to work on limited domain speech data in the IWSLT 2005 speech translation task. Expand
...
1
2
3
4
5
...