Share This Author
Moses: Open Source Toolkit for Statistical Machine Translation
We describe an open-source toolkit for statistical machine translation whose novel contributions are (a) support for linguistically motivated factors, (b) confusion network decoding, and (c)…
Optimizing Statistical Machine Translation for Text Simplification
- W. Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch
- Computer ScienceTACL
- 27 July 2016
This work is the first to design automatic metrics that are effective for tuning and evaluating simplification systems, which will facilitate iterative development for this task.
PPDB: The Paraphrase Database
The 1.0 release of the paraphrase database, PPDB, contains over 220 million paraphrase pairs, consisting of 73 million phrasal and 8 million lexical paraphrases, as well as 140million paraphrase patterns, which capture many meaning-preserving syntactic transformations.
Problems in Current Text Simplification Research: New Data Can Help
This opinion paper argues that focusing on Wikipedia limits simplification research, and introduces a new simplification dataset that is a significant improvement over Simple Wikipedia, and presents a novel quantitative-comparative approach to study the quality of simplification data resources.
Open Source Toolkit for Statistical Machine Translation: Factored Translation Models and Lattice Decoding
PPDB 2.0: Better paraphrase ranking, fine-grained entailment relations, word embeddings, and style classification
- Ellie Pavlick, Pushpendre Rastogi, Juri Ganitkevitch, Benjamin Van Durme, Chris Callison-Burch
- Computer ScienceACL
- 1 July 2015
PPDB 2.0 includes a discriminatively re-ranked set of paraphrases that achieve a higher correlation with human judgments than PPDB 1.0's heuristic rankings.
Paraphrasing with Bilingual Parallel Corpora
This work defines a paraphrase probability that allows paraphrases extracted from a bilingual parallel corpus to be ranked using translation probabilities, and shows how it can be refined to take contextual information into account.
Findings of the 2012 Workshop on Statistical Machine Translation
- Chris Callison-Burch, Philipp Koehn, Christof Monz, Matt Post, Radu Soricut, Lucia Specia
- Computer Science, PsychologyWMT@NAACL-HLT
- 7 June 2012
A large-scale manual evaluation of 103 machine translation systems submitted by 34 teams was conducted, which used the ranking of these systems to measure how strongly automatic metrics correlate with human judgments of translation quality for 12 evaluation metrics.
(Meta-) Evaluation of Machine Translation
- Chris Callison-Burch, C. Fordyce, Philipp Koehn, Christof Monz, J. Schroeder
- Computer ScienceWMT@ACL
- 23 June 2007
An extensive human evaluation was carried out not only to rank the different MT systems, but also to perform higher-level analysis of the evaluation process, revealing surprising facts about the most commonly used methodologies.
Re-evaluating the Role of Bleu in Machine Translation Research
It is shown that an improved Bleu score is neither necessary nor sufficient for achieving an actual improvement in translation quality, and two significant counterexamples to Bleu’s correlation with human judgments of quality are given.