• Publications
  • Influence
Optimizing Statistical Machine Translation for Text Simplification
TLDR
We present an in-depth adaptation of statistical machine translation to perform text simplification, taking advantage of large-scale paraphrases learned from bilingual texts and a small amount of manual simplifications with multiple references. Expand
  • 186
  • 81
  • PDF
Problems in Current Text Simplification Research: New Data Can Help
TLDR
We introduce a new simplification dataset that is a significant improvement over Simple Wikipedia, and present a novel quantitative-comparative approach to study the quality of simplification data resources. Expand
  • 169
  • 44
  • PDF
Annotated Gigaword
TLDR
We have created layers of annotation on the English Gigaword v.5 corpus to render it useful as a standardized corpus for knowledge extraction and distributional semantics. Expand
  • 227
  • 34
  • PDF
JFLEG: A Fluency Corpus and Benchmark for Grammatical Error Correction
TLDR
We present a new parallel corpus, JHU FLuency-Extended GUG corpus (JFLEG) for developing and evaluating grammatical error correction (GEC). Expand
  • 83
  • 19
  • PDF
Ground Truth for Grammaticality Correction Metrics
TLDR
We establish a ground truth for GEC by conducting a human evaluation and producing a human ranking of the systems entered into the CoNLL-2014 Shared Task on GEC. Expand
  • 33
  • 12
  • PDF
A Report on the Automatic Evaluation of Scientific Writing Shared Task
TLDR
The Automated Evaluation of Scientific Writing, or AESW, is the task of identifying sentences in need of correction to ensure their appropriateness in a scientific prose. Expand
  • 32
  • 7
  • PDF
Evaluating Sentence Compression: Pitfalls and Suggested Remedies
TLDR
This work surveys existing evaluation methodologies for the task of sentence compression, identifies their shortcomings, and proposes alternatives. Expand
  • 41
  • 7
  • PDF
Reassessing the Goals of Grammatical Error Correction: Fluency Instead of Grammaticality
TLDR
The field of grammatical error correction (GEC) has grown substantially in recent years, with research directed at both evaluation metrics and improved system performance against those metrics. Expand
  • 37
  • 7
  • PDF
Finding Good Conversations Online: The Yahoo News Annotated Comments Corpus
TLDR
This work presents a dataset and annotation scheme for the new task of identifying “good” conversations that occur online, which we call ERICs: Engaging, Respectful, and/or Informative Conversations. Expand
  • 35
  • 7
  • PDF
There's No Comparison: Reference-less Evaluation Metrics in Grammatical Error Correction
TLDR
We show that reference-less grammaticality metrics correlate very strongly with human judgments and are competitive with the leading reference-based evaluation metrics. Expand
  • 31
  • 7
  • PDF