JFLEG: A Fluency Corpus and Benchmark for Grammatical Error Correction

@inproceedings{Tetreault2017JFLEGAF,
  title={JFLEG: A Fluency Corpus and Benchmark for Grammatical Error Correction},
  author={Joel R. Tetreault and Keisuke Sakaguchi and Courtney Napoles},
  booktitle={EACL},
  year={2017}
}
We present a new parallel corpus, JHU FLuency-Extended GUG corpus (JFLEG) for developing and evaluating grammatical error correction (GEC). Unlike other corpora, it represents a broad range of language proficiency levels and uses holistic fluency edits to not only correct grammatical errors but also make the original text more native sounding. We describe the types of corrections made and benchmark four leading GEC systems on this corpus, identifying specific areas in which they do well and how… 

Figures and Tables from this paper

Grammatical Error Correction in Low Error Density Domains: A New Benchmark and Analyses

TLDR
It is demonstrated that a factor behind this is the inability of systems to rely on a strong internal language model in low error density domains, and it is hoped this work shall facilitate the development of open-domain GEC models that generalize to different topics and genres.

UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language

TLDR
A corpus professionally annotated for grammatical error correction (GEC) and fluency edits in the Ukrainian language and can be used for researching multilingual and low-resource NLP, morphologically rich languages, document-level GEC, and fluencies correction.

Cool English: a Grammatical Error Correction System Based on Large Learner Corpora

TLDR
The sequence-to-sequence model, which is frequently used in machine translation and text summarization, is applied to this GEC task and achieves competitive performance on a number of publicly available testsets.

Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation

TLDR
This GEC system preserves the accuracy of SMT output and, at the same time, generates more fluent sentences as it typical for NMT, which is closer to reaching human-level performance than any other GEC systems reported so far.

Language Model Based Grammatical Error Correction without Annotated Training Data

TLDR
This paper re-examines LMs in GEC and shows it is entirely possible to build a simple system that not only requires minimal annotated data, but is also fairly competitive with several state-of-the-art systems.

Controlling Grammatical Error Correction Using Word Edit Rate

TLDR
It is shown that it is possible to actually control the degree of GEC by using new training data annotated with word edit rate, so that diverse corrected sentences is obtained from a single erroneous sentence.

How Good (really) are Grammatical Error Correction Systems?

TLDR
This analysis paper studies the performance of GEC systems relative to closest-gold – a gold reference text created relative to the output of a system, and shows that the real performance is 20-40 points better than standard evaluations show.

A Reference-less Evaluation Metric Based on Grammaticality, Fluency, and Meaning Preservation in Grammatical Error Correction

TLDR
It is empirically show that a reference-less metric that combines both fluency and meaning preservation with grammaticality provides a better estimate of manual scores than that of commonly used reference-based metrics.

Connecting the Dots: Towards Human-Level Grammatical Error Correction

We build a grammatical error correction (GEC) system primarily based on the state-of-the-art statistical machine translation (SMT) approach, using task-specific features and tuning, and further

An Empirical Study of Incorporating Pseudo Data into Grammatical Error Correction

TLDR
This study investigates how the pseudo data should be generated or used in the training of grammatical error correction models with state-of-the-art performance on the CoNLL-2014 test set and the official test set of the BEA-2019 shared task.
...

References

SHOWING 1-10 OF 18 REFERENCES

Reassessing the Goals of Grammatical Error Correction: Fluency Instead of Grammaticality

TLDR
It is shown that automatic evaluation with the authors' new annotation scheme has very strong correlation with expert rankings, and it is advocated for a fundamental and necessary shift in the goal of GEC, from correcting small, labeled error types, to producing text that has native fluency.

Grammatical error correction using neural machine translation

This paper presents the first study using neural machine translation (NMT) for grammatical error correction (GEC). We propose a twostep approach to handle the rare word problem in NMT, which has been

Better Evaluation for Grammatical Error Correction

TLDR
This work presents a novel method for evaluating grammatical error correction that is an algorithm for efficiently computing the sequence of phrase-level edits between a source sentence and a system hypothesis that achieves the highest overlap with the gold-standard annotation.

Ground Truth for Grammatical Error Correction Metrics

TLDR
The first human evaluation of GEC system outputs is conducted, and it is shown that the rankings produced by metrics such as MaxMatch and I-measure do not correlate well with this ground truth.

Building a Large Annotated Corpus of Learner English: The NUS Corpus of Learner English

TLDR
The annotation schema and the data collection and annotation process of NUCLE are described and an unpublished study of annotator agreement for grammatical error correction is reported on.

Adapting Grammatical Error Correction Based on the Native Language of Writers with Neural Network Joint Models

TLDR
This paper adapts a neural network joint model (NNJM) using L1-specific learner text and integrates it into a statistical machine translation (SMT) based GEC system and shows that adaptation achieves significant F0.5 score gains on English texts written by L1 Chinese, Russian, and Spanish writers.

Tense and Aspect Error Correction for ESL Learners Using Global Context

TLDR
This work regards the task as sequence labeling: each verb phrase in a document is labeled with tense/aspect depending on surrounding labels, and shows that the global context makes a moderate contribution to tense/Aspect error correction.

Predicting Grammaticality on an Ordinal Scale

TLDR
This work constructs a statistical model of grammaticality using various linguistic features (e.g., misspelling counts, parser outputs, n-gram language model scores) and presents a new publicly available dataset of learner sentences judged for grammaticaly on an ordinal scale.

The CoNLL-2013 Shared Task on Grammatical Error Correction

TLDR
The task definition is given, the data sets are presented, and the evaluation metric and scorer used in the shared task are described, to give an overview of the various approaches adopted by the participating teams, and present the evaluation results.

Phrase-based Machine Translation is State-of-the-Art for Automatic Grammatical Error Correction

TLDR
It is found that a bare-bones phrase-based SMT setup with task-specific parameter-tuning outperforms all previously published results for the CoNLL-2014 test set by a large margin and improves the state-of-the-art to 49.49% M^2.