Exploiting Unlabeled Data for Neural Grammatical Error Detection

  title={Exploiting Unlabeled Data for Neural Grammatical Error Detection},
  author={Zhuoran Liu and Yang Liu},
  journal={Journal of Computer Science and Technology},
  • Zhuoran Liu, Yang Liu
  • Published 28 November 2016
  • Computer Science
  • Journal of Computer Science and Technology
Identifying and correcting grammatical errors in the text written by non-native writers have received increasing attention in recent years. Although a number of annotated corpora have been established to facilitate data-driven grammatical error detection and correction approaches, they are still limited in terms of quantity and coverage because human annotation is labor-intensive, time-consuming, and expensive. In this work, we propose to utilize unlabeled data to train neural network based… 
Bidirectional LSTM Tagger for Latvian Grammatical Error Detection
This paper reports on the development of a grammar error labeling system for the Latvian language. We choose to label six error types that are crucial for understanding a text as noted in a survey by
Error Detection for Arabic Text Using Neural Sequence Labeling
The first experiments using neural network models for the task of error detection for Modern Standard Arabic text are presented, demonstrating that neural network architectures for error detection through sequence labeling can successfully be applied to Arabic text.
Deep Learning for Arabic Error Detection and Correction
The experimental results confirm that the proposed system significantly outperforms the performance of Microsoft Word 2013 and Open Office Ayaspell 3.4, which have been used in the literature for evaluating similar research.
A Neural Network Architecture for Detecting Grammatical Errors in Statistical Machine Translation
This paper presents a Neural Network architecture for detecting grammatical errors in Statistical Machine Translation using monolingual morpho-syntactic word representations in combination with surface and syntactic context windows and shows that this approach is portable to other languages.
Unlabeled Text Classification Optimization Algorithm Based on Active Self-Paced Learning
  • Tingyi Zheng, Li Wang
  • Computer Science
    2018 IEEE International Conference on Big Data and Smart Computing (BigComp)
  • 2018
An algorithm for learning from unlabeled and very few labeled text based on the combination of convolutional neural network (CNN), one-vs-all SVM classifier and Active Self-Paced Learning (ASPL).
Developing AI Tools For A Writing Assistant: Automatic Detection of dt-mistakes In Dutch
A lightweight, scalable model that predicts whether a Dutch verb ends in -d, -t or -dt, and the main advantages are the short training time and the potential to use the same technique with other disambiguation tasks in Dutch or in other languages.
Assessing Grammatical Correctness in Language Learning
This work explores the problem of detecting alternative-correct answers: when more than one inflected form of a lemma fits syntactically and semantically in a given context and investigates the ability of pre-trained BERT to detect grammatical errors and then fine-tune it using synthetic training data.
Looking for Low-proficiency Sentences in ELL Writing
Determining whether an author is writing in their native language (L1) or a second language (L2) is a problem that lies at the intersection of four traditional NLP tasks: native language
ENCORE: Ensemble Learning using Convolution Neural Machine Translation for Automatic Program Repair
This work proposes ENCORE, a new G&V technique, which uses ensemble learning on convolutional neural machine translation (NMT) models to automatically fix bugs in multiple programming languages, taking advantage of the randomness in hyper-parameter tuning to build multiple models that fix different bugs and combine them using ensemble learning.
Grammatical Error Checking Systems: A Review of Approaches and Emerging Directions
Previous works of Grammatical Error Correction or Detection systems, challenges associated with these systems, and, finally, suggested future directions are presented.


Sentence-Level Grammatical Error Identification as Sequence-to-Sequence Correction
It is demonstrated that an attention-based encoder-decoder model can be used for sentence-level grammatical error identification for the Automated Evaluation of Scientific Writing (AESW) Shared Task 2016 and is shown to be particularly effective, outperforming other results on the AESW Shared Task on its own, and showing gains over a word-based counterpart.
Generating artificial errors for grammatical error correction
Experiments using the NUCLE corpus from the CoNLL 2013 shared task reveal that training on artificially created errors improves precision at the expense of recall and different types of linguistic information are better suited for correcting different error types.
Generating Confusion Sets for Context-Sensitive Error Correction
This paper focuses on the task of correcting errors in preposition usage made by non-native English speakers, using discriminative classifiers and proposes several methods of restricting candidate sets, finding that restricting candidates to those that are observed in the non- native data improves both the precision and the recall.
A unified architecture for natural language processing: deep neural networks with multitask learning
We describe a single convolutional neural network architecture that, given a sentence, outputs a host of language processing predictions: part-of-speech tags, chunks, named entity tags, semantic
The CoNLL-2013 Shared Task on Grammatical Error Correction
The task definition is given, the data sets are presented, and the evaluation metric and scorer used in the shared task are described, to give an overview of the various approaches adopted by the participating teams, and present the evaluation results.
Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation
Qualitatively, the proposed RNN Encoder‐Decoder model learns a semantically and syntactically meaningful representation of linguistic phrases.
Constrained Grammatical Error Correction using Statistical Machine Translation
This paper describes our use of phrasebased statistical machine translation (PBSMT) for the automatic correction of errors in learner text in our submission to the CoNLL 2013 Shared Task on
A Beam-Search Decoder for Grammatical Error Correction
A novel beam-search decoder for grammatical error correction that is able to perform correction of whole sentences with multiple and interacting errors while still taking advantage of powerful existing classifier approaches.
HOO 2012: A Report on the Preposition and Determiner Error Correction Shared Task
This paper reports on the HOO 2012 shared task on error detection and correction in the use of prepositions and determiners, where systems developed by 14 teams from around the world were evaluated on the same previously unseen errorful text.
Grammar Error Correction Using Pseudo-Error Sentences and Domain Adaptation
G grammar error correction for Japanese particles that uses discriminative sequence conversion, which corrects erroneous particles by substitution, insertion, and deletion is presented.