• Corpus ID: 16214547

Four types of context for automatic spelling correction

@article{Flor2012FourTO,
  title={Four types of context for automatic spelling correction},
  author={Michael Flor},
  journal={Trait. Autom. des Langues},
  year={2012},
  volume={53},
  pages={61-99}
}
  • Michael Flor
  • Published 2012
  • Computer Science
  • Trait. Autom. des Langues
This paper presents an investigation on using four types of contextual information for improving the accuracy of automatic correction of single-token non-word misspellings. The task is framed as contextually-informed re-ranking of correction candidates. Immediate local context is captured by word n-grams statistics from a Web-scale language model. The second approach measures how well a candidate correction fits in the semantic fabric of the local lexical neighborhood, using a very large… 
Leveraging known Semantics for Spelling Correction
TLDR
This work explores the use of spelling correction tools and language modeling to correct misspellings that often lead to errors in obtaining semantic forms, and shows that such tools can significantly reduce the number of unanalyzable cases.
A Benchmark Corpus of English Misspellings and a Minimally-supervised Model for Spelling Correction
TLDR
An annotated data set of 6,121 spelling errors in context, based on a corpus of essays written by English language learners is presented and a minimallysupervised context-aware approach to spelling correction is developed.
Unsupervised Context-Sensitive Spelling Correction of English and Dutch Clinical Free-Text with Word and Character N-Gram Embeddings
We present an unsupervised context-sensitive spelling correction method for clinical free-text that uses word and character n-gram embeddings. Our method generates misspelling replacement candidates
Survey of Automatic Spelling Correction
TLDR
The survey describes selected approaches in a common theoretical framework based on Shannon’s noisy channel for automatic spelling correction systems selected from papers indexed in Scopus and Web of Science from 1991 to 2019.
AutomAtic spelling correction for russiAn sociAl mediA texts
TLDR
This paper describes an automatic spelling correction system for Russian, using edit distance for candidate search and a combination of weighted edit distance and language model for candidate hypotheses selection and has won the first SpellRuEval competition for Russian spell checkers by all the metrics.
Grammatical Error Correction: Machine Translation and Classifiers
TLDR
An algorithmic approach is developed that combines the strengths of both machine learning classification and machine translation and is better at correcting complex mistakes.
Bootstrapped OCR error detection for a less-resourced language variant
TLDR
The chosen solution based on statistical affix analysis reaches an accuracy 10 points higher than existing morphological analysis systems on error detection, while a combination of fuzzy and approximate string search performs best for error correction.
Patterns of misspellings in L2 and L1 English: a view from the ETS Spelling Corpus
TLDR
It is found that the rate of misspellings decreases as writing proficiency (essay score) increases, both in TOEFL and in GRE, and depends on writing proficiency and not on NS/NNS distinction.
Patterns of misspellings in L 2 and L 1 English : a view from the ETS Spelling Corpus 1
This paper presents a study of misspellings, based on annotated data from the ETS Spelling corpus. The corpus consists of 3000 essays written by examinees , native (NS) and non-native speakers (NNS)
...
...

References

SHOWING 1-10 OF 90 REFERENCES
Producing an annotated corpus with automatic spelling correction
TLDR
An evaluation of the ConSpel system was conducted, using the data from the completed phase of the annotation project, and indicates that an advanced correction algorithm, which takes into account the local context of misspellings, achieves correction accuracy of 77% and consistently outperforms a baseline context-blind approach.
On using context for automatic correction of non-word misspellings in student essays
TLDR
A new spell-checking system that utilizes contextual information for automatic correction of non-word misspellings and corrects errors generated by non-native English writers with almost same rate of success as it does for writers who are native English speakers.
Techniques for automatically correcting words in text
TLDR
Research aimed at correcting words in text has focused on three progressively more difficult problems: nonword error detection; (2) isolated-word error correction; and (3) context-dependent work correction, which surveys documented findings on spelling error patterns.
Lexical postcorrection of OCR-results:the web as a dynamic secondary dictionary?
TLDR
Ourex experiments show that dynamic dictionaries retrieved via an automated analysis of the vocabulary of web pages from a given domain can improve the coverage for the given thematic area in a significant way and help to improve the quality of lexical postcorrection methods.
Memory-based context-sensitive spelling correction at web scale
  • Andrew Carlson, Ian Fette
  • Computer Science
    Sixth International Conference on Machine Learning and Applications (ICMLA 2007)
  • 2007
TLDR
This work uses a novel correction algorithm and a massive database of training data to demonstrate higher accuracy on correcting real- word errors than previous work, and very high accuracy at a new task of ranking corrections to non-word errors given by a standard spelling correction package.
A Winnow-Based Approach to Context-Sensitive Spelling Correction
TLDR
This work presents an algorithm combining variants of Winnow and weighted-majority voting, and applies it to a problem in the aforementioned class: context-sensitive spelling correction, and finds that WinSpell achieves accuracies significantly higher than BaySpell was able to achieve in either the pruned or unpruned condition.
Using the Web for Language Independent Spellchecking and Autocorrection
TLDR
An end-to-end system spellchecking and autocorrection system that does not require any manually annotated training data is designed and implemented that outperform baselines which use candidate corrections based on hand-curated dictionaries.
Exploiting Syntactic and Distributional Information for Spelling Correction with Web-Scale N-gram Models
TLDR
Experimental results show that introducing syntactic features into n-gram based models significantly reduces errors by up to 12.4% over the current state-of-the-art.
Ordering the suggestions of a spellchecker without using context*
  • R. Mitton
  • Business
    Natural Language Engineering
  • 2009
TLDR
A series of experiments is described, beginning with a basic corrector that implements a well-known algorithm for reversing single simple errors, and making successive enhancements to take account of substring matches, pronunciation, known error patterns, syllable structure and word frequency.
Improving Query Spelling Correction Using Web Search Results
TLDR
A novel method is proposed for use of web search results to improve the existing query spelling correction models solely based on query logs by leveraging the rich information on the web related to the query and its top-ranked candidate.
...
...