Johnny Bigert

Learn More
In many NLP-applications, the robustness of the internal modules of an application is a prerequisite for the success and usability of the system. The term robustness is a bit unclear and vague, but in NLP, it is often used in the sense robust against noisy, illformed, and partial natural language data. The full spectrum of robustness is defined by Menzel(More)
We describe two freeware programs for automatic evaluation. The first, AutoEval, greatly simplifies the data gathering, processing and counting often involved in an evaluation. To this end, AutoEval includes a simple and powerful script language to describe the evaluation task to be carried out. The second program is called Missplel. It introduces(More)
This article describes an automatic evaluation procedure for NLP system robustness under the strain of noisy and ill-formed input. The procedure requires no manual work or annotated resources. It is language and annotation scheme independent and produces reliable estimates on the robustness of NLP systems. The only requirement is an estimate on the NLP(More)
This article presents a robust probabilistic method for the detection of context-sensitive spelling errors. The algorithm identifies lessfrequent grammatical constructions and attempts to transform them into more-frequent constructions while retaining similar syntactic structure. If the transformations result in lowfrequency constructions, the text is(More)
We address the topic of automatic evaluation of robustness and performance degradation in parsing systems. We focus on one aspect of robustness, namely ill-formed sentences and the impact of spelling errors on the different components of a parsing system. We propose an automated framework to evaluate robustness, where ill-formed and noisy data is introduced(More)
This article focuses on the evaluation of a novel algorithm for the detection of context-sensitive spelling errors. We present a fully automatic evaluation procedure with no requirements of manual work or resources annotated with spelling errors. The evaluation method is applicable to any language and tag set, and is easily adaptable to other NLP systems(More)
Grammar errors and context-sensitive spelling errors in texts written by second language learners are hard to detect automatically. We have used three different approaches for grammar checking: manually constructed error detection rules, statistical differences between correct and incorrect texts, and machine learning of specific error types. The three(More)
The topic of this Master's thesis is factorization of large integers using the number eld sieve, the best factorization algorithm known today. We explain the theory behind the algorithm giving examples of the algebraic structures involved. We give a brief survey of an implementation of the algorithm, which is later used in the experiments. We try to improve(More)
This article describes an automatic evaluation procedure for NLP system robustness under the strain of noisy and ill-formed input. The procedure requires no manual work or annotated resources. It is language and annotation scheme independent and produces reliable estimates on the robustness and accuracy of NLP systems. The procedure was applied to five(More)
  • 1