Evaluation of natural language processing systems: Issues and approaches

@article{Guida1986EvaluationON,
  title={Evaluation of natural language processing systems: Issues and approaches},
  author={Giovanni Guida and Giancarlo Mauri},
  journal={Proceedings of the IEEE},
  year={1986},
  volume={74},
  pages={1026-1035}
}
This paper encompasses two main topics: a broad and general analysis of the issue of performance evaluation of NLP systems and a report on a specific approach developed by the authors and experimented on a sample test case. More precisely, it first presents a brief survey of the major works in the area of NLP systems evaluation. Then, after introducing the notion of the life cycle of an NLP system, it focuses on the concept of performance evaluation and analyzes the scope and the major problems… Expand
Evaluating Natural Language Processing Systems: An Analysis and Review
TLDR
This comprehensive state-of-the-art book is the first devoted to the important and timely issue of evaluating NLP systems, and provides a wide-ranging and careful analysis of evaluation concepts, reinforced with extensive illustrations. Expand
A diagnostic tool for German syntax
TLDR
An ongoing effort to construct a catalogue of syntactic data exemplifying the major syntactic patterns of German to support the diagnosis of errors in the syntactic components of natural language processing (NLP) systems is described. Expand
Evaluating natural language processing systems
TLDR
Evaluating Natural Language Processing Systems Designing customized methods for testing various NLP systems may be costly and expensive, so post hoc justification is needed. Expand
Natural Language Sourcebook.
TLDR
The Sourcebook is a compilation on 197 processing problems addressed or handled by intelligent computer systems classified into a scheme with an artificial intelligence bent and cross-referenced to companion schemes one with a linguistic and a cognitive psychological perspective on the type of issues reflected in the problems. Expand
Why Human Translators Still Sleep in Peace? (Four Engineering and Linguistic Gaps in Nlp)
TLDR
This paper is a brief dissertation on four engineering and linguistic issues believed critical for a more striking success of NLP: extensive acquisition of the semantic lexicon, formal performance evaluation methods to evaluate systems, development of shell systems for rapid prototyping and customization, and finally a more linguistically motivated approach to word categorization. Expand
Evaluating Natural Language Systems: A Sourcebook Approach
TLDR
Progress is reported in development of evaluation methodologies for natural language systems with a common classification of the problems in natural language understanding. Expand
Issues in Performance Evaluation of Mathematical Notation Recognition Systems
TLDR
Issues that are discussed cover the reported performance evaluation experiments, the code availability, the nature of the mathematical notation, the extent of the coverage of mathematical recognition systems, and the quantification of performance evaluation results. Expand
DFKI Workshop on Natural Language Systems : Reusability and Modularity Saarbrücken , October 23 , 1992 Proceedings
In this paper we give a rough sketch of the German grammar that was developed in the DISCO project. The description also includes some characteristics of the grammar formalism and of the variousExpand
Evaluation: An assessment
TLDR
An editorial introduction to this Special Issue of Machine Translation dedicated to Evaluation is provided, the rationale for the Issue is described, the various contributions of the papers in this issue are outlined, and the main current approaches are given. Expand
Constructing natural language interface applications to operating systems
TLDR
The presented linguistic stratification analysis has been employed in the design of a user interface management system for developing natural language interfaces to operating systems and is demonstrated through the development of a natural language interface for the Unix operating system. Expand
...
1
2
...

References

SHOWING 1-8 OF 8 REFERENCES
Designing and automating the quality assessment of a knowledge-based. system: The initial Automated academic advisor experience
The automated academic advisor (AAA), a large practical artificial intelligence system currently under development, is introduced. Two parsers are described which were designed for use with the AAA.Expand
Understanding Natural Language Through Parallel Processing of Syntactic and Semantic Knowledge: An Application to Data Base Query
TLDR
The core of the PARNAX system is constituted by the analyzer that includes parallel processing of syntactic and semantic knowledge, and it is argued that this feature allowed the system to reach a good linguistic coverage, still ensuring an acceptable degree of efficiency. Expand
Experience with the Evaluation of Natural Language Question Answerers
TLDR
Two measurements, conceptual and linguistic completeness, are defined and discussed in this paper and demonstrated that the conceptual coverage of natural language systems should be extended to better satisfy the needs and expectations of users. Expand
Experience with ROBOT in 12 Commercial, Natural Language Data Base Query Applications
TLDR
The unexpected linguistic and semantic difficulties encountered in the 12 commercial applications to which ROBOT has been applied during the last year and a half are discussed. Expand
Software Reliability Analysis Models
  • M. Ohba
  • Computer Science
  • IBM J. Res. Dev.
  • 1984
TLDR
Improvements to conventional software reliability analysis models by making the assumptions on which they are based more realistic are discussed, including the delayed S-shaped growth model, the inflection S- shaped model, and the hyperexponential model. Expand
Software Reliability Analysis
TLDR
A case study is presented of the analysis of failure data from a Space Shuttle software project to predict the number of failures likely during a mission, and the subsequent verification of these predictions. Expand
A formal basis for performance evaluation ofatural languages understanding systems " Comput
  • Linguistics