EasyEnglish: A Tool for Improving Document Quality

Abstract

We describe the authoring tool, EasyEnglish, which is part of IBM's internal SGML editing environment, Information Development Workbench. EasyEnglish helps writers produce clearer and simpler English by pointing out ambiguity and complexity as well as performing some standard grammar checking. Where appropriate, EasyEnglish makes suggestions for rephrasings that may be substituted directly into the text by using the editor interface. EasyEnglish is based on a full parse by English Slot Grammar; this makes it possible to produce a higher degree of accuracy in error messages as well as handle a large variety of texts. 1 I n t r o d u c t i o n Like most other big corporations today, IBM is interested in cost-effective, yet high-quality information dissemination. Every year, many pages of online and printed documentation are produced. No matter what part of the world the documentation is written in, it is normally first written in English, and then translated into all the other supported languages. IBM has developed a number of tools to help writers cope with this task of information development. In this paper, we describe EasyEnglish, a tool that helps writers produce clearer and simpler English by pointing out ambiguity and complexity. Where appropriate, EasyEnglish makes suggestions for rephrasings. The EasyEnglish system can be viewed as a "grammar checker++", in that standard grammar checking facilities such as spell-checking, word count (sentence length), and detection of passive constructions are available in addition to the checks for ambiguity. Furthermore, facilities for user-defined controlled vocabulary are available. Totally, there are currently about forty checks. EasyEnglish is part of IBM's internal Information Development Workbench (IDWB), an SGML-based document creation and document management system. ArborText's Adept editor is used with IDWB 1 EasyEnglish summarizes the problems encountered in a given document by giving an overall rating, the Clarity Indez (CI). The CI has to be in a certain range before the document can be accepted for publication. EasyEnglish combines features from both standard grammar checkers and Controlled Language (CL) compliance checkers with checks for structural ambiguity in a way that we believe is general enough to be useful for any writer.., not just tec~hnical writers. It has been claimed that the restrictions found in CLs mostly reflect the inadequacies of the MT systems used in conjunction with CLs (Cl~mencin 1996; van der Eijk et al. 1996; Hayes et al. 1996). It is certainly the case that preprocessing a document with the same parser that is used for source analysis improves the MT results. EasyEnglish uses the same parser as LMT (McCord 1989a, 1989b). This offers an obvious advantage for MT results. Other MT systems, including the KANT system (Mitamura and Nyberg 1995; Nyberg and Mitamura 1996), see the advantage of this. However, we claim that a document that has been "EasyEnglished" is also easier to understand for native speakers as well as nonnative speakers of English. A similar point has been made for Caterpillar Technical English (Hayes ct al. 1996). We think, however, that our approach is more general because our use of a broad-coverage, geaeral English grammar 2 allows us to go beyond the concept of CL to look for more general types of ambiguities. I EasyEnglish also works with the XEDIT editor on VM and the EPM editor on OS/2. An earlier version of EasyEnglish was written in Prolog; however, the current version is written in pure ANSI C, and hence the question of platform is mainly a matter of supplying an appropriate editor interface. ~English Slot Grammar (McCord 1980, 1990, 1993)

Extracted Key Phrases

Cite this paper

@inproceedings{Bernth1997EasyEnglishAT, title={EasyEnglish: A Tool for Improving Document Quality}, author={Arendse Bernth}, booktitle={ANLP}, year={1997} }