Shortening the OED: experience with a grammar-defined database

@article{Blake1992ShorteningTO,
  title={Shortening the OED: experience with a grammar-defined database},
  author={G. Blake and T. Bray and F. Tompa},
  journal={ACM Trans. Inf. Syst.},
  year={1992},
  volume={10},
  pages={213-232}
}
Textual databases with highly variable structure can be usefully described by a grammar-defined model. One example of such a text is the Oxford English Dictionary. This paper describes a first attempt to apply technology based on this model to a real problem. A language called GOEDEL, which is a partial implementation of a set of grammar-defined database operators, was used to extract and alter a subset of the OED in order to assist the editors in their production of The Shorter Oxford English… Expand

Figures, Tables, and Topics from this paper

Text structure recognition using a region algebra
TLDR
This thesis proposes an efficient batch parsing model and characterize the region algebras to which it applies and proposes an alternative approach based on the type of region algebra that is often used as a query language for text databases. Expand
Grammars++ for Modelling Information in Text
TLDR
Grammars provide a convenient means to describe the set of valid instances in a text database and can be used to specify database manipulation, including query, update, view definition, and index specification. Expand
Transformation of structured documents
TLDR
The conclusion was that simple and local transformations can be automatized or semiautomatized, depending whether additional information is not needed, while global transformations are difficult to automatize. Expand
An Algebra for Structured Text Search and a Framework for its Implementation
TLDR
A query algebra is presented that expresses searches on structured text that permits queries that harness document structure and manipulates arbitrary intervals of text, which are recognized in the text from implicit or explicit markup. Expand
Retrieval from hierarchical texts by partial patterns
TLDR
This work describes a query language for retrieving information from collections of hierarchical text based on a tree pattern matching notion called tree inclusion, which allows easy expression of queries that use the structure and the content of the document. Expand
Transformation of Structured Documents with the Use of Grammar
TLDR
The method uses grammars to define both the structure of documents and transformation between structures and its implementation to certain modifications in a syntax-directed document processing system created by the authors. Expand
Structured Document Transformations
TLDR
Alchemist is a transformation language called alchemist which is based on tt grammars which has been extended with semantic actions in order to make it possible to build full scale transformations. Expand
Views of Text
Text databases are becoming increasingly important in business applications. However the diverse nature of text is not widely understood, nor appreciated. Some properties of simple document modelsExpand
Data Model for Document Transformation and Assembly (Extended Abstract)
TLDR
This paper shows a data model for transforming and assembling document information such as SGML or XML documents that simultaneously provides (1) powerful patterns and contextual conditions, and (2) schema transformation. Expand
A language for queries on structure and contents of textual databases
TLDR
The key idea of the model is that a set-oriented query language based on operations on nearby structure elements of one or more hierarchi es is quite expressive and efficiently implementable, being a good tradeoff between both goals. Expand
...
1
2
3
...

References

SHOWING 1-10 OF 29 REFERENCES
Mind Your Grammar: a New Approach to Modelling Text
TLDR
The grammar-based model presented here builds on the traditional foundations of computer science, and particularly database theory and practice, and uses grammars as schemas and “parsed strings” as instances to create a database model for textdominated database systems. Expand
Making it short: The Shorter Oxford English Dictionary
Of the early history of the SHORTER O X F O R D ENGLISH DICTIONARY ON HISTORICAL PRINCIPLES little is known beyond the brief facts set out in the preface to the first edition — that specimens wereExpand
Programming Languages: Design and Implementation
TLDR
This book explores the major issues in both design and implementation of modern programming languages and provides a basic introduction to the underlying theoretical models on which these languages are based. Expand
Document Design with HiTeX: A Step beyond LaTeX
In a computerized environment document design involves three major acitivities, {\em creating\/} a new design, {\em implementing\/} a design specification for a given formatter, and {\emExpand
Office Document Architecture and Office Document Interchange Formats: Current Status of International Standardization
TLDR
The architectural model, the underlying processing model, and the principles of the interchange formats of the ECMA 101 and ISO drafts are introduced, and possibilities of further development indicated. Expand
SGML handbook
TLDR
This paper introduces generalized markup, a model for generalized markup that automates the very labor-intensive and therefore time-heavy and expensive process of developing and distributing SGML documents. Expand
No Silver Bullet Essence and Accidents of Software Engineering
  • F. Brooks
  • Computer Science, Engineering
  • Computer
  • 1987
TLDR
This article shall try to show why there is no single development, in either technology or management technique, that by itself promises even one order-of-magnitude improvement in productivity, in reliability, in simplicity. Expand
Proceedings
s: Keynote voordrachten 9 Abstracts: VK Prijs (voordrachten) 13s: VK Prijs (voordrachten) 13 Abstracts: VK Prijs (postermededelingen) 27s: VKExpand
PAT 3.3 User’s Guzde
  • Centre for the New Oxford English Dictionary,
  • 1988
A User's Guide to the OED
  • A User's Guide to the OED
  • 1991
...
1
2
3
...