Ann A. Copestake

Learn More
Multiword expressions are a key problem for the development of large-scale, linguistically sound natural language processing technology. This paper surveys the problem and some currently available analytic techniques. The various kinds of multiword expressions should be analyzed in distinct ways, including listing “words with spaces”, hierarchically(More)
The LinGO (Linguistic Grammars Online) project’s English Resource Grammar and the LKB grammar development environment are language resources which are freely available for download for any purpose, including commercial use (see http://lingo.stanford.edu). Executable programs and source code are both included. In this paper, we give an outline of the LinGO(More)
We develop a framework for formalizing semantic construction within grammars expressed in typed feature structure logics, including HPSG. The approach provides an alternative to the lambda calculus; it maintains much of the desirable flexibility of unificationbased approaches to composition, while constraining the allowable operations in order to capture(More)
In this paper we discuss various aspects of systematic or conventional polysemy and their formal treatment within an implemented constraint based approach to linguistic representation. We distinguish between two classes of systematic polysemy: constructional polysemy, where a single sense assigned to a lexical entry is contextually specialised, and sense(More)
Chemical named entities represent an important facet of biomedical text. We have developed a system to use character-based n-grams, Maximum Entropy Markov Models and rescoring to recognise chemical names and other such entities, and to make confidence estimates for the extracted entities. An adjustable threshold allows the system to be tuned to high(More)
We describe the lexical knowledge base system (LKB) which has been designed and implemented as part of the ACQUILEX project 1 to allow the representation of multilinguM syntactic and semantic information extracted from machine readable dictionaries (MRDs), in such a way that it is usable by natural language processing (NLP) systems. The LKB's lexical(More)