Data Set Used
We discuss a grammar development process used to generate the trees of the wide-coverage Lexicalized Tree Adjoining Grammar (LTAG) for English of the XTAG Project. Result of the coupling of Becker's metarules and a simple yet principled hierarchy of rule application, the approach has been successful to generate the large set of verb trees in the grammar,… (More)
In this paper, we present a tool that allows one to automatically extract verb argument-structure from the Penn Treebank as well as from other corpora annotated with the Penn Treebank release 2 conventions. More specifically, we examine each possible sequence of tags, both functional and categorial and determine whether such a sequence indicates an… (More)
In this paper, we propose a classification of grammar development strategies according to two criteria : handwritten versus automatically acquired grammars, and grammars based on a low versus high level of syntactic abstraction. Our classification yields four types of grammars. For each type, we discuss implementation and evaluation issues.
We describe an LR parser of parts-of-speech (and punctuation labels) for Tree Adjoining Grammars (TAGs), that solves table conflicts in a greedy way, with limited amount of backtracking. We evaluate the parser using the Penn Treebank showing that the method yield very fast parsers with at least reasonable accuracy, confirming the intuition that LR parsing… (More)
This paper presents a novel approach to deal with dictionary retrieval. This new approach is based on a very efficient and scalable theoretical structure called Multi-Terminal Multi-valued Decision Diagrams (MTMDD). Such tool allows the definition of very large, even multilingual, dictionaries without significant increase in memory demands, and also with… (More)
Coordination of phrases of different syntactic categories has posed a problem for generative systems based only on syntactic categories. Although some prefer to treat them as exceptional cases that should require some extra mechanism (as for elliptical constructions), or to allow for unrestricted cross-category coordination, they can be naturally derived in… (More)
We report in this paper on an experiment on automatic extraction of a Tree Adjoining Grammar from the WSJ corpus of the Penn Treebank. We use an automatic tool developed by (Xia, 2001) properly adapted to our particular need. Rather than addressing general aspects of the automatic extraction we focus on the problems we have found to extract a linguistically… (More)