Data Set Used
The grammar matrix is an open-source starter-kit for the development of broad-coverage HPSGs. By using a type hierarchy to represent cross-linguistic generalizations and providing compatibility with other open-source tools for grammar engineering, evaluation, parsing and generation, it facilitates not only quick start-up but also rapid growth towards the… (More)
The LinGO Redwoods initiative is a seed activity in the design and development of a new type of treebank. A treebank is a (typically hand-built) collection of natural language utterances and associated linguistic analyses; typical treebanks—as for example the widely recognized Penn—assign syntactic phrase structure or tectogrammatical dependency trees over… (More)
In this paper we describe and evaluate different statistical models for the task of realization ranking, i.e. the problem of discriminating between competing surface realizations generated for a given input semantics. Three models are trained and tested; an n-gram language model, a discriminative maximum entropy model using structural features, and a… (More)
We define broad-coverage semantic dependency parsing (SDP) as the task of recovering sentence-internal predicate–argument relationships for all content words, i.e. the semantic structure constituting the rela-tional core of sentence meaning. Syntactic dependency parsing has seen great advances in the past decade, in part owing to relatively broad consensus… (More)
The growing la.ngua<gc l.echnology indusl;ry n(!eds suit;al)ilil;y for <t variety of ~tl)l)licalions.
We give a detailed account of an algorithm for efficient tactical generation from underspecified logical-form semantics, using a wide-coverage grammar and a corpus of real-world target utterances. Some earlier claims about chart realization are critically reviewed and corrected in the light of a series of practical experiments. As well as a set of… (More)
This paper addresses two questions: (1) when a large deep processing resource developed for relatively closed domains is run over open text, what coverage does it have, and (2) what are the most effective and time-efficient ways of consolidating gaps in the coverage of such as resource?
This article details our experiments on hpsg parse disambiguation, based on the Redwoods treebank. Using existing and novel stochastic models, we evaluate the usefulness of different information sources for disambiguation – lexical, syntactic, and semantic. We perform careful comparisons of generative and discriminative models using equivalent features and… (More)