Learn More
The main asset of categorial grammar has traditionally been the close association between syntactic description on the one hand and a transparent and well-founded semantics on the other. Consequently, categorial grammar has primarily been favored by semantically oriented theorists and has arguably played an important role in the development of formal(More)
This article presents an algorithm for translating the Penn Treebank into a corpus of Combina-tory Categorial Grammar (CCG) derivations augmented with local and long-range word–word dependencies. The resulting corpus, CCGbank, includes 99.4% of the sentences in the Penn Treebank. It is available from the Linguistic Data Consortium, and has been used to(More)
This paper addresses the problem of learning to map sentences to logical form, given training data consisting of natural language sentences paired with logical representations of their meaning. Previous approaches have been designed for particular natural languages or specific meaning representations; here we present a more general method. The approach(More)
We present an algorithm which translates the Penn Treebank into a corpus of Combinatory Categorial Grammar (CCG) derivations. To do this we have needed to make several systematic changes to the Treebank which have to effect of cleaning up a number of errors and inconsistencies. This process has yielded a cleaner treebank that can potentially be used in any(More)
"Two weeks later, Bonadea had already been his lover for a fortnight."-Robert Musil, Der Mann ohne Eigenschaften. A semantics of temporal categories in language and a theory of their use in defining the temporal relations between events both require a more complex structure on the domain underlying the meaning representations than is commonly assumed. This(More)
We consider the problem of learning fac-tored probabilistic CCG grammars for semantic parsing from data containing sentences paired with logical-form meaning representations. Traditional CCG lexicons list lexical items that pair words and phrases with syntactic and semantic content. Such lexicons can be inefficient when words appear repeatedly with closely(More)
This paper compares a number of gen-erative probability models for a wide-coverage Combinatory Categorial Grammar (CCG) parser. These models are trained and tested on a corpus obtained by translating the Penn Treebank trees into CCG normal-form derivations. According to an evaluation of unlabeled word-word dependencies, our best model achieves a performance(More)
The paper proposes a theory relating syntax, discourse semantics, and intonational prosody. The full range of English intonational tunes distinguished by Pierrehumbert and their semantic interpretation in terms of focus and information structure are discussed, including " dis-continuous " themes and rhemes. The theory is based on Combinatory Categorial(More)
Part-of-speech (POS) induction is one of the most popular tasks in research on unsuper-vised NLP. Many different methods have been proposed, yet comparisons are difficult to make since there is little consensus on evaluation framework, and many papers evaluate against only one or two competitor systems. Here we evaluate seven different POS induction systems(More)
We describe an implemented system which automatically generates and animates conversations between multiple human-like agents with appropriate and synchronized speech, intonation, facial expressions, and hand gestures. Conversations are created by a dialogue planner that produces the text as well as the intonation of the utterances. The speaker/listener(More)