Mark Steedman

Learn More
The main asset of categorial grammar has traditionally been the close association between syntactic description on the one hand and a transparent and well-founded semantics on the other. Consequently, categorial grammar has primarily been favored by semantically oriented theorists and has arguably played an important role in the development of formal(More)
This article presents an algorithm for translating the Penn Treebank into a corpus of Combinatory Categorial Grammar (CCG) derivations augmented with local and long-range word–word dependencies. The resulting corpus,CCGbank,includes 99.4% of the sentences in the Penn Treebank. It is available from the Linguistic Data Consortium,and has been used to train(More)
A semantics of temporal categories in language and a theory of their use in defining the temporal relations between events both require a more complex structure on the domain underlying the meaning representations than is commonly assumed. This paper proposes an ontology based on such notions as causation and consequence, rather than on purely temporal(More)
The paper proposes a theory relating syntax, discourse semantics, and intonational prosody. The full range of English intonational tunes distinguished by Pierrehumbert and their semantic interpretation in terms of focus and information structure are discussed, including “discontinuous” themes and rhemes. The theory is based on Combinatory Categorial(More)
We describe an implemented system which <italic>automatically</italic> generates and animates conversations between multiple human-like agents with appropriate and synchronized speech, intonation, facial expressions, and hand gestures. Conversation is created by a dialogue planner that produces the text as well as the intonation of the utterances. The(More)
This paper addresses the problem of learning to map sentences to logical form, given training data consisting of natural language sentences paired with logical representations of their meaning. Previous approaches have been designed for particular natural languages or specific meaning representations; here we present a more general method. The approach(More)
This paper shows how to construct semantic representations from the derivations produced by a wide-coverage CCG parser. Unlike the dependency structures returned by the parser itself, these can be used directly for semantic interpretation. We demonstrate that well-formed semantic representations can be produced for over 97% of the sentences in unseen WSJ(More)
We consider the problem of learning factored probabilistic CCG grammars for semantic parsing from data containing sentences paired with logical-form meaning representations. Traditional CCG lexicons list lexical items that pair words and phrases with syntactic and semantic content. Such lexicons can be inefficient when words appear repeatedly with closely(More)
Part-of-speech (POS) induction is one of the most popular tasks in research on unsupervised NLP. Many different methods have been proposed, yet comparisons are difficult to make since there is little consensus on evaluation framework, and many papers evaluate against only one or two competitor systems. Here we evaluate seven different POS induction systems(More)