Daniel Jurafsky

Learn More
MT of computers to automate some or all of the process of translating from one language to another. Translation, in its full generality, is a difficult, fascinating, and intensely human endeavor, as rich as any other area of human creativity. Consider the following passage from the end of Chapter 45 of the 18th-century novel The Story of the Stone, also(More)
Modern models of relation extraction for tasks like ACE are based on supervised learning of relations from small hand-labeled corpora. We investigate an alternative paradigm that does not require labeled corpora, avoiding the domain dependence of ACEstyle algorithms, and allowing the use of corpora of any size. Our experiments use Freebase, a large semantic(More)
Human linguistic annotation is crucial for many natural language processing tasks but can be expensive and time-consuming. We explore the use of Amazon’s Mechanical Turk system, a significantly cheaper and faster method for collecting annotations from a broad base of paid non-expert contributors over the Web. We investigate five tasks: affect recognition,(More)
We present a system for identifying the semantic relationships, or semantic roles, filled by constituents of a sentence within a semantic frame. Given an input sentence and a target word and frame, the system labels constituents with either abstract semantic roles such as AGENT or PATIENT, or more domain-specific semantic roles such as SPEAKER, MESSAGE, and(More)
We describe a statistical approach for modeling dialogue acts in conversational speech, i.e., speechact-like units such as STATEMENT,QUESTION, BACKCHANNEL,AGREEMENT, DISAGREEMENT, and APOLOGY. Our model detects and predicts dialogue acts based on lexical, collocational, and prosodic cues, as well as on the discourse coherence of the dialogue act sequence.(More)
Semantic taxonomies such as WordNet provide a rich source of knowledge for natural language processing applications, but are expensive to build, maintain, and extend. Motivated by the problem of automatically constructing and extending such taxonomies, in this paper we present a new algorithm for automatically learning hypernym (is-a) relations from text.(More)
This paper details the coreference resolution system submitted by Stanford at the CoNLL2011 shared task. Our system is a collection of deterministic coreference resolution models that incorporate lexical, syntactic, semantic, and discourse information. All these models use global document-level information by sharing mention attributes, such as gender and(More)
In this paper, we propose a machine learning algorithm for shallow semantic parsing, extending the work of Gildea and Jurafsky (2002), Surdeanu et al. (2003) and others. Our algorithm is based on Support Vector Machines which we show give an improvement in performance over earlier classifiers. We show performance improvements through a number of new(More)
The natural language processing community has recently experienced a growth of interest in domain independent shallow semantic parsing—the process of assigning a Who did What to Whom, When, Where, Why, How etc. structure to plain text. This process entails identifying groups of words in a sentence that represent these semantic arguments and assigning(More)
We propose a novel algorithm for inducing semantic taxonomies. Previous algorithms for taxonomy induction have typically focused on independent classifiers for discovering new single relationships based on hand-constructed or automatically discovered textual patterns. By contrast, our algorithm flexibly incorporates evidence from multiple classifiers over(More)