Learn More
Following the recent adoption by the machine translation community of automatic evaluation using the BLEU/NIST scoring process, we conduct an in-depth study of a similar idea for evaluating summaries. The results show that automatic evaluation using unigram co-occurrences between summary pairs correlates surprising well with human evaluations, based on(More)
The goal of the OntoNotes project is to provide linguistic data annotated with a skeletal representation of the literal meaning of sentences including syntactic parse, predicate-argument structure, coreference, and word senses linked to an ontology, allowing a new generation of language understanding technologies to be developed with new functional(More)
In this paper we explore the power of surface text patterns for open-domain question answering systems. In order to obtain an optimal set of patterns, we have developed a method for learning such patterns automatically. A tagged corpus is built from the Internet in a bootstrapping process by providing a few hand-crafted examples of each question type to(More)
1 Motivation The three motivations behind the Rand index [4], a general clustering evaluation metric, can be rephrased in coreference terms: (i) every mention is unequivocably assigned to a specific entity; (ii) entities are defined just as much by those mentions which they do not contain as by those mentions which they do contain; and (iii) all mentions(More)
In order to produce, a good summary, one has to identify the most relevant portions of a given text. We describe in this t)at)er a method for automatically training tel)it, signatures-sets of related words, with associated weights, organized around head topics and illustrate with signatm'es we cre-;tt.ed with 6,194 TREC collection texts over 4 selected(More)
Identifying sentiments (the affective parts of opinions) is a challenging problem. We present a system that, given a topic, automatically finds the people who hold opinions about that topic and the sentiment of each opinion. The system contains a module for determining word sentiment and another for combining sentiments within a sentence. We experiment with(More)
This paper presents a method for identifying an opinion with its holder and topic, given a sentence in online news media texts. We introduce an approach of exploiting the semantic structure of a sentence, anchored to an opinion bearing verb or adjective. This method uses semantic role labeling as an intermediate step to label an opinion holder and topic(More)
Though most text generators are capable of simply stringing together more than one sentence, they cannot determine which order will ensure a coherent paragraph. A paragraph is coherent when the information in successive sentences follows some pattern of inference or of knowledge with which the hearer is familiar. To signal such inferences, speakers usually(More)
Although many algorithms have been developed to harvest lexical resources, few organize the mined terms into taxonomies. We propose (1) a semi-supervised algorithm that uses a root concept, a basic level concept, and re-cursive surface patterns to learn automatically from the Web hyponym-hypernym pairs subordinated to the root; (2) a Web based concept(More)
The automatic interpretation of noun-noun compounds is an important subproblem within many natural language processing applications and is an area of increasing interest. The problem is difficult, with disagreement regarding the number and nature of the relations, low inter-annotator agreement, and limited annotated data. In this paper, we present a novel(More)