Data Set Used
While the mechanisms for conveying temporal information in language have been have been extensively studied by linguists, very little of this work has been done in the tradition of corpus linguistics. In this paper we discuss the outcomes of a research effort to build a corpus, called TIMEBANK, which is richly annotated to indicate events, times, and… (More)
This paper describes the Second PASCAL Recognising Textual Entailment Challenge (RTE-2). 1 We describe the RTE-2 dataset and overview the submissions for the challenge. One of the main goals for this year's dataset was to provide more " realistic " text-hypothesis examples , based mostly on outputs of actual systems. The 23 submissions for the challenge… (More)
The views, opinions, and/or findings contained in this report are those of the MITRE Corporation and should not be construed as an official Government position, policy, or decision, unless designated by other documentation.
We describe MITRE's two submissions to the RTE Challenge, intended to exemplify two different ends of the spectrum of possibilities. The first submission is a traditional system based on linguistic analysis and inference, while the second is inspired by alignment approaches from machine translation. We also describe our efforts to build our own entailment… (More)
This paper introduces a set of guidelines for annotating time expressions with a canonicalized representation of the times they refer to, and describes methods for extracting such time expressions from multiple languages.
We describe our efforts to generate a large (100,000 instance) corpus of textual entailment pairs from the lead paragraph and headline of news articles. We manually inspected a small set of news stories in order to locate the most productive source of entailments, then built an annotation interface for rapid manual evaluation of further exemplars. With this… (More)
The ability to communicate in natural language has long been considered a defining characteristic of human intelligence. Furthermore, we hold our ability to express ideas in writing as a pinnacle of this uniquely human language facility—it defies formulaic or algorithmic specification. So it comes as no surprise that attempts to devise computer programs… (More)
In this paper, we report on Qaviar, an experimental automated evaluation system for question answering applications. The goal of our research was to find an automatically calculated measure that correlates well with human judges' assessment of answer correctness in the context of question answering tasks. Qaviar judges the response by computing recall… (More)
Broadcast news is a rich domain for information extraction, but one that presents new challenges for evaluation. In this paper we present an overview of the first evaluation of information extraction from broadcast news that was conducted as part of the DARPA-funded Hub 4 1998 workshop. We discuss the work that was required to design and administer the… (More)
This paper introduces a set of guidelines for annotating time expressions with a canonicalized representation of the times they refer to. Applications that can benefit from such an annotated corpus include information extraction (e.g., normalizing temporal references for database entry), question answering (answering " when " questions), summarization… (More)