Learn More
While the mechanisms for conveying temporal information in language have been have been extensively studied by linguists, very little of this work has been done in the tradition of corpus linguistics. In this paper we discuss the outcomes of a research effort to build a corpus, called TIMEBANK, which is richly annotated to indicate events, times, and(More)
Historically, tailoring language processing systems to specific domains and languages for which they were not originally built has required a great deal of effort. Recent advances in corpus-based manual and automatic training methods have shown promise in reducing the time and cost of this porting process. These developments have focused even greater(More)
As with several other veteran Muc participants, MITRE'S Alembic system has undergone a major transformation in the past two years. The genesis of this transformation occurred during a dinner conversation at the last Muc conference, MUC-5. At that time, several of us reluctantly admitted that our major impediment towards improved performance was reliance on(More)
MiTAP (MITRE Text and Audio Processing) is a prototype system available for monitoring infectious disease outbreaks and other global events. MiTAP focuses on providing timely, multi-lingual, global information access to medical experts and individuals involved in humanitarian assistance and relief work. Multiple information sources in multiple languages are(More)
For several years, chunking has been an integral part of MITRE's approach to information extraction. Our work exploits chunking in two principal ways. First, as part of our extraction system (Alembic) (Aberdeen et al., 1995), the chunker delineates descriptor phrases for entity extraction. Second, as part of our ongoing research in parsing, chunks provide(More)
In order to support a range of textual annotation tasks, we have developed a new annotation tool called Callisto. To promote task-specific specialization of the interface and associated constraint checking, Callisto provides a facility for the independent development, compilation and installation of task module plug-ins (in the form of Java Archive jar(More)
Alembic is a comprehensive information extraction system that has been applied to a range of tasks. These include the now-standard components of the formal MOC evaluations: name tagging (NE in MUC-6), name normalization (WE), and template generation (ST). The system has also been exploited to help segment and index broadcast video and was used for early(More)