• Publications
  • Influence
A model-theoretic coreference scoring scheme
This note describes a scoring scheme for the coreference task in MUC6. It improves on the original approach by: (1) grounding the scoring scheme in terms of a model; (2) producing more intuitiveExpand
Overview of BioCreAtIvE: critical assessment of information extraction for biology
The first BioCreAtIvE assessment provided state-of-the-art performance results for a basic task (gene name finding and normalization), where the best systems achieved a balanced 80% precision / recall or better, which potentially makes them suitable for real applications in biology. Expand
Natural language question answering: the view from here
The best systems are now able to answer more than two thirds of factual questions in this evaluation, with recent successes reported in a series of question-answering evaluations. Expand
The Tipster Summac Text Summarization Evaluation
The TIPSTER Text Summarization Evaluation (SUMMAC) has established definitively that automatic text summarization is very effective in relevance assessment tasks. Summaries as short as 17% of fullExpand
Overview of BioCreative II gene normalization
Major advances for the BioCreative II gene normalization task include broader participation (20 versus 8 teams) and a pooled system performance comparable to human experts, at over 90% agreement, which show promise as tools to link the literature with biological databases. Expand
Deep Read: A Reading Comprehension System
Initial work on Deep Read, an automated reading comprehension system that accepts arbitrary text input (a story) and answers questions about it is described, with a baseline system that retrieves the sentence containing the answer 30--40% of the time. Expand
Automating Coreference: The Role of Annotated Training Data
A study of interannotator agreement in the coreference task as defined by the Message Understanding Conference (MUC-6 and MUC-7) clarified and simplified the annotation specification, and an analysis of disagreement among several annotators concluded that only 16% of disagreements represented genuine disagreement about coreference. Expand
Evaluating Message Understanding Systems: An Analysis of the Third Message Understanding Conference (MUC-3)
The purpose, history, and methodology of the conference are reviewed, the participating systems are summarized, issues of measuring system effectiveness are discussed, the linguistic phenomena tests are described, and a critical look at the evaluation in terms of the lessons learned is provided. Expand
MITRE: description of the Alembic system used for MUC-6
As with several other veteran MUC participants, MITRE's Alembic system has undergone a major transformation in the past two years. The genesis of this transformation occurred during a dinnerExpand
Overcoming barriers to NLP for clinical text: the role of shared tasks and the need for additional creative solutions
This issue of JAMIA focuses on natural language processing (NLP) techniques for clinical-text information extraction and shared tasks like the i2b2/VA Challenge, a shared-task challenge co-sponsored by the Veteran's Administration for the last 2 years. Expand