Dina Demner-Fushman

Learn More
title 85 10 5 Title + 1st two sentences 90 5 5 Entire abstract 86 2 12 of the abstract varying in length: abstract title only, abstract title and first two sentences, and entire abstract text. Concepts in the title, in the introduction section of structured abstracts, or in the first two sentences in unstructured abstracts, are given higher confidence(More)
An experiment was performed at the National Library of Medicine((R)) (NLM((R))) in word sense disambiguation (WSD) using the Journal Descriptor Indexing (JDI) methodology. The motivation is the need to solve the ambiguity problem confronting NLM's MetaMap system, which maps free text to terms corresponding to concepts in NLM's Unified Medical Language(More)
The ability to accurately model the content structure of text is important for many natural language processing applications. This paper describes experiments with generative models for analyzing the discourse structure of medical abstracts, which generally follow the pattern of “introduction”, “methods”, “results”, and “conclusions”. We demonstrate that(More)
This paper presents a hybrid approach to question answering in the clinical domain that combines techniques from summarization and information retrieval. We tackle a frequently-occurring class of questions that takes the form “What is the best drug treatment for X?” Starting from an initial set of MEDLINE citations, our system first identifies the drugs(More)
Following recent developments in the automatic evaluation of machine translation and document summarization, we present a similar approach, implemented in a measure called POURPRE, for automatically evaluating answers to definition questions. Until now, the only way to assess the correctness of answers to such questions involves manual determination of(More)
We describe a natural language processing system (Enhanced SemRep) to identify core assertions on pharmacogenomics in Medline citations. Extracted information is represented as semantic predications covering a range of relations relevant to this domain. The specific relations addressed by the system provide greater precision than that achievable with(More)
It is now almost 15 years since the publication of the first paper on text mining in the genomics domain, and decades since the first paper on text mining in the medical domain. Enormous progress has been made in the areas of information retrieval, evaluation methodologies and resource construction. Some problems, such as abbreviation-handling, can(More)
The paradigm of evidence-based medicine (EBM) recommends that physicians formulate clinical questions in terms of the problem/population, intervention, comparison, and outcome. Together, these elements comprise a PICO frame. Although this framework was developed to facilitate the formulation of clinical queries, the ability of PICO structures to represent(More)
The biomedical community makes extensive use of text mining technology. In the past several years, enormous progress has been made in developing tools and methods, and the community has been witness to some exciting developments. Although the state of the community is regularly reviewed, the sheer volume of work related to biomedical text mining and the(More)