Research Paper: Automated Acquisition of Disease-Drug Knowledge from Biomedical and Clinical Documents: An Initial Study

  title={Research Paper: Automated Acquisition of Disease-Drug Knowledge from Biomedical and Clinical Documents: An Initial Study},
  author={Elizabeth S. Chen and George Hripcsak and Hua Xu and Marianthi Markatou and Carol Friedman},
  journal={Journal of the American Medical Informatics Association : JAMIA},
  volume={15 1},
OBJECTIVE Explore the automated acquisition of knowledge in biomedical and clinical documents using text mining and statistical techniques to identify disease-drug associations. DESIGN Biomedical literature and clinical narratives from the patient record were mined to gather knowledge about disease-drug associations. Two NLP systems, BioMedLEE and MedLEE, were applied to Medline articles and discharge summaries, respectively. Disease and drug entities were identified using the NLP systems in… 

Tables from this paper

Automated Knowledge Acquisition from Clinical Narrative Reports

The preliminary study demonstrated that this method for knowledge acquisition of disease-symptom pairs from clinical reports is effective, and can be applied to detect other clinical associations, such as between diseases and medications.

Investigating drug–disease interactions in drug–symptom–disease triples via citation relations

The results showed that the citation relation‐based approach resulted in diverse types of biomedical entities and preserved topical consistency, and drug–disease pairs identified only via citation relations are interesting for clinical trials when they are examined using BITOLA.


This study concludes that a small number of expert-curated biomedical literature resources can prove sufficient for acquiring near-saturated disease-specific vocabularies.

Pattern-based Extraction of Disease Drug Combination Knowledge from Biomedical Literature

A word sense disambiguation method utilizing POS Tagging is developed and leveraged to extract disease and drug combination pairs from MEDLINE abstracts and the patterns with regards to the association between drug combinations and diseases are analyzed.

A Knowledge Graph of Combined Drug Therapies Using Semantic Predications From Biomedical Literature: Algorithm Development

An automated, visual approach to discover knowledge about combination therapies from biomedical literature, especially from those studies with high-level evidence such as clinical trial reports and clinical practice guidelines is developed.

Mining Patterns of Drug-Disease Association from Biomedical Texts

The goal of the paper, drug-disease relation extraction automatically from biomedical texts, is fundamental to the study of drug repurposing since lots of clinical case studies published in an unstructured textual form.



Extracting Phenotypic Information from the Literature via Natural Language Processing

This work presents the development of a system called BioMedLEE that extracts a broad variety of phenotypic information from the biomedical literature by adapting MedLEE, an existing clinical information extraction NLP engine.

EDGAR: extraction of drugs, genes and relations from the biomedical literature.

The mechanisms for automatically generating assertions about drugs and genes relevant to cancer and on a simple application, conceptual clustering of documents are reported on.

Automated knowledge extraction from the UMLS

Evaluations showed that the UMLS MRCOC knowledge has good sensitivity, especially regarding disease-drug relationships, and this knowledge is being used to produce disease-specific views of patients' electronic patient record.

Extracting Drug-Drug Interaction Articles from MEDLINE to Improve the Content of Drug Databases

The study showed that automated classification techniques have the potential to perform at least as well as PubMed in identifying drug-drug interaction articles.

Knowledge Discovery for Advanced Clinical Data Management and Analysis

Human experts and available domain expert systems could provide suggestions as how to improve all major steps in the knowledge discovery process such as evaluation of knowledge, choice of data mining methods and data input.

Mining a clinical data warehouse to discover disease-finding associations using co-occurrence statistics

Two methods, chi2 statistics and the proportion confidence interval (PCI) method, are used to measure the dependence of pairs of diseases and findings and used heuristic cutoff values for association selection to construct knowledge bases of disease-finding relations.

Supporting Discovery in Medicine by Association Rule Mining in Medline and UMLS

The evaluation showed that the approach for supporting discovery in medicine is successful, but also that some improvements are needed, especially on limiting the number of potential discoveries the system generates.

Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program

The UMLS Metathesaurus, the largest thesaurus in the biomedical domain, provides a representation of biomedical knowledge consisting of concepts classified by semantic type and both hierarchical and

MeSHmap: a text mining tool for MEDLINE

The proposed MeSHmap supports searches via PubMed followed by user driven exploration of the MeSH terms and subheadings in the retrieved set, and has the potential to generate maps of entities such that the strength of the link between two entities in the map represents their similarity as expressed in the Me SH metadata of the MEDLINE documents.