Exploiting MeSH indexing in MEDLINE to generate a data set for word sense disambiguation

  title={Exploiting MeSH indexing in MEDLINE to generate a data set for word sense disambiguation},
  author={Antonio Jimeno-Yepes and Bridget T. McInnes and Alan R. Aronson},
  journal={BMC Bioinformatics},
  pages={223 - 223}
BackgroundEvaluation of Word Sense Disambiguation (WSD) methods in the biomedical domain is difficult because the available resources are either too small or too focused on specific types of entities (e.g. diseases or genes). We present a method that can be used to automatically develop a WSD test collection using the Unified Medical Language System (UMLS) Metathesaurus and the manual MeSH indexing of MEDLINE. We demonstrate the use of this method by developing such a data set, called MSH WSD… 

An Efficient Method for Biomedical Word Sense Disambiguation Based on Web-Kernel Similarity

The authors propose a web-based kernel function for measuring the semantic relatedness between concepts to disambiguate an expression versus multiple possible concepts and evaluate a WSD algorithm based on this measure.

Preliminary Results for Biomedical Word Sense Disambiguation Based on Semantic Clustering

The aim is to design scalable unsupervised WSD methods for the semantic annotation of large biomedical corpora by presenting a clustering-based method that takes profit from the KR information encoded in form of kernels.

Knowledge-Based Biomedical Word Sense Disambiguation: An Evaluation and Application to Clinical Document Classification

  • Vijay GarlaC. Brandt
  • Computer Science
    2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology
  • 2012
A knowledge-based WSD method that uses semantic similarity measures derived from the Unified Medical Language System (UMLS) and evaluated the contribution of WSD to clinical text classification on the 2007 Computational Medicine Challenge corpus.

Supervised Learning and Knowledge-Based Approaches Applied to Biomedical Word Sense Disambiguation

The results show that word embedding models improved the disambiguation accuracy, proving to be a powerful resource in the WSD task.

Determining the difficulty of Word Sense Disambiguation

Disambiguation of entities in MEDLINE abstracts by combining MeSH terms with knowledge

This work proposes an automatic, light-weight method that processes MEDLINE abstracts at largescale and with high-quality output, and exploits MeSH terms and knowledge in UMLS to first identify unambiguous anchor entities, and then disambiguate remaining entities via heuristics.

Studying the correlation between different word sense disambiguation methods and summarization effectiveness in biomedical texts

It has been found that the use of WSD techniques has a positive impact on the results of the graph-based summarizer, and that, when both the WSD and summarization tasks are assessed over large and homogeneous evaluation collections, there exists a correlation between the overall results.

Integration of UMLS and MEDLINE in Unsupervised Word Sense Disambiguation

This work proposes the integration of the UMLS and MEDLINE to create concept profiles which are used to perform knowledge-based word sense disambiguation and shows a statistically significant improvement over other knowledge- based methods based on the U MLS on this data set.



Word sense disambiguation by selecting the best semantic type based on Journal Descriptor Indexing: Preliminary experiment

An experiment was performed at the National Library of Medicine in word sense disambiguation (WSD) using the Journal Descriptor Indexing (JDI) methodology, which is the basis for selecting the best meaning that is correlated to UMLS semantic types assigned to ambiguous concepts in the Metathesaurus.

Knowledge-based biomedical word sense disambiguation: comparison of approaches

Four approaches which rely on the UMLS Metathesaurus as the source of knowledge to perform word sense disambiguation (WSD) achieve better results, but the performance is still below statistical learning trained on manually produced data and below the maximum frequency sense baseline.

Supervised and knowledge-based methods for disambiguating terms in biomedical text using the umls and metamap

K-CUI explores the use of Concept Unique Identifiers (CUIs) as assigned by MetaMap, as features for a supervised learning method for word sense disambiguation and investigates four techniques to reduce the noise in the feature set by restricting which CUIs to include.

Generating quality word sense disambiguation test sets based on MeSH indexing

A MeSH-based automatic sense-tagging method that preferentially annotates terms being topical of the text, and it is suggested that, by cross-validating with 2 or 3 annotators, the method should be able to efficiently generate quality WSD test sets.

Disambiguating Ambiguous Biomedical Terms in Biomedical Narrative Text: An Unsupervised Method

This paper presents a two-phase unsupervised method to build a WSD classifier for an ambiguous biomedical term W, which demonstrated that classifiers trained on the derived sense-tagged corpora achieved an overall accuracy of about 97%, with greater than 90% accuracy for each individual ambiguous term.

Research Paper: Automatic Resolution of Ambiguous Terms Based on Machine Learning and Conceptual Relations in the UMLS

An automatic method that constructs sense-tagged corpora for ambiguous terms in the UMLS using MEDLINE abstracts and can be used to automatically acquire knowledge needed for resolving ambiguity when mapping free-text to U MLS concepts is proposed.

An Unsupervised Vector Approach to Biomedical Term Disambiguation: Integrating UMLS and Medline

An unsupervised vector approach to disambiguate words in biomedical text that can be applied to all-word disambIGuation is introduced.

Resolving abbreviations to their senses in Medline

An automatic analysis of Medline abstracts and linguistic methods are combined to build a dictionary of abbreviation/sense pairs and the system disambiguates abbreviations with a precision of 98.9% for a recall of 99.2% (98.5% accuracy).

Research Paper: A Multi-aspect Comparison Study of Supervised Word Sense Disambiguation

It was found that different aspects of supervised WSD depend on each other and the experiment method presented in the study can be used to select the best supervised W SD classifier for each ambiguous term.

Effects of information and machine learning algorithms on word sense disambiguation with small datasets