Enhanced LexSynonym Acquisition for Effective UMLS Concept Mapping

  title={Enhanced LexSynonym Acquisition for Effective UMLS Concept Mapping},
  author={Chris J. Lu and Destinee L. Tormey and Lynn McCreedy and Allen C. Browne},
  journal={Studies in health technology and informatics},
Concept mapping is important in natural language processing (NLP) for bioinformatics. The UMLS Metathesaurus provides a rich synonym thesaurus and is a popular resource for concept mapping. Query expansion using synonyms for subterm substitutions is an effective technique to increase recall for UMLS concept mapping. Synonyms used to substitute subterms are called element synonyms. The completeness and quality of both element synonyms and the UMLS synonym thesaurus is the key to success in such… Expand
Enhancing LexSynonym Features in the Lexical Tools
The effectiveness of this technique relies on the completeness and quality of both the element synonyms and the UMLS synonym thesaurus in the query expansion pipeline of U MLS concept mapping. Expand
The Unified Medical Language System SPECIALIST Lexicon and Lexical Tools: Development and applications
The objective is to provide generic, broad coverage and a robust lexical system for NLP applications, and a novel multiword approach and other planned developments are proposed. Expand
Enhanced Features in the SPECIALIST Lexicon - Antonyms
The objective is to develop a systematic approach to generate antonyms in the SPECIALIST Lexicon (thereafter, the Lexicon) and hope to provide generic and comprehensive antonym features needed for the NLP community. Expand
Classification Types: A New Feature in the SPECIALIST Lexicon
The performance of automated consumer question understanding could be improved if the Lexicon provides informal terms with their crossreferenced (CR) formal terms (synonyms) in the same lexical record. Expand
The Unified Medical Language System at 30 Years and How It Is Used and Published: Systematic Review and Content Analysis
  • Xia Jing
  • JMIR Medical Informatics
  • 2021
Background The Unified Medical Language System (UMLS) has been a critical tool in biomedical and health informatics, and the year 2021 marks its 30th anniversary. The UMLS brings together manyExpand
Medical Concept Normalization in User-Generated Text
A novel machine learning approach to normalize Adverse Drug Effect mentions in user-generated text to a standard vocabulary from a medical Ontology is proposed, demonstrating a competitive performance among the current state of the art techniques and posing the potential feasibility of the model in the medical concept normalization domain. Expand
Using UMLS for electronic health data standardization and database design
The multistep mapping process developed and implemented is necessary to normalize electronic health data from multiple domains and sources into a common data model to support secondary use of data. Expand
Normalizing Adverse Events using Recurrent Neural Networks with Attention.
  • Kahyun Lee, Özlem Uzuner
  • Computer Science, Medicine
  • AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science
  • 2020
This work proposes a novel neural network for AE normalization utilizing bidirectional long short-term memory (biLSTM) with attention mechanism that generalizes to diverse datasets and outperforms widely used rule-based normalizers on a diverse set of narratives. Expand
Normalization of Long-tail Adverse Drug Reactions in Social Media
This paper exploits the implicit semantics of rare ADRs for which there are few training samples, in order to detect the most similar class for the given ADR. Expand


Piecewise Synonyms for Enhanced UMLS Source Terminology Integration
A new methodology, based on the notion of piecewise synonyms, for enhancing the process of concept discovery in the UMLS is presented, showing a 34% improvement over simple string matching. Expand
Discovering missed synonymy in a large concept-oriented Metathesaurus
This paper reviews general methods for finding missed synonymy and describes several specific novel approaches which have been found effective. Expand
Synonym, Topic Model and Predicate-Based Query Expansion for Retrieving Clinical Documents
Amongst the three expansion methods, the topic model-based method performed the best in terms of recall and F-measure, and was developed and tested for the retrieval of clinical documents. Expand
Performance evaluation of unified medical language system®'s synonyms expansion to query PubMed
This study highlights the need for specific search tools for each type of user and use-cases and proposes a method: expanding users' queries using Unified Medical Language System' (UMLS) synonyms i.e. all the terms gathered under one unique Concept Unique Identifier. Expand
Development of Sub-Term Mapping Tools (STMT)
The Sub-Term Mapping Tools (STMT), developed at National Library of Medicine (NLM), are used to find 1) all sub-terms, all prefixes, and the longest prefix in a specified corpus; 2) all sub-termExpand
Generating a Distilled N-Gram Set - Effective Lexical Multiword Building in the SPECIALIST Lexicon
A new systematic approach to lexical multiword acquisition from MEDLINE through filters and matchers based on empirical models is described and improvement in recall or precision can be anticipated in NLP projects using the MEDLINE distilled n-gram set, SPECIALIST Lexicon and its applications. Expand
Failure Analysis of MetaMap Transfer (MMTx)
A failure analysis was conducted to categorize the types of terms not correctly mapped by MMTx, and distinguish between classes of failures that may be easily rectified, such as alternative retrieval strategies to extract exact matches, and ones that require additional research. Expand
Generating SD-Rules in the SPECIALIST Lexical Tools - Optimization for Suffix Derivation Rule Set
A methodology to select an optimized SD-Rule set that meets the requirement of 95\% system precision with best system performance from SD candidate rules is described and results in better precision and recall for NLP applications using Lexical Tools derivational related flow components. Expand
Query expansion using UMLS Tools for health information retrieval
Results from a comparison evaluation study indicated that the Mean Average Precisions (MAPs) with term-level expansion are higher than those with concept level expansion and the String index with Term expansion has the highest MAPs for both 30 queries and short queries. Expand
Sophia: An Expedient UMLS Concept Extraction Annotator
Sophia, a rapid UMLS concept extraction annotator was developed to fulfill a mandate and address extraction where high throughput is needed while preserving performance, and is noted to be several fold faster than cTAKES and the scaled-out MetaMap service. Expand