Corpus ID: 2990304

QuickUMLS: a Fast, Unsupervised Approach for Medical Concept Extraction

@inproceedings{Soldaini2016QuickUMLSAF,
  title={QuickUMLS: a Fast, Unsupervised Approach for Medical Concept Extraction},
  author={Luca Soldaini},
  year={2016}
}
Entity extraction is a fundamental step in many health informatics systems. In recent years, tools such as MetaMap and cTAKES have been widely used for medical concept extraction on medical literature and clinical notes; however, relatively little interest has been placed on their scalability to large datasets. In this work, we present QuickUMLS: a fast, unsupervised, approximate dictionary matching algorithm for medical concept extraction. The proposed method achieves similar precision and… Expand

Tables from this paper

Clinical named-entity recognition: A short comparison
TLDR
The preliminary results demonstrate that BioPortal performs well when extracting disorder and drug and can provide clinical researchers with real-clinical insights into patient's health patterns and it may allow to create a first version of an annotated dataset. Expand
Clinical Concept Extraction with Lexical Semantics to Support Automatic Annotation
  • Asim Abbas, M. Afzal, +4 authors Seokhee Jeon
  • Medicine
  • International journal of environmental research and public health
  • 2021
TLDR
The proposed methodology significantly improves the performance of concept extraction from unstructured clinical narratives by exploiting the linguistic and lexical semantic features and can ease the automatic annotation process of clinical data, which ultimately improves theperformance of supervised data-driven applications trained with these data. Expand
Clinical Phrase Mining with Language Models
TLDR
Experimental results on the MIMIC-III dataset show that the proposed CliniPhrase method can outperform the current state-of-the-art techniques by up to 18% in terms of F1 measure while being very efficient (up to 48 times faster). Expand
MedLinker: Medical Entity Linking with Neural Representations and Dictionary Matching
TLDR
This paper explores how NLMs can be used for Medical Entity Linking with the recently introduced MedMentions dataset, and introduces a solution that performs competitively on semantic type linking, while improving the state-of-the-art on the more fine-grained task of concept linking. Expand
Evaluation of Medical Concept Annotation Systems on Clinical Records
TLDR
This paper analyses and evaluates four annotation systems for the task of extracting medical concepts from clinical free-text documents and finds the concept recognition component of each system was found to be highly sensitive to the quality of the text spans output by the concept extraction component of the annotation system. Expand
Ontology-based enriched concept graphs for medical document classification
TLDR
Based on evaluation metrics of precision, recall and F1-scores, the proposed method achieves a significantly higher classification performance than other widely used similarity measures for similarity-based text classification. Expand
A CNL-based Method for Detecting Disease Negation
Negation detection is a key feature to the processing of biomedical text, and it involves two steps: identifying a medical term of interest in text and identifying that that medical term is mentionedExpand
Mining Concepts for a COVID Interface Terminology for Annotation of EHRs
  • V. Keloth, Shuxin Zhou, +4 authors Y. Perl
  • Computer Science
  • 2020 IEEE International Conference on Big Data (Big Data)
  • 2020
TLDR
This work proposes to design a COVID interface terminology for the annotation of EHR notes of COVID-19 patients, and uses the techniques of concatenation and anchoring iteratively to extract high granularity phrases from the clinical text. Expand
MedType: Improving Medical Entity Linking with Semantic Type Prediction
TLDR
This paper presents MedType, a fully modular system that prunes out irrelevant candidate concepts based on the predicted semantic type of an entity mention, and incorporates it into five off-the-shelf toolkits for medical entity linking and demonstrates that it consistently improves entity linking performance across several benchmark datasets. Expand
A Survey of Named-Entity Recognition Methods for Food Information Extraction
TLDR
An overview and a comparison of named-entity recognition methods in the food domain, which can be used for automated extraction of food information from text, and it is evident that FoodIE obtains very promising results compared to the other food named- entity recognition methods taken into consideration. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 20 REFERENCES
Medical Entity Recognition: A Comparaison of Semantic and Statistical Methods
TLDR
This paper presents and compares three methods based on domain-knowledge and machine-learning techniques for medical Entity Recognition and shows that the hybrid approach based on both machine learning and domain knowledge obtains the best performance. Expand
Sophia: An Expedient UMLS Concept Extraction Annotator
TLDR
Sophia, a rapid UMLS concept extraction annotator was developed to fulfill a mandate and address extraction where high throughput is needed while preserving performance, and is noted to be several fold faster than cTAKES and the scaled-out MetaMap service. Expand
MaxMatcher: Biological Concept Extraction Using Approximate Dictionary Lookup
TLDR
This work proposes a generic extraction approach, referred to as approximate dictionary lookup, to cope with term variations and implement it as an extraction system called MaxMatcher, which dramatically improves the extraction recall while maintaining the precision. Expand
An overview of MetaMap: historical perspective and recent advances
TLDR
This study reports on MetaMap's evolution over more than a decade, concentrating on those features arising out of the research needs of the biomedical informatics community both within and outside of the National Library of Medicine. Expand
Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications
TLDR
The cTAKES annotations are the foundation for methods and modules for higher-level semantic processing of clinical free-text, and its components, specifically trained for the clinical domain, create rich linguistic and semantic annotations. Expand
MetaCoDe: A Lightweight UMLS Mapping Tool
TLDR
A lightweight UMLS tagger is developed that processes large text collections at an acceptable speed, but at the cost of the sophistication of the treatments, allowing potential users to balance the gain in speed against the loss in quality. Expand
State-of-the-art in biomedical literature retrieval for clinical cases: a survey of the TREC 2014 CDS track
TLDR
An overview of the task, a survey of the information retrieval methods employed by the participants, an analysis of the results, and a discussion on the future directions for this challenging yet important task are provided. Expand
2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text
TLDR
The 2010 i2b2/VA Workshop on Natural Language Processing Challenges for Clinical Records presented three tasks, which showed that machine learning approaches could be augmented with rule-based systems to determine concepts, assertions, and relations. Expand
Inferring conceptual relationships to improve medical records search
TLDR
The results show the effectiveness of the approach to model the implicit knowledge in medical records search, whereby the infAP retrieval performance is significantly improved up to 14.43% over an effective concept-based representation baseline. Expand
An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition
TLDR
Overall, BioASQ helped obtain a unified view of how techniques from text classification, semantic indexing, document and passage retrieval, question answering, and text summarization can be combined to allow biomedical experts to obtain concise, user-understandable answers to questions reflecting their real information needs. Expand
...
1
2
...