Scaling up WSD with Automatically Generated Examples
@inproceedings{Cheng2012ScalingUW, title={Scaling up WSD with Automatically Generated Examples}, author={Weiwei Cheng and Judita Preiss and Mark Stevenson}, booktitle={BioNLP@HLT-NAACL}, year={2012}, url={https://api.semanticscholar.org/CorpusID:1946135} }
This paper describes a large scale WSD system based on automatically labeled examples generated using information from the UMLS Metathesaurus which is found to outperform a state-of-the-art unsupervised approach which also uses information fromThe Metatheaurus.
9 Citations
DALE: A Word Sense Disambiguation System for Biomedical Documents Trained using Automatically Labeled Examples
- 2013
Computer Science, Medicine
DALE (Disambiguation using Automatically Labeled Examples) is a supervised WSD system that can disambiguate a wide range of ambiguities found in biomedical documents and uses the UMLS Metathesaurus as both a sense inventory and a source of information for automatically generating labeled training examples.
The effect of word sense disambiguation accuracy on literature based discovery
- 2016
Computer Science, Medicine
This study reveals that LBD performance is sensitive to WSD accuracy, and concludes that WSD has the potential to improve the output of LBD systems by reducing the amount of spurious hidden knowledge that is generated.
Acronym Disambiguation in Clinical Notes from Electronic Health Records
- 2020
Medicine, Computer Science
This study introduces an unsupervised method for acronym disambiguation, the task of classifying the correct sense of acronyms in the clinical EHR notes, and demonstrates that applying CASEml to medical notes improves the AUC of a phenotype algorithm for rheumatoid arthritis.
Evaluating knowledge-poor and knowledge-rich features in automatic classification: A case study in WSD
- 2012
Computer Science, Linguistics
This work evaluates the automatic disambiguation performance of five machine learning classifiers: Naive Bayes, Support Vector Machines, Decision Trees, KStar and Maximum Entropy.
Semantic Type Classification of Common Words in Biomedical Noun Phrases
- 2015
Computer Science, Medicine
The task of classifying common nouns onto fine-grained semantic types is addressed: for instance, “condition” can be typed as “symptom and finding” or “configuration and setting”.
Knowledge based word-concept model estimation and refinement for biomedical text mining.
- 2015
Computer Science, Medicine
The Unified Medical Language System at 30 Years and How It Is Used and Published: Systematic Review and Content Analysis
- 2021
Medicine, Computer Science
The results, although largely related to academia, demonstrate that UMLS achieves its intended uses successfully, in addition to achieving uses broadly beyond its original intentions.
Knowledge based word-concept model estimation and refinement for biomedical text mining
- 2015
Computer Science, Medicine
This research presents a probabilistic procedure to estimate the likelihood that a person’s immune system will decline with age and disease, and a histopathological assessment shows that the likelihood of an immune attack is low.
27 References
Unsupervised WSD based on Automatically Retrieved Examples: The Importance of Bias
- 2004
Computer Science
The “WordNet monosemous relatives” method is applied to construct automatically a web corpus that is used to train disambiguation systems and has been used toTrain WSD algorithms that include supervised methods, minimally supervised, and fully unsupervised.
Disambiguation of ambiguous biomedical terms using examples generated from the UMLS Metathesaurus
- 2010
Computer Science, Medicine
Knowledge-based biomedical word sense disambiguation: comparison of approaches
- 2010
Computer Science, Medicine
Four approaches which rely on the UMLS Metathesaurus as the source of knowledge to perform word sense disambiguation (WSD) achieve better results, but the performance is still below statistical learning trained on manually produced data and below the maximum frequency sense baseline.
Self-training and co-training in biomedical word sense disambiguation
- 2011
Computer Science, Medicine
Preliminary results of two semi-supervised learning algorithms on biomedical word sense disambiguation are presented, which add relevant unlabeled examples to the training set, and optimal parameters are similar for each ambiguous word.
Graph-based Word Sense Disambiguation of biomedical documents
- 2010
Computer Science, Medicine
A graph-based approach to WSD in the biomedical domain, which makes use of knowledge from the Unified Medical Language System (UMLS) Metathesaurus which is represented as a graph, outperforms other methods that rely on the UMLS Met athesaurus alone.
Effects of information and machine learning algorithms on word sense disambiguation with small datasets
- 2005
Computer Science, Medicine
Research Paper: Automatic Resolution of Ambiguous Terms Based on Machine Learning and Conceptual Relations in the UMLS
- 2002
Computer Science
An automatic method that constructs sense-tagged corpora for ambiguous terms in the UMLS using MEDLINE abstracts and can be used to automatically acquire knowledge needed for resolving ambiguity when mapping free-text to U MLS concepts is proposed.
Exploiting MeSH indexing in MEDLINE to generate a data set for word sense disambiguation
- 2010
Computer Science, Medicine
A method that can be used to automatically develop a WSD test collection using the Unified Medical Language System (UMLS) Metathesaurus and the manual MeSH indexing of MEDLINE is presented and allows the evaluation of WSD algorithms in the biomedical domain.
Using Corpus Statistics and WordNet Relations for Sense Identification
- 1998
Computer Science, Linguistics
A statistical classifier is described that combines topical context with local cues to identify a word sense and is used to disambiguate a noun, a verb, and an adjective.
Disambiguation of biomedical text using diverse sources of information
- 2008
Computer Science, Medicine
Disambiguation of biomedical terms benefits from the use of information from a variety of sources including linguistic features of the context in which the ambiguous term is used and domain-specific resources, such as UMLS.