A Simple Algorithm for Identifying Abbreviation Definitions in Biomedical Text

@article{Schwartz2003ASA,
  title={A Simple Algorithm for Identifying Abbreviation Definitions in Biomedical Text},
  author={Ariel S. Schwartz and Marti A. Hearst},
  journal={Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing},
  year={2003},
  pages={
          451-62
        }
}
The volume of biomedical text is growing at a fast rate, creating challenges for humans and computer systems alike. One of these challenges arises from the frequent use of novel abbreviations in these texts, thus requiring that biomedical lexical ontologies be continually updated. In this paper we show that the problem of identifying abbreviations' definitions can be solved with a much simpler algorithm than that proposed by other research efforts. The algorithm achieves 96% precision and 82… Expand
A New Alignment Algorithm to Identify Definitions Corresponding to Abbreviations in Biomedical Text
TLDR
This work proposes an algorithm analogous to pairwise sequence alignment, in which it is given a penalty score if there are two unmatched characters separately from the abbreviation and definition, and in this way some irregular abbreviations are found. Expand
Automatic extraction of abbreviation definitions based on general texts
  • Z. Zhou, G. Chen
  • Computer Science
  • 2013 10th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD)
  • 2013
TLDR
An abbreviation definition identification algorithm is proposed, which employs a variety of rules and incorporates shallow parsing of the text to identify the most probable abbre acronym definition from general texts. Expand
Designing of an efficient algorithm for identifying Abbreviation definitions in biomedical Text
TLDR
A patternmatching method for mining acronyms and their definitions from biomedical text by considering the space reduction heuristic constraints has been proposed and implemented and is faster and more efficient than the previous approaches. Expand
Biomedical Abbreviation Recognition and Resolution by PROSA-MED
TLDR
Five systems are proposed that deal with the problem of discovering and disambiguating acronyms and their expanded forms and one of the systems clearly outperforms the others, both in the detection of entities and identifying relations between short-long forms. Expand
Automatic Extraction for Creating a Lexical Repository of Abbreviations in the Biomedical Literature
TLDR
A hybrid approach combining lexical analysis techniques and the Support Vector Machine to create an automatically generated and maintained lexicon of abbreviations that outperforms the leading abbreviation algorithms, ExtractAbbrev and ALICE. Expand
A comparison study of biomedical short form definition detection algorithms
TLDR
It is found that most systems have some difficulty in detecting definitions for chemical/gene/protein symbols where ALICE has relatively better performance of chemical/Gene/ protein symbols comparing to the other two possibly due to fine tuning of the system for those symbols. Expand
Automatic construction of biomedical abbreviations dictionary from text
TLDR
A new method for automatic construction of biomedical abbreviations dictionary from text by combining string matching algorithm and searching algorithm, based on the idea that readers often lookup relative articles to judge the longform of an abbreviation is correct or not is proposed. Expand
Abbreviation definition identification based on automatic precision estimates
TLDR
An algorithm for abbreviation identification that uses a variety of strategies to identify the most probable definition for an abbreviation and also produces an estimated accuracy of the result, which is purely automatic. Expand
A Proposed System to Identify and Extract Abbreviation Definitions in Spanish Biomedical Texts for the Biomedical Abbreviation Recognition and Resolution (BARR) 2017
TLDR
A ruled-based system using an adapted version of the algorithm for extraction of abbreviations and their definitions from biomedical text proposed by Schwartz & Hearst is developed. Expand
Recognizing Acronyms and their Definitions in Swedish Medical Texts
This paper addresses the task of recognizing acronym-definition pairs in Swedish (medical) texts as well as the compilation of a freely available sample of such manually annotated pairs. A materialExpand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 44 REFERENCES
Research Paper: Creating an Online Dictionary of Abbreviations from MEDLINE
TLDR
An algorithm to identify abbreviations from text using a statistical learning algorithm, logistic regression, to score abbreviation expansions based on their resemblance to a training set of human-annotated abbreviations is developed. Expand
Automatic Extraction of Acronym-meaning Pairs from MEDLINE Databases
TLDR
A system called ACROMED that is part of a set of Information Extraction tools designed for processing and extracting information from abstracts in the Medline database is presented, found to be better for biomedical texts than the performance of other acronym extraction systems designed for unrestricted text. Expand
Research Paper: Mapping Abbreviations to Full Forms in Biomedical Articles
TLDR
The authors found that an average of 25 percent of abbreviations were defined in biomedical articles and that of a randomly selected subset of undefined abbreviations, 68 percent could be mapped to any of four abbreviation databases. Expand
Semi-Supervised Maximum Entropy Based Approach to Acronym and Abbreviation Normalization in Medical Texts
TLDR
A method of automatically generating training data for Maximum Entropy (ME) modeling of abbreviations and acronyms is demonstrated and it is shown that using ME modeling is a promising technique for abbreviation and acronym normalization. Expand
Extraction and Disambiguation of Acronym Meaning-Pairs in Medline
TLDR
This work presents a system called Acromed which finds acronym-meaning pairs as part of a set of information extraction tools designed for processing and extracting data from abstracts in the Medline database, and presents Polyfind, an algorithm for disambiguating polynyms, which uses a vector space model. Expand
PNAD-CSS: a workbench for constructing a protein name abbreviation dictionary
TLDR
A workbench for protein name abbreviation dictionary (PNAD) building and a hybrid system composed of the PROPER System and the PNAD System to extract a pair of protein names and abbreviations. Expand
SaRAD: a Simple and Robust Abbreviation Dictionary
  • Eytan Adar
  • Computer Science, Medicine
  • Bioinform.
  • 2004
TLDR
The Simple and Robust Abbreviation Dictionary (SaRAD) provides an easy to implement, high performance tool for the construction of a biomedical symbol dictionary and result in a high quality dictionary and toolset to disambiguate abbreviation symbols automatically. Expand
Hybrid Text Mining for Finding Abbreviations and their Definitions
TLDR
This method employs pattern-based abbreviation rules in addition to text markers and cue words to find abbreviations and their definitions in free format texts with the advantages of high accuracy, high flexibility, wide coverage, and fast recognition. Expand
A multi-level text mining method to extract biological relationships
TLDR
This paper presents a novel approach to extract relationships between multiple biological objects that are present in a text document that is both adaptable and scalable to new problems as opposed to rule-based methods. Expand
Constructing Biological Knowledge Bases by Extracting Information from Text Sources
TLDR
A research effort aimed at automatically mapping information from text sources into structured representations, such as knowledge bases, is begun, to use machine-learning methods to induce routines for extracting facts from text. Expand
...
1
2
3
4
5
...