Research Paper: Methods for Building Sense Inventories of Abbreviations in Clinical Notes

Abstract

OBJECTIVE To develop methods for building corpus-specific sense inventories of abbreviations occurring in clinical documents. DESIGN A corpus of internal medicine admission notes was collected and instances of each clinical abbreviation in the corpus were clustered to different sense clusters. One instance from each cluster was manually annotated to generate a final list of senses. Two clustering-based methods (Expectation Maximization--EM and Farthest First--FF) and one random sampling method for sense detection were evaluated using a set of 12 clinical abbreviations. MEASUREMENTS The clustering-based sense detection methods were evaluated using a set of clinical abbreviations that were manually sense annotated. "Sense Completeness" and "Annotation Cost" were used to measure the performance of different methods. Clustering error rates were also reported for different clustering algorithms. RESULTS A clustering-based semi-automated method was developed to build corpus-specific sense inventories for abbreviations in hospital admission notes. Evaluation demonstrated that this method could largely reduce manual annotation cost and increase the completeness of sense inventories when compared with a manual annotation method using random samples. CONCLUSION The authors developed an effective clustering-based method for building corpus-specific sense inventories for abbreviations in a clinical corpus. To the best of the authors knowledge, this is the first time clustering technologies have been used to help building sense inventories of abbreviations in clinical text. The results demonstrated that the clustering-based method performed better than the manual annotation method using random samples for the task of building sense inventories of clinical abbreviations.

DOI: 10.1197/jamia.M2927

Extracted Key Phrases

4 Figures and Tables

Cite this paper

@article{Xu2008ResearchPM, title={Research Paper: Methods for Building Sense Inventories of Abbreviations in Clinical Notes}, author={Hua Xu and Peter D. Stetson and Carol Friedman}, journal={Journal of the American Medical Informatics Association : JAMIA}, year={2008}, volume={16 1}, pages={103-8} }