Combining text classification and Hidden Markov Modeling techniques for categorizing sentences in randomized clinical trial abstracts.

Abstract

Randomized clinical trials (RCT) papers provide reliable information about efficacy of medical interventions. Current keyword based search methods to retrieve medical evidence,overload users with irrelevant information as these methods often do not take in to consideration semantics encoded within abstracts and the search query. Personalized semantic search, intelligent clinical question answering and medical evidence summarization aim to solve this information overload problem. Most of these approaches will significantly benefit if the information available in the abstracts is structured into meaningful categories (e.g., background, objective, method, result and conclusion). While many journals use structured abstract format, majority of RCT abstracts still remain unstructured.We have developed a novel automated approach to structure RCT abstracts by combining text classification and Hidden Markov Modeling(HMM) techniques. Results (precision: 0.98, recall: 0.99) of our approach significantly outperform previously reported work on automated categorization of sentences in RCT abstracts.

Cite this paper

@article{Xu2006CombiningTC, title={Combining text classification and Hidden Markov Modeling techniques for categorizing sentences in randomized clinical trial abstracts.}, author={Rong Xu and Kaustubh Supekar and Yang Huang and Amar K. Das and Alan M. Garber}, journal={AMIA ... Annual Symposium proceedings. AMIA Symposium}, year={2006}, pages={824-8} }