Mining for associations between categorical data items in a clinical data repository.

Abstract

We present here our preliminary work in using simple two-way categorical tests to discover associations between categorical items in a clinical data repository. Initial results using the chi square test yielded diagnosis code associations that seemed plausible as well as several that did not. This may be due in part to the effect of sample size. Tests more resistant to the effects of sample size may yield a higher fraction of plausible diagnosis code associations.

Cite this paper

@article{Dubey2007MiningFA, title={Mining for associations between categorical data items in a clinical data repository.}, author={Anil K. Dubey and Christopher Herrick and Shawn N. Murphy}, journal={AMIA ... Annual Symposium proceedings. AMIA Symposium}, year={2007}, pages={945} }