The Use of Multiple Imputation for Data Subject to Limits of Detection.

Abstract

Missing data due to limit of detection and limit of quantification is a common obstacle in epidemiological and biomedical research. We are interested in methodologies that provide unbiased and efficient estimates of these missing data while using popular statistical software. We describe a multiple imputation (MI) procedure for cross-sectional and longitudinal data which examines the sources of variation of hormones levels throughout the menstrual cycle conditional on specific biomarkers. We describe the rational, procedure, advantages and disadvantages of the multiple imputation procedure. We also provide a comparison to commonly used missing data procedures (complete cases analysis and single imputation). We illustrate our approach using the BioCycle data where we are interested in the effects of Vitamin E and Beta-carotene on Progesterone levels. We also evaluate the longitudinal impact of changes in Vitamin E on Progesterone levels over time. Finaly, we demonstrate the advantages of using MI over complete case analysis or naive single replacement in both cross-sectional and longitudinal analysis where measurements below the limit of quantification (LOQ) are unreported. We also illustrate that if available, inclusion of potentially demined unreliable data below the limit of detection (LOD) improves simple estimation substantially.

4 Figures and Tables

Cite this paper

@article{Harel2014TheUO, title={The Use of Multiple Imputation for Data Subject to Limits of Detection.}, author={Ofer Harel and Neil Perkins and Enrique F. Schisterman}, journal={Sri Lankan journal of applied statistics}, year={2014}, volume={5 4}, pages={227-246} }