RE: Leveraging Biospecimen Resources for Discovery or Validation of Markers for Early Cancer Detection.
- Stuart G Baker
- Journal of the National Cancer Institute
BACKGROUND High-throughput laboratory technologies coupled with sophisticated bioinformatics algorithms have tremendous potential for discovering novel biomarkers, or profiles of biomarkers, that could serve as predictors of disease risk, response to treatment or prognosis. We discuss methodological issues in wedding high-throughput approaches for biomarker discovery with the case-control study designs typically used in biomarker discovery studies, especially focusing on nested case-control designs. METHODS We review principles for nested case-control study design in relation to biomarker discovery studies and describe how the efficiency of biomarker discovery can be effected by study design choices. We develop a simulated prostate cancer cohort data set and a series of biomarker discovery case-control studies nested within the cohort to illustrate how study design choices can influence biomarker discovery process. RESULT Common elements of nested case-control design, incidence density sampling and matching of controls to cases are not typically factored correctly into biomarker discovery analyses, inducing bias in the discovery process. We illustrate how incidence density sampling and matching of controls to cases reduce the apparent specificity of truly valid biomarkers 'discovered' in a nested case-control study. We also propose and demonstrate a new case-control matching protocol, we call 'antimatching', that improves the efficiency of biomarker discovery studies. CONCLUSIONS For a valid, but as yet undiscovered, biomarker(s) disjunctions between correctly designed epidemiologic studies and the practice of biomarker discovery reduce the likelihood that true biomarker(s) will be discovered and increases the false-positive discovery rate.