Sampling, Amplification, and Resampling


Many biological experiments for measuring the concentration levels of the gene transcripts or protein molecules involve the application of the Polymerase Chain Reaction (PCR) procedure to the gene or protein samples. To better model the results of the these experiments, we propose a new sampling scheme—sampling, amplification, and resampling (SAR)—for generating discrete data, and derive the asymptotic distribution of the SAR sample. We suggest new statistics for the test of association based on the new model, and give their asymptotic distributions. We also compare the new model with the traditional multinomial model, and show that the new model predicts a significantly larger variance for the SAR sample. This implies that, when applied to the SAR sample, the tests based on the traditional model will have a much higher type I error than expected.

