Re-evaluating data quality of dog mitochondrial, Y chromosomal, and autosomal SNPs genotyped by SNP array
The advent of high-throughput and cost-effective genotyping platforms made genome-wide association (GWA) studies a reality. While the primary focus has been invested upon the improvement of reducing genotyping error, the problems associated with missing calls are largely overlooked. To probe into the effect of missing calls on GWAs, we demonstrated experimentally the prevalence and severity of the problem of missing call bias (MCB) in four genotyping technologies (Affymetrix 500 K SNP array, SNPstream, TaqMan, and Illumina Beadlab). Subsequently, we showed theoretically that MCB leads to biased conclusions in the subsequent analyses, including estimation of allele/genotype frequencies, the measurement of HWE and association tests under various modes of inheritance relationships. We showed that MCB usually leads to power loss in association tests, and such power change is greater than what could be achieved by equivalent reduction of sample size unbiasedly. We also compared the bias in allele frequency estimation and in association tests introduced by MCB with those by genotyping errors. Our results illustrated that in most cases, the bias can be greatly reduced by increasing the call-rate at the cost of genotyping error rate. The commonly used 'no-call' procedure for the observations of borderline quality should be modified. If the objective is to minimize the bias, the cut-off for call-rate and that for genotyping error rate should be properly coupled in GWA. We suggested that the ongoing QC cut-off for call-rate should be increased, while the cut-off for genotyping error rate can be reduced properly.