On the use of DNA pooling to estimate haplotype frequencies
@article{Wang2003OnTU,
title={On the use of DNA pooling to estimate haplotype frequencies},
author={Shuang Wang and Kenneth K. Kidd and Hongyu Zhao},
journal={Genetic Epidemiology},
year={2003},
volume={24}
}Genome‐wide association studies may be necessary to identify genes underlying certain complex diseases. Because such studies can be extremely expensive, DNA pooling has been introduced, as it may greatly reduce the genotyping burden. Parallel to DNA pooling developments, the importance of haplotypes in genetic studies has been amply demonstrated in the literature. However, DNA pooling of a large number of samples may lose haplotype information among tightly linked genetic markers. Here, we…
51 Citations
Fast and accurate haplotype frequency estimation for large haplotype vectors from pooled DNA data
- Biology, Computer ScienceBMC Genetics
- 2012
This work presents an algorithm for haplotype frequency estimation from pooled data using a tree-based determinstic sampling technique that demonstrates superior performance in datasets with large number of markers and could be the method of choice for haplotypes frequency estimation in such datasets.
Estimating haplotype‐disease associations with pooled genotype data
- BiologyGenetic epidemiology
- 2005
This work develops simple and efficient numerical algorithms for calculating the maximum likelihood estimators and their variances, and implements these algorithms in a freely available computer program, and shows that DNA pooling is highly efficient in studying haplotype‐disease associations.
Efficiency of single-nucleotide polymorphism haplotype estimation from pooled DNA
- BiologyProceedings of the National Academy of Sciences of the United States of America
- 2003
A method for obtaining maximum likelihood estimates of haplotype frequencies for different pool sizes is developed, the accuracy of these estimates are assessed, and it is shown that pooling DNA samples is efficient in estimating haplotypes frequencies.
Testing linkage disequilibrium from pooled DNA: a contingency table perspective.
- BiologyStatistics in medicine
- 2008
It is discovered that pooling is not efficient in testing weak LD despite its efficiency in estimating haplotype frequencies, and Wald-type tests for linkage disequilibrium (LD) coefficient using pooled data are discovered.
PDA: Pooled DNA analyzer
- BiologyBMC Bioinformatics
- 2005
This new multipoint testing procedure overcomes a computational bottleneck of conventional haplotype-oriented multipoint methods in DNA pooling analyses and can handle data sets having a large pool size and/or large numbers of polymorphic markers.
Estimating population haplotype frequencies from pooled SNP data using incomplete database information
- Computer Science, BiologyBioinform.
- 2009
A Bayesian model for estimating the haplotypes and their frequencies from pooled allelic observations is introduced that combines an idea of using database information for haplotype estimation with a computationally efficient multinormal approximation.
Maximum-parsimony haplotype frequencies inference based on a joint constrained sparse representation of pooled DNA
- Biology, Computer ScienceBMC Bioinformatics
- 2013
A method for maximum-parsimony haplotype frequency estimation from pooled DNA data based on the sparse representation of the DNA pools in a dictionary of haplotypes is developed and outperforms state-of-the-art methods such as HIPPO and HAPLOPOOL in datasets that contain pools with a small number of individuals.
Estimating Haplotype Frequencies by Combining Data from Large DNA Pools with Database Information
- Computer ScienceIEEE/ACM Transactions on Computational Biology and Bioinformatics
- 2011
A Bayesian haplotyping method for pooled DNA based on a continuous approximation of the multinomial distribution, similar to that of an EM-algorithm, which uses a multinormal approximation for the pooled allele frequencies, but which does not utilize prior information about the haplotypes.
Estimating population haplotype frequencies from pooled DNA samples using PHASE algorithm.
- Computer ScienceGenetics research
- 2008
A modified version of PHASE is presented for estimating population haplotype frequencies from pooled DNA data and it is suggested that the PHASE algorithm is a method of choice also on pooledDNA data.
Using DNA pools for genotyping trios
- BiologyNucleic acids research
- 2006
Using this approach, future trio-based association studies may be able to increase the sample size by 50% for the same cost and thereby increase the power to detect associations.
References
SHOWING 1-10 OF 35 REFERENCES
Efficiency of DNA pooling to estimate joint allele frequencies and measure linkage disequilibrium
- BiologyGenetic epidemiology
- 2002
Methods of analyzing pooled DNA samples to estimate the joint prevalence of variants at two or more loci are extended and the expected savings in numbers of assays required when pooling is utilized compared to individual testing are quantified.
Allele frequency distributions in pooled DNA samples: applications to mapping complex disease genes.
- BiologyGenome research
- 1998
The studies show that accurate, quantitative data on allele frequencies, suitable for identifying markers for complex disorders, can be identified from pooled DNA samples, and this approach promises to drastically reduce the labor and cost of genotyping in the initial identification of disease loci.
Selective DNA pooling for determination of linkage between a molecular marker and a quantitative trait locus.
- BiologyGenetics
- 1994
DNA pooling takes this one step further by pooling DNA from the selected individuals at each of the two phenotypic extremes, and basing the test for linkage on marker allele frequencies as estimated from the pooled samples only, which can reduce genotyping costs of marker-QTL linkage determination by up to two orders of magnitude.
The accuracy of statistical methods for estimation of haplotype frequencies: an example from the CD4 locus.
- BiologyAmerican journal of human genetics
- 2000
It is shown that the estimated frequencies of common haplotypes do not differ significantly with the use of phase-known versus phase-unknown data, however, rare haplotypes are occasionally miscalled when their presence/absence must be inferred, so frequency estimates based on the phase- unknown marker-typing results from unrelated individuals will be sufficient.
Accuracy of haplotype frequency estimation for biallelic loci, via the expectation-maximization algorithm for unphased diploid genotype data.
- BiologyAmerican journal of human genetics
- 2000
This study considers and explores sources of error between EM-derived haplotype frequency estimates and their population parameters, pointing out the relative impacts of sampling error and estimation error, and calling attention to the pronounced accuracy of EM estimates once sampling error has been accounted for.
Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population.
- BiologyMolecular biology and evolution
- 1995
An expectation-maximization (EM) algorithm leading to maximum-likelihood estimates of molecular haplotype frequencies under the assumption of Hardy-Weinberg proportions is implemented and appears to be useful for the analysis of nuclear DNA sequences or highly variable loci.
Association mapping of disease loci, by use of a pooled DNA genomic screen.
- BiologyAmerican journal of human genetics
- 1997
Use of pooled DNA amplifications is reported for the accurate determination of marker-disease associations for both case-control and nuclear family-based data, including application of correction methods for stutter artifact and preferential amplification.
Prospects for whole-genome linkage disequilibrium mapping of common disease genes
- BiologyNature Genetics
- 1999
Recently, attention has focused on the use of whole-genome linkage disequilibrium (LD) studies to map common disease genes. Such studies would employ a dense map of single nucleotide polymorphisms…
Inference of haplotypes from PCR-amplified samples of diploid populations.
- BiologyMolecular biology and evolution
- 1990
Details of the algorithm for extracting allelic sequences from population samples, along with some population-genetic considerations that influence the likelihood for success of the method, are presented here.
Use of pooled DNA samples to detect linkage disequilibrium of polymorphic restriction fragments and human disease: studies of the HLA class II loci.
- BiologyProceedings of the National Academy of Sciences of the United States of America
- 1985
Several specific polymorphic restriction fragments associated with IDDM were revealed by using this economical and rapid approach to search for informative RFLPs, and may be informative markers for IDDM susceptibility.