This work has characterized whole-genome patterns of common human DNA variation by genotyping 1,586,383 single-nucleotide polymorphisms (SNPs) in 71 Americans of European, African, and Asian ancestry and indicates that these SNPs capture most common genetic variation as a result of linkage disequilibrium.
The proposed data structure for representing all distances in a graph is distributed in the sense that it may be viewed as assigning labels to the vertices, such that a query involving vertices u and v may be answered using only the labels of u andV.
It is shown that for every fixed ε>0, the GROUP-STEINER-TREE problem admits no efficient log2-ε k approximation, where k denotes the number of groups (or, alternatively, the input size), unless NP has quasi polynomial Las-Vegas algorithms.
It is reported that surnames can be recovered from personal genomes by profiling short tandem repeats on the Y chromosome (Y-STRs) and querying recreational genetic genealogy databases and it is shown that a combination of a surname with other types of metadata, such as age and state, can be used to triangulate the identity of the target.
A meta-analysis of genome-wide association studies of coronary artery disease comprising 22,233 individuals with CAD and 64,762 controls of European descent followed by genotyping of top association signals found 13 loci newly associated with CAD at P < 5 × 10−8 and confirmed the association of 10 of 12 previously reported CAD loci.
A comprehensive assessment of the methods applied to both trios and to unrelated individuals, with a focus on genomic-scale problems, concludes that all the methods considered will provide highly accurate estimates of haplotypes when applied to trio data sets.
Improved algorithms for finding small vertex covers in bounded degree graphs and hypergraphs are obtained and an approximation algorithm for the weighted independent set problem is obtained, matching a recent result of Halldorsson.
Methods for local ancestry inference which leverage the structure of linkage disequilibrium in the ancestral population, and incorporate the constraint of Mendelian segregation when inferring local ancestry in nuclear family trios (LAMP-HAP) are introduced.
The method leverages a new insight into the underlying structure of haplotypes that shows that SNPs are organized in highly correlated 'blocks' and is extremely efficient compared with previous methods such as PHASE and HAPLOTYPER.