The variant call format and VCFtools
VCFtools is a software suite that implements various utilities for processing VCF files, including validation, merging, comparing and also provides a general Perl API.
A global reference for human genetic variation
- Taras K. Oleksyk, Adam Gonçalo R. David M. Richard M. Gonçalo R. David R. Auton Abecasis Altshuler Durbin Abecasis Bentley C, Shane A. McCarthy
- 30 September 2015
The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.
An integrated map of genetic variation from 1,092 human genomes
It is shown that evolutionary conservation and coding consequence are key determinants of the strength of purifying selection, that rare-variant load varies substantially across biological pathways, and that each individual contains hundreds of rare non-coding variants at conserved sites, such as motif-disrupting changes in transcription-factor-binding sites.
A map of human genome variation from population-scale sequencing
The pilot phase of the 1000 Genomes Project is presented, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms, and the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants are described.
A second generation human haplotype map of over 3.1 million SNPs
The Phase II HapMap is described, which characterizes over 3.1 million human single nucleotide polymorphisms genotyped in 270 individuals from four geographically diverse populations and includes 25–35% of common SNP variation in the populations surveyed, and increased differentiation at non-synonymous, compared to synonymous, SNPs is demonstrated.
An integrated map of structural variation in 2,504 human genomes
An integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which are constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations are described.
Genes mirror geography within Europe
Despite low average levels of genetic differentiation among Europeans, there is a close correspondence between genetic and geographic distances; indeed, a geographical map of Europe arises naturally as an efficient two-dimensional summary of genetic variation in Europeans.
A common sequence motif associated with recombination hot spots and genome instability in humans
Increased hot-spot resolution afforded by the Phase 2 HapMap and novel search methods are used to identify an extended family of motifs based around the degenerate 13-mer CCNCCNTNNCCNC, which is critical in recruiting crossover events to at least 40% of all human hot spots.
Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use
Evidence is reported for the involvement of many systems in tobacco and alcohol use, including genes involved in nicotinic, dopaminergic, and glutamatergic neurotransmission, which provide a solid starting point to evaluate the effects of these loci in model organisms and more precise substance use measures.
Genome-wide patterns of genetic variation in worldwide Arabidopsis thaliana accessions from the RegMap panel
- Matthew W Horton, Angela M. Hancock, J. Bergelson
- Environmental Science, BiologyNature Genetics
- 21 December 2011
The pattern of historical recombination in A. thaliana is characterized and an enrichment of hotspots in its intergenic regions and repetitive DNA is observed, which is consistent with the pattern that is observed for humans but which is strikingly different from that observed in other plant species.