• Publications
  • Influence
The Sequence Alignment/Map format and SAMtools
Summary: The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against reference sequences, supporting short and long reads (up to 128 Mbp) produced by
The variant call format and VCFtools
VCFtools is a software suite that implements various utilities for processing VCF files, including validation, merging, comparing and also provides a general Perl API.
A global reference for human genetic variation
  • Adam Gonçalo R. David M. Richard M. Gonçalo R. David R. Auton Abecasis Altshuler Durbin Abecasis Bentley C, A. Auton, Shane A. McCarthy
  • Biology
  • 30 September 2015
The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.
Merlin—rapid analysis of dense genetic maps using sparse gene flow trees
The multipoint engine for rapid likelihood inference (Merlin) is a computer program that uses sparse inheritance trees for pedigree analysis; it performs rapid haplotyping, genotype error detection and affected pair linkage analyses and can handle more markers than other pedigree analysis packages.
An integrated map of genetic variation from 1,092 human genomes
It is shown that evolutionary conservation and coding consequence are key determinants of the strength of purifying selection, that rare-variant load varies substantially across biological pathways, and that each individual contains hundreds of rare non-coding variants at conserved sites, such as motif-disrupting changes in transcription-factor-binding sites.
A map of human genome variation from population-scale sequencing
The pilot phase of the 1000 Genomes Project is presented, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms, and the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants are described.
A second generation human haplotype map of over 3.1 million SNPs
The Phase II HapMap is described, which characterizes over 3.1 million human single nucleotide polymorphisms genotyped in 270 individuals from four geographically diverse populations and includes 25–35% of common SNP variation in the populations surveyed, and increased differentiation at non-synonymous, compared to synonymous, SNPs is demonstrated.
MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes
It is shown that genotype imputation of common variants using HapMap haplotypes as a reference is very accurate using either genome‐wide SNP data or smaller amounts of data typical in fine‐mapping studies, and it is illustrated how association analyses of unobserved variants will benefit from ongoing advances such as larger Hap map reference panels and whole genome shotgun sequencing technologies.
Genetic studies of body mass index yield new insights for obesity biology
A genome-wide association study and Metabochip meta-analysis of body mass index (BMI), a measure commonly used to define obesity and assess adiposity, in up to 339,224 individuals provide strong support for a role of the central nervous system in obesity susceptibility.
Biological, Clinical, and Population Relevance of 95 Loci for Blood Lipids
The results identify several novel loci associated with plasma lipids that are also associated with CAD and provide the foundation to develop a broader biological understanding of lipoprotein metabolism and to identify new therapeutic opportunities for the prevention of CAD.