• Publications
  • Influence
The Sequence Alignment/Map format and SAMtools
Summary: The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against reference sequences, supporting short and long reads (up to 128 Mbp) produced byExpand
The variant call format and VCFtools
VCFtools is a software suite that implements various utilities for processing VCF files, including validation, merging, comparing and also provides a general Perl API. Expand
A global reference for human genetic variation
  • Adam Gonçalo R. David M. Richard M. Gonçalo R. David R. Auton Abecasis Altshuler Durbin Abecasis Bentley C, A. Auton, +73 authors Shane A. McCarthy
  • Biology, Medicine
  • Nature
  • 30 September 2015
The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping. Expand
An integrated map of genetic variation from 1,092 human genomes
It is shown that evolutionary conservation and coding consequence are key determinants of the strength of purifying selection, that rare-variant load varies substantially across biological pathways, and that each individual contains hundreds of rare non-coding variants at conserved sites, such as motif-disrupting changes in transcription-factor-binding sites. Expand
Haplotype-based variant detection from short-read sequencing
A Bayesian statistical framework which is capable of modeling multiallelic loci in sets of individuals with non-uniform copy number is developed and its implementation in a haplotype-based variant detector, FreeBayes is described. Expand
ART: a next-generation sequencing read simulator
UNLABELLED ART is a set of simulation tools that generate synthetic next-generation sequencing reads. This functionality is essential for testing and benchmarking tools for next-generation sequencingExpand
An integrated map of structural variation in 2,504 human genomes
An integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which are constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations are described. Expand
A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms
This high-density SNP map provides a public resource for defining haplotype variation across the genome, and should help to identify biomedically important genes for diagnosis and therapy. Expand
Demographic history and rare allele sharing among human populations
It is found that the majority of human genomic variable sites are rare and exhibit little sharing among diverged populations, emphasizing that replication of disease association for specific rare genetic variants across diverging populations must overcome both reduced statistical power because of rarity and higher population divergence. Expand
BamTools: a C++ API and toolkit for analyzing and managing BAM files
BamTools is a software suite for programmers and end users that facilitates research analysis and data management using BAM files and provides both the first C++ API publicly available for BAM file support as well as a command-line toolkit. Expand