• Publications
  • Influence
The variant call format and VCFtools
VCFtools is a software suite that implements various utilities for processing VCF files, including validation, merging, comparing and also provides a general Perl API. Expand
Mouse genomic variation and its effect on phenotypes and gene regulation
These sequences provide a starting point for a new era in the functional analysis of a key model organism and show that the molecular nature of functional variants and their position relative to genes vary according to the effect size of the locus. Expand
A reference panel of 64,976 haplotypes for genotype imputation
A reference panel of 64,976 human haplotypes at 39,235,157 SNPs constructed using whole-genome sequence data from 20 studies of predominantly European ancestry leads to accurate genotype imputation at minor allele frequencies as low as 0.1% and a large increase in the number of SNPs tested in association studies. Expand
BCFtools/RoH: a hidden Markov model approach for detecting autozygosity from next-generation sequencing data
BCFtools/RoH is presented and evaluated, an extension to the BCFtools software package, that detects regions of autozygosity in sequencing data, in particular exome data, using a hidden Markov model and it is shown that it has higher sensitivity and specificity than existing methods under a range of sequencing error rates and levels of autozykgosity. Expand
Reference-based phasing using the Haplotype Reference Consortium panel
A new phasing algorithm, Eagle2, is introduced that attains high accuracy across a broad range of cohort sizes by efficiently leveraging information from large external reference panels (such as the Haplotype Reference Consortium; HRC) using a new data structure based on the positional Burrows-Wheeler transform. Expand
The UK10K project identifies rare variants in health and disease
  • Klaudia Josine L. Jie Lucy Yasin Shane John R. B. ChangJia Walter Min Huang Crooks Memari McCarthy Perry Xu F, Klaudia Walter, +240 authors Weihua Zhang
  • Biology, Medicine
  • Nature
  • 14 September 2015
In extensively phenotyped cohorts, insights from sequencing whole genomes or exomes of nearly 10,000 individuals from population-based and disease collections are described and population structure and functional annotation of rare and low-frequency variants are described. Expand
High levels of RNA-editing site conservation amongst 15 laboratory mouse strains
In the Cds2 gene, evidence for RNA editing acting to preserve the ancestral transcript sequence despite genomic sequence divergence is found, showing that despite over two million years of evolutionary divergence, the sites edited and the level of editing at each site is remarkably consistent across the 15 strains. Expand
Twelve years of SAMtools and BCFtools
The SAMtools and BCFtools packages represent a unique collection of tools that have been used in numerous other software projects and countless genomic pipelines, free for both non-commercial and commercial use. Expand
A comparative phenotypic and genomic analysis of C57BL/6J and C57BL/6N mouse strains
Comparison of C57BL/6J and C57bl/6N demonstrates a range of phenotypic differences that have the potential to impact upon penetrance and expressivity of mutational effects in these strains. Expand
Whole‐genome sequencing identifies EN1 as a determinant of bone density and fracture
Evidence is provided that low‐frequency non‐coding variants have large effects on BMD and fracture, thereby providing rationale for whole‐genome sequencing and improved imputation reference panels to study the genetic architecture of complex traits and disease in the general population. Expand