Improving the Accuracy and Efficiency of Identity-by-Descent Detection in Population Data

  title={Improving the Accuracy and Efficiency of Identity-by-Descent Detection in Population Data},
  author={Brian L. Browning and Sharon R. Browning},
  pages={459 - 471}
Segments of indentity-by-descent (IBD) detected from high-density genetic data are useful for many applications, including long-range phase determination, phasing family data, imputation, IBD mapping, and heritability analysis in founder populations. We present Refined IBD, a new method for IBD segment detection. Refined IBD achieves both computational efficiency and highly accurate IBD segment reporting by searching for IBD in two steps. The first step (identification) uses the GERMLINE… 

Figures from this paper

PIGS: improved estimates of identity-by-descent probabilities by probabilistic IBD graph sampling

A hybrid approach (PIGS) is developed, which combines the computational efficiency of pairwise methods with the power of multiway methods and leverages the IBD graph structure to compute the probability of IBD conditional on all pairwise estimates simultaneously.

Detecting identity by descent and estimating genotype error rates in sequence data.

Relationship Estimation from Whole-Genome Sequence Data

A new method to identify and mask genomic regions with excess pairwise IBD in both the pedigree and control datasets using three established IBD methods: GERMLINE, fastIBD, and ISCA is developed.

Fast and accurate shared segment detection and relatedness estimation in un-phased genetic data using TRUFFLE

TRUFFLE, a method that integrates computational techniques and statistical principles for the identification and visualization of identity-by-descent (IBD) segments using un-phased data, is developed, by skipping the haplotype phasing step and, instead, relying on a simpler region-based approach.

A fast and simple method for detecting identity by descent segments in large-scale data

Harp-IBD is shown to be the only method that can rapidly and accurately detect short 2-4 cM IBD segments in the full UK Biobank data and is the only way to detect identical-by-descent haplotype segments for large-scale genotype data.

Fast and Robust Identity-by-Descent Inference with the Templated Positional Burrows–Wheeler Transform

The templated positional Burrows–Wheeler transform (TPBWT) is presented to make fast IBD estimates robust to genotype and phasing errors and compares the performance of the TPBWT against a widely used phase-free IBD inference approach that is robust to phase errors.

Reducing Pervasive False-Positive Identical-by-Descent Segments Detected by Large-Scale Pedigree Analysis

HaploScore is introduced, a novel, computationally efficient metric that scores IBD segments proportional to the number of switch errors they contain and can improve the accuracy of segments reported by any IBD detection method, provided that estimates of the genotyping error rate and switch error rate are available.

Conflation of Short Identity-by-Descent Segments Bias Their Inferred Length Distribution

An estimator of the de novo mutation rate using IBD segments is presented and analyzed, and it is demonstrated that unmodeled conflation leads to underestimates of the ages of the common ancestors on these segments, and hence a significant overestimate of the mutation rate.



High-resolution detection of identity by descent in unrelated individuals.

Identity by descent estimation with dense genome‐wide genotype data

IBDLD overcomes the challenges of exact multipoint estimation of IBD in pedigrees of potentially large size and eliminates the difficulty of accommodating the background linkage disequilibrium (LD) that is present in high‐density genotype data.

Using identity by descent estimation with dense genotype data to detect positive selection

This work uses IBD to find signals of selection in the Maasai from Kinyawa, Kenya, and uses the advantage of statistical tools that can probabilistically estimate IBD sharing without having to thin genotype data because of linkage disequilibrium (LD), and that allow for both inbreeding and more than one allele to be shared IBD.

A method for detecting IBD regions simultaneously in multiple individuals--with applications to disease genetics.

A new Markov Chain Monte Carlo method is presented, based on a probabilistic model applicable to unphased SNP data, that can simultaneously infer IBD sharing among multiple individuals and is more powerful and accurate than several other non-pedigree based methods.

Detecting Rare Variant Associations by Identity-by-Descent Mapping in Case-Control Studies

The results suggest that IBD mapping may have higher power than association analysis of SNP data when multiple rare causal variants are clustered within a gene, however, for outbred populations, very large sample sizes may be required for genome-wide significance unless the causal variants have strong effects.

Detection of sharing by descent, long-range phasing and haplotype imputation

This work shows how to phase more than 1,000 SNPs simultaneously for a large fraction of the 35,528 Icelanders genotyped by Illumina chips, which is particularly powerful in studies of the inheritance of recurrent mutations and fine-scale recombinations in large sample sets.

Inferring Coancestry in Population Samples in the Presence of Linkage Disequilibrium

A hidden Markov model for ibd among a set of chromosomes is presented and it is shown that, despite not incorporating LD, the model has been quite successful in detecting segments as small as 106 bp (1 Mpb), and comparisons with fastIBD which uses an LD model in estimating ibd are presented.

Identity by descent between distant relatives: detection and applications.

The principles behind methods for IBD segment detection are explained, recently developed methods are described, approaches to comparing methods are discussed, and an overview of applications are given.

Relatedness mapping and tracts of relatedness for genome‐wide data in the presence of linkage disequilibrium

A new method for identifying IBD tracts among individuals from genome‐wide single nucleotide polymorphisms data using a continuous time Markov model that accurately accounts for linkage disequilibrium using pairwise haplotype probabilities.