Enhanced Pedigree Error Detection

  title={Enhanced Pedigree Error Detection},
  author={Lei Sun and Kenneth Wilder and Mary Sara McPeek},
  journal={Human Heredity},
  pages={99 - 110}
Accurate information on the relationships among individuals in a study is critical for valid linkage analysis. We extend the MLLR, EIBD, AIBS and IBS tests for detection of misspecified relationships to a broader range of relative pairs, and we improve the two-stage screening procedure for analyzing large data sets. We have developed software, PREST, which calculates the test statistics and performs the corresponding hypothesis tests for relationship misclassification in general outbred… 

Figures and Tables from this paper

Detecting Pedigree Relationship Errors.

  • Lei Sun
  • Biology
    Methods in molecular biology
  • 2017
Several allele sharing as well as likelihood-based statistics that were proposed to efficiently extract genealogical information from available genome-wide marker data, and the software package PREST that implements these methods are reviewed.

Improving Pedigree-based Linkage Analysis by Estimating Coancestry Among Families

It is shown that when families share a gene for a trait due to shared ancestry on the order of tens of generations, the method can detect a linkage signal when independent analyses of the families do not.

Linkage analysis without defined pedigrees

New fast and accurate algorithms for estimating global and local kinship coefficients from dense SNP genotypes are presented, which require only a single pass through the SNP genotype data and can be used to cluster individuals into pedigrees.

Non-identifiable Pedigrees and a Bayesian Solution

A general criteria for establishing whether a pair of pedigrees is non-identifiable and two easy-to-compute criteria guaranteeing identifiability are introduced and a method for dealing with non-Identifiable likelihoods is suggested: use Bayes rule to obtain the posterior from the likelihood and prior.

The impact of data quality on the identification of complex disease genes: experience from the Family Blood Pressure Program

A protocol for large linkage studies is developed that reduces two sources of data error: pedigree structure and marker genotyping errors and uses the linkage signals, before and after data cleaning, to illustrate the impact of missing and erroneous data.

PREST-plus identifies pedigree errors and cryptic relatedness in the GAW18 sample using genome-wide SNP data

Using the genome-wide single-nucleotide polymorphism (SNP) data, PREST-plus detects 7 mis-specified relative pairs, with their IBD estimates clearly deviating from the null expectations, and it identifies 4 cryptic related pairs involving 7 individuals from 6 families.

Multiple Genetic Variant Association Testing by Collapsing and Kernel Methods With Pedigree or Population Structured Data

Searching for rare genetic variants associated with complex diseases can be facilitated by enriching for diseased carriers of rare variants by sampling cases from pedigrees enriched for disease,

Pedigree and genotype errors in the Framingham Heart Study

The pedigree and genotype data from the Framingham Heart Study were examined for errors, and five Mendelian errors were found following the pedigree corrections.

Accurate Phasing of Pedigree Genotypes Using Whole Genome Sequence Data

A new method for phasing genotypes from whole genome sequencing data in pedigrees: PULSAR (Phasing Using Lineage Specific Alleles / Rare variants), which is built upon the idea that alleles that are specific to a single founding chromosome within a pedigree are highly informative for identifying haplotypes that are identical-by-decent between individualswithin a pedigree.



Detection of Misspecified Relationships in Inbred and Outbred Pedigrees

Genome screen data collected for linkage analysis can be used to detect pedigree errors and a graphical method for error detection in complex inbred pedigrees is proposed, with application to the Hutterites.

A test statistic to detect errors in sib-pair relationships.

A test statistic based on the summation, over a large number of genetic markers, of the number of alleles shared identical by state by a pair of individuals, for each marker is proposed, which identifies sibs as MZ twins.

PedCheck: a program for identification of genotype incompatibilities in linkage analysis.

Four error-checking algorithms are implemented in a new computer program, PedCheck, which will assist researchers in identifying all Mendelian inconsistencies in pedigree data and will provide them with useful and detailed diagnostic information to help resolve the errors.

Identifying marker typing incompatibilities in linkage analysis.

Two methods for automatically identifying those individuals whose genotypes are most likely the cause of the inconsistencies in the pedigree are developed and implemented as a module of the pedigree analysis program package MENDEL.

Estimation of pairwise relationships in the presence of genotyping errors.

In the calculation of the likelihood for a putative relationship, this work proposes to replace the values for pi(x) used by Boehnke and Cox (1997xAccurate inference of relationships in sib-pair linkage studies) with the following: where e denotes twice the approximate genotyping error rate.

Relationship Estimation in Affected Sib Pair Analysis of Late-Onset Diseases

  • H. GöringJ. Ott
  • Environmental Science
    European journal of human genetics : EJHG
  • 1997
It is demonstrated that elimination of false sib pairs increases the power to detect linkage in affected sib pair studies and it is shown that sibs, half-sibs and unrelated individuals can be distinguished from each other quite reliably using numbers of markers that should be available in most sib Pair studies.

Accurate inference of relationships in sib-pair linkage studies.

The number of markers required to accurately infer relationships typically encountered in a sib-pair study is explored, as a function of marker allele frequencies, marker spacing, and genotyping error rate, and it is concluded that very accurate inference of relationships can be achieved, given the marker data from even part of a genome scan.

Relationship estimation by Markov-process models in a sib-pair linkage study.

  • J. Olson
  • Mathematics
    American journal of human genetics
  • 1999
This work proposes multipoint methods that are based on a Markov-process model of allele sharing along the chromosome that can be implemented by standard algorithms that compute multipoint marker allele-sharing probabilities for sib pairs.

On Relationship Inference Using Gamete Identity by Descent Data

Under the assumption that the crossovers follow a Poisson process, it is shown that the exact calculation of the likelihood of a particular relationship for a given gamete IBD datum is tractable.