Haplotyping as Perfect Phylogeny: A Direct Approach

@article{Bafna2003HaplotypingAP,
  title={Haplotyping as Perfect Phylogeny: A Direct Approach},
  author={Vineet Bafna and Dan Gusfield and Giuseppe Lancia and Shibu Yooseph},
  journal={Journal of computational biology : a journal of computational molecular cell biology},
  year={2003},
  volume={10 3-4},
  pages={
          323-40
        }
}
  • V. Bafna, D. Gusfield, +1 author S. Yooseph
  • Published 2003
  • Computer Science
  • Journal of computational biology : a journal of computational molecular cell biology
A full haplotype map of the human genome will prove extremely valuable as it will be used in large-scale screens of populations to associate specific haplotypes with specific complex genetic-influenced diseases. A haplotype map project has been announced by NIH. The biological key to that project is the surprising fact that some human genomic DNA can be partitioned into long blocks where genetic recombination has been rare, leading to strikingly fewer distinct haplotypes in the population than… 
Empirical Exploration of Perfect Phylogeny Haplotyping and Haplotypers
TLDR
The next high-priority phase of human genomics will involve the development of a full Haplotype Map of the human genome and results of using the method to find non-overlapping intervals where the haplotyping solution is highly reliable, as a function of the level of recombination in the data are discussed.
A Note on Efficient Computation of Haplotypes via Perfect Phylogeny
TLDR
This short note addresses two questions that were left open about the perfect phylogeny haplotyping problem and shows that the problem is NP-hard using a reduction from Vertex Cover (Garey and Johnson, 1979).
Graph algorithms for the haplotyping problem
TLDR
A linear-time algorithm is introduced for the Perfect Phylogeny Haplotyping (PPH) problem that provides all the possible solutions from an input and is much faster than previous methods.
Algorithms for Imperfect Phylogeny Haplotyping (IPPH) with a Single Homoplasy or Recombination Event
TLDR
The haplotype inference problem is addressed explicitly, by allowing one recombination or homoplasy event in the model of haplotype evolution, and a polynomial time solution for one problem is provided using an additional, empirically-supported assumption.
Fast Perfect Phylogeny Haplotype Inference
TLDR
This work addresses the problem of reconstructing haplotypes in a population, given a sample of genotypes and assumptions about the underlying population, and proposes a different combinatorial approach exploiting intersections of sampled genotypes (considered as sets of candidate haplotypes).
Computational Problems in Perfect Phylogeny Haplotyping: Typing without Calling the Allele
TLDR
It is shown how to solve the problem in polynomial time by a reduction to the graph realization problem by showing that tree uniqueness implies uniquely determined haplotypes, up to inherent degrees of freedom, and give a sufficient condition for the uniqueness.
Haplotype Inferring Via Galled-Tree Networks Is NP-Complete
TLDR
It is shown that, in general, haplotyping via galled-tree networks is NP-complete, and thus indeed hard.
On intractability of haplotype inferring via galled-tree networks
TLDR
A polynomial algorithm for haplotyping via imperfect phylogenies with a single homoplasy was presented, as well as a practical algorithm forHaplotype inferring via galled-tree networks with one gall, showing that the hypergraph covering problem in the general case is NP-complete by reduction from 3-SAT.
Optimal imperfect phylogeny reconstruction and haplotyping (IPPH).
TLDR
The general IPPH problem is solved and it is shown for the first time that it is possible to infer optimal q-near-perfect phylogenies from diploid genotype data in polynomial time for any constant q, where q is the number of "extra" mutations required in the phylogeny beyond what would be present in a perfect phylogeny.
Haplotype reconstruction from genotype data using Imperfect Phylogeny
TLDR
The method leverages a new insight into the underlying structure of haplotypes that shows that SNPs are organized in highly correlated 'blocks' and is extremely efficient compared with previous methods such as PHASE and HAPLOTYPER.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 31 REFERENCES
Haplotyping as Perfect Phylogeny: a Direct Approach Haplotyping as Perfect Phylogeny: a Direct Approach
TLDR
The algorithmic implications of the no recombination in long blocks observation for the problem of inferring haplotypes in populations are explored and a very simple easy to program algorithm is established that determines whether there is a PPH solution for input genotypes and produces a linear space data structure to represent all of the solutions.
Haplotyping as perfect phylogeny: conceptual framework and efficient solutions
TLDR
This paper explores the algorithmic implications of the key "no-recombination in long blocks" observation, for the problem of inferring haplotypes in populations, and observes that the no-re Combination assumption is very powerful.
A Note on Efficient Computation of Haplotypes via Perfect Phylogeny
TLDR
This short note addresses two questions that were left open about the perfect phylogeny haplotyping problem and shows that the problem is NP-hard using a reduction from Vertex Cover (Garey and Johnson, 1979).
Inference of Haplotypes from Samples of Diploid Populations: Complexity and Algorithms
TLDR
The problem is NP-hard and, in fact, Max-SNP complete; it is shown that the reduction creates problem instances conforming to a severe restriction believed to hold in real data; and an approach based on that operation and (integer) linear programming works quickly and correctly on simulated data.
High-resolution haplotype structure in the human genome
TLDR
A high-resolution analysis of the haplotype structure across 500 kilobases on chromosome 5q31 using 103 single-nucleotide polymorphisms (SNPs) in a European-derived population offers a coherent framework for creating a haplotype map of the human genome.
Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms.
TLDR
A new Monte Carlo approach that can accurately and rapidly infer haplotypes for a large number of linked SNPs and is robust to the violation of Hardy-Weinberg equilibrium, to the presence of missing data, and to occurrences of recombination hotspots is proposed.
Large scale reconstruction of haplotypes from genotype data
TLDR
This paper presents results for a highly accurate method for haplotype resolution from genotype data which leverages a new insight into the underlying structure of haplotypes which shows that SNPs are organized in highly correlated "blocks".
Efficient reconstruction of haplotype structure via perfect phylogeny.
TLDR
A simple and efficient polynomial-time algorithm for inferring haplotypes from the genotypes of a set of individuals assuming a perfect phylogeny is presented and a hardness result for the problem of removing the minimum number of individuals from a population is presented to ensure that the genotype of the remaining individuals are consistent with aperfect phylogeny.
HAPLO: a program using the EM algorithm to estimate the frequencies of multi-site haplotypes.
TLDR
A FORTRAN program, HAPLO, is written that implements the EM algorithm to estimate haplotype frequencies from phenotype data on samples of unrelated individuals, a generalized iterative maximum likelihood approach to estimation that is useful when data are ambiguous and/or incomplete.
Inference of haplotypes from PCR-amplified samples of diploid populations.
  • A. Clark
  • Biology
    Molecular biology and evolution
  • 1990
TLDR
Details of the algorithm for extracting allelic sequences from population samples, along with some population-genetic considerations that influence the likelihood for success of the method, are presented here.
...
1
2
3
4
...