Pure Parsimony Xor Haplotyping

@article{Bonizzoni2010PurePX,
  title={Pure Parsimony Xor Haplotyping},
  author={Paola Bonizzoni and Gianluca Della Vedova and Riccardo Dondi and Yuri Pirola and Romeo Rizzi},
  journal={IEEE/ACM Transactions on Computational Biology and Bioinformatics},
  year={2010},
  volume={7},
  pages={598-610}
}
The haplotype resolution from xor-genotype data has been recently formulated as a new model for genetic studies. The xor-genotype data is a cheaply obtainable type of data distinguishing heterozygous from homozygous sites without identifying the homozygous alleles. In this paper, we propose a formulation based on a well-known model used in haplotype inference: pure parsimony. We exhibit exact solutions of the problem by providing polynomial time algorithms for some restricted cases and a fixed… 

Figures and Tables from this paper

Maximum parsimony xor haplotyping by sparse dictionary selection
TLDR
This work proposes a framework of maximum parsimony inference of haplotypes based on the search of a sparse dictionary, and presents a greedy method that can effectively infer the haplotype pairs given a set of xor-genotypes augmented by a small number of regular genotypes.
Improved haplotype assembly using Xor genotypes.
COMBINATORIAL HAPLOTYPING PROBLEMS
TLDR
The collection of a large amount of genomic data, culminated with the completion of the Human Genome Project, has brough the confirmation that the genetic makeup of humans (as well as other species) is remarkably well-conserved.
Parameterized Algorithms in Bioinformatics: An Overview
TLDR
This work surveys recent developments of parameterized algorithms and complexity for important NP-hard problems in bioinformatics, and covers sequence assembly and analysis, genome comparison and completion, and haplotyping and phylogenetics.

References

SHOWING 1-10 OF 27 REFERENCES
Computational Problems in Perfect Phylogeny Haplotyping: Xor-Genotypes and Tag SNPs
TLDR
It is shown how to resolve xor-genotypes under perfect phylogeny model, and the degrees of freedom in such resolutions, and it is shown that the full genotype of at most three individuals suffice in order to determine all haplotypes across the phylogeny.
Computational Problems in Perfect Phylogeny Haplotyping: Typing without Calling the Allele
TLDR
It is shown how to solve the problem in polynomial time by a reduction to the graph realization problem by showing that tree uniqueness implies uniquely determined haplotypes, up to inherent degrees of freedom, and give a sufficient condition for the uniqueness.
Haplotyping Populations by Pure Parsimony: Complexity of Exact and Approximation Algorithms
TLDR
This paper proves that the problem is APX-hard and presents a 2k- 1-approximation algorithm for the case in which each genotype has at most k ambiguous positions, and gives a new integer-programming formulation that has (for the first time) a polynomial number variables and constraints.
Haplotype Inference by Pure Parsimony
TLDR
The results are that the Pure Parsimony problem can be solved efficiently in practice for a wide range of problem instances of current interest in biology.
Haplotyping as perfect phylogeny: conceptual framework and efficient solutions
TLDR
This paper explores the algorithmic implications of the key "no-recombination in long blocks" observation, for the problem of inferring haplotypes in populations, and observes that the no-re Combination assumption is very powerful.
Integer programming approaches to haplotype inference by pure parsimony
TLDR
A new polynomial-sized IP formulation is presented that is a hybrid between two existing IP formulations and inherits many of the strengths of both and can be extended in a variety of ways to allow errors in the input or model the structure of the population under consideration.
Islands of Tractability for Parsimony Haplotyping
TLDR
It is proved that the parsimony approach to haplotype inference, which calls for finding a set of haplotypes of minimum cardinality that explains an input set of genotypes, is APX-hard even in very restricted cases.
Shorelines of Islands of Tractability: Algorithms for Parsimony and Minimum Perfect Phylogeny Haplotyping Problems
TLDR
This work extends recent work by further mapping the interface between "easy" and "hard" instances, within the framework of (k, f)-bounded instances, and constructs for both PH and MPPH polynomial time approximation algorithms, based on properties of the columns of the input matrix.
A haplotype map of the human genome
TLDR
A public database of common variation in the human genome: more than one million single nucleotide polymorphisms for which accurate and complete genotypes have been obtained in 269 DNA samples from four populations, including ten 500-kilobase regions in which essentially all information about common DNA variation has been extracted.
A haplotype map of the human genome.
TLDR
A public database of common variation in the human genome: more than one million single nucleotide polymorphisms for which accurate and complete genotypes have been obtained in 269 DNA samples from four populations, including ten 500-kilobase regions in which essentially all information about common DNA variation has been extracted.
...
...