Learn More
Motivated by the problem in computational biology of reconstructing the series of chromosome inversions by which one organism evolved from another, we consider the problem of computing the shortest series of reversals that transform one permutation to another. The permutations describe the order of genes on corresponding chromosomes, and a reversal takes an(More)
The MSA program, written and distributed in 1989, is one of the few existing programs that attempts to find optimal alignments of multiple protein or DNA sequences. The MSA program implements a branch-and-bound technique together with a variant of Dijkstra's shortest paths algorithm to prune the basic dynamic programming graph. We have made substantial(More)
Software watermarking is a tool used to combat software piracy by embedding identifying information into a program. Most existing proposals for software watermarking have the shortcoming that the mark can be destroyed via fairly straightforward semantics-preserving code transformations. This paper introduces path-based watermarking, a new approach to(More)
We study the problem of comparing two circular chromosomes that have evolved by chromosome inversion, assuming that the order of corresponding genes is known, as well as their orientation. Determining the minimum number of inversions is equivalent to finding the minimum of reversals to sort a signed circular permutation, where a reversal takes an arbitrary(More)
For as long as biologists have been computing alignments of sequences, the question of what values to use for scoring substitutions and gaps has persisted. While some choices for substitution scores are now common, largely due to convention, there is no standard for choosing gap penalties. An objective way to resolve this question is to learn the(More)
\Simultaneous comparisons of three or more sequences related by a tree, in D. Sanko and J. Kruskal (eds.) Time warps, string edits and macromolecules: the theory and practice of sequence comparison, 253-264, in the alphabet as a possibility for the (l+1)st position of the candidate sequence S. The nal answer is obtained by looking at the vector for(More)
We develop a novel and general approach to estimating the accuracy of protein multiple sequence alignments without knowledge of a reference alignment, and use our approach to address a new problem that we call parameter advising. For protein alignments, we consider twelve independent features that contribute to a quality alignment. An accuracy estimator is(More)