Learn More
The entire protein sequence database has been exhaustively matched. Definitive mutation matrices and models for scoring gaps were obtained from the matching and used to organize the sequence database as sets of evolutionarily connected components. The methods developed are general and can be used to manage sequence data generated by major genome sequencing(More)
The Lambert W function is deened to be the multivalued inverse of the function w 7 ! we w. It has many applications in pure and applied mathematics, some of which are brieey described here. We present a new discussion of the complex branches of W, an asymptotic expansion valid for all branches, an eecient numerical procedure for evaluating the function to(More)
We introduce a family of simple and fast algorithms for solving the classical string matching problem, string matching with don't care symbols and complement symbols, and multiple patterns. In addition we solve the same problems allowing up to <italic>k</italic> mismatches. Among the features of these algorithms are that they are real time algorithms, they(More)
In aligning homologous protein sequences, it is generally assumed that amino acid substitutions subsequent in time occur independently of amino acid substitutions previous in time, i.e. that patterns of mutation are similar at low and high sequence divergence. This assumption is examined here and shown to be incorrect in an interesting way. Separate(More)
The exhaustive matching of the protein sequence database makes possible a broadly based study of insertions and deletions (indels) during divergent evolution. In this study, the probability of a gap in an alignment of a pair of homologous protein sequences was found to increase with the evolutionary distance measured in PAM units (number of accepted point(More)
We have developed an algorithm for identifying proteins at the sub-microgram level without sequence determination by chemical degradation. The protein, usually isolated by one- or two-dimensional gel electrophoresis, is digested by enzymatic or chemical means and the masses of the resulting peptides are determined by mass spectrometry. The resulting mass(More)
MOTIVATION We announce the availability of the second release of Darwin v. 2.0, an interpreted computer language especially tailored to researchers in the biosciences. The system is a general tool applicable to a wide range of problems. RESULTS This second release improves Darwin version 1.6 in several ways: it now contains (1) a larger set of libraries(More)
As part of a study of the general issue of complexity of comparison based problems, as well as interest in the specific problem, we consider the task of performing the basic priority queue operations on a heap. We show that in the worst case: lg lg n + O(1) comparisons are necessary and sufficient to insert an element into a heap. (This improves the(More)