A new class of metrics appropriate for measuring effective similarity relations between sequences, say one type of similarity per metric, is studied. We propose a new "normalized information… (More)
MOTIVATION
Genomics and proteomics studies routinely depend on homology searches based on the strategy of finding short seed matches which are then extended. The exploding genomic data growth… (More)
We give the rst non-trivial approximation algorithms for the Steiner tree problem and the generalized Steiner network problem on general directed graphs. These problems have several applications in… (More)
Genome informatics. International Conference on…
2003
Extending the single optimized spaced seed of PatternHunter to multiple ones, PatternHunter II simultaneously remedies the lack of sensitivity of Blastn and the lack of speed of Smith-Waterman, for… (More)
MOTIVATION
Traditional sequence distances require an alignment and therefore are not directly applicable to the problem of whole genome phylogeny where events such as rearrangements make full length… (More)
This paper presents a collection of string algorithms that are at the core of several biological problems such as discovering potential drug targets, creating diagnostic probes, universal primers or… (More)
We consider the following problem: given a collection of strings <italic>s<subscrpt>1</subscrpt></italic>,…, <italic>s<subscrpt>m</subscrpt></italic>, find the shortest string <italic>s</italic> such… (More)
This paper presents a novel linear programming approach to do protein 3-dimensional (3D) structure prediction via threading. Based on the contact map graph of the protein 3D structure template, the… (More)