Sparse Dynamic Programming for Longest Common Subsequence from Fragments

@article{Baker2002SparseDP,
  title={Sparse Dynamic Programming for Longest Common Subsequence from Fragments},
  author={Brenda S. Baker and Raffaele Giancarlo},
  journal={J. Algorithms},
  year={2002},
  volume={42},
  pages={231-254}
}
Sparse Dynamic Programming has emerged as an essential tool for the design of efficient algorithms for optimization problems coming from such diverse areas as computer science, computational biology, and speech recognition. We provide a new sparse dynamic programming technique that extends the Hunt?Szymanski paradigm for the computation of the longest common subsequence (LCS) and apply it to solve the LCS from Fragments problem: given a pair of strings X and Y (of length n and m, respectively… 

Figures from this paper

Application of the A * Algorithm to Solve the Longest Common Subsequence from Fragments Problem
TLDR
A new method using a tree searching strategy, A* algorithm, is proposed in this study for the LCS from fragments problem and can help to filter out some fragments which wouldn’t appear in solutions, and efficiently find a solution.
Effective Sparse Dynamic Programming Algorithms for Merged and Block Merged LCS Problems
TLDR
A hybrid algorithm is fabricated which utilizes the advantages of the proposed algorithms and previous state of the arts algorithms to provide best output in every possible cases, from both time efficiency and space efficiency.
New efficient algorithms for the merged LCS problem with and without block constraints using sparse dynamic programming
  • A. Rahman, M. S. Rahman
  • Computer Science
    2012 15th International Conference on Computer and Information Technology (ICCIT)
  • 2012
TLDR
The longest common subsequence problem has been widely studied and used to find out the relationship between sequences and an algorithm to solve a variation of the problem where block constraint arises is proposed.
$LCSk$++: Practical similarity metric for long strings
TLDR
A new metric for measuring the similarity of long strings, and an algorithm for its efficient computation is presented which computes $LCSk$++ with complexity of $O((|X |+|Y|)\log(|X|+ |Y|))$ for strings $X and $Y$ under a realistic random model.
Algorithms for Computing the Longest Parameterized Common Subsequence
TLDR
This paper defines a generalization of variants of LCS, the longest parameterized common subsequence (LPCS) problem, and shows how to solve it in O(n2) and O( n+Rlog n) time.
Fast and simple algorithms for computing both $LCS_{k}$ and $LCS_{k+}$
TLDR
A single algorithm to compute both $LCS_k$ and $LCS_{k+}$ which outperforms the state-of-the art algorithms in terms of runtime complexity and requires only basic data structures is presented.
A basic analysis toolkit for biological sequences
TLDR
This paper presents a software library, nicknamed BATS, for some basic sequence analysis tasks, that includes algorithms for string matching and alignment problems, and consists of C/C++ library functions as well as Perl library functions.
...
...

References

SHOWING 1-10 OF 26 REFERENCES
A fast algorithm for computing longest common subsequences
TLDR
An algorithm for finding the longest common subsequence of two sequences of length n which has a running time of O((r + n) log n), where r is the total number of ordered pairs of positions at which the two sequences match.
Chaining multiple-alignment fragments in sub-quadratic time
TLDR
This work describes a multiple-sequence alignment algorithm for determining the highest-scoring alignment that can be obtained by chaining together non-overlapping subalignments selected from a given collection of such ‘Yragments’, making it the first sub-quadratic sparse dynamic programming algo rithm for the case K > 2.
Sparse dynamic programming II: convex and concave cost functions
Dynamic programming solutions to two recurrence equations, used to compute a sequence alignment from a set of matching fragments between two strings, and to predict RNA secondary structure, are
String Editing and Longest Common Subsequences
TLDR
Serial and parallel algorithmic solutions for the string editing problem for input strings x and y are described, which models a variety of problems arising in such diverse areas as text and speech processing, geology and, last but not least, molecular biology.
Parameterized Pattern Matching: Algorithms and Applications
  • B. Baker
  • Computer Science
    J. Comput. Syst. Sci.
  • 1996
TLDR
This paper investigates parameterized pattern matching via parameterized suffix trees (p- Suffix trees) and gives two algorithms for constructing p-suffix trees: one that runs in linear time for fixed alphabets, and another that uses auxiliary data structures and runs inO(nlog(n) time for variable alphABets, wherenis input length.
Sparse dynamic programming I: linear cost functions
TLDR
Dynamic programming solutions to a number of different recurrence equations for sequence comparison and for RNA secondary structure prediction are considered, when the weight functions used in the recurrences are taken to be linear.
Sparse Dynamic Programming for Evolutionary-Tree Comparison
TLDR
This paper shows that the maximum agreement-subtree problem reduces to unary weighted bipartite matching ($\UWBM$) with an $O(n^{1+o(1)})$ additive overhead and reduces linearly to $\MAST$, and shows that this algorithm is optimal unless $\UW BM$ can be solved in near linear time.
Introduction to computational biology - maps, sequences, and genomes: interdisciplinary statistics
TLDR
This chapter discusses mapping with Real Data Cloning and Clone Libraries, and physical maps and clone libraries, and the challenges faced in mapping with real data.
Serial computations of Levenshtein distances
TLDR
This chapter focuses on the problem of evaluating a longest common subsequence, which is expressively equivalent to the simple form of the Levenshtein distance.
...
...