The Dyck Language Edit Distance Problem in Near-Linear Time

@article{Saha2014TheDL,
  title={The Dyck Language Edit Distance Problem in Near-Linear Time},
  author={Barna Saha},
  journal={2014 IEEE 55th Annual Symposium on Foundations of Computer Science},
  year={2014},
  pages={611-620}
}
  • B. Saha
  • Published 18 October 2014
  • Computer Science
  • 2014 IEEE 55th Annual Symposium on Foundations of Computer Science
Given a string σ over alphabet Σ and a grammar G defined over the same alphabet, how many minimum number of repairs (insertions, deletions and substitutions) are required to map σ into a valid member of G? The seminal work of Aho and Peterson in 1972 initiated the study of this language edit distance problem providing a dynamic programming algorithm for context free languages that runs in O(|G|2n3) time, where n is the string length and G is the grammar size. While later improvements reduced… Expand
Faster Language Edit Distance, Connection to All-pairs Shortest Paths and Related Problems
  • B. Saha
  • Mathematics, Computer Science
  • ArXiv
  • 2014
TLDR
It is shown that exact computation of language edit distance in true sub-cubic time will imply a truly sub- cubic algorithm for all-pairs shortest paths, a long-standing open question and result in a breakthrough for a large range of problems in graphs and matrices due to sub-Cubic equivalence. Expand
Language Edit Distance and Maximum Likelihood Parsing of Stochastic Grammars: Faster Algorithms and Connection to Fundamental Graph Problems
  • B. Saha
  • Mathematics, Computer Science
  • 2015 IEEE 56th Annual Symposium on Foundations of Computer Science
  • 2015
TLDR
This paper gives the first such algorithm that computes language edit distance almost optimally and designs the very first subcubic (Õ(nω)) algorithm that given an arbitrary stochastic context free grammar, and a string returns a nearly-optimal maximum likelihood parsing of that string. Expand
Language Edit Distance Approximation via Amnesic Dynamic Programming
In 1975, a breakthrough result of L. Valiant showed that parsing context free grammars can be reduced to Boolean matrix multiplication, resulting in a running time of O(n) for parsing where ω ≤ 2.373Expand
Approximating Language Edit Distance Beyond Fast Matrix Multiplication: Ultralinear Grammars Are Where Parsing Becomes Hard!
TLDR
Additive approximation algorithms for language edit distance are studied, providing two explicit combinatorial algorithms to obtain a string with minimum edit distance with performance dependencies on either the number of non-linear productions, k^*, or theNumber of nested non- linear production, k, used in the optimal derivation. Expand
Sublinear Algorithms for Gap Edit Distance
TLDR
An algorithm for distinguishing whether the edit distance is at most t or at least t^2 (the quadratic gap problem) in time Õ(n/t+t^3). Expand
Improved bounds for testing Dyck languages
TLDR
This paper introduces a new problem called Truestring Equivalence, which is easily reducible to the $2$-type Dyck language property testing problem, and shows a lower bound of $n$ to the power of $1/5$. Expand
Fast & Space-Efficient Approximations of Language Edit Distance and RNA Folding: An Amnesic Dynamic Programming Approach
  • B. Saha
  • Mathematics, Computer Science
  • 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS)
  • 2017
TLDR
These are the first algorithms that break the cubic-time barriers of any combinatorial algorithm, and quadratic-space barrier of any algorithm significantly improving upon their long-standing space and time complexities. Expand
An Improved Algorithm for The k-Dyck Edit Distance Problem
TLDR
These algorithms combine several new structural properties of the Dyck edit distance problem, a refined algorithm for fast $(\min,+)$ matrix product, and a careful modification of ideas used in Valiant's parsing algorithm. Expand
If the Current Clique Algorithms are Optimal, So is Valiant's Parser
TLDR
It is proved that any improvement on Valiant' s algorithm, even for constant size grammars, would imply a breakthrough algorithm for the k-Clique problem: given a graph on n nodes, decide if there are k that form a clique. Expand
Approximating Edit Distance within Constant Factor in Truly Sub-Quadratic Time
TLDR
An algorithm with running time Õ(n^2-2/7) that approximates the edit distance within a constant factor poly(log n) is provided. Expand
...
1
2
3
...

References

SHOWING 1-10 OF 38 REFERENCES
Testing membership in parenthesis languages
TLDR
A sublinear-time algorithm for testing all Dyck languages and a testing algorithm for the context free language LREV, where Σ is a fixed alphabet, which almost matches the lower bound given by Alon et al. Expand
Approximating edit distance efficiently
TLDR
Algorithms are developed that solve gap versions of the edit distance problem: given two strings of length n with the promise that their edit distance is either at most k or greater than /spl lscr/, decide which of the two holds and develop an n/sup 3/7/-approximation quasilinear time algorithm. Expand
Incremental String Comparison
TLDR
This paper considers the following incremental version of comparing two sequences A and B to determine their longest common subsequence (LCS) or the edit distance between them, and obtains O(nk) algorithms for the longest prefix approximate match problem, the approximate overlap problem, and cyclic string comparison. Expand
Polylogarithmic Approximation for Edit Distance and the Asymmetric Query Complexity
TLDR
The lower bound is the first to expose hardness of edit distance stemming from the input strings being ``repetitive'', which means that many of their substrings are approximately identical, and provides the first rigorous separation between edit distance and Ulam distance. Expand
Approximately Matching Context-Free Languages
  • E. Myers
  • Mathematics, Computer Science
  • Inf. Process. Lett.
  • 1995
TLDR
An O(PN2(N + log P)) algorithm for approximately matching a string of length N and a context-free language specified by a grammar of size P is given, which generalizes the Cocke-Younger-Kasami algorithm for determining membership in a context -free language. Expand
Approximating edit distance in near-linear time
TLDR
This is the first sub-polynomial approximation algorithm for this problem that runs in near-linear time, improving on the state-of-the-art n<sup>(1/3+o(1)) approximation. Expand
Regular languages are testable with a constant number of queries
TLDR
This paper discusses testability of more complex languages and shows that the query complexity required for testing context free languages cannot be bounded by any function of /spl epsiv/. Expand
Recognizing well-parenthesized expressions in the streaming model
TLDR
It is proved that this one-pass algorithm for Dyck(2) is optimal, up to a log(n) factor, even when two-sided error is allowed, and conjecture that a similar bound holds for any constant number of passes over the input. Expand
An Introduction to Formal Language Theory
TLDR
This volume intended to serve as a text for upper undergraduate and graduate level students and special emphasis is given to the role of algebraic techniques in formal language theory through a chapter devoted to the fixed point approach to the analysis of context-free languages. Expand
Streaming Algorithms for Recognizing Nearly Well-Parenthesized Expressions
TLDR
It is proved that any randomized one-pass algorithm that makes error at most k/n requires at least Ω(k log(n/k))) space to accept strings which are exactly k-away from strings in 1-turn-Dyck2 and to reject strings which is exactly k + 2-awayfrom strings in1- turn-DYck2. Expand
...
1
2
3
4
...