On the Construction of Classes of Suffix Trees for Square Matrices: Algorithms and Applications

  title={On the Construction of Classes of Suffix Trees for Square Matrices: Algorithms and Applications},
  author={Raffaele Giancarlo and Roberto Grossi},
  journal={Inf. Comput.},
Given an n × n TEXT matrix with entries defined over an ordered alphabet σ, we introduce 4n−1 classes of index data structures for TEXT. Those indices are informally the two-dimensional analog of the suffix tree of a string [15], allowing on-line searches and statistics to be performed on TEXT. We provide one simple algorithm that efficiently builds any chosen index in those classes in O(n2 log n) worst case time using O(n2) space. The algorithm can be modified to require optimal O(n2) expected… 
On-Line Construction of Two-Dimensional Suffix Trees in O(n2 log n) Time
The main contribution in this paper is an O(log n) factor improvement in the time complexity of the GG algorithm, making it optimal for unbounded alphabets and leading to a major simplification of theGG algorithm.
A Simple Construction of Two-Dimensional Suffix Trees in Linear Time
A new and simple algorithm to construct two-dimensional suffix trees in linear time by applying the skew scheme to square matrices is proposed.
Generalizations of suffix arrays to multi-dimensional matrices
Linear-Time Construction of Two-Dimensional Suffix Trees
This work presents the first linear-time algorithm for constructing two-dimensional suffix trees, a compacted trie that represents all suffixes of S in two dimensions.
Optimal on-line search and sublinear time update in string matching
  • P. Ferragina, R. Grossi
  • Computer Science
    Proceedings of IEEE 36th Annual Foundations of Computer Science
  • 1995
This work is presenting the first dynamic algorithm that achieves optimal time to find the occ occurrences of P, and sublinear time per update, i.e. O(/spl radic/(n+y)), in the worst case.
The Burrows-Wheeler Transform : Ten Years Later
The FM-index is a succinct text index needing only O(Hkn) bits of space, with the constant factor depending only logarithmically on σ, which means in practice for all but very small alphabets, even with huge texts.
One-dimensional and multi-dimensional substring selectivity estimation
This paper uses pruned count-suffix trees (PSTs) as the basic data structure for substring selectivity estimation and presents a novel technique called MO (Maximal Overlap), which is both practical and substantially superior to competing algorithms.
Multi-Dimensional Substring Selectivity Estimation
This paper develops a space and time eecient probabilis-tic algorithm to construct multi-dimensional pruned count-suux trees directly and demonstrates experimentally, using real data sets, that MO is substantially superior to GNO in the quality of the estimate.
A Note on a Tree-Based 2D Indexing
This work presents the transformation of 2D structures into the form of a tree, preserving the context of each element of the structure, and achieves the properties analogous to the results obtained in tree pattern matching and string indexing.
Two-dimensional pattern matching with rotations
An upper and lower bound on the number of such different possible rotated patterns is proved, given an m × m array and an n × n array over some finite alphabet Σ, yielding an O(n2m3) time algorithm for pattern matching with rotation.


An Index Data Structure For Matrices, with Applications to Fast Two-Dimensional Pattern Matching
It is shown that the s-trees are optimal for space and within a log factor optimal for time.
Suffix arrays: a new method for on-line string searches
A new and conceptually simple data structure, called a suffixarray, for on-line string searches is introduced in this paper, and it is believed that suffixarrays will prove to be better in practice than suffixtrees for many applications.
A Generalization of the Suffix Tree to Square Matrices, with Applications
A new data structure is described, the Lsuffix tree, which generalizes McCreight's suffix tree for a string to a square matrix and gives efficient algorithms for the static versions of the following dual problems that arise in low-level image processing and visual databases.
Fast Algorithms for Finding Nearest Common Ancestors
An algorithm for a random access machine with uniform cost measure (and a bound of $\Omega (\log n)$ on the number of bits per word) that requires time per query and preprocessing time is presented, assuming that the collection of trees is static.
A Space-Economical Suffix Tree Construction Algorithm
A new algorithm is presented for constructing auxiliary digital search trees to aid in exact-match substring searching. This algorithm has the same asymptotic running time bound as previously
The Design and Analysis of Computer Algorithms
This text introduces the basic data structures and programming techniques often used in efficient algorithms, and covers use of lists, push-down stacks, queues, trees, and graphs.
Fast Parallel and Serial Approximate String Matching
Organization and maintenance of large ordered indices
The index organization described allows retrieval, insertion, and deletion of keys in time proportional to logk I where I is the size of the index and k is a device dependent natural number such that the performance of the scheme becomes near optimal.
An Improved Algorithm for Approximate String Matching
A new algorithm for finding all occurrences of the pattern string in the text string with at most k differences is presented and both its theoretical and practical variants improve the known algorithms.