Absent Subsequences in Words

  title={Absent Subsequences in Words},
  author={Maria Kosche and Tore Ko{\ss} and Florin Manea and Stefan Siemer},
An absent factor of a string w is a string u which does not occur as a contiguous substring (a.k.a. factor) insidew. We extend this well-studied notion and define absent subsequences: a string u is an absent subsequence of a string w if u does not occur as subsequence (a.k.a. scattered factor) inside w. Of particular interest to us are minimal absent subsequences, i.e., absent subsequences whose every subsequence is not absent, and shortest absent subsequences, i.e., absent subsequences of… 

Subsequences in Bounded Ranges: Matching and Analysis Problems

. In this paper, we consider a variant of the classical algorithmic problem of checking whether a given word v is a subsequence of another word w . More precisely, we consider the problem of

Computing Longest (Common) Lyndon Subsequences

This work proposes algorithms for finding the longest common Lyndon subsequence of two strings of length n in O(n) time with O( n) space, or online in O (nσ) space and time.

Longest (Sub-)Periodic Subsequence

An algorithm computing the longest periodic subsequence of a string of length n in O(n7) time with O( n4) words of space is presented and improvements are obtained when restricting the exponents or extending the search.

m-Nearly k-Universal Words - Investigating Simon Congruence

Determining the index of the Simon congruence is a long outstanding open problem. Two words u and v are called Simon congruent if they have the same set of scattered factors, which are parts of the

Combinatorial Algorithms for Subsequence Matching: A Survey

In this paper we provide an overview of a series of recent results regarding algorithms for searching for subsequences in words or for the analysis of the sets of subsequences occurring in a word.



Absent words in a sliding window with applications

Parallelising the Computation of Minimal Absent Words

Experimental results show that a multiprocessing implementation of this algorithm can accelerate the overall computation by more than a factor of two compared to state-of-the-art approaches, and it is shown that the implementation achieves near-optimal speed-ups.

Constructing Strings Avoiding Forbidden Substrings

These algorithms are motivated by data privacy, and in particular, by the data sanitization process, and can be directly applied to solve the reachability and shortest path problems in complete de Bruijn graphs in the presence of forbidden edges or of forbidden paths.

Linear-time computation of minimal absent words using suffix array

A new linear-time and linear-space algorithm for the computation of minimal absent words based on the suffix array is presented and experimental results show that this implementation outperforms the one by Pinho et al.

Words and forbidden factors

Using minimal absent words to build phylogeny

Computing DAWGs and Minimal Absent Words in Linear Time for Integer Alphabets

The directed acyclic word graph (DAWG) of a string y is the smallest (partial) DFA which recognizes all suffixes of y and has only O(n) nodes and edges and it is shown that the set MAW(y) of all minimal absent words of y can be computed in optimal O-time time and working space for integer alphabets.

Internal Shortest Absent Word Queries

An O((n/k) · log log σ n)-size data structure is presented, which can be constructed in O(n logσ n) time, and answers queries in time O(log logσ k).

Alignment-free sequence comparison using absent words