# Longest Common Prefixes with k-Errors and Applications

@inproceedings{Ayad2018LongestCP,
title={Longest Common Prefixes with k-Errors and Applications},
author={Lorraine A. K. Ayad and Panagiotis Charalampopoulos and Costas S. Iliopoulos and Solon P. Pissis},
booktitle={SPIRE},
year={2018}
}
• Published in SPIRE 13 January 2018
• Computer Science
Although real-world text datasets, such as DNA sequences, are far from being uniformly random, average-case string searching algorithms perform significantly better than worst-case ones in most applications of interest. [] Key Result We show that our technique is applicable to several algorithmic problems in computational biology and elsewhere.
15 Citations

### Faster Algorithms for Longest Common Substring

• Computer Science
ESA
• 2021
An O(n logk−1/2 n)-time algorithm is shown, which stems from a recursive heavy-path decomposition technique that was first introduced in the seminal paper of Cole et al.

### Time-Space Tradeoffs for Finding a Long Common Substring

• Computer Science
CPM
• 2020
A significant speed-up is obtained for instances where the length of the sought LCS is large, based on techniques originating from the LCS with Mismatches problem, on space-efficient locally consistent parsing, and on the structure of maximal repetitions in the input documents.

### Longest Property-Preserved Common Factor

• Mathematics, Computer Science
SPIRE
• 2018
This paper considers two fundamental string properties: square-free factors and periodic factors under two different settings, one per property and presents linear-time solutions for both settings.

### Linear-Time Algorithm for Long LCF with k Mismatches

• Computer Science, Mathematics
CPM
• 2018
In the Longest Common Factor with $k$ Mismatches (LCF$_k$) problem, we are given two strings $X$ and $Y$ of total length $n$, and we are asked to find a pair of maximal-length factors, one of $X$ and

### Faster Algorithms for 1-Mappability of a Sequence

• Computer Science, Mathematics
COCOA
• 2017
Two new algorithms that require worst-case time and space for integer alphabets of size $$m=\varOmega (\log _\sigma n)$$ are presented, thus greatly improving the state of the art.

### supporting time-optimal queries with O ( log 2 n ) time for updates

• Computer Science, Mathematics
• 2019
The techniques developed can be applied to obtain fully dynamic algorithms for all of the analogously restricted dynamic variants of problems on strings and are applied to computing the solution for a string with a given set of k edits, which leads to answering internal queries on a string.

### Dynamic and Internal Longest Common Substring

• Materials Science
Algorithmica
• 2020
The first solution to the fully dynamic LCS problem requiring sublinear time in n per edit operation is presented, and dynamic sublinear-time algorithms for both the longest palindrome and Lyndon factorization of a string after a single edit operation are developed.

### Pattern Masking for Dictionary Matching

• Computer Science
ISAAC
• 2021
It is shown, through a reduction from the well-known $k$-Clique problem, that a decision version of the PMDM problem is NP-complete, even for strings over a binary alphabet.

### SMART: SuperMaximal approximate repeats tool

• Biology
Bioinform.
• 2020
This talk will present SMART, a tool based on recent algorithmic advances implemented in C++ to compute supermaximal k-mismatch repeats directly and show that the elements SMART outputs are statistically much more significant than the output of the state-of-the-art tools.

## References

SHOWING 1-10 OF 41 REFERENCES

### Longest Common Prefixes with k-Mismatches and Applications

• Computer Science
SOFSEM
• 2018
The proposed algorithm for computing the longest prefix of each suffix of a given string of length n over a constant-sized alphabet of size $$\sigma$$ that occurs elsewhere in the string with Hamming distance at most k can be directly applied to the problem of genome mappability.

### Longest Common Prefix with Mismatches

An algorithm is proposed that computes, for each text suffix, the length of its longest prefix that occurs elsewhere in the text with at most one mismatch, and a second algorithm is described and analysed that uses a greedy strategy to reduce the amount of computation.

### Longest Common Substring with Approximately k Mismatches

A conditional lower bound based on the SETH hypothesis implying that there is little hope to improve existing solutions is shown and a strongly subquadratic-time 2-approximation algorithm for the longest common substring with k mismatches problem is obtained and conditional hardness of improving its approximation ratio is shown.

### Deterministic Indexing for Packed Strings

• Computer Science
CPM
• 2017
A new string index is created in the deterministic and packed setting such that given a packed pattern string of length m the authors can support queries in (deterministic) time O(m/a + log m + log log s), where a = w /log s is the number of characters packed in a word of size w = log n.

### Algorithmic Framework for Approximate Matching Under Bounded Edits with Applications to Sequence Analysis

• Computer Science
RECOMB
• 2018
A novel algorithmic framework for solving approximate sequence matching problems that permit a bounded total number k of mismatches, insertions, and deletions and is expected to be a broadly applicable theoretical tool, and may inspire the design of practical heuristics and software.

### A Provably Efficient Algorithm for the k-Mismatch Average Common Substring Problem

• Computer Science
J. Comput. Biol.
• 2016
This article presents the first provably efficient algorithm for the k-mismatch average common string (ACSk) problem that takes O(n) space and O( n log(k) n) time in the worst case for any constant k.

### Optimal suffix tree construction with large alphabets

• M. Farach
• Computer Science
Proceedings 38th Annual Symposium on Foundations of Computer Science
• 1997
This work builds suffix trees in linear time for integer alphabet using Weiner's algorithm, which matches a trivial /spl Omega/(n log n)-time lower bound based on sorting.

### Suffix arrays: a new method for on-line string searches

• Computer Science
SODA '90
• 1990
A new and conceptually simple data structure, called a suffixarray, for on-line string searches is introduced in this paper, and it is believed that suffixarrays will prove to be better in practice than suffixtrees for many applications.

### kmacs: the k-mismatch average common substring approach to alignment-free sequence comparison

• Biology, Computer Science
Bioinform.
• 2014
This work describes kmacs, an efficient implementation of this idea based on generalized enhanced suffix arrays, and presents a greedy heuristic to approximate the length of such k-mismatch substrings by considering longest common substrings with k mismatches.