• Publications
  • Influence
A guided tour to approximate string matching
TLDR
This work surveys the current techniques to cope with the problem of string matching that allows errors, and focuses on online searching and mostly on edit distance, explaining the problem and its relevance, its statistical behavior, its history and current developments, and the central ideas of the algorithms. Expand
Searching in metric spaces
TLDR
A unified view of all the known proposals to organize metric spaces, so as to be able to understand them under a common framework, and presents a quantitative definition of the elusive concept of "intrinsic dimensionality". Expand
Compressed full-text indexes
TLDR
The relationship between text entropy and regularities that show up in index structures and permit compressing them are explained and the most relevant self-indexes are covered, focusing on how they exploit text compressibility to achieve compact structures that can efficiently solve various search problems. Expand
Flexible pattern matching in strings - practical on-line search algorithms for texts and biological sequences
TLDR
This book presents a practical approach to string matching problems, focusing on the algorithms and implementations that perform best in practice, and includes all of the most significant new developments in complex pattern searching. Expand
Effective Proximity Retrieval by Ordering Permutations
TLDR
A new probabilistic proximity search algorithm for range and A"-nearest neighbor (A"-NN) searching in both coordinate and metric spaces is introduced to predict closeness between elements according to how they order their distances toward a distinguished set of anchor objects. Expand
Fast and flexible string matching by combining bit-parallelism and suffix automata
TLDR
A new automaton to recognize suffixes of patterns with classes of characters is introduced, which seems very adequate for computational biology applications, since it is the fastest algorithm to search on DNA sequences and flexible searching is an important problem in that area. Expand
A compact space decomposition for effective metric indexing
TLDR
This paper presents a simple index called list of clusters (LC), which is shown to require little space, to be suitable both for main and secondary memory implementations, and most importantly to be very resistant to the intrinsic dimensionality of the data set. Expand
Fully Functional Static and Dynamic Succinct Trees
TLDR
A simple and flexible data structure is proposed, called the range min-max tree, that reduces the large number of relevant tree operations considered in the literature to a few primitives that are carried out in constant time on polylog-sized trees. Expand
Storage and Retrieval of Highly Repetitive Sequence Collections
TLDR
New static and dynamic full-text indexes are developed that are able of capturing the fact that a collection is highly repetitive, and require space basically proportional to the length of one typical sequence plus the total number of edit operations. Expand
Compressed representations of sequences and full-text indexes
TLDR
The FM-index is the first that removes the alphabet-size dependance from all query times and the compressed representation of integer sequences with a compression boosting technique to design compressed full-text indexes that scale well with the size of the input alphabet Σ. Expand
...
1
2
3
4
5
...