A guided tour to approximate string matching
- G. Navarro
- Computer ScienceCSUR
- 1 March 2001
This work surveys the current techniques to cope with the problem of string matching that allows errors, and focuses on online searching and mostly on edit distance, explaining the problem and its relevance, its statistical behavior, its history and current developments, and the central ideas of the algorithms.
Searching in metric spaces
- Edgar Chávez, G. Navarro, Ricardo Baeza-Yates, J. Marroquín
- Computer ScienceCSUR
- 1 September 2001
A unified view of all the known proposals to organize metric spaces, so as to be able to understand them under a common framework, and presents a quantitative definition of the elusive concept of "intrinsic dimensionality".
Compressed full-text indexes
- G. Navarro, V. Mäkinen
- Computer ScienceCSUR
- 12 April 2007
The relationship between text entropy and regularities that show up in index structures and permit compressing them are explained and the most relevant self-indexes are covered, focusing on how they exploit text compressibility to achieve compact structures that can efficiently solve various search problems.
Flexible pattern matching in strings - practical on-line search algorithms for texts and biological sequences
- G. Navarro, M. Raffinot
- Computer Science
- 27 May 2002
This book presents a practical approach to string matching problems, focusing on the algorithms and implementations that perform best in practice, and includes all of the most significant new developments in complex pattern searching.
Effective Proximity Retrieval by Ordering Permutations
- Edgar Chávez, Karina Figueroa, G. Navarro
- Computer ScienceIEEE Transactions on Pattern Analysis and Machine…
- 1 September 2008
A new probabilistic proximity search algorithm for range and A"-nearest neighbor (A"-NN) searching in both coordinate and metric spaces is introduced to predict closeness between elements according to how they order their distances toward a distinguished set of anchor objects.
A compact space decomposition for effective metric indexing
- Edgar Chávez, G. Navarro
- Computer SciencePattern Recognition Letters
- 1 July 2005
Fast and flexible string matching by combining bit-parallelism and suffix automata
- G. Navarro, M. Raffinot
- Computer ScienceJEAL
- 31 December 2000
A new automaton to recognize suffixes of patterns with classes of characters is introduced, which seems very adequate for computational biology applications, since it is the fastest algorithm to search on DNA sequences and flexible searching is an important problem in that area.
Fully Functional Static and Dynamic Succinct Trees
- K. Sadakane, G. Navarro
- Computer ScienceTALG
- 6 May 2009
A simple and flexible data structure is proposed, called the range min-max tree, that reduces the large number of relevant tree operations considered in the literature to a few primitives that are carried out in constant time on polylog-sized trees.
Storage and Retrieval of Highly Repetitive Sequence Collections
- V. Mäkinen, G. Navarro, Jouni Sirén, Niko Välimäki
- BiologyJ. Comput. Biol.
- 8 April 2010
New static and dynamic full-text indexes are developed that are able of capturing the fact that a collection is highly repetitive, and require space basically proportional to the length of one typical sequence plus the total number of edit operations.
Compressed representations of sequences and full-text indexes
- P. Ferragina, G. Manzini, V. Mäkinen, G. Navarro
- Computer ScienceTALG
- 1 May 2007
The FM-index is the first that removes the alphabet-size dependance from all query times and the compressed representation of integer sequences with a compression boosting technique to design compressed full-text indexes that scale well with the size of the input alphabet Σ.
...
...