Gonzalo Navarro

Learn More
We survey the current techniques to cope with the problem of string matching that allows errors. This is becoming a more and more relevant issue for many fast growing areas such as information retrieval and computational biology. We focus on online searching and mostly on edit distance, explaining the problem and its relevance, its statistical behavior, its(More)
The problem of searching the elements of a set that are close to a given query element under some similarity criterion has a vast number of applications in many branches of computer science, from pattern recognition to textual and multimedia information retrieval. We are interested in the rather general case where the similarity criterion defines a metric(More)
Full-text indexes provide fast substring search over large text collections. A serious problem of these indexes has traditionally been their space consumption. A recent trend is to develop indexes that exploit the compressibility of the text, so that their size is a function of the compressed text length. This concept has evolved into <i>self-indexes</i>,(More)
Now, we come to offer you the right catalogues of book to open. flexible pattern matching in strings practical on line search algorithms for texts and biological sequences is one of the literary work in this world in suitable to be reading material. That's not only this book gives reference, but also it will show you the amazing benefits of reading a book.(More)
We present a new indexing method for the approximate string matching problem. The method is based on a suffix array combined with a partitioning of the pattern. We analyze the resulting algorithm and show that the average retrieval time is , for some that depends on the error fraction tolerated and the alphabet size . It is shown that for approximately ,(More)
The metric space model abstracts many proximity search problems, from nearest-neighbor classifiers to textual and multimedia information retrieval. In this context, an index is a data structure that speeds up proximity queries. However, indexes lose their efficiency as the intrinsic data dimensionality increases. In this paper we present a simple index(More)
We propose a new data structure to search in metric spaces. A metric space is formed by a collection of objects and a distance function defined among them which satisfies the triangle inequality. The goal is, given a set of objects and a query, retrieve those objects close enough to the query. The complexity measure is the number of distances computed to(More)
Given a sequence <i>S</i> &equals; <i>s</i><sub>1</sub><i>s</i><sub>2</sub>&#8230;<i>s</i><sub><i>n</i></sub> of integers smaller than <i>r</i> &equals; <i>O</i>(polylog(<i>n</i>)), we show how <i>S</i> can be represented using <i>nH</i><sub>0</sub>(<i>S</i>) &plus; <i>o</i>(<i>n</i>) bits, so that we can know any <i>s</i><sub><i>q</i></sub>, as well as(More)
We introduce a new probabilistic proximity search algorithm for range and A"-nearest neighbor (A"-NN) searching in both coordinate and metric spaces. Although there exist solutions for these problems, they boil down to a linear scan when the space is intrinsically high dimensional, as is the case in many pattern recognition tasks. This, for example, renders(More)
With few exceptions, proximity search algorithms in metric spaces based on the use of pivots select them at random among the objects of the metric space. However, it is well known that the way in which the pivots are selected can drastically affect the performance of the algorithm. Between two sets of pivots of the same size, better chosen pivots can(More)