#### Filter Results:

#### Publication Year

1995

2017

#### Publication Type

#### Co-author

#### Publication Venue

#### Data Set Used

#### Key Phrases

Learn More

The problem of searching the elements of a set that are close to a given query element under some similarity criterion has a vast number of applications in many branches of computer science, from pattern recognition to textual and multimedia information retrieval. We are interested in the rather general case where the similarity criterion defines a metric… (More)

We survey the current techniques to cope with the problem of string matching that allows errors. This is becoming a more and more relevant issue for many fast growing areas such as information retrieval and computational biology. We focus on online searching and mostly on edit distance, explaining the problem and its relevance, its statistical behavior, its… (More)

We present nrgrep (\nondeterministic reverse grep"), a new pattern matching tool designed for eecient search of complex patterns. Unlike previous tools of the grep family, such as agrep and Gnu grep, nrgrep is based on a single and uniform concept: the bit-parallel simulation of a nondeterministic suux automaton. As a result, nrgrep can nd from simple… (More)

Full-text indexes provide fast substring search over large text collections. A serious problem of these indexes has traditionally been their space consumption. A recent trend is to develop indexes that exploit the compressibility of the text, so that their size is a function of the compressed text length. This concept has evolved into <i>self-indexes</i>,… (More)

The metric space model abstracts many proximity search problems, from nearest-neighbor classifiers to textual and multimedia information retrieval. In this context, an index is a data structure that speeds up proximity queries. However , indexes lose their efficiency as the intrinsic data dimensionality increases. In this paper we present a simple index… (More)

We propose a new data structure to search in metric spaces. A metric space is formed by a collection of objects and a distance function deened among them, which satisses the triangular inequality. The goal is, given a set of objects and a query, retrieve those objects close enough to the query. The number of distances computed to achieve this goal is the… (More)

We present a new algorithm for on-line approximate string matching. The algorithm is based on the simulation of a non-deterministic nite automaton built from the pattern and using the text as input. This simulation uses bit operations on a RAM machine with word length w = (log n) bits, where n is the text size. This is essentially similar to the model used… (More)

We introduce a new probabilistic proximity search algorithm for range and K-nearest neighbor (K-NN) searching in both coordinate and metric spaces. Although there exist solutions for these problems, they boil down to a linear scan when the space is intrinsically high-dimensional, as is the case in many pattern recognition tasks. This, for example, renders… (More)

A model to query document databases by both their content and structure is presented. The goal is to obtain a query language that is expressive in practice while being efficiently implementable, features not present at the same time in previous work. The key ideas of the model are a set-oriented query language based on operations on nearby structure… (More)