Ricardo A. Baeza-Yates

Learn More
The problem of searching the elements of a set that are close to a given query element under some similarity criterion has a vast number of applications in many branches of computer science, from pattern recognition to textual and multimedia information retrieval. We are interested in the rather general case where the similarity criterion defines a metric(More)
We introduce a family of simple and fast algorithms for solving the classical string matching problem, string matching with don't care symbols and complement symbols, and multiple patterns. In addition we solve the same problems allowing up to <italic>k</italic> mismatches. Among the features of these algorithms are that they are real time algorithms, they(More)
In this paper we study the trade-offs in designing efficient caching systems for Web search engines. We explore the impact of different approaches, such as static vs. dynamic caching, and caching query results vs.caching posting lists. Using a query log spanning a whole year we explore the limitations of caching and we demonstrate that caching posting lists(More)
In this paper we propose a method that, given a query submitted to a search engine, suggests a list of related queries. The related queries are based in previously issued queries, and can be issued by the user to the search engine to tune or redirect the search process. The method proposed is based on a query clustering process in which groups of(More)
In this paper we initiate a new area of study dealing with the best way to search a possibly unbounded region for an object. The model for our search algorithms is that we must pay costs proportional to the distance of the next probe position relative to our current position. This model is meant to give a realistic cost measure for a robot moving in the(More)
We present a new data structure, called the xed-queries tree, for the problem of nding all elements of a xed set that are close, under some distance function, to a query element. Fixed-queries trees can be used for any distance function, not necessarily even a metric, as long as it satisses the triangle inequality. We give an analysis of several performance(More)