Carolina Bonacic

Learn More
Large scale data centers for crawlers are able to maintain a very large number of active http connections in order to download as fast as possible the usually huge number of web pages from given sections of the WWW. This generates a continuous stream of new URLs of documents to be downloaded and it is clear that the associated work-load can only be served(More)
This paper proposes a strategy to organize metricspace query processing in multicore search nodes as understood in the context of search engines running on clusters of computers. The strategy is applied in each search node to process all active queries visiting the node as part of their solution which, in general, for each query is computed from the(More)
Text search is a classical problem in Computer Science, with many data-intensive applications. For this problem, suffix arrays are among the most widely known and used data structures, enabling fast searches for phrases, terms, substrings and regular expressions in large texts. Potential application domains for these operations include large-scale search(More)
0167-8191/$ see front matter 2010 Elsevier B.V doi:10.1016/j.parco.2010.02.001 * Corresponding author. Address: Yahoo! Researc E-mail addresses: mmarin@yahoo-inc.com (M. M A parallel query processing method is proposed for the design and construction of web search engines to efficiently deal with dynamic variations in query traffic. The method allows for(More)
We describe and evaluate the performance of a parallel search engine that is able to cope efficiently with concurrent read/write operations. Read operations come in the usual form of queries submitted to the search engine and write ones come in the form of new documents added to the text collection in an on-line manner, namely the insertions are embedded(More)
With the emergence of multi-core CPU (or Chip-level MultiProcessor -CMP-), it is essential to develop techniques that capitalize on CMP’s advantages to speed up very demanding applications of parallel computing such as Web search engines. In particular, for this application and given the huge amount of computational resources deployed at data centers, it is(More)
In this paper we present strategies and experiments that show how to take advantage of the multi-threading parallelism available in Chip Multithreading (CMP) processors in the context of efficient query processing for search engines. We show that scalable performance can be achieved by letting the search engine go synchronous so that batches of queries can(More)
Search nodes are single-purpose components of large Web search engines and their efficient implementation is critical to sustain thousands of queries per second and guarantee individual query response times within a fraction of a second. Current technology trends indicate that search nodes ought to be implemented as multi-threaded multi-core systems. The(More)