Scalability of the Nutch search engine

  title={Scalability of the Nutch search engine},
  author={Jos{\'e} E. Moreira and Maged M. Michael and Dilma Da Silva and Doron Shiloach and Parijat Dube and Li Zhang},
Nutch is an open source search engine that is gaining increasing popularity in the commercial world. The Nutch architecture leads itself to a wide range of parallelization techniques. Multiple backend servers can be used to both partition the corpus of search data, thus increasing the rate of queries serviced, and to increase the size of the search data while preserving the service rate. Alternatively, multiple search engines can operate in parallel, further increasing the query rate. In this… CONTINUE READING
Highly Cited
This paper has 36 citations. REVIEW CITATIONS