Web Search for a Planet: The Google Cluster Architecture

@article{Barroso2003WebSF,
  title={Web Search for a Planet: The Google Cluster Architecture},
  author={Luiz Andr{\'e} Barroso and Jeffrey Dean and Urs H{\"o}lzle},
  journal={IEEE Micro},
  year={2003},
  volume={23},
  pages={22-28}
}
Amenable to extensive parallelization, Google's web search application lets different queries run on different processors and, by partitioning the overall index, also lets a single query use multiple processors. to handle this workload, Googless architecture features clusters of more than 15,000 commodity-class PCs with fault tolerant software. This architecture achieves superior performance at a fraction of the cost of a system built from fewer, but more expensive, high-end servers. 

Figures and Tables from this paper

Motivating a Distributed System of Commodity Machines 1

This report examines the price/performance benefit of using a large cluster of commodity machines rather than server level hardware for certain large scale software applications. A number of tools

Query Processing in Highly-Loaded Search Engines

A novel dropping strategy is introduced, based on machine learned performance predictors to select the queries to drop in order to sustain the largest possible query rate with a relative degradation in effectiveness.

Load balancing for term-distributed parallel retrieval

Methods for load balancing in term-distributed parallel architectures are examined, and a suite of techniques for reducing net querying costs are proposed, which allow a 30% improvement in query throughput when tested on an eight-node parallel computer system.

A Hybrid Distributed Architecture for Indexing

Test results confirmed that indexing performance is directly related to the size of the hybrid grid and intranet networking does not play a major role, and a system-efficiency and cost-effectiveness comparison of a grid and a multiprocessor machine showed that for workloads of modest to large sizes, the grid architecture delivers better throughput per unit cost than the multiprocessionor.

ROAR: increasing the flexibility and performance of distributed search

Rendezvous On a Ring (ROAR) is introduced, a novel distributed algorithm that enables on-the-fly re-configuration of the partitioning level that can add and remove servers without stopping the system, cope with server failures, and provide good load-balancing even with a heterogeneous server pool.

The Impact of Novel Computing Architectures on Large-Scale Distributed Web Information Retrieval Systems

K-model, a computational model to properly evaluate algorithms designed for such hardware, is introduced and the impact of K-model rules on algorithm design is studied, to evaluate the benefits and compare the complexity of a solution built using properly designed techniques, and the existing ones.

Automatic management of partitioned, replicated search services

The distributed search architecture that underlies Twitter user search, a service for discovering relevant accounts on the popular microblogging service, makes use of the principle that eliminates the distinction between failure and other anticipated service disruptions, which leads to greater robustness and fault-tolerance.

Exploiting Hybrid Parallelism in Web Search Engines

An hybrid technique based on MPI and OpenMP which has been devised to take advantage of the multithreading facilities provided by CMP nodes for search engines under high query traffic is proposed.

Optimized Inverted List Assignment in Distributed Search Engine Architectures

This work analyzes search engine query traces in order to optimize the assignment of index data to the nodes in the system, such that terms frequently occurring together in queries are also often collocated on the same node.

Performance Improvements for Search Systems Using an Integrated Cache of Lists+Intersections

A static cache that works simultaneously as list and intersection cache is proposed, offering a more efficient way of handling cache space and effective strategies to select the term pairs that should populate the cache are proposed.
...

References

SHOWING 1-9 OF 9 REFERENCES

Memory system characterization of commercial workloads

This study characterizes the memory system behavior of these workloads through a large number of architectural experiments on Alpha multiprocessors augmented with full system simulations to determine the impact of architectural trends.

The Anatomy of a Large-Scale Hypertextual Web Search Engine

A Single-Chip Multiprocessor

Presents the case for billion-transistor processor architectures that will consist of chip multiprocessors (CMPs): multiple (four to 16) simple, fast processors on one chip, and all processors share a larger level-two cache.

Piranha: a scalable architecture based on single-chip multiprocessing

This paper describes the Piranha system, a research prototype being developed at Compaq that aggressively exploits chip multiprocessing by integrating eight simple Alpha processor cores along with a

Hyper-threading technology architecture and microarchitecture : a hyperhtext history

TPC Benchmark C Full Disclosure Report for IBM eserver xSeries 440 using Microsoft SQL Server 2000 Enterprise Edition and Microsoft Windows .NET Datacenter Server 2003, TPC-C Version 5

  • TPC Benchmark C Full Disclosure Report for IBM eserver xSeries 440 using Microsoft SQL Server 2000 Enterprise Edition and Microsoft Windows .NET Datacenter Server 2003, TPC-C Version 5

A Single-Chip Multiprocessor," Computer

  • vol. 30,
  • 1997

, and K . Olukotun , “ A Single - Chip Multiprocessor

  • Piranha : A Scalable Architecture Based on Single - Chip Multiprocessing , ” Proc . 27 th ACM Int ’ l Symp . Computer Architecture
  • 2000

“ TPC Benchmark C Full Disclosure Report for IBM eserver xSeries 440 using Microsoft SQL Server 2000 Enterprise Edition and Microsoft Windows . NET Datacenter Server 2003 , TPC - C Version 5 . 0

  • “ Hyper - Threading Technology Architecture and Microarchitecture : A Hypertext History , ” Intel Technology J .
  • 2002