End-to-end Fault Containment in Scalable Shared-memory Multiprocessors

  title={End-to-end Fault Containment in Scalable Shared-memory Multiprocessors},
  author={Dan Teodosiu},
Current shared-memory multiprocessors suffer from an inherent fragility, since a single hardware or system software failure can cause the entire machine to crash. This dissertation describes a combination of hardware and software techniques that can be used to provide fault containment for large-scale shared memory machines. With fault containment, the impact of a fault remains limited to only a small portion of the system, while the remaining good parts can continue operating normally after… CONTINUE READING
3 Citations
24 References
Similar Papers


Publications referenced by this paper.
Showing 1-10 of 24 references

Hive: Operating System Fault Containment for Shared-Memory Multiprocessors.

  • J. Chapin
  • Ph.D. Thesis, Stanford University
  • 1997
Highly Influential
16 Excerpts

Sun Enterprise 10000 Server: Dynamic System Domains” whitepaper, available online at http://www.sun.com/servers/whitepapers/domains.html

  • Sun Microsystems, Inc
  • 2000

Inside Windows NT Second Edition.

  • D. A. Solomon
  • 1998
1 Excerpt

Similar Papers

Loading similar papers…