Unreliable failure detectors for reliable distributed systems

  title={Unreliable failure detectors for reliable distributed systems},
  author={T. Chandra and S. Toueg},
  journal={J. ACM},
  • T. Chandra, S. Toueg
  • Published 1996
  • Computer Science
  • J. ACM
  • We introduce the concept of unreliable failure detectors and study how they can be used to solve Consensus in asynchronous systems with crash failures. We characterise unreliable failure detectors in terms of two properties—completeness and accuracy. We show that Consensus can be solved even with unreliable failure detectors that make an infinite number of mistakes, and determine which ones can be used to solve Consensus despite any number of crashes, and which ones require a majority of… CONTINUE READING
    2,775 Citations

    Figures and Topics from this paper

    Fail-aware failure detectors
    • C. Fetzer, F. Cristian
    • Computer Science
    • Proceedings 15th Symposium on Reliable Distributed Systems
    • 1996
    • 18
    • Highly Influenced
    • PDF
    Implementing unreliable failure detectors with unknown membership
    • 65
    • PDF
    Unreliable intrusion detection in distributed computations
    • D. Malkhi, M. Reiter
    • Computer Science
    • Proceedings 10th Computer Security Foundations Workshop
    • 1997
    • 126
    Failure detectors in omission failure environments
    • 115
    • PDF
    The weakest failure detector for solving consensus
    • 318
    • PDF
    Revisiting Failure Detection and Consensus in Omission Failure Environments
    • 56
    • PDF
    A New Failure Detector to Detect Failures in a Distributed System
    • 2015
    • Highly Influenced
    • PDF
    A realistic look at failure detectors
    • 69
    • Highly Influenced
    • PDF
    The Weakest Failure Detector for Solving Election Problems in Asynchronous Distributed Systems
    • S. Park
    • Computer Science
    • EurAsia-ICT
    • 2002
    • 1


    Impossibility of distributed consensus with one faulty process
    • 2,789
    • Highly Influential
    • PDF
    Using process groups to implement failure detection in asynchronous environments
    • 229
    • Highly Influential
    • PDF
    Implementing fault-tolerant services using the state machine approach: a tutorial
    • 2,364
    • Highly Influential
    • PDF
    Using Failure Detectors to Solve Consensus in Asynchronous Sharde-Memory Systems (Extended Abstract)
    • 89
    • Highly Influential
    Knowledge and common knowledge in a distributed environment
    • 400
    • Highly Influential