The Consensus Problem in Unreliable Distributed Systems (A Brief Survey)

  title={The Consensus Problem in Unreliable Distributed Systems (A Brief Survey)},
  author={Michael J. Fischer},
  • M. Fischer
  • Published in FCT 21 August 1983
  • Mathematics
Agreement problems involve a system of processes, some of which may be faulty. A fundamental problem of fault-tolerant distributed computing is for the reliable processes to reach a consensus. We survey the considerable literature on this problem that has developed over the past few years and give an informal overview of the major theoretical results in the area. 
Consensus and Membership in Synchronous and Asynchronous Distributed Systems
This paper surveys two fundamental agreement problems that must be solved in dependable distributed systems and considers the existing literature on consensus and membership protocols designed according to various assumptions and the advantages and drawbacks of the different approaches.
On the reliability of consensus-based fault-tolerant distributed computing systems
This work uses a stochastic model of processor failure times to investigate design choices such as replication level, protocol running time, randomized versus deterministic protocols, fault detection, and authentication, and uses the probability with which a system produces the correct output as an evaluation criterion.
Formal analysis of consensus protocols in asynchronous distributed systems
This paper formalizes an abstract model of the underlying failure detection protocols and building upon this abstract model, formalize the two consensus protocols and proves that both algorithms satisfy the properties of "uniform agreement","uniform integrity", "termination" and " uniform validity".
Distributed Consensus Revisited
The Agreement Problem in Unreliable Scale-Free Networks
A new BA protocol is proposed that adapts to the scale-free network (SFN) environment and derives its limit of allowable faulty components while maintaining the minimum number of message exchanges.
Consensus in asynchronous distributed systems
This paper focuses on the models proposed to overcome the impossibility of deterministically reaching consensus when even one single fault occurs in an asynchronous system and the research originated from them.
The anatomy study of consensus agreement in MANETs
Stopping Times of Distributed Consensus Protocols: A Probabilistic Analysis
Design of fault tolerant distributed systems: the fail-controlled approach
There are several advantages in the use of distributed systems. In general, they offer a larger modularity, extensibility, and resource sharing, with regard to integrated systems. However, new


Impossibility of distributed consensus with one faulty process
It is shown that every protocol for this problem has the possibility of nontermination, even with only one faulty process, in the asynchronous consensus problem.
On the minimal synchronism needed for distributed consensus
The proofs expose general heuristic principles that explain why consensus is possible in certain models but not possible in others, and several critical system parameters, including various synchronicity conditions, are identified.
Polynomial algorithms for multiple processor agreement
It is proved that no matter what kind of information is exchanged, there is no way to reach agreement with fewer than t+1 rounds of exchange, where t is the upper bound on the number of faults.
Reaching agreement is a primitive of distributed com­ puting. While this poses no problem in an ideal, failure-free environment, it imposes certain constraints on the capabilities of an actual
Another advantage of free choice (Extended Abstract): Completely asynchronous agreement protocols
This work exhibits a probabilistic solution for this problem, which guarantees that as long as a majority of the processes continues to operate, a decision will be made (Theorem 1).
This paper will combine Byzantine Agreement with Two-Phase Commit, using observations of Lamport to provide a method to cope with failure within a given time bound and present algorithms that overcome multiple failures and guarantee a unanimous commit or abort among all the correctly operating processors, subject only to certain limits on the number of failures that can occur.
Using Time Instead of Timeout for Fault-Tolerant Distributed Systems.
Description d'une methode generale pour implementer un systeme reparti ayant n'importe quel degre desire de tolerance de panne, d'un solution au probleme «Bizantine Generals» sont assumes.
Synchronizing clocks in the presence of faults
Three algorithms for maintaining clock synchrony in a distributed multiprocess system where each process has its own clock work in the presence of arbitrary clock or process failures, including “two-faced clocks” that present different values to different processes.
Unanimity in an unknown and unreliable environment
  • D. Dolev
  • Computer Science
    22nd Annual Symposium on Foundations of Computer Science (sfcs 1981)
  • 1981
It is proved that independently of the model, unanimity is achievable if and only if the number of faulty processors in the system is less than less than one half of the connectivity of the system's network.
Reaching Agreement in the Presence of Faults
It is shown that the problem is solvable for, and only for, n ≥ 3m + 1, where m is the number of faulty processors and n is the total number and this weaker assumption can be approximated in practice using cryptographic methods.