Zab: High-performance broadcast for primary-backup systems
@article{Junqueira2011ZabHB, title={Zab: High-performance broadcast for primary-backup systems}, author={Flavio Paiva Junqueira and Benjamin C. Reed and Marco Serafini}, journal={2011 IEEE/IFIP 41st International Conference on Dependable Systems \& Networks (DSN)}, year={2011}, pages={245-256} }
Zab is a crash-recovery atomic broadcast algorithm we designed for the ZooKeeper coordination service. ZooKeeper implements a primary-backup scheme in which a primary process executes clients operations and uses Zab to propagate the corresponding incremental state changes to backup processes1. Due the dependence of an incremental state change on the sequence of changes previously generated, Zab must guarantee that if it delivers a given state change, then all other changes it depends upon must…
329 Citations
Improving the Latency and Throughput of ZooKeeper Atomic Broadcast
- Computer ScienceICCSW
- 2017
Two easy-to-implement Zab variants are presented, called ZabAC and ZabAA, designed to offer small atomic-broadcast latencies and to reduce the processing load on the primary node that plays a leading role in Zab.
Improving ZooKeeper Atomic Broadcast Performance When a Server Quorum Never Crashes
- Computer Science
- 2018
Three variations of Zab are proposed and the potential of coin-tossing in ZooKeeper performances better than Zab is found, particularly at high workloads, because of the least-restricted Zab fault assumptions.
Improving ZooKeeper Atomic Broadcast Performance by Coin Tossing
- Computer ScienceEPEW
- 2017
The coin-tossing Zab version (ZabCT) meets all requirements essential for crash-tolerance provisions within Zab which can be adopted in any ZabCT implementation and the dual objectives of performance gains and traffic reduction can be accomplished.
Mechanisms for improving ZooKeeper Atomic Broadcast performance
- Computer Science
- 2018
Two main limitations that prevent existing systems such as Apache ZooKeeper from achieving a higher write performance are identified and three variations of Zab are proposed, which are all capable of reaching an agreement in fewer communication steps than Zab.
Dynamic Reconfiguration of Primary/Backup Clusters
- Computer ScienceUSENIX Annual Technical Conference
- 2012
A new reconfiguration protocol is described, recently implemented in Apache Zookeeper, that fully automates configuration changes and minimizes any interruption in service to clients while maintaining data consistency.
Brief Announcement: Consensus and Efficient Passive Replication
- Computer ScienceDISC
- 2012
Using the Paxos consensus protocol to implement passive replication requires taking care of peculiar constraints, and Paxos does not necessarily preserve the dependency between A and the delivery of δAB.
Make the Leader Work: Executive Deferred Update Replication
- Computer Science2014 IEEE 33rd International Symposium on Reliable Distributed Systems
- 2014
EDUR streamlines transaction certification with the broadcast protocol, which improves overall performance and scalability compared to deferred update replication based on total order broadcast (TOB).
Rollup : Non-Disruptive Rolling Upgrade
- Computer Science
- 2015
Although Rollup builds upon existing lower-bound results in terms of load and time, its key contribution is to bridge the gap between a long body of theoretical results and recent system achievements through the rolling upgrade application.
Scalable coordination of distributed in-memory transactions
- Computer Science
- 2016
It is experimentally demonstrated that transaction latency and throughput scale considerably well when an atomic multicast service is offered to transaction nodes by a crash-tolerant ensemble of dedicated nodes and that using such a service is the most scalable approach compared to practices advocated in the literature.
Acuerdo: Fast Atomic Broadcast over RDMA
- Computer ScienceICPP
- 2022
Acuerdo is built from the ground up to perform communication using one-side RDMA writes, which do not use the CPU of the remote machine, and is explicitly designed to minimize waiting on the critical path.
References
SHOWING 1-10 OF 30 REFERENCES
Vertical paxos and primary-backup replication
- Computer SciencePODC '09
- 2009
It is shown how primary-backup systems in current use can be viewed, and shown to be correct, as instances of Vertical Paxos algorithms, in which reconfiguration can occur in the middle of reaching agreement on an individual state-machine command.
Reliable and total order broadcast in the crash-recovery model
- Computer ScienceJ. Parallel Distributed Comput.
- 2005
Viewstamped Replication: A New Primary Copy Method to Support Highly-Available Distributed Systems
- Computer SciencePODC '88
- 1999
This paper presents a new replication algorithm that has desirable performance properties, based on the primary copy technique, and uses a special kind of timestamp called a viewstamp to detect lost information.
ZooKeeper: Wait-free Coordination for Internet-scale Systems
- Computer ScienceUSENIX Annual Technical Conference
- 2010
ZooKeeper provides a per client guarantee of FIFO execution of requests and linearizability for all requests that change the ZooKeeper state to enable the implementation of a high performance processing pipeline with read requests being satisfied by local servers.
Atomic Broadcast in Asynchronous Crash-Recovery Distributed Systems and Its Use in Quorum-Based Replication
- Computer ScienceIEEE Trans. Knowl. Data Eng.
- 2003
It is shown that atomic broadcast can be implemented requiring few additional log operations in excess of those required by the consensus, and howAdditional log operations can improve the protocol in terms of faster recovery and better throughput.
A new look at atomic broadcast in the asynchronous crash-recovery model
- Computer Science24th IEEE Symposium on Reliable Distributed Systems (SRDS'05)
- 2005
The paper proposes a new specification of atomic broadcast in the crash-recovery model that allows to distinguish between a uniform and a non-uniform version of Atomic broadcast, and is thus more efficient.
Chain Replication for Supporting High Throughput and Availability
- Computer ScienceOSDI
- 2004
Besides outlining the chain replication protocols themselves, simulation experiments explore the performance characteristics of a prototype implementation and several object-placement strategies (including schemes based on distributed hash table routing) are discussed.
The Chubby lock service for loosely-coupled distributed systems
- Computer ScienceOSDI '06
- 2006
The paper describes the initial design and expected use, compares it with actual use, and explains how the design had to be modified to accommodate the differences.
Efficient message ordering in dynamic networks
- Computer SciencePODC '96
- 1996
The aJgorithm always allows processors to initiate messages, even when they are not members of a connected majority component in the network, so that messages can eventually become totally ordered even if their initiator is never a member of a majority component.
Omega Meets Paxos: Leader Election and Stability Without Eventual Timely Links
- Computer ScienceDISC
- 2005
This paper provides a realization of distributed leader election without having any eventual timely links, and an extension of the protocol provides leader stability, which guarantees against arbitrary demotion of a qualified leader and avoids performance penalties associated with leader changes in schemes such as Paxos.