Recoverable Mutual Exclusion: [Extended Abstract]

@article{Golab2016RecoverableME,
  title={Recoverable Mutual Exclusion: [Extended Abstract]},
  author={Wojciech M. Golab and Aditya Ramaraju},
  journal={Proceedings of the 2016 ACM Symposium on Principles of Distributed Computing},
  year={2016}
}
  • W. Golab, A. Ramaraju
  • Published 25 July 2016
  • Computer Science
  • Proceedings of the 2016 ACM Symposium on Principles of Distributed Computing
Mutex locks have traditionally been the most common mechanism for protecting shared data structures in parallel programs. However, the robustness of such locks against process failures has not been studied thoroughly. Most (user-level) mutex algorithms are designed around the assumption that processes are reliable, meaning that a process may not fail while executing the lock acquisition and release code, or while inside the critical section. If such a failure does occur, then the liveness… 

Figures from this paper

Recoverable mutual exclusion
TLDR
This work formalizes the problem of recoverable mutual exclusion, and proposes several solutions that vary both in their assumptions regarding hardware support for synchronization, and in their efficiency, which are more robust as they do not restrict where a process may crash, and provide stricter guarantees in terms of efficiency.
Recoverable FCFS Mutual Exclusion with Wait-Free Recovery
TLDR
The main features of the Recoverable Mutual Exclusion algorithm are that it satisfies FCFS, it ensures that each process recovers in a wait-free manner, and in the absence of failures, it guarantees a worst-case Remote Memory Reference (RMR) complexity of O(lg n) on both Cache Coherent (CC) and Distributed Shared Memory (DSM) machines.
An Adaptive Approach to Recoverable Mutual Exclusion
TLDR
This work presents a new algorithm for solving the RME problem whose RMR complexity gradually adapts to the number of failures that have occurred in the system recently, given by [EQUATION] where F denotes the total number of failed processes in the recent past.
Adaptive and Fair Transformation for Recoverable Mutual Exclusion
TLDR
This work presents a framework that transforms any algorithm that solves the RME problem into an algorithm that can also simultaneously adapt to (a) the number of processes competing for the lock, as well as (b) theNumber of failures that have occurred in the recent past, while maintaining the correctness and performance properties of the underlying RME algorithm.
Recoverable Mutual Exclusion with Abortability
TLDR
This work presents the first RME algorithm where a process has the ability to abort executing the algorithm, if it decides to give up its request for a shared resource before being granted access to that resource.
Detectable recovery of lock-free data structures
TLDR
The analysis reveals that understanding the actual persistence cost of an algorithm in machines with real NVMM, is more complicated than previously thought, and requires a thorough evaluation, since the impact of different persistence instructions on performance may greatly vary.
Optimal Recoverable Mutual Exclusion Using only FASAS
TLDR
This work presents the first Recoverable Mutual Exclusion algorithm whose Remote Memory Reference (RMR) complexity is optimal for both Cache-Coherent (CC) and Distributed Shared Memory (DSM) machines.
Brief Announcement: Detectable Sequential Specifications for Recoverable Shared Objects
TLDR
A detectable sequential specification (DSS) is formalized using a detectable recoverable lock-free queue algorithm and its performance is evaluated on a multiprocessor equipped with Intel Optane persistent memory.
Memory Reclamation for Recoverable Mutual Exclusion
TLDR
This work presents the first “general” recoverable algorithm for memory reclamation in the context of recoverable mutual exclusion, which can be plugged into any RME algorithm very easily and preserves all correctness property and most desirable properties of the algorithm.
The Recoverable Consensus Hierarchy
TLDR
This collection of results exhibits the first separation between the simultaneous and independent crash-recovery failure models with respect to the computability of consensus, and implies that result cannot be generalized to all primitives at level 2 in the conventional consensus hierarchy.
...
...

References

SHOWING 1-10 OF 38 REFERENCES
Recoverable user-level mutual exclusion
TLDR
This work presents an algorithm which can ensure the successful registration of ownership of a spin lock, regardless of where processes fail, and proves it works even on the weak memory consistency models implemented by many modern multiprocessor computer systems.
Recovering scalable spin locks
We present a mechanism for making a scalable spin lock protocol, the MCS lock recoverable, thereby ensuring that a lock never becomes permanently unavailable, even if one or more processes using the
Scalable queue-based spin locks with timeout
TLDR
It is demonstrated that it is possible to obtain both scalability and bounded waiting, using variants of the queue-based locks of Craig, Landin, and Hagersten, and of Mellor-Crummey and Scott.
A fast, scalable mutual exclusion algorithm
TLDR
A new algorithm forN-process mutual exclusion that requires only read and write operations and that hasO(logN) time complexity, where “time” is measured by counting remote memory references.
The communication requirements of mutual exclusion
TLDR
It is shown that there does not exist a scalable mutual exclusion protocol that uses only read and write operations, and the results suggest that many current generation microprocessors have instruction sets that are not well-suited to performing mutual exclusion in a shared memory environment.
Algorithms for mutual exclusion
TLDR
All of the algorithms in this book have been rewritten in a single language and restructured so that they are easy to understand and compare, and the principles guiding their design are stressed.
Wait-free synchronization
TLDR
A hierarchy of objects is derived such that no object at one level has a wait-free implementation in terms of objects at lower levels, and it is shown that atomic read/write registers, which have been the focus of much recent attention, are at the bottom of the hierarchy.
Queue locks on cache coherent multiprocessors
TLDR
A method to characterize the performance of proposed queue lock algorithms, and applies it to previously published algorithms conclude that the M lock is the best overall queue lock for the class of architectures studied.
A new fast-path mechanism for mutual exclusion
TLDR
The problem of designing a read/write mutual exclusion algorithm with O(1) time complexity in the absence of contention and O(logN)Time complexity under contention has remained open is closed by presenting a fast-path mechanism that achieves these time complexity bounds when used in conjunction with Yang and Anderson's arbitration-tree algorithm.
Whole-system persistence
Today's databases and key-value stores commonly keep all their data in main memory. A single server can have over 100 GB of memory, and a cluster of such servers can have 10s to 100s of TB. However,
...
...