On the inherent weakness of conditional synchronization primitives

@inproceedings{Ellen2004OnTI,
  title={On the inherent weakness of conditional synchronization primitives},
  author={Faith Ellen and Danny Hendler and Nir Shavit},
  booktitle={PODC '04},
  year={2004}
}
The "wait-free hierarchy" classifies multiprocessor synchronization primitives according to their power to solve consensus. The classification is based on assigning a number n to each synchronization primitive, where n is the maximal number of processes for which deterministic wait-free consensus can be solved using instances of the primitive and read write registers. Conditional synchronization primitives, such as Compare-and-Swap and Load-Linked/Store-Conditional, can implement deterministic… 

Figures from this paper

Replacing competition with cooperation to achieve scalable lock-free FIFO queues
TLDR
This paper formalizes the notion of competitiveness of a synchronizing statement which can beused as a measure for the scalability of concurrent implementations, and presents a new queue implementation, the Speculative Pairing (SP) queue, which decreases competitiveness by using Fetch-And-Increment (FAI) instead of CAS.
On the Cost of Concurrency in Transactional Memory
TLDR
The Transactional Memory abstraction is proposed as a synchronization mechanism that relieves the programmer of the overhead of reasoning about data conflicts that may arise from concurrent operations without severely limiting the program's performance.
Wait-Free CAS-Based Algorithms: The Burden of the Past
TLDR
It is proved that CAS does not allow to implement wait-free and linearizable visible objects in the infinite model with a space complexity bounded by the number of active processes (i.e. ones that have operations in progress on this object).
ActiveMonitor: Asynchronous Monitor Framework for Scalability and Multi-Object Synchronization
TLDR
This work presents ActiveMonitor - a framework that allows multi-object synchronization without global locks, and improves parallelism by exploiting asynchronous execution of critical sections, and shows that on most of these problems, ActiveMonitor based programs outperform programs implemented using Java's reentrant-lock and condition constructs.
Fair synchronization
Poly-Logarithmic Adaptive Algorithms Require Unconditional Primitives
TLDR
Any collect algorithm must perform Omega(k) steps, in an execution with total contention k in O(log( log(n))), the lower bound applies for snapshot and renaming, both one-shot and long-lived.
Multithreading Strategies for Replicated Objects
TLDR
It is concluded that replication middleware should implement reconfigurable multithreading strategies, as there is no optimal one-size-fits-all solution.
Transactional data structures
TLDR
Transactional Data Structures are introduced which are data structuresthat permit access to past versions, although not all accesses succeed and form the basis of a concurrent programming solution that supports database type transactions in memory.
Scaling mount concurrency : scalability and progress in concurrent algorithms
TLDR
In this dissertation, it is shown that existing instruction set architectures must be extended to allow general scalable algorithms to be built, and a reasonably scalable implementation of a map built on the widely-available compare-and-swap primitive is presented.
Language Support and Compiler Optimizations for Object-Based Software Transactional Memory
TLDR
It is concluded that appropriate language support and high quality compiler optimizations are necessary for the success of any STM system, and the first language extensions and compiler support for transactional boosting is proposed.
...
...

References

SHOWING 1-10 OF 16 REFERENCES
Wait-free synchronization
TLDR
A hierarchy of objects is derived such that no object at one level has a wait-free implementation in terms of objects at lower levels, and it is shown that atomic read/write registers, which have been the focus of much recent attention, are at the bottom of the hierarchy.
A fast, scalable mutual exclusion algorithm
TLDR
A new algorithm forN-process mutual exclusion that requires only read and write operations and that hasO(logN) time complexity, where “time” is measured by counting remote memory references.
Contention in shared memory algorithms
TLDR
The first formal complexity model for contention in shared-memory multiprocessors is introduced and certain counting networks outperform conventional single-variable counters at high levels of contention, providing the first formal model explaining this phenomenon.
The communication requirements of mutual exclusion
TLDR
It is shown that there does not exist a scalable mutual exclusion protocol that uses only read and write operations, and the results suggest that many current generation microprocessors have instruction sets that are not well-suited to performing mutual exclusion in a shared memory environment.
A time complexity lower bound for randomized implementations of some shared objects
TLDR
The lower bound implies that for any shared object O implemented using any oblivious universal construction, no matter what O’s type is, in the worst-case a process performs Ω(log n) operations on shared∗This work is partially supported by NSF RIA grant CCR9410421.
Time and space lower bounds for non-blocking implementations (preliminary version)
TLDR
The following time and space complexity lower bounds are shown, which improves on some of the C?(@) space complexit~ lower bounds of Fich, Herlihy & Shavit [FHS93].
An improved lower bound for the time complexity of mutual exclusion
TLDR
It is suggested that, from an asymptotic standpoint, comparison primitives are no better than reads and writes when implementing local-spin mutual exclusion algorithms, and may not be the best choice to provide in hardware if one is interested in scalable synchronization.
Time and Space Lower Bounds for Nonblocking Implementations
TLDR
Time and space complexity lower bounds of any randomized nonblocking n-process implementation of any object in set A from any combination of objects in set B are shown, showing the near optimality of some known wait-free implementations in terms of space complexity.
Lower bounds for adaptive collect and related objects
TLDR
It is proved that if a collect algorithm is f -adaptive to total contention, namely, its step complexity is f(k), where k is the number of processes that ever took a step, then it uses Ω(<i>f</i><sup>-1</sup>(n</i>) multi-writer registers, where n is the total number of process in the system.
Linearizability: a correctness condition for concurrent objects
TLDR
This paper defines linearizability, compares it to other correctness conditions, presents and demonstrates a method for proving the correctness of implementations, and shows how to reason about concurrent objects, given they are linearizable.
...
...