Synchronization algorithms for shared-memory multiprocessors

@article{Graunke1990SynchronizationAF,
  title={Synchronization algorithms for shared-memory multiprocessors},
  author={Gary Graunke and Shreekant S. Thakkar},
  journal={Computer},
  year={1990},
  volume={23},
  pages={60-69}
}
A performance evaluation of the Symmetry multiprocessor system revealed that the synchronization mechanism did not perform well for highly contested locks, like those found in certain parallel applications. Several software synchronization mechanisms were developed and evaluated, using a hardware monitor, on the Symmetry multiprocessor system; the mechanisms were to reduce contention for the lock. The mechanisms remain valuable even when changes are made to the hardware synchronization… 

Figures and Tables from this paper

High performance synchronization algorithms for multiprogrammed multiprocessors
TLDR
The design and evaluation of scalable scheduler-conscious mutual exclusion locks, reader-writer locks, and barriers are described, and it is shown that by sharing information across the kernel/application interface the authors can improve the performance of Scheduler-oblivious implementations by more than an order of magnitude.
A Distributed Hardware Mechanism for Process Synchronization on Shared-Bus Multiprocessors
TLDR
A new technique is presented that uses distributed hardware locking queues to reduce both contention and latency to the minimum values that can be obtained using a shared-bus.
Scalable spin locks for multiprogrammed systems
TLDR
Two queue-based locks that recover from in-queue preemption are presented that demonstrate that high-performance software spin locks are compatible with multiprogramming on both large-scale and bus-based machines.
On Synchronization Patterns in Parallel Programs
TLDR
The impact of techniques to obtain locks under high contention in a more realistic framework using a sample of real parallel programs running on a shared-bus multiprocessor system is considered.
Efficient Software Synchronization on Large Cache Coherent Multiprocessors
TLDR
A method to characterize the performance of proposed queue lock algorithms, and applies it to previously published algorithms conclude that the M lock is the best overall queue lock for the class of architectures studied.
A low-latency scalable locking algorithm for shared memory multiprocessors
TLDR
Unlike existing queue-based locks which suffer from high latency in low contention situations, experimental results show that the shared array lock has low latency and good scalability.
Scheduler-conscious synchronization
TLDR
It is found that while it is possible to avoid pathological performance problems using previously proposed kernel mechanisms, a modest additional widening of the kernel/user interface can make scheduler-conscious synchronization algorithms significantly simpler and faster, with performance on dedicated machines comparable to that of Scheduler-oblivious algorithms.
Queue locks on cache coherent multiprocessors
TLDR
A method to characterize the performance of proposed queue lock algorithms, and applies it to previously published algorithms conclude that the M lock is the best overall queue lock for the class of architectures studied.
Synchronization on Cray-T3E Virtual Shared Memory
We consider algorithms for implementing mutual exclusion on the Cray-T3E virtual shared memory using various atomic operations. Our implementations of Anderson's and MCS Lock minimize network
Algorithms for scalable synchronization on shared-memory multiprocessors
TLDR
The principal conclusion is that contention due to synchronization need not be a problemin large-scale shared-memory multiprocessors, and the existence of scalable algorithms greatly weakens the case for costly special-purpose hardware support for synchronization, and provides protection against so-called “dance hall” architectures.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 12 REFERENCES
The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors
  • T. Anderson
  • Computer Science
    IEEE Trans. Parallel Distributed Syst.
  • 1990
The author examines the questions of whether there are efficient algorithms for software spin-waiting given hardware support for atomic instructions, or whether more complex kinds of hardware support
Dynamic decentralized cache schemes for mimd parallel processors
TLDR
It appears that moderately large parallel processors can be designed by employing the principles presented in this paper, and both schemes feature decentralized consistency control and dynamic type classification of the datum cached.
Dynamic Decentralized Cache Schemes for MIMD Parallel Processors
TLDR
It appears that moderately large parallel processors can be designed by employing the principles presented in this paper, and both schemes feature decentralized consistency control and dynamic type classification of the datum cached.
A survey of synchronization methods for parallel computers
An examination is given of how traditional synchronization methods influence the design of MIMD (multiple-instruction multiple-data-stream) multiprocessors. She provides an overview of MIMD
Efficient synchronization primitives for large-scale cache-coherent multiprocessors
TLDR
A set of efficient primitives for process synchronization in multiprocessors that make use of synchronization bits to provide a simple mechanism for mutual exclusion and to implement Fetch and Add with combining in software rather than hardware is proposed.
The NYU Ultracomputer—Designing an MIMD Shared Memory Parallel Computer
We present the design for the NYU Ultracomputer, a shared-memory MIMD parallel machine composed of thousands of autonomous processing elements. This machine uses an enhanced message switching network
A fast mutual exclusion algorithm
A new solution to the mutual exclusion problem is presented that, in the absence of contention, requires only seven memory accesses. It assumes atomic reads and atomic writes to shared registers.
Lovett and S . S . Thakkar . “ The Symmetry Multiprocessor System
  • 1988
The RP3 System
  • Proc. Int'l Conf Parallel Processing
  • 1986
...
1
2
...