The scalable commutativity rule: designing scalable software for multicore processors

@article{Clements2013TheSC,
  title={The scalable commutativity rule: designing scalable software for multicore processors},
  author={Austin T. Clements},
  journal={Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles},
  year={2013}
}
  • A. Clements
  • Published 3 November 2013
  • Computer Science
  • Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
What fundamental opportunities for scalability are latent in interfaces, such as system call APIs? Can scalability opportunities be identified even before any implementation exists, simply by considering interface specifications? To answer these questions this paper introduces the following rule: Whenever interface operations commute, they can be implemented in a way that scales. This rule aids developers in building more scalable software starting from interface design and carrying on through… 
The scalable commutativity rule
TLDR
This paper formalizes the scalable commutativity rule and proves it correct for any machine on which conflict-free operations scale, such as current cache-coherent multicore machines, and enables a better design process for scalable software.
Automating the Proof that Data Structure Implementations Commute with mn-Differencing
TLDR
Techniques to automatically prove the correctness of method commutativity conditions from data structure implementations are described, including a reduction to reachability that decomposes the problem using mn-differencing relations and observational equivalence relations.
Reducing Commutativity Verification to Reachability with Differencing Abstractions
TLDR
A novel algorithm is described that reduces the problem to reachability, so that off-the-shelf program analysis tools can perform the reasoning necessary for proving commutativity, and abstracts away effects of methods that would be the same regardless of the order.
Veracity: Declarative Multicore Programming with Commutativity
TLDR
It is shown that commute conditions can be synthesized even for nonlinear programs; the expectation that concurrency speedups can be seen as the computation increases is confirmed; and the work is applied to a small in-memory filesystem and a crowdfund blockchain smart contract.
Decomposing Data Structure Commutativity Proofs with $m\!n$-Differencing
TLDR
This paper introduces a novel decomposition to improve the task of verifying method-pair commutativity conditions from data structure implementations, and incorporates this decomposition into a proof rule, as well as an automata-theoretic reduction for Commutativity verification.
Automatic Generation of Precise and Useful Commutativity Conditions (Extended Version)
TLDR
A fully automated technique to generate conditions for commutativity conditions for a range of data structures including Set, HashTable, Accumulator, Counter, and Stack is designed and implemented in a prototype open-source tool Servois.
Automating the Choice of Consistency Levels in Replicated Systems
TLDR
This work presents SIEVE, a tool that relieves Java programmers from this errorprone decision process, allowing applications to automatically extract good performance when possible, while resorting to strong consistency whenever required by the target semantics.
Commutativity race detection
TLDR
The concept of a commutativity race occurs in a given execution when two library method invocations can happen concurrently yet they do not commute, and a new logical fragment is presented for specifying Commutativity conditions that guarantees a constant number of comparisons per method invocation.
Commutativity Condition Refinement
TLDR
It is shown that one can pose the commutativity question in a way that does not introduce additional quantifiers, via a mechanized lifting of a (potentially partial) specification to an equivalent total specification.
NrOS: Effective Replication and Sharing in an Operating System
TLDR
NrOS is a new OS kernel with a safer approach to synchronization that runs many POSIX programs and is primarily constructed as a simple, sequential kernel with no concurrency, making it easier to develop and reason about its correctness.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 92 REFERENCES
Commutativity analysis: a new analysis technique for parallelizing compilers
TLDR
This article presents a new analysis technique, commutativity analysis, for automatically parallelizing computations that manipulate dynamic, pointer-based data structures and presents performance results for the generated parallel code running on the Stanford DASH machine.
Exploring the limits of disjoint access parallelism
TLDR
The design and architecture of a prototype tool that provides insights about critical sections is described, based on the Pin binary rewriting engine and works on unmodified x86 binaries that considers both the amount of contention for a particular lock as well as the potential amount of disjoint access parallelism.
Experience distributing objects in an SMMP OS
TLDR
An object-oriented structure that minimizes sharing by providing a natural mapping from independent requests to independent code paths and data structures, and the selective partitioning, distribution, and replication of object implementations in order to improve locality are found to be effective in improving scalability of SMMP operating systems.
Making asynchronous parallelism safe for the world
TLDR
A parallel programming model that allows asynchronous threads of control but restrict shared-memory accesses and other side effects so as to prevent the behavior of the program from depending on any accidents of execution order that can arise from the indeterminacy of the asynchronous process model is proposed.
Commutative set: a language extension for implicit parallel programming
TLDR
A generalized semantic commutativity based programming extension, called Commutative Set (COMMSET), and associated compiler technology that enables multiple forms of parallelism and enables well performing parallelizations in cases where they were inapplicable or non-performing before.
Commutativity-based concurrency control for abstract data types
  • W. Weihl
  • Computer Science
    [1988] Proceedings of the Twenty-First Annual Hawaii International Conference on System Sciences. Volume II: Software track
  • 1988
TLDR
Two novel concurrency control algorithms for abstract data types are presented and it is proved that both algorithms ensure a local atomicity property called dynamic atomicity, which means that they can be used in combination with any other algorithms that also ensureynamic atomicity.
Scalable address spaces using RCU balanced trees
TLDR
A new design for increasing the concurrency of kernel operations on a shared address space is contributed by exploiting read-copy-update (RCU) so that soft page faults can both run in parallel with operations that mutate the same address space and avoid contending with other page faults on shared cache lines.
CUTE: a concolic unit testing engine for C
TLDR
A method to represent and track constraints that capture the behavior of a symbolic execution of a unit with memory graphs as inputs is developed and an efficient constraint solver is proposed to facilitate incremental generation of such test inputs.
Laws of order: expensive synchronization in concurrent algorithms cannot be eliminated
TLDR
It is proved that it is impossible to build concurrent implementations of classic and ubiquitous specifications such as sets, queues, stacks, mutual exclusion and read-modify-write operations, that completely eliminate the use of expensive synchronization.
Corey: An Operating System for Many Cores
TLDR
This paper proposes three operating system abstractions (address ranges, kernel cores, and shares) that allow applications to control inter-core sharing and to take advantage of the likely abundance of cores by dedicating cores to specific operating system functions.
...
1
2
3
4
5
...