Transactional Mutex Locks

@inproceedings{Dalessandro2010TransactionalML,
  title={Transactional Mutex Locks},
  author={Luke Dalessandro and David Dice and Michael L. Scott and Nir Shavit and Michael F. Spear},
  booktitle={European Conference on Parallel Processing},
  year={2010}
}
Mutual exclusion (mutex) locks limit concurrency but offer low single-thread latency. Software transactional memory (STM) typically has much higher latency, but scales well. We present transactional mutex locks (TML), which attempt to achieve the best of both worlds for read-dominated workloads. We also propose compiler optimizations that reduce the latency of TML to within a small fraction of mutex overheads. Our evaluation of TML, using microbenchmarks on the x86 and SPARC architectures… 

FASTLANE: Streamlining Transactions for Low Thread Counts

Preliminary evaluation results indicate that the approach provides promising performance at low thread counts: FASTLANE almost systematically wins over a classical STM in the 2-4 threads range, and often performs better than sequential execution of the non-instrumented version of the same application.

FastLane: improving performance of software transactional memory for low thread counts

Evaluation results indicate that the approach provides promising performance at low thread counts: FastLane almost systematically wins over a classical STM in the 1-6 threads range, and often performs better than sequential execution of the non-instrumented version of the same application starting with 2 threads.

Streamlining Transactions for Low Thread Counts

Preliminary evaluation results indicate that the approach provides promising performance at low thread counts: FASTLANE almost systematically wins over a classical STM in the 2-4 threads range, and often performs better than sequential execution of the non-instrumented version of the same application.

Exploiting Off-the-Shelf Virtual Memory Mechanisms to Boost Software Transactional Memory

It is claimed that the hardware and kernel-level mechanisms that already support virtual memory on commodity computers can also play an unexpected role: to fulfill the core concurrency control needs of an STM.

Exploiting O ff-the-Shelf Virtual Memory Mechanisms to Boost Software Transactional Memory

It is claimed that the hardware and kernel-level mechanisms that already support virtual memory on commodity computers can also play an unexpected role: to fulfill the core concurrency control needs of an STM.

TrC-MC: Decentralized Software Transactional Memory for Multi-multicore Computers

  • K. ChanCho-Li Wang
  • Computer Science
    2011 IEEE 17th International Conference on Parallel and Distributed Systems
  • 2011
This paper applies two design changes, namely zone partitioning and timestamp extension, to optimize an existing decentralized algorithm and finds it as much as several times faster than the state-of-the-art software transactional memory system.

Remote Invalidation: Optimizing the Critical Path of Memory Transactions

Remote Invalidation (or RInval) is a new STM algorithm that reduces overheads and improves STM performance by remote execution of commit and invalidation routines and cache-aligned communication, and reduces the overhead of spin locking and cache misses on shared locks.

Remote Transaction Commit: Centralizing Software Transactional Memory Commits

Remote Transaction Commit (or RTC), a mechanism for executing commit phases of STM transactions, is introduced, which decreases the overheads of spinning on locks during commit and enables exploiting the benefits of coarse-grained locking algorithms and bloom filter-based algorithms.

On Improving Transactional Memory: Optimistic Transactional Boosting, Remote Execution, and Hybrid Transactions

This dissertation designs an optimistic methodology for transactional boosting to specifically enhance the performance of the transactional data structures, and proposes a hybrid TM solution which exploits the new HTM features of the currently released Intel's Haswell processor.

Lightweight, robust adaptivity for software transactional memory

This work presents a low-overhead system for adapting between STM implementations, which enables adaptivity between different parameterizations of a given algorithm, and it allows adapting between the use of transactions and coarse-grained locks.
...

References

SHOWING 1-10 OF 37 REFERENCES

Code Generation and Optimization for Transactional Memory Constructs in an Unmanaged Language

This system is the first to demonstrate that transactions integrate well with an unmanaged language, and can perform as well as fine- grain locking while providing the programming ease of coarse-grain locking even on an unmanaging environment.

Transactional Locking II

This paper introduces the transactional locking II (TL2) algorithm, a software transactional memory (STM) algorithm based on a combination of commit-time locking and a novel global version-clock based validation technique, which is ten-fold faster than a single lock.

Compiler and runtime support for efficient software transactional memory

A high-performance software transactional memory system (STM) integrated into a managed runtime environment is presented and the JIT compiler is the first to optimize the overheads of STM, and novel techniques for enabling JIT optimizations on STM operations are shown.

Software Transactional Memory: Why Is It Only a Research Toy?

It is observed that the overall performance of TM is significantly worse at low levels of parallelism, which is likely to limit the adoption of this programming paradigm.

Scalable Techniques for Transparent Privatization in Software Transactional Memory

A dynamic hybrid of PVRs and strict in-order commits is stable and reasonably fast across a wide range of load parameters, and the remaining overheads are high enough to suggest the need for programming model or architectural support.

Optimizing transactions for captured memory

This paper proposes runtime and compiler optimizations to elide STM barriers to captured memory and implemented those optimizations in the Intel C++ STM compiler.

Reducing Memory Ordering Overheads in Software Transactional Memory

This work proposes compiler optimizations that can safely eliminate many fence instructions and obtains a reduction of up to 89% in the number of fences, and 20% in per-transaction latency, for common transactional benchmarks.

RingSTM: scalable transactions with a single atomic instruction

The RingSTM system is the first STM that is inherently livelock-free and privatization-safe while at the same time permitting parallel writeback by concurrent disjoint transactions.

McRT-STM: a high performance software transactional memory system for a multi-core runtime

A software transactional memory (STM) system that is part of McRT, an experimental Multi-Core RunTime, and a detailed performance analysis of various STM design tradeoffs such as pessimistic versus optimistic concurrency, undo logging versus write buffering, and cache line based versus object based conflict detection are presented.

Optimizing memory transactions

A new 'direct access' implementation that avoids searching thread-private logs is introduced, compiler optimizations to reduce the amount of logging are developed, and a series of GC-time techniques to compact the logs generated by long-running atomic blocks are presented.