Embedded-TM: Energy and complexity-effective hardware transactional memory for embedded multicore systems

@article{Ferri2010EmbeddedTMEA,
  title={Embedded-TM: Energy and complexity-effective hardware transactional memory for embedded multicore systems},
  author={Cesare Ferri and Samantha Wood and Tali Moreshet and R. Iris Bahar and Maurice Herlihy},
  journal={J. Parallel Distributed Comput.},
  year={2010},
  volume={70},
  pages={1042-1052}
}

Figures and Tables from this paper

Characterizing Energy Consumption in Hardware Transactional Memory Systems

This work characterize the performance and energy consumption of two well-known Hardware Transactional Memory systems that employ opposite policies for data versioning and conflict management and finds that although on average Lazy-Lazy beats Eager-Eager there are considerable deviations in performance depending on the particular characteristics of each application.

Hardware transactional memory on multi-processor FPGA platform

This paper proposes a hardware transactional memory (HTM) which exploits both version and conflict management and offers up to 14% improvement in terms of clock cycle over the HTM scheme that only exploits conflict management.

TMbox: A Flexible and Reconfigurable 16-Core Hybrid Transactional Memory System

This paper evaluates a 16-core Hybrid Transactional Memory implementation based on the Tiny STM-ASF proposal on a Virtex-5 FPGA and accelerates three benchmarks written to investigate TM.

Energy-Efficient and High-Performance Lock Speculation Hardware for Embedded Multicore Systems

This article proposes Embedded-Spec, a hardware solution for supporting transparent lock speculation, without the requirement for special supporting instructions, and concludes that for resource-constrained platforms, lock speculation can provide real benefits in terms of improved concurrency and energy efficiency.

On the design of energy‐efficient hardware transactional memory systems

This work characterize the performance and energy consumption of two well‐known hardware transactional memory systems that employ opposite policies for data versioning and conflict management and finds that even though lazy‐lazy beats eager‐eager on average, there are considerable deviations in performance depending on the particular characteristics of each application and the settings of both systems.

Energy-Performance Tradeoffs in Software Transactional Memory

This work characterize the behavior of three state-of-the-art lock-based STM algorithms, along with three different conflict resolution schemes, and proposes a DVFS-based technique that can be integrated into the resolution policies so as to improve the energy-delay product (EDP).

Investigating Transactional Memory for High Performance Embedded Systems

A Transaction Management Unit for Hardware Transactional Memories enables three different contention management strategies, which can be applied according to the workload, and enables unbounded transactions in terms of size.

SoC-TM: Integrated HW/SW support for transactional memory programming on embedded MPSoCs

  • C. FerriA. Marongiu M. Herlihy
  • Computer Science
    2011 Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)
  • 2011
This proposal leverages a Hardware Transactional Memory (HTM) design, based on a dedicated HW module for conflict management, whose functionality is exposed to the software through compiler directives, implemented as an extension to the popular OpenMP programming model.

A Flexible Hybrid Transactional Memory Multicore on FPGA

A 16-core Hybrid Transactional Memory implementation based on the TinySTM-ASF proposal on a Virtex-5 FPGA is evaluated and three benchmarks written to investigate TM are accelerated.

Configurable Version Management Hardware Transactional Memory for Multi-processor Platform

This paper proposes a hardware transactional memory (HTM) with interchangeable version management that is targeted for embedded applications and is area-efficient compared to current implementations that apply cache coherence protocols.

References

SHOWING 1-10 OF 41 REFERENCES

Energy and Throughput Efficient Transactional Memory for Embedded Multicore Systems

It is shown that the victim cache scheme can provide up to a 4X improvement in energy-delay product, compared to a traditional HTM scheme that uses a separate transactional cache.

On the energy-efficiency of software transactional memory

Experimental results show that the proposed novel scratchpad-based energy-aware STM design strategies can achieve an energy improvement of up to ~36% with regard to the base STM for applications characterized by short-lived transactions and relatively high abort rate.

A hardware/software framework for supporting transactional memory in a MPSoC environment

This paper demonstrates a complete hardware transactional memory solution for an embedded multi-core architecture, consisting of a cache-coherent ARM-based cluster, similar to ARM's MPCore, and evaluates the architectural framework over a set of different system and application settings and shows that transactionalmemory is a promising solution, even for resource-constrained embedded multiprocessors.

Energy reduction in multiprocessor systems using transactional memory

It is shown that transactional memory has an advantage in terms of energy consumption over locks, but that this advantage largely depends on the system architecture, the contention level, and the policy of conflict resolution.

Cache coherence tradeoffs in shared-memory MPSoCs

This work aims at providing a comparative energy and performance analysis of cache-coherence support schemes in MPSoCs by exploring different cache- coherent shared-memory communication schemes for a number of cache configurations and workloads.

Scratchpad memory: a design alternative for cache on-chip memory in embedded systems

The results clearly establish scratch pad memory as a low power alternative in most situations with an average energy reduction of 40% and the average area-time reduction for the scratchpad memory was 46% of the cache memory.

Power and performance tradeoffs using various caching strategies

  • R. I. BaharG. AlberaS. Manne
  • Computer Science
    Proceedings. 1998 International Symposium on Low Power Electronics and Design (IEEE Cat. No.98TH8379)
  • 1998
It is shown that, by using buffers, energy consumption of the memory subsystem may be reduced by as much as 13% for certain data cache configurations and by asmuch as 23% forcertain instruction cache configurations without adversely effecting processor performance or on-chip energy consumption.

Power/Performance Hardware Optimization for Synchronization Intensive Applications in MPSoCs

This paper explores optimization techniques of the synchronization mechanisms for MPSoCs based on complex interconnect (network-on-chip) by introducing a HW module, the synchronization-operation buffer (SB), which queues and manages the requests issued by the processors.

Analyzing on-chip communication in a MPSoC environment

The simulation environment proved capable of a detailed comparative analysis between two industry-standard communication architectures, under realistic workloads and different system configurations, pointing out the impact of fine grained architectural mismatches on macroscopic performance differences.

Performance Pathologies in Hardware Transactional Memory

The authors identify a set of performance pathologies that could degrade performance in proposed HTM designs and suggest improving conflict resolution could eliminate these pathologies so designers can build robust HTM systems.