Learn More
The emphasis in microprocessor design has shifted from high performance, to a combination of high performance and low power. Until recently, this trend was mostly true for uniprocessors. In this work we focus on new energy consumption issues unique to multiprocessor systems: synchronization of accesses to shared memory. We investigate and compare different(More)
We investigate how transactional memory can be adapted for embedded systems. We consider energy consumption and complexity to be driving concerns in the design of these systems and therefore adapt simple hardware transactional memory (HTM) schemes in our architectural design. We propose several different cache structures and contention management schemes to(More)
Two overriding concerns in the development of embedded MPSoCs are ease of programming and hardware complexity. In this paper we present <b>SoC-TM</b>, an integrated HW/SW solution for transactional programming on embedded MPSoCs. Our proposal leverages a Hardware Transactional Memory (HTM) design, based on a dedicated HW module for conflict management,(More)
— One important way in which multiprocessors differ from uniprocessors is in the need to provide programmers the ability to synchronize concurrent access to memory. Transac-tional memory was proposed as a way of improving throughput especially when the rate of synchronization conflict is low. In this paper we explore power implications of transactional(More)
We evaluate the energy-efficiency and performance of a number of synchronization mechanisms adapted for embedded devices. We focus on simple hardware accelerators for common software synchronization patterns. We compare the energy efficiency of a range of shared memory benchmarks using both spin-locks and a simple hardware transactional memory. In most(More)
We propose a new design for an energy-efficient hardware transactional memory (HTM) system for power-aware embedded devices. Prior hardware transactional memory designs proposed a small, fully-associative transactional cache at the same level as the L1 cache. We propose an alternative design that unifies the transactional and L1 caches, and provides a small(More)
Manufacturers are focusing on multiprocessor-system-on-a-chip (MPSoC) architectures in order to provide increased concurrency, rather than increased clock speed, for both large-scale as well as embedded systems. Traditionally lock-based synchronization is provided to support concurrency; however, managing locks can be very difficult and error prone. In(More)
Many-cores, processors with 100s of cores, are becoming increasingly popular in general-purpose computing, yet power is a limiting factor in their performance. In this paper, we compare the power and performance of two design points in the many-core processor domain. The XMT general-purpose processor provides significant runtime advantage on irregular(More)
Roughly ninety percent of all microprocessors manufactured in any one year are intended for embedded devices such as cameras, cell-phones, or machine controllers. We evaluate the energy-efficiency and performance of spin-locks and simple hardware transactional memory on embedded devices. In most cases, transactional memory provides both significantly(More)