Learn More
The emphasis in microprocessor design has shifted from high performance, to a combination of high performance and low power. Until recently, this trend was mostly true for uniprocessors. In this work we focus on new energy consumption issues unique to multiprocessor systems: synchronization of accesses to shared memory. We investigate and compare different(More)
We investigate how transactional memory can be adapted for embedded systems. We consider energy consumption and complexity to be driving concerns in the design of these systems and therefore adapt simple hardware transactional memory (HTM) schemes in our architectural design. We propose several different cache structures and contention management schemes to(More)
Two overriding concerns in the development of embedded MPSoCs are ease of programming and hardware complexity. In this paper we present <b>SoC-TM</b>, an integrated HW/SW solution for transactional programming on embedded MPSoCs. Our proposal leverages a Hardware Transactional Memory (HTM) design, based on a dedicated HW module for conflict management,(More)
Manufacturers are focusing on multiprocessor-system-on-a-chip (MPSoC) architectures in order to provide increased concurrency, rather than increased clock speed, for both large-scale as well as embedded systems. Traditionally lock-based synchronization is provided to support concurrency; however, managing locks can be very difficult and error prone. In(More)
We evaluate the energy-efficiency and performance of a number of synchronization mechanisms adapted for embedded devices. We focus on simple hardware accelerators for common software synchronization patterns. We compare the energy efficiency of a range of shared memory benchmarks using both spin-locks and a simple hardware transactional memory. In most(More)
—Many-cores, processors with 100s of cores, are becoming increasingly popular in general-purpose computing, yet power is a limiting factor in their performance. In this paper, we compare the power and performance of two design points in the many-core processor domain. The XMT 1 general-purpose processor provides significant runtime advantage on irregular(More)
We propose a new design for an energy-efficient hardware transactional memory (HTM) system for power-aware embedded devices. Prior hardware transactional memory designs proposed a small, fully-associative transactional cache at the same level as the L1 cache. We propose an alternative design that unifies the transactional and L1 caches, and provides a small(More)
—High-end embedded systems, like their general-purpose counterparts, are turning to many-core cluster-based shared-memory architectures that provide a shared memory abstraction subject to non-uniform memory access (NUMA) costs. In order to keep the cores and memory hierarchy simple, many-core embedded systems tend to employ simple, scratchpad-like memories,(More)
— Current trends in microprocessor designs indicate increasing pipeline depth in order to keep up with higher clock frequencies and increased architectural complexity. Speculatively issued instructions are particularly sensitive to increases in pipeline depth. In this paper, we use load hit speculation as an example, and evaluate its cost-effectiveness as(More)