Learn More
Multi-core processors naturally exploit thread-level par-allelism (TLP). However, extracting instruction-level paral-lelism (ILP) from individual applications or threads is still a challenge as application mixes in this environment are nonuniform. Thus, multi-core processors should be flexible enough to provide high throughput for uniform parallel(More)
Page-based virtual memory improves programmer productivity, security, and memory utilization, but incurs performance overheads due to costly page table walks after TLB misses. This overhead can reach 50% for modern workloads that access increasingly vast memory with stagnating TLB sizes. To reduce the overhead of virtual memory, this paper proposes(More)
The continuously increasing gap between processor and memory speeds is a serious limitation to the performance achievable by future microprocessors. Currently, processors tolerate long-latency memory operations largely by maintaining a high number of in-flight instructions. In the future, this may require supporting many hundreds, or even thousands, of(More)
Transactional Memory aims to provide a programming model that makes parallel programming easier. Hardware implementations of transactional memory (HTM) suffer from fewer overheads than implementations in software, and refinements in conflict management strategies for HTM allow for even larger improvements. In particular, lazy conflict management has been(More)
Transactional Memory (TM) is being studied widely as a new technique for synchronizing concurrent accesses to shared memory data structures for use in multi-core systems. Much of the initial work on TM has been evaluated using microbenchmarks and application kernels; it is not clear whether conclusions drawn from these workloads will apply to larger(More)
— With the continued scaling of NAND flash and multi-level cell technology, flash-based storage has gained widespread use in systems ranging from mobile platforms to enterprise servers. However, the robustness of NAND flash cells is an increasing concern, especially at nanometer-regime process geometries. NAND flash memory bit error rate increases(More)
In this paper, we present a Haskell Transactional Memory benchmark to provide a comprehensive application suite for the use of Software Transactional Memory (STM) researchers. We develop a framework to profile the execution of the benchmark applications and to collect detailed runtime data on their transactional behavior, running them on a 128-core(More)
—Translation Lookaside Buffers (TLBs) are ubiquitously used in modern architectures to cache virtual-to-physical mappings and, as they are looked up on every memory access, are paramount to performance scalability. The emergence of chip-multiprocessors (CMPs) with per-core TLBs, has brought the problem of TLB coherence to front stage. TLBs are kept coherent(More)