• Publications
  • Influence
The gem5 simulator
TLDR
The high level of collaboration on the gem5 project, combined with the previous success of the component parts and a liberal BSD-like license, make gem5 a valuable full-system simulation tool.
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset
TLDR
The Wisconsin Multifacet Project has created a simulation toolset to characterize and evaluate the performance of multiprocessor hardware systems commonly used as database and web servers and has released a set of timing simulator modules for modeling the timing of the memory system and microprocessors.
LogTM: log-based transactional memory
TLDR
This paper presents a new implementation of transactional memory, log-based transactionalMemory (LogTM), that makes commits fast by storing old values to a per-thread log in cacheable virtual memory and storing new values in place.
Frequent Pattern Compression: A Significance-Based Compression Scheme for L2 Caches
TLDR
This work proposes and evaluates a simple significance-based compression scheme that has a low compression and decompression overhead and provides comparable compression ratios to more complex schemes that have higher cache hit latencies.
LogTM-SE: Decoupling Hardware Transactional Memory from Caches
TLDR
This paper proposes a hardware transactional memory system called LogTM Signature Edition (LogTM-SE), which uses signatures to summarize a transactions read-and write-sets and detects conflicts on coherence requests (eager conflict detection), and allows cache victimization, unbounded nesting, thread context switching and migration, and paging.
Adaptive cache compression for high-performance processors
  • A. Alameldeen, D. Wood
  • Computer Science
    Proceedings. 31st Annual International Symposium…
  • 19 June 2004
TLDR
An adaptive policy that dynamically adapts to the costs and benefits of cache compression is developed and it is shown that compression can improve performance for memory-intensive commercial workloads by up to 17%.
Managing Wire Delay in Large Chip-Multiprocessor Caches
TLDR
This paper develops L2 cache designs for CMPs that incorporate block migration, stride-based prefetching between L1 and L2 caches, and presents a hybrid design-combining all three techniques-that improves performance by an additional 2% to 19% overPrefetching alone.
A Primer on Memory Consistency and Cache Coherence
TLDR
This primer is to provide readers with a basic understanding of consistency and coherence, and presents both highlevel concepts as well as specific, concrete examples from real-world systems.
DBMSs on a Modern Processor: Where Does Time Go?
TLDR
This paper examines four commercial DBMSs running on an Intel Xeon and NT 4.0 and introduces a framework for analyzing query execution time, and finds that database developers should not expect the overall execution time to decrease significantly without addressing stalls related to subtle implementation issues.
Implementation techniques for main memory database systems
TLDR
This paper considers the changes necessary to permit a relational database system to take advantage of large amounts of main memory, and evaluates AVL vs B+-tree access methods, hash-based query processing strategies vs sort-merge, and study recovery issues when most or all of the database fits in main memory.
...
...