Share This Author
The gem5 simulator
The high level of collaboration on the gem5 project, combined with the previous success of the component parts and a liberal BSD-like license, make gem5 a valuable full-system simulation tool.
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset
The Wisconsin Multifacet Project has created a simulation toolset to characterize and evaluate the performance of multiprocessor hardware systems commonly used as database and web servers and has released a set of timing simulator modules for modeling the timing of the memory system and microprocessors.
LogTM: log-based transactional memory
- Kevin E. Moore, J. Bobba, Michelle J. Moravan, M. Hill, D. Wood
- Computer ScienceThe Twelfth International Symposium on High…
- 27 February 2006
This paper presents a new implementation of transactional memory, log-based transactionalMemory (LogTM), that makes commits fast by storing old values to a per-thread log in cacheable virtual memory and storing new values in place.
LogTM-SE: Decoupling Hardware Transactional Memory from Caches
- Luke Yen, J. Bobba, D. Wood
- Computer ScienceIEEE 13th International Symposium on High…
- 10 February 2007
This paper proposes a hardware transactional memory system called LogTM Signature Edition (LogTM-SE), which uses signatures to summarize a transactions read-and write-sets and detects conflicts on coherence requests (eager conflict detection), and allows cache victimization, unbounded nesting, thread context switching and migration, and paging.
Efficiently enabling conventional block sizes for very large die-stacked DRAM caches
- Gabriel H. Loh, M. Hill
- Computer Science44th Annual IEEE/ACM International Symposium on…
- 3 December 2011
Die-stacking technology enables multiple layers of DRAM to be integrated with multicore processors. A promising use of stacked DRAM is as a cache, since its capacity is insufficient to be all of main…
Amdahl's Law in the Multicore Era
- M. Hill
- Computer ScienceComputer
- 1 July 2008
Augmenting Amdahl's law with a corollary for multicore hardware makes it relevant to future generations of chips with multiple processor cores. Obtaining optimal multicore performance will require…
A Primer on Memory Consistency and Cache Coherence
This primer is to provide readers with a basic understanding of consistency and coherence, and presents both highlevel concepts as well as specific, concrete examples from real-world systems.
Evaluating Associativity in CPU Caches
All-associativity simulation is theoretically less efficient than forest simulation or stack simulation (a commonly used simulation algorithm), in practice it is not much slower and allows the simulation of many more caches with a single pass through an address trace.
DBMSs on a Modern Processor: Where Does Time Go?
This paper examines four commercial DBMSs running on an Intel Xeon and NT 4.0 and introduces a framework for analyzing query execution time, and finds that database developers should not expect the overall execution time to decrease significantly without addressing stalls related to subtle implementation issues.
Weaving Relations for Cache Performance
This paper proposes a new data organization model called PAX (Partition Attributes Across), that significantly improves cache performance by grouping together all values of each attribute within each page, and demonstrates that in-page data placement is the key to high cache performance.