• Publications
  • Influence
High performance cache replacement using re-reference interval prediction (RRIP)
TLDR
This paper proposes cache replacement using Re-reference Interval Prediction (RRIP). Expand
  • 563
  • 124
  • PDF
Adaptive insertion policies for high performance caching
TLDR
We show that simple changes to the insertion policy can significantly reduce cache misses for memory-intensive workloads, bridging two-thirds of the gap between LRU and OPT. Expand
  • 618
  • 100
  • PDF
Scheduling heterogeneous multi-cores through performance impact estimation (PIE)
TLDR
This paper proposes Performance Impact Estimation (PIE) as a mechanism to predict which workload-to-core mapping is likely to provide the best performance. Expand
  • 304
  • 47
  • PDF
SHiP: Signature-based Hit Predictor for high performance caching
TLDR
We propose a Signature-based Hit Predictor (SHiP) to learn the re-reference behavior of cache lines belonging to each signature. Expand
  • 187
  • 41
  • PDF
DRAMsim: a memory system simulator
TLDR
We introduce DRAMsim, a detailed and highly-configurable C-based memory system simulator to fill this gap, with the capability to easily vary their parameters. Expand
  • 325
  • 34
  • PDF
Adaptive insertion policies for managing shared caches
TLDR
We propose Thread-Aware Dynamic Insertion Policy (TADIP) that can take into account the memory requirements of each of the concurrently executing applications. Expand
  • 317
  • 32
  • PDF
CAMEO: A Two-Level Memory Organization with Capacity of Main Memory and Flexibility of Hardware-Managed Cache
TLDR
We propose CAMEO, a hardware-based Cache-like Memory Organization that not only makes stacked DRAM visible as part of the memory address space but also exploits data locality on a fine-grained basis. Expand
  • 111
  • 23
  • PDF
MCM-GPU: Multi-chip-module GPUs for continued performance scalability
TLDR
In this paper we demonstrate that package-level integration of multiple GPU modules to build larger logical GPUs can enable continuous performance scaling beyond Moore's law. Expand
  • 88
  • 22
  • PDF
CoLT: Coalesced Large-Reach TLBs
TLDR
We propose Coalesced Large-Reach TLBs (CoLT), which leverage this intermediate contiguity to coalesce multiple virtual-to-physical page translations into single TLB entries rather than the hundreds required for the constituent base pages. Expand
  • 121
  • 19
  • PDF
Achieving Non-Inclusive Cache Performance with Inclusive Caches: Temporal Locality Aware (TLA) Cache Management Policies
TLDR
We show that the limited performance of inclusive caches is mostly due to inclusion victims—lines that are evicted from the core caches to satisfy the inclusion property—and not the reduced cache capacity of the hierarchy due to duplication of data. Expand
  • 99
  • 17
  • PDF