• Publications
  • Influence
High performance cache replacement using re-reference interval prediction (RRIP)
TLDR
This paper proposes cache replacement using Re-reference Interval Prediction (RRIP). Expand
  • 571
  • 131
  • PDF
Adaptive insertion policies for high performance caching
TLDR
We show that simple changes to the insertion policy can significantly reduce cache misses for memory-intensive workloads, bridging two-thirds of the gap between LRU and OPT. Expand
  • 626
  • 104
  • PDF
Scheduling heterogeneous multi-cores through performance impact estimation (PIE)
TLDR
This paper proposes Performance Impact Estimation (PIE) as a mechanism to predict which workload-to-core mapping is likely to provide the best performance. Expand
  • 307
  • 49
  • PDF
SHiP: Signature-based Hit Predictor for high performance caching
TLDR
We propose a Signature-based Hit Predictor (SHiP) to learn the re-reference behavior of cache lines belonging to each signature. Expand
  • 189
  • 43
  • PDF
DRAMsim: a memory system simulator
TLDR
We introduce DRAMsim, a detailed and highly-configurable C-based memory system simulator to fill this gap, with the capability to easily vary their parameters. Expand
  • 327
  • 33
  • PDF
Adaptive insertion policies for managing shared caches
TLDR
We propose Thread-Aware Dynamic Insertion Policy (TADIP) that can take into account the memory requirements of each of the concurrently executing applications. Expand
  • 317
  • 33
  • PDF
MCM-GPU: Multi-chip-module GPUs for continued performance scalability
TLDR
In this paper we demonstrate that package-level integration of multiple GPU modules to build larger logical GPUs can enable continuous performance scaling beyond Moore's law. Expand
  • 90
  • 24
  • PDF
CAMEO: A Two-Level Memory Organization with Capacity of Main Memory and Flexibility of Hardware-Managed Cache
TLDR
We propose CAMEO, a hardware-based Cache-like Memory Organization that not only makes stacked DRAM visible as part of the memory address space but also exploits data locality on a fine-grained basis. Expand
  • 115
  • 23
  • PDF
CoLT: Coalesced Large-Reach TLBs
TLDR
We propose Coalesced Large-Reach TLBs (CoLT), which leverage this intermediate contiguity to coalesce multiple virtual-to-physical page translations into single TLB entries rather than the hundreds required for the constituent base pages. Expand
  • 125
  • 19
  • PDF
Fairness-aware scheduling on single-ISA heterogeneous multi-cores
TLDR
We propose fairness-aware scheduling for single-ISA heterogeneous multi-cores, and explore two flavors for doing so. Expand
  • 77
  • 17
  • PDF