A dual grain hit-miss detector for large Die-Stacked DRAM caches

Abstract

Die-Stacked DRAM caches offer the promise of improved performance and reduced energy by capturing a larger fraction of an application's working set than on-die SRAM caches. However, given that their latency is only 50% lower than that of main memory, DRAM caches considerably increase latency for misses. They also incur a significant energy overhead for remote lookups in snoop-based multi-socket systems. Ideally, it would be possible to detect in advance that a request will miss in the DRAM cache and thus selectively bypass it. This work proposes a "dual grain filter" which successfully predicts whether an access is a hit or a miss in most cases. Experimental results with commercial and scientific workloads show that a 158KB dual-grain filter can correctly predict data block residency for 85% of all accesses to a 256MB DRAM cache. As a result, average off-die latency with our filter is within 8% of that possible with a perfectly accurate filter, which is impractical to implement.

Extracted Key Phrases

8 Figures and Tables

Cite this paper

@article{ElNacouzi2013ADG, title={A dual grain hit-miss detector for large Die-Stacked DRAM caches}, author={Michel El-Nacouzi and Islam Atta and Misel-Myrto Papadopoulou and Jason Zebchuk and Natalie D. Enright Jerger and Andreas Moshovos}, journal={2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)}, year={2013}, pages={89-92} }