Filtering Directory Lookups in CMPs

Abstract

Coherence protocols consume an important fraction of power to determine which coherence action should take place. In this paper we focus on CMPs with a shared cache and a directory-based coherence protocol implemented as a duplicate of local caches tags. We observe that a big fraction of directory lookups produce a miss since the block looked up is not cached in any local cache. We propose to add a filter before the directory lookup in order to reduce the number of lookups to this structure. The filter identifies whether the current block was last accessed as a data or as an instruction. With this information, looking up the whole directory can be avoided for most accesses. We evaluate the filter in a CMP with 8 in-order processors with 4 threads each and a memory hierarchy with a shared L2 cache. We show that a filter with a size of 3% of the tag array of the shared cache can avoid more than 70% of all comparisons performed by directory lookups with a performance loss of just 0.2% for SPLASH2 and 1.5% for Specweb2005. On average, the number of 15-bit comparisons avoided per cycle is 54 out of 77 for SPLASH2 and 29 out of 41 for Specweb2005. In both cases, the filter requires less than one read of 1 bit per cycle.

DOI: 10.1016/j.micpro.2011.08.006

13 Figures and Tables

Cite this paper

@article{Bosque2010FilteringDL, title={Filtering Directory Lookups in CMPs}, author={Ana Bosque and V{\'i}ctor Vi{\~n}als and Pablo Ib{\'a}{\~n}ez and Jos{\'e} Mar{\'i}a Llaber{\'i}a}, journal={2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools}, year={2010}, pages={207-216} }