Directoryless shared memory coherence using execution migration

Abstract

We introduce the concept of deadlock-free migration-based coherent shared memory to the NUCA family of architectures. Migration-based architectures move threads among cores to guarantee sequential semantics in large multicores. Using a execution migration (EM) architecture, we achieve performance comparable to directory-based architectures without using directories: avoiding automatic data replication significantly reduces cache miss rates, while a fast network-level thread migration scheme takes advantage of shared data locality to reduce remote cache accesses that limit traditional NUCA performance. EM area and energy consumption are very competitive, and, on the average, it outperforms a directory-based MOESI baseline by 1.3× and a traditional S-NUCA design by 1.2×. We argue that with EM scaling performance has much lower cost and design complexity than in directorybased coherence and traditional NUCA architectures: by merely scaling network bandwidth from 256 to 512 bit flits, the performance of our architecture improves by an additional 13%, while the baselines show negligible improvement.

Extracted Key Phrases

9 Figures and Tables

Cite this paper

@inproceedings{Lis2011DirectorylessSM, title={Directoryless shared memory coherence using execution migration}, author={Mieszko Lis and Keun Sup Shim and Myong Hyon Cho and Omer Khan and Srinivas Devadas}, year={2011} }