Rhymes: A shared virtual memory system for non-coherent tiled many-core architectures
Processor vendors are integrating more and more cores into their chip. These many-core processors usually implement hardware coherence mechanisms, but when the core count goes to hundreds or more, it becomes prohibitively difficult to design and verify efficient hardware coherence support. Despite this, many parallel applications, for example RMS applications , show little data sharing, which suggests that providing a complex hardware coherence implementation wastes hardware budget and design effort. Moreover, in some increasingly important domains, such as server and cloud computing, multiple applications may run on a single many-core chip. Those applications require coherence support among the cores that they are running on, but not between different applications. This indicates a strong requirement for dynamically reconfigurable coherence domains, which is extremely hard to support with hardware-only mechanisms. In addition, hardware coherence is believed to be too complex to support heterogeneous platforms such as combined CPU-GPU systems, regardless whether the GPU is integrated or discrete. In this paper, we argue that software managed coherence is a better choice than hardware coherence for many-core processors. We believe that software managed coherence can make better use of silicon, efficiently support emerging applications, dynamically reconfigure coherence domain, and most importantly, still be able to provide performance comparable to hardware coherence. We implemented a prototype system with software managed coherence over a partially-shared address space and show promising results.