Learn More
The trend for multicore processors is towards increasing numbers of cores, with 100s of cores--<i>i.e</i>. large-scale chip multiprocessors (LCMPs)--possible in the future. The key to realizing the potential of LCMPs is the cache hierarchy, so studying how memory performance will scale is crucial. Reuse distance (RD) analysis can help architects do this. In(More)
This paper describes our experience with profiling and optimizing physical locality for the distributed shared cache (DSC) in Tilera's Tile multicore processor. Our approach uses the Tile Processor's hardware performance measurement counters (PMCs) to acquire page-level access pattern profiles. A key problem we address is imprecise PMC interrupts. Our(More)
Researchers have proposed numerous directory techniques to address multicore scalability whose behavior depends on the CPU's particular configuration, e.g. core count and cache size. As CPUs continue to scale, it is essential to explore the directory's architecture dependences. However, this is challenging using detailed simulation given the large number of(More)
To enable performance improvements in a power-efficient manner, computer architects have been building CPUs that exploit greater amounts of thread-level parallelism. A key consideration in such CPUs is properly designing the on-chip cache hierarchy. Unfortunately, this can be hard to do, especially for CPUs with high core counts and large amounts of cache.(More)
This paper describes our experience with profiling and optimizing physical locality for the distributed shared cache (DSC) in Tilera’s Tile multicore processor. Our approach uses the Tile Processor’s hardware performance measurement counters (PMCs) to acquire page-level access pattern profiles. A key problem we address is imprecise PMC interrupts. Our(More)
The trend for multicore CPUs is towards increasing core count. One of the key limiters to scaling will be the on-chip directory cache. Our work investigates moving portions of the directory away from the cores, perhaps to off-chip DRAM, where ample capacity exists. While suchmulti-level directory caches exhibit increased latency, several aspects of(More)
Title of dissertation: Studying the Impact of Multicore Processor Scaling on Cache Coherence Directories via Reuse Distance Analysis Minshu Zhao, Doctor of Philosophy, 2015 Dissertation directed by: Professor Donald Yeung Department of Electrical and Computer Engineering Directories are one key part of a processor’s cache coherence hardware, and constitute(More)
Researchers have proposed numerous techniques to improve the scalability of coherence directories. The effectiveness of these techniques not only depends on application behavior, but also on the CPU's configuration, for example, its core count and cache size. As CPUs continue to scale, it is essential to explore the directory's application <i>and</i>(More)
  • 1