Automatic Resource Scheduling with Latency Hiding for Parallel Stencil Applications on GPGPU Clusters


Overlapping computations and communication is a key to accelerating stencil applications on parallel computers, especially for GPU clusters. However, such programming is a time-consuming part of the stencil application development. To address this problem, we developed an automatic code generation tool to produce a parallel stencil application with latency… (More)
DOI: 10.1109/IPDPS.2012.57


15 Figures and Tables