Learn More
We describe a highly optimized implementation of MPI domain decomposition in a GPU-enabled, general-purpose molecular dynamics code, HOOMD-blue (Anderson and Glotzer, 2013). Our approach is inspired by a traditional CPU-based code, LAMMPS (Plimpton, 1995), but is implemented within a code that was designed for execution on GPUs from the start (Anderson et(More)
GPU computing has revolutionized HPC by bringing the performance of the supercomputer to the desktop. Attractive price, performance, and power characteristics allow multiple GPUs to be plugged into both desktop machines as well as supercomputer nodes for increased performance. Excellent performance and scalability can be achieved for some problems using(More)
On modern GPU clusters, the role of the CPUs is often restricted to controlling the GPUs and handling MPI communication. The unused computing power of the CPUs, however, can be considerable for computations whose performance is bounded by memory traffic. This paper investigates the challenges of simultaneous usage of CPUs and GPUs for computation. Our(More)
  • 1