Learn More
In heterogeneous supercomputers such as TSUBAME2.5, GPUs on some nodes in GPU batch queues are left idle even though there are jobs waiting in the queues, this is caused by GPU resource-assignment fragmentation problem. For example, in the case that each node has three GPUs like TSUBAME2.5's, if a node has already been assigned to a job requesting two GPUs(More)
Remote GPU execution has been proven to increase GPU occupancy and reduce job waiting time in multi-GPU batch-queue systems, by allowing jobs to utilize remote GPUs when there are not enough unoccupied local GPUs available. However, for GPU communication intensive applications, remote GPU communication overhead may account for more than 70% of the(More)
  • 1