The Speedup-Test: a statistical methodology for programme speedup analysis and computation
Reproducibility for High Performance Computing (HPC) systems has been discussed for some time already, but more work should be carried out to cover the latest accelerators that equip the fastest supercomputers such as the ones listed in Top500. In this paper, we perform a replication of a performance evaluation carried out using an N-Body Open MP parallel application on a XeonPhi accelerator. We also compare the obtained performance with a similar N-Body CUDA application. Besides encountering intriguing results about the Xeon Phi on the number of hardware threads, our comparison against Nvidia boards using the same load shows that the execution Xeon Phi is slower than on Nvidia K20 and GTX760 accelerators.