An efficient MPI/OpenMP parallelization of the Hartree-Fock method for the second generation of Intel Xeon Phi processor

Abstract

Modern OpenMP threading techniques are used to convert the MPI-only Hartree-Fock code in the GAMESS program to a hybrid MPI/OpenMP algorithm. Two separate implementations that di‚er by the sharing or replication of key data structures among threads are considered, density and Fock matrices. All implementations are benchmarked on a super-computer of 3,000 Intel® Xeon PhiTM processors. With 64 cores per processor, scaling numbers are reported on up to 192,000 cores. Œe hybrid MPI/OpenMP implementation reduces the memory footprint by approximately 200 times compared to the legacy code. Œe MPI/OpenMP code was shown to run up to six times faster than the original for a range of molecular system sizes.

DOI: 10.1145/3126908.3126956

11 Figures and Tables

Cite this paper

@article{Mironov2017AnEM, title={An efficient MPI/OpenMP parallelization of the Hartree-Fock method for the second generation of Intel Xeon Phi processor}, author={Vladimir Mironov and Yuri Alexeev and Kristopher Keipert and Michael D'mello and Alexander Moskovsky and Mark S. Gordon}, journal={CoRR}, year={2017}, volume={abs/1708.00033} }