Evaluating and Improving the Performance and Scheduling of HPC Applications in Cloud

Abstract

Cloud computing is emerging as a promising alternative to supercomputers for some high-performance computing (HPC) applications. With cloud as an additional deployment option, HPC users and providers are faced with the challenges of dealing with highly heterogeneous resources, where the variability spans across a wide range of processor configurations, interconnects, virtualization environments, and pricing models. In this paper, we take a holistic viewpoint to answer the question-why and whoshould choose cloud for HPC, for what applications, and how should cloud be used for HPC? To this end, we perform comprehensive performance and cost evaluation and analysis of running a set of HPC applications on a range of platforms, varying from supercomputers to clouds. Further, we improve performance of HPC applications in cloud by optimizing HPC applications' characteristics for cloud and cloud virtualization mechanisms for HPC. Finally, we present novel heuristics for online application-aware job scheduling in multi-platform environments. Experimental results and simulations using CloudSim show that current clouds cannot substitute supercomputers but can effectively complement them. Significant improvement in average turnaround time (up to 2X)and throughput (up to 6X) can be attained using our intelligent application-aware dynamic scheduling heuristics compared tosingle-platform or application-agnostic scheduling.

DOI: 10.1109/TCC.2014.2339858

11 Figures and Tables

Cite this paper

@article{Gupta2016EvaluatingAI, title={Evaluating and Improving the Performance and Scheduling of HPC Applications in Cloud}, author={Abhishek Gupta and Paolo Faraboschi and Filippo Gioachin and Laxmikant V. Kal{\'e} and Richard Kaufmann and Bu-Sung Lee and Verdi March and Dejan S. Milojicic and Chun Hui Suen}, journal={IEEE Transactions on Cloud Computing}, year={2016}, volume={4}, pages={307-321} }