Preparing HPC Applications for Exascale: Challenges and Recommendations

  title={Preparing HPC Applications for Exascale: Challenges and Recommendations},
  author={Erika {\'A}brah{\'a}m and Constantine Bekas and Ivona Brandi{\'c} and Samir Genaim and Einar Broch Johnsen and Ivan Kondov and Sabri Pllana and Achim Streit},
  journal={2015 18th International Conference on Network-Based Information Systems},
  • E. ÁbrahámC. Bekas A. Streit
  • Published 24 March 2015
  • Computer Science
  • 2015 18th International Conference on Network-Based Information Systems
While the HPC community is working towards the development of the first Exaflop computer (expected around 2020), after reaching the Petaflop milestone in 2008 still only few HPC applications are able to fully exploit the capabilities of Petaflop systems. In this paper we argue that efforts for preparing HPC applications for Exascale should start before such systems become available. We identify challenges that need to be addressed and recommend solutions in key areas of interest, including… 

Figures and Tables from this paper

Towards a Comprehensive Framework for Telemetry Data in HPC Environments

This work introduces a conceptual model and a software framework to collect, store, analyze, and exploit streams of telemetry data generated by HPC systems and their applications and shows how this framework can be integrated with HPC platform architectures and how it enables common application execution strategies.

HPC-Smart Infrastructures: A Review and Outlook on Performance Analysis Methods and Tools

The performance analysis tools and techniques for HPC applications and systems used by the researchers and HPC benchmarking suites are reviewed and a qualitative comparison of various tools used for the performance analysis of HPC Applications is provided.

Seastar: A Comprehensive Framework for Telemetry Data in HPC Environments

Seastar is introduced, a conceptual model and a software framework to collect, store, analyze, and exploit streams of telemetry data generated by HPC systems and their applications and how it enables common application execution strategies.

Challenges of Process Migration to Support Distributed Exascale Computing Environment

By analyzing the reasons for the dynamic and interactive requirements, a new model for process migration is represented and adding parameters to the process migration definition makes the process Migration be able to be used in the distributed Exascale computing system.

Programming languages for data-Intensive HPC applications: A systematic mapping study

Event Management and Monitoring Framework for HPC Environments using ServiceNow and Prometheus

An event management and monitoring framework is presented that addresses the operational needs of the future pre-exascale systems at the Lawrence Berkeley National Laboratory's NERSC and integrates the Operations Monitoring and Notification Infrastructure with the Prometheus, Grafana and ServiceNow platforms.

Towards a Framework for Monitoring and Analyzing High Performance Computing Environments Using Kubernetes and Prometheus

  • Nitin SukhijaElizabeth Bautista
  • Computer Science
    2019 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI)
  • 2019
A new architecture for the Operations Monitoring and Notification Infrastructure (OMNI) at NERSC is described that enables proactive monitoring and management at scale by integrating state-of-the-art technology, such as Kubernetes, Prometheus, Grafana, and other predictive platforms with data from metrics, sensors, and analytics engines.

Benchmarking OpenCL, OpenACC, OpenMP, and CUDA: programming productivity, performance, and energy consumption

This paper study empirically the characteristics of OpenMP, OpenACC, OpenCL, and CUDA with respect to programming productivity, performance, and energy and uses the homegrown tool CodeStat to evaluate programming productivity.

Mimic: Fast Recovery from Data Corruption Errors in Stencil Computations

  • Anis AlazzaweK. Kant
  • Computer Science
    2019 IEEE 38th International Performance Computing and Communications Conference (IPCCC)
  • 2019
This paper presents a computational model, refered to as mimic replication, that provides resilience against SDC errors through dynamic reexecution of processes that are vulnerable to having their data tainted due to a detected latent error and provides an analytical model that allows tradeoff between resource and energy consumption and resilience.

The past, present and future of scalable computing technologies trends and paradigms: A survey

The paper highlights that exascale and quantum computing are the most recent topic to effectively achieve high performance computing, both technologies have their advantages and disadvantages so it is recommended to implement a hybrid system that uses both technologies so quantum computing can be used as an accelerators to the existing high performance Computing systems.



Changing computing paradigms towards power efficiency

This work describes its efforts towards a power-efficient computing paradigm that combines low- and high-precision arithmetic, and showcases the widely used kernel of solving systems of linear equations that finds numerous applications in scientific and engineering disciplines as well as in large-scale data analytics, statistics and machine learning.

Software Challenges in Extreme Scale Systems

The implications of the concurrency and energy eciency challenges on future software for Extreme Scale Systems are discussed and the importance of software-hardware co- design in addressing the fundamental challenges for application enablement on Extreme Scale systems is discussed.

The DEEP Project - Pursuing Cluster-Computing in the Many-Core Era

The DEEP project is presented with an emphasis on the DEEP programming environment, which integrates the offloading functionality given by the MPI standard with an abstraction layer based on the task-based OmpSs programming paradigm.

The International Exascale Software Project roadmap

The work of the community to prepare for the challenges of exascale computing is described, ultimately combing their efforts in a coordinated International Exascale Software Project.

AutoTune: A Plugin-Driven Approach to the Automatic Tuning of Parallel Applications

The AutoTune project is extending Periscope, an automatic distributed performance analysis tool developed by Technische Universitat Munchen, with plugins for performance and energy efficiency tuning, to be able to tune serial and parallel codes for multicore and manycore architectures.

Automatic performance analysis of hybrid MPI/OpenMP applications

  • F. WolfB. Mohr
  • Computer Science
    Eleventh Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2003. Proceedings.
  • 2003

KOJAK - A Tool Set for Automatic Performance Analysis of Parallel Programs

Today’s parallel computers with SMP nodes provide both multithreading and message passing as their modes of parallel execution, so the developer of parallel programs is still required to filter out relevant parts from a huge amount of low-level information shown in numerous displays and map that information onto program abstractions without tool support.

Tools for Power-Energy Modelling and Analysis of Parallel Scientific Applications

An integrated framework to profile, monitor, model and analyze power dissipation in parallel MPI and multi-threaded scientific applications and a statistical software module that inspects the execution trace of the application to calculate the parameters of an accurate model for the global energy consumption.

VMeter: Power modelling for virtualized clouds

  • Ata E. H. BohraV. Chaudhary
  • Computer Science
    2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)
  • 2010
A novel power modelling technique, VMeter, based on online monitoring of system-resources having high correlation with the total power consumption is presented, which predicts instantaneous power consumption of an individual VM hosted on a physical node besides the full system power consumption.

Programmability and performance portability aspects of heterogeneous multi-/manycore systems

Three complementary approaches that can provide both portability and an increased level of abstraction for the programming of heterogeneous multicore systems are discussed and it is shown how they could complement each other in an integrational programming framework for heterogeneous Multicore systems.