Transitioning from file-based HPC workflows to streaming data pipelines with openPMD and ADIOS2

@inproceedings{Poeschel2021TransitioningFF,
  title={Transitioning from file-based HPC workflows to streaming data pipelines with openPMD and ADIOS2},
  author={Franz Poeschel and E Juncheng and William F. Godoy and Norbert Podhorszki and Scott Klasky and Greg Eisenhauer and Philip E. Davis and Lipeng Wan and Ana Gainaru and Junmin Gu and Fabian Koller and Ren{\'e} Widera and Michael Bussmann and Axel Huebl},
  booktitle={SMC},
  year={2021}
}
This paper aims to create a transition path from file-based IO to streaming-based workflows for scientific applications in an HPC environment. By using the openPMP-api, traditional workflows limited by filesystem bottlenecks can be overcome and flexibly extended for in situ analysis. The openPMD-api is a library for the description of scientific data according to the Open Standard for Particle-Mesh Data (openPMD). Its approach towards recent challenges posed by hardware heterogeneity lies in… 
Organizing Large Data Sets for Efficient Analyses on HPC Systems
TLDR
This work explores the performance of reading and writing operations involving one such scientific application on two different supercomputers, and demonstrates the querying functionality in ADIOS could effectively support common selective data analysis operations, such as conditional histograms.
Modeling of advanced accelerator concepts
TLDR
The current status and future needs of AAC systems are summarized and several key aspects of high-performance computing, including performance, portability, scalability, advanced algorithms, scalable I/Os and In-Situ analysis are reported on.
Scalable training of graph convolutional neural networks for fast and accurate predictions of HOMO-LUMO gap in molecules
TLDR
This work uses HydraGNN, an in-house library for large-scale GCNN training, and uses ADIOS, a high-performance data management framework for e-cient storage and reading of large molecular graph data, to build a GCNN predictor for an important quantum property known as the HOMO-LUMO gap.
Snowmass 21 Accelerator Modeling Community White Paper by the Beam and Accelerator Modeling Interest Group ( BAMIG )
1Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA 2Cornell University, Ithaca, NY 14853, USA 3Thomas Jefferson National Accelerator Facility, Newport News, VA 23606, USA 4Argonne

References

SHOWING 1-10 OF 28 REFERENCES
Extending the Publish/Subscribe Abstraction for High-Performance I/O and Data Management at Extreme Scale
TLDR
Using the publish/subscribe model afforded by ADIOS, a set of services that connect data format, metadata, queries, data reduction, and high-performance delivery are demonstrated that enable the dynamic capabilities that will be required for exascale data management.
Improving I/O Performance for Exascale Applications Through Online Data Layout Reorganization
TLDR
It is shown that by understanding application I/O patterns and carefully designing data layouts the authors can increase read performance by more than 80 percent, and two online data layout reorganization approaches for achieving good tradeoffs between read and write performance are introduced.
On the Scalability of Data Reduction Techniques in Current and Upcoming HPC Systems from an Application Perspective
TLDR
A scaling law characterizing performance bottlenecks in state-of-the-art approaches for data reduction is presented, and multi-threaded data-transformations for the I/O library ADIOS are proposed as a feasible way to trade underutilized host-side compute potential on heterogeneous systems for reduced I/o latency.
Flux: A Next-Generation Resource Management Framework for Large HPC Centers
TLDR
This paper details the design of Flux and describes and evaluates the initial prototyping effort of the key run-time components, showing that the run- time prototype provides strong and predictable scalability.
The ALPINE In Situ Infrastructure: Ascending from the Ashes of Strawman
TLDR
This paper introduces ALPINE, a flyweight in situ infrastructure designed for leading-edge supercomputers, and has support for both distributed-memory and shared-memory parallelism.
DataStager: scalable data staging services for petascale applications
TLDR
Experimental evaluations of the flexible ‘DataStager’ framework establish both the necessity of intelligent data staging and the high performance of the approach, using the GTC fusion modeling code and benchmarks running on 1000+ processors.
The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems
TLDR
The design and key differences of the Summit and Sierra systems are discussed, and several CPU, network and memory bound analytics and GPU-bound deep learning codes achieve up to a 11X and 79X speedup/node, respectively over Titan.
Improving Performance of M-to-N Processing and Data Redistribution in In Transit Analysis and Visualization
TLDR
By leveraging design characteristics, which facilitate an “intelligent” mapping from M- to-N, significant performance gains are possible in terms of several different metrics, including time-to-solution and amount of data moved.
Hello ADIOS: the challenges and lessons of developing leadership class I/O frameworks
TLDR
The startling observations made in the last half decade of I/O research and development are described, and some of the challenges that remain as the coming Exascale era are detailed.
Lightsource Unified Modeling Environment (LUME), a Start-to-End Simulation Ecosystem
TLDR
The platform is built with an open, well-documented architecture so that science groups around the world can contribute specific experimental designs and software modules, advancing both their scientific interests and a broader knowledge of the opportunities provided by the exceptional capabilities of X-ray FELs.
...
...