Enabling particle applications for exascale computing platforms

  title={Enabling particle applications for exascale computing platforms},
  author={Susan M. Mniszewski and J. F. Belak and Jean-Luc Fattebert and Christian Francisco Andres Negre and Stuart R. Slattery and Adetokunbo Adedoyin and Robert Francis Bird and Choong-Seock Chang and Guangye Chen and St{\'e}phane Ethier and Shane Fogerty and Salman Habib and Christoph Junghans and Damien Lebrun-Grandi{\'e} and Jamaludin Mohd-Yusof and Stan G. Moore and Daniel Osei-Kuffuor and Steven J. Plimpton and Adrian Pope and Samuel Temple Reeve and L. F. Ricketson and Aaron Scheinberg and Amil Yograj Sharma and Michael E. Wall},
  journal={The International Journal of High Performance Computing Applications},
  pages={572 - 597}
  • S. Mniszewski, J. Belak, +21 authors M. Wall
  • Published 1 July 2021
  • Computer Science
  • The International Journal of High Performance Computing Applications
The Exascale Computing Project (ECP) is invested in co-design to assure that key applications are ready for exascale computing. Within ECP, the Co-design Center for Particle Applications (CoPA) is addressing challenges faced by particle-based applications across four “sub-motifs”: short-range particle–particle interactions (e.g., those which often dominate molecular dynamics (MD) and smoothed particle hydrodynamics (SPH) methods), long-range particle–particle interactions (e.g., electrostatic… 
Machine learning accelerated particle-in-cell plasma simulations
Particle-In-Cell (PIC) methods are frequently used for kinetic, high-fidelity simulations of plasmas. Implicit formulations of PIC algorithms feature strong conservation properties, up to numerical


Impacts of Multi-GPU MPI Collective Communications on Large FFT Computation
This paper analyzes the limitations of collective MPI communication for the computation of fast Fourier transforms (FFTs), and proposes a new FFT library, named HEFFTE (Highly Efficient FFTs for Exascale), which supports heterogeneous architectures and yields considerable speedups compared with CPU libraries, while maintaining good weak as well as strong scalability.
Warp-X: A new exascale computing platform for beam–plasma simulations
  • J. Vay, A. Almgren, +12 authors W. Zhang
  • Physics, Computer Science
    Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment
  • 2018
The various components of the codes such as the new Particle-In-Cell Scalable Application Resource (PICSAR) and the redesigned adaptive mesh refinement library AMReX, which are combined with redesigned elements of the Warp code, in the new WarpX software are presented.
An efficient mixed-precision, hybrid CPU-GPU implementation of a nonlinearly implicit one-dimensional particle-in-cell algorithm
A very efficient, mixed-precision hybrid CPU-GPU implementation of the 1D implicit PIC algorithm exploiting a fundamental feature of the method, the segregation of particle-orbit computations from the field solver, while remaining fully self-consistent.
Implementing a neural network interatomic model with performance portability for emerging exascale architectures
This work re-implement a neural network interatomic model in CabanaMD, an MD proxy application, built on libraries developed for performance portability, and shows significantly improved on-node scaling in this complex kernel as compared to a current LAMMPS implementation.
Modeling Dilute Solutions Using First-Principles Molecular Dynamics: Computing more than a Million Atoms with over a Million Cores
Using a robust new algorithm, this work has developed an O(N) complexity solver for electronic structure problems with fully controllable numerical error, allowing for very accurate FPMD simulations of more than a million atoms on over a million cores.
Efficient parallel linear scaling construction of the density matrix for Born-Oppenheimer molecular dynamics.
We present an algorithm for the calculation of the density matrix that for insulators scales linearly with system size and parallelizes efficiently on multicore, shared memory platforms with small
Performance Optimizations of Recursive Electronic Structure Solvers targeting Multi-Core Architectures (LA-UR-20-26665)
The scientific application of interest here is the Basic Math Library (BML) that provides a singular interface for linear algebra operation frequently used in the Quantum Molecular Dynamics (QMD) community and several optimization strategies are introduced into these micro-kernels.
An energy- and charge-conserving, implicit, electrostatic particle-in-cell algorithm
A main development in this study is the nonlinear elimination of the new-time particle variables (positions and velocities), which is term particle enslavement, results in a nonlinear formulation with memory requirements comparable to those of a fluid computation, and affords us substantial freedom in regards to the particle orbit integrator.
The Universe at extreme scale: Multi-petaflop sky simulation on the BG/Q
  • S. Habib, V. Morozov, +9 authors Z. Lukic
  • Computer Science, Physics
    2012 International Conference for High Performance Computing, Networking, Storage and Analysis
  • 2012
HACC simulations at these scales will for the first time enable tracking individual galaxies over the entire volume of a cosmological survey.
Long-Time Dynamics through Parallel Trajectory Splicing.
A novel simulation technique, Parallel Trajectory Splicing (ParSplice), that aims at addressing the atomistic evolution of materials over long time scales through the timewise parallelization of long trajectories through the study of topology changes in Ag42Cu13 core-shell nanoparticles.