Ookami: Deployment and Initial Experiences

@article{Burford2021OokamiDA,
  title={Ookami: Deployment and Initial Experiences},
  author={Andrew Burford and A. Calder and David Carlson and B. Chapman and Firat CoSKun and Tony Curtis and Catherine Feldman and R. Harrison and Yan Kang and Benjamin Michalow-Icz and Eric Raut and E. Siegmann and Daniel G. Wood and R. L. DeLeon and Mathew Jones and N. Simakov and Joseph P. White and Dossay Oryspayev},
  journal={Practice and Experience in Advanced Research Computing},
  year={2021}
}
Ookami [3] is a computer technology testbed supported by the United States National Science Foundation. It provides researchers with access to the A64FX processor developed by Fujitsu [17] in collaboration with RIKΞN [35, 37] for the Japanese path to exascale computing, as deployed in Fugaku [36], the fastest computer in the world [34]. By focusing on crucial architectural details, the ARM-based, multi-core, 512-bit SIMD-vector processor with ultrahigh-bandwidth memory promises to retain… Expand

Figures and Tables from this paper

A64FX - Your Compiler You Must Decide!
TLDR
Test three state-of-the-art compiler suite against a broad set of benchmarks show that orders of magnitudes in performance can be gained by deviating from the recommended usage model of the A64FX compute nodes. Expand

References

SHOWING 1-10 OF 27 REFERENCES
A performance analysis of the first generation of HPC‐optimized Arm processors
TLDR
Performance results from Isambard, the first production supercomputer to be based on Arm CPUs that have been optimized specifically for HPC, are presented and node‐level benchmark results comparing ThunderX2 with mainstream CPUs, including Intel Skylake and Broadwell, as well as Xeon Phi are presented. Expand
An Introduction to the MPI Standard
TLDR
The Message Passing Interface is a portable message-passing standard that facilitates the development of parallel applications and libraries and forms a possible target for compilers of languages such as High Performance Fortran. Expand
The Software development process of FLASH, a multiphysics simulation code
  • A. Dubey, K. Antypas, +10 authors K. Weide
  • Computer Science
  • 2013 5th International Workshop on Software Engineering for Computational Science and Engineering (SE-CSE)
  • 2013
TLDR
The FLASH code has evolved into a modular and extensible scientific simulation software system over the decade of its existence and there has been an upsurge in the contributions by external users; some provide significant new capability. Expand
The HPC Challenge (HPCC) benchmark suite
TLDR
This tutorial will introduce attendees to HPCC, provide tools to examine differences in HPC architectures, and give hands-on training that will hopefully lead to better understanding of parallel environments. Expand
Porting and Evaluation of a Distributed Task-driven Stencil-based Application
Alternative programming models and runtimes are increasing in popularity and maturity. This allows porting and comparing, on competitive grounds, emerging parallel approaches against the traditionalExpand
A Workload Analysis of NSF's Innovative HPC Resources Using XDMoD
TLDR
A detailed workload analysis of the portfolio of supercomputers comprising the NSF Innovative HPC program is reported on in order to characterize its past and current workload and look for trends to understand the nature of how the broad portfolio of computational science research is being supported and how it is changing over time. Expand
Open XDMoD: A Tool for the Comprehensive Management of High-Performance Computing Resources
TLDR
These tools enable the comprehensive management of HPC resources, allowing HPC center personnel to ensure that the resource is operating efficiently and to determine what applications are running, how efficiently they're running, and what resources they're consuming, all of which are important to optimizing the HPC system. Expand
Legion: Expressing locality and independence with logical regions
TLDR
A runtime system that dynamically extracts parallelism from Legion programs, using a distributed, parallel scheduling algorithm that identifies both independent tasks and nested parallelism. Expand
Introducing OpenSHMEM: SHMEM for the PGAS community
TLDR
An OpenSHMEM specification is proposed to help tie together a number of divergent implementations of SHMEM that are currently available, and there will be a wider availability of a PGAS library model on current and future architectures. Expand
Flash: An adaptive mesh hydrodynamics code for modeling astrophysical thermonuclear flashes
We report on the completion of the first version of a new-generation simulation code, FLASH. The FLASH code solves the fully compressible, reactive hydrodynamic equations and allows for the use ofExpand
...
1
2
3
...