Overview of the IBM Blue Gene/P Project

  title={Overview of the IBM Blue Gene/P Project},
  author={Ibm Blue and Gen-X Team},
  journal={IBM J. Res. Dev.},
IBM announced the Blue Gene/Pe system as the leading offering in its massively parallel Blue Genet supercomputer line, succeeding the Blue Gene/Le system. The Blue Gene/P system is designed to scale to at least 262,144 quad-processor nodes, with a peak performance of 3.56 petaflops. More significantly, the Blue Gene/P system enables this unprecedented scaling via architectural and design choices that maximize performance per watt, performance per square foot, and mean time between failures… 

Early Experience on the Blue Gene/Q Supercomputing System

The Argonne Leadership Computing Facility (ALCF) is home to Mira, a 10 PF Blue Gene/Q (BG/Q) system that offers several new opportunities for tuning and scaling scientific applications.

Measuring power consumption on IBM Blue Gene/P

The integration of Blue Gene power monitoring capabilities into system-level tools like LLview are described, and some results of analyzing the production workload at Research Center Jülich (FZJ) are highlighted.

Hybrid Parallel Programming for Blue Gene/P

This work presents optimizations of a Grid-based projector-augmented wave method software, GPAW, for the Blue Gene/P architecture, and demonstrates a hybrid programming model, which is clearly beneficial compared to the original flat programming model.

Understanding Application Performance via Micro-benchmarks on Three Large Supercomputers: Intrepid, Ranger and Jaguar

A performance comparison of three of the fastest machines in the world: IBM’s Blue Gene/P installation at ANL, the SUN-Infiniband cluster at TACC (Ranger) and Cray”s XT4 installation at ORNL (Jaguar) is presented.

Toward message passing for a million processes: characterizing MPI on a massive scale blue gene/P

The communication performance of the message passing interface (MPI) implementation on 32 racks of the largest Blue Gene/P (BG/P) system in the United States (80% of the total system size) is characterized and various interesting insights into it are revealed.

A system level view of Petascale I/O on IBM Blue Gene/P

This work describes the tuning and scaling behavior of the GPFS parallel file system on JUGENE, the largest IBM Blue Gene/P installation worldwide and the first PetaFlop/s HPC resource within the European PRACE Research Infrastructure.

Performance Evaluation of Massively Parallel Systems Using SPEC OMP Suite

An extensive evaluation study of the performance peaks and scalability of these two modern architectures using SPEC OMP benchmarks is presented.

Analyzing Checkpointing Trends for Applications on the IBM Blue Gene/P System

This paper studies the checkpointing overhead seen by applications running on leadership-class machines such as the IBM Blue Gene/P at Argonne National Laboratory, and designs a methodology to assist users in understanding and choosing checkpointing frequency and reducing the overhead incurred.

Parallel I/O Performance for Application-Level Checkpointing on the Blue Gene/P System

This study shows that rbIO and coIO result in 100vó improvement over previous checkpointing approaches on up to 65,536 processors of the Blue Gene/P using the GPFS, and demonstrates a 25vó production performance improvement for NekCEM.

Optimization of MPI_Allreduce on the blue Gene/Q supercomputer

This paper presents techniques to optimize the MPI_Allreduce collective operation by building ten different edge disjoint spanning trees on the ten torus links to accelerate summing of network packets with local buffers by the use of Quad Processing SIMD unit in the BG/Q cores.



Blue Gene/L programming and operating environment

The system software stack for BG/L creates a programming and operating environment that harnesses the raw power of this architecture with great effectiveness and specialized the services provided by each component of the system architecture to deliver high performance to applications.

Vectorization for SIMD architectures with alignment constraints

This paper presents a compilation scheme that systematically vectorizes loops in the presence of misaligned memory references, and proposes several techniques to minimize the number of data reorganization operations generated.

An integrated simdization framework using virtual vectors

This paper proposes aSimdization framework that addresses several orthogonal aspects of simdization, such as alignment handling, simdized of loops with mixed data lengths, and SIMD parallelism extraction from different program scopes (from basic blocks to inner loops).

Blue Matter: Approaching the Limits of Concurrency for Classical Molecular Dynamics

A novel spatial-force decomposition for N-body simulations for which the authors observe O(sqrt(p)) communication scaling is described, which has enabled Blue Matter to approach the effective limits of concurrency for molecular dynamics using particle-mesh methods for handling electrostatic interactions.

The BlueGene/L supercomputer and quantum ChromoDynamics

This work describes the methods for performing quantum chromodynamics (QCD) simulations that sustain up to 20% of the peak performance on BlueGene supercomputers and proposes that QCD should be used as a new, powerful HPC benchmark.

Extending stability beyond CPU millennium: a micron-scale atomistic simulation of Kelvin-Helmholtz instability

Improvements in three key areas for massively parallel computation such as on BlueGene/L (BG/L): fault tolerance, application kernel optimization, and highly efficient parallel I/O, enable the first micron-scale simulation of a Kelvin-Helmholtz instability using molecular dynamics.

Climate, Ocean, and Sea Ice Modeling: POP; see http://climate.lanl.gov/Models

  • Climate, Ocean, and Sea Ice Modeling: POP; see http://climate.lanl.gov/Models

HPC Challenge Awards Competition

  • HPC Challenge Awards Competition