On the role of burst buffers in leadership-class storage systems

@article{Liu2012OnTR,
  title={On the role of burst buffers in leadership-class storage systems},
  author={Ning Liu and Jason Cope and Philip H. Carns and Christopher D. Carothers and Robert B. Ross and Gary Grider and Adam Crume and Carlos Maltzahn},
  journal={012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST)},
  year={2012},
  pages={1-11}
}
  • Ning Liu, Jason Cope, C. Maltzahn
  • Published 16 April 2012
  • Computer Science
  • 012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST)
The largest-scale high-performance (HPC) systems are stretching parallel file systems to their limits in terms of aggregate bandwidth and numbers of clients. To further sustain the scalability of these file systems, researchers and HPC storage architects are exploring various storage system designs. One proposed storage system design integrates a tier of solid-state burst buffers into the storage system to absorb application I/O requests. In this paper, we simulate and explore this storage… 

Figures and Tables from this paper

An empirical study of I/O separation for burst buffers in HPC systems
Hybrid flash arrays for HPC storage systems: An alternative to burst buffers
  • T. Petersen, J. Bent
  • Computer Science
    2017 IEEE High Performance Extreme Computing Conference (HPEC)
  • 2017
TLDR
This paper proposes an alternative architecture that is hardware managed, user transparent, file system agnostic, and that only buffers small IO while allowing large sequential IO to access the disks directly and achieves comparable results to the reported burst buffer numbers.
Evaluation and Performance Modeling of a Burst Buffer Solution
TLDR
The results of an evaluation of an emerging technology, DataDirect Networks' (DDN) Infinite Memory Engine (IME), in which parameter range burst buffers are able to counteract the widening performance gap between compute and I/O are investigated.
Integration of Burst Buffer in High-level Parallel I/O Library for Exa-scale Computing Era
  • Kai-yuan Hou, Reda Al-Bahrani, W. Liao
  • Computer Science
    2018 IEEE/ACM 3rd International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems (PDSW-DISCS)
  • 2018
TLDR
An I/O driver in PnetCDF is developed that uses a log-based format to store individual I/o requests on the burst buffer and shows that IO aggregation is a promising role for burst buffers in high-level I/ O libraries.
Evaluating Burst Buffer Placement in HPC Systems
TLDR
This work contributes a provisioning system to provide accurate, multi-tenant simulations that model realistic application and storage workloads from HPC systems and analyses the impact of these designs on latency, I/O phase lengths, contention for network and storage devices, and choice of network topology.
BurstMem: A high-performance burst buffer system for scientific applications
TLDR
The design of BurstMem is introduced, a high-performance burst buffer system that provides a storage framework with efficient storage and communication management strategies and is able to speed up the I/O performance of scientific applications by up to 8.5× on leadership computer systems.
Sizing and Partitioning Strategies for Burst-Buffers to Reduce IO Contention
TLDR
It is shown that the general sharing problem to guarantee fair performance for all applications is an NP-Complete problem and a polynomial time algorithms for the special case of finding the optimal buffer size such that no application is slowed down due to PFS contention are proposed.
Data Elevator: Low-Contention Data Movement in Hierarchical Storage System
  • Bin Dong, S. Byna, N. Keen
  • Computer Science
    2016 IEEE 23rd International Conference on High Performance Computing (HiPC)
  • 2016
TLDR
This paper proposes a new system, named Data Elevator, for transparently and efficiently moving data in hierarchical storage, which reduces the resource contention on BB servers via offloading the data movement from a fixed number of BB server nodes to compute nodes.
What size should your Burst-Buffers be?
Burst-Buffers are high throughput, small size intermediate storage systems typically based on SSDs or NVRAM that are designed to be used as a potential buffer between the computing nodes of a
What Size Should Your Buffers to Disks be?
Burst-Buffers are high throughput, small size intermediate storage systems typically based on SSDs or NVRAM that are designed to be used as a potential buffer between the computing nodes of a
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 50 REFERENCES
Workload characterization of a leadership class storage cluster
TLDR
This paper characterize the scientific workloads of the world's fastest HPC (High Performance Computing) storage cluster, Spider, at the Oak Ridge Leadership Computing Facility (OLCF), and shows that the read and write I/O bandwidth usage as well as the inter-arrival time of requests can be modeled as a Pareto distribution.
Modeling a Leadership-Scale Storage System
TLDR
The Co-design of Exascale Storage System (CODES) framework for evaluating exascale storage system design points is presented and the use of CODES through simulations of an existing petascalestorage system is demonstrated.
Scalable I/O forwarding framework for high-performance computing systems
TLDR
An I/O protocol and API for shipping function calls from compute nodes to I/o nodes are described, and a quantitative analysis of the overhead associated with I-O forwarding is presented.
Building a parallel file system simulator
TLDR
A parallel file system simulator that can simulate parallel file systems at very large scale and provide preliminary results that are encouraging both in terms of fidelity and simulation scalability is described.
DASH-IO: an empirical study of flash-based IO for HPC
TLDR
DASH is a new Teragrid resource aggressively leveraging flash technology (and also distributed shared memory technology) to fill the latency gap and shows that performance can be improved by 9x with appropriate existing technologies and probably further improved by future ones.
On the Role of NVRAM in Data-intensive Architectures: An Evaluation
TLDR
This work explores the potential of future NVRAM technologies to store program state at performance comparable to DRAM and shows that within a couple of technology generations, a system architecture with local high performance NVRam will be able to effectively augment DRAM to support highly concurrent data-intensive applications with large memory footprints.
Zest Checkpoint storage system for large supercomputers
TLDR
The PSC has developed a prototype distributed file system infrastructure that vastly accelerates aggregated write bandwidth on large compute platforms and prototyped a scalable solution that will be directly applicable to future petascale compute platforms having of order 10^6 cores.
Managing Variability in the IO Performance of Petascale Storage Systems
  • J. Lofstead, F. Zheng, M. Wolf
  • Computer Science
    2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
  • 2010
TLDR
These measurements motivate developing a 'managed' IO approach using adaptive algorithms varying the IO system workload based on current levels and use areas, which achieves higher overall performance and less variability in both a typical usage environment and with artificially introduced levels of 'noise'.
File System Workload Analysis For Large Scientific Computing Applications
TLDR
This work re-examine the workload characteristics in parallel computing environments in the light of recent technology advances and new applications, and finds that current file systems are not well optimized for file sharing.
Parallel I/O and the metadata wall
TLDR
This work presents results showing the performance of metadata operations with standard disk equipment and with solid state storage hardware, and extrapolate whether the evolution in hardware alone will be sufficient to limit the effects of this I/O Metadata Wall.
...
1
2
3
4
5
...