FusionFS: Toward supporting data-intensive scientific applications on extreme-scale high-performance computing systems

@article{Zhao2014FusionFSTS,
  title={FusionFS: Toward supporting data-intensive scientific applications on extreme-scale high-performance computing systems},
  author={Dongfang Zhao and Zhao Zhang and Xiaobing Zhou and Tonglin Li and Ke Wang and Dries Kimpe and Philip H. Carns and Robert B. Ross and Ioan Raicu},
  journal={2014 IEEE International Conference on Big Data (Big Data)},
  year={2014},
  pages={61-70}
}
State-of-the-art, yet decades-old, architecture of high-performance computing systems has its compute and storage resources separated. It thus is limited for modern data-intensive scientific applications because every I/O needs to be transferred via the network between the compute and storage resources. In this paper we propose an architecture that hss a distributed storage layer local to the compute nodes. This layer is responsible for most of the I/O operations and saves extreme amounts of… 
Storage Support for Data-Intensive Applications on Large Scale High-Performance Computing Systems
TLDR
This work proposes a new architecture with nodelocal persistent storage, called FusionFS, with two major design principles: maximal metadata concurrency and optimal file write, both of which are crucial to HPC applications.
Storage Support for Data-Intensive Applications on Extreme-Scale HPC Systems
TLDR
This work proposes a storage architecture with node-local disks for HPC systems, which is called FusionFS, with two major design principles: maximal metadata concurrency and optimal file write, both of which are crucial to HPC applications.
High-Performance Storage Support for Scientific Big Data Applications on the Cloud
TLDR
This work analyzes and evaluates four representative file systems and elaborate the design and implementation of FusionFS that employs a scalable approach to managing both metadata and data in addition to its unique features on cooperative caching, dynamic compression, GPU-accelerated data redundancy, lightweight provenance, and parallel serialization.
masFS: File System Based on Memory and SSD in Compute Nodes for High Performance Computers
TLDR
This paper designs and implements masFS, a novel file system for HPC that exploits available memory and SSD resources on compute nodes with little interference to applications running on the nodes, and implements and deploys it on TH-1A.
I/O load balancing for big data HPC applications
TLDR
A global mapper on Lustre Metadata Server is designed, which gathers runtime statistics from key storage components on the I/O path, and applies Markov chain modeling and a minimum-cost maximum-flow algorithm to decide where data should be placed.
UniviStor: Integrated Hierarchical and Distributed Storage for HPC
TLDR
UniviStor is introduced, a data management service offering a unified view of storage layers that provides performance optimizations and data structures tailored for distributed and hierarchical data placement, interferenceaware data movement scheduling, adaptive data striping, and lightweight workflow management.
Experiences of Converging Big Data Analytics Frameworks with High Performance Computing Systems
TLDR
This paper addresses a critical question on how to accelerate complex application that contains both data-intensive and compute-intensive workloads on the Tianhe-2 system by deploying an in-memory file system as data access middleware, and proposes shared map output shuffle strategy and file metadata cache layer to alleviate the impact of metadata bottleneck.
WatCache: a workload-aware temporary cache on the compute side of HPC systems
TLDR
This paper designed a workload-aware node allocation method to allocate fast storage devices to jobs according to their I/O requirements and merged the devices of the jobs into separate temporary cache spaces, and implemented a metadata caching strategy that reduces the metadata overhead ofI/O requests to improve the performance of small I/o.
A BeeGFS-Based Caching File System for Data-Intensive Parallel Computing
TLDR
The solution unifies data access for both the internal storage and external file systems using a uniform namespace, and improves storage performance by exploiting data locality across storage tiers, and increases data sharing between compute nodes and across applications.
Experimental evaluation of a flexible I/O architecture for accelerating workflow engines in cloud environments
TLDR
The results show that Hercules provides a scalable I/O solution with remarkable performance, especially for write operations, compared with classic I/W approaches for high performance computing in cloud environments.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 182 REFERENCES
HyCache+: Towards Scalable High-Performance Caching Middleware for Parallel File Systems
TLDR
A distributed storage middleware right on the compute nodes, which allows I/O to effectively leverage the high bi-section bandwidth of the high-speed interconnect of massively parallel high-end computing systems, and a 2-phase mechanism to cache the hot data for parallel applications, called 2-Layer Scheduling (2LS).
Towards Data Intensive Many-Task Computing
TLDR
This chapter defines an abstract model for data diffusion, defines and implement scheduling policies with heuristics that optimize real world performance, and develops a competitive online caching eviction policy to enable dataintensive many-task computing.
Scalable I/O forwarding framework for high-performance computing systems
TLDR
An I/O protocol and API for shipping function calls from compute nodes to I/o nodes are described, and a quantitative analysis of the overhead associated with I-O forwarding is presented.
Scalable in situ scientific data encoding for analytical query processing
TLDR
DIRAQ is proposed, a parallel in situ, in network data encoding and reorganization technique that enables the transformation of simulation output into a query-efficient form, with negligible runtime overhead to the simulation run.
Towards high-performance and cost-effective distributed storage systems with information dispersal algorithms
TLDR
Results show that, for both HPC and cloud computing communities, IDA-based methods with current commodity hardware could outperform data replication in some cases, and would completely surpass data replication with the growing computational capacity through multi/many-core processors (e.g. Intel Xeon Phi, NVIDIA GPU).
iTransformer: Using SSD to Improve Disk Scheduling for High-performance I/O
TLDR
iTransformer is proposed, a scheme that employs a small SSD to schedule requests for the data on disk, and can improve the I/O throughput of the cluster by 35% on average for MPI/IO benchmarks of various data access patterns.
Design and analysis of data management in scalable parallel scripting
TLDR
A scalable MTC data management system that uses aggregated compute node local storage for more efficient data movement strategies and delivers BLAST performance better than mpiBLAST at various scales up to 32,768 cores, while preserving the flexibility of the original BLAST application.
Damaris: How to Efficiently Leverage Multicore Parallelism to Achieve Scalable, Jitter-free I/O
TLDR
A new approach to I/O is proposed, called Damaris, which leverages dedicated I/ O cores on each multicore SMP node, along with the use of shared-memory, to efficiently perform asynchronous data processing and I/o in order to hide this variability.
On the role of burst buffers in leadership-class storage systems
TLDR
It is shown that burst buffers can accelerate the application perceived throughput to the external storage system and can reduce the amount of external storage bandwidth required to meet a desired application perceived bottleneck goal.
The quest for scalable support of data-intensive workloads in distributed systems
TLDR
An abstract model for data diffusion is defined, new scheduling policies with heuristics to optimize real-world performance are introduced, and a competitive online cache eviction policy is developed, to explore the feasibility of data diffusion.
...
1
2
3
4
5
...