Black-Box Problem Diagnosis in Parallel File Systems

  title={Black-Box Problem Diagnosis in Parallel File Systems},
  author={Michael P. Kasick and Jiaqi Tan and Rajeev Gandhi and Priya Narasimhan},
We focus on automatically diagnosing different performance problems in parallel file systems by identifying, gathering and analyzing OS-level, black-box performance metrics on every node in the cluster. Our peercomparison diagnosis approach compares the statistical attributes of these metrics across I/O servers, to identify the faulty node. We develop a root-cause analysis procedure that further analyzes the affected metrics to pinpoint the faulty resource (storage or network), and demonstrate… CONTINUE READING
Highly Cited
This paper has 60 citations. REVIEW CITATIONS

From This Paper

Topics from this paper.


Publications citing this paper.
Showing 1-10 of 44 extracted citations

An Online Performance Anomaly Detector in Cluster File Systems

2010 3rd International Symposium on Parallel Architectures, Algorithms and Programming • 2010
View 6 Excerpts
Highly Influenced

Latent fault detection in large scale services

IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2012) • 2012
View 4 Excerpts
Highly Influenced

Proctor: Detecting and Investigating Interference in Shared Datacenters

2018 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) • 2018
View 1 Excerpt

TOBTD: Throughput debugging in total-order broadcast systems

2018 IEEE Middle East and North Africa Communications Conference (MENACOMM) • 2018
View 1 Excerpt

60 Citations

Citations per Year
Semantic Scholar estimates that this publication has 60 citations based on the available data.

See our FAQ for additional information.


Publications referenced by this paper.
Showing 1-10 of 22 references

Lustre file system: Highperformance storage architecture and scalable cluster file system

Sun Microsystems, Inc
White paper, • 2008
View 3 Excerpts

NFS version 4 minor version 1

S. Shepler, M. Eisler, D. Noveck
Internet-Draft, Dec. • 2008
View 1 Excerpt

SYSSTAT utilities home

S. Godard

The Linux sg3_utils package, June 2008. utils.html

D. Gilbert
View 1 Excerpt

Underneath the covers at Google: Current systems and future

J. Dean
View 1 Excerpt

Background Patrol Read for Dell PowerEdge RAID Controllers

D. Habas, J. Sieber
Dell Power Solutions, Feb. • 2006
View 1 Excerpt

Similar Papers

Loading similar papers…