Stack Trace Analysis for Large Scale Debugging

@article{Arnold2007StackTA,
  title={Stack Trace Analysis for Large Scale Debugging},
  author={Dorian C. Arnold and Dong H. Ahn and Bronis R. de Supinski and Gregory L. Lee and Barton P. Miller and Martin Schulz},
  journal={2007 IEEE International Parallel and Distributed Processing Symposium},
  year={2007},
  pages={1-10}
}
We present the Stack Trace Analysis Tool (STAT) to aid in debugging extreme-scale applications. STAT can reduce problem exploration spaces from thousands of processes to a few by sampling stack traces to form process equivalence classes, groups of processes exhibiting similar behavior. We can then use full-featured debuggers on representatives from these behavior classes for root cause analysis. STAT scalably collects stack traces over a sampling period to assemble a profile of the application… CONTINUE READING

Citations

Publications citing this paper.
SHOWING 1-10 OF 101 CITATIONS

Overcoming Scalability Challenges for Tool Daemon Launching

  • 2008 37th International Conference on Parallel Processing
  • 2008
VIEW 11 EXCERPTS
CITES METHODS & BACKGROUND

A Scalable Prescriptive Parallel Debugging Model

  • 2015 IEEE International Parallel and Distributed Processing Symposium
  • 2015
VIEW 6 EXCERPTS
CITES METHODS

PGDB: A Debugger for MPI Applications

  • XSEDE '14
  • 2014
VIEW 7 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

Lessons learned at 208K: Towards debugging millions of cores

  • 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis
  • 2008
VIEW 9 EXCERPTS
CITES METHODS & BACKGROUND

Runtime MPI Correctness Checking with a Scalable Tools Infrastructure

VIEW 4 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

MPI Runtime Error Detection with MUST: Advanced Error Reports

  • Parallel Tools Workshop
  • 2012
VIEW 3 EXCERPTS
CITES METHODS & BACKGROUND

FILTER CITATIONS BY YEAR

2007
2019

CITATION STATISTICS

  • 9 Highly Influenced Citations

References

Publications referenced by this paper.