Clayton Chandler

Learn More
— System-and application-level failures could be characterized by analyzing relevant log files. The resulting data might then be used in numerous studies on and future developments for the mission-critical and large scale computational architecture, including fields such as failure prediction, reliability modeling, performance modeling and power awareness.(More)
One predominant barrier encountered in furthering research and development efforts aimed at facilitating resilient HPC applications is a substantial lack of existing reliability and performance data originating from extreme-scale computing distributions. In order to develop an understanding of how and why highly scaled HPC applications are encountering(More)
  • 1