Online Fault and Anomaly Detection for Large-Scale Scientific Workflows

Abstract

Scientific workflows are an enabler of complex scientific analyses. Large-scale scientific workflows are executed on complex parallel and distributed resources, where many things can fail. Application scientists need to track the status of their workflows in real time, detect execution anomalies automatically, and perform troubleshooting -- without logging… (More)
DOI: 10.1109/HPCC.2011.55

Topics

15 Figures and Tables

Slides referencing similar topics