The future of scientific workflows

@article{Deelman2018TheFO,
  title={The future of scientific workflows},
  author={Ewa Deelman and Tom Peterka and Ilkay Altintas and Christopher D. Carothers and Kerstin Kleese van Dam and Kenneth Moreland and M. Parashar and Lavanya Ramakrishnan and Michela Taufer and Jeffrey S. Vetter},
  journal={The International Journal of High Performance Computing Applications},
  year={2018},
  volume={32},
  pages={159 - 175}
}
Today’s computational, experimental, and observational sciences rely on computations that involve many related tasks. The success of a scientific mission often hinges on the computer automation of these workflows. In April 2015, the US Department of Energy (DOE) invited a diverse group of domain and computer scientists from national laboratories supported by the Office of Science, the National Nuclear Security Administration, from industry, and from academia to review the workflow requirements… 

Figures from this paper

Incorporating Scientific Workflows in Computing Research Processes
TLDR
This special issue aspires to increase awareness of the benefits of workflows to enhance computational and data-enabled research and to foster the exchange of lessons learned and good practices that can benefit the community.
The Collaborative Research Center FONDA
TLDR
The design and setup of the Collaborative Research Center (CRC) 1404 “FONDA -– Foundations of Workflows for Large-Scale Scientific Data Analysis” is described, in which roughly 50 researchers jointly investigate new technologies, algorithms, and models to increase the portability, adaptability, and dependability of DAWs executed over distributed infrastructures.
Toward a Methodology and Framework for Workflow-Driven Team Science
TLDR
A conceptual design toward the development of methodologies and services for effective workflow-driven collaborations, namely the PPoDS methodology for collaborative workflow development and the SmartFlows Services for smart execution in a rapidly evolving cyberinfrastructure ecosystem are presented.
Toward Understanding I/O Behavior in HPC Workflows
TLDR
This paper presents an approach to augment the I/O efficiency of the individual tasks of workflows by combining workflow description frameworks with systemI/O telemetry data, and demonstrates how real-world applications and workflows could benefit from the approach.
Workflows Community Summit: Bringing the Scientific Workflows Community Together
TLDR
This report documents and organizes the wealth of information provided by the participants before, during, and after the workflow summit, and develops a view of the state of the art and identify crucial research challenges in the workflow community.
Lifecycle Support for Scientific Investigations: Integrating Data, Computing, and Workflows
TLDR
This paper highlights use cases that contributed to DEEDS development and concludes with lessons learned from a process that joined experiences and perspectives from diverse science domains.
Toward Common Components for Open Workflow Systems
TLDR
This work investigates the design and implementation of open workflow systems for supercomputers based upon common components by examining the different types of workflows and workflow management systems, reviewing the perspective of a large supercomputing facility, examining the common features and problems of workflow management system, and finally presenting a proposed solution based on the concept of common building blocks.
Towards Performant Workflows, Monitoring and Measuring
TLDR
The existing state of workflow monitoring is discussed, and strategies to improve on the information captured are suggested, to capture the most important aspects of modern research computing.
Exploration of Workflow Management Systems Emerging Features from Users Perspectives
TLDR
This work analyzes the applicability of the two models of workflow management by carefully describing each model and by providing an examination of the different variations of WMSs that fall under the task driven model.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 186 REFERENCES
Characterizing and profiling scientific workflows
Pegasus, a workflow management system for science automation
PANORAMA: An approach to performance modeling and diagnosis of extreme-scale workflows
TLDR
The central contribution of this article is a description of the PANORAMA approach for modeling and diagnosing the run-time performance of complex scientific workflows, which integrates extreme-scale systems testbed experimentation, structured analytical modeling, and parallel systems simulation into a comprehensive workflow framework called Pegasus.
Towards Case-Based Support for e-Science Workflow Generation by Mining Provenance
TLDR
E-Science workflows as a CBR domain is introduced, key technical issues are sketched, and directions towards addressing these issues are illustrated through ongoing research on Phala, a system which supports workflow generation by aiding re-use of portions of prior workflows.
Kepler: an extensible system for design and execution of scientific workflows
TLDR
The Kepler scientific workflow system provides domain scientists with an easy-to-use yet powerful system for capturing scientific workflows (SWFs), a formalization of the ad-hoc process that a scientist may go through to get from raw data to publishable results.
Taverna: lessons in creating a workflow environment for the life sciences: Research Articles
Life sciences research is based on individuals, often with diverse skills, assembled into research groups. These groups use their specialist expertise to address scientific problems. The in silico
Taverna: lessons in creating a workflow environment for the life sciences
Life sciences research is based on individuals, often with diverse skills, assembled into research groups. These groups use their specialist expertise to address scientific problems. The in silico
A Provenance-based Adaptive Scheduling Heuristic for Parallel Scientific Workflows in Clouds
TLDR
An adaptive scheduling heuristic for parallel execution of scientific workflows in the cloud that is based on three criteria: total execution time (makespan), reliability and financial cost is introduced.
Failure prediction and localization in large scientific workflows
TLDR
Methods for guiding remediation to stochastic errors through predictions of the impact on application performance are described and techniques for isolating systematic sources of failures are described.
The Trident Scientific Workflow Workbench
TLDR
The ability to utilize both local and cloud resources for storage and execution, as well as services such as provenance, monitoring, logging and scheduling workflows over clusters are illustrated.
...
1
2
3
4
5
...