WfChef: Automated Generation of Accurate Scientific Workflow Generators

  title={WfChef: Automated Generation of Accurate Scientific Workflow Generators},
  author={Tain{\~a} Coleman and Henri Casanova and Rafael Ferreira da Silva},
  journal={2021 IEEE 17th International Conference on eScience (eScience)},
Scientific workflow applications have become mainstream and their automated and efficient execution on large-scale compute platforms is the object of extensive research and development. For these efforts to be successful, a solid experimental methodology is needed to evaluate workflow algorithms and systems. A foundation for this methodology is the availability of realistic workflow instances. Dozens of workflow instances for a few scientific applications are available in public repositories… 

Figures from this paper

WfCommons: A Framework for Enabling Scientific Workflow Research and Development
WfCommons is presented, a framework that provides a collection of tools for analyzing workflow executions, for producing generators of synthetic workflows, and for simulating workflow executions and it is found that the workflow generators that are automatically constructed by the framework not only generate representative same-scale workflows but also do so at scales larger than that of available real-world workflows.


Pegasus, a workflow management system for science automation
An integrated view of the Pegasus system is provided, showing its capabilities that have been developed over time in response to application needs and to the evolution of the scientific computing platforms.
Community resources for enabling and evaluating research in distributed scientific workflows
  • 10th IEEE International Conference on e-Science, ser. eScience’14, 2014, pp. 177–184.
  • 2014
WorkflowHub: Community Framework for Enabling Scientific Workflow Research and Development
WorkflowHub is presented, a community framework that provides a collection of tools for analyzing workflow execution traces, producing realistic synthetic workflow traces, and simulating workflow executions and it is found that this framework can be used to generate representative workflow traces at larger scales than that of available workflow traces.
Workflows Community Summit: Bringing the Scientific Workflows Community Together
This report documents and organizes the wealth of information provided by the participants before, during, and after the workflow summit, and develops a view of the state of the art and identify crucial research challenges in the workflow community.
Developing accurate and scalable simulators of production workflow management systems with WRENCH
WRENCH, a WMS simulation framework, whose objectives are accurate and scalable simulations; and easy simulation software development is presented, to determine to which extent WRENCH achieves its objectives.
Lessons Learned from the Chameleon Testbed
The Chameleon testbed is a case study in adapting the cloud paradigm for computer science research, and it is made a case that utilizing mainstream technology in research testbeds can increase efficiency without compromising on functionality.
Serverless execution of scientific workflows: Experiments with HyperFlow, AWS Lambda and Google Cloud Functions
A prototype workflow executor functions using AWS Lambda and Google Cloud Functions, coupled with the HyperFlow workflow engine are developed, which can run workflow tasks in AWS and Google infrastructures, and feature such capabilities as data staging to/from S3 or Google Cloud Storage and execution of custom application binaries.
A characterization of workflow management systems for extreme-scale applications
A novel characterization of workflow management systems using features commonly associated with extreme-scale computing applications is presented and 15 popular workflow management Systems are classified in terms of workflow execution models, heterogeneous computing environments, and data access methods.
Application skeletons: Construction and use in eScience
The Application Skeleton is presented, a simple yet powerful tool to build synthetic applications that represent real applications, with runtime and I/O close to those of the real applications.
Scientific workflows: moving across paradigms
  • ACM Computing Surveys (CSUR), vol. 49, no. 4, pp. 1–39, 2016.
  • 2016