Investigating reproducibility and tracking provenance – A genomic workflow case study
An in-silico experiment can be naturally specified as a workflow of activities implementing, in a standardized environment, the process of data and control analysis. A workflow has the advantage to be reproducible, traceable and compositional by reusing other workflows. In order to support the daily work of a bioscientist, several Workflow Management Systems (WMSs) have been proposed in bioinformatics. Generally, these systems centralize the workflow enactment and do not exploit standard process definition languages to describe, in order to be reusable, workflows. While almost all WMSs require heavy stand-alone applications to specify new workflows, only few of them provide a web-based process definition tool. We have developed BioWMS, a Workflow Management System that supports, through a web-based interface, the definition, the execution and the results management of an in-silico experiment. BioWMS has been implemented over an agent-based middleware. It dynamically generates, from a user workflow specification, a domain-specific, agent-based workflow engine. Our approach exploits the proactiveness and mobility of the agent-based technology to embed, inside agents behaviour, the application domain features. Agents are workflow executors and the resulting workflow engine is a multiagent system – a distributed, concurrent system – typically open, flexible, and adaptative. A demo is available at http://litbio.unicam.it:8080/biowms . BioWMS, supported by Hermes mobile computing middleware, guarantees the flexibility, scalability and fault tolerance required to a workflow enactment over distributed and heterogeneous environment. BioWMS is funded by the FIRB project LITBIO (Laboratory for Interdisciplinary Technologies in Bioinformatics).