Taverna: lessons in creating a workflow environment for the life sciences

@article{Oinn2006TavernaLI,
  title={Taverna: lessons in creating a workflow environment for the life sciences},
  author={Thomas M. Oinn and R. Mark Greenwood and Matthew Addis and M. Nedim Alpdemir and Justin Ferris and Kevin Glover and Carole A. Goble and Antoon Goderis and Duncan Hull and Darren Marvin and Peter Li and Phillip W. Lord and Matthew R. Pocock and Martin Senger and Robert Stevens and Anil Wipat and Chris Wroe},
  journal={Concurrency and Computation: Practice and Experience},
  year={2006},
  volume={18}
}
  • T. Oinn, R. Greenwood, C. Wroe
  • Published 25 August 2006
  • Environmental Science
  • Concurrency and Computation: Practice and Experience
Life sciences research is based on individuals, often with diverse skills, assembled into research groups. These groups use their specialist expertise to address scientific problems. The in silico experiments undertaken by these research groups can be represented as workflows involving the co‐ordinated use of analysis programs and information repositories that may be globally distributed. With regards to Grid computing, the requirements relate to the sharing of analysis and information… 
Source Workflow Systems in Life Sciences Informatics
TLDR
Although SWMS, including open source ones, have several open issues, their unique features and strong momentum clearly suggest that it is only a matter of time before they are adopted in even more scientific fields.
Open source workflow systems in life sciences informatics
TLDR
Although SWMS, including open source ones, have several open issues, their unique features and strong momentum clearly suggest that it is only a matter of time before they are adopted in even more scientific fields.
Open source workflow systems for the development of complex computational experiments
TLDR
This thesis thoroughly reviews the Scientific Workflows Management Systems field and investigates in detail popular open source workflow systems from a scientific applicability perspective, and implements a complex computational experiment from the life sciences field.
Scientific Workflow Management -- For Whom?
TLDR
This paper reflects about the usage scenarios of scientific workflow management based on the practical experience of heavy users of workflow technology from communities in three scientific domains: Astrophysics, Heliophysics and Biomedicine.
Knowledge Discovery for Biology with Taverna
TLDR
The myGrid project has the potential to integrate and aggregate workflow outcomes, and reason over provenance logs to identify new experimental insights, and to build and export a Semantic Web of experiments that contributes to Knowledge Discovery for Taverna users and for the scientific community as a whole.
Scientific Workflows
TLDR
A taxonomy of workflow management system (WMS) characteristics is proposed, including aspects previously overlooked, that frames a review of prevalent WMSs used by the scientific community, elucidates their evolution to handle the challenges arising with the emergence of the “fourth paradigm,” and identifies research needed to maintain progress.
eScience Workflows 9 years Out: Converging on a Vision
TLDR
It is pointed out that workflows have high adoption costs; it requires a team of vested interests to bring about a success and the overhead of workflow creation, execution, and management is simply too high.
Scientific workflows with the jABC framework
TLDR
It is described how the use of the PROPHETS synthesis plugin can enable a semantics-based simplification of the workflow design process, and how the Cinco SCCE Meta-Tooling Suite can be used to generate tailored workflow management tools.
Trends in Use of Scientific Workflows: Insights from a Public Repository and Recommendations for Best Practice
TLDR
A wide variety of workflow systems and publicly available workflows on the public repository myExperiment are analyzed in order to promote open discourse and access to scientific methods as well as data and it is hoped that understanding the usage of workflows and developing a set of recommended best practices will lead to increased contribution to the public domain.
Discovering Scientific Workflows: The myExperiment Benchmarks
TLDR
This study investigates current practices in workflow sharing, re-use and discovery amongst life scientists chiefly using the Taverna workflow management system and develops a benchmark specifically for the evaluation of workflow discovery.
...
...

References

SHOWING 1-10 OF 75 REFERENCES
Contextualised Workflow Execution in MyGrid
TLDR
The benefits that derive from the provision of integrated access to contextual information that links the phases of a problem-solving activity are illustrated, so that the steps of a solution do not happen in isolation, but rather as the components of a coherent whole.
Experiences with e-Science workflow specification and enactment in bioinformatics
TLDR
The EPSRC funded Grid project has developed a graphical toolset and workflow enactor which uses its own high level representation of a process flow, including specification of processing units, data transfers and execution constraints.
An Ontology-Driven Framework for Data Transformation in Scientific Workflows
TLDR
This paper defines a generic framework for transforming heterogeneous data within scientific workflows, which relies on a formalized ontology, which serves as a simple, unstructured global schema.
Performing in silico Experiments on the Grid : A Users Perspective
TLDR
The Grid project is introduced and the nature of an in silico experiment for the bioinformatics domain is explored, and the general user requirements for an empirical e-Scientist are reviewed.
Using Semantic Web Technologies for Representing E-science Provenance
TLDR
This work explores the use of Semantic Web technologies such as RDF, and ontologies to support its representation and used existing initiatives such as Jena and LSID, to generate and store such material.
Performing \emph{In Silico} Experiments on the Grid: A Users' Perspective
TLDR
The Grid project is introduced and the nature of an in silico experiment for the bioinformatics domain is explored, and the general user requirements for an empirical e-Scientist are reviewed.
A Life Scientist's Gateway to Distributed Data Management and Computing: The PathPort/ToolBus Framework
TLDR
PathPort, short for Pathogen Portal, employs a generic, web-services based framework to deal with some of the problems identified by the bioinformatics community, using two major components.
Knowledge Integration
A Knowledge-Based Approach to Interactive Workflow Composition
TLDR
This work has developed a system called CAT (Composition Analysis Tool) that analyzes workflows and generates error messages and suggestions in order to help users compose complete and consistent workflows.
Workflow Patterns
TLDR
A number of workflow patterns addressing what the authors believe identify comprehensive workflow functionality are described, providing the basis for an in-depth comparison of a number of commercially availablework flow management systems.
...
...