OpenAlea: scientific workflows combining data analysis and simulation

@article{Pradal2015OpenAleaSW,
  title={OpenAlea: scientific workflows combining data analysis and simulation},
  author={Christophe Pradal and Christian Fournier and Patrick Valduriez and Sarah Cohen Boulakia},
  journal={Proceedings of the 27th International Conference on Scientific and Statistical Database Management},
  year={2015}
}
Analyzing biological data (e.g., annotating genomes, assembling NGS data...) may involve very complex and interlinked steps where several tools are combined together. Scientific workflow systems have reached a level of maturity that makes them able to support the design and execution of such in-silico experiments, and thus making them increasingly popular in the bioinformatics community. However, in some emerging application domains such as system biology, developmental biology or ecology, the… 

Figures from this paper

Portability of Scientific Workflows in NGS Data Analysis: A Case Study
TLDR
This work describes its efforts to port a state-of-the-art workflow for the detection of specific variants in whole-exome sequencing of mice to the scientific workflow system SaasFee that can execute workflows on (multi-core) stand-alone servers or on clusters of arbitrary sizes using the Hadoop.
yggdrasil: a Python package for integrating computational models across languages and scales
  • Meagan Lang
  • Computer Science, Biology
    in silico Plants
  • 2019
TLDR
Yggdrasil (previously cis_interface), a Python package for running integration networks with connections between models across languages and scales, and can be used to connect computational models from any domain.
Dealing with multi‐source and multi‐scale information in plant phenomics: the ontology‐driven Phenotyping Hybrid Information System
TLDR
The open‐source Phenotyping Hybrid Information System (PHIS) is proposed for plant phenotyping experiments in various categories of installations and has the potential for rapid diffusion because of its ability to integrate, manage and visualize multi‐source and multi‐scale data.
Efficient Execution of Scientific Workflows in the Cloud Through Adaptive Caching
TLDR
An adaptive caching solution for data-intensive workflows in the cloud based on a new scientific workflow management architecture that automatically manages the storage and reuse of intermediate data and adapts to the variations in task execution times and output data size is proposed.
Challenges of Translating HPC Codes to Workflows for Heterogeneous and Dynamic Environments
TLDR
This paper explains, through the CFD use case, how to transform the parallel code and exhibits challenges to 'unfold' the task graph dynamically in order to improve the overall performance of the workflow engine.
Distributed Caching of Scientific Workflows in Multisite Cloud
TLDR
This paper proposes a solution for distributed caching of scientific workflows in a multisite cloud, implemented in the OpenAlea workflow system, together with cache-aware distributed scheduling algorithms.
Scientific workflows: Past, present and future
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 22 REFERENCES
Scientific workflow systems - can one size fit all?
  • V. Curcin, M. Ghanem
  • Computer Science
    2008 Cairo International Biomedical Engineering Conference
  • 2008
TLDR
This paper provides a high-level framework for comparing the systems based on their control flow and data flow properties with a view of both informing future research in the area by academic researchers and facilitating the selection of the most appropriate system for a specific application task by practitioners.
Data-centric iteration in dynamic workflows
Cuneiform: a Functional Language for Large Scale Scientific Data Analysis
TLDR
This work presents Cuneiform, a novel language for large-scale scientific data analysis, including tool support for programming, workflow visualization, debugging, logging, and provenance-tracing, and the parallel execution engine Hi-WAY are fully implemented.
Enabling ScientificWorkflow Reuse through Structured Composition of Dataflow and Control-Flow
TLDR
A generic framework, based on scientific workflow templates and frames, for embedding control-flow intensive subtasks within dataflow process networks, which can enable scientific workflows that are more robust and at the same time more reusable, since the embedding of frames and templates yields more structured and modular workflow designs.
Algebraic dataflows for big data analysis
TLDR
This paper illustrates how a big data processing dataflow can be modeled using the algebra and yields performance gains of up to 19.6% using algebraic optimizations in the dataflow and up to 39.1% of time saved on a user steering scenario.
Taverna, Reloaded
TLDR
How the recently overhauled technical architecture of Taverna addresses issues of efficiency, scalability, and extensibility, and presents performance results based on a collection of synthetic workflows is described, as well as a concrete case study involving a production workflow in the area of cancer research.
OpenAlea: a visual programming and component-based software platform for plant modelling.
TLDR
An open-source platform, OpenAlea, that provides a user-friendly environment for modellers, and advanced deployment methods, and the use of the platform to assemble several heterogeneous model components and to rapidly prototype a complex modelling scenario is presented.
Technical Note : SciDAC-SPA-TN-2003-01 On Providing Declarative Design and Programming Constructs for Scientific Workflows based on Process Networks
TLDR
This technical note describes in some detail the structure of the PromoterIdentification-Workflow (PIW) demonstrated at SSDBM [ABB+03), and proposes a simple solution to this problem, based on a functional programming approach.
Lambda Calculus as a Workflow Model
TLDR
This paper explains why lambda calculus is an appropriate model for workflow representation, and how a suitably efficient implementation can provide a wide range of capabilities to developers.
...
1
2
3
...