Learn More
Many scientific disciplines are now data and information driven, and new scientific knowledge is often gained by scientists putting together data analysis and knowledge discovery " pipelines ". A related trend is that more and more scientific communities realize the benefits of sharing their data and computational services, and are thus contributing to a(More)
Most scientists conduct analyses and run models in several different software and hardware environments, mentally coordinating the export and import of data from one environment to another. The Kepler scientific workflow system provides domain scientists with an easy-to-use yet powerful system for capturing scientific workflows (SWFs). SWFs are a(More)
The tools used to analyze scientific data are often distinct from those used to archive, retrieve, and query data. A scientific workflow environment, however, allows one to seamlessly combine these functions within the same application. This increase in capability is accompanied by an increase in complexity, especially in workflow tools like Kepler, which(More)
Ecology is inherently cross-disciplinary, drawing together many types of information to address questions about the natural world. Finding and integrating relevant data to assist in these analyses is crucial, but is difficult owing to ambiguous terminology and the lack of sufficient information about datasets. Ontologies provide a formal mechanism for(More)
We introduce and describe scientific workflows, i.e., executable descriptions of automatable scientific processes such as computational science simulations and data analyses. Scientific workflows are often expressed in terms of tasks and their (dataflow) dependencies. This chapter first provides an overview of the characteristic features of scientific(More)
a r t i c l e i n f o Keywords: Scientific workflows Sensors Near real-time data access Data analysis Terrestrial ecology Oceanography Environmental sensor networks are now commonly being deployed within environmental observatories and as components of smaller-scale ecological and environmental experiments. Effectively using data from these sensor networks(More)
The ecological sciences represent a challenging community from the perspective of scientific data management. Ecological data are collected by investigators who are spread out over a large geographic area and who are using a wide variety of research protocols and data handling techniques. The resulting heterogeneous data are stored in autonomous database(More)
Domain scientists synthesize different data and computing resources to solve their scientific problems. Making use of distributed execution within scientific workflows is a growing and promising way to achieve better execution performance and efficiency. This paper presents a high-level distributed execution framework, which is designed based on the(More)