Learn More
MapReduce provides a parallel and scalable programming model for data-intensive business and scientific applications. MapReduce and its de facto open source project, called Hadoop, support parallel processing on large datasets with capabilities including automatic data partitioning and distribution, load balancing, and fault tolerance management. Meanwhile,(More)
Large-scale production of biopharmaceuticals by current bioreactor techniques is limited by low transgenic efficiency and low expression of foreign proteins. In general, a bacterial artificial chromosome (BAC) harboring most regulatory elements is capable of overcoming the limitations, but transferring BAC into donor cells is difficult. We describe here the(More)
Domain scientists synthesize different data and computing resources to solve their scientific problems. Making use of distributed execution within scientific workflows is a growing and promising way to achieve better execution performance and efficiency. This paper presents a high-level distributed execution framework, which is designed based on the(More)
Making use of distributed execution within scientific workflows is a growing and promising methodology to achieve better execution performance. We have implemented a distributed execution framework in the Kepler scientific workflow environment, called Master-Slave Distribution, to distribute sub-workflows to a common distributed environment, namely ad-hoc(More)
To clarify the mechanisms underlying the pancreatic β-cell response to varying glucose concentrations ([G]), electrophysiological findings were integrated into a mathematical cell model. The Ca(2+) dynamics of the endoplasmic reticulum (ER) were also improved. The model was validated by demonstrating quiescent potential, burst-interburst electrical events(More)
Service composition is becoming the dominant paradigm for developing Web service applications. It is important to ensure that a service composition complies with the requirements for the application. A rigorous compliance checking approach usually needs the requirements being specified in property specification formalisms such as temporal logics, which are(More)
Next-generation DNA sequencing machines are generating a very large amount of sequence data with applications in many scientific challenges and placing unprecedented demands on traditional single-processor bioinformatics algorithms. Middleware and technologies for scientific workflows and data-intensive computing promise new capabilities to enable rapid(More)
End-user service composition is a promising way to ensure flexible, quick and personalized information provision and utilization, and consequently to better cope with spontaneous business requirements. For end-users to compose services directly, issues like service granularity, service organization and business-level semantics are critical. End-users will(More)
With the increasing popularity of the Cloud computing, there are more and more requirements for scientific work-flows to utilize Cloud resources. In this paper, we present our preliminary work and experiences on enabling the interaction between the Kepler scientific workflow system and the Amazon Elastic Compute Cloud (EC2). A set of EC2 actors and Kepler(More)
In the Big Data era, workflow systems need to embrace data parallel computing techniques for efficient data analysis and analytics. We present an easy-to-use, scalable approach to build and execute Big Data applications using actor-oriented modeling in data parallel computing. We use two bioinformatics use cases for next-generation sequencing data analysis(More)