Learn More
Scientists, engineers, and statisticians must execute domain-specific application programs many times on large collections of file-based data. This activity requires complex orchestration and data management as data is passed to, from, and among application invocations. Distributed and parallel computing resources can accelerate such processing, but their(More)
In this paper we show the possibility of using very mild stochastic damping to stabilize long time step integrators for Newtonian molecular dynamics. More specifically, stable and accurate integrations are obtained for damping coefficients that are only a few percent of the natural decay rate of processes of interest, such as the velocity autocorrelation(More)
In this paper we present the Coaster System. It is an automatically-deployed node provisioning (Pilot Job) system for grids, clouds, and ad-hoc desktop-computer networks supporting file staging, on-demand opportunistic multi-node allocation, remote logging, and remote monitoring. The Coaster System has been previously [32] shown to work at scales of(More)
Scripting is often used in science to create applications via the composition of existing programs. Parallel scripting systems allow the creation of such applications, but each system introduces the need to adopt a somewhat specialized programming model. We present an alternative scripting approach, AMFS Shell, that lets programmers express parallel(More)
—High-performance computing (HPC) and distributed systems rely on a diverse collection of system software to provide application services, including file systems, schedulers, and web services. Such system software services must manage highly concurrent requests, interact with a wide range of resources, and scale well in order to be successful.(More)
Over the past few years, the increasing amounts of data produced by large-scale simulations have motivated a shift from traditional offline data analysis to in situ analysis and visualization. In situ processing began as the coupling of a parallel simulation with an analysis or visualization library, motivated primarily by avoiding the high cost of(More)
Sharing data and storage space in a distributed system remains a difficult task for ordinary users, who are constrained to the fixed abstractions and resources provided by administrators. To remedy this situation, we introduce the concept of a tactical storage system (TSS) that separates storage abstractions from storage resources, leaving users free to(More)
Although the grid allows the researcher to tap a vast amount of resources, the complexity involved in utilizing this power can make it unwieldy and time-consuming. The Grid Interface for Parameter Sweeps and Exploration (GIPSE) 1 toolset aims to solve this issue by freeing users from script debugging, storage issues, and other minutiae involved in managing(More)
We seek to enable efficient large-scale parallel execution of applications in which a shared filesystem abstraction is used to couple many tasks. Such parallel scripting (<i>many-task computing, MTC</i>) applications suffer poor performance and utilization on large parallel computers because of the volume of filesystem I/O and a lack of appropriate(More)
Efficiently utilizing the rapidly increasing concurrency of multi-petaflop computing systems is a significant programming challenge. One approach is to structure applications with an upper layer of many loosely-coupled coarse-grained tasks, each comprising a tightly-coupled parallel function or program. "Many-task" programming models such as functional(More)