Douglas Thain

Learn More
Since 1984, the Condor project has enabled ordinary users to do extraordinary computing. Today, the project continues to explore the social and technical problems of cooperative computing on scales ranging from the desktop to the world-wide computational grid. In this chapter, we provide the history and philosophy of the Condor project and describe how it(More)
Large scale hardware-supported multithreading, an attractive means of increasing computational power, benefits significantly from low per-thread costs. Hardware support for lightweight threads is a developing area of research. Each architecture with such support provides a unique interface, hindering development for them and comparisons between them. A(More)
We present the design, implementation, and evaluation of the Batch-Aware Distributed File System (BAD-FS), a system designed to orchestrate large, I/O-intensive batch workloads on remote computing clusters distributed across the wide area. BAD-FS consists of two novel components: a storage layer that exposes control of traditionally fixed policies such as(More)
Eucalyptus, Open Nebula and Nimbus are three major open-source cloud-computing software platforms. The overall function of these systems is to manage the provisioning of virtual machines for a cloud providing infrastructure-as-a-service. These various open-source projects provide an important alternative for those who do not wish to use a commercially(More)
Although modern parallel and distributed computing systems provide easy access to large amounts of computing power, it is not always easy for non-expert users to harness these large systems effectively. A large workload composed in what seems to be the obvious way by a naive user may accidentally abuse shared resources and achieve very poor performance. To(More)
Access to remote data is one of the principal challenges of grid computing. While performing I/O, grid applications must be prepared for server crashes, performance variations, and exhausted resources. To achieve high throughput in such a hostile environment, applications need a resilient service that moves data while hiding errors and latencies. We(More)
Traditional distributed filesystem technologies designed for local and campus area networks do not adapt well to wide area Grid computing environments. To address this problem, we have designed the Chirp distributed filesystem, which is designed from the ground up to meet the needs of Grid computing. Chirp is easily deployed without special privileges,(More)
Interposition agents are a well-known device for attaching legacy applications to distributed systems. However, agents are difficult to build and are often large, monolithic pieces of software which are suited only to limited applications or systems. We solve this problem with Bypass, a language and a tool for quickly building multiple small agents that can(More)