Tim Rühl

Learn More
Orca is a portable, object-based distributed shared memory (DSM) system. This article studies and evaluates the design choices made in the Orca system and compares Orca with other DSMs. The article gives a quantitative analysis of Orca's coherence protocol (based on write-updates with function shipping), the totally ordered group communication protocol, the(More)
LFC is a new, low-level communication substrate for Myrinet, designed to support the development of high-performance communication software for parallel systems. LFC is novel in two ways. First, it exploits Myrinet's programmable network interface (NI) to implement flow control, forward multicast traffic, reduce the overhead of network interrupts, and to(More)
The Distributed ASCI Supercomputer (DAS) is a homogeneous wide-area distributed system consisting of four cluster computers at different locations. DAS has been used for research on communication software, parallel languages and programming systems, schedulers, parallel applications, and distributed applications. The paper gives a preview of the most(More)
This paper studies the implementation of efficient mul-ticast protocols for Myrinet, a switched, wormhole-routed, Gigabit-per-second network technology. Since Myrinet does not support multicasting in hardware, multicast services must be implemented in software. We present a new, efficient, and reliable software multicast protocol that uses the network(More)
This paper surveys the design issues for user-level network interface protocols for modern high-speed networks such as Myrinet. It first explains the principles of such protocols through a simple, unreliable protocol. Next, six important design issues are discussed in more detail: data transfers, address translation, protection, control transfers,(More)
Panda is a virtual machine designed to support portable implementations of parallel programming systems. It provides communication primitives and thread support to higher-level layers (such as a runtime system). We have used Panda to implement four parallel programming systems: Orca, data parallel Orca, PVM, and SR. The paper describes our experiences in(More)
Clusters of workstations are often claimed to be a good platform for parallel processing, especially if a fast network is used to interconnect the workstations. Indeed, high performance can be obtained for low-level message passing primitives on modern networks like ATM and Myrinet. Most application programmers, however, want to use higher-level(More)
We systematically evaluate the performance of five implementations of a single, user-level communication interface. Each implementation makes different architectural assumptions about the reliability of the network hardware and the capabilities of the network interface. The implementations differ accordingly in their division of protocol tasks between host(More)
Clusters of workstations are a popular platform for high-performance computing. For many parallel applications, efficient use of a fast interconnection network is essential for good performance. Several modern System Area Networks include programmable network interfaces that can be tailored to perform protocol tasks that otherwise would need to be done by(More)