Craig B. Stunkel

Learn More
The interconnect plays a key role in both the cost and performance of large-scale HPC systems. The cost of future high-bandwidth electronic interconnects is expected to increase due to expensive optical transceivers needed between switches. We describe a potentially cheaper and more power-efficient approach to building high-performance interconnects.(More)
Trace-driven simulation is an important aid in performance analysis of computer systems. Capturing address traces for these simulations is a difficult problem for single processors and particularly for multicomputers. Even when existing trace methods can be used on multicomputers, the amount of collected data typically grows with the number of processors,(More)
Multidestination message passing has been proposed as an attractive mechanism for efficiently implementing multicast and other collective operations on direct networks. However, applying this mechanism to switch-based parallel systems is non-trivial. In this paper we propose alternative switch architectures with differing buffer organizations to implement(More)
Switch-based interconnects are used in a number of application domains including parallel system interconnects, local area networks, and wide area networks. However, very few switches have been designed that are suitable for more than one of these application domains. Such a switch must offer both extremely low latency and very high throughput for a variety(More)
This paper proposes a new approach for implementing fast multicast and broadcast in multistage interconnection networks (MINs) with multiport encoded multidestination worms. For a MIN with k k switches and n stages such worms usen header flits each. One flit is used for each stage of the network and it indicates the output ports to which a multicast message(More)
We describe the adaptive source routing (ASR) method which is a first attempt to combine adaptive routing and source routing methods. In ASR, the adaptivity of each packet is determined at the source processor. Every packet can be routed in a fully adaptive or partially adaptive or non–adaptive manner, all within the same network at the same time. We(More)
This paper describes the architecture of a third-generation switching element which may appear in future IBM RS/6000 SP interconnection networks. In this paper this ASIC will be referred as the Switch3 switch chip. Like its predecessors, Switch3 is an 8-port device implementing output-queuing using the high-utilization central-buffering technique. However,(More)