Learn More
The design of a new adaptive virtual cut-through router for torus networks is presented in this paper. With much lower VLSI costs than adaptive wormhole routers, the adaptive Bubble router is even faster than deterministic wormhole routers based on virtual channels. This has been achieved by combining a low-cost deadlock avoidance mechanism for virtual(More)
A router design for torus networks that significantly reduces message latency over traditional wormhole routers is presented in this paper. This new router implements virtual cut-through switching and fully-adaptive minimal routing. Packet deadlock is avoided by providing escape ways governed by Bubble flow control, a mechanism that guarantees enough free(More)
As the size of parallel computers increases, as well as the number of sources per router node, congestion inside the interconnection network rises significantly. In such systems, packet injection must be restricted in order to prevent throughput degradation at high loads. This work evaluates three congestion control mechanisms on adaptive cut-through torus(More)
Supercomputer performance is highly dependent on its interconnection subsystem design. In this paper we study how different architectural approaches for router design impact into system performance when running real parallel applications. A thorough methodology has been employed to quantify this impact. Architectural router decisions have been chosen taking(More)
The performance of an interconnection network is measured by two metrics: average latency and peak network throughput. Network throughput is the total number of packets delivered per unit of time. Most synthetic network loads consist of sources injecting at the same given rate, using traffic patterns such as random, permutations or hot spot, which reflect(More)
This paper explores the suitability of dense circulant graphs of degree four for the design of on-chip interconnection networks. Networks based on these graphs reduce the Torus diameter in a factor 1 √ 2 which translates into significant performance gains for unicast traffic. In addition, they are clearly superior to Tori when managing collective(More)
A strategy to implement adaptive routing in irregular networks is presented and analyzed in this work. A simple and widely applicable deadlock avoidance method, applied to a ring embedded in the network topology, constitutes the basis of this high-performance packet switching. This adaptive router improves the network capabilities by allocating more(More)
Any simulation-based evaluation of an interconnection network proposal requires a good characterization of the workload. Synthetic traffic patterns based on independent traffic sources are commonly used to measure performance in terms of average latency and peak throughput. As they do not capture the level of self-throttling that occurs in most parallel(More)
Many simulation-based performance studies of interconnection networks are carried out using synthetic workloads under the assumption of independent traffic sources. We show that this assumption, although may be useful for some traffic patterns, can lead to deceptive performance results for loads beyond saturation. Network throughput varies so much amongst(More)