Torsten Mehlan

Learn More
This paper introduces Netgauge, an extensible open-source framework for implementing network benchmarks. The structure of Net-gauge abstracts and explicitly separates communication patterns from communication modules. As a result of this separation of concerns, new benchmark types and new network protocols can be added independently to Netgauge. We describe(More)
Large–scale parallel applications performing global synchronization may spend a significant amount of execution time waiting for the completion of a barrier operation. Consequently , numerous research works have focused on reducing the communication costs of synchronization primitives. However, so far there has been no exhaustive comparison of barrier(More)
Accurate models of parallel computation are often crucial to optimize parallel algorithms for their running time. In general the easier the model's use and the smaller the number of parameters and interdependen-cies among them, the more inaccuarcies are introduced by simplification. On the other hand a too complex model is unusable. We show that it is(More)
To leverage high speed interconnects like InfiniBand it is important to minimize the communication overhead. The most interfering overhead is the registration of communication memory. In this paper, we present our analysis of the memory registration process inside the Mellanox InfiniBand driver and possible ways out of this bottleneck. We evaluate and(More)
The performance of the barrier operation can be crucial for many parallel codes. Especially distributed shared memory systems have to synchronize frequently to ensure the proper ordering of memory accesses. The barrier operation is often performed on top of point-to-point messages and the best algorithm scales with O(log 2 P · L) in the LogP model. We(More)
We present a micro benchmark suite to evaluate InfiniBand TM implementations with regards to single message performance and the addressing of many hosts. We use a 1:n communication pattern to assess the latency and bandwidth for all different combinations of InfiniBands TM transport services and functions. The results gathered in this study are used to(More)
Open MPI is a recent open source development project which combines features of different MPI implementations. These features include fault tolerance , multi network support, grid support and a component architecture which ensures extensibility. The TUC Hardware Barrier is a special purpose low la-tency barrier network based on commodity hardware. We show(More)
The Virtual Interface Architecture (VIA) was introduced to define a common set of features that are suitable to build high–speed networks. Today the interface of VIA serves as access point to a wide range of system area networks. M-VIA is a software that provides the VIA interface on top of several Ethernet cards. The overhead of TCP/IP protocols is avoided(More)
The architecture of the IBM Cell BE processor represents a new approach for designing CPUs. The fast execution of legacy software has to stand back in order to achieve very high performance for new scientific software. The Cell BE consists of 9 independent cores and represents a new promising architecture for HPC systems. The programmer has to write(More)