Bernard Tourancheau

Learn More
High speed networks are now providing incredible performances. Software evolution is slow and the old protocol stacks are no longer adequate for these kind of communication speed. When band-width increases, the latency should decrease as much in order to keep the system balance. With the current network technology, the main bottleneck is most of the time(More)
Block cyclic distribution seems to suit well for most linear algebra algorithms and this type of data distribution was chosen for the ScaLAPACK library as well as for the HPF language. But one has to choose a good compromise for the size of the blocks (to achieve a good computation and communication eeciency and a good load balancing). This choice heavily(More)
Emerging many-core processors, like CUDA capable nVidia GPUs, are promising platforms for regular parallel algorithms such as the Lattice Boltzmann Method (LBM). Since global memory on graphic devices shows high latency and LBM is data intensive, memory access pattern is an important issue to achieve good performances. Whenever possible, global memory loads(More)
In this paper we present a scalable protocol for conducting periodic probes of network performance in a way that minimizes collisions between separate probes. The goal of the protocol is to enable active performance monitoring of large-scale distributed computational systems and networks. We use the protocol to generate time series of measurement data that(More)