Learn More
Clustered computing environments, although becoming the predominant high-performance computing platform of choice, continue to grow in complexity. It is relatively easy to achieve good performance with real-world MPI applications on such platforms, but obtaining the best possible MPI performance is still an extremely difficult task, requiring painstaking(More)
The performance optimization of scientific applications usually requires an in-depth knowledge of the hardware and software. A performance tuning mechanism is suggested to automatically tune OpenACC parameters to adapt to the execution environment on a given system. A historic learning based methodology is suggested to prune the parameter search space for a(More)
The Abstract Data and Communication Library (ADCL) is an adaptive communication library optimizing application level collective communication operations at runtime. The library provides for a given communication pattern a large number of implementations and incorporates a runtime selection logic in order to choose the implementation leading to the highest(More)
Emerging computing systems have a wide variety of hardware and software components influencing the performance of parallel applications, presenting end-users with a (nearly) unique execution environment on each parallel machine. One of the big challenges of High Performance Computing is therefore to develop portable and efficient codes for any execution(More)
An explicit marching-on-in-time (MOT)-based time-domain volume integral equation (TDVIE) solver has recently been developed for characterizing transient electromagnetic wave interactions on arbitrarily shaped dielectric bodies (A. Al-Jarro et al., IEEE Trans. Antennas Propag., vol. 60, no. 11, 2012). The solver discretizes the spatio-temporal convolutions(More)
Minimizing the communication costs associated with a parallel application is a key challenge for the scalability of petascale and future exascale application. This paper introduces the notion of a personalized MPI library that is customized for a particular application and platform. The work is based on the Open MPI communication library, which has a large(More)
Graphics processing units (GPUs) are gradually becoming mainstream in high-performance computing, as their capabilities for enhancing performance of a large spectrum of scientific applications to many fold when compared to multi-core CPUs have been clearly identified and proven. In this paper, implementation and performance-tuning details for porting an(More)