Ramachandra C. Nanjegowda

Learn More
OpenMP relies heavily on barrier synchronization to coordinate the work of threads that are performing the computations in a parallel region. A good implementation of barriers is thus an important part of any implementation of this API. As the number of cores in shared and distributed shared memory machines continues to grow, the quality of the barrier(More)
OpenMP is a defacto standard API for shared memory programming with widespread vendor support and a large user base. The OpenMP Architecture Review Board has sanctioned an interface specification known as the ”OpenMP Runtime API for Profiling” to enable tools to collect performance data for OpenMP programs. This paper describes the interface(More)
OpenSHMEM is a recently introduced open standard for all SHMEM libraries. In this paper we discuss the different aspects of porting the NAS parallel benchmarks from their MPI 1 implementations to those that use the new OpenSHMEM library API. We compare performance and scalability of these unoptimized OpenSHMEM NAS benchmarks with their MPI 1, and in some(More)
Developing shared memory parallel programs using OpenMP is straightforward, but getting good performance in terms of speedup and scalability can be difficult. This paper demonstrates the functionality of a collector-based dynamic optimization framework called DARWIN that uses collected performance data as feedback to affect the behavior of the program(More)
  • 1