A Partitioned Global Address Space (PGAS) approach treats a distributed system as if the memory were shared on a global level. Given such a global view on memory, the user may program applications very much like shared memory systems. This greatly simplifies the tasks of developing parallel applications, because no explicit communication has to be specified… (More)
—The Partitioned Global Address Space (PGAS) approach is a promising programming model in high performance parallel computing that combines the advantages of distributed memory systems and shared memory systems. The PGAS model has been used on a variety of hardware platforms in the form of PGAS programming languages like Unified Parallel C (UPC), Chapel and… (More)
In this paper we present a simple but useful profiling tool for OpenMP applications similar in spirit to the MPI profiler mpiP . We describe the implementation of our tool and demonstrate its functionality on a number of test applications.
SCALASCA is a performance toolset that has been specifically designed to analyze parallel application behavior on large-scale systems, but is also well-suited for small-and medium-scale HPC platforms. SCALASCA offers an incremen-tal performance-analysis process that integrates runtime summaries with in-depth studies of concurrent behavior via event tracing,… (More)
Analyzing the scalability behavior and the overheads of Open-MP applications is an important step in the development process of scientific software. Unfortunately, few tools are available that allow an exact quantification of OpenMP related overheads and scalability characteristics. We present a methodology in which we define four overhead categories that… (More)
DASH is a realization of the PGAS (partitioned global address space) model in the form of a C++ template library. Operator overloading is used to provide global-view PGAS semantics without the need for a custom PGAS (pre-)compiler. The DASH library is implemented on top of our runtime system DART, which provides an abstraction layer on top of existing… (More)
Performance analysis of applications on supercomputers require scalable tools. The Periscope environment applies a distributed automatic online analysis and thus scales to thousands of processors. This article gives an overview of the Periscope system, from the performance property specification , via the search process, to the integration with two… (More)
—As supercomputers are being built from an ever increasing number of processing elements, the effort required to achieve a substantial fraction of the system peak performance is continuously growing. Tools are needed that give developers and computing center staff holistic indicators about the resource consumption of applications and potential performance… (More)