Learn More
Simultaneous Multithreading machines fetch and execute instructions from multiple instruction streams to increase system utilization and speedup the execution of jobs. When there are more jobs in the system than there is hardware to support simultaneous execution, the operating system scheduler must choose the set of jobs to coscheduleThis paper(More)
Cycle-accurate simulation is far too slow for modeling the expected performance of full parallel applications on large HPC systems. And just running an application on a system and observing wallclock time tells you nothing about why the application performs as it does (and is anyway impossible on yet-to-be-built systems). Here we present a framework for(More)
Scientific applications will have to scale to many thousands of processor cores to reach petascale. Therefore it is crucial to understand the factors that affect their scalability. Here we examine the strong scaling of four representative codes that exhibit different behaviors on four machines. We demonstrate the efficiency and analytic power of our(More)
SPECFEM3D_GLOBE is a spectral-element application enabling the simulation of global seismic wave propagation in 3D anelastic, anisotropic, rotating and self-gravitating Earth models at unprecedented resolution. A fundamental challenge in global seismology is to model the propagation of waves with periods between 1 and 2 seconds, the highest frequency(More)
The size of supercomputers in numbers of processors is growing exponentially. Today's largest supercomputers have upwards of a hundred thousand processors and tomorrow's may have on the order one million. The applications that run on these systems commonly coordinate their parallel activities via MPI; a trace of these MPI communication events is an(More)
The <i>Gordon</i> data intensive supercomputer entered service in 2012 as an allocable computing system in the NSF Extreme Science and Engineering Discovery Environment (XSEDE) program. <i>Gordon</i> has several innovative features that make it ideal for data intensive computing including: 1,024, compute nodes based on Intel's Sandy Bridge (Xeon E5)(More)