Learn More
We present our implementation of the HPC Challenge Class II (productivity) benchmarks in the Charm++ [1] programming paradigm. Our submission focuses on explaining how over-decomposed, message-driven, migratable objects enhance the clarity of expression of parallel programs and also enable the runtime system to deliver portable performance. Our submission(More)
As we move to exascale machines, both peak power demand and total energy consumption have become prominent challenges. A significant portion of that power and energy consumption is devoted to cooling, which we strive to minimize in this work. We propose a scheme based on a combination of limiting processor temperatures using Dynamic Voltage and Frequency(More)
The Canadian c-spine rule (CCR) allows safe, reproducible use of radiography in alert, stable patients with potential c-spine injury in the emergency setting [Stiell, I., Clement, C., McKnight, R., Brison, R., Schull, M., Lowe, B., Worthington, J., Eisenhauer, M., Cass, D., Greenberg, G., MacPhail, I., Dreyer, J., Lee, J., Bandiera, G., Reardon, M.,(More)
Providing homogeneous access ('services') to heterogeneous environmental data distributed across heterogeneous computing systems on a wide area network requires a robust information paradigm that can mediate between differing storage and information formats. While there are a number of ISO standards that provide some guidance on how to do this, the(More)
This paper presents scalable algorithms and data structures for adaptive mesh refinement computations. We describe a novel mesh restructuring algorithm for adaptive mesh refinement computations that uses a constant number of collectives regardless of the refinement depth. To further increase scalability, we describe a localized hierarchical coordinate-based(More)
AIM To determine the potential of the Canadian Cervical Spine Rule (CCR) to safely reduce the number of cervical spine (c-spine) radiographs performed in the UK emergency department setting. METHODS The study was conducted in two UK emergency departments with a combined annual attendance of >150,000 adult patients. Over the 24 month trial period, 148(More)
The design and manufacture of present-day CPUs causes inherent variation in supercomputer architectures such as variation in power and temperature of the chips. The variation also manifests itself as frequency differences among processors under Turbo Boost dynamic overclocking. This variation can lead to unpredictable and suboptimal performance in tightly(More)
This short paper outlines the key components of the NERC DataGrid: a discovery service, a vocabulary service and a software stack deployed both centrally to provide a data discovery portal, and at data providers to provide local portals and data and metadata services.
Dense LU factorization is a prominent benchmark used to rank the performance of supercomputers. Many implementations use block-cyclic distributions of matrix blocks onto a two-dimensional process grid. The process grid dimensions drive a trade-off between communication and computation and are architecture- and implementation-sensitive. The critical panel(More)
Termination detection is relevant for signaling completion (all processors are idle and no messages are in flight) of many operations in distributed systems, including work stealing algorithms, dynamic data exchange, and dynamically structured computations. In the face of growing supercomputers with increasing likelihood that each job may encounter faults,(More)