Learn More
As file systems reach the petabytes scale, users and administrators are increasingly interested in acquiring high-level analytical information for file management and analysis. Two particularly important tasks are the processing of aggregate and top-k queries which, unfortunately, cannot be quickly answered by hierarchical file systems such as ext3 and(More)
As the HPC community moves into the exascale computing era, application energy is becoming as large of a concern as performance. Optimizing for energy will be essential in the effort to overcome the limited power envelope. Existing efforts to optimize energy in applications employ Dynamic Frequency and Voltage Scaling (DVFS) to maximize energy savings in(More)
As the HPC community moves into the exascale computing era, application energy has become a big concern. Tuning for energy will be essential in the effort to overcome the limited power envelope. How is tuning for lower energy related to tuning for faster execution? Understanding that relationship can guide both performance and energy tuning for exascale. In(More)
Power, energy, and compute time are all important metrics that can act as either objectives or constraints in program or system optimization. Recent microprocessors include sensors (counters) for monitoring these metrics as well as on-chip system controllers that may use this information. Code optimization is relatively straightforward if the measurements(More)
Recent developments in modern computational accelerators like Graphics Processing Units (GPUs) and coprocessors provide great opportunities for making scientific applications run faster than ever before. However, efficient parallelization of scientific code using new programming tools like CUDA requires a high level of expertise that is not available to(More)
Computational modeling of cardiac electrophysiology is a powerful tool for studying arrhythmia mechanisms. In particular, cardiac models are useful for gaining insights into experimental studies, and in the foreseeable future they will be used by clinicians to improve therapy for the patients suffering from complex arrhythmias. Such models are highly(More)
In this paper, we present a study on the parallelization of the shortest path graph kernel from machine learning theory. We first present a fast sequential implementation of the graph kernel which we refer as Fast Computation of Shortest Path Kernel (FCSP). Then we explore two different parallelization schemes on the CPU and four different implementations(More)
Performance tuning is an ongoing activity at most HPC sites. Small performance improvements can save thousands of dollars. Run-to-run performance variations significantly impact performance tuning. Not being able to tell which code version is faster (or more energy efficient) in a single run greatly increases the computational expense and uncertainty for(More)
The HPC community is striving to achieve exascale computing within a power cap of 20 Megawatts. This paper studies the impact of power capped environments on compiler transformed programs. The impact of CPU clock modulation (a mechanism for reducing CPU frequency) on program variants of several Polybench benchmarks is studied. Our evaluation shows at least(More)
Based on anisometric noble-metal nanocrystals, a universal fabrication protocol for preparing 3D supercrystals with controlled orientation on a chip has been developed. A comparison of the surface-enhanced Raman scattering (SERS) behavior of 3D nanorod supercrystals aligned vertically and parallel to the chip indicates that the SERS-enhancing ability and(More)
  • 1