Learn More
High productivity is critical in harnessing the power of high-performance computing systems to solve science and engineering problems. It is a challenge to bridge the gap between the hardware complexity and the software limitations. Despite significant progress in programming language, compiler, and performance tools, tuning an application remains largely a(More)
Torus networks are commonly used for massively parallel computers, its performance often becomes the constraint on total application performance. Especially in an asymmetric torus network, network traffic along the longest axis is the performance bottleneck for all-to-all communication, so that it is important to schedule the longest-axis traffic smoothly.(More)
Video on demand (VOD) systems are becoming popular. They have special requirements for receiving VOD data steadily, but common general communication protocols cannot meet their requirements. Several new protocols have been designed to fulll this service requirement , but it takes a long time for a protocol to be supported from end to end. We propose a VOD(More)
Simultaneous Multi-Thread (SMT) techniques are becoming popular because they increase the efficiency of CPU resource usage by allowing multiple threads to run on a single physical processor at a very fine granularity. Emerging real-time applications, however, may not benefit from the SMT techniques because those techniques often compromise the predictable(More)
To optimize various high performance computing (HPC) programs on complex supercomputer architectures source-to-source optimization tools are becoming important to support network-architecture-specific and application-specific performance optimization in addition to compiler optimizations. Because of try-and-error nature of performance optimization work and(More)
Multicore processors are becoming dominant in the high performance computing (HPC) area, so multithread programming with OpenMP is becoming a key to good performance on such processors, though debugging problems remain. In particular, it is difficult to detect data races among threads with nondeterministic results, thus calling for tools to detect data(More)
Video o n demand (VOD) systems are becoming popular. T h e y have special requirements for receiving VOD data steadily, but c o m m o n general communication protocols cannot m e e t their requirements. Several n e w protocols have been designed t o fulfill this service require-m e n t , but it takes a long t i m e f o r a protocol t o be supported f r o m(More)
Deploying an application onto a target platform for high performance oftentimes demands manual tuning by experts. As machine architecture gets increasingly complex, tuning becomes even more challenging and calls for systematic approaches. In our earlier work we presented a prototype that combines efficiently expert knowledge, static analysis, and runtime(More)