Learn More
As supercomputers grow, understanding their behavior and performance has become increasingly challenging. New hurdles in scalability, programmability, power consumption, reliability, cost, and cooling are emerging, along with new technologies such as 3D integration, GP-GPUs, silicon-photonics, and other "game changers". Currently, they HPC community lacks a(More)
Field programmable gate arrays (FPGAs) have long been an attractive alternative to microprocessors for computing tasks - as long as floating-point arithmetic is not required. Fueled by the advance of Moore's law, FPGAs are rapidly reaching sufficient densities to enhance peak floating-point performance as well. The question, however, is how much of this(More)
Due to their generic and highly programmable nature, FPGAs provide the ability to implement a wide range of applications. However, it is this nonspecific nature that has limited the use of FPGAs in scientific applications that require floating-point arithmetic. Even simple floating-point operations consume a large amount of computational resources. In this(More)
Given the logic density of modern FPGAs, it is feasible to use FPGAs for floating-point applications. However, it is important that any floating-point units that are used be highly optimized. This paper introduces an open source library of highly optimized floating-point units for Xilinx FPGAs. The units are fully IEEE compliant and acheive approximately(More)
Architectural Modifications to Enhance the Floating-Point Performance of FPGAs With the density of Field programmable Gate Arrays (FPGAs) steadily increasing, FPGAs have reached the point where they are capable of implementing complex floating-point applications. However, their general-purpose nature has limited the use of FPGAs in scientific applications(More)
A fundamental challenge for supercomputer architecture is that processors cannot be fed data from DRAM as fast as CPUs can consume it. Therefore, many applications are memory-bandwidth bound. As the number of cores per chip increases, and traditional DDR DRAM speeds stagnate, the problem is only getting worse. A variety of non-DDR 3D memory technologies(More)
Designing supercomputer architectures and applications is becoming more difficult because of their increased size and complexity, because of new technologies, and due to new constraints such as power and thermal limits. The Structural Simulation Toolkit (SST) is an architectural simulation framework designed to assist in the design, evaluation, and(More)
Power and energy concerns are motivating chip manufacturers to consider future hybrid-core processor designs that combine a small number of traditional cores optimized for single-thread performance with a large number of simpler cores optimized for throughput performance. This trend is likely to impact the way compute resources for network protocol(More)
Advances in FPGA technology have led to dramatic improvements in double precision floating-point performance. Modern FPGAs boast several GigaFLOPs of raw computing power. Unfortunately, this computing power is distributed across 30 floating-point units with over 10 cycles of latency each. The user must find two orders of magnitude more parallelism than is(More)