David H. Bailey

Learn More
A new set of benchmarks has been developed for the performance evaluation of highly parallel supercomputers. These benchmarks consist of five parallel kernels and three simulated application benchmarks. Together they mimic the computation and data movement characteristics of large scale computational fluid dynamics (CFD) applications. The principal(More)
This paper describes the \fractional Fourier transform", which admits computation by an algorithm that has complexity proportional to the fast Fourier transform algorithm. Whereas the discrete Fourier transform (DFT) is based on integral roots of unity e 2 , the fractional Fourier transform is based on fractional roots of unity e 2 i , where is arbitrary.(More)
A new set of benchmarks has been developed for the performance evaluation of highly parallel supercomputers. These benchmarks consist of five ‘parallel kernel” benchmarks and three ‘simulated application” benchmarks. Together they mimic the computation and data movement charactem”stics of large scale computational fluid dynamics applications. The principal(More)
Benchmark results for the Numerical Aerodynamic Simulation (NAS) Program at NASA Ames Research Center, which is dedicated to advancing the science of computational aerodynamics are presented. The benchmark performance results are for the Y-MP, Y-MO EL, and C-90 systems from Cray Research; the TC2000 from Bolt Baranek and Newman; the Gamma iPSC/860 from(More)
Conventional algorithms for computing large one-dimensional fast Fourier transforms (FFTs), even those algorithms recently developed for vector and parallel computers, are largely unsuitable for systems with external or hierarchical memory. The principal reason for this is the fact that most FFT algorithms require at least <italic>m</italic> complete passes(More)
In 2003, the DARPA's High Productivity Computing Systems released the HPCC suite. It examines the performance of HPC architectures using kernels with various memory access patterns of well known computational kernels. Consequently, HPCC results bound the performance of real applications as a function of memory access characteristics and define performance(More)
A quad-double number is an unevaluated sum of four IEEE double precision numbers, capable of representing at least 212 bits of significand. We present the algorithms for various arithmetic operations (including the four basic operations and various algebraic and transcendental operations) on quad-double numbers. The performance of the algorithms,(More)
This article describes the design rationale, a C implementation, and conformance testing of a subset of the new Standard for the BLAS (Basic Linear Algebra Subroutines): Extended and Mixed Precision BLAS. Permitting higher internal precision and mixed input/output types and precisions allows us to implement some algorithms that are simpler, more accurate,(More)
Bailey's work is supported by the Director, O ce of Computational and Technology Research, Division of Mathematical, Information, and Computational Sciences of the U.S. Department of Energy, under contract number DE-AC03-76SF00098. We propose a theory to explain random behavior for the digits in the expansions of fundamental mathematical constants. At the(More)