Paul N. Swarztrauber

Learn More
Several multiprocessor FFTs are developed in this paper for both vector multiproces-sors with shared memory and the hypercube. Two FFTs for vector multiprocessors are given that compute an ordered transform and have a stride of one except for a single "link" step. Since multiple FFTs provide additional options for both vectorization and distribution we show(More)
The fast Fourier transform (FFT) is often used to compute numerical approximations to continuous Fourier and Laplace transforms. However, a straightforward application of the FFT to these problems often requires a large FFT to be performed, even though most of the input data to this FFT may be zero and only a small fraction of the output data may be of(More)
We examine design alternatives for ordered FFT algorithms on massively parallel hypercube multiprocessors such as the Connection Machine. Particular emphasis is placed on reducing communication which is known to dominate the overall computing time. To this end we combine the order and computational phases of the FFT and also use sequence to processor maps(More)
The original Cooley-Tukey FFT was published in 1965 and presented for sequences with length N equal to a power of two. However, in the same paper they noted that their algorithm could be generalized to composite N in which the length of the sequence was a product of small primes. In 1967, Bergland presented an algorithm for composite N and variants of his(More)
The general theory of compatibility conditions for the differentiability of solutions to initial-boundary value problems is well known. This paper introduces the application of that theory to numerical solutions of partial differential equations and its ramifications on the performance of high-order methods. Explicit application of boundary conditions (BCs)(More)