James C. Clippinger

Learn More
This paper presents an architecture-independent method for performing BMMC permutations on multiprocessors with distributed memory. All interprocessor communication uses the MPI function MPI_Sendrecv_replace(). The number of elements and number of processors must be powers of 2, with at least one element per processor, and there is no inherent upper bound(More)
  • 1